Nov 30, 2009

Data Classification

Lets take a look at what 'Data' is comprised of for any enterprise. Consider a hypothetical company "Orange" with 30,000 stores worldwide, 40,000 employees , operating in 10 different countries, manufactures and sells around 15,000 varieties of products via different direct and indirect sales channel, having around 10,000 direct customers , procures various raw and finished products from 1500 suppliers across the globe. These staggering number of employees, location, products, customers,suppliers and various other,  represents facts or information about real world objects. For this organization 'Data'  is representation of these real world objects. Data can represent virtual objects as well. Lets go through this illustration of a very common online shopping experience which involves data representing data from both worlds.

1.  End customer logs in to online shopping portal.
2.  Selects and configure the right product.
3.  Initiate checkout process

  • Provides shipping address.
  • Provide payment details.
  • Confirm the order.
4.  Customer receives email confirmation with order tracking number and estimated delivery date.

The entire process involves data flow from one point to another and finally halts with shipment of product. Product,customer,location etc represents real world objects but information about order,invoice,receipt which are shared between two points in electronic format are transactional in nature and represents virtual objects. A pictorial representation and data captured in the online shopping is below :



In essence data for any enterprise can be classified in following major categories :

Non Transactional(Master Data) Data :

Non transactional data represents critical nouns for any business and generally provides information about person,location,things and concepts. Non transactional data are very common classification of data which exists across all industries Retail,Manufacturing,Engineering,Oil,Pharmaceutical etc.

Transactional Data :

Transactional data is related to sales,invoice,receipt,claim etc. and other monetary and non monetary interactions.

Data About Data (Metadata) :

This is data about another data and contains details like data type(character,number,date etc ..), comments about data etc. and are generally maintained in some kind of repository.

Unstructured Data :

This is data found in various formats floating or maintained in variuos formats like email,website.white paper, legal and marketing collateral etc.

Analytical Data

As the word 'Analytical' suggestes , it assists company's decission making. Analytical data comprises of various forms of data assembled into a one giant data warehouse or multiple data marts to help enterprise make wise and intelligent decission which might impact the sale, reduce cost, identify latency in supply chain.

3 comments: