Jun 26, 2010

Data Quality

Stumbled upon this very well written article which talks about 'eveluating data capture process'
http://iaidq.info/publications/doc2/olson-2009-10.shtml

Well the post talks about a very interesting aspect in the data capture process and possible cause of bad health of data but at the same time I've noticed that in various instance the third party(information provider) is not willing to disclose certain information or deliberately provides inaccurate information like address,telephone no, SSN , Annual Salary etc. For instance a week back I received a call from a sales man from health insurance co. and he enquired about my current health insurance policy and offered his products as well. During the course of discussion he gathered several information like my current address, phone numbe, SSN etc.  Too avoide further call and offcourse for security reasons I declined to give correct information.  For sales man its not possible to validate all the information and the system which captures this information has got some data which is infected with wrong information.

Similarly there exists various other possible cause of bad data in the system and such 'cause' of bad data are inevitable. From pure data standpoint and based on nature of data , system should be able to provide a mean to classify various information which are pertinent for other systems or transaction before it can be used as an 'information' or considered as good data and also system should allow to configure rules using which system/data owner can be notified over the period about state of data in the system and also drilldown to such data and take corrective actions

Apr 23, 2010

CRUD Matrix

One of the critical task in MDM project is to identify and define owners for various data elements. A sample spreadsheet as shown below can be used as a starting point to collect this information and which can be further used to define goverance around those elements like who updates what , who can view what ? 



Each cell can have values like C(Create), U(Update),R(Read Only),D(Delete).  The cell which intersects function and data elements and having values C or U are the natural owner for that element. Such spreadsheet will also help to identify any conflicts in ownsership which is possible in a large enterprise.

data quality dimensions

This is a very well written article on Data quality .
http://www.information-management.com/issues/2007_58/master_data_management_mdm_quality-10015358-1.html?ET=informationmgmt:e963:2046487a:&st=email

The most challenging aspect in any MDM initiative is to bring awareness among business and IT community about data qualty. In my current assignment , I interviewed few Data stewards and realized that data collection process is still not streamlined in many industries. Data is extracted from spreadsheet, various design documents before entering it into the system. In a engineering organization, a proper SDLC cycle is followed to design,develop,test and release the actual product. And during this process product related information is captured from various departmetns like Product Managers, Marketing, Finance, Release , Legal etc and most of the information is either in human head or documented in some unstructured formats. And extracting such information from human head and putting it into system consumes considerable amount of time and effort.  Also, non availability of information leads to inconsistent and incomplete data. Data collection process is a multi step and iterative process. And having SLAs defined for each step of the process will yield timely and accurate data.

Also data definition,dimension, domain values changes over period and not having a well defined roadmap or process to manage such changes is also cause of inconsistent information.

Feb 23, 2010

MDM & Data Quality Vendors

List of various vendors having MDM/Data Quality offerings.


Acxiom
 www.acxiom.com/CDI
Alliance Consulting (Integrated Customer View)
www.alliance-consulting.com/$c1$MDM_Overview$214_1.htm
Amalto
http://www.amalto.com/
Amdocs
www.amdocs.com/Site/OLDOfferings/Offering%20Framework/IM/ECH.htm
Ascential Enterprise Integration Suite (IBM Information Server suite)
www.ascential.com/products/eis.html
BEA Liquid Data 
http://dev2dev.bea.com/aqualogic
Composite Software
www.compositesoftware.com/solutions/cdi_mdm.shtml
DataFlux
www.dataflux.com/Business-Issues/Unified-Enterprise-View/index.asp qMDM
Data Foundations
http://www.datafoundations.com/
DWL
www.306.ibm.com/software/data/masterdata/customer
Epiphany – Customer Relationship Backbone
www.infor.com/product_summary/crm/epiphany/ (now Infor)
Fair Isaac
www.fairisaac.com/Fairisaac/Solutions/Enterprise+Applications/
FullTilt Solutions
http://www.fulltiltsolutions.com/ Perfect Product Suite (now QAD)
Gaine Solutions
http://www.gainesolutions.com/
Global IDs
www.globalids.com/Global_Customer_Data_Integration_Solutions.html
GoldenGate Software
http://www.goldengate.com/

GoldenSource
http://www.thegoldensource.com/
Heiler Software
http://www.heiler.com/
Human Inference
www.humaninference.com/solutions/single-customer-view/
Hybris
http://www.hybris.com/
Hyperion – MDM Server
www.oracle.com/appserver/business-intelligence/data-relationship-management.html
IBM InfoSphere MDM Server
www.306.ibm.com/software/data/masterdata/customer
IBM WebSphere Information Integrator
www.ibm.com/software/data/integration/
Identity Systems
www.identitysystems.com/ (see Informatica)
Informatica
www.informatica.com/solutions/master_data_management
Initiate Systems
www.initiate.com/PRODUCTS/Pages/default.aspx
Innovative Systems
www.innovativesystems.com/customer-centric/index.php
IntelliSync
http://www.identitysystems.com/
Kalido – Master Data Manager
www.kalido.com/products/mdm
Liaison Technologies –
http://www.liaison.com/
MetaMatrix
www.metamatrix.com/l3_server.html
Microsoft MDM
www.microsoft.com/sharepoint/mdm/default.mspx
Nimaya IdentitySync
www.nimaya.com/products/ActionBridge.asp
ObjectStar
www.tibco.com/company/news/releases/2005/press654.jsp (now TIBCO CIM)
Oracle – Oracle Customers Online, Oracle Customer Data Hub, Citizen Data Hub, Financial Consolidation Hub, Financial Services Accounting Data Hub, Product Data Hub
www.oracle.com/data_hub/index.html
Orchesta Networks EBX.Platform –
http://www.orchestranetworks.com/
Purisma – Purisma Data Hub
http://www.purisma.com/
Riversand Technologies –
http://www.riversand.com/
Sanchez CRM – www.fidelityinfoservices.com/FNFIS/Markets/FinancialIndustries/MidTierLgBanking/DataAccessIntegraSolu
SAP – SAP NetWeaver MDM
www.sap.com/platform/netweaver/components/mdm/index.epx
SAS – OEMs/resells DataFlux
www.sas.com/technologies/dw/masterdatamgmt/index.html
Siebel UCM –
www.oracle.com/applications/crm/siebel/customer-data-integration/index.html
Silver Creek Systems 
http://www.silvercreeksystems.com/
Siperian Hub
 http://www.siperian.com/
Software AG webMethods ESB Platform – www.softwareag.com/Corporate/products/wm/integration/default.asp
SRD
www.03.ibm.com/industries/telecom/doc/content/solution/262178102.html
Stratature +EDM (Microsoft MDM) –
http://www.stratature.com/
Sun MDM Suite & Mural
http://www.sun.com/
Teradata MDM
www.teradata.com/master-data-management
TIBCO Collaborative Information Manager – www.tibco.com/software/master_data_management/collaborative_information_manager/default.jsp
Trillium
www.trilliumsoftware.com/home/products/enterprise-integration/cdi-mdm-connectors.aspx
VisionWare MultiVue
http://www.visionwareplc.com/

Data Quality Vendor Products

Address Doctor --
http://www.addressdoctor.com/
Business Objects (Firstlogic) - An SAP Company -- www.businessobjects.com/product/packages/integration.asp
DataFlux
http://www.dataflux.com/
Datanomics
http://www.datanomics.com/
Exeros
http://www.exeros.com/
Group 1 Software (Pitney Bowes)
www.g1.com/Solutions/Business
Harte Hanks Global Address
http://www.globaladdress.net/
Informatica (Simarlity, Evoke)
www.informatica.com/products_services/data_quality/Pages/index.aspx
Melissa Data
http://www.melissadata.com/
Netrics
http://www.netrics.com/
Omikron
http://www.omikron.net/
Trillium (Harte Hanks company) –
http://www.trilliumsoftware.com/
Zoomix
http://www.zoomix.com/

Feb 10, 2010

Master Data

As mentioned in my previous post , master data represents the nouns of any enterprise. For Orange the master data  includes information about


Most widely accepted information across various industries which are considered as Master Data are

  • Products
  • Customers
  • Locations or Sites or Stores
  • Suppliers
  • Employees
  • Accounts
  • Rules or Policies

Nov 30, 2009

Data Classification

Lets take a look at what 'Data' is comprised of for any enterprise. Consider a hypothetical company "Orange" with 30,000 stores worldwide, 40,000 employees , operating in 10 different countries, manufactures and sells around 15,000 varieties of products via different direct and indirect sales channel, having around 10,000 direct customers , procures various raw and finished products from 1500 suppliers across the globe. These staggering number of employees, location, products, customers,suppliers and various other,  represents facts or information about real world objects. For this organization 'Data'  is representation of these real world objects. Data can represent virtual objects as well. Lets go through this illustration of a very common online shopping experience which involves data representing data from both worlds.

1.  End customer logs in to online shopping portal.
2.  Selects and configure the right product.
3.  Initiate checkout process

  • Provides shipping address.
  • Provide payment details.
  • Confirm the order.
4.  Customer receives email confirmation with order tracking number and estimated delivery date.

The entire process involves data flow from one point to another and finally halts with shipment of product. Product,customer,location etc represents real world objects but information about order,invoice,receipt which are shared between two points in electronic format are transactional in nature and represents virtual objects. A pictorial representation and data captured in the online shopping is below :



In essence data for any enterprise can be classified in following major categories :

Non Transactional(Master Data) Data :

Non transactional data represents critical nouns for any business and generally provides information about person,location,things and concepts. Non transactional data are very common classification of data which exists across all industries Retail,Manufacturing,Engineering,Oil,Pharmaceutical etc.

Transactional Data :

Transactional data is related to sales,invoice,receipt,claim etc. and other monetary and non monetary interactions.

Data About Data (Metadata) :

This is data about another data and contains details like data type(character,number,date etc ..), comments about data etc. and are generally maintained in some kind of repository.

Unstructured Data :

This is data found in various formats floating or maintained in variuos formats like email,website.white paper, legal and marketing collateral etc.

Analytical Data

As the word 'Analytical' suggestes , it assists company's decission making. Analytical data comprises of various forms of data assembled into a one giant data warehouse or multiple data marts to help enterprise make wise and intelligent decission which might impact the sale, reduce cost, identify latency in supply chain.

Nov 21, 2009

Data servicing a milestone task

I landed on a MDM assignment few years back and within a period of six months realized the importance and enormous complexity involved in a MDM initiative. MDM initiative is  not just about deploying and confugring a third party software or even write your own(if you have patience, time and money to do so) but it also involves adherence to standards, practises and principles. And also deals with transformation of human mind set and change in perception towards 'Data'. 

In todays world, enterprise are relying on combination of homegrown softwares which are written and matured over a period and also specialized third party softwares to perform specific function supporting variuos business like sales & marketing ,finance, operations, HR, supply chain, customer relation,service etc. All these complex systems are interlaced together and relies on various kind of data which flows from one application to other in various formats and provides desired optimization to business.

I still remember buying my first car few years back and along with it came a booklet mentioning various 'service' options available and schedule for it. And each time my washed car came out of the service station it felt like it got a a new life. Buying a new car was a great decission and timely maintenance of it  is imperative for smooth run and longitivity. For any IT project which involves development, configuration of an application and various data conversion and enrichment activities and once application and data are married it brings the desire output for business. And similar to the Car , MDM initiative is a great decission and timely servicing  of  both application(software,hardware) and data is imperative. To be successfull with MDM initiative a task must be created for 'Data Servicing' to undertake following  :


Both IT and business engagement is very critical for a success of this task and it should be a recurring task in MDM project plan.