Data is currently the hot topic in IT, and there is no doubt that having a good data strategy can supercharge an organisation’s ability to understand their customers and provide insight into their operations. Like everything though, investing in ensuring your organisation is ready to take on this opportunity can be the difference between a successful implementation of that data strategy or the creation of just another very costly data store. The following are data hygiene 101 practices that all organisations should consider, either before or in tandem with tackling the implementation of a coherent data strategy. It is also a good reminder for those organisations that may have already undertaken the first steps of implementing a data strategy to routinely keep their house in order. I’ll call it the Data Strategy Spring Cleaning List.
What do you mean by that?
It might be a no brainer for most but understanding what you are talking about in an organisation is fundamental to the implementation of a good data strategy. Too often, organisations have several names for the same thing: one department calls their customer a “Client” while another calls their customer a “User”, while yet another calls their employee a “User”. Having a consistent definition across the organisation is crucial for understanding and categorising data. I remember facilitating a 2-hour workshop about “Product” only to realise that we had two departments in absolute agreement but talking about two totally different things when they said the word “Product”. The establishment of a common vocabulary across the organisation, and then understanding and rationalising the difference from department to department, is a crucial step on the road to a cohesive data strategy. Leveraging common industry data models is a good first step, then building on these for the nuances of your particular organisation can help to speed this process up dramatically.
Get your facts straight
Along the same line as definitions, the ability for an organisation to articulate the facts which underpin their operations is mandatory in order to successfully implement a data strategy. Knowing how to calculate the profit for a department and whether that is gross profit or net profit is important to ensure the consistent treatment of data to form information. Understanding not only the Key Performance Indicators (KPIs) but how to calculate them ensures there isn’t a proliferation of spreadsheets across the organisation using slightly different formulas. It is also an important step in building consensus within the organisation as a whole. It is hard enough for an external implementor to build the complex environment needed without also trying to garner the agreement of the facts which underpin the very business itself. Implementors can provide guidance in this area but at the end of the day it is the business who is the expert in their own domain. Like definitions, it is helpful to have a register of how calculations are performed in your organisation and to ensure this is uniformly applied across all departments.
Where’s that smell coming from?
The next most important aspect for your spring-cleaning activities is understanding where all of your data lives, what systems use it and what they do with that data. In many cases there may be several systems using the same piece of data; some of this will be manually moved across systems, some will be integrated, and others will look like they are the same but are managed completely separately to other systems. Having a register of systems and the data that they rely on will help to understand the complexity of the environment. Knowing which systems modify data and which ones simply rely on data will help your implementation to understand which systems to focus on. Uncovering unknown pockets of data may help you rationalise these systems before attempting to pull this data together within complex structures, or at least allow you to plan a final design taking into account the complexity of the environment. Performing this task may also provide your executive with the impetus to implement a data strategy if you find your customer data is sitting under a desk somewhere on Level 6.
Has anyone seen my keys?
After you have found out where your data is, understanding how you identify data in each system (Primary and Surrogate Keys) and understanding how you identify your data outside of systems (Natural Keys) is important for how you will bring your data together. The concept here is to be able to uniquely identify each of the definitions you created above. Does a supplier have the same identifier in the payments system as it does in the contracts system? Is your customer defined differently when they interact with you instore or online? This information will allow you to understand how difficult it will be to bring disparate data sources together and whether you can rely on system-to-system matching or whether more complex matching will be required. It will also give you the opportunity to see if you can enhance your source systems to allow for a more consistent strategy moving forward.
What’s down the back of the couch?
The final spring-cleaning activity takes a little more effort but is definitely worth it if you happen to uncover some nasties. Knowing exactly what is being added into your data by performing some data profiling will save you from inadvertently exposing data when you shouldn’t. We’ve all put our hand down the back of the couch only to find the pen we had lost three months ago or the peas your youngest just didn’t want to eat. The main culprit for your data is generally free text fields that have little-to-no validation. I’ve had the experience of finding credit cards and passports in notes fields, passwords in middle name fields and disparaging comments about customers that, when data has been stitched together, has formed part of corporate communications to the customer. Investing in understanding where these pieces of data may be a risk for your organisation and having suitable strategies to mitigate this will ensure you don’t become tomorrow’s front page story for all the wrong reasons.
Data is a powerful thing in any organisation and with a little bit of spring cleaning you can ensure your data strategy has the best chances of success.