Five ways to innovate with data

Data innovation isn't just an opportunity for data tech or data product companies. You don't have to "do" data to be data-driven. And believe me, it is the data-savvy companies that have the competitive advantage over their peers in every sector, niche and at every stage of maturity.

A lot of what I write about in this blog is lessons I've learned the hard way through blood, sweat and lots and lots of tears as an entrepreneur. But data innovation is what I know and love and have been doing throughout my career, including in three of my own companies and in pretty much every job I've had since graduation. So, here, from my happy place, are my tips, potholes and bullsh*t alerts on how any business can innovate with data. To drive new products and services, and perhaps more importantly to drive competitiveness, efficiency and customer-centricity.

1. Understand what you really have

Even if the only data in your business lurks in a bulging filing cabinet and some incomprehensible Microsoft Access files, you still have data. Which is an opportunity. And also means the rules around data protection apply to you just as much as anyone else.

Chances are you have significantly more, usable data than you realise - certainly from your internal and customer-facing digital interactions. It doesn't take much prep to start capturing and systematically utilising that data.

Data is nothing without action, so first you need to list out all your data sources - however incomplete - to understand what you have and what you need to do to unleash that potential.

Where to look and what to look for?

All the places where money touches the business (so not just finance, but if you have a website, the places where transactions and transactional records occur, also the tools, suppliers and technologies you are spending money on).
All the places where customers or potential customers touch the business (from sales lists to inbound enquiry forms, to your website, to your CRM system).
Shared repositories of know-how, IP, your tools, processes, rules and methods of work are also important and valuable data.
Last but not least - most overlooked and most likely to get you in legal hot water - the records you keep on current and former employees. For a lot of businesses, this is their largest source of highly personally identifiable information - and it is the data most likely to be left in unlocked filing cabinets and shared spreadsheets. This should be a major focus of your GDPR or data protection efforts.

Once you know what you have, the key to unleashing the power of that data is in finding commonalities between data sets that allow you enrich, forecast or triangulate. So tieing product sales to product returns to promotion offer. Linking payment types to costs to sources of customer enquiries. Marketing activity to sales or customer enquiries.

You do not necessarily need to know how to do all this yourself - but knowing that it can be done, and sourcing the right tools and right people to help you achieve this is powerful and potentially transformational. (But please see Bullsh*t Bingo below, as you are now at risk of hype-driven exploitation!)

Also know that you must consider permissions, privacy and data protection. It is one thing to join non-personal data together, but if you are "back-filling" one set of personal information by joining it to another set, ensure you still have explicit consent to do so and that your customer clearly understands how and why you use their data. Ensure you comply with data protection regulations for your country and the countries where your customers are based. Sure there are loopholes, sure companies get away with very naughty things indeed. But you know - people, please. Don't be evil!!

So, without being evil, and with the data sources you have or can access, what amazing problems could you solve? Data drives huge decisions and opens up incredible opportunities. If you can imagine the question or define the problem, chances are that data used wisely will enable a solution.

2. Clean, classify, cluster, segment

So, where do you even begin? Before rushing off to hire rare-as-hens-teeth data scientists, it is worth understanding what you'll likely be needing to undertake as a team. Understand this and then whether you directly hire, contract through data science marketplaces like Pivigo, or engage with students via projects like Data Lab, you'll need to broadly know how to scope out the skills you need.

It's all about the plumbing. Data uploading, ingestion, cleaning, validation and standardisation to the point it is workable is the least sexy and most critical step. Even data artists need their raw materials to be in shape first.

Classification of data is another essential task - machines, like people, need training. In my last company, we were able to very accurately predict if a specific customer would return a specific product they had bought. But in order to do that, within the data we had to be able to first classify not just products (to tell them apart from people and other fields), we also needed to build classification schemas for things like product types, colour and size. This sounds very basic but it's actually very complex.

Take as an example, a size 12 ruby shimmer dress. Age 12, UK 12, US 12 (UK 16)? Ruby - colour or stone or girls name? What colour - red, purple, other? Shimmer? Is that a colour, finish, material or new field? Does it make the original colour need to be reclassified as metallic or other? A fashion-minded human can take an informed guess because they have been trained over their lifetime to develop a mental classification system, however idiosyncratic or weird. Programmatic solutions (machine learning, algorithms etc) also have to be taught how to classify before higher analysis can begin.

Segmentation for personalisation. One of the highest value action you, or a data analyst or data scientist working with you, can ever take is to help you find meaningful clusters and segments amongst your prospects, users, staff, customers or other business behaviours/groups. The point of data wisely used is that it allows you to stop blindly treating everyone and everything exactly the same, or like some fictitious average. Massive steps in responsiveness and efficiency can be achieved when you break things down into discretely different, but internally similar groups.

For example, without good data, I may have been watering all living things in my home - plants, dogs, husband, kids, cat - with the same average amount of water every day. Imagine the improvement in the existence of all when I start grouping plants together and the big and little mammals together and adjust their watering regimes accordingly! Non-evil segmentation, for sure.

There are a lot of clustering techniques and algorithms, but rarely a quick, easy and right first-time approach (see Bullsh*t Bingo below). Clustering and segmentation is iterative and evolves - the good news though is that this is one of the most powerful steps a business can take. Beware though, this has to be a data-driven process. Don't just segment customers based on hunches or invalid metrics and then try to torture your data until it surrenders and complies with your theory. Be ready to accept that the data may tell you everything you thought you knew was wrong. The example below shows how a customer turns from high value to low value once you look groups of customers based on what they keep, rather than buy:

3. Make data available to the whole business

If data is (cliche alert) the new oil, the business energy it generates is kinetic. Physics 101, kinetic energy is the one associated with movement, acceleration, action - you know, actually doing stuff.

Rule one of data. It is very comforting to have data, but it's completely worthless til you actually do something meaningful with it. Or your team and wider business stakeholders do something with it. That means you need to do two things - make it available in appropriately usable forms and explain why you care. And keep explaining and clarifying the why for a really long time.

If you don't explain to your whole business what is commercially important (to you, your clients, your investors, your market - the competition) then in the unlikely event you get past the terminally pointless pie chart stage, your analysis will never progress past interesting to become important. Data becomes a comfort blanket or a performing pet, not an enabler of action.

So depending on your technical sophistication, use tools, KPIs, presentation layers, shared datasets and collaborative insights, even simple APIs and feeds to make data available for everyone. Explain and document the facts of the data (ie time periods, field definitions, permissions, the basics needed to know what this stuff is) but don't be overly prescriptive. Instead, keep reinforcing why this matters to the business.

And why does it matter again? Sorry to deflate your ego, but almost certainly the most compelling, transformational, brilliant innovations in your business won't come solely out of your personal head. They will come from a member of your team tinkering with an idea around a problem they are very familiar with or care passionately about. This is why getting both data and commercial context into everyone's hands is essential, along with allowing people the space to find their own purpose and meaning. The HR admin is as likely to be the source of innovation as a product manager. As a leader, your job is to ensure that individual's fragile bubble of potential brilliance rises high and survives long enough for others to witness it and be moved to think or act.

4. Look for leading indicators

Data is flowing, people know why and jubilation all round they're even using it. Next pothole alert... There is a risk that your walls, intranets and shared folders may start being wallpapered in pretty but pointless retrospective analysis. You'll have enough cute pie charts to give you data diabetes.

This entire data innovation process is ultimately to enable you to look forward with improved certainty not backward with better charts. Happily, this is where very good analysis and decent data science can help - if focused on outcomes that are predictive or leading indicators of meaningful future behaviours. For example, Google is several weeks quicker and significantly more accurate at predicting the unemployment rate than the government statistics offices tasked with the job. Why? Google has access to search data - it can see leading indicators that people are going to become unemployed - searching about how to claim, redundancy, job hunting - whereas the government doesn't see that individual until maybe 8 weeks later when they sign on as unemployed.

Your business, industry, customer data will have leading indicators - this is a rich seam to mine for innovation and commercial opportunity.

When you find your leading indicators, next best actions are where to focus. Yes, we're edging towards machine learning and AI territory here but this can be also done manually, by, you know - doing stuff. Really sweat (as in simulate, model, test, or just plain take a punt on and see what happens) the next best actions. The biggest impact you will ever get is when you go from doing nothing to doing something. So try that first. After this first step, your improvements are incremental.

5. Bullsh*t Bingo

While it shouldn't put you off, there is so much hype and hysteria around data that it is definitely rich territory for all the usual charlatans, snake oil salesmen, plus a lot of starry-eyed wannabes who think having watched a YouTube Video makes them a data scientist worth £80k. Here's a few of my current favourite red flags. I'll try and remember to update this list, so feel free to send me suggestions as you hear them:

K means, neural nets and random forests. This is like showing up to a chef's interview saying I do frying, garlic and refrigeration. These are tools, ingredients and methods, not solutions. Each has their place and while I am glad you know how to use Wikipedia, now let's talk real-world solutions and applications.... Seriously, there are lots of inexperienced wannabes who know enough jargon to get through the door, then plan to learn the rest (if required) on your time and money. In a startup or when building a data function for the first time, you simply cannot afford this. Push, push, push for outcomes and reference for them as well.

Trust me I'm a data scientist. Er no. Sorry. But no. Not til you develop commercial curiosity and working pace, a commitment to basic data plumbing and an obsession with the words "why?" And "what if?" Push for show and tells around outcomes and avoid ego/interest driven work that displaces the boring but important. Also, ensure that the basics of data hygiene and quality are not ignored - most data is rubbish and needs work before being usable. Some data scientists - particularly the inexperienced - make dangerous assumptions that the data is fit for purpose and so skip past the cleaning and validation steps in a rush to get to the fun bit. Many years ago I told my then employer HP that "I can keep digging deeper into this pile of (data) crap if you wish, but it will remain crap and I will just smell worse". My point still stands.

We're a big data/Artificial Intelligence business. There is nothing wrong with being a small data and human intelligence business, unless possibly when you're trying to raise money. If your data comes in files you can readily open/store on your laptop, you're not a big data business - not yet. That doesn't matter - to start you need to take meaningful action, not to wait forever for better data. And most AI/machine learning uses of data are currently limited, specific and frankly still mostly the territory of the advanced class. And while you may be in the advanced class very soon, it is not the best place to start. My point is, you don't have to big up your data commitment - do important things with what you have and take it from there.

We have an algorithm, we're rich. Sorry, no. An algorithm is simply a specification of how to solve a class of computational problems. Often these are mathematical Lego kits, enabled by statistical software. You can't patent maths. You can only use it - hopefully in non-evil ways. It all comes back to action and business processes. And the good news, once you do have real invention or process innovation, this is potentially patentable (which, unfortunately, is still not the same as rich).

GDPR is the most terrifying threat your business has faced.... ever. One for the Brits. And anyone old enough to remember the Millennium Bug. Before you rush into hiring one of the many GDPR experts probably contacting you daily right now, read the Information Commissioner's overview on how GDPR overlaps with and differs from the existing UK Data Protection Act around consent and controlling, processing and deleting personal data. Be aware but don't panic.

So despite these red flags, every business has the potential to innovate with data. You just need to actually do something other than cling to it like a comforter blanket.

And if you need a little help with identifying that "something", why not consider one of my in-house data innovation workshops? Guaranteed to bust jargon, open minds, get hands-on and help your team on the path to unleashing your data potential!

Five ways to innovate with data

Historic Posts

Archive