Forbes: 3 Major Mistakes Companies Make With Big Data And How To Fix Them
Investing in big data is all the rage right now. But as with many topics that become buzz worthy, its core values often get lost in the hype. In customer conversations it has become increasingly clear that many companies are investing in the wrong places when it comes to data science, predictive analytics, or big data. They think they're taking shortcuts to execute quickly, but are generally then left with systems that fall far short of their business needs. Here are the three major mistakes I see in choosing big data solutions and how to overcome them:
Mistake #1: Thinking the Business Problem Will Be Solved on the First Try
I see this all the time; companies want a recommendation algorithm or some sort of machine learning capability. When broadly defined, their business problem seems straightforward enough, so they believe they should be able to simply buy a black-box solution or hire a person or firm to “build a model” and solve the problem.
As we see with companies like Netflix, which employs 300 people to maintain and improve its content recommendations, customer data is a continuously changing environment. That’s why the company also spends $150 million recommending movies and TV shows to its members every year to keep up with the cyclical state of machine learning.
Symptoms: The deliverables never seem to precisely address the business challenges, processes require more work to maintain and execute, or bugs seem to constantly present themselves.
Solution: You will need, at minimum, four different roles on the team to be successful. We have found these essential positions to be:
Data Scientist: builds and iterates models in a particular language (such as R/Stata/SAS/Matlab/Python/C).
Business Analyst: provides a basic understanding of statistics and a robust understanding of the business problem.
Developer: operationalizes the process (i.e. takes action on the data science rather than having just a manual process). Someone will need to normalize data, integrate systems and put this in action.
Quality Assurance: provides troubleshooting along the way. With all the things that can go wrong, robust end-to-end quality assurance is absolutely key.
Symptoms: Deliverables are delayed, everything requires manual effort, or systems don’t seem to return the business value promised.
Solution: No matter how simple the problem may seem at first, exceptions, edge cases, data hygiene challenges and integration issues will inevitably slow the execution and the results. The first application should be seen as an initial attempt to solve a complex problem, which will undoubtedly require constant iteration to be successful. Don’t think there’s a silver bullet or that you’ll be able to get there without a lot of effort.
Mistake #2: Leaving Your Business Problem to Be Solved by Data Scientists
There is no doubt that having experts in statistics, machine learning and other areas of data science is incredibly important to the success of an initiative. However, left to their own devices, data scientists will rarely be able to achieve the business results an organization needs. Data scientists typically build new models and solve intricate equations, leaving a business problem, however obvious, not a priority. Data scientists are only one part of the complex, cross-functional team required to create business value.
Symptoms: The deliverables never seem to precisely address the business challenges, processes require more work to maintain and execute, or bugs seem to constantly present themselves.
Solution: You will need, at minimum, four different roles on the team to be successful. We have found these essential positions to be:
Data Scientist: builds and iterates models in a particular language (such as R/Stata/SAS/Matlab/Python/C).
Business Analyst: provides a basic understanding of statistics and a robust understanding of the business problem.
Developer: operationalizes the process (i.e. takes action on the data science rather than having just a manual process). Someone will need to normalize data, integrate systems and put this in action.
Quality Assurance: provides troubleshooting along the way. With all the things that can go wrong, robust end-to-end quality assurance is absolutely key.
Mistake #3: Focusing on the Wrong Part of the Value Chain
A typical process has at minimum three key components:
Data Ingestion and Normalization: loading the system with data from various sources.
Modeling/Analysis: creating and refining various models to answer business questions.
Campaign Execution: putting the learnings from the models into action.
By all means, organizations large and small should be investing in better understanding their data, but companies routinely spend inordinate amounts of time trying to correct the parts that are unable to actually provide worth. Instead of moving data around and building only a basic model, companies should be focusing exclusively on the one step of the chain that will create business value: the modeling. The Modeling/Analysis phase contains real and unique business value that is worth investing in to build internal skills.
In order to achieve a successful modeling phase, businesses must ask what key goal they are looking to improve such as click-through rates, revenue attributed to email, or customer registration for a loyalty program. Then they should consider which aspect of their marketing they should adjust to add value such as sending during a specific time of day, using unique messaging, featuring certain images, leading with an attention-grabbing phrase, or other variables. This sifting and analyzing will point to the most effective model to add constant value to customer data and reach key goals.
Symptoms: Too much time and money is spent on integration or campaign execution to the detriment of the models. Or, not enough time is spent on those components and the project is continually plagued with data hygiene or execution issues.
Solution: Leverage tools that can accelerate your time to value by quickly and reliably building a foundation for execution across the entire process, allowing your team to focus on the areas that bring results.
As more companies venture to make sense of their big data, they’ll gain value as long as they keep their business goals in sight, accept that machine learning is a continual process without the option for shortcuts and get their modeling and analysis focused on the most valuable aspect of predictive analytics.
Erik Severinghaus is the founder & CEO of SimpleRelevance, a Chicago-based company focused on digital marketing personalization.