Selecting Your First Big Data Project – Compelling Drivers
Note: We are currently posting the top 5 posts of 2012, this week through Jan. 1. This post is #4 on the list and was originally published Mar. 20, 2012.
In this blog series, we will cover how to select, staff and plan your first big data project. Our recommendations are based on many years of experience that we have had working with a wide variety of customers in several industries. We won’t focus on specific technologies in this series. Instead, we will examine the organizational dynamics and lessons learned from how these projects go in real life, inside existing, often very busy IT infrastructures.
Every customer is different, so please take these as general guidelines rather than hard and fast rules. Please note that this is not focused on normal project management. We assume that you have adequate project management discipline in place, and we’re simply going to look at the dynamics that can be applied to your big data project.
Know What Your Compelling Drivers Are
The first, most obvious question is “Why do this at all?” There should be a compelling use case, a competitive driver, cost driver, or some other issue that has been identified where the application of big data technologies is in the critical path to solving the problem. Typical drivers include the information type (for example, under-utilized structured information sources), or the volume of information (retention of IP logs), but in any case, you need to identify exactly why you are pursuing this path.
One of the most important things you should look for is a compelling ROI (return on investment). That is to say, find something for which you can put a value on the cost of the problem before you plan a solution.
When calculating the problem’s cost, remember to add the cost of the effort you will put into solving it, both from a technology and labor point of view. You can then compare that to the “after” state to determine the overall value of the project. Now, it is important to understand that the first phase of a project may not, in and of itself, be ROI positive. But what is important is for you to have a line of sight through the completion of the entire use case of the project. It may be the second or third application of the technology that flips the switch to a positive ROI.
A good example of such a line of sight comes from one of our clients, who uses the natural language analytics capabilities of InfoSphere BigInsights to understand email correspondence. Through the BigInsights analysis, our client can identify problems in customer satisfaction before they manifest in an unhappy customer leaving the firm.
To begin this project, our client selected a subset of its data to analyze; in this case, the data represented just one region of the country. Once we analyzed the data from that region, it did, in fact, show a positive ROI. However, we didn’t base the model on achieving positive ROI from a single region. The project’s ROI plan was based on running all of the email across the client’s entireU.S.footprint through the system, not just one division. Staking out a path to ROI and getting agreement on it keeps everyone focused on what is practical.
Please note we’re not saying that pure experimentation – and going after R&D projects first as a way of understanding new technologies – is not a valid approach. What we are saying is it is important to differentiate between experimentation and your first “business” project. Do not confuse how you learn with how you implement something that ultimately has to go into production.
In future posts, we will cover: