Big data – Big Deal?

comments 0






Robert Szabo's picture

Is data going to solve Friedrich von Hayek’s knowledge problem? Well, data is the gold of the early 21st century, but remember, gold is volatile.

We are living the era of unprecedented increase of available structured and unstructured data which, as any unprecedented event, has its popular phase of imagination motivating the markets to issue creative forward looking concepts. There is nothing bad with the development visions, but in many cases we need to distinguish between realistic developments and mirages as having access to data and correctly understand its information are two rapidly different things.

Although tackling the data obese momentum may have endless opportunities, it is important to manage expectations. Many companies have still not been successful in implementing lean techniques not even 40 years after Toyota has successfully introduced the concept and also lean start-up models are just being truly discovered. This may indicate that the companies have difficulties in understanding their own data, therefore an immediate jump to Big Data concepts may be confusing and not bring the value expected. Correct data and consequent information management are the key principles and main challenges at the same time.

The quantum of historical information and most up to date statistical techniques promisingly analyze and discover past events helping companies understand their customers faster and, thanks to the amount of data available, more precisely. We are being surrounded by success stories of Big Data endeavour of some of the big companies, however without them stating the starting level of their customer knowledge this may just reflect how little they knew about their customers before and therefore any new knowledge they gained, particularly from Big Data, is significantly advertised. At the end of the day, any new knowledge obtained is a great achievement. The question however is, whether the company could not achieve it with a simpler solution than Big Data, considering the burden of its implementation.

The other questionable point of the current development of Big Data is the perception that hypothesis and models will not be needed anymore as the vast amount of data will show the exact correlations and causations. Mathematically this might sound right, however we may not forget about customer behaviour which is a constantly changing variable. It is changing due to a subjective understanding of the surrounding variables each customer faces every moment. Every single variable may influence customers differently and may lead to different behaviour. Big Data has so far ambiguously analysed the particular roles customers play on markets and jobs products fulfil. We may also tend to say, there aren’t behavioural patterns governing ones activities, as the variables are basically limitless. Big Data predictions are currently based rather on ceteris-paribus situations than on implementing social dimension of data evolution. I am convinced social scientists should and will play a role of an ever increasing importance in data science interpreting the social context of the data. Context analyses will be the key to properly utilize the opportunities rooted in Big Data concepts.

Even though Big Data concepts suggest all the variables could be found and analyzed, is data fidelity the entire knowledge? Does that mean there will be unique products available to all the customers and moreover capturing the whole consumer and producer surplus? I doubt so. At least not in the close future, therefore customer segmentation and data aggregation will be present. In other words, models of behaviour will likely to be used to ease the analyses and therefore working with aggregated customer segments will ignore some of the local knowledge specifics failing to deliver the ultimate goal of Big Data.

Companies tend to implement Big Data concepts to achieve a stage of one-to-one marketing. I agree that data is the doorkeeper to achieve such a business model and stating this goal might be interpreted as a wishful stage encouraging development, also known as Kaizen from Toyota Production System, not every industry can really achieve that in a considerable future. The first challenging phase will be influenced by the appropriate data aggregation and consequent noise cleaning achieving desired data accuracy while the later stages will focus on one-to-one production most presumably leveraging the development in 3D printing.

Undoubtedly, Big Data will help create more homogeneous and better distinguished customer segments as well as will help identify correlations never seen before, however not every correlation and segment discovered will at the early stages be the best alternative for the companies to choose. Companies will have to prioritize efforts until reaching the point of one-to-one marketing.

The attractiveness of Big Data relies in the unknown correlations and the opportunities of their discovery. I admit, with Exabyte of data, correlation seems to be ruling any other mathematical and statistical models, however this might be rather true in the field of natural science. End customer behaviour develops differently and we have learned many times from the past that there eventually are no patterns of behaviour. Given the pace of current global collaboration and consequent local adjustments of the variables, there are new variables created on every corner of our society faster than ever. Aggregation appears the data to be more stable and misses these little changes which are the real opportunities and keys for further success. To better understand these changes, smaller data would be more beneficial. Partitioning problems with small data sets might bring more effective results and moreover could be used as a learning point to better work with data and therefore consequently grow into better understanding of unrelated Big Data sets. Missing the interim step in data science is ambitious and promises ambiguous results while smaller data may have direct focus on customer sets and with lower engagement a quicker return might be expected. The key is not having a large data set, but a set of data which will eventually help solve the problem.

In conclusion, Big Data is the driving force for future developments and productivity improvements, but given its current early phase and intangibility of value brought, I would suggest focusing on smaller data. This approach might limit the risk of return on investment and serve as a learning point for data science as well as the first step to successfully implement Big Data concepts. Smaller data analyses may positively impact the response time of companies to the changing environment which may result in fulfilling customer requirements better and faster resulting in capturing the customers and the market while the data also limits the risk of the company and therefore improves its competitiveness from both perspectives – company operations and market perspective.

Big Data concepts suggest that distributed knowledge can be gathered and governed by a centralized approach and through correct data management maintain the decision making in a distributed fashion. This may indicate Big Data concepts are marginalizing Hayek’s knowledge problem concepts and are being the driving force to knowledge based markets. Before achieving its full potential, there are several obstacles in methodology, accuracy, privacy and data context interpretation to consider, but promising first steps have already been made.


1. James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, Angela Hung Byers, Big data: The next frontier for innovation, competition, and productivity, 2011, McKinsey & Company, Insights & Publications,
2. Shvetank Shah, Andrew Horne, and Jaime Capellá, Good Data Won't Guarantee Good Decisions, 2012, Harvard Business Review, The Magazine, April,
3. Mark Graham, Big data and the end of theory?, 2012, The Guardian,
4. Dr. Rufus Pollock, Forget Big Data, Small Data is the Real Revolution, 2013, Open Knowledge Foundation blog,
5. Dr. Rufus Pollock, Forget big data, small data is the real revolution, 2013, The Guardian,
6. Dr. Rufus Pollock, Frictionless Data: making it radically easier to get stuff done with data, 2013, Open Knowledge Foundation blog,
7. F.A. Hayek, The use of knowledge in society, 1945, The American Economic Review,
8. Carmen Nobel, Clay Christensen’s Milkshake Marketing, 2011, Harvard Business Knowledge Working Knowledge,
9. Chris Anderson, The End of Theory: The Data Deluge Makes the Scientific Method Obsolete, 2008, Wired,
10. John Hsu, Why big data will have a big impact on sustainability, 2014, The Guardian,