Registered: 1182202224 Posts: 99
Reply with quote #1
In his July/August/September 2012 article, titled
Big Data, Big Ruse
, Stephen examines the term, "Big Data," which is supposed to represent a paradigm-shifting innovation in the world of business intelligence (BI). However, Stephen contends that Big Data is an empty term that represents little more than the current round of marketing hype, trying to separate you from your BI dollars.
What are your thoughts on this article? We invite you to post your comments here.
Registered: 1348863542 Posts: 1
Reply with quote #2
Thanks Stephen, for a long overdue public thrashing of this concept of "Big data". I thought the most insightful thought from your piece was this: "
The success of BI, however, cannot be
measured in petabytes or any other unit of data volume. It must be measured in our increased ability to understand data and then make better decisions based on that understanding. " Not long ago, I was interviewing for an analytics leadership position, and the hiring manager asked, "What experience do you have with Big Data? We really want someone with big data experience." I asked her to define what she means by big data. How big is big in her world? I politely informed her that she was asking the wrong question. What she really should have been asking is, "What kind of insights have you been able to derive from analysis of large data sets, and how did you achieve this?" I explained to her that the term big data is far too vague to be meaningful, that it is simply a product of the BI Consulting and software vendor world, constantly searching for a new marketing angle and designed to take advantage of the weak-minded. In the end, big or little, the ultimate objective of the analyst is to derive insight, and recommend action. I would contend that the big data "movement" is inherently evil. It is intended to make data analysis seem so difficult and terrifying that you would naturally want to enlist the help of a software vendor and consultant. It implies that you don't really need to understand your data, rather you can just turn to this magical tool set and it will derive all the insight you need, no fuss no muss. In a world dominated by business leaders with little analytical ability or technical saavy, and an insatiable desire for immediate results and instant gratification, it's a powerful lure.
Registered: 1320901864 Posts: 3
Reply with quote #3
I am in full agreement about the misrepresetnation that 'big data' gives. One particular misunderstanding is made clear by Matthew O’Kane's quote:
"The main driver of benefit is when predictive analytics is improved through the use of more varied and deeper data sets. It is this area where new techniques are required because the tried and tested regressions and decision trees won’t cut it any more." It's completely the other way around: Data mining techniques require sufficient amounts of data to properly learn. If you have 'too much data', then simply take subsamples. In most cases the analysis is still performed on samples. The deployment might be on a huge customer database. So the issues are choosing data that reflect the scenario of implementation ("are the holdout data similar to the context in which we'll deploy this algorithm?"), evaluating performance appropriately, and always, always, keeping the business objective and requirements in mind. If 'big data' lures companies and individuals into analytics education, then that's great. We see a surge in 'business analytics' programs in business schools (see my blog post http://www.bzst.com/2012/08/the-mad-rush-masters-in-analytics.html). Being involved in this scene, I can attest to its usefulness all along. __________________ Galit Shmueli
SRITNE Prof of Data Analytics and Associate Prof of Statistics & Information Systems
Indian School of Business
Registered: 1348995178 Posts: 190
Reply with quote #4
In the last 5-6 years, together with my colleagues, I was involved in several projects related to business data analysis. I can talk only about our experience in such of projects, how and when the "big data" term came into the picture. First of all, when I talk about data, mainly is about data storage, data investigation and data visualization. Data storage is already there, performant in most of cases, being able to store billion of records, with at least one query like language and other programming interfaces usable for data extraction. Data investigation is a complex process performed by analysts based on several requests to IT professionals for data extraction. It involves in depth study of the detailed data, statistical pattern detection and so on. Data visualization is, in most of cases, a direct result (or even part) of data investigation, with the respect of several displaying rules according with the visualization knowledge level of the target audience. Most of the BI packages claim to provide solutions for all three matters: alternative data storage (optimized for fast queries), fast visual data investigation tools and rich libraries/interfaces for data visualization addressing several target operating systems or browsing technologies (web, tablet apps etc). “Big data” affects data storage technology, data investigation algorithms, it will not change the dashboard like output, but it will change the display or extraction of detailed data. In case of small size data, the raw data behind the cumulated dashboard figures can be easily displayed or saved in spreadsheets or text files. This changes if raw data becomes too large, several solutions being available: display cumulated data on a higher level than raw data, paginated views and so on. Common upper limits are: a few dozens for web like tables, one million for spreadsheets and few millions for text files. These limits stay the same for databases of a few millions of records or hundreds of billions of records. A larger amount of data will not change the data visualization fundamentals. It will change, however the storage of data and investigation algorithms. If a BI package is sold as all in one solution for storage, investigation and visualization then “big data” is going to represent a decisive marketing term. From my point of view the biggest issue is not the use of “big data” term in any marketing campaign, but the detour of the BI sense.
Registered: 1330960889 Posts: 57
Reply with quote #5
You ask: "Does Big Data Affect Data Visualization?". Well the biggest Big Data-related project I can think of is the LHC at CERN. I imagine the amount of data collected there pales in to comparison to the amount used in BI tools (not to mention the amount that is filtered out and thrown away). Has there been any need for new kinds of data visualisations there? Not so far as I've seen.
Maybe science is completely different to BI in this respect, but I don't see why it would be. Maybe the people at CERN just aren't innovative enough... err maybe not.
Registered: 1349186749 Posts: 2
Reply with quote #6
I'm not altogether convinced the "more data vs. better understanding of data" is a totally valid argument. Sure, there is clear benefit in better understanding of data but should this challenge be a prohibitor to opening up new channels of data? Why can the two statements not exist at the same time?
It seems to be more of a frustration rather than anything else.
Registered: 1135986598 Posts: 823
Reply with quote #7
I'm not arguing against more data. I'm arguing that more data without the ability to make sense of data is not only useless, but distracting and overwhelming. More data is good when it is useful data that people can manage through skilled sensemaking. The gist of my argument is that the vendors are putting the cart before the horse and loading the cart to unmanageable proportions. We need better horses; or perhaps we actually need a car. __________________ Stephen Few
Registered: 1349186749 Posts: 2
Reply with quote #8
Less garbage is better than more garbage. Sure. I'm on board. Thanks.
Registered: 1350443864 Posts: 2
Registered: 1135986598 Posts: 823
Reply with quote #10
Thanks for sharing your thoughtful perspective on the topic of Big Data. I wasn't aware of Kobielus' response to my article because I don't tweet. I avoid discussions in venues that restrict communication as Twitter does. To set the record straight, I am not in the least cynical. What I am is skeptical. A person of intelligence with my experience who doesn't scrutinize each new panacea of the information age with a healthy dose of skepticism is either insane or is on the payroll of one or more of the companies that promote this stuff. Check your logic again regarding my claim that we crave things that are BIG. The fact that we also crave a few things that are small in size does not negate my claim. By the way, we don't crave diamonds, caviar, or waistlines because they're small, but because they are perceived as having great (i.e., big) value. In other words, they make us feel big in the eyes of others. (Does anyone really like those little fish eggs?) Another correction: I don't define Big Data only in terms of volume. I include in my definition every characteristic that I've encountered in the many papers, blogs, and articles that I've read on the topic. How does Kobielus' organization, IBM, define Big Data? I bet that you'll find many definitions from IBM and a fair amount of conflict between them. Having been around awhile, I remember when the term Data Warehouse was brand new. At the time I worked for a large bank that was what we called an "IBM shop" at the time. That meant that we ran our primary systems on a large IBM mainframe. As such, we had a dedicated IBM Account Manager. Because I knew that IBM was involved in the new venture of data warehousing, one day I asked our account manager to tell me how IBM defined the term "data warehouse." He looked at me, paused, and with a wry smile on his face said, "Steve, we at IBM define the term 'data warehouse' to mean whatever it is the client thinks it means." I understood and appreciated his candor. Essentially, he was admitting that, as a large vendor in the space, it was useful for IBM to leave the definition vague and malleable so that whatever it was that a client thought a data warehouse was, IBM could say "Yes, that's precisely how we define the term and if you sign right here we'll sell you one for a million dollars or so." Unfortunately, 98% of what we hear about Big Data is marketing hype. Those of us who actually work with people to derive real value from data care little about the latest term for what we do -- Big Data, Analytics, Data Science -- we care about using data more effectively to create a better world. There is a big chasm between people who talk about this stuff and people who actually do it. I don't know Kobielus and he doesn't know me. What I do know, however, is that as an employee of IBM he is paid to promote Big Data. I'm paid to teach people skills that they desperately need to use data more effectively. Slowly but surely, what I'm doing is making a difference. I know a lot of people, however, for whom Big Data (and the vendors like IBM that promote it) has created nothing but confusion, frustration, distraction from what really matters, and a huge loss of revenue. Perhaps Kobielus would like to meet a few of these folk and explain to them that my perspective on the topic is "profoundly shallow, cynical, and uninformed." __________________ Stephen Few
Registered: 1350443864 Posts: 2
Reply with quote #11
First of all, the "small" thing was sort of a joke, but I guess you didn't see the humor in it.
I was sort of offended by your comment, "Those of us who actually work with people to derive real value from data."
So I stand by my review, elements of cynicism. See you in Las Vegas.
And,"Using data more effectively to create a better world?" I'd get on that bus too, but you know damned well the bulk of this stuff is used by government, military, law enforcement and making rich people and companies richer. Or maybe I'm just cynical.
-NR __________________ Neil Raden
Registered: 1135986598 Posts: 823
Reply with quote #12
I didn't have you in mind when I made the comment that offended you. The association never crossed my mind. I was referring primarily to Kobielus. (What made you think I was referring to you?) Regarding your cynicism about how data is usually used for less than good, I agree that this is true. This doesn't make me cynical, however, because I really do believe that great things can be accomplished by improved understanding derived from data. If I didn't believe this, I wouldn't do what I do. I'd probably be working for the World Health Organization or some other non-profit group. Las Vegas? I have no plans to be in Las Vegas, which is one of my least favorite places in the world. Are you referring to TDWI? If so, I haven't participated in a TDWI conference for well over two years. I was eventually turned off by the degree to which TDWI accepts money from and grants favors in exchange to BI vendors. __________________ Stephen Few
Registered: 1364844412 Posts: 1
Reply with quote #13
The essence of the matter seems to lie in Stephen's 3 questions about information on page 6 of 8:
1. What does it mean? 2. Why does it matter? (Not “Does it matter?” because it’s too tempting to respond “Yes” without really understanding why.) 3. How then should we respond? Throughout my career - which dates to the early days of time-sharing (now called cloud computing?) - it's proven helpful to encourage ValueDialog around data and information. ValueDialog is a straightforward conversation that starts with "So, what's it worth?" Perhaps this is a 4th question for Stephen's list. Thanks for working to keep the focus on thoughtful decisions, alignment and action as the goal.