Categories
Product Management

The Value of Big Data and Why It’s Difficult to Monetize

I recently attended a session on Autonomous Cars at the law firm Herbert Smith Freehills. It was an insightful session where the lawyers gave great presentations on legal issues they advise on, like M&A, regulatory and product liability. However, one non-legal item they talked about was the ability for car manufacturers to “monetize data.” The idea of monetizing data comes up often but it’s a lot harder than it sounds.

A decade ago, I was working for a large credit card company looking at new growth opportunities. We were convinced that we could become the most valuable company in the world. Our reasoning went like this. Google was worth billions of dollars. But Google’s value was based on what web links people clicked. We, as a credit card company, had data on what people actually bought. Because our data was more relevant to advertisers than Google’s data, we should clearly have been worth more than Google.

There was just one problem. While we had this data, so did Bank of America, Capital One, JP Morgan and every other bank. And everyone was looking to monetize their data.

Did I say one problem? It wasn’t just financial services companies looking to out-Google Google. The phone companies were in this game too. They were saying, “Hey, we should be the most valuable companies in the world. Google has data on where people go on the web, but we have data on where people actually are in the real world.” Suffice it to say, there was a lot of data around.

This reminded me of an article written about undersea cable capacity in the days of the telegraph. Andy Kessler shared the following cautionary tale:

After undersea telegraph messages were first sent between Newfoundland and Ireland in 1886, a half-dozen companies sprang up to relay messages between London and Paris and New York. Half the traffic was for stock trading. These companies charged up to $5 per word and could transmit 15 to 17 words per minute. Each thought it could generate revenues of $5 million dollars or more per year. It was easy to raise the $2 million it took to lay undersea cable and investors, who constantly dashed off telegrams themselves, were all too happy to lend money.

Each of these companies assumed that they’d have a monopoly on the market. But when many companies entered the market based on that same assumption, all of the excess capacity created a race to the bottom for telegraph message pricing, forcing many of the companies into bankruptcy.

So what makes Google different? I remember a discussion with stock analysts around that time. I had written a paper on Mobile Payments along with Citi’s Equity Analysts. The topic of data was very hot and various analysts asked me, “Who’s going to win the data game? Who has the best data?” I explained that the real differentiator, and what people will pay for, isn’t the data itself but what you can do with the data.

As the famous Harvard Marketing professor, Theodore Levitt said, “People don’t want to buy a quarter-inch drill. They want a quarter-inch hole!” In the data space, this would be, “People don’t want to buy data, they want to buy results!”

How Google Uses Big Data

The goal of a search engine is to find the most relevant documents. In the early days of search engines, things were relatively easy. You could:

  1. Examine Web Pages: Early search engines like Lycos and Altavista would look at web pages and determine which ones were the most relevant. They would do this by looking at factors like the number of times a word was repeated or whether the search term was in the title of the document.
  2. Curated Directory: Yahoo, on the other hand, had humans hand-curating the web into a giant directory. This was relatively easy when the web only had a few thousand pages.
My Interpretation of the Early Web. With Only a Few Pages, Choosing a Winner Wasn’t That Difficult.  

However, as the web grew, it became more and more difficult to manage search with these methods. Lycos and Altavista were overwhelmed. Not only was it difficult to distinguish between multiple similar pages based on the text in the page but there was also web spam that was trying to fool the search engines into promoting their pages. Yahoo had a problem hiring enough people to keep up with the quickly growing web. Both had doomed strategies.

The State of Web Search When Google Entered the Game. As the Web Started Exploding, Finding the Best Pages Became Increasingly More Difficult.
Google went down a different path. By using an algorithm called PageRank (after Larry Page), formerly called BackRub (oh those Googlers and their funny names), Google was able to make use of data that everyone else was overlooking. The links between pages were just as valuable as the data in the pages themselves. For example, any page can claim to be the authoritative page of IBM. But if 100 people point to IBM.com as the right answer, it’s easy to lift that one to the top.
Google Changed the Game by Using Links from Other Sites as a Measure of Quality

There are a few things to realize about Google’s use of data:

1.  Google didn’t have the “best” data. Yahoo had a more accurate method for categorizing the web. Having humans look at content gave better results for each individual page. Unfortunately for Yahoo, that method was too slow and expensive to sustain.

2.  The data didn’t cost Google anything. At the time, everyone was concentrating on the web pages themselves — not the linkages between the pages. This kind of information is often called “information exhaust” — information that’s a by-product of what you’re really looking for. It was already out there, free for anyone to use.

3.  It’s the capability that made the difference. While the data was free, it was up to Google to organize the data and make it useful. Going back to the jobs to be done metaphor, Google put this data to work solving a problem for users.

4.  More data is better. While other search engines were getting overwhelmed by the torrent of data from an explosion of web content, Google’s product actually benefited: The more links that can point to a quality web page, the better search results Google produces.

Google has been using this template for various other projects since they were founded. They can leverage data in some very creative and useful ways. Take location data for example. If you have an Android phone or Google Maps on your phone, Google is keeping track of your location data. You can take a look at your data here. The data is useful to me but it’s a bit odd seeing that Google holds a record of everywhere I’ve been.

An Example of Google Tracking Me Through the Day.

So how can Google use your location, along with that of others, to create value? Well, one way is to aggregate this data to show where there’s road traffic. If you have a lot of phones not moving, then you can flag that road as congested. But where else could Google use this data? Google added a feature to Google Maps that let you see how crowded a restaurant was at different times of day based on how many cell phones they found at the restaurant.

A Graph of Popular Times at Bubby’s Restaurant Compiled Through Location Data. Note the Popularity of Sunday Brunch.

It’s important to remember that Google did not have the best data to determine busy times at restaurants. Telephone companies and restaurant sites (e.g., Yelp, OpenTable) likely had better data. For example, OpenTable manages the reservations systems for many restaurants and actually knows how busy they are. But yet again, Google was the best at putting the data to work at solving this problem.

So let’s sum up. People still talk about monetizing data but their data isn’t as valuable as they think it is. There’s a lot of data out there that can solve problems and generate value. The tricky part is extracting the value from the data. Google did this in search and continues to do so in lots of other ways.

Note: Ben Thompson from Stratechery gave a similar talk about how Google works last week to kick off the University of Chicago Antitrust and Competition Conference.