What does e-commerce for data really mean?


The way that data is sold today is extremely opaque. Unlike Software-as-a-Service companies (Slack, Stripe, Shopify, etc.), Data-as-a-Service companies almost never put their pricing on their website (SafeGraph is a stark exception). In fact, most don’t even put much about their product on their website. Instead, DaaS companies or companies that sell data “on the side” usually have a pretty generic website that says something like “We’ve got really interesting data in the category of X! Call us if you’re interested.”

On that phone call, a salesperson assesses the buyer’s interest in the data, their use case, their time horizon, and their willingness to pay. They pick a number, making an educated guess based on their conversation. But they are unlikely to make it based on the true market value of the data. This is partly because the provider may not know how interchangeable his data is or what competitive datasets are going for. He may also not know what data the buyer already has, and the buyer is unlikely to tell him. Data becomes exponentially more valuable when it is combined with other datasets. To expose the most potent combination would be to expose the buyer’s own competitive advantage.

This price opacity leads, as price opacity usually does, to an extremely inefficient market. Sellers and buyers have a hard time finding each other. When they do, the difference between what buyers are willing to pay and sellers are willing to take is extremely wide. Buyers may want to buy just a single row of the provider’s data: the missing link in their chain of data that will bring untold fortune to their business. Sellers want to sell entire datasets, on recurring licenses, in order to boost their own revenues and please investors. This extreme misalignment of incentives leads to an opportunity for middlemen like data brokers, and data marketplaces, to charge extremely high premiums for introducing buyer to seller. But these middlemen don’t do much for price transparency.

Our goal is to bring price transparency to data sales by encouraging sellers to post pricing for their data publicly, and giving market feedback. Sellers who have particularly “exclusive” datasets, that is, data they don’t intend to sell to that many buyers, can choose to hide pricing on their storefront and build orders on behalf of customers at any price. But we hope that as more data sellers enter the market, price transparency becomes the norm. We also hope to offer guidance to our customers who are just getting started in the data business. The size of a dataset is not always correlated with a higher price: a dataset that has just 100 rows may be just as valuable or more valuable than a dataset with one billion rows. Pricing may be dynamic: pricing for data around certain filters, say a certain geographic area or company demographic, may be more valuable than data around others. And lastly, price may depend a lot on the target buyer. Corporate data buyers have different needs than financial ones.

“Productizing” Data

Datasets are hard to describe. They are a list of fields. They are a set of values at a point in time. You can’t take a picture of them or really capture what they are, because the data is always changing. And so when someone is interested in a dataset to purchase, they are left with approximations: a small sample, usually in the form of a spreadsheet. A data dictionary which contains the list of fields in the dataset and their definitions. A marketing powerpoint.

None of these artifacts are compelling in the way that shopping online is compelling. E-commerce broadly lets the customer experience the product without actually possessing it. It packages the product in an appealing way while also seamlessly delivering the product to customers as soon as they are ready to buy. It mimics in many ways the brick and mortar retail experience by taking advantage of our human psychology – the allure of branding, the satisfaction of seeking out and finding, and the avoidance of scarcity. Find what you want and buy it in just a few clicks. No conversation necessary.

For salespeople at data providers, the artifacts they have at their disposal are almost always limiting, and endlessly out of date. A sample set needs to be regenerated on the fly for every new potential customer. A data dictionary needs to be reviewed in case an engineer changed the name of a field. A pdf needs to be revised to reflect the fact that the data is now updated weekly instead of monthly. E-commerce lets providers break free of these artifacts. The data shop is never out of sync with the underlying data. The customer can customize his own sample. The storefront is the collateral.


Products without distribution are in no man’s land, sitting on the proverbial shelf in someone’s garage. Data is no different. While several data marketplaces have popped up in the past 18 months or so to streamline data distribution (Amazon and Snowflake each have their own, and there are a few startup competitors like Narrative and Datarade) the value that these companies provide to data companies is in the infrastructure, not in the marketing. By providing the pipes to data buyers that already buy on Amazon, or already have their data in a Snowflake data warehouse, these companies pitch to data providers that buyers will flock to their product simply because it exists.

What these companies don’t do is help the providers in the marketplace actually stand out. They don’t make the products seem valuable. There may be a short description. There may be a free sample of the dataset – who knows when it’s from. But there is no easy way to find data in the marketplace, nor a compelling reason to buy.

Much like the internet at large, if data providers want to build a brand, they need to control their own message. And so, if their products are only listed in a crowded marketplace, it’s very unlikely that those products will stand out. It’s very unlikely that buyers will know what their data means, or why it’s valuable. They will still in all likelihood end up back on the company’s website, clicking “Book a Demo” and talking to the salesperson to figure out what’s really going on.

Much like Shopify offers brick and mortar retailers the tools to easily build an online store, we offer DaaS companies the tools to easily build their online data shop. Customers can search and filter within the actual dataset to see if the data has what they need (there is even a “match” feature for stock tickers, town names, brands, etc. to see if the dataset contains what you’re really interested in). They can get an instant price quote based on their search criteria, and they can checkout and get the data immediately by zip file. What’s more, their search is saved, so they can come back to get a refreshed copy of the data again when they need it, or subscribe for automatic updates.

Sellers get analytics on what data buyers are searching for within their data storefront. They get lead capture from buyers who aren’t yet ready to check out but who are interested and want to know more. They get order management, with customer names, the data they bought, and the price they paid all in one place. And most importantly, they get control over the messaging around their product – what makes this data special? Why should you buy it? What is it worth?

For questions, drop us a line at sales@getsyndetic.com.

%d bloggers like this: