Applying design thinking to data

Design thinking has been around since the 1960’s, but it really started gain influence in the early 90’s, when IDEO developed its design process grounded in problem solving and empathic thinking. IDEO is generally credited with bringing design thinking out of academia at the Stanford Design School (also known as and into the mainstream. Soon Apple, Google, GE, and Samsung popularized the rapidly iterative approach to design, TED Talks were filmed, and suddenly design thinking was everywhere. But what is it?

According to the Interaction Design Foundation, design thinking is an iterative process in which we seek to understand the user, challenge assumptions, and redefine problems in an attempt to identify alternative strategies and solutions that might not be instantly apparent with our initial level of understanding. It involves empathizing with our intended user to deeply understand how they will use our product, what problems they might encounter in doing so, and how we might eliminate those problems through constant iteration and refinement.

The data space is sorely lacking in design thinking.

Much of the focus on data in recent years has been on managing data as an asset rather than as a product. By thinking of data as something to extract value from first, rather than as something to be used first in order to extract value, the industry is missing a huge opportunity to make data more consumable, more usable, and more delightful. One company buys or acquires data from several other places, their engineers normalize and transform it into a clean, consistent standard format, and then they sell that dataset on to other companies. Pretty straightforward.

But what if we applied a design thinking rubric to the process instead?

The principles of design thinking were first described by Nobel Prize laureate Herbert Simon in The Sciences of the Artificial in 1969. There are generally thought to be five stages of design thinking:

  1. Empathize
  2. Define
  3. Ideate
  4. Prototype
  5. Test

Empathy is the most important principle underpinning design thinking, because it requires you to put yourself in the mind of your user. As a data provider, what does this mean? Think about who you are selling your dataset to. Start with the company. Is it a corporate user? A hedge fund user? A venture capital user? Next the actual person at that company. Are they a business person or a data scientist? Maybe both are involved in the decision to buy? Or is your dataset intended to be consumed directly by a system, such as a BI tool? Your data product may need to be quite different in order to meets the needs of these various consumers and their use case.

For example, let’s say you have a data product you are targeting to hedge funds. What types of funds are you targeting? Some funds are quantitative, meaning that they make their trading decisions based on mathematical and statistics analysis. Some funds are fundamental, meaning that they make their trading decisions based on research, by studying trends in a sector or looking at company-specific events. The types of data that these funds buy, and from whom, may vary widely. Additionally, quant funds want data that they can feed into their statistical models. They prefer data to be as machine-readable and easy to manipulate as possible, therefore they typically want it in as raw a form as possible. Fundamental funds on the other hand may want insights and analytics to be pulled together into a report that can be readily consumed by an investment professional.

The definition phase involves defining your users needs and their problems. What are your user’s problems? Maybe your user is a retail brand that doesn’t have good data on the effectiveness of its marketing campaigns. Your data is going to be used by marketing professionals so that they can create more targeted and effective campaigns going forward. What are those users biggest problems with the data they already have? Is it not particularly useable because it is siloed in too many different systems, with no common way of linking the data together? Is it just stale and hasn’t been updated in a while? Is it incomplete in some way?

There is a concept in design thinking called wicked problems, which are seemingly intractable or especially tricky to solve. A wicked problem may involve incomplete, contradictory, changing or overlapping requirements. It may involve interdependencies that illuminate that one problem is actually just a symptom of a larger problem. What are the wicked problems in data?

We would propose that the fluid nature of data (that is, the fact that datasets change over time) is one of the wicked problems that both data buyers and sellers must reckon with in order to extract the most value out of the data they exchange. From the very outset of a sales discussion the data buyer must get to heart of not only what is this data right now, but what will this data be tomorrow? One month from now? One year from now? Will it become more valuable over time, or less? How often does it change? Why does it change?

Design thinking asks us to reframe wicked problems and to challenge assumptions, which leads us to the next phase: ideation. One of the reasons we started Syndetic was to challenge the assumption that data sales must involve an exchange of static documents which are meant to in some way “represent” the dataset. To us, there is no way to represent a dynamic product with a static document. That’s why we came up with this concept of a “live” data dictionary that is always hooked into the underlying dataset. But we went further than that. Why not eliminate the exchange entirely, and let the buyer start exploring the dataset itself right away? That way they can see how it is today, and how it is tomorrow. They can buy just a point in time sample, or they can subscribe to regular updates.

By this point you can start with the prototype phase, which is when your database becomes a product rather than just some data lying around in a database somewhere. You’ve identified who your target user is, how they are going to use the data, and what problems they have. But now you have to get it in the hands of users and see if they actually want to buy it. This is where a data storefront really shines, because in the old days of [phone rings…] “Hey, I saw that you guys sell ESG data, and I run an ESG fund, so what’s your data about?” it can take many weeks to months to even get your prototype in the hands of your user. They might ask for a sample of your data product, but how do you know what kind of sample to provide? Do they want 100 random rows of data? Do they want the entire dataset? Do they really just need the header names? Maybe they only are interested in a specific slice of your data – for example, if you sell foot traffic data, a buyer may only be interested in foot traffic to retail stores, and doesn’t care about restaurants. An online data storefront helps you get right to the point because the buyer can filter down your dataset right from the beginning to “retail stores” and pull sample directly from the site.

As you drive more traffic to your storefront, you can start the final phase of the design thinking process: testing. You’ll learn how your customers want to slice and dice your dataset. You’ll learn what might be missing or incomplete. Perhaps most importantly, you’ll learn how to more accurately price your data, because you’ll have a better sense of what your data is worth. All through an iterative process where you systematically improve your product, offer it to more customers, and improve it again. Running this type of testing, data collection and iterative process through a disjointed, phone or email-based sales motion is almost impossible. With each lead that comes to your site, you will gain more information about your user, their wicked problems, and how your data might help them. And your dataset will become ever more elegantly designed.

%d bloggers like this: