What’s your Unfair Data Advantage?
This is the year that companies are searching far and wide for the most fruitful opportunities for using generative AI.
Instead of copying your competitor's pilots, it’s more useful to understand where generative AI can bring unique value.
One useful question to start that quest, which I’ve asked my clients in our Discover workshops, is: Where do you have an unfair advantage in data?
An Unfair Data Advantage comes from the intersection of three areas:
1. Real and significant customer needs
2. Unique data sources
3. Generative AI to democratize access to the data
1. Real and significant customer needs
This is an obvious but still often overlooked starting point. What are the biggest challenges your customers face? What are they struggling with frequently?
A potential customer need is an important and complex challenge in their lives that is not sufficiently solved currently.
2. Unique data sources
What data reservoirs do you have that are hard to replicate? If you can access data streams from customers, experts, devices, or other sources that competitors cannot copy, you might be on to something.
Examples of unique data sources include:
a fleet of heavy machinery that produces data through sensors
millions of customer data points related to behavior like shopping
thousands of internal documents in your area of expertise
3. Generative AI to democratize access to the data
The next key is considering how generative AI can democratize access to this data.
How could gen AI make it easy to ask questions related to that data? How could your customers make better decisions with that data? Could your customers benefit from the data by just using their phone cameras?
Examples of Unfair Data Advantages
Here are some concrete examples of how these three pieces come together for the Unfair Data Advantage:
Legal AI startup Harvey took massive amounts of US case law examples to build their own custom Large Language Model to help attorneys with legal questions. The custom model easily outshines standard models like GPT-4 in complex legal questions.
Their Unfair Data Advantage came from training a custom model with a significant corpus of US-specific legal cases. This solves a real problem for lawyers: spending hours researching legal questions by plowing through endless case databases.Education non-profit Zelma aims to democratize access to educational data. With Zelma, anyone can ask questions about the data, such as “ELA Scores Over Time in South Carolina by Gender.” In seconds, Zelma creates a simple data visualization to answer the question. This unforeseen access to data helps education leaders and parents make more informed decisions.
To achieve this, the team took all publicly available US school test data, cleaned it up, and built a custom solution for accessing it with GPT-4. Combining all of these scattered data sources and combining them with GPT-4 is their Unfair Data Advantage.Healthify helps anyone live a healthier life with personalized AI-powered nutritional coaching. Users snap photos of their food, and in response, they get guidance on improving the nutritional value of their meals. Healtifty’s unfair data advantage is that they fine-tuned the standard GPT-4 model with their own unique knowledge of nutritional science.
Once you identify a potential area for your Unfair Data Advantage, it's time to create a pilot to test the assumption. Is it useful to your customers? Do you have the required data infrastructure in place? Are lighter AI development methods like RAG models and prompt engineering sufficient, or should you fine-tune your own model?
Starting within your Unfair Data Advantage zone ensures you build something that can have lasting value for your customers and set your organization apart from your competitors.