AI and data: two lessons and more questions
Following up on my questions about AI and data. Some insightful reads, clarity on some things leading to more interesting questions...
A few weeks ago, I posed an open question about AI and data. Since then I have come across a number of articles that touch on this whole sphere of data, information and knowledge.
I still don’t have any answers but I think I now have some more interesting questions in my mind.
Start with two fundamentals:
GenAI is not designed for correct answers. For me, this means the best value will come from applying the technology in areas where there are no “right” answers.
LLMs make a huge quantity of data legible, accessible and usable that was previously incoherent. More and better data should lead to better answers to at least some questions.
Thought provoking reads
What I have noticed is a number of people talking about this last part. How more data will lead to better answers and what value that might generate.
A very obvious example is this tweet from Jaya Gupta. He talks about decision traces and how these might be built into a context graph. Decoding that new jargon, what he means is clear:
“Once you have decision records, the “why” becomes first-class data. Over time, these records naturally form a context graph: the entities the business already cares about (accounts, renewals, tickets, incidents, policies, approvers, agent runs) connected by decision events (the moments that matter) and “why” links. Companies can now audit and debug autonomy and turn exceptions into precedent”
He quotes some specific examples. A more theoretical perspective came from Andrew Maxwell. He is talking less directly about the topic so can be excused for the lack of specifics. His focus is how AI changes entrepreneurship and he highlights one key area:
“When embodied in trained, purpose-built GPTs, AI can be used to systematically surface the fundamental questions that determine whether a technology will actually be adopted. It can force teams to confront who must change their behaviour, what incentives or risks they face, what workflows, regulations, or norms will resist that change, what evidence decision-makers would require before committing, and what must be true for a solution to scale beyond early enthusiasts.”
Different but strikingly similar in that both authors believe AI can be used to look inside organisations and understand how they operate in new, more dynamic ways.
A more cautionary approach is visible in this post from Vaughn Tan. I will return to the substance of this post in another context. But for now just focus on this simple and crystal clear opening sentence:
“The most important knowledge in any organisation—what makes your work distinctively yours—can’t be written down or taught through training programmes. It’s tacit knowledge that must be learned through direct experience.”
Now we are starting to open up the problems with the simplistic view that more data solves previously intractable problems. My thinking on this was further clarified from a more left field source, a brilliant piece of history writing from Michael Magoon on how people solved problems… before 1000 A.D.
Going right back to hunter gatherers, he highlights two things:
“Learning occurred through imitation, storytelling, and hands-on practice, and competition between groups encouraged refinement of techniques.
Even so, problem-solving remained embedded within production itself. Technical knowledge was tacit, local, and tightly bound to specific environments.”
It is remarkable how similar his description of ancient learning is to the recommendations Vaughn Tan has for building a learning organisation:
“Organisations that successfully teach tacit knowledge embed learning into everyday work through three mechanisms: making work public and concrete, focusing feedback on outcomes rather than processes, and using concrete examples as anchors.”
Two lessons
You may be thinking this is going round in circles. Where is the clarity in all this?
Far from complete is the short answer. Two general points this has clarified in my mind:
There is never going to be a complete source of all data. Some of this lives in people’s heads. Beyond that sometimes you would need to see inside multiple minds to get the complete picture.
No matter how much data you have, precedent is not a perfect guide to the future. I see a lot of commentary where people say “AI will be able to predict x.” No it will not. It may well be better and more reliable at some predictions than humans. But it will be just another tool, not a source of certainty.
More questions
Inevitably that opens up some more questions:
Closely linked to my original question, could data cleaning and fixing actually be a negative thing? Sorting everything into a nice neat order risks losing the messiness and richness of what really happened. Taking the “decision traces” example above, what would happen to this if the data had been cleaned?
Since AI will not provide the right answer, the real explosion will come when products find the sweet spot that combines “better than humans” and “trusted”. How can companies explore this surface?
Taking that a step further, from a business performance point of view those sweet spots will link efficiency and value. Cutting costs alone is never enough to generate true value. When and where will we see this approach in practice? Suspect there are examples out there already that meet this description.
In many applications, AI will only generate value when it has access to confidential business or personal data. This could be a minefield of permissions, access and friction. Which tools and approaches will simplify and become ubiquitous in this area?
If you would like to start a conversation about these questions, please get in touch.
Thanks for reading


