The Data Quality Renaissance: Why Your AI is Only as Good as Your DBA

Craig Mullins explains why data quality is the ultimate competitive advantage for AI

Craig Mullins March 17, 2026

For decades, the database community has marched to the beat of a familiar drum: “Garbage in, garbage out.” It’s a phrase so weathered and overused that it has almost lost its meaning. We nodded along during the early days of relational database adoption, we whispered it during the migration to the cloud, and we preached it during the Big Data era. But in 2026, as we sit squarely in the age of Agentic AI and autonomous enterprise workflows, “Garbage In, Garbage Out” is no longer just a warning but a potential catastrophic business risk.

The industry is currently experiencing an age of AI reckoning. Organizations that spent the last three years pouring millions into large language models (LLMs) and generative AI are finding that their shiny new engines are stalling. The reason? The fuel is contaminated. We have reached the point where the most sophisticated AI architecture in the world cannot compensate for a poorly modeled, inconsistently defined or inaccurately maintained database.

In short: Data quality is the only competitive advantage left.

The Commodity of Intelligence

To understand why data quality has suddenly become the star of the show, we have to look at the democratization of AI. In 2026, high-performance AI models are essentially a commodity. Whether you are using open-source models or proprietary enterprise APIs, the intelligence is accessible to everyone, including your competitors.

If two rival banks are using the same underlying AI model to determine creditworthiness, the winner isn’t the one with the faster processor. The winner is the one whose data is cleaner, more historical and more accurately labeled. When the algorithms are a level playing field, the data becomes the differentiator.

The ‘Hallucination’ of Bad Data

One significant issue with AI is its tendency to hallucinate. This is where AI confidently asserts something that isn’t true. While some hallucinations are inherent to the probabilistic nature of LLMs, a significant portion of enterprise AI failures are likely to be data hallucinations.

If your AI agent is tasked with summarizing customer health, but your Db2 tables contain three different “John Smiths” with overlapping addresses and conflicting transaction histories, the AI will do what it was trained to do: It will bridge the gaps. It will invent a narrative to connect the dots. In a traditional reporting environment, a human analyst might spot the duplicate and flag it. In an autonomous agentic workflow, the AI executes a decision based on that hallucination, perhaps triggering an incorrect credit limit or a faulty insurance claim.

Your database schema may also be causing problems for AI. Poorly named columns (e.g. COL1 or MISCV2) make it harder for an LLM to understand and reliably use your data, which degrades answer quality, increases hallucinations and causes brittle downstream pipelines built on that data.

The cost of a mistake in 2026 is higher than it was in 2020 because the speed of execution is faster. We are no longer just printing wrong numbers on a PDF; we are automating the wrong actions in real-time.

The Rise of Data Contracts

One of the most significant shifts we are seeing in 2026 is the move away from passive data governance toward data contracts.

For years, DBAs and data stewards have played a game of catch-up, trying to clean data after it has already landed in the production tables. Data contracts flip this script, moving toward treating data quality like it always should have been treated. A data contract is a formal agreement between a data provider (like an application service) and a data consumer (like an AI model or an analytics engine). It defines the schema, the semantics, the quality SLAs and the expectations for every piece of data exchanged.

If the incoming data doesn’t meet the contract—if a “Date” field is missing or a “Currency” code is invalid—the data is rejected at the gate. You can think of this as a return to the principles of database integrity enforcement (e.g., strong typing and referential integrity) but applied at the architectural level of the entire enterprise.

The Mainframe as the ‘Single Source of Truth’

This brings us to the unique position of the IBM Z platform and Db2 for z/OS. In the rush to the cloud, many organizations created data silos where fragments of customer data were scattered across a dozen different SaaS platforms.

In 2026, the trend of data repatriation is being driven, at least partially, by the need for quality. Organizations are realizing that their most reliable, most audited and most truthful data still resides on the mainframe. By keeping the AI processing close to the data using technologies like the Telum processor’s integrated AI accelerators we can eliminate the data drift that happens when information is extracted, transformed and loaded (ETL) into secondary systems.

Data movement and migration can be complex and prone to creating problems. When you move data, you lose context. When you lose context, you lose quality.

The DBA as the Chief Data Auditor

So, where does this leave the DBA? For a long time, there was a fear that automation and self-tuning databases would make the DBA obsolete. However, the opposite has happened.

The DBA of 2026 is shifting from a system maintainer to a data architect and auditor. While the system might handle its own backups and reorgs, only a human with deep domain knowledge can determine if a data model accurately reflects the business’s reality. This entails tackling many different issues, such as:

Semantic consistency – Ensuring that “Revenue” means the same thing in the mainframe COBOL app as it does in the Python-based AI agent.
Lineage tracking – Being able to prove to a regulator exactly where a piece of data came from before it influenced an AI decision.
Ethical guardrails – Identifying biases in the data (such as geographic or demographic gaps) that could lead an AI model to make discriminatory choices.

The Cost of Data Debt

We often talk about technical debt, which is the cost of choosing an easy solution now instead of a better one that takes longer. We need to start talking about data debt. Data debt is the accumulated cost of improper data handling such as ignored duplicates, missing null-handling and inconsistent naming conventions. In the past, data debt was a nuisance. Today, it is essentially an AI tax. Every hour your data scientists spend wrangling or cleaning data is an hour they aren’t spent innovating. In some organizations, this tax can be as high as 80% of the total project cost.

The companies that will dominate moving forward are those that treat their databases not as storage bins, but as curated assets. They are the ones who realize that a smaller, gold-standard dataset is infinitely more valuable than a massive data swamp.

Conclusion: Back to Basics

As we look toward the future, the most modern thing a company can do isn’t to buy a new AI tool. It is to go back to the basics of rigorous data management.

We need to embrace:

Strict schema enforcement: No more schema-on-read chaos.
Automated quality checks: Real-time validation within the DBMS.
A culture of stewardship: Recognizing that everyone who touches a keyboard is responsible for the integrity of the enterprise’s memory.

AI is the engine of the modern enterprise, but data is the fuel. If we continue to focus on the engine while ignoring the quality of the fuel, we shouldn’t be surprised when our “intelligent” systems leave us stranded on the side of the road.

This “Data Quality Renaissance” isn’t just a trend; it’s a survival strategy. And for the DBAs who have been shouting this from the rooftops for 30 years? Your time may have finally come.

The Data Quality Renaissance: Why Your AI is Only as Good as Your DBA

The Commodity of Intelligence

The ‘Hallucination’ of Bad Data

The Rise of Data Contracts

The Mainframe as the ‘Single Source of Truth’

The DBA as the Chief Data Auditor

The Cost of Data Debt

Conclusion: Back to Basics

Related Articles See more

The Double-Edged AI Threat, Quantum and More: TechPulse Survey Reveals Mainframe Security Sentiments

AI Tops Security Concerns—in More Ways Than One: March TechPulse Results

IBM Announces Expanded Collaboration with NVIDIA to Advance AI for the Enterprise