Garbage In, Garbage Out? Trust In The Data Behind AI Is Vanishing

featured-image

Trust in the data needed for data-driven decisions is falling — precipitously.

Data, data, everywhere, but...

Trust in the data behind artificial intelligence and other data-driven initiatives is going downhill – fast. A recent survey finds fewer than half of business leaders say they have the data they need to pursue cutting-edge strategies – and their numbers have declined somewhat dramatically over the past two years. That’s the word from a recent Salesforce survey of 552 business leaders, which finds trust in the data that underpins data-driven decisions is falling.



For example, only 40% trust the reliability of their companies’ data – down from 54% in a similar survey conducted in 2023. Likewise, 41% say their data is relevant – down from 50% two years ago. And, stunningly, a mere 36% believe their data is accurate – down from 49% just two years ago.

While 63% of leaders say finding, analyzing, and interpreting data on their own is key to their jobs, 54% of them aren’t fully confident in their ability to do this. What’s going on? Industry observers agree that data poses a problematic issue for AI initiatives of all kinds. Executives understand “how subpar their data collection, cleansing, and curation process that is fed into AI to create a decision engine," said industry analyst Andy Thurai .

"They know it is garbage in, garbage out. So, knowing they are feeding garbage in, would you feel comfortable using the decision?" When it comes to automated processes, the stakes get even higher. “You better have data to back it up, and that’s a problem” said Thurai.

“Executives are afraid their data is not complete, not wholesome, not timely, stale, not reliable, or not accurate.” Complicating the situation is the fact that many enterprises now use synthetic data to train their AI models when not enough data is available, or to maintain security. The problem is that "executives’ confidence that the models are trained on real-world data is not there,” Thurai added.

Part of the issue may also be the fact that applications designed for the 2020s are relying on technology designed in the 1990s. Enterprises “often struggle with a data infrastructure built over many years that utilizes many different technologies, and was built without a clear plan or direction,” said Dwaine Plauche , senior manager of product marketing for Aspen Technology. "This infrastructure involves myriad complex point-to-point data connections that do not meet stakeholders’ needs or 2025 cybersecurity requirements.

” What is required is a shift in thinking for the handling and positioning of enterprise data, Plauche advised. “Rethink the data infrastructure as an internal customer service. "The goal should be providing internal customers or projects with needed data – a service-oriented mindset that is aligned with the organization’s strategy.

" Of course, AI itself can help manage and overcome issues with data, "An adage among veteran data scientists is that good modeling is 80% data preparation and only 20% modeling and analysis," said Richard Sonnenblick , chief data scientist at Planview. “Generative AI has shown promise by working alongside data stores to create self-correcting pipelines that identify and filter errors in units, missing data, and redundant data, lowering the chances of leading a team to false conclusions. In addition, storage media that brings entity relationships front and center – such as graph databases – surface critical insights that generative AI models can highlight and explain to data consumers.

” Start with the business problem and work backwards, Sonnenblick advised. “AI for AI’s sake is unlikely to achieve useful results. Leaders should identify the desired outcome of a particular task that AI will support, such as decreased time to completion or a higher success rate.

AI excels at identifying patterns in data that are too complex to be quickly recognized by humans, meaning it can provide powerful insights for reassigning tasks or reallocating resources to maximize margins or minimize downtime.”.