The most challenging aspect of implementing gen AI for businesses is ensuring that the data is prepared.

The most challenging aspect of implementing gen AI for businesses is ensuring that the data is prepared.
The most challenging aspect of implementing gen AI for businesses is ensuring that the data is prepared.
  • The obstacle that's hindering tech leaders from implementing AI is not creating a model, but rather acquiring suitable data.
  • A global study of over 1,300 tech and data executives found that only 18% of companies are fully prepared for AI deployment, with their data being fully accessible and unified.
  • To safeguard private data, businesses must perform intricate data labeling and categorization.

AI deployment is not just about having a solid use case, as IT executives like chief information officers understand.

According to Prukalpa Sankar, co-founder of data catalog and governance software Atlan, the challenge that's preventing tech leaders from deploying AI is not generating a model and rolling it out, but rather having data ready for AI. Sankar stated, "Everybody's ready for AI except your data."

A global study of over 1,300 tech and data executives found that only 18% of companies are fully prepared for AI deployment, with another 40% considering themselves mostly ready but not quite there.

To achieve readiness, Sankar stated that companies must first overcome the challenge of consolidating and organizing their data, which is primarily the responsibility of data engineers. The goal is to unite data that was previously isolated in various business units in order to deploy it for a specific purpose.

To ensure the privacy of private data, businesses must complete complex data tagging and classification. As Sankar stated, "The data that goes behind the question can be changed depending on who is asking it." For instance, a human resources chatbot can utilize payroll data, while an overall chatbot cannot.

With AI, data governance isn't so cut and dry

Data governance, which involves managing data assets through policies, processes, and standards, encompasses all of this. According to Matt Carroll, CEO and co-founder of data security platform Immuta, data governance is not a new concept, but AI has transformed the way it is executed.

"As Carroll stated, "Traditional business intelligence, which we've been practicing for 30 years, had a well-structured and efficient governance process. However, when introducing AI, a different approach is necessary.""

To ensure the accuracy of AI models, businesses must continuously acquire new data from both internal and external sources.

According to Carroll, the readiness of AI depends on three factors: data retrieval, utilization, and observation of usage.

While a mature data governance pipeline is not widespread across industries, a 2024 AI readiness report from MIT revealed that data governance, trust, and security are given more attention in government and financial institutions compared to other industries. Carroll emphasized that this practice should not be limited to these industries, as all businesses that use generative or other AI solutions must balance IT, legal, and organizational executives' perspectives, as well as those of the departments they serve.

To ensure ongoing data readiness after deploying AI, Carroll suggests that companies establish an AI hotline. This can be a full-on hotline in larger companies or a managed Slack channel in smaller ones. The key is to have domain experts directly communicate with the engineering team about issues such as hallucinations or incorrect data tagging.

"Carroll suggested that a model review board could reevaluate or flag the feedback loop for retraining and revalidation, which is not a negative thing, but rather part of the game."

In addition to continuous testing on models, the company ensures that they meet the quality standards by looking for any unusual behavior.

Companies get creative in getting ready for AI

Since the beginning of AI deployment journeys, Sankar has observed companies using AI readiness scores to quantify the process of preparing their data. These scores are typically measured on a scale of 1 to 5.0, taking into account various factors. According to Sankar, "Without measurement, nothing progresses."

Experts in the field of data management are noticing a trend where employees are being given the additional title of "data steward" alongside their primary role. This is particularly relevant for those who have expertise in a specific domain but are now responsible for managing a data set that may be used for AI. Additionally, highly specialized data governors, such as data governance executives or data management engineers, are becoming increasingly important and may become more prevalent in the future.

Sankar compared the data infrastructure ecosystem to a marketplace, stating that on one side are business-ready AI use cases and on the other side is complex data infrastructure.

Before pursuing AI readiness, organizations must first consider the ethical decision of whether or not to expose certain types of data into their systems. This is a crucial step in determining data readiness, as it involves an unpopular question in the C-suite: "Should you do it at all?"

Member Panel: Better Tools for Work
by Rachel Curry

Technology