AI Depends On Your Precious Data: Don't Lose Control Over It

featured-image

Maintaining control over your data isn't a limitation; it’s what ensures your long-term success.

Michel Tricot is Cofounder and CEO of Airbyte . We've seen this repeated over and over: AI models are only as good as the data they’re trained on. The secret sauce is training your model on your proprietary data while ensuring that your data remains in your control.

“Your” data is referred to as “first-party data,” and it's the basis for “intelligence” in AI. In the race to AI, you want to feed models as much data as possible, pulling from every available source to optimize performance. But here’s the hard truth: You shouldn’t trade your first-party data for intelligence.



What I'm about to say won’t necessarily make me popular, but if you don’t control how your data is accessed and used, you could undermine the very thing that sets your AI apart. First, you must secure your data through rigorous access control and permissions. If you’re not prioritizing who can see and use your data, you risk losing control over your biggest and most valuable asset, weakening your competitive edge.

Second, you’ve got to safeguard content containing personally identifiable information (PII) , sensitive business or financial data and other “private” information. So, when making data available to the AI model, you’ve got to think about who will have access to the said data, or you risk exposing sensitive data and being out of compliance, either with your own organizational rules or with government regulations, which can subject you to substantial fines. This issue is compounded when you consider how many times datasets will be used to feed different AI models.

Every time data is used within a model, access controls and permissions must be reviewed. This issue becomes even more complex when datasets are repeatedly used to train different AI models. Each time a dataset is accessed, its permissions and controls must be re-evaluated—even for the same dataset at a later date—because access rights, user roles or regulatory requirements may have changed in the interim.

Historically, there are well-established permissioning structures to regulate access to data. That same rigor must be applied to the data being fed into AI models. AI must be built with access control by design, filtering what data the model can use.

I know that the fear of being left behind makes it tempting to bypass the rigor that it takes to ensure data sovereignty. Worse yet, ready access to ChatGPT and DeepSeek makes it easier than ever for anyone in your business to input sensitive data into external AI systems, often without considering the risks. But the bottom line here is that in the race to AI, data sovereignty can't be forgotten, as first-party data is your most valuable business asset and your biggest differentiator.

So, what should you do? 1. Make access control a cornerstone of your AI strategy. Permissions should be built into your AI initiatives from the start, not treated as an afterthought.

Before feeding any data into a model, ensure proper access controls are in place. 2. Continuously manage permissions.

Every time data is pulled for AI use, permissions should be reviewed and updated to reflect changes in users, access rights and compliance requirements. 3. Pre-process and redact sensitive data.

PII and other confidential data often hide within unstructured datasets. Before feeding data into an AI model, apply automated redaction and filtering to protect privacy. 4.

Think long term. Managing access and permissions isn’t just about security—it’s about ensuring a scalable, sustainable AI strategy. Get it right from the start, or risk creating unnecessary complexity and operational challenges that will be difficult to resolve later.

In summary, don’t risk losing control of your data in the race to AI. Your first-party data is one of your most valuable assets—protect it. Maintaining control over your data isn't a limitation; it’s what ensures your long-term success.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?.