What Does Datadog’s BYOC Pivot Mean for AI Governance?

What Does Datadog’s BYOC Pivot Mean for AI Governance?

Chloe Maraina is a powerhouse in the world of big data and business intelligence, possessing an uncanny ability to transform massive, chaotic datasets into clear, actionable narratives. Her career has been defined by navigating the sprawling complexities of multi-cloud environments, helping organizations transition from rigid, siloed SaaS models to flexible, federated architectures. As companies grapple with the explosive growth of telemetry in the age of agentic AI, Maraina provides a strategic vision for integrating observability with cost-effective data management. She currently focuses on the intersection of data science and cloud sovereignty, ensuring that as systems grow more intelligent, they remain both transparent and manageable for the modern enterprise.

In this discussion, we explore the seismic shifts occurring as major observability players move toward Bring-Your-Own-Cloud architectures and federated data access. We delve into the practicalities of managing petabyte-scale telemetry, the persistent risks of vendor lock-in despite the rise of open-source standards, and the emerging challenges of AI tokenomics. The conversation also highlights the evolution of security tools that now leverage telemetry to automatically identify “crown jewel” assets, moving beyond the era of manual tagging to provide real-time, context-aware protection in a world of skyrocketing data volumes.

With the recent shift toward Bring-Your-Own-Cloud models, why are we seeing such a massive departure from the traditional SaaS-only approach in the observability space?

The shift is really a response to the sheer gravity of data we are seeing today, especially as teams invest heavily in AI. We are no longer talking about gigabytes; many organizations are managing telemetry volumes at a staggering petabyte scale. When you are operating at that level across multiple geographies, the old model of shipping every single log or trace to a vendor’s cloud becomes a logistical and financial nightmare. Digital sovereignty is also a massive driver here, as many regions now have strict compliance requirements that practically demand data be stored on customer-controlled infrastructure. By supporting BYOC, platforms are finally acknowledging that the customer needs to maintain physical and legal control over their data while still benefiting from high-level analysis tools for their metrics and traces.

Federated search is gaining a lot of traction as a way to query data across different platforms like Snowflake or ClickHouse. How does this actually change the day-to-day operations for a team trying to keep costs under control?

Federated search is a game-changer because it allows a team to use a tool like Log Explorer to query data where it already lives, whether that’s in Databricks or an external S3 bucket. From an operational standpoint, this should significantly reduce the triple threat of ingest, storage, and compute costs that typically haunt large-scale deployments. You are essentially eliminating the “observability tax” that comes with moving data just to look at it. It allows engineers to keep their data fidelity intact without the massive overhead of centralizing everything into one proprietary silo. However, the real magic is in maintaining that operational context across disparate systems, so a developer doesn’t feel like they are jumping through hoops just to see a log entry that happens to be sitting in a different cloud.

Analysts often warn about the “degrees of lock-in” that still exist even when vendors embrace open standards like OpenTelemetry. What should an enterprise leader be looking for to ensure they aren’t accidentally painting themselves into a corner?

It is a subtle trap that many fall into because “open source” has become such a powerful marketing term. You might see a vendor offer their own distribution of OpenTelemetry, but the catch is that if you use the standard upstream collector, you only get the very basic functionality. To unlock the “good stuff”—things like advanced database monitoring, data observability, or intricate cloud network mapping—you are often forced to use the vendor’s proprietary agent. This makes it incredibly difficult to extract your data or switch providers later because your entire instrumentation layer is tied to those unique features. It’s a trade-off; you get a polished, fully integrated platform that works beautifully out of the box, but you have to accept that your exit strategy just became a lot more complicated and expensive.

The concept of “AI tokenomics” is becoming a major headache for IT managers. How are organizations supposed to govern the behavior and costs of these autonomous agents when the pricing models seem so volatile?

Managing AI costs right now feels like trying to herd a hundred cats; it is messy and full of surprises. We are seeing the introduction of specialized “Agent Consoles” that attempt to monitor usage and token costs for agents from providers like OpenAI and Anthropic, but the complexity is daunting. Every different product—whether it’s APM, traces, or Real User Monitoring—is metered differently and is highly dependent on the workload. One developer might turn on high cardinality tags or intensive instrumentation in the pipeline, and suddenly you are looking at a cost surprise that hits your budget overnight. To combat this, some are using a strategy of “small language models” to handle the bulk of the detection traffic efficiently, only calling out to a high-cost frontier LLM for the most sophisticated scenarios to keep the whole operation cost-sustainable.

Security is also evolving, with new tools designed to automatically identify “crown jewel” systems. Why is this automated approach so much more effective than the traditional method of using manual tags?

Manual tagging has been a well-documented failure mode in large-scale enterprise environments for years. Tags go stale the moment a project shifts, ownership goes unclaimed as people leave the company, and security teams end up wasting precious hours chasing vulnerabilities in systems that don’t actually matter to the business. The move toward a Runtime Prioritization Engine changes the math because it uses actual telemetry to detect which systems are sensitive and important in real-time. By automatically classifying these “crown jewels” based on how data flows through the network, you eliminate the human error inherent in manual labeling. It allows a security analyst to focus on remediation that actually impacts business risk rather than just checking boxes on an outdated spreadsheet.

What is your forecast for the future of observability costs as AI-based features become the standard?

I believe we are entering a period of significant price uncertainty where the “yet-to-be-determined cost” of AI features will force a major reckoning for IT budgets. While vendors are racing to announce a dizzying array of AI-native tools for remediation and prioritization, the underlying billing remains incredibly opaque. We will likely see a move toward more transparent, outcome-based pricing, but in the short term, organizations are going to struggle with the “observability tax” as they try to balance the need for deep insights with the skyrocketing costs of processing petabytes of telemetry. The winners will be the firms that can successfully use those smaller, more efficient models to filter the noise before the big, expensive AI engines even get involved.

Subscribe to our weekly news digest.

Join now and become a part of our fast-growing community.

Invalid Email Address
Thanks for Subscribing!
We'll be sending you our best soon!
Something went wrong, please try again later