As a leading expert in business intelligence and data science, Chloe Maraina has a unique vantage point on one of the most pressing issues facing IT leaders today: the staggering growth of cloud costs. With a passion for translating complex data into clear, actionable stories, she helps organizations navigate the turbulent financial waters of digital transformation, particularly in the age of AI. We sit down with her to explore why cloud spending has quietly become the second-largest expense for many companies, trailing only their payroll.
In our conversation, Chloe unpacks the operational chaos caused by volatile cloud bills and the specific challenges of forecasting unpredictable AI workloads, which now account for a significant portion of these costs. She offers a clear-eyed perspective on striking a delicate balance between empowering developers and maintaining financial discipline, arguing for a cultural shift where cost becomes a shared business responsibility. We’ll move beyond the buzzwords to discuss concrete strategies for aligning IT and finance, eliminating the wasteful “cloud sprawl” left behind by forgotten experiments, and what the future holds for SaaS pricing as vendors grapple with their own mounting AI-driven expenses.
The article highlights that for many midsize IT firms, cloud costs are now the second-largest expense after labor, with some spending over 13% of revenue. Beyond the budget, what are the biggest operational challenges this creates, and what specific metrics do you use to track the impact on profit margins?
It’s a visceral challenge that goes far beyond a line item on a spreadsheet. When nearly a third of companies are seeing over 13% of their revenue eaten up by the cloud, the real problem becomes volatility. The data shows that for three-quarters of CFOs, these costs fluctuate between 5% and 10% month over month. Imagine trying to plan hiring, marketing campaigns, or R&D investment when one of your biggest expenses is that unpredictable. It creates a constant sense of uncertainty that paralyzes strategic planning. To combat this, we move beyond just tracking total spend. We relentlessly monitor the cost of goods sold (COGS) at the individual product level, specifically tracking the gross margin against its direct cloud consumption. This tells us not just what we’re spending, but whether the value we’re delivering justifies the cost.
Given that AI workloads account for 22% of cloud costs and introduce unpredictable “non-linear patterns” in spending, what specific strategies or tools have you found most effective for forecasting these expenses? Please share an anecdote of how you managed an unexpected training or inference spike.
Forecasting AI spend is like trying to predict the weather in a hurricane. Traditional, linear models that finance teams love are completely broken by the “non-linear patterns” of AI. You have these massive, sudden training spikes and unpredictable inference costs that are driven by user behavior. The most effective strategy is to build guardrails, not just forecasts. We had a situation where a developer kicked off a new model training experiment on a Friday afternoon. The consumption-based pricing started to climb from a few dollars to hundreds, then over a thousand. Instead of waiting for a bill at the end of the month, we had an automated alert that triggered not just an email, but a script that paused the job and notified the team lead. It’s about creating real-time controls that prevent that “non-virtuous cycle” where more powerful AI drives up compute and storage costs exponentially before anyone even notices.
One expert suggests costs spiral when developers are given “near-infinite infrastructure choices” without economic guidance. How can IT leaders strike a balance between developer agility and financial accountability? Could you outline a step-by-step process for connecting cloud consumption directly to business value without stifling innovation?
That quote perfectly captures the heart of the problem. In the old data center days, a developer couldn’t just rack a new server without a purchase order. In the cloud, we’ve handed them a credit card with an infinite limit. The key isn’t to take the card away but to teach them how to spend wisely. A first step is implementing rigorous, mandatory tagging for every single resource. If it isn’t tagged to a team, a project, and a business goal, it doesn’t get deployed. Second, we connect that data directly to unit economics. We stop talking about the total cloud bill and start talking about the cloud cost per active user, per transaction, or per report generated. This makes the cost tangible. Finally, we embed cost analysis directly into the architectural design phase. Before a team builds a new feature, they must answer the question, “What is the economic goal of this workload?” This shifts the mindset from just building something that works to building something that works profitably.
The content argues that cost is a business decision, not just a technical one, and that FinOps can be a “one-time Band-Aid.” Beyond budgeting, what collaborative processes should a CIO and CFO establish to align cloud spending with strategic goals? Please describe a successful conversation framework.
Calling FinOps a “one-time Band-Aid” is spot on. It’s fantastic for finding obvious waste, like unused resources, but it doesn’t solve the core strategic issue: developers are often making huge financial decisions without any insight into revenue or margin. A truly successful collaboration between a CIO and CFO transforms the conversation from “How do we cut the cloud bill?” to “How do we invest our cloud budget for maximum return?” A great framework involves a joint quarterly planning session. The CIO presents the product roadmap, saying, “This new AI feature is our top priority. We project it will increase cloud spend by 20%, but our models show it will decrease customer churn by 15%.” The CFO can then work with them to model the impact on gross margins and ask, “Are there architectural trade-offs we can make to optimize that cost?” It becomes a partnership focused on hitting business goals, not just an arbitrary budget number.
We read about developers who “spin it up for an experiment, and then they forget it exists,” leading to extensive sprawl. Besides basic alerts, what are some of the most effective, automated, step-by-step procedures you recommend for identifying and eliminating this kind of cloud waste?
This is a classic and costly problem. That image of developers spinning up environments for an experiment and then just forgetting them is painfully accurate. Basic alerts for budget overruns are too little, too late. The most effective defense is proactive automation. First, we implement a ‘Time-to-Live’ policy for all non-production environments. When a developer creates a test server, it’s automatically tagged to be shut down in 48 hours unless they provide a justification to extend it. Second, we run nightly automated scripts that act like a cleanup crew, hunting for “orphan” resources—things like unattached storage volumes or idle databases that are just sitting there racking up charges. Finally, and this is crucial for culture, we publish a “Cloud Hygiene” dashboard that shows which teams are the most efficient. It’s not about shaming, but about making stewardship visible and creating a sense of collective ownership over our resources.
What is your forecast for how and when these rising AI cloud costs, currently being absorbed by vendors, will be passed on to customers, and what will that mean for the SaaS market’s competitive landscape?
My forecast is that the era of vendors swallowing these costs is rapidly coming to an end. It’s simply not sustainable, especially as AI’s share of cloud spend continues to scale. I predict that over the next 12 to 18 months, we will see a fundamental shift in SaaS pricing models. It won’t be a simple, across-the-board price hike, as that would increase churn in a fiercely competitive market. Instead, we’ll see the rise of hybrid and consumption-based pricing tiers specifically for AI-powered features. A standard subscription might include basic functionality, but leveraging advanced generative AI reports or predictive analytics will come with a usage-based fee. This will dramatically alter the competitive landscape. The winners will be the companies that have built the most economically coherent and efficient AI architectures. They will be able to offer these powerful features at a lower cost, giving them a massive advantage and putting immense pressure on competitors who haven’t gotten their own cloud houses in order.
