By Rohit Ramanand, Group Vice President of Engineering, India, New Relic Organisations are constantly under pressure to roll out new features and updated versions of their products to stay competitive. When OpenAI recently released its latest image-generation feature, it took an unpredictable turn as Studio Ghibli-style images took the internet by storm. The images representing users as characters from the Japanese films quickly went viral, especially in South Asian countries, and OpenAI gained millions of new users in a matter of hours.
Behind the glamour of this trend was a less publicised problem. Despite having some of the most skilled talent and cutting-edge technology, the sudden surge in traffic put immense pressure on ChatGPT’s infrastructure . It caused multiple operational challenges and delays in product releases, as shared by Sam Altman.
Such incidents present huge challenges for any business, including monetary losses. This is where observability powered by agentic AI capabilities come into play. They are not just the solution for outages or downtime, but a powerful tool that boosts productivity, prevents operational delays, predicts incidents, and keeps systems running smoothly to avoid losses.
Imagine a scenario where a single misconfigured API breaks your entire customer onboarding flow. This is common in applications built with interconnected services and components. Despite having best-in-class technologies, businesses still struggle to prevent outages and operational interruptions.
The root cause is simple: many organisations depend on numerous tools to manage their IT systems and business data. With a high volume of telemetry, alerts, and fragmented tools, the ability to make quick and accurate decisions becomes challenging. IT teams often struggle to connect the dots between each tool, making it hard to draw context-rich, reliable conclusions.
What makes things worse is that siloed data limits the accuracy of AI-generated insights. Since each tool only has a partial view of the ecosystem, AI systems lack full context, resulting in biased or incomplete recommendations. This slows down troubleshooting, increases mean time to resolution (MTTR) and ultimately leads to more frequent and costly downtime as teams are unable to anticipate, prevent, or even backtrack to the root cause of incidents effectively.
This lack of visibility forces teams into a reactive rather than proactive approach, leading to operational inefficiencies and potential financial losses. Repeated outages not only hurt performance but also damage business reputation and erode customer trust. This is why businesses need a robust observability strategy that works in tandem with an entire workflow.
Advanced observability platforms are strengthened by AI and, increasingly, feature agentic AI integrations. As businesses look to AI to automate tasks, the need for an observability platform that can communicate and coordinate with AI agents across the ecosystem becomes imperative. Comprehensive 360-degree visibility into the entire ecosystem helps streamline operations and ensure consistent business uptime.
Natively integrated observability tools with agentic AI capabilities make this possible. They do the heavy lifting for IT teams by automating tasks and workflows, even across external tools. Agentic integrations that extend business-critical observability insights into popular workflow platforms, such as ITSM/SDLC tools like GitHub Copilot, Amazon Q Business, ServiceNow, and Gemini Code, make troubleshooting, incident prediction, and resolution much easier.
Intelligent orchestration ensures that the right agent is prioritised for each task, delivering highly relevant and accurate responses and recommendations. They also enable an open ecosystem of agent-to-agent orchestrations connected via natural language APIs. This allows users to automate research and complex tasks across workstreams and turn disparate data sets into business-critical insights.
With the right observability platform, IT teams no longer need to spend hours sifting through large volumes of data to find the root cause of an issue. Observability platforms with agentic integrations help teams adopt a more proactive posture with actionable insights surfaced in context from all IT and business data, empowering intelligent decisions that drive growth and innovation. When we look at the Ghibli trend and the pressure that followed on OpenAI’s infrastructure, an agentic AI-strengthened observability solution may have helped them avoid such operational challenges.
These tools leverage response intelligence and predictions to contextualise telemetry, offering AI-strengthened impact analysis and faster incident mitigation. This would have accelerated resolution time, allowing teams to focus on new product releases instead of combating issues in real time. For any business, incidents or downtime can have a significant impact on revenue.
It’s always better to be proactive and prevent issues before they occur rather than face detrimental damage and financial loss..
Technology
Ghibli Trend to ChatGPT Breakdowns: Why Observability with Agentic AI Integrations is a Game Changer
