
Early users are thrilled with the speed and effectiveness with which Cloud Assist investigations helps them troubleshoot and resolve tough problems.
“At ZoomInfo, maintaining uptime is critical, but equally important is ensuring our engineers can swiftly and effectively troubleshoot complex issues. By integrating Gemini Cloud Assist investigations early into our development process, we’ve accelerated troubleshooting across all levels of our engineering team. Engineers at every experience level can now rapidly diagnose and resolve problems, reducing some resolution times from hours to minutes. This efficiency enables our teams to spend more energy innovating and less time on reactive problem-solving. Gemini Cloud Assist investigations isn’t just a troubleshooting aid; it’s a key driver of productivity and innovation.” – Yasin Senturk, DevOps Engineer at ZoomInfo
“I’m really impressed by how Gemini Cloud Assist Investigations feature in 2 minutes turned over with some valid suggestions on the potential root causes, and the first one being the actual culprit! I was able to mitigate the whole issue within an hour. Gemini Cloud Assist really saved my weekend!” – Chuanzhen Wu, SRE, Google Waze
Let’s walk through Gemini Cloud Assist investigations’ capabilities in a bit more detail.
Programmatic, proactive, and interactive access
You can start an investigation directly from various points within Google Cloud, such as error messages in Logs Explorer or specific product pages (like Google Kubernetes Engine or Cloud Run), or from the central Investigations page, where you can provide context like error messages, affected resources, and observation time. Gemini Cloud Assist investigations also provides an API, allowing you to integrate it into existing workflows such as Slack or other incident management tools. If the root cause of an issue requires further assistance, you can trigger a Google Cloud support case with the Investigation response so support engineers can proceed from that point.
Contextualization
Investigations can start with a natural language description, error message, log snippets, or any combination of information that you have about your issue. It starts by gathering the initial context related to your issue, then builds a topology of relevant resources and all the associated data sources that might provide insights to the root cause.
Investigations uses both public and private knowledge, playbooks informed by Google SRE and Google Cloud Support issues, and your topology, grounding itself in similar issues before generating a troubleshooting plan for your issue. This context becomes key in providing focused and comprehensive signal analysis.
Comprehensive signal analysis
Once the investigation runs, you’ll see the observations that it starts to collect from your project. The investigation goes beyond surface-level observations; it automatically analyzes critical data sources across your Google Cloud environment, including:
- Google Cloud logs: Sifting through vast log data to identify anomalies and critical events
- Cloud Asset Inventory: Understanding changes in your resource configurations and their potential impact
- Metrics (coming soon): Correlating performance data to pinpoint resource exhaustion or unexpected behavior
- Errors: Aggregating and categorizing errors to highlight patterns and recurring problems
- Log themes: Identifying common patterns and themes within log data to provide a higher-level view of issues
Source Credit: https://cloud.google.com/blog/products/management-tools/gemini-cloud-assist-investigations-performs-root-cause-analysis/