Google Cloud Platform Technology Nuggets — July 16–31, 2025 | by Romin Irani | Google Cloud - Community

Welcome to the July 16–31, 2025 edition of Google Cloud Platform Technology Nuggets. The nuggets are also available on YouTube.

Looking to get a quick list of things that Google Cloud announced in AI in July. Check out this roundup “What Google Cloud Announced in AI this month” and bookmark it.

As expected there are a range of updates in the AI space for this edition. First up is that Veo 3 and Veo 3 Fast on Vertex AI are in General Availability (GA). Do note that an upcoming image-to-video capabilities for both models, allowing users to animate static visuals is highly anticipated by users. Check out the blog post.

The catalog of open models available as Model-as-a-Service (MaaS) offerings in Vertex AI Model Garden has seen the addition of Deepseek R1. The blog post goes into the detail of how you can enabled the Deepseek R1 service in your project along with testing it out in the UI and integrating the API for inference purposes.

While the catalog of models available in the form of Model-as-a-Service (MaaS) may keep growing, what if you were tasked with implementing an end-to-end lifecycle of taking an open model from discovery to a production-ready endpoint on Vertex AI. Its a 4-step process as given below and described in detail in the blog post:

Part 1: Quickly choose the right base model
Part 2: Start parameter efficient fine-tuning (PEFT) with your data
Part 3: Evaluate your fine-tuned model
Part 4: Deploy to a production endpoint

A while back when specialized Vector databases were springing up, it was expected that eventually all the mainstream Databases would support embeddings and vector search to remain relevant. If you have been a Cloud SQL for MySQL user and looking to get started with Vector Embeddings and Search, you should check out this article that highlights how Vertex AI integration helps to generate vector embeddings using a simple SQL function and then you can build build persistent indexes on them, and perform ANN search between them. You can also leverage any Vertex AI model directly from MySQL, bringing advanced AI capabilities closer to your data.

Google Cloud announced general availability of a global endpoint for Anthropic’s Claude models on Vertex AI, significantly enhancing model availability and reliability. To account for data residency requirements, both regional and global endpoint locations are supported. Check out the blog post for more details. A quick table below summarizes the differences between the two options.

Google Cloud BigQuery is one of the top used services in Google Cloud and with the rist of AI Agents, it is not surprising to see that the demand to connect these Agents to BigQuery datasets is being actively looked into. While you can build out your own tools to do the integration, a new, first-party toolset for BigQuery that includes tools to fetch metadata and execute queries is now available. As the blog post states “These official, Google-maintained tools provide a secure and reliable bridge to your data, and you can use them in two powerful ways: a built-in toolset in Google’s Agent Development Kit (ADK) or through the flexible, open-source MCP Toolbox for Databases.”

10 years is a long time and Google Kubernetes Engine (GKE), which provides a best-in-class managed Kubernetes service has led the way when it comes to deploying flexible and scalable computing applications in the cloud. Check out 10 years of GKE, an ebook that explores this period of time and how customers have innovated with it.

Secure Boot for Google Cloud AI Workloads sounds like a complicated set of terms. A bootkits can bypass traditional security by compromising the system at startup, gaining high-level privileges to steal data or corrupt AI models and the consequences could be damaging. The blog post covers Google Cloud’s Shielded VM offering, which includes Secure Boot to verify system integrity, and addresses the challenge of using Secure Boot with GPU drivers that are typically not officially signed. The blog post covers how-to’s on automated and manual methods for signing GPU drivers and enabling Secure Boot on Google Cloud instances.

The first CISO Bulletin for July 2025 covers Big Sleep AI agent, an AI agent developed by Google DeepMind and Google Project Zero to proactively detect and neutralize security vulnerabilities in software. It has successfully identified and prevented a critical SQLite vulnerability before attackers could exploit it. Also included are details on improving email phishing detection in Gmail. Check it out.

The 2nd CISO Bulletin for July 2025 covers Google’s efforts to enhance cybersecurity measures, specifically focusing on tackling identity-based attacks. It highlights two key items: passkeys, designed to offer a more secure and simpler sign-in experience by eliminating traditional passwords, and Device Bound Session Credentials (DBSC), which aim to disrupt cookie theft by cryptographically linking authenticated sessions to specific devices. Check out the blog post for more details.

If you are interested in getting a quick review of all the announcements in data analytics, database and business intelligence teams, check out the Whats New with Google Data Cloud and bookmark it.

Dataproc has evolved to a large extent from a basic managed service into a robust platform for analytics and AI workloads. Check out the blog post that highlights the advantages of using Google Cloud Dataproc for managing Apache Spark environments. Key features highlighted in the article include the Lightning Engine for Spark, which significantly boosts performance, seamless integrations with Cloud Storage and BigQuery and other AI/ML and enterprise-grade security features.

As a followup to the above article, Dataproc is well positioned to manage your AI/ML experience with Apache Spark due to constant flow of new feature updates. These updates include ML Runtimes with pre-packaged GPU drivers and essential machine learning libraries, simplifying environment setup for both Dataproc on Compute Engine clusters and Google Cloud Serverless for Apache Spark. Developer Experience is bumped up due to enabling Spark application development within Colab Enterprise notebooks, VSCode, and JupyterLab, alongside support for distributed training and inference with GPU acceleration. Check out the blog post.

Looking to get inspired with the stuff that Customers are building with Google Cloud AI. Check out the July Edition of Cool Stuff Customers Built that covers interesting innovations (as mentioned in the blog post):

Box’s AI agents extracting insights with cross-platform agent integration.
Schroders uses multiple connected agents to build a complex investment research system.
Hypros’ IoT device can monitor patient distress in hospitals without individual monitoring.
Formula E exhibition of whether regenerative braking can power an EV supercar for an entire lap.
A unified data and AI platform at LVMH that can serve 75 distinct luxury brands.
Alpian deploys AI to become the most advanced cloud-native private bank in Switzerland.
A gen-AI powered tool estimates where potential home runs might land in the stands during the MLB All-Star Game.
Oviva developed an AI-powered meal-logging app.

The key to a fast adoption and acceptance of AI coding assistants is largely dependent on them being able to adapt to specific rules of the organization and users. Often developers and even non-developers have mini-workflows to do certain tasks. Gemini CLI has introduced the concepts of Custom Commands that allows one to define these mini workflows or tasks in a markdown file and set them up as commands. This is a great way to extend the current commands/tools that Gemini CLI has plus it puts the user in control. Check out this article that goes into the details on how to setup Custom Commands in Gemini CLI.

If you are invested in Google Developer technologies, a refreshed and new Google Developer Program forums have been launched at discuss.google.dev. The new forums are designed to help people build with Google technology. You will find discussion groups to engage with other developers and Google experts; how-to articles, reference architectures and use cases; and a community of users looking to help. Read the blog for more details.

If you are looking to deploy computationally intensive AI inference, you quickly run into challenges ranging from inefficient Load Balancing, variable processing times, limited resource management and sub-optimal auto scaling. Enter GKE and GKE Inference Gateway, which provides intelligent, AI-aware load balancing and AI-aware resource management. Check out the blog post that contains a detailed walkthrough guide on how you can do that today.

Dataflow ML simplifies the process of building and running machine learning pipelines, specifically for tasks involving embeddings and RAG applications. Check out the blog post that explains the knowledge ingestion pipelines, including streaming versus batch processing and chunking data, for populating vector databases like AlloyDB, MLTransform for generating embeddings and an enrichment transform for RAG use cases. All of these is just a few lines of code.

AI Assistance across Google Cloud has been brewing for a while. Application Monitoring in Google Cloud brings you a step closer to efficiently doing understanding how your applications are done. App Hub brought together various Google Cloud resources under a single Application construct and Application Monitoring takes it a step further by automatically labelling the metrics, logs, traces with your Application metadata and a set of curated Dashboards to view that information under Monitoring. You additionally have Gemini Cloud Assist’s investigation feature that provides you with automated troubleshooting and can take your logs and give you a first assessment of the issues and remedial measures. Check out the blog post for more details.

Google Cloud’s Cluster Director, a unified management plane designed to simplify the deployment and management of large-scale AI and HPC infrastructure. It’s got enhanced features like a simplified graphical user interface, a managed Slurm experience for job scheduling, and advanced observability tools for performance monitoring. Check out the blog post for more details.

There is a new monitoring library for Google Cloud TPUs (Tensor Processing Units), designed to help users optimize performance for AI training and inference workloads. The blog post details the capabilities of the library that provides real-time insights into metrics such as Tensor core utilisation and high-bandwidth memory (HBM) usage. This library can be integrated into your Python applications too.

Looking to build AI applications on Google Cloud and looking for guides that help you build a variety of applications. Check out this blog post titled “25+ top Gen AI how-to guides for enterprise” that features applications across various categories like Faster model deployment, Building generative AI apps & multi-agentic systems, Fine-tuning, evaluation, and Retrieval-Augmented Generation (RAG) and Integrations.

If you would like to share your Google Cloud expertise with your fellow practitioners, consider becoming an author for Google Cloud Medium publication. Reach out to me via comments and/or fill out this form and I’ll be happy to add you as a writer.

Have questions, comments, or other feedback on this newsletter? Please send Feedback.

If any of your peers are interested in receiving this newsletter, send them the Subscribe link.

Source Credit: https://medium.com/google-cloud/google-cloud-platform-technology-nuggets-july-16-31-2025-18f59d638059?source=rss—-e52cf94d98af—4