
Optimized networking via Private Service Connect (PSC)
While Dedicated Endpoints Public remain available for models accessible over the public internet, we are enhancing networking options on Dedicated Endpoints utilizing Google Cloud Private Service Connect (PSC). The new Dedicated Endpoints Private (via PSC) provide a secure and performance-optimized path for prediction requests. By leveraging PSC, traffic routes entirely within Google Cloud’s network, offering significant benefits:
-
Enhanced security: Requests originate from within your Virtual Private Cloud (VPC) network, eliminating public internet exposure for the endpoint.
-
Improved performance consistency: Bypassing the public internet reduces latency variability.
-
Reduced performance interference: PSC facilitates better network traffic isolation, mitigating potential “noisy neighbor” effects and leading to more predictable performance, especially for demanding workloads.
For production workloads with strict security requirements and predictable latency, Private Endpoints using Private Service Connect are the recommended configuration.
How Sojern is using the new Vertex AI Prediction Dedicated Endpoints to serve models at scale
Sojern is a marketing company focusing on the hospitality industry, matching potential customers to travel businesses around the globe. As part of their growth plans, Sojern turned to Vertex AI. Leaving their self-managed ML stack behind, Sojern can focus more on innovation, while scaling out far beyond their historical footprint.
Given the nature of Sojern’s business, their ML deployments follow a unique deployment model, requiring several high throughput endpoints to be available and agile at all times, allowing for constant model evolution. Using Public Endpoints would cause rate limiting and ultimately degrade user experience; moving to a Shared VPC model would have required a major design change for existing consumers of the models.
With Private Service Connect (PSC) and Dedicated Endpoint, Sojern avoided hitting the quotas / limits enforced on Public Endpoints, while also avoiding a network redesign to accommodate Shared VPC.
The ability to quickly promote tested models, take advantage of Dedicated Endpoint’s enhanced featureset, and improve latency for their customers strongly aligned with Sojern’s goals. The Sojern team continues to onboard new models, always improving accuracy and customer satisfaction, powered by Private Service Connect and Dedicated Endpoint.
Get started
Are you struggling to scale your prediction workloads on Vertex AI? Check out the resources below to start using the new Vertex AI Prediction Dedicated Endpoints:
Documentation
Github samples
Your experience and feedback are important as we continue to evolve Vertex AI. We encourage you to explore these new endpoint capabilities and share your insights through Google Cloud community forum.
Source Credit: https://cloud.google.com/blog/products/ai-machine-learning/reliable-ai-with-vertex-ai-prediction-dedicated-endpoints/