Enterprises across the world are processing significant amounts of data. Palo Alto Networks processes thousands of firewall logs, telemetry signals and threat events every second across its product portfolio. To support this scale, Palo Alto Networks had 30,000 individual data pipelines, each with its own operational load. And while this single tenant architecture model worked originally, it had recently started to slow innovation, limit further scale, and made onboarding new analytics use cases increasingly costly.
To support the next generation of security products, Palo Alto Networks partnered with Google Cloud to modernize their data processing landscape into a unified multi-tenant platform powered by Dataflow, Pub/Sub and BigQuery. This transformation became the foundation of Palo Alto Network’s Unified Data Platform (UDP), which now processes billions of events every day with improved agility, simpler operations and meaningful cost efficiency.
The challenge: A single tenant architecture could not keep pace
Before migrating, Palo Alto Network’s data platform was built around a “one pipeline per tenant” model. Each tenant pipeline required its own configuration, troubleshooting, on-call rotations and capacity tuning. As Palo Alto Networks usage grew, so did the friction:
-
Brittle alerting and weekly operational overhead to support more than 30,000 pipelines that were processing a combined throughput of roughly 30 GB per second.
-
Slow deployment cycles made onboarding new tenants harder.
-
Significant compute resources were dedicated to each tenant, regardless of load.
-
Engineering time was spent managing infrastructure instead of building new analytics.
This model hindered operational agility and made it challenging to scale,as new product lines expanded and data volumes increased.
The transformation: Embracing a new architectural paradigm with Dataflow
The turning point came when the team recognized that Google Cloud Dataflow’s serverless auto scaling architecture could support a completely different operating model. Instead of maintaining thousands of individual pipelines, Palo Alto Networks could unify workloads into a multi-tenant system where resources are shared intelligently across tenants.
Several core capabilities made this possible:
1. The architectural shift
Dataflow allowed the team to move from “one job per tenant” to a “shared resource pool” that can handle multiple tenants within a single architecture. This shift dramatically simplified operations and unlocked new efficiencies.
2. Unlocking multi tenancy at scale
Dataflow’s autoscaling engine manages fluctuating workloads with ease, accommodating the unpredictable spikes that are common in cybersecurity environments. This eliminated the need for manual capacity planning.
3. Operational freedom
By using Flex Templates and Dataflow’s managed service model, the team transformed their CI/CD process from week-long deployment cycles into a single day workflow. Engineers no longer spend time managing infrastructure and can instead focus on analytics, threat detection and product innovation.
4. Unified execution
With all jobs running on a shared Dataflow based platform, the team gained flexibility to move workloads across real time and batch systems without maintaining different codebases.
5. Observability
With Dataflow, the team relies on built in logging and metrics to monitor pipeline health across both real time and batch workloads, providing clear visibility into performance without additional tooling. Dataflow exposes the full set of metrics required for on-call alerting, eliminating the need to build or maintain custom metrics in the PANW codebase. When alerts trigger, the Dataflow UI enables engineers to quickly identify performance bottlenecks and take corrective action.
Architecture overview
Source Credit: https://cloud.google.com/blog/topics/partners/palo-alto-networks-builds-a-multi-tenant-unified-data-platform/
