How Jina AI built its 100-billion-token web grounding system with Cloud Run GPUs

Editor’s note: The Jina AI Reader is a specialized tool that transforms raw web content from URLs or local files into a clean, structured, and LLM-friendly format. In this post, Han Xiao details how Cloud Run empowers Jina AI to build a secure, reliable, and massively scalable web scraping system that remains economically viable. This post explores the collaborative innovation, technical hurdles, and breakthrough achievements behind Jina Reader, a web grounding system now processing 100 billion tokens daily.

When Jina Reader launched in April 2024, its explosive growth — serving more than 10 million requests and 100 billion tokens daily — confirmed huge demand for reliable, LLM-friendly web content. Jina Reader isn’t just another scraper; it takes a different approach to how AI systems consume web content by transforming raw, noisy web pages into clean, structured markdown.

The core challenge for any AI system processing web data is the “web grounding problem.” Modern websites are a chaotic mix of content, ads, tracking scripts, and dynamic JavaScript, creating an overwhelming noise-to-signal ratio. Traditional scrapers struggle with this complexity, often failing on dynamic single-page applications or generating unusable, ungrounded data for LLMs. Jina Reader’s breakthrough, ReaderLM-v2, is a purpose-built 1.5-billion-parameter language model that intelligently extracts content, trained on millions of documents to understand web structure beyond simple rules.

Source Credit: https://cloud.google.com/blog/products/application-development/how-jina-ai-built-its-100-billion-token-web-grounding-system-with-cloud-run-gpus/