
Editor’s note: Today we hear about SmarterX, which helps retailers, manufacturers, and logistics companies minimize regulatory risk, maximize sales, and protect consumers and the environment by giving them AI-driven tools to safely and compliantly sell, ship, store, and dispose of their products. SmarterX uses BigQuery, Gemini, and Vertex AI to collect, process, and analyze vast amounts of unstructured regulatory and product data from across the web, using it to train custom, highly accurate large language models (LLMs) to help large consumer packaged goods brands and retailers sell, ship, store, and dispose of regulated products compliantly. Read on to learn about how Google Cloud’s integrated, easy-to-use toolset is helping them accelerate product development.
EVP for product and technology at SmarterX, Russell Foltz-Smith views the world of retail through search-colored glasses.
“If universal product codes were really universal, looking for a product and all the information directly related to it would be a one-step process,” he proposes. “But in the real world, the ideal of universality just doesn’t exist.”
It’s a reality we all deal with dozens of times a day when we type something into a browser’s search bar: There are very few queries guaranteed to return a single answer. Thus the need for what data scientists call “probabilistic search backed by algorithmic indexing and ranking strategies” ( what most of us call “googling”) was born.
“In many ways, all data science and LLM-building boils down to accurate information retrieval,” adds Foltz-Smith. And he’s well-positioned to understand why.
SmarterX customers — consumer packaged goods brands, third-party retailers, distributors, and logistics companies — rely on SmarterX to make sense of the overwhelming volume of regulatory product data online. The platform helps ensure the way products are sold, shipped, stored, and disposed of complies with all applicable laws and regulations.
“SmarterX collects and indexes data, triangulates for missing data points, and provides a queryable interface that helps our customers minimize regulatory risk while maximizing sales,” Foltz-Smith explains. To do so, SmarterX hunts down regulatory information, using crawlers enabled by machine learning and natural language processing to locate, scrape, and parse data from websites, research papers, safety data sheets, and other nooks and crannies of the web where it may be tucked away.
“Google Cloud technologies are a perfect fit for our needs,” Foltz-Smith states. “At their core is the ability to surface the right search results from an inconceivably vast expanse of data where the inputs and outputs are not predetermined and the data itself is unstructured.”
Real-time data processing and fast, accurate model-building
To collect and store all that data, SmarterX employs BigQuery and Cloud Storage. “Our data sources are disparate and the formats unpredictable,” he continues. “BigQuery accommodates unstructured and semi-structured data, then functions as a job engine, recursively cleansing, normalizing, schematizing, and classifying that data at runtime.”
Google Cloud’s scalable compute resources and storage also enable real-time data processing. “We never have to worry about whether we have enough servers in a data center or adequate bandwidth,” Foltz-Smith adds. “Google Cloud hides all that complexity, so it’s handled automatically and cost-effectively.”
Further accelerating data processing is BigQuery’s integration with Gemini, which manages data-processing job queues and also forms the basis of many of the large language models, or LLMs, SmarterX builds for its clients. “Gemini is in part a collection of everything Google has already crawled, so we don’t need to re-crawl it ourselves,” Foltz-Smith notes. That makes model-building faster.
Built-in grounding — the ability to connect model output to verifiable information sources — makes Gemini a safer, more conscientious way to assemble data for SmarterX customers. And retrieval-augmented generation, or RAG, allows SmarterX to connect Gemini with customers’ proprietary databases, enhancing the LLMs’ accuracy and relevance while helping ensure the security of his customers’ data.
Source Credit: https://cloud.google.com/blog/products/data-analytics/smarterx-uses-google-ai-and-data-tools-to-build-custom-llms/