Automating Hoax Classification with Vertex AI and Cloud Functions | by Esther Irawati Setiawan | Google Cloud - Community

vertexai.init(project="YOUR_PROJECT_ID", location="us-central1")generation_config_json = {
"temperature": 0,
"top_k": 1,
"max_output_tokens": 1024,
"response_mime_type": "text/plain",
}
generation_config = GenerationConfig(
**generation_config_json
)
tools = Tool.from_google_search_retrieval(
grounding.GoogleSearchRetrieval(
# Optional: For Dynamic Retrieval
dynamic_retrieval_config=grounding.DynamicRetrievalConfig(
dynamic_threshold=0.06,
)
)
)

Calling vertexai.init initializes the Generative AI client, ensuring it authenticates properly with your Google Cloud project. The generation_config (in this case populated from the generation_config_json dictionary) sets the model’s output parameters—things like temperature (which controls randomness in generation) and max_output_tokens which limits the response length. The tools object sets up Google Search retrieval capabilities so the model can query web sources for relevant information.

def classify_claim(claim, article_content=None, classify_disinformation=False, classify_hate_speech=None):
claim = f"Claim: {claim}"if article_content:
claim += f"\n\nArticle about the claim: {article_content}"
classify_targets = ["Hoax", "Verified"]
if classify_disinformation:
classify_targets += ["Disinformation"]
if classify_hate_speech:
classify_targets += ["Hate Speech"]
map_generation_config_json = {
**generation_config_json,
"response_schema": {
"type": "object",
"properties": {
"classification": {
"type": "string",
"enum": classify_targets,
},
"justification": {
"type": "string",
}
}
},
"response_mime_type": "application/json",
}
map_generation_config = GenerationConfig(
**map_generation_config_json
)
searcher_model = GenerativeModel(
model_name="gemini-1.5-pro",
generation_config=generation_config,
system_instruction=(
f"The user will give you a claim, your task is to generate a detailed justification based on the found sources from google and specify whether the claim is either one of these {classify_targets}, make your point clear on what the most probable one is between all of those choices.\n"
"make a query using the language used in the claim because it might help.\n"
"a generally helpful guideline is to search 'is {claim} a hoax?\n'"
)
)
mapper_model = GenerativeModel(
model_name="gemini-1.5-flash",
generation_config=map_generation_config,
system_instruction=(
"You are a mapper system that will map the output conclusion of a hoax classifier system into json format, map the provided conclusions into json"
)
)
search_result = searcher_model.generate_content(
claim, 
tools=tools,
safety_settings=safety_config
)
response = mapper_model.generate_content(
search_result.text,
safety_settings=safety_config
)
result = json.loads(response.text)
result['links'] = [dict(uri=chunk.web.uri, site=chunk.web.title) for chunk in search_result.candidates[0].grounding_metadata.grounding_chunks]
return result

The classify_claim function accepts a textual claim and optional parameters such as article_content and boolean flags for disinformation and hate speech classification. It constructs a prompt that includes the user’s claim and any relevant article text, then identifies which classification categories (Hoax, Verified, etc.) should be considered. A single GenerativeModel named searcher_model is created with a system instruction guiding it to query Google and produce a detailed justification of its classification. The final result is assembled into a Python dictionary containing the generated text, a classification field (derived from a basic keyword-matching heuristic), and a collection of link metadata (sources) extracted from the model’s grounding metadata.

Source Credit: https://medium.com/google-cloud/hoax-classification-with-vertex-ai-and-vertex-ai-endpoints-420a2bb441ed?source=rss—-e52cf94d98af—4