Risk detection models for the enterprise

View Only

Risk detection models for the enterprise - Granite 3.1

By NICK PLOWDEN posted Sun January 12, 2025 03:16 PM

Like

The Granite Guardian model collection is designed to detect risks in user prompts and LLM The Granite Guardian models are a collection of models designed to detect risks in prompts and responses. Trained on instruction fine-tuned Granite languages models, these models can help with risk detection along many key dimensions catalogued in the IBM Risk Atlas. The models are trained on unique data comprising human annotations from socioeconomically diverse people and synthetic data informed by internal red-teaming. They outperform similar models on standard benchmarks.

Granite Guardian is useful for risk detection use-cases which are applicable across a wide-range of enterprise applications:

Detecting harm-related risks within prompt text or model response (as guardrails). These present two fundamentally different use cases as the former assesses user supplied text while the latter evaluates model generated text.
RAG (retrieval-augmented generation) use-case where the guardian model assesses three key issues: context relevance (whether the retrieved context is relevant to the query), groundedness (whether the response is accurate and faithful to the provided context), and answer relevance (whether the response directly addresses the user’s query).
Function calling risk detection within agentic workflows, where Granite Guardian evaluates intermediate steps for syntactic and semantic hallucinations. This includes assessing the validity of function calls and detecting fabricated information, particularly during query translation.

Granite Guardian is available in 2B and 8B variants. These are enterprise-grade models trained in a transparent manner, and according to IBM’s AI Ethics principles and released with Apache 2.0 license for research and commercial use.

#LLM

0 comments

3 views

watsonx

IBM Granite