Remote OpenClaw

azure-ai-evaluation-py

DevOps & Cloud

v0.1.0

Benign

Azure AI Evaluation SDK for Python.

11.7K downloads1.7K installsby @thegovind

Setup & Installation

Install command

clawhub install thegovind/azure-ai-evaluation-py

If the CLI is not installed:

Install command

npx clawhub@latest install thegovind/azure-ai-evaluation-py

Or install with OpenClaw CLI:

Install command

openclaw skills install thegovind/azure-ai-evaluation-py

or paste the repo link into your assistant's chat

Install command

https://github.com/openclaw/skills/tree/main/skills/thegovind/azure-ai-evaluation-py

What This Skill Does

Python SDK for evaluating generative AI applications using Azure OpenAI. Supports quality metrics (groundedness, relevance, coherence), NLP-based scoring (F1, BLEU, ROUGE), and safety evaluations (violence, hate, self-harm). Results can be logged to Azure AI Foundry for tracking across runs.

Combines quality, NLP, and safety evaluators in one SDK with direct Azure AI Foundry integration, eliminating the need to wire together separate scoring libraries.

When to Use It

Scoring RAG pipeline responses for groundedness against source documents
Running safety checks on chatbot outputs before production deployment
Batch evaluating a dataset of query/response pairs with multiple metrics
Logging evaluation runs to Azure AI Foundry for regression tracking
Building custom domain-specific evaluators for specialized content

Example Workflow

Here's how your AI assistant might use this skill in practice.

INPUT

User asks: Scoring RAG pipeline responses for groundedness against source documents

AGENT

1Scoring RAG pipeline responses for groundedness against source documents
2Running safety checks on chatbot outputs before production deployment
3Batch evaluating a dataset of query/response pairs with multiple metrics
4Logging evaluation runs to Azure AI Foundry for regression tracking
5Building custom domain-specific evaluators for specialized content

OUTPUT

Azure AI Evaluation SDK for Python.

Share this skill

Security Audits

VirusTotalBenign

OpenClawBenign

View full report

These signals reflect official OpenClaw status values. A Suspicious status means the skill should be used with extra caution.

Details

LanguageMarkdown

Last updatedFeb 26, 2026

Similar Skills

azure-ai-voicelive-py

Build real-time voice AI applications.

agent-framework-azure-ai-py

Build Azure AI Foundry agents.

azd-deployment

Deploy containerized applications to Azure Container Apps.

azure-ai-transcription-py

Azure AI Transcription SDK for Python.