Back to skills
DevOps & CloudMarkdown

azure-ai-evaluation-py

Azure AI Evaluation SDK for Python.

Installs

1.7K

Stars

1

Forks

0

Updated

Feb 26, 2026

Install command

clawhub install thegovind/azure-ai-evaluation-py

What it does

Python SDK for evaluating generative AI applications using Azure OpenAI. Supports quality metrics (groundedness, relevance, coherence), NLP-based scoring (F1, BLEU, ROUGE), and safety evaluations (violence, hate, self-harm). Results can be logged to Azure AI Foundry for tracking across runs.

Why it's useful

Combines quality, NLP, and safety evaluators in one SDK with direct Azure AI Foundry integration, eliminating the need to wire together separate scoring libraries.

Use cases

Scoring RAG pipeline responses for groundedness against source documents
Running safety checks on chatbot outputs before production deployment
Batch evaluating a dataset of query/response pairs with multiple metrics
Logging evaluation runs to Azure AI Foundry for regression tracking
Building custom domain-specific evaluators for specialized content

Community reviews

Comments from operators using this skill

0 comments

Log in to leave a review, rate the skill, and help the best community tools rise to the top.

Related skills

More in DevOps & Cloud

Browse all skills →