ML Eval Framework

A lightweight harness for running offline LLM evaluations.

Python PyTorch HuggingFace

A lightweight evaluation harness for running offline assessments of large language model outputs. Supports custom metrics, batch evaluation, and structured output logging.

Replace this content with your actual project description.