ML Eval Framework
A lightweight harness for running offline LLM evaluations.
Python PyTorch HuggingFace
A lightweight evaluation harness for running offline assessments of large language model outputs. Supports custom metrics, batch evaluation, and structured output logging.
Replace this content with your actual project description.