GitHub Trends
10.1K subscribers
15.3K links
See what the GitHub community is most excited about today.

A bot automatically fetches new repositories from https://github.com/trending and sends them to the channel.

Author and maintainer: https://github.com/katursis
Download Telegram
#python #agent #agentops #agents_sdk #ai #anthropic #autogen #cost_estimation #crewai #evals #evaluation_metrics #groq #langchain #llm #mistral #ollama #openai #openai_agents

AgentOps is a tool that helps developers monitor and improve AI agents. It provides features like session replays, cost management for Large Language Models (LLMs), and security checks to prevent data leaks. This platform allows you to track how your agents perform, interact with users, and use external tools. By using AgentOps, you can quickly identify problems, optimize agent performance, and ensure compliance with safety standards. It integrates well with popular platforms like OpenAI and AutoGen, making it easy to set up and use[1][3][5].

https://github.com/AgentOps-AI/agentops
#python #evaluation_framework #evaluation_metrics #llm_evaluation #llm_evaluation_framework #llm_evaluation_metrics

DeepEval is an open-source tool that makes it easy to test and improve large language model (LLM) applications, much like how Pytest works for regular software, but focused on LLM outputs. It offers over 30 ready-to-use metrics—such as answer relevancy, faithfulness, and hallucination—to check if your LLM is accurate, safe, and reliable. You can test your whole application or just parts of it, and even generate synthetic data for better testing. DeepEval works locally or in the cloud, letting you compare results, share reports, and keep improving your models. This helps you build better, safer, and more trustworthy LLM apps with less effort[1][2][3].

https://github.com/confident-ai/deepeval