MCP-Universe Tasks
Explore our comprehensive collection of benchmark tasks across 6 domains. Each task is designed to evaluate AI agents in real-world scenarios using actual MCP servers.
Loading...
Total Tasks
6
Domains
11
MCP Servers
Loading...
Evaluators
Loading Status: Initializing...
Web Search
Advanced web search tasks requiring multi-step information retrieval, synthesis, and real-time data processing from various sources.
Loading tasks...
Browser Automation
Complex browser automation tasks involving real-time web interactions, request submissions, and dynamic content extraction.
Loading tasks...
Financial Analysis
Real-time financial data analysis, quantitative investing, market research, and investment calculations involving temporal dynamics and live market data.
Loading tasks...
Repository Management
Version control workflows, code repository management, and collaborative development tasks across different platforms like GitHub.
Loading tasks...
3D Design
Three-dimensional modeling and design tasks using real Blender software tools with geometric constraints and design specifications.
Loading tasks...