MCP-Universe Tasks

Explore our comprehensive collection of benchmark tasks across 6 domains. Each task is designed to evaluate AI agents in real-world scenarios using actual MCP servers.

Loading...
Total Tasks
6
Domains
11
MCP Servers
Loading...
Evaluators
Loading Status: Initializing...

Location Navigation

Real-world geospatial navigation tasks involving complex location queries, route planning, and geographic point calculations with actual map data.
Loading tasks...

Browser Automation

Complex browser automation tasks involving real-time web interactions, request submissions, and dynamic content extraction.
Loading tasks...

Financial Analysis

Real-time financial data analysis, quantitative investing, market research, and investment calculations involving temporal dynamics and live market data.
Loading tasks...

Repository Management

Version control workflows, code repository management, and collaborative development tasks across different platforms like GitHub.
Loading tasks...

3D Design

Three-dimensional modeling and design tasks using real Blender software tools with geometric constraints and design specifications.
Loading tasks...