The most powerful agent workflows rarely live in a single tool. They span systems, requiring the agent to pull a task from Linear, clone a repo from GitHub, execute code in a sandbox, post results to Slack, and file a detailed report in Google Docs. The sandbox acts as the isolated compute backbone, but its value multiplies when seamlessly orchestrated with every other system the agent needs to interact with.
This partnership between Arcade, the MCP runtime that handles authorization and tool execution for AI agents in production, and Daytona delivers that multi-system orchestration. The Daytona toolkit is available as an Arcade Optimized integration. That means it’s been through a rigorous engineering process: full integration test coverage, agent-tuned tool descriptions, and structured error handling that guarantee reliable, efficient execution across LLMs and agent frameworks. Resulting in fewer failed calls, less wasted tokens, and more predictable agent behavior.
46 Tools, Built for How Agents Think
The Daytona toolkit by Arcade ships with 46 tools that cover the full spectrum of the sandbox lifecycle.
Key capabilities include:
- Sandbox Management: Create, start, stop, archive, resize, delete, and configure sandboxes. Set policies like auto-stop and auto-delete, and use labels for organization.
- Code & Shell Execution: Run code directly in the sandbox interpreter or execute shell commands. Persistent sessions are supported for long-running processes like dev servers and multi-step builds.
- File System: Read, write, move, delete, search, and replace files. Create directories, get file metadata, and paginate large directory trees.
- Git Operations: Clone, branch, commit, push, pull, check status, and view logs, enabling full repository workflow support inside sandboxes.
- Snapshots: Create and manage sandbox templates for instantly reproducible environments.
- SSH & Networking: Generate SSH credentials, list active ports, and get preview URLs for web services running inside sandboxes.
The true breakthrough is not the tool count, but the engineering design that makes them reliable inside complex agent workflows.
Designed for Agent Workflows
Building 46 tools is one challenge, making them work reliably inside long-running, multi-step autonomous agent workflows is the ultimate engineering hurdle. The toolkit brings several crucial features to the table:
- Name-Based Sandbox Resolution: Agents can reference sandboxes naturally by their human-readable name, assigned at creation, instead of tracking shifting, complex IDs across a workflow.
- Fuzzy Matching with Suggestions: If an agent provides a sandbox name with a typo or truncation, the toolkit suggests the closest matches, allowing the agent to self-correct and continue without human intervention.
- Structured Error Recovery: Every error response is designed for LLM consumption: structured messages that clearly explain what went wrong and what the agent should try next. This allows agents to recover gracefully and keep workflows moving instead of hitting dead ends.
- Tool Descriptions Tuned for LLM Reasoning: Tool descriptions and parameter schemas follow the Agentic Tool Patterns methodology, purpose-built for how language models select and invoke tools. Clear descriptions lead to better tool selection, fewer hallucinated parameters, and more reliable multi-step execution.
GitHub OAuth, Built In
Agent workflows involving sandboxes almost always require Git operations. The traditional friction point has always been authentication: how does the agent securely get a valid GitHub token into the sandbox exactly when it’s needed?
The Daytona toolkit by Arcade solves this with just-in-time GitHub OAuth built directly into the Git tools. When an agent invokes a Git operation against a GitHub remote, Arcade handles the OAuth flow transparently. This means:
- No Personal Access Tokens (PATs) to manage
- No tokens to inject into environment variables
- No credential plumbing
The user authorizes once, and every subsequent Git operation across any sandbox just works. For non-GitHub remotes, explicit credentials remain supported as a fallback. This pattern is only possible because Arcade’s runtime manages both the tool execution and the auth layer, combining sandbox infrastructure with managed OAuth in a single, secure tool call.
What You Can Build
Daytona sandboxes, combined with Arcade’s runtime and catalog of high quality tools (GitHub, Linear, Slack, Google Docs, and many more), unlock fully autonomous development workflows:
- Intelligent CI: Analyze a Pull Request (PR) diff, run only the affected tests in parallel Daytona sandboxes, and post actionable failure explanations (not raw logs) back to the PR.
- Automated Regression Bisect: Given a failing test, automatically bisect commit history by spinning up a sandbox per revision in parallel. Identify the exact breaking commit and generate a human-readable root-cause explanation.
- TODO-to-PR: Systematically retire technical debt by scanning a repository for TODO comments, spinning up isolated sandboxes to implement each one, running tests, and opening focused pull requests.
- Continuous Refactoring: Detect complexity hotspots, apply refactorings in isolated sandboxes, verify tests pass, and open small PRs on a recurring schedule, making refactoring a continuous background process rather than an episodic project.
- Safe Agent Development: Give agents full autonomy inside disposable sandboxes—cloning repos, installing dependencies, and running arbitrary code—with zero risk to production. Arcade’s OAuth layer ensures no long-lived tokens are exposed in environment variables.
Getting Started
The Daytona toolkit is available now as an Arcade Optimized integration. Sign up for Arcade to start building with Daytona tools in your AI agents.
Resources