How it works
Every LLM provider supports tool calling (also called function calling). You define a tool calledexecute_python that accepts a code parameter. When the LLM decides it needs to compute something, it returns a tool call instead of a text response. Your code executes that tool call inside a BoxLite CodeBox and feeds the output back to the LLM, which then formulates the final answer.
The pattern is the same regardless of provider:
- Define the tool schema (what “execute code” means)
- Send messages to the LLM with the tool available
- When the LLM returns a tool call, run the code in BoxLite
- Send the result back to the LLM
- Repeat until the LLM responds with text
Prerequisites
- OpenAI
- Anthropic
- Vercel AI SDK
OpenAI
- Python
- Node.js
openai_sandbox.py
Anthropic
Anthropic uses a different tool schema format (input_schema instead of parameters) and a different response structure (stop_reason instead of finish_reason, content blocks instead of tool_calls).
- Python
- Node.js
anthropic_sandbox.py
Vercel AI SDK
The Vercel AI SDK provides a unifiedtool() helper that handles schema validation and execution in one place. The maxSteps parameter replaces the manual while loop — the SDK automatically re-calls the model when a tool result is returned.
vercel-ai-sandbox.ts
Multiple questions
The CodeBox persists across calls, so installed packages and files on disk carry over between tool invocations within the same conversation. Note that in-memory state (variables, functions) does not persist — eachrun() is a separate Python process.
multi_question.py

