Code & Files¶
Tools that interact with the local filesystem and runtime environment. Use these when the agent needs to do work rather than just talk about it.
| Tool | Tool ID | Purpose |
|---|---|---|
BashTool |
bash |
Run bounded shell commands under project_root |
FileReadTool |
read_file |
Read project files with optional line ranges |
EditFileTool |
edit_file |
Apply exact string replacement patches to existing files |
FileWriteTool |
write_file |
Create or overwrite project files |
GlobSearchTool |
glob_files |
Find files by glob pattern under project_root |
GrepSearchTool |
grep_files |
Search file contents with ripgrep or Python fallback |
CodeExecutionTool |
run_code |
Execute Python or shell code in a sandboxed subprocess |
WorkspaceFilesTool |
workspace_files |
Read, write, list, and inspect files |
MemoryTool |
memory |
Store and retrieve persistent memory facts |
ArtifactBuilderTool |
build_artifact |
Create named artifacts (markdown, JSON, code files) |
The built-in project tools use project_root="/tmp" by default. Override this on Agent or DeepAgent when you want them scoped to a repo checkout instead.
See the dedicated prompt pages for the exact shipped instructions:
run_code¶
Class: CodeExecutionTool
Module: shipit_agent.tools.code_execution
Tool ID: run_code
Executes Python or shell code in a local subprocess workspace. Captures stdout, stderr, and exit code. Times out after a configurable wall-clock limit.
When to use¶
- Math and data analysis the LLM shouldn't do in its head (
17 * 23is fine; matrix algebra isn't) - File processing — parse a CSV, transform JSON, run a regex
- Verifying claims by running the actual code
- Building small one-off scripts the agent then executes
Schema¶
{
"name": "run_code",
"parameters": {
"type": "object",
"properties": {
"language": { "type": "string", "enum": ["python", "bash"] },
"code": { "type": "string", "description": "The code to execute" },
"timeout": { "type": "number", "description": "Wall-clock timeout in seconds (default 30)" }
},
"required": ["code"]
}
}
Configuration¶
from shipit_agent import CodeExecutionTool
tool = CodeExecutionTool(
workspace_root=".shipit_workspace/code_execution", # where files live
timeout=30.0, # default per-call timeout
python_executable="python3", # which Python to use
allow_shell=True, # set False to disable bash
)
Example¶
from shipit_agent import Agent, CodeExecutionTool
from shipit_agent.llms import OpenAIChatLLM
agent = Agent(
llm=OpenAIChatLLM(model="gpt-4o-mini"),
tools=[CodeExecutionTool()],
)
result = agent.run(
"Calculate the standard deviation of [12, 17, 23, 31, 42, 58]. "
"Use the code interpreter to verify."
)
Output structure¶
ToolOutput.text is the captured stdout. ToolOutput.metadata contains:
| Field | Type | Description |
|---|---|---|
language |
str | "python" or "bash" |
exit_code |
int | Subprocess exit code |
stdout |
str | Captured standard output |
stderr |
str | Captured standard error |
duration_seconds |
float | Wall-clock execution time |
timed_out |
bool | True if killed by timeout |
Security notes¶
⚠️ run_code runs untrusted code in a subprocess. It's not a security sandbox. The subprocess inherits your environment variables, can access the filesystem under workspace_root, and can make network requests.
For production deployments where the LLM is exposed to untrusted prompts:
- Run shipit-agent inside a Docker container with no host filesystem mounts
- Restrict the workspace to a tmpfs volume
- Drop network access from the container
- Set
allow_shell=Falseif you don't need bash - Consider running each agent invocation in a fresh container
For local dev and trusted internal use, the default config is fine.
workspace_files¶
Class: WorkspaceFilesTool
Module: shipit_agent.tools.workspace_files
Tool ID: workspace_files
Read, write, list, and inspect files in a scoped workspace directory. Supports text and binary modes, append-or-overwrite semantics, and recursive listing.
When to use¶
- The agent needs to stash intermediate results between tool calls
- A tool produced output that another tool needs to read later
- You want to persist artifacts across agent runs
- Multi-step workflows where data flows through files
Schema¶
{
"name": "workspace_files",
"parameters": {
"type": "object",
"properties": {
"action": { "type": "string", "enum": ["read", "write", "append", "list", "delete", "exists"] },
"path": { "type": "string", "description": "Relative path within the workspace" },
"content": { "type": "string", "description": "Content to write (for write/append actions)" }
},
"required": ["action", "path"]
}
}
Configuration¶
from shipit_agent import WorkspaceFilesTool
tool = WorkspaceFilesTool(
root_dir=".shipit_workspace", # workspace root — all paths resolved relative to this
max_file_size_bytes=1_000_000, # cap on read/write sizes
)
Example¶
from shipit_agent import Agent, WorkspaceFilesTool, CodeExecutionTool
from shipit_agent.llms import OpenAIChatLLM
agent = Agent(
llm=OpenAIChatLLM(model="gpt-4o-mini"),
tools=[WorkspaceFilesTool(), CodeExecutionTool()],
)
result = agent.run(
"Generate a CSV of the first 100 prime numbers, save it to primes.csv, "
"then read it back and tell me the sum."
)
# The agent will:
# 1. run_code → generate primes
# 2. workspace_files write → save to primes.csv
# 3. workspace_files read → load it back
# 4. run_code → compute the sum
Path safety¶
The tool rejects paths outside root_dir — attempts at ../../../etc/passwd get an error. All paths are resolved with Path.resolve() and checked against the workspace root.
memory¶
Class: MemoryTool
Module: shipit_agent.tools.memory
Tool ID: memory
Stores and retrieves structured memory facts that persist across turns within a session and (optionally) across sessions when paired with FileMemoryStore.
When to use¶
- The user told the agent something earlier (their name, preferences, project context) and you want it remembered
- The agent learned a fact from a tool call that should be available in future runs
- You're building a long-running assistant that needs persistent state
Schema¶
{
"name": "memory",
"parameters": {
"type": "object",
"properties": {
"action": { "type": "string", "enum": ["store", "retrieve", "list", "search"] },
"fact": { "type": "string", "description": "The fact to store (for store action)" },
"category": { "type": "string", "description": "Category tag for the fact" },
"query": { "type": "string", "description": "Search query (for search action)" }
},
"required": ["action"]
}
}
Configuration¶
The tool reads/writes to context.state["memory_store"] which the runtime populates from agent.memory_store. Configure at the Agent level:
from shipit_agent import Agent, FileMemoryStore, MemoryTool
from shipit_agent.llms import OpenAIChatLLM
agent = Agent(
llm=OpenAIChatLLM(model="gpt-4o-mini"),
tools=[MemoryTool()],
memory_store=FileMemoryStore(root=".shipit_memory"), # persistent across runs
)
Example¶
# Turn 1
agent.run("My name is Alice and I prefer markdown over plain text.")
# Agent stores: name="Alice", format_preference="markdown"
# Turn 2 (even after restart)
agent.run("Send me a summary of the latest news.")
# Agent recalls Alice's preference and returns markdown
Notes¶
- The runtime auto-stores tool results as memory facts after every run, so even without explicit
memorycalls, your tool outputs become recall-able context for future turns - For semantic search across memory, the
MemoryTool.searchaction does substring matching by default — for true semantic search, swap in a customMemoryStorebacked by a vector DB
build_artifact¶
Class: ArtifactBuilderTool
Module: shipit_agent.tools.artifact_builder
Tool ID: build_artifact
Creates a named artifact — a markdown report, JSON blob, code file, or any other deliverable — and saves it to the workspace with structured metadata. The runtime tracks artifacts on AgentResult.artifacts for downstream consumers.
When to use¶
- The agent's final output is a file (a report, a generated codebase, a data export)
- You want the artifact tracked separately from the conversation history
- The user wants to download the deliverable, not just read it inline
Schema¶
{
"name": "build_artifact",
"parameters": {
"type": "object",
"properties": {
"name": { "type": "string", "description": "Artifact name (without extension)" },
"format": { "type": "string", "enum": ["markdown", "json", "code", "html", "text"] },
"content": { "type": "string", "description": "The artifact body" },
"description": { "type": "string", "description": "What this artifact is" }
},
"required": ["name", "format", "content"]
}
}
Configuration¶
from shipit_agent import ArtifactBuilderTool
tool = ArtifactBuilderTool(
workspace_root=".shipit_workspace/artifacts",
)
Example¶
from shipit_agent import Agent, ArtifactBuilderTool
from shipit_agent.llms import OpenAIChatLLM
agent = Agent(
llm=OpenAIChatLLM(model="gpt-4o-mini"),
tools=[ArtifactBuilderTool()],
)
result = agent.run(
"Write a 500-word markdown report on Python type hints and save it as "
"an artifact named 'python-types-report'."
)
# The artifact is now on result.artifacts
for artifact in result.artifacts:
print(f"{artifact.name}.{artifact.format} → {artifact.path}")
print(f" size: {len(artifact.content)} bytes")
print(f" description: {artifact.description}")
Output¶
ToolOutput.metadata contains:
| Field | Type | Description |
|---|---|---|
name |
str | Artifact name |
format |
str | One of markdown, json, code, html, text |
path |
str | Filesystem path where it was saved |
size_bytes |
int | Content size |
description |
str | Optional description |
The artifact is also added to RuntimeState.artifacts and surfaces on AgentResult.artifacts for the caller to enumerate.
Common patterns¶
Build → execute → verify pipeline¶
Useful for code generation tasks: the agent writes code, runs it, and verifies the output before claiming the task is done.
Stash → process → report pipeline¶
web_search + open_url (gather raw data)
↓
workspace_files write (stash to disk)
↓
run_code (process the data)
↓
build_artifact (final report)
Useful for data analysis tasks where intermediate results are too big to keep in conversation context.