Building Yet Another Coding Agent — Part 1
At my current job, we build a full-stack data analytics platform. A big chunk of thework is integrating external APIs. Authentication, pagination, rate limits, retries —most of it is boilerplate. We've built frameworks to handle that. But the core fetchlogic for each integration still has to be written by hand.
We get a Postman collection, some docs, some config. An engineer reads through it andtranslates it to code. Every time.
I wanted to see if a coding agent could handle that. Not the fancy multi-agent kind —just the basics. Understand the input, generate code, run it, fix it.
while True:
user_input = get_input()
response = llm.complete(messages, tools)
print(response)The LLM doesn't write files or run code. It just decides what should happen. Your codedoes the actual work. The LLM talks, you execute, feed the result back, repeat.
The interesting part is the tool use. The model returns a tool call - create_file,run_file, whatever, with arguments. You execute it, send the output back, and themodel reacts.
Tools
Four tools are enough to start: read, create, modify, run.
import os
import subprocess
import sys
def read_file(filename: str) -> str:
"""
Reads the content of a file. Use this to inspect existing code or output.
"""
try:
with open(filename, 'r') as f:
return f"Content of {filename}:\n{f.read()}"
except FileNotFoundError:
return f"Error: {filename} not found."
except Exception as e:
return f"Error reading file: {e}"
def create_file(filename: str, content: str) -> str:
"""
Creates a new file with the given content. Use this to write new Python scripts.
"""
try:
if os.path.dirname(filename):
os.makedirs(os.path.dirname(filename), exist_ok=True)
with open(filename, 'w') as f:
f.write(content)
return f"Created {filename}."
except Exception as e:
return f"Error creating file: {e}"
def modify_file(filename: str, old_content: str, new_content: str) -> str:
"""
Replaces a specific block of text in a file. Prefer this for targeted edits.
old_content must match exactly.
"""
try:
with open(filename, 'r') as f:
current = f.read()
if old_content not in current:
return f"Error: Could not find the text to replace in {filename}."
updated = current.replace(old_content, new_content, 1)
with open(filename, 'w') as f:
f.write(updated)
return f"Modified {filename}."
except FileNotFoundError:
return f"Error: {filename} not found."
except Exception as e:
return f"Error modifying file: {e}"
def run_file(filename: str) -> str:
"""
Executes a Python file and returns stdout and stderr.
Use this after writing or modifying code to verify it works.
"""
try:
result = subprocess.run(
[sys.executable, filename],
capture_output=True,
text=True,
timeout=30
)
output = []
if result.stdout:
output.append(f"stdout:\n{result.stdout}")
if result.stderr:
output.append(f"stderr:\n{result.stderr}")
output.append(f"Exit code: {result.returncode}")
return "\n".join(output)
except subprocess.TimeoutExpired:
return "Error: Script timed out after 30 seconds."
except Exception as e:
return f"Error running file: {e}"One thing worth noting on modify_file: I made it search-and-replace instead of append. Append sounds simpler but the model quickly produces broken code when it can only add to the end. Search-and-replace lets it fix exactly what's wrong.
The docstrings matter. The model reads them to decide when and how to call each tool.Vague descriptions lead to bad decisions — calling run_file before writing anything, rewriting whole files when one line needs changing.
Tool schema
You can't pass Python functions directly to the LLM. You describe each tool in aschema, and the SDK uses that to tell the model what's available.
tools = [
{
"name": "read_file",
"description": "Reads the content of a file. Use this to inspect existing code or output.",
"input_schema": {
"type": "object",
"properties": {
"filename": {"type": "string", "description": "Path to the file."}
},
"required": ["filename"]
}
},
{
"name": "create_file",
"description": "Creates a new file with the given content. Use this to write new Python scripts.",
"input_schema": {
"type": "object",
"properties": {
"filename": {"type": "string"},
"content": {"type": "string", "description": "Full file content."}
},
"required": ["filename", "content"]
}
},
{
"name": "modify_file",
"description": "Replaces a specific block of text in a file. Prefer this for targeted edits.",
"input_schema": {
"type": "object",
"properties": {
"filename": {"type": "string"},
"old_content": {"type": "string", "description": "The exact text to replace."},
"new_content": {"type": "string", "description": "The replacement text."}
},
"required": ["filename", "old_content", "new_content"]
}
},
{
"name": "run_file",
"description": "Executes a Python file. Use this after writing or editing to verify the code works.",
"input_schema": {
"type": "object",
"properties": {
"filename": {"type": "string"}
},
"required": ["filename"]
}
}
]Wiring it up
The loop needs conversation history. Without it, the model forgets what code it wrotetwo turns ago. Every message goes into messages, including tool results.
import anthropic
import json
client = anthropic.Anthropic()
SYSTEM_PROMPT = """You are a coding agent helping write Python data integration scripts.
When asked to write code:
1. Create the file using create_file.
2. Run it using run_file.
3. If there are errors, use modify_file to fix them and run again.
4. Tell the user what you did and show relevant output.
Prefer targeted edits over full rewrites.
"""
tool_map = {
"read_file": read_file,
"create_file": create_file,
"modify_file": modify_file,
"run_file": run_file,
}
def run_agent():
messages = []
print("Agent ready. Ctrl+C to exit.\n")
while True:
user_input = input("You: ").strip()
if not user_input:
continue
messages.append({"role": "user", "content": user_input})
# inner loop: keep going until the model stops calling tools
while True:
response = client.messages.create(
model="claude-opus-4-5",
max_tokens=4096,
system=SYSTEM_PROMPT,
tools=tools,
messages=messages
)
for block in response.content:
if hasattr(block, 'text'):
print(f"\nAgent: {block.text}")
if response.stop_reason == "end_turn":
messages.append({"role": "assistant", "content": response.content})
break
if response.stop_reason == "tool_use":
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for block in response.content:
if block.type == "tool_use":
print(f"\n[{block.name}({json.dumps(block.input)})]")
fn = tool_map.get(block.name)
result = fn(**block.input) if fn else f"Unknown tool: {block.name}"
print(f"[→ {result[:300]}{'...' if len(result) > 300 else ''}]")
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
messages.append({"role": "user", "content": tool_results})
if __name__ == "__main__":
run_agent()You: Write a script that fetches posts from JSONPlaceholder and prints each title.
[create_file({"filename": "fetch_posts.py", "content": "import requests\n..."})]
[→ Created fetch_posts.py.]
[run_file({"filename": "fetch_posts.py"})]
[→ stderr:
ModuleNotFoundError: No module named 'requests'
Exit code: 1]
[modify_file({"filename": "fetch_posts.py", "old_content": "import requests", "new_content": "import urllib.request\nimport json"})]
[→ Modified fetch_posts.py.]
[run_file({"filename": "fetch_posts.py"})]
[→ stdout:
sunt aut facere repellat provident occaecati...
qui est esse
...
Exit code: 0]
Agent: Done. Hit a missing `requests` module, switched to urllib from the standard
library. Script fetches all 100 posts and prints each title.Our agent wrote the code, ran it, saw the error, fixed it, ran it again. No manual intervention.
What's missing
This is a skeleton. A few gaps to keep in mind:
Sandboxing. `run_file` currently runs with full access to your machine. Foranything beyond local experiments, you'd want Docker or at minimum a restrictedsubprocess environment.
History truncation. A long session will eventually overflow the context window.You'll need a sliding window, summarization, or a turn cap.
Cost. Each tool round-trip is a separate API call. For multi-step tasks theseadd up. Worth tracking early.
Part 2
Next part is the actual use case: reading a Postman collection and connector config,understanding the auth scheme and pagination pattern, generating a fetch script, andwriting the output to Parquet. We'll also add sandboxed execution and a data previewstep so you can verify what came back before committing to anything.
The loop here carries forward unchanged. That's the point of starting simple.