AI Agents LangChain Ollama

November 29, 2025 8 min read AI Automation

Build Local Long-Running AI Agents That Don't Lose Context

Discover how to overcome the frustrating "context window problem" where AI agents forget their progress on complex tasks. This implementation using LangChain and Ollama creates reliable checkpoints so your agents can resume work exactly where they left off.

Long-running AI agent tutorial thumbnail

The Frustrating Context Window Problem

Every developer using AI agents hits the same wall eventually Your agent starts strong - adding features, writing tests, making commits. Then suddenly, it starts hallucinating, rewriting files, introducing bugs. This isn't random failure - it's the context window limitation in action.

As Antropic's research shows, even as models improve, maintaining consistent progress across multiple context windows remains an unsolved challenge. When your agent restarts, it begins with fresh memory, losing all context of previous work.

Key insight:: Agents must work in discrete sessions where each restart means starting from scratch. Without checkpoints, complex tasks become impossible to complete reliably.

The core symptoms are unmistakable Your agent might:

Forget which features it already implemented
Rewrite working code with broken implementations
Lose track of completed tests and validations
Repeat work already marked as complete

Antropic's Two-Agent Solution

Antropic's breakthrough was recognizing that single-agent architectures fundamentally can't solve this problem. Their solution splits responsibilities between two specialized agents:

1. The Initializer Agent

Acts as the architect creating:

Complete feature lists with implementation steps
Structured project state tracking
Initial project scaffolding

2. The Coding Agent

Works incrementally using:

Feature lists as progress trackers
Git commits as versioned checkpoints
Test results as validation markers

Epiphany moment: By separating planning from execution, the system creates natural breakpoints where work can safely pause and resume without losing context.

Local Implementation with LangChain

The beauty of this architecture is its model-agnostic design. We've implemented it locally using:

LangChain 1.1 for agent orchestration
Ollama running quantized 3B parameter models
Pydantic for structured output validation

This stack delivers several advantages:

Complete local operation - no API latency or costs
Structured outputs prevent error accumulation
Checkpoint files are human-readable JSON

Implementation tip: Always run agents in isolated directories with restricted permissions. Our demo uses simple Python functions, but the same architecture scales to complex projects.

How Checkpoints Actually Work

The checkpoint system relies on three key artifacts maintained in the agent environment:

1. Features List JSON

{   "name": "factorial",   "description": "Implement factorial function",   "steps": ["Create factorial.py", "Write tests"],   "passing": false }

2. Git History

Last 5 commits provide versioned restore points

3. Code Files

Implementation files with passing tests

When the coding agent resumes work:

It checks which features list for incomplete items
Reviews git history for last working state
Verifies existing implementations with tests

The Coding Agent in Action

Let's walk through a complete cycle at 8:22 in the video:

Agent selects next incomplete feature from JSON

Checks git history for related commits

Implements code following feature steps

Runs validation tests

If tests pass, marks feature complete and commits

Key advantage: The agent doesn't need the full conversation history - just these structured artifacts provide enough context to resume work reliably.

Critical Safety Considerations

While powerful, this approach introduces serious risks if implemented carelessly:

⚠️ Shell Access

Our demo agents have full shell access - never do this production without:

Command whitelisting
Filesystem restrictions
Resource limits

✅ Safe Implementation

Production systems should:

Run in Docker containers
Restrict to project directories
Log all actions

Remember: An agent with write access can modify or delete files just as easily as it can create.

Real Implementation Results

The video demonstrates the system implementing:

Factorial function with unit tests
Fibonacci sequence generator
Validation test suites

100%

Features completed

Passing implementations

Passing test suites

Beyond the demo: Antropic reports this architecture successfully scales to web applications with dozens of features across multiple context windows.

Watch the Full Tutorial

See the complete implementation from 3:15 where the initializer agent creates the feature list through to 12:40 where the coding agent completes the Fibonacci implementation.

Key Takeaways

This architecture solves three fundamental problems in agentic workflows:

Context loss between sessions
Progress tracking across interruptions
Verification of completed work

In summary: By splitting planning and execution while maintaining structured checkpoints, we can create reliable long-running agents that overcome context window limitations.

Frequently Asked Questions

Common questions about long-running AI agents

What is the main challenge with long-running AI agents?

The core challenge is context window limitations where agents lose track of previous work when their session restarts. Without checkpoints, each new session begins with fresh memory, making it impossible to resume complex tasks reliably.

This manifests as agents forgetting completed work, rewriting files unnecessarily, or introducing errors that didn't exist in previous iterations.

Agents work in discrete sessions
No memory carries between runs
Complex tasks require continuity

How does Antropic's solution solve the context window problem?

Antropic splits the problem into two agents: an initializer that creates feature lists and checkpoints, and a coding agent that uses these checkpoints to resume work. The system maintains progress through JSON files tracking completed features and git commits.

This architecture means:

Initializer sets up the roadmap
Coding agent follows the roadmap
Checkpoints prevent context loss

What technologies are used in this local implementation?

This implementation uses LangChain for agent orchestration, Ollama as the local LLM provider (running quantized 3B parameter models), and Pydantic for structured output validation. The entire system runs locally without cloud dependencies.

Key components:

LangChain 1.1 - agent framework
Ollama - local model execution
Pydantic - validation and structure

How does the checkpoint system actually work?

The system maintains three key artifacts: a features list JSON file tracking implementation status, git commit history for version control, and code files containing the actual implementations. The coding agent checks these artifacts to determine where to resume work.

When the agent restarts:

Checks features list for incomplete items
Reviews git history for last working state
Verifies existing implementations with tests

What safety precautions are needed when running AI agents locally?

Critical precautions include running in a sandboxed environment, restricting file system access, and implementing command whitelisting. The demonstration shows agents with shell access which should never be used on production systems without proper safeguards.

Always:

Run in containers
Restrict to project directories
Log all actions

Can this approach scale to complex projects?

While demonstrated with simple Python functions, Antropic reports success with larger web applications. Scaling requires more sophisticated feature breakdowns and additional verification steps, but the core checkpoint architecture remains valid.

The principles apply to:

Web apps with dozens of features
Multi-file projects
Teams of agents working together

What are the limitations of this approach?

The main limitations are dependency on well-structured feature definitions, potential error accumulation across checkpoints, and the need for human verification at scale. The system works best for modular tasks with clear completion criteria.

Real-world challenges include:

Defining features clearly
Maintaining checkpoint integrity
Human oversight at scale

How can GrowwStacks help implement this for your business?

GrowwStacks specializes in implementing reliable AI agent systems for business automation. We can design custom checkpoint architectures, integrate with your existing tools, and deploy secure sandboxed environments for agent operation.

Our services include:

Custom agent system design
Safe implementation
Ongoing maintenance

Book Free Consultation →

Ready to Build Reliable Long-Running Agents?

Don't let context windows limit your agents forget their progress. Let GrowwStacks implement Antropic's checkpoint system tailored to your specific needs.

Book Free Consultation → More AI Automation Articles