I manage technology at Homeville Group — co-lending platforms, housing finance compliance systems, NBFC operations. The work is dense: RBI regulatory frameworks, co-lending architectures, sprint automation, escrow reconciliation, bureau reporting transitions. The kind of domain where context is everything and forgetting is expensive.
I work with AI agents daily — primarily Claude and Gemini — and for a long time I hit the same wall everyone hits: every conversation started from zero. I'd re-explain the same architecture, re-paste the same compliance context, re-establish the same mental models. Hundreds of tokens, every session, just to get back to where we left off.
So I built something to fix it. I had written a post earlier on this Git-backed brain.
The Problem: AI Agents Have No Long-Term Memory
Here's the uncomfortable truth about working with LLMs in a real organisational context:
- Each session is stateless. Claude doesn't remember what we discussed last Tuesday.
- Context windows are finite. You can't paste your entire company's history into every prompt.
- Decisions decay. Six months later, nobody remembers why we made that architecture call — not the human, and definitely not the AI.
- Multi-agent setups fragment knowledge. When Claude and Gemini both work on your codebase, they can't learn from each other unless you deliberately create a shared channel.
The usual workarounds — pasting notes, maintaining a CLAUDE.md, hoping the model "remembers" via in-context retrieval — are brittle. They require manual effort, they break at scale, and they don't survive agent handoffs.
What I needed was a shared external memory that any agent could read and write, that persisted across sessions, and that maintained a literal history of thought.
The Architecture: Two-Layer Memory
The solution I landed on uses two repositories as distinct memory layers, each serving a different purpose.
┌─────────────────────────────────────────────────────────┐
│ AI AGENT SESSION │
│ (Claude / Gemini / Claude Code) │
└────────────────────────┬────────────────────────────────┘
│ reads context from
┌──────────────┴──────────────┐
│ │
▼ ▼
┌─────────────────┐ ┌──────────────────────┐
│ DNYANKOSH │ │ MANTHAN │
│ (GitLab, pvt) │ │ (GitHub, pvt) │
│ │ │ │
│ Organisational │ │ Active execution │
│ memory. Stable │ │ context. Updated │
│ foundational │ │ every session. │
│ knowledge, │ │ Sprint state, │
│ policies, arch │ │ decisions, │
│ precedents. │ │ working memory. │
└─────────────────┘ └──────────────────────┘
Layer 1: Dnyankosh (Sanskrit: repository of knowledge)
What it is: A self-hosted GitLab repository — the organisational long-term memory. It contains stable, foundational knowledge: regulatory frameworks we operate under, architectural precedents, product principles, team onboarding context, domain glossaries.
Who writes it: Primarily humans. These are deliberate, reviewed documents. Changes here are infrequent.
Who reads it: Every AI agent, at session start, for domain grounding. It's exposed to Claude via GitLab MCP (Project ID: 268), so agents can query it directly without copy-pasting.
The analogy: Dnyankosh is the company wiki crossed with a law library. You don't update it every day. But when you need to know why the co-lending architecture works a certain way, or what regulatory constraint drives a product decision, it's there.
Layer 2: Manthan (Sanskrit: churning)
What it is: A GitHub repository — the active execution context. It's updated continuously, every session, by both humans and AI agents. The name comes from the mythological churning of the ocean — the idea of collaborative work producing something valuable from effort.
Who writes it: Both humans and AI agents. Claude commits session logs, ADRs, context updates, and automation tools directly to this repo.
Who reads it: Every agent, at session start, for active state. The entry point is always context/working-memory-YYYY-MM.md.
The structure:
manthan/
├── decisions/ # Architecture Decision Records (ADRs)
├── context/ # Active sprint state, ops context, working memory
├── prompts/ # Reusable agent instruction templates
├── harnesses/ # Automation scripts (Python, production-ready)
└── observability/ # Agent session logs and telemetry
└── sessions/ # JSON log per session, every session
How Agents Use It: The Session Protocol
Every Claude session follows this sequence:
- Fetch root of Manthan — get the directory tree, orient.
- Read
context/working-memory-YYYY-MM.md— this is the highest-signal document. It captures current sprint state, open decisions, active workstreams, and recent decisions. - Check
INDEX.md— full table of contents for quick lookups. - Read domain-specific context — if the session is about co-lending, read
context/co-lending-operations.md. If it's about CIC reporting, readcontext/cic_reporting_transition_context.md. And so on. - Query Dnyankosh — for foundational constraints or architectural precedents before proposing solutions.
- Do the work.
- Commit session log to
observability/sessions/— a structured JSON recording what was done, duration, tokens, outcome, files touched.
This protocol means the agent starts each session already knowing what we've been working on, what decisions have been made, and what constraints apply. Zero re-briefing time.
The Observability Layer
This was the piece I didn't originally plan, but it turned out to be one of the most valuable.
Every session produces a structured JSON log committed to observability/sessions/:
{
"session_id": "session-20260608-143022-claude",
"agent": "claude-sonnet-4-6",
"date": "2026-06-08",
"duration_minutes": 45,
"task": "CIC reporting transition context update",
"outcome": "completed",
"files_modified": [
"context/cic_reporting_transition_context.md"
],
"tokens_estimated": 12000,
"decisions_made": ["Blended rate to be reported to CICs for co-lending loans"],
"open_items": ["July 1 UCRF deadline — state codes update"]
}
The result: a literal history of what every agent did, in every session, across all of time. You can audit exactly what Claude touched on a given day. You can see where a decision came from. You can reconstruct the reasoning behind any context document by reading the session logs around it.
This also enables cross-agent collaboration. When Gemini picks up a task, it can read Claude's recent session logs and understand what's already been done and what's still open — without any human having to do the handoff.
The Auto-Sync: Keeping Memory Fresh Without Manual Effort
One of the hardest problems with any knowledge base is keeping it current. The solution here is a weekly GitHub Actions workflow that auto-syncs Manthan from four sources:
- OpenProject — closed work packages this week (sprint progress)
- GitLab — merged MRs this week (code changes)
- Granola — meeting notes from the week
- Obsidian — notes modified this week
Every Sunday at 23:00 UTC, the workflow runs, pulls data from all four sources, generates a context/weekly-sync-YYYY-WWW.md, and commits it. The repo is always current. Zero manual effort after the 5-minute setup.
The MCP Layer: Agents Can Actually Read (and Write) These Repos
The whole system depends on AI agents having direct, live access to these repositories — not via copy-pasting, but via MCP (Model Context Protocol) connections.
For Claude in the web interface:
- The GitLab self-hosted instance is connected via MCP, giving direct access to Dnyankosh
- GitHub is connected via MCP, giving read/write access to Manthan
- Google Calendar, Gmail, Granola, Google Drive, and Microsoft 365 are also connected
For Claude Code CLI:
- GitHub MCP runs as a Docker container authenticated with a Personal Access Token
- Gives Claude Code the same read/write access to Manthan during coding sessions
The result: Claude can fetch a context document, update it, and commit the change — all in one session, without any human intermediary.
Design Principles That Made It Work
After building this and running it for several months, a few principles emerged as critical:
1. Agents need context, not instructions.
The single biggest insight. Give agents rich context documents and they figure out what to do. Long system prompts with rules are brittle; a well-maintained working memory document is resilient.
2. Dated decisions > decisions in chat.
Every ADR has a date and a "why." Six months later, you can reconstruct the reasoning. Chat history is ephemeral. GitHub is permanent.
3. Working memory beats weekly syncs.
The working-memory-YYYY-MM.md document — updated continuously — is the highest-signal entry point. Weekly sync reports are useful but no substitute for a live, curated context document.
4. Immutability for decisions; mutability for context.
ADRs are never modified after they're written (decisions are historical facts). Context documents are updated freely (they reflect current reality). The distinction matters for auditability.
5. Two MCP namespaces can coexist — know which one has private repo access.
This caused a surprising amount of debugging. The mcp__plugin_ecc_github__* namespace only has public repo access; mcp__github__* has private repo access. One 404 is all it takes to make you read the docs properly.
What This Looks Like In Practice
A real session, verbatim context:
"Before responding, Claude fetches
context/working-memory-2026-06.md, reads the CIC reporting transition section, queries Dnyankosh for the relevant RBI Credit Information Reporting Directions, cross-references the open gap items list, and comes back with a compliance checklist citing specific paragraphs — without me having to re-explain what UCRF is, what our current state is, or what the July 1 deadline means."
That's the experience this system enables. Agents that are genuinely up to speed. Not because they have better models (though that helps), but because they have better context.
What's Next
A few things on the roadmap:
- Dnyankosh Claude plugin — packaging Dnyankosh as a first-class Claude project knowledge source, so the organisational context loads automatically in every session without explicit fetches.
- BHN Integration MCP — a 33-tool MCP server (TypeScript, Streamable HTTP) giving agents direct access to the lending management system, not just documentation about it.
- Agent autonomy framework — today, agents update context documents but human review is required for ADRs. The next step is defining a protocol for agents to make low-risk architectural decisions autonomously and record them to Manthan without blocking on human review.
- Multi-org support — the Manthan architecture is general enough to work beyond Homeville. Designing it to be templatable for other organisations is an interesting open problem.
The Repo
Manthan is at github.com/kidakaka/manthan. It's a private repo, but the README is a detailed spec of the architecture — worth reading if you want to build something similar.
The system took about a week to design and two weeks to fully instrument. The ongoing maintenance burden is roughly two hours a month. The value — agents that remember, context that accumulates, decisions that are auditable — is difficult to overstate.
If you're working with AI agents at any scale beyond toy projects, you need an external memory layer. This is how I built mine.