Open Source · TypeScript · Zero Dependencies

Lifecycle-aware memory for AI.
Extract once, recall always.

Vitamem models conversations as sessions with a lifecycle. When a session rests, it extracts the facts that matter, deduplicates them, and makes them available for every future conversation — automatically.

Five lines of code. Zero memory infrastructure to build. Works with any AI that talks to users across sessions.

$ npm install vitamem click to copy

Four steps. Zero memory code.

Vitamem manages the full memory lifecycle behind the scenes.

1
CHAT
User talks to your AI. You write zero memory code — just call chatWithUser().
2
REST
Conversation goes quiet. Vitamem notices automatically via sweepThreads().
3
REMEMBER
Key facts are extracted, embedded, and deduplicated — one batch, not fifty calls.
4
RECALL
Next session, relevant memories appear in context. Pinned facts always included.

Conversations have a lifecycle. Vitamem tracks it.

Every thread moves through four states — mirroring how real conversations actually work. Click a state to learn what happens.

💬
Active
session live
Cooling
between sessions
🧠
Dormant
memories saved
📦
Closed
archived
Active — session is live
The user is actively checking in. Messages are stored in full fidelity. No embeddings yet — computing them now would be wasteful while the conversation is still evolving. The thread stays active as long as messages keep arriving.

Why Vitamem

These are facts about the code, not marketing claims.

236
Passing tests
96.6% statement coverage across all modules. Run npm test and see for yourself.
0
Production dependencies
No hidden supply chain risk. You bring your own LLM and storage — Vitamem wires them together.
0.92
Dedup threshold
Cosine similarity ≥ 0.92 = duplicate, skipped. Also deduplicates within the same batch. Configurable.
4
Enforced states
Active → Cooling → Dormant → Closed. Invalid transitions throw InvalidTransitionError.

The real cost of AI memory is context window waste, not embeddings

The real cost of memory isn't embedding — it's the tokens you send to the LLM on every chat turn. LLM input tokens cost $2.50–15 / 1M — that's 125–750× more expensive than embedding ($0.02 / 1M).

Naive: stuff all memories
2,000+
tokens injected into every LLM call — the entire memory store, growing with each session.
Vitamem: retrieve relevant
~300
tokens per turn — only the memories that matter, stays flat as the store grows.
Stuffing over 50 chats: ~100K tokens → $0.25–1.50
Retrieval over 50 chats: ~15K tokens → $0.04–0.23
Plus, storage embeddings are batched
50 calls (naive)
6 calls (Vitamem)
~8× fewer embedding calls — one per extracted fact at dormant transition, not per message.
As memory grows from 20 to 200 facts, stuffing cost grows 10×. Retrieval stays constant — only the most relevant facts are fetched, regardless of store size.

Any AI that needs to remember users

Health companions, coaching assistants, tutoring systems — anywhere your AI needs persistent, cross-session memory.

🩺
Health Companions
Track symptoms, medications, and health goals across sessions. Built-in health profile rules and auto-pinning included.
🎯
Coaching Assistants
Remember goals, progress, and setbacks over time. Deduplication keeps the coaching record clean as context evolves.
📚
Tutoring Systems
Know what students understand, where they struggle, and what they've mastered — without per-message processing costs.
🎧
Support Agents
Recall customer context, issue history, and preferences. Every support session starts with full context, not a blank slate.

Up and running in minutes

Bring your own LLM adapter and storage. Vitamem handles the lifecycle and memory.

TYPESCRIPT
import { createVitamem } from 'vitamem';

// 1. Initialize with a provider shortcut
const mem = await createVitamem({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY,
  storage: 'ephemeral',
  autoRetrieve: true,
});

// 2. Start a conversation session
const thread = await mem.createThread({ userId: 'user-123' });
const { reply } = await mem.chat({
  threadId: thread.id,
  message: "I prefer dark mode, use TypeScript, and deploy on Vercel.",
});

// 3. Session rests → extract facts, embed once, deduplicate, save
await mem.triggerDormantTransition(thread.id);

// 4. Next session — relevant memories appear automatically
const newThread = await mem.createThread({ userId: 'user-123' });
const { reply: reply2 } = await mem.chat({
  threadId: newThread.id,
  message: "What tools do I use?",
});
// Vitamem auto-retrieves: "Prefers TypeScript", "Deploys on Vercel", ...
Full Quickstart → Health Companion Guide View Examples
Have questions? Check the Frequently Asked Questions — 55+ answers on behavior, cost, retrieval, and integration.
Note: Vitamem is a developer library for AI memory management, not a medical device. If you build health-related applications, you are responsible for compliance (HIPAA, GDPR, etc.) and safety disclosures. Full disclaimer →

Browse docs from this repository

Comprehensive documentation covering everything from getting started to API reference.