Open Source · TypeScript · Zero Dependencies

Lifecycle-aware memory for AI.
Extract once, recall always.

Vitamem models conversations as sessions with a lifecycle. When a session rests, it extracts the facts that matter, deduplicates them, and makes them available for every future conversation — automatically.

Five lines of code. Zero memory infrastructure to build. Works with any AI that talks to users across sessions.

See the Demo Read the Docs GitHub

How It Works

Four steps. Zero memory code.

Vitamem manages the full memory lifecycle behind the scenes.

CHAT

User talks to your AI. You write zero memory code — just call chatWithUser().

REST

Conversation goes quiet. Vitamem notices automatically via sweepThreads().

REMEMBER

Key facts are extracted, embedded, and deduplicated — one batch, not fifty calls.

RECALL

Next session, relevant memories appear in context. Pinned facts always included.

Core Concept

Conversations have a lifecycle. Vitamem tracks it.

Every thread moves through four states — mirroring how real conversations actually work. Click a state to learn what happens.

💬

Active

session live

→

⏳

Cooling

between sessions

→

🧠

Dormant

memories saved

→

📦

Closed

archived

Active — session is live

The user is actively checking in. Messages are stored in full fidelity. No embeddings yet — computing them now would be wasteful while the conversation is still evolving. The thread stays active as long as messages keep arriving.

Why Vitamem

These are facts about the code, not marketing claims.

236

Passing tests

96.6% statement coverage across all modules. Run npm test and see for yourself.

Production dependencies

No hidden supply chain risk. You bring your own LLM and storage — Vitamem wires them together.

0.92

Dedup threshold

Cosine similarity ≥ 0.92 = duplicate, skipped. Also deduplicates within the same batch. Configurable.

Enforced states

Active → Cooling → Dormant → Closed. Invalid transitions throw InvalidTransitionError.

Core Insight

The real cost of AI memory is context window waste, not embeddings

The real cost of memory isn't embedding — it's the tokens you send to the LLM on every chat turn. LLM input tokens cost $2.50–15 / 1M — that's 125–750× more expensive than embedding ($0.02 / 1M).

Naive: stuff all memories

2,000+

tokens injected into every LLM call — the entire memory store, growing with each session.

Vitamem: retrieve relevant

~300

tokens per turn — only the memories that matter, stays flat as the store grows.

Stuffing over 50 chats: ~100K tokens → $0.25–1.50

Retrieval over 50 chats: ~15K tokens → $0.04–0.23

Plus, storage embeddings are batched

50 calls (naive)

→

6 calls (Vitamem)

~8× fewer embedding calls — one per extracted fact at dormant transition, not per message.

As memory grows from 20 to 200 facts, stuffing cost grows 10×. Retrieval stays constant — only the most relevant facts are fetched, regardless of store size.

Use Cases

Any AI that needs to remember users

Health companions, coaching assistants, tutoring systems — anywhere your AI needs persistent, cross-session memory.

🩺

Health Companions

Track symptoms, medications, and health goals across sessions. Built-in health profile rules and auto-pinning included.

🎯

Coaching Assistants

Remember goals, progress, and setbacks over time. Deduplication keeps the coaching record clean as context evolves.

📚

Tutoring Systems

Know what students understand, where they struggle, and what they've mastered — without per-message processing costs.

🎧

Support Agents

Recall customer context, issue history, and preferences. Every support session starts with full context, not a blank slate.

Quick Start

Up and running in minutes

Bring your own LLM adapter and storage. Vitamem handles the lifecycle and memory.

TYPESCRIPT

import { createVitamem } from 'vitamem';

// 1. Initialize with a provider shortcut
const mem = await createVitamem({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY,
  storage: 'ephemeral',
  autoRetrieve: true,
});

// 2. Start a conversation session
const thread = await mem.createThread({ userId: 'user-123' });
const { reply } = await mem.chat({
  threadId: thread.id,
  message: "I prefer dark mode, use TypeScript, and deploy on Vercel.",
});

// 3. Session rests → extract facts, embed once, deduplicate, save
await mem.triggerDormantTransition(thread.id);

// 4. Next session — relevant memories appear automatically
const newThread = await mem.createThread({ userId: 'user-123' });
const { reply: reply2 } = await mem.chat({
  threadId: newThread.id,
  message: "What tools do I use?",
});
// Vitamem auto-retrieves: "Prefers TypeScript", "Deploys on Vercel", ...

Full Quickstart → Health Companion Guide View Examples

Have questions? Check the Frequently Asked Questions — 55+ answers on behavior, cost, retrieval, and integration.

Note: Vitamem is a developer library for AI memory management, not a medical device. If you build health-related applications, you are responsible for compliance (HIPAA, GDPR, etc.) and safety disclosures. Full disclaimer →

Lifecycle-aware memory for AI.Extract once, recall always.