createVitamem

The main factory function. Takes a single config object and returns a Vitamem instance.

Signature

async function createVitamem(config: VitamemConfig): Promise<Vitamem>;

Config Options

LLM Provider

Provide either a provider string shortcut or an llm adapter instance.

Field	Type	Default	Description
`provider`	`"openai" \| "anthropic" \| "ollama"`	—	Creates a built-in adapter (requires `apiKey` for cloud providers)
`apiKey`	`string`	—	API key for the provider
`model`	`string`	Provider default	Chat model override
`extractionModel`	`string`	Same as `model`	Model for memory extraction. Use a cheaper/faster model for extraction while keeping a more capable chat model.
`embeddingModel`	`string`	Provider default	Embedding model override
`baseUrl`	`string`	Provider default	API base URL (for proxies, self-hosted)
`llm`	`LLMAdapter`	—	Custom adapter instance (overrides `provider`)

Provider defaults:

Provider	Chat Model	Embedding Model
`openai`	`gpt-5.4-mini`	`text-embedding-3-small`
`anthropic`	`claude-sonnet-4-20250514`	`text-embedding-3-small` (via OpenAI)
`ollama`	`llama3.2`	`nomic-embed-text`

Storage

Provide either a string shortcut or a StorageAdapter instance.

Field	Type	Default	Description
`storage`	`"ephemeral" \| "supabase" \| StorageAdapter`	required	Storage backend
`supabaseUrl`	`string`	—	Required when `storage: "supabase"`
`supabaseKey`	`string`	—	Required when `storage: "supabase"`

Behavioral Settings

Field	Type	Default	Description
`preset`	`PresetName`	—	Named timeout preset (`"daily-checkin"`, `"weekly-therapy"`, `"on-demand"`, `"long-term"`). Explicit timeout values override preset values.
`coolingTimeoutMs`	`number`	`21600000` (6h)	Inactivity before active → cooling in `sweepThreads()`
`dormantTimeoutMs`	`number`	Same as `coolingTimeoutMs`	Time in cooling before cooling → dormant in `sweepThreads()`
`closedTimeoutMs`	`number`	`2592000000` (30d)	Time in dormant before auto-close in `sweepThreads()`
`embeddingConcurrency`	`number`	`5`	Max concurrent embedding API calls
`autoRetrieve`	`boolean`	`false`	Inject relevant memories into every `chat()` call
`structuredExtractionRules`	`StructuredExtractionRule[]`	—	Rules for classifying extracted facts into structured profile fields. Use `HEALTH_STRUCTURED_RULES` for health domains.

Retrieval Controls

Field	Type	Default	Description
`onRetrieve`	`(memories, query) => MemoryMatch[]`	—	Hook to filter or reorder memories after the retrieval pipeline runs
`minScore`	`number`	`0`	Minimum cosine similarity score for retrieved memories (0 = no filtering)
`recencyWeight`	`number`	`0`	Blend factor (0—1) between cosine similarity and recency. `0` = pure cosine, `1` = pure recency.
`recencyMaxAgeMs`	`number`	`7776000000` (90d)	Normalization window for recency scoring
`diversityWeight`	`number`	`0`	MMR diversity weight (0—1). `0` = standard top-K, higher values promote diversity.

Memory Intelligence

Field	Type	Default	Description
`extractionPrompt`	`string`	—	Top-level extraction prompt override. Forwarded to the adapter when using the `provider` shortcut. Overrides the adapter’s default health-focused prompt.
`memoryContextFormatter`	`(memories: MemoryMatch[], query: string) => string`	—	Custom formatter for auto-retrieve memory injection. Replaces the default bullet-point format.
`deduplicationThreshold`	`number`	`0.92`	Cosine similarity threshold for exact duplicate detection. Facts with similarity above this are discarded.
`supersedeThreshold`	`number`	`0.75`	Cosine similarity threshold for memory supersede. Facts with similarity between this and `deduplicationThreshold` update the existing memory in-place (e.g., A1C 7.4% → 6.8%).
`autoPinRules`	`AutoPinRule[]`	—	Rules that automatically pin critical memories during extraction. Use the built-in `HEALTH_AUTO_PIN_RULES` for health domains.
`forgetting`	`ForgettingConfig`	`undefined`	Enable active forgetting with decay model. If not set, decay is disabled.
`forgetting.forgettingHalfLifeMs`	`number`	`15552000000` (180 days)	Time in ms until unretrieved memory relevance halves
`forgetting.minRetrievalScore`	`number`	`0.1`	Score threshold below which memories are archival candidates
`enableReflection`	`boolean`	`false`	Enable a second LLM call to validate and enrich extracted facts
`reflectionPrompt`	`string`	built-in	Custom prompt override for the reflection LLM call
`prioritySignaling`	`boolean`	`true`	Prepend priority markers (`[CRITICAL]`, `[IMPORTANT]`, `[INFO]`) to each memory line based on source and pinned status
`chronologicalRetrieval`	`boolean`	`true`	Sort retrieved memories by `createdAt` and group by month/year with date headers
`cacheableContext`	`boolean`	`false`	Split memory context into stable prefix (profile + pinned) and dynamic suffix (retrieved) for LLM caching

Custom Memory Context Formatter

const vm = await createVitamem({
  provider: "openai",
  apiKey: process.env.OPENAI_API_KEY!,
  storage: "ephemeral",
  autoRetrieve: true,
  memoryContextFormatter: (memories, query) =>
    `Known facts about this user:\n${memories.map(m => `• ${m.content}`).join("\n")}`,
});

Auto-Pin Rules

import { createVitamem, HEALTH_AUTO_PIN_RULES } from "vitamem";

const vm = await createVitamem({
  provider: "openai",
  apiKey: process.env.OPENAI_API_KEY!,
  storage: "ephemeral",
  autoPinRules: HEALTH_AUTO_PIN_RULES,
});
// "Allergic to penicillin" → automatically pinned
// "Blood type A+" → automatically pinned

Usage Examples

// Minimal — string shortcuts
const mem = await createVitamem({
  provider: "openai",
  apiKey: process.env.OPENAI_API_KEY!,
  storage: "ephemeral",
});

// Local models
const mem = await createVitamem({
  provider: "ollama",
  storage: "ephemeral",
});

// Custom adapter + Supabase
const mem = await createVitamem({
  llm: myCustomAdapter,
  storage: "supabase",
  supabaseUrl: process.env.SUPABASE_URL!,
  supabaseKey: process.env.SUPABASE_KEY!,
});

// Full config
const mem = await createVitamem({
  provider: "openai",
  apiKey: process.env.OPENAI_API_KEY!,
  model: "gpt-4o",
  storage: "supabase",
  supabaseUrl: process.env.SUPABASE_URL!,
  supabaseKey: process.env.SUPABASE_KEY!,
  coolingTimeoutMs: 6 * 60 * 60 * 1000,
  closedTimeoutMs: 30 * 24 * 60 * 60 * 1000,
  embeddingConcurrency: 10,
  autoRetrieve: true,
});

Returns

A Vitamem instance with the following methods.

Vitamem Methods

`createThread(opts)`

Creates a new conversation thread in the active state.

const thread = await mem.createThread({ userId: "user-123" });
// thread.state === 'active'

`chat(opts)`

Sends a message in a thread and returns the AI reply. Automatically reactivates cooling threads.

const { reply, thread, memories } = await mem.chat({
  threadId: thread.id,
  message: "I take metformin daily for my diabetes.",
  systemPrompt: "You are a health companion.", // optional
});

Option	Type	Description
`threadId`	`string`	The thread to send the message in
`message`	`string`	The user’s message
`systemPrompt`	`string?`	Optional system prompt prepended to context

Returns: { reply: string, thread: Thread, memories?: MemoryMatch[], previousThreadId?: string, redirected?: boolean }

When autoRetrieve is enabled, memories contains the memories that were injected into context. If the thread was dormant or closed, a new thread is created and redirected is true with previousThreadId set.

`chatStream(opts)`

Streaming variant of chat(). Returns an AsyncGenerator that yields response tokens as they are generated. The full reply is saved to storage after the stream completes.

const { stream, thread, memories } = await mem.chatStream({
  threadId: thread.id,
  message: "What medications am I taking?",
  systemPrompt: "You are a health companion.", // optional
});

for await (const chunk of stream) {
  process.stdout.write(chunk);
}

Option	Type	Description
`threadId`	`string`	The thread to send the message in
`message`	`string`	The user’s message
`systemPrompt`	`string?`	Optional system prompt prepended to context

Returns: Promise<{ stream, thread, memories?, previousThreadId?, redirected? }>

Field	Type	Description
`stream`	`AsyncGenerator<string>`	Yields response tokens one-by-one
`thread`	`Thread`	The resolved thread (may be a new thread if redirected)
`memories`	`MemoryMatch[]?`	Memories injected into context (when `autoRetrieve` is enabled)
`previousThreadId`	`string?`	Original thread ID if redirected from dormant/closed
`redirected`	`boolean?`	`true` if a new thread was created due to dormant/closed state

`chatWithUser(opts)`

Convenience method that resolves or creates a thread for the user, then calls chat().

const { reply, thread } = await mem.chatWithUser({
  userId: "user-123",
  message: "How has my blood pressure been?",
});

`chatWithUserStream(opts)`

Streaming variant of chatWithUser(). Resolves or creates a thread, then calls chatStream().

const { stream, thread } = await mem.chatWithUserStream({
  userId: "user-123",
  message: "How has my blood pressure been?",
});

for await (const chunk of stream) {
  process.stdout.write(chunk);
}

Returns: Same shape as chatStream().

`retrieve(opts)`

Searches a user’s stored memories using semantic similarity.

const memories = await mem.retrieve({
  userId: "user-123",
  query: "medications and health conditions",
  limit: 5,
});
// Returns MemoryMatch[] sorted by similarity score descending

`pinMemory(memoryId)`

Pins a memory so it is always included in retrieval results (score 1.0) and exempt from active forgetting decay.

await mem.pinMemory("memory-abc");

`unpinMemory(memoryId)`

Removes the pinned status from a memory.

await mem.unpinMemory("memory-abc");

`getOrCreateThread(userId)`

Returns the user’s latest active or cooling thread, or creates a new one if none exists.

const thread = await mem.getOrCreateThread("user-123");
// thread.state === 'active' or 'cooling'

`getThread(threadId)`

Returns the current thread object, or null if not found.

const thread = await mem.getThread("thread-abc");
console.log(thread?.state); // 'active' | 'cooling' | 'dormant' | 'closed'

`triggerDormantTransition(threadId)`

Transitions a thread to dormant (via cooling if currently active) and runs the embedding pipeline (extract, reflect, classify, embed, deduplicate, save).

const stats = await mem.triggerDormantTransition(thread.id);
console.log(`Saved: ${stats.memoriesSaved}, Superseded: ${stats.memoriesSuperseded}, Deduped: ${stats.memoriesDeduped}`);

Returns: Promise<{ memoriesSaved, memoriesDeduped, memoriesSuperseded, totalExtracted, profileFieldsUpdated }>

`closeThread(threadId)`

Archives a thread. Only valid from dormant state. Memories remain searchable.

await mem.closeThread(thread.id);
// Thread is now 'closed'

`sweepThreads()`

Checks all threads and applies lifecycle transitions based on configured timeouts. Call this on a schedule (e.g., setInterval, cron).

// Check every minute
setInterval(() => mem.sweepThreads(), 60_000);

Transitions applied:

active → cooling if no messages for coolingTimeoutMs
cooling → dormant if cooling for dormantTimeoutMs (runs embedding pipeline)
dormant → closed if dormant for closedTimeoutMs

`deleteMemory(memoryId)`

Deletes a single memory by ID.

await mem.deleteMemory("memory-abc");

`deleteUserData(userId)`

Deletes all memories for a user. Use for GDPR right-to-erasure requests.

await mem.deleteUserData("user-123");

`getProfile(userId)`

Returns the user’s structured profile, or null if profile storage is not supported or no profile exists.

const profile = await mem.getProfile("user-123");
if (profile) {
  console.log(profile.allergies); // ['penicillin']
  console.log(profile.vitals);    // { a1c: { value: 6.8, unit: '%' } }
}

Returns: UserProfile | null

`updateProfile(userId, updates)`

Updates the user’s structured profile with merge semantics. Creates the profile if it doesn’t exist. No-op if the storage adapter does not support profiles.

await mem.updateProfile("user-123", {
  conditions: ["Type 2 diabetes"],
  allergies: ["penicillin"],
});

Option	Type	Description
`userId`	`string`	The user whose profile to update
`updates`	`Partial<Omit<UserProfile, "userId">>`	Fields to merge into the profile

Public Exports

In addition to the Vitamem facade, the library exports several utility functions for advanced use cases.

Memory Decay & Archival

`applyDecay(results, config)`

Apply time-based decay scoring to memory matches. Returns re-scored and re-sorted memories with adjusted relevance. Pinned memories are exempt from decay.

import { applyDecay } from "vitamem";

const scored = applyDecay(memoryMatches, {
  forgettingHalfLifeMs: 15552000000, // 180 days
});

Param	Type	Description
`results`	`MemoryMatch[]`	Memory matches to apply decay to
`config`	`{ forgettingHalfLifeMs?: number }`	Decay configuration

Returns: MemoryMatch[] — Re-scored and re-sorted by decayed score.

`shouldArchive(memory, config)`

Check if a memory should be archived based on its decay score falling below minRetrievalScore. Pinned memories always return false.

import { shouldArchive } from "vitamem";

const archive = shouldArchive(memory, {
  minRetrievalScore: 0.1,
  forgettingHalfLifeMs: 15552000000,
});

Param	Type	Description
`memory`	`Memory`	The memory to evaluate
`config`	`{ minRetrievalScore?: number; forgettingHalfLifeMs?: number }`	Archive thresholds

Returns: boolean — true if the memory’s decay score falls below the threshold.

Extraction Reflection

`reflectOnExtraction(extractedFacts, existingMemories, originalMessages, llm, customPrompt?)`

Run a second LLM call to validate extracted facts against existing memories and conversation. Checks accuracy, detects conflicts, catches missed facts, and enriches vague facts. If reflection fails (e.g., invalid JSON), returns original facts unchanged so the pipeline is never broken.

import { reflectOnExtraction } from "vitamem";

const result = await reflectOnExtraction(
  extractedFacts,
  existingMemories,
  conversationMessages,
  llmAdapter,
);

Param	Type	Description
`extractedFacts`	`ExtractedFact[]`	Facts from the extraction step
`existingMemories`	`Array<{ content: string; source: string }>`	User’s current stored memories
`originalMessages`	`Array<{ role: string; content: string }>`	The original conversation
`llm`	`{ chat: (...) => Promise<string> }`	LLM adapter with a `chat` method
`customPrompt`	`string?`	Optional custom system prompt

Returns: Promise<ReflectionResult>

`applyReflectionResult(result)`

Convert a ReflectionResult into a flat array of ExtractedFact[] for the pipeline. Filters out facts with action "remove" and merges in missed facts.

import { applyReflectionResult } from "vitamem";

const finalFacts = applyReflectionResult(reflectionResult);

Param	Type	Description
`result`	`ReflectionResult`	The result from `reflectOnExtraction()`

Returns: ExtractedFact[] — Cleaned, enriched facts ready for the embedding pipeline.

Structured Extraction

`classifyStructuredFacts(facts, rules)`

Classify extracted facts into structured profile fields using pattern-matching rules.

import { classifyStructuredFacts, HEALTH_STRUCTURED_RULES } from "vitamem";

const structuredFacts = classifyStructuredFacts(extractedFacts, HEALTH_STRUCTURED_RULES);

`applyStructuredFacts(profile, facts)`

Apply classified structured facts to a user profile with set/add/remove semantics.

import { applyStructuredFacts } from "vitamem";

const updatedProfile = applyStructuredFacts(currentProfile, structuredFacts);

Other Exports

Export	Description
`PRESETS`	Built-in preset configurations
`canTransition(from, to)`	Check if a thread state transition is valid
`transition(thread, to)`	Apply a state transition to a thread
`shouldCool(thread, timeoutMs)`	Check if a thread should transition to cooling
`shouldGoDormant(thread, timeoutMs)`	Check if a thread should transition to dormant
`reactivate(thread)`	Reactivate a cooling thread back to active
`extractMemories(messages, llm, sessionDate?)`	Extract memories from messages using an LLM
`cosineSimilarity(a, b)`	Compute cosine similarity between two vectors
`isDuplicate(a, b, threshold)`	Check if two embeddings are duplicates
`deduplicateFacts(facts, existing)`	Deduplicate extracted facts against existing memories
`validateExtraction(memories)`	Validate extracted memory structure
`runEmbeddingPipeline(...)`	Run the full embedding pipeline (extract, embed, dedup, save)
`applyRecencyWeighting(results, weight, maxAge)`	Apply recency-based score weighting
`applyMMR(candidates, weight, limit)`	Apply Maximal Marginal Relevance for diverse retrieval
`EphemeralAdapter`	In-memory storage adapter
`SupabaseAdapter`	Supabase-backed storage adapter
`createOpenAIAdapter(opts)`	Factory for OpenAI LLM adapter
`createAnthropicAdapter(opts)`	Factory for Anthropic LLM adapter
`createOllamaAdapter(opts)`	Factory for Ollama LLM adapter
`HEALTH_AUTO_PIN_RULES`	Built-in auto-pin rules for health domains
`HEALTH_STRUCTURED_RULES`	Built-in structured extraction rules for health domains
`createEmptyProfile(userId)`	Create an empty `UserProfile` with default values