Skip to content

OpenAI

Install the Vitamem package and the OpenAI peer dependency:

Terminal window
npm install vitamem openai

The fastest way to get started with OpenAI:

import { createVitamem } from "vitamem";
const mem = await createVitamem({
provider: "openai",
apiKey: process.env.OPENAI_API_KEY!,
storage: "ephemeral",
});

This uses the default models: gpt-5.4-mini for chat and extraction, text-embedding-3-small for embeddings.

For full control, use createOpenAIAdapter directly:

import { createOpenAIAdapter, createVitamem } from "vitamem";
const llm = createOpenAIAdapter({
apiKey: process.env.OPENAI_API_KEY!,
chatModel: "gpt-5.4-mini",
embeddingModel: "text-embedding-3-small",
});
const mem = await createVitamem({
llm,
storage: "ephemeral",
});
OptionTypeDefaultDescription
apiKeystringrequiredYour OpenAI API key.
chatModelstring"gpt-5.4-mini"Model used for chat completions and memory extraction.
embeddingModelstring"text-embedding-3-small"Model used for text embeddings.
baseUrlstringundefinedOverride the API base URL (see below).
apiMode"completions" | "responses""completions"Which OpenAI API shape to use (see API Mode).
extraChatOptionsobjectundefinedProvider-specific options spread into every chat/extraction call (see Pass-through Options).
extraEmbeddingOptionsobjectundefinedProvider-specific options spread into every embedding call.
extractionPromptstringBuilt-in promptCustom prompt for memory extraction. Must include a {conversation} placeholder.

Vitamem supports two OpenAI API modes:

  • completions (default) — Uses /v1/chat/completions. Compatible with OpenAI, DashScope, Azure OpenAI, Groq, Together AI, and other OpenAI-compatible providers.
  • responses — Uses OpenAI’s newer Responses API (/v1/responses) with extended features.
const adapter = createOpenAIAdapter({
apiKey: process.env.OPENAI_API_KEY!,
apiMode: 'responses', // Use Responses API
});

Or via environment variable:

OPENAI_API_MODE=responses

Pass any provider-specific options directly to the underlying SDK call using extraChatOptions and extraEmbeddingOptions. These are spread into the SDK request, so you can use any option your provider supports without waiting for explicit Vitamem support.

const adapter = createOpenAIAdapter({
apiKey: process.env.OPENAI_API_KEY!,
extraChatOptions: {
temperature: 0.7,
max_tokens: 1024,
},
extraEmbeddingOptions: {
dimensions: 512,
},
});

Or via environment variables (JSON format):

OPENAI_EXTRA_CHAT_OPTIONS={"temperature": 0.7, "max_tokens": 1024}
OPENAI_EXTRA_EMBEDDING_OPTIONS={"dimensions": 512}

Any model available through the OpenAI chat completions API works.

const llm = createOpenAIAdapter({
apiKey: process.env.OPENAI_API_KEY!,
chatModel: "gpt-5.4-mini",
});

The embedding model determines the vector dimensions stored in your database. If you change the embedding model after memories have been saved, existing embeddings will not be compatible with new ones.

ModelDimensionsNotes
text-embedding-3-small1536Default. Good quality at low cost.
text-embedding-3-large3072Higher accuracy, double the storage.

The baseUrl option lets you point the adapter at any API that implements the OpenAI chat completions and embeddings endpoints. This is useful for proxies, self-hosted models, or enterprise gateways.

DashScope provides an OpenAI-compatible endpoint that supports both chat completions (Qwen models) and embeddings.

const llm = createOpenAIAdapter({
apiKey: process.env.DASHSCOPE_API_KEY!,
chatModel: "qwen3.5-flash",
embeddingModel: "text-embedding-v4",
baseUrl: "https://dashscope.aliyuncs.com/compatible-mode/v1",
});

Or configure entirely via environment variables:

LLM_PROVIDER=openai
OPENAI_API_KEY=sk-your-dashscope-key
OPENAI_CHAT_MODEL=qwen3.5-flash
OPENAI_EMBEDDING_MODEL=text-embedding-v4
OPENAI_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
OPENAI_API_MODE=completions

Recommended DashScope models:

PurposeModelNotes
Chat / Extractionqwen3.5-flashFast, cost-effective.
Chat / Extractionqwen-plusHigher accuracy.
Embeddingstext-embedding-v4Use via the compatible-mode URL.
const llm = createOpenAIAdapter({
apiKey: process.env.AZURE_OPENAI_API_KEY!,
baseUrl: "https://your-resource.openai.azure.com/openai/deployments/your-deployment",
chatModel: "gpt-4o",
});
const llm = createOpenAIAdapter({
apiKey: process.env.GROQ_API_KEY!,
chatModel: "llama-3.3-70b-versatile",
baseUrl: "https://api.groq.com/openai/v1",
});
const llm = createOpenAIAdapter({
apiKey: "lm-studio", // LM Studio does not require a real key
baseUrl: "http://localhost:1234/v1",
chatModel: "your-loaded-model",
embeddingModel: "your-embedding-model",
});
const llm = createOpenAIAdapter({
apiKey: "vllm",
baseUrl: "http://localhost:8000/v1",
chatModel: "meta-llama/Llama-3-8b-chat-hf",
});

The OpenAI adapter supports streaming in both completions and responses API modes. Use chatStream() or chatWithUserStream() on the Vitamem instance to receive tokens as they are generated:

const mem = await createVitamem({
provider: "openai",
apiKey: process.env.OPENAI_API_KEY!,
storage: "ephemeral",
});
const { stream } = await mem.chatStream({
threadId: thread.id,
message: "What should I know about my medications?",
});
for await (const chunk of stream) {
process.stdout.write(chunk);
}

See Streaming Output for the full guide, including SSE integration and fallback behavior.

The extraction prompt controls how the LLM identifies facts worth remembering from a conversation. The default prompt is tuned for health companions, focusing on conditions, medications, lifestyle, and goals.

To customize it, pass an extractionPrompt string. It must include the {conversation} placeholder, which Vitamem replaces with the formatted message history.

const llm = createOpenAIAdapter({
apiKey: process.env.OPENAI_API_KEY!,
extractionPrompt: `Extract health-related facts from this conversation.
Focus on: diagnoses, medications, vitals, and care preferences.
Conversation:
{conversation}
Return a JSON array only (no markdown, no explanation):
[{ "content": "brief factual statement", "source": "confirmed" | "inferred" }]`,
});

When using the string shortcut via createVitamem, you can override models with the model and embeddingModel fields:

const mem = await createVitamem({
provider: "openai",
apiKey: process.env.OPENAI_API_KEY!,
model: "gpt-4o",
embeddingModel: "text-embedding-3-large",
storage: "ephemeral",
});

The OpenAI adapter lazy-loads the openai SDK at runtime. It is listed as an optional peer dependency of Vitamem:

{
"peerDependencies": {
"openai": ">=4.0.0"
}
}

If the openai package is not installed and you attempt to use the OpenAI adapter, you will get a module resolution error at runtime.