OpenAI
Installation
Section titled “Installation”Install the Vitamem package and the OpenAI peer dependency:
npm install vitamem openaiQuick Setup
Section titled “Quick Setup”The fastest way to get started with OpenAI:
import { createVitamem } from "vitamem";
const mem = await createVitamem({ provider: "openai", apiKey: process.env.OPENAI_API_KEY!, storage: "ephemeral",});This uses the default models: gpt-5.4-mini for chat and extraction, text-embedding-3-small for embeddings.
Adapter Factory
Section titled “Adapter Factory”For full control, use createOpenAIAdapter directly:
import { createOpenAIAdapter, createVitamem } from "vitamem";
const llm = createOpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY!, chatModel: "gpt-5.4-mini", embeddingModel: "text-embedding-3-small",});
const mem = await createVitamem({ llm, storage: "ephemeral",});Options
Section titled “Options”| Option | Type | Default | Description |
|---|---|---|---|
apiKey | string | required | Your OpenAI API key. |
chatModel | string | "gpt-5.4-mini" | Model used for chat completions and memory extraction. |
embeddingModel | string | "text-embedding-3-small" | Model used for text embeddings. |
baseUrl | string | undefined | Override the API base URL (see below). |
apiMode | "completions" | "responses" | "completions" | Which OpenAI API shape to use (see API Mode). |
extraChatOptions | object | undefined | Provider-specific options spread into every chat/extraction call (see Pass-through Options). |
extraEmbeddingOptions | object | undefined | Provider-specific options spread into every embedding call. |
extractionPrompt | string | Built-in prompt | Custom prompt for memory extraction. Must include a {conversation} placeholder. |
API Mode
Section titled “API Mode”Vitamem supports two OpenAI API modes:
completions(default) — Uses/v1/chat/completions. Compatible with OpenAI, DashScope, Azure OpenAI, Groq, Together AI, and other OpenAI-compatible providers.responses— Uses OpenAI’s newer Responses API (/v1/responses) with extended features.
const adapter = createOpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY!, apiMode: 'responses', // Use Responses API});Or via environment variable:
OPENAI_API_MODE=responsesPass-through Options
Section titled “Pass-through Options”Pass any provider-specific options directly to the underlying SDK call using extraChatOptions and extraEmbeddingOptions. These are spread into the SDK request, so you can use any option your provider supports without waiting for explicit Vitamem support.
const adapter = createOpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY!, extraChatOptions: { temperature: 0.7, max_tokens: 1024, }, extraEmbeddingOptions: { dimensions: 512, },});Or via environment variables (JSON format):
OPENAI_EXTRA_CHAT_OPTIONS={"temperature": 0.7, "max_tokens": 1024}OPENAI_EXTRA_EMBEDDING_OPTIONS={"dimensions": 512}Choosing Models
Section titled “Choosing Models”Chat Models
Section titled “Chat Models”Any model available through the OpenAI chat completions API works.
const llm = createOpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY!, chatModel: "gpt-5.4-mini",});Embedding Models
Section titled “Embedding Models”The embedding model determines the vector dimensions stored in your database. If you change the embedding model after memories have been saved, existing embeddings will not be compatible with new ones.
| Model | Dimensions | Notes |
|---|---|---|
text-embedding-3-small | 1536 | Default. Good quality at low cost. |
text-embedding-3-large | 3072 | Higher accuracy, double the storage. |
Using with OpenAI-Compatible APIs
Section titled “Using with OpenAI-Compatible APIs”The baseUrl option lets you point the adapter at any API that implements the OpenAI chat completions and embeddings endpoints. This is useful for proxies, self-hosted models, or enterprise gateways.
DashScope (Alibaba Cloud)
Section titled “DashScope (Alibaba Cloud)”DashScope provides an OpenAI-compatible endpoint that supports both chat completions (Qwen models) and embeddings.
const llm = createOpenAIAdapter({ apiKey: process.env.DASHSCOPE_API_KEY!, chatModel: "qwen3.5-flash", embeddingModel: "text-embedding-v4", baseUrl: "https://dashscope.aliyuncs.com/compatible-mode/v1",});Or configure entirely via environment variables:
LLM_PROVIDER=openaiOPENAI_API_KEY=sk-your-dashscope-keyOPENAI_CHAT_MODEL=qwen3.5-flashOPENAI_EMBEDDING_MODEL=text-embedding-v4OPENAI_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1OPENAI_API_MODE=completionsRecommended DashScope models:
| Purpose | Model | Notes |
|---|---|---|
| Chat / Extraction | qwen3.5-flash | Fast, cost-effective. |
| Chat / Extraction | qwen-plus | Higher accuracy. |
| Embeddings | text-embedding-v4 | Use via the compatible-mode URL. |
Azure OpenAI
Section titled “Azure OpenAI”const llm = createOpenAIAdapter({ apiKey: process.env.AZURE_OPENAI_API_KEY!, baseUrl: "https://your-resource.openai.azure.com/openai/deployments/your-deployment", chatModel: "gpt-4o",});const llm = createOpenAIAdapter({ apiKey: process.env.GROQ_API_KEY!, chatModel: "llama-3.3-70b-versatile", baseUrl: "https://api.groq.com/openai/v1",});LM Studio
Section titled “LM Studio”const llm = createOpenAIAdapter({ apiKey: "lm-studio", // LM Studio does not require a real key baseUrl: "http://localhost:1234/v1", chatModel: "your-loaded-model", embeddingModel: "your-embedding-model",});const llm = createOpenAIAdapter({ apiKey: "vllm", baseUrl: "http://localhost:8000/v1", chatModel: "meta-llama/Llama-3-8b-chat-hf",});Streaming
Section titled “Streaming”The OpenAI adapter supports streaming in both completions and responses API modes. Use chatStream() or chatWithUserStream() on the Vitamem instance to receive tokens as they are generated:
const mem = await createVitamem({ provider: "openai", apiKey: process.env.OPENAI_API_KEY!, storage: "ephemeral",});
const { stream } = await mem.chatStream({ threadId: thread.id, message: "What should I know about my medications?",});
for await (const chunk of stream) { process.stdout.write(chunk);}See Streaming Output for the full guide, including SSE integration and fallback behavior.
Custom Extraction Prompt
Section titled “Custom Extraction Prompt”The extraction prompt controls how the LLM identifies facts worth remembering from a conversation. The default prompt is tuned for health companions, focusing on conditions, medications, lifestyle, and goals.
To customize it, pass an extractionPrompt string. It must include the {conversation} placeholder, which Vitamem replaces with the formatted message history.
const llm = createOpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY!, extractionPrompt: `Extract health-related facts from this conversation.Focus on: diagnoses, medications, vitals, and care preferences.
Conversation:{conversation}
Return a JSON array only (no markdown, no explanation):[{ "content": "brief factual statement", "source": "confirmed" | "inferred" }]`,});Config Shortcut
Section titled “Config Shortcut”When using the string shortcut via createVitamem, you can override models with the model and embeddingModel fields:
const mem = await createVitamem({ provider: "openai", apiKey: process.env.OPENAI_API_KEY!, model: "gpt-4o", embeddingModel: "text-embedding-3-large", storage: "ephemeral",});Peer Dependency
Section titled “Peer Dependency”The OpenAI adapter lazy-loads the openai SDK at runtime. It is listed as an optional peer dependency of Vitamem:
{ "peerDependencies": { "openai": ">=4.0.0" }}If the openai package is not installed and you attempt to use the OpenAI adapter, you will get a module resolution error at runtime.
Next Steps
Section titled “Next Steps”- Anthropic Provider — use Claude for chat with OpenAI embeddings
- Ollama Provider — run models locally with zero config
- Custom LLM Adapter — implement the interface for any provider