pgvector · AES-256-GCM · Gemini Flash · Public Beta

Give your AI a memory layer.

Persistent memory infrastructure for AI applications. Extract structured facts from every conversation, store them encrypted, and surface the right context for any LLM in milliseconds.

Start building free View API docs

No credit card · 1,000 free memories · OpenAI / Claude / Gemini / Llama

memory_sdk.py
ingest.py
retrieve.py
response.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Extract structured memory from any conversation text
import requests

response = requests.post(  "https://ai-memory-sdk.onrender.com/api/v1/memory/ingest",  headers={"Authorization": "Bearer sk_live_..."},  json={
    "tenant_id": "acme-app",
    "user_id": "usr_8821",
    "text": "I prefer dark mode and hate long onboarding flows"
  }
)

# Returns structured SPO triples with confidence scores
# {"memories":[{"subject":"user","predicate":"prefers","object":"dark mode","confidence":0.94}]}
OpenAI GPT-4o
Anthropic Claude
Google Gemini Flash
Meta Llama 3
Mistral Large
Cohere Command R
pgvector HNSW
AES-256-GCM
Argon2id
Redis
JWT Auth
GDPR Compliant
OpenAI GPT-4o
Anthropic Claude
Google Gemini Flash
Meta Llama 3
Mistral Large
Cohere Command R
pgvector HNSW
AES-256-GCM
Argon2id
Redis
JWT Auth
GDPR Compliant
Live Playground

See extraction in real time.

Type any user preference. Watch the SDK extract structured Subject-Predicate-Object triples live.

User Input
Extracted Memories
Extracted memories will appear here...
Capabilities

Production-grade by design.

Every feature built for real workloads, real compliance, and real scale. Not a wrapper — a complete memory layer.

01
Structured Memory Extraction

Converts raw conversation into Subject-Predicate-Object triples via Gemini Flash. Hallucinations, conditionals, and speculative statements filtered before storage. Only verified facts persist, each scored by confidence and importance.

SPO tripleshallucination filterdeduplicationconfidence scoring
01
02
Semantic Retrieval

pgvector cosine similarity with HNSW indexing. Ranked by relevance, recency, confidence, and importance.

pgvectorHNSWweighted ranking
02
03
Zero-Trust Security

AES-256-GCM encryption on every memory. Argon2id key derivation. JWT auth with row-level tenant isolation on every query.

AES-256-GCMArgon2idJWT
03
04
Multi-Tenant Isolation

Strict tenant_id partitioning at the database row level. B2B-grade separation enforced on every single query.

row-level securityB2B-ready
04
05
GDPR Ready

Export, delete, and forget endpoints built in. Seven-year audit trail. Right to erasure in a single API call.

GDPRaudit trailright to erasure
05
Architecture

Four steps.
One endpoint.

Deterministic. Sub-millisecond retrieval. Zero configuration required.

01
Ingest
Send conversation text

POST any conversation to the ingest endpoint with your API key. No preprocessing required.

02
Extract
Gemini extracts structured facts

SPO triples identified and scored. Hallucinations and conditionals removed before storage.

03
Store
Encrypted and indexed

AES-256-GCM encrypted, pgvector embedded, HNSW indexed for sub-millisecond retrieval.

04
Retrieve
Context ready for injection

Ranked memories returned ready to inject into your next LLM system prompt.

EXTRACTED MEMORIES — usr_8821 processing
userprefersdark mode
userdislikeslong onboarding
userusesReact + TypeScript
userworks atAcme Corp
Average confidence score91.2%
System Design

Built for production AI systems.

Deterministic memory infrastructure designed for reliability, scale, and complete control at every layer.

Application
API and Integration
REST API and SDK Integration
Multi-tenant Isolation
Stateless Containers
Memory Engine
Extraction and Storage
Deterministic Extraction
Version-Controlled Updates
Conflict-Safe Writes
Data Layer
PostgreSQL and pgvector
Indexed Vector Storage
Deterministic Retrieval
TTL and Cleanup Policies
Security
Zero-Trust by Default
JWT Authentication
Argon2id Key Derivation
AES-256-GCM Encryption
<1ms
Retrieval Latency
99.5%
Uptime SLA
256-bit
AES Encryption
Any LLM
No Lock-in
Comparison

Infrastructure versus improvised.

Most AI apps treat memory as an afterthought. We built the infrastructure so you never have to think about it again.

AI Memory SDK
Purpose-built infrastructure
Persistent memory that survives every session boundary
Structured SPO extraction, not raw text accumulation
Complete tenant isolation enforced at the row level
Semantic retrieval weighted by recency and confidence
AES-256-GCM encryption on every record, always
GDPR erasure, export, and full audit trail included
Compatible with any LLM provider, zero lock-in
Typical AI App
Stateless by default
State resets completely on every new session
Full conversation history stuffed into system prompts
No isolation, user data can bleed across tenants
Context window exhausted as usage grows
No encryption or compliance controls in place
No deletion guarantees, significant legal exposure
Tightly coupled to a single LLM provider
Integration

Three steps to persistent memory.

From zero to production memory in under ten minutes. No infrastructure setup required.

Step 01
01
Get your API key

Sign up and generate your production API key from the dashboard. Free tier includes 1,000 memories, no credit card required.

dashboard → API Keys → Generate
Step 02
02
Ingest conversation text

POST conversation text to the ingest endpoint. Structured facts are extracted and stored automatically with zero configuration.

POST /api/v1/memory/ingest
Step 03
03
Retrieve and inject context

Query relevant memories and inject ranked context directly into your LLM system prompt. Works with every model and provider.

POST /api/v1/memory/retrieve
FAQ

Common questions.

What is AI Memory SDK?
AI Memory SDK is persistent memory infrastructure for AI applications. Most chatbots are stateless and forget everything between sessions. This SDK extracts structured facts from conversations and stores them for long-term recall across any number of future sessions.
How does the extraction work?
The SDK converts raw conversation text into Subject-Predicate-Object triples using Gemini Flash. Hallucinations, conditional statements, and speculative content are filtered out before storage. Only verified, high-confidence facts are saved, each with a confidence score and importance weight.
Which LLMs does it support?
Every LLM. OpenAI GPT-4o, Anthropic Claude, Google Gemini, Meta Llama, Mistral, Cohere, and any model you self-host. The SDK returns ranked memory strings ready to paste directly into any system prompt format. No vendor lock-in by design.
How is user data kept secure?
Every memory is encrypted at rest with AES-256-GCM using Argon2id-derived keys. Authentication uses JWT tokens with row-level tenant isolation. Data from one tenant cannot be accessed by another under any circumstances. Full GDPR compliance including right to erasure is built in from day one.
What is included in the free tier?
The free tier includes 1,000 stored memories, full API access, and all security features with no time limit. No credit card required. Pro plans start at $9 per month with unlimited memories, priority support, and advanced analytics.
Get started today

Give your AI
a memory.

Join developers building smarter, more personal AI experiences.

Free for your first 1,000 memories · Pro from $9 per month · No contracts