pgvector · AES-256-GCM · Gemini Flash · Public Beta

Give your AI a memory layer.

Persistent memory infrastructure for AI applications. Extract structured facts from every conversation, store them encrypted, and surface the right context for any LLM in milliseconds.

Start building free View API docs

No credit card · 1,000 free memories · OpenAI / Claude / Gemini / Llama

memory_sdk.py

ingest.py

retrieve.py

response.json

1
2
3
4
5
6
7
8
9
10
11
12
13
14

# Extract structured memory from any conversation text
import requests

response = requests.post(  "https://ai-memory-sdk.onrender.com/api/v1/memory/ingest",  headers={"Authorization": "Bearer sk_live_..."},  json={
    "tenant_id": "acme-app",
    "user_id": "usr_8821",
    "text": "I prefer dark mode and hate long onboarding flows"
  }
)

# Returns structured SPO triples with confidence scores
# {"memories":[{"subject":"user","predicate":"prefers","object":"dark mode","confidence":0.94}]}

OpenAI GPT-4o

Anthropic Claude

Google Gemini Flash

Meta Llama 3

Mistral Large

Cohere Command R

pgvector HNSW

AES-256-GCM

Argon2id

Redis

JWT Auth

GDPR Compliant

OpenAI GPT-4o

Anthropic Claude

Google Gemini Flash

Meta Llama 3

Mistral Large

Cohere Command R

pgvector HNSW

AES-256-GCM

Argon2id

Redis

JWT Auth

GDPR Compliant

Live Playground

See extraction in real time.

Type any user preference. Watch the SDK extract structured Subject-Predicate-Object triples live.

User Input

Extracted Memories

Extracted memories will appear here...

Capabilities

Production-grade by design.

Every feature built for real workloads, real compliance, and real scale. Not a wrapper — a complete memory layer.

Structured Memory Extraction

Converts raw conversation into Subject-Predicate-Object triples via Gemini Flash. Hallucinations, conditionals, and speculative statements filtered before storage. Only verified facts persist, each scored by confidence and importance.

SPO tripleshallucination filterdeduplicationconfidence scoring

Semantic Retrieval

pgvector cosine similarity with HNSW indexing. Ranked by relevance, recency, confidence, and importance.

pgvectorHNSWweighted ranking

Zero-Trust Security

AES-256-GCM encryption on every memory. Argon2id key derivation. JWT auth with row-level tenant isolation on every query.

AES-256-GCMArgon2idJWT

Multi-Tenant Isolation

Strict tenant_id partitioning at the database row level. B2B-grade separation enforced on every single query.

row-level securityB2B-ready

GDPR Ready

Export, delete, and forget endpoints built in. Seven-year audit trail. Right to erasure in a single API call.

GDPRaudit trailright to erasure

Architecture

Four steps.
One endpoint.

Deterministic. Sub-millisecond retrieval. Zero configuration required.

Ingest

Send conversation text

POST any conversation to the ingest endpoint with your API key. No preprocessing required.

Extract

Gemini extracts structured facts

SPO triples identified and scored. Hallucinations and conditionals removed before storage.

Store

Encrypted and indexed

AES-256-GCM encrypted, pgvector embedded, HNSW indexed for sub-millisecond retrieval.

Retrieve

Context ready for injection

Ranked memories returned ready to inject into your next LLM system prompt.

EXTRACTED MEMORIES — usr_8821 processing

userprefersdark mode

userdislikeslong onboarding

userusesReact + TypeScript

userworks atAcme Corp

Average confidence score91.2%

System Design

Built for production AI systems.

Deterministic memory infrastructure designed for reliability, scale, and complete control at every layer.

Application

API and Integration

REST API and SDK Integration

Multi-tenant Isolation

Stateless Containers

Memory Engine

Extraction and Storage

Deterministic Extraction

Version-Controlled Updates

Conflict-Safe Writes

Data Layer

PostgreSQL and pgvector

Indexed Vector Storage

Deterministic Retrieval

TTL and Cleanup Policies

Security

Zero-Trust by Default

JWT Authentication

Argon2id Key Derivation

AES-256-GCM Encryption

<1ms

Retrieval Latency

99.5%

Uptime SLA

256-bit

AES Encryption

Any LLM

No Lock-in

Comparison

Infrastructure versus improvised.

Most AI apps treat memory as an afterthought. We built the infrastructure so you never have to think about it again.

AI Memory SDK

Purpose-built infrastructure

✓Persistent memory that survives every session boundary

✓Structured SPO extraction, not raw text accumulation

✓Complete tenant isolation enforced at the row level

✓Semantic retrieval weighted by recency and confidence

✓AES-256-GCM encryption on every record, always

✓GDPR erasure, export, and full audit trail included

✓Compatible with any LLM provider, zero lock-in

Typical AI App

Stateless by default

✗State resets completely on every new session

✗Full conversation history stuffed into system prompts

✗No isolation, user data can bleed across tenants

✗Context window exhausted as usage grows

✗No encryption or compliance controls in place

✗No deletion guarantees, significant legal exposure

✗Tightly coupled to a single LLM provider

Integration

Three steps to persistent memory.

From zero to production memory in under ten minutes. No infrastructure setup required.

Step 01

Get your API key

Sign up and generate your production API key from the dashboard. Free tier includes 1,000 memories, no credit card required.

dashboard → API Keys → Generate

Step 02

Ingest conversation text

POST conversation text to the ingest endpoint. Structured facts are extracted and stored automatically with zero configuration.

POST /api/v1/memory/ingest

Step 03

Retrieve and inject context

Query relevant memories and inject ranked context directly into your LLM system prompt. Works with every model and provider.

POST /api/v1/memory/retrieve

FAQ

Common questions.

What is AI Memory SDK?

AI Memory SDK is persistent memory infrastructure for AI applications. Most chatbots are stateless and forget everything between sessions. This SDK extracts structured facts from conversations and stores them for long-term recall across any number of future sessions.

How does the extraction work?

The SDK converts raw conversation text into Subject-Predicate-Object triples using Gemini Flash. Hallucinations, conditional statements, and speculative content are filtered out before storage. Only verified, high-confidence facts are saved, each with a confidence score and importance weight.

Which LLMs does it support?

Every LLM. OpenAI GPT-4o, Anthropic Claude, Google Gemini, Meta Llama, Mistral, Cohere, and any model you self-host. The SDK returns ranked memory strings ready to paste directly into any system prompt format. No vendor lock-in by design.

How is user data kept secure?

Every memory is encrypted at rest with AES-256-GCM using Argon2id-derived keys. Authentication uses JWT tokens with row-level tenant isolation. Data from one tenant cannot be accessed by another under any circumstances. Full GDPR compliance including right to erasure is built in from day one.

What is included in the free tier?

The free tier includes 1,000 stored memories, full API access, and all security features with no time limit. No credit card required. Pro plans start at $9 per month with unlimited memories, priority support, and advanced analytics.

Get started today