v1.0 — Memlayer

Your AI Agent Forgets Every User.
Every Single Session. That's Costing You.

MemLayer is the only agent memory API built with hybrid scoring — combining semantic search, recency decay, and importance weighting so your agent retrieves the RIGHT memory, not just any memory. One API key. No vector DB to manage. No infrastructure to maintain.

See How It Works →

Built on OpenAI embeddings + pgvector. Works with LangGraph, AutoGen, CrewAI, and any custom agent.

quickstart.py
from memlayer import MemlayerClient
client = MemlayerClient(api_key="memlayer_live_xxx")
memories = client.recall("what does this user prefer?", user_id="u1")

You've Tried Everything. Nothing Sticks.

Every workaround developers use for agent memory has a breaking point. You've probably hit at least two.

I'm stuffing chat history into the system prompt

You'll hit the token limit. Your costs will explode. At 10,000 users, this is $3,000/month in wasted tokens.

I built my own vector DB solution

Now you maintain it. It gets noisy. Old preferences outrank fresh ones. Your agent gives wrong answers.

My agent keeps asking the same questions

"What's your name?" "Where are you based?" — again. Users give up. Churn goes up.

User updated their info but my agent still uses the old data

Two conflicting memories. Agent picks the wrong one. No way to fix it without wiping everything.

These aren't edge cases. They're what happens to every AI agent developer at scale. MemLayer was built specifically to solve them.

Why MemLayer Is Different From Every Other Memory Solution

vs stuffing prompts

  • Token limits hit at ~20 messages
  • No semantic search
  • Old messages = wasted tokens
  • Costs scale with conversation length
MemLayer
  • Stores unlimited memories
  • Returns only what's relevant
  • Same cost at 1 user or 100,000

vs mem0 / LangMem

  • Black box — you can't see scores
  • No BYOD — your data, their servers
  • Framework-specific
  • Opinionated about what to store
MemLayer
  • score_detail on every result
  • BYOD — your Supabase, your data
  • Works with any LLM or framework
  • You decide what gets stored

vs building yourself

  • 2–3 days minimum to build
  • Ongoing maintenance forever
  • No duplicate detection
  • No hybrid scoring out of the box
MemLayer
  • Running in 5 minutes
  • We maintain it
  • Duplicate detection built in
  • Hybrid scoring built in

The Memory Layer Your Agent Is Missing

Not just storage. Intelligent retrieval. MemLayer doesn't just save text — it knows which memory to surface, when to surface it, and when to let it fade.

🧠

Retrieval That Actually Makes Sense

Other memory APIs return the most similar vector. MemLayer returns the most USEFUL memory. Hybrid scoring combines semantic relevance (70%), recency decay (20%), and importance weighting (10%). A preference from yesterday beats an identical one from 6 months ago. Every time.

🔍

See Exactly Why Every Memory Ranked

Every search result includes score_detail — a breakdown of cosine similarity, recency score, and importance score. No black box. Debug in seconds, not hours.

🏢

Your Data Stays Where You Want It

Free and Pro: we handle everything on our infrastructure. Enterprise: BYOD — connect your own Supabase instance. Your data never leaves your servers. GDPR and HIPAA use cases handled out of the box.

🔌

Works With Everything You're Already Using

LangGraph. AutoGen. CrewAI. Custom agents. Raw REST API. Python SDK. If it can make an HTTP request, it works with MemLayer. No framework lock-in. Ever.

Three Calls. Your Agent Never Forgets Again.

01

remember()

When your agent learns something, store it. MemLayer embeds it, deduplicates it, and makes it retrievable forever.

POST /memories
{
"content": "User prefers concise responses",
"user_id": "user_123",
"agent_id": "support_bot"
}
02

context()

Session starts. Before the user says a word, load what your agent already knows. Inject into the system prompt. Agent is already personalized.

GET /memories/context
?user_id=user_123
&agent_id=support_bot
03

recall()

User asks something. Search semantically. Hybrid scoring returns the most relevant, most recent, most important memory. Not just the most similar vector.

GET /memories/search
?query=what+does+this
+user+prefer

That's it. Three methods. Your agent has persistent, intelligent memory across every session.

Up and Running in 5 Minutes

quickstart.py
pip install memlayer
from memlayer import MemlayerClient
client = MemlayerClient(
api_key="memlayer_live_xxx",
base_url="https://api.memlayer.online",
)
# Store what your agent learns
client.remember(
"User prefers concise bullet points",
user_id="user_123",
agent_id="support_bot",
memory_type="semantic",
importance=0.8,
)
# Load context at session start
context = client.context(
user_id="user_123",
agent_id="support_bot",
)
# Search when user asks something
memories = client.recall(
query="what are this user's preferences?",
user_id="user_123",
agent_id="support_bot",
)

Eight Endpoints. The Entire Memory Layer for Your Agent.

No bloat. No configuration. Every endpoint does exactly one thing and does it perfectly.

POST/memoriesStore a memory
GET/memories/searchSemantic search
GET/memories/contextLoad session context
GET/memoriesList memories
PATCH/memories/{id}Update a memory
DELETE/memories/{id}Delete one memory
DELETE/memoriesWipe all memories

Most memory APIs require you to configure pipelines, choose embedding models, and tune retrieval parameters. MemLayer ships with production defaults — OpenAI embeddings, hybrid scoring, duplicate detection — all on by default.

Your Time Is Worth More Than This Problem

We've already solved it. Here's what you'd spend building the same thing from scratch:

Build it yourself

  • Set up pgvector2 hrs · $200
  • Write embedding pipeline1 hr · $100
  • Implement hybrid scoring1 day · $800
  • Build duplicate detection3 hrs · $300
  • Handle TTL expiry2 hrs · $200
  • Multi-tenant isolation1 day · $800
  • Maintain it foreverongoing · $???

Total ~3 days · ~$2,400 + ongoing maintenance

Use MemLayer Pro

  • Already done
  • Already done
  • Already done
  • Already done
  • Already done
  • Already done
  • Already done

5 minutes + $19/month

And that's before you hit the edge cases. Duplicate memories. Stale facts outranking fresh ones. Counts drifting out of sync after deletions. We've already solved all of it.

Start Free. Scale When You're Ready.

Every plan includes hybrid scoring, duplicate detection, TTL expiry, and semantic search. No features locked behind paywalls. Just higher limits.

Free

$0/mo

Perfect for building and testing your agent. No credit card. No time limit. No catch.

  • 500 memories
  • 100 req/day
  • Hosted only
  • Community support
MOST POPULAR

Pro

$19/mo

For agents in production serving real users. 50,000 memories handles hundreds of users with thousands of interactions each.

  • 50,000 memories
  • 10,000 req/day
  • Hosted
  • Email support

Enterprise

$99+/mo

For companies with data sovereignty requirements. BYOD means your data never touches our servers. GDPR. HIPAA. SOC2-ready architecture.

  • Unlimited memories
  • Unlimited requests
  • Hosted + BYOD
  • Priority support · SLA · GDPR/HIPAA

Not sure which plan? Start free. You'll know when you need Pro — your users will tell you by coming back.

The Problem Is Real. We've Seen the Receipts.

These aren't made-up pain points. They're what developers post on Reddit at 2am when their agent breaks in production.

"Memory is becoming the real bottleneck for AI agents. If code used to be the bottleneck, memory might be the new one."

Posted on r/AI_Agents · 111 shares · 8 months ago

"Nobody talks about what AI memory looks like after six months in production. Old preferences keep winning retrieval, sarcastic comments get stored as literal truth."

Posted on r/aiagents · 22 comments · 1 day ago

MemLayer was built because we saw these posts and knew the infrastructure to fix them already existed. We just built the API layer on top of it.

Questions Developers Actually Ask

How is MemLayer different from mem0 or LangMem?+

mem0 and LangMem make decisions about what to store — they extract memories from conversations automatically. MemLayer doesn't. You decide what gets stored. MemLayer's job is to store it reliably and retrieve the right thing at the right time. We're infrastructure, not intelligence. That boundary matters. Also: MemLayer shows you score_detail on every search result. No other memory API does that.

Do I need to set up a vector database?+

No. MemLayer handles pgvector, embeddings, indexing, and retrieval. You call an API. We do the rest.

Why does retrieval use hybrid scoring instead of pure vector similarity?+

Pure cosine similarity returns the most similar vector — not the most useful memory. A preference from 6 months ago can outscore an identical one from yesterday using pure similarity. Recency decay fixes that. Importance weighting lets you pin critical facts. Together they return what your agent actually needs.

Can I see why a memory ranked where it did?+

Yes. Every search result includes score_detail — cosine, recency, importance, and final score. Most memory APIs are black boxes. MemLayer isn't.

What happens to my data?+

Free and Pro plans: stored on our Supabase instance, isolated by your tenant ID. No other tenant can access your data. Enterprise BYOD: your Supabase instance, we run the engine. Your data never leaves your servers.

Which embedding model does MemLayer use?+

OpenAI text-embedding-3-small — 1536 dimensions, 99.9% uptime SLA, production-grade reliability. Not a free tier model with no uptime guarantee.

Does it work with LangGraph / AutoGen / CrewAI?+

Yes. Any framework that can make an HTTP request works with MemLayer. We have a Python SDK (pip install memlayer-py) and a full REST API. Framework agnostic by design.

What if I hit my memory limit?+

You get a clear 402 error: "Memory limit reached — upgrade to store more." No silent failures. No data loss. Upgrade and the new limit applies immediately.

Your Competitors' Agents Are Already Remembering.

Every session your agent starts from zero is a session where a user feels like a stranger. Persistent memory is no longer a nice-to-have. It's the baseline expectation for any AI product that wants users to come back.

Free forever on the starter plan. No credit card. Your first memory stored in under 5 minutes.