MemLayer is the only agent memory API built with hybrid scoring — combining semantic search, recency decay, and importance weighting so your agent retrieves the RIGHT memory, not just any memory. One API key. No vector DB to manage. No infrastructure to maintain.
Built on OpenAI embeddings + pgvector. Works with LangGraph, AutoGen, CrewAI, and any custom agent.
from memlayer import MemlayerClientclient = MemlayerClient(api_key="memlayer_live_xxx")memories = client.recall("what does this user prefer?", user_id="u1")Every workaround developers use for agent memory has a breaking point. You've probably hit at least two.
I'm stuffing chat history into the system prompt
You'll hit the token limit. Your costs will explode. At 10,000 users, this is $3,000/month in wasted tokens.
I built my own vector DB solution
Now you maintain it. It gets noisy. Old preferences outrank fresh ones. Your agent gives wrong answers.
My agent keeps asking the same questions
"What's your name?" "Where are you based?" — again. Users give up. Churn goes up.
User updated their info but my agent still uses the old data
Two conflicting memories. Agent picks the wrong one. No way to fix it without wiping everything.
These aren't edge cases. They're what happens to every AI agent developer at scale. MemLayer was built specifically to solve them.
Not just storage. Intelligent retrieval. MemLayer doesn't just save text — it knows which memory to surface, when to surface it, and when to let it fade.
Other memory APIs return the most similar vector. MemLayer returns the most USEFUL memory. Hybrid scoring combines semantic relevance (70%), recency decay (20%), and importance weighting (10%). A preference from yesterday beats an identical one from 6 months ago. Every time.
Every search result includes score_detail — a breakdown of cosine similarity, recency score, and importance score. No black box. Debug in seconds, not hours.
Free and Pro: we handle everything on our infrastructure. Enterprise: BYOD — connect your own Supabase instance. Your data never leaves your servers. GDPR and HIPAA use cases handled out of the box.
LangGraph. AutoGen. CrewAI. Custom agents. Raw REST API. Python SDK. If it can make an HTTP request, it works with MemLayer. No framework lock-in. Ever.
When your agent learns something, store it. MemLayer embeds it, deduplicates it, and makes it retrievable forever.
POST /memories{ "content": "User prefers concise responses", "user_id": "user_123", "agent_id": "support_bot"}Session starts. Before the user says a word, load what your agent already knows. Inject into the system prompt. Agent is already personalized.
GET /memories/context ?user_id=user_123 &agent_id=support_botUser asks something. Search semantically. Hybrid scoring returns the most relevant, most recent, most important memory. Not just the most similar vector.
GET /memories/search ?query=what+does+this +user+preferThat's it. Three methods. Your agent has persistent, intelligent memory across every session.
pip install memlayer
from memlayer import MemlayerClient
client = MemlayerClient( api_key="memlayer_live_xxx", base_url="https://api.memlayer.online",)
# Store what your agent learnsclient.remember( "User prefers concise bullet points", user_id="user_123", agent_id="support_bot", memory_type="semantic", importance=0.8,)
# Load context at session startcontext = client.context( user_id="user_123", agent_id="support_bot",)
# Search when user asks somethingmemories = client.recall( query="what are this user's preferences?", user_id="user_123", agent_id="support_bot",)No bloat. No configuration. Every endpoint does exactly one thing and does it perfectly.
Most memory APIs require you to configure pipelines, choose embedding models, and tune retrieval parameters. MemLayer ships with production defaults — OpenAI embeddings, hybrid scoring, duplicate detection — all on by default.
We've already solved it. Here's what you'd spend building the same thing from scratch:
Total ~3 days · ~$2,400 + ongoing maintenance
5 minutes + $19/month
And that's before you hit the edge cases. Duplicate memories. Stale facts outranking fresh ones. Counts drifting out of sync after deletions. We've already solved all of it.
Every plan includes hybrid scoring, duplicate detection, TTL expiry, and semantic search. No features locked behind paywalls. Just higher limits.
Perfect for building and testing your agent. No credit card. No time limit. No catch.
For agents in production serving real users. 50,000 memories handles hundreds of users with thousands of interactions each.
For companies with data sovereignty requirements. BYOD means your data never touches our servers. GDPR. HIPAA. SOC2-ready architecture.
Not sure which plan? Start free. You'll know when you need Pro — your users will tell you by coming back.
These aren't made-up pain points. They're what developers post on Reddit at 2am when their agent breaks in production.
"Memory is becoming the real bottleneck for AI agents. If code used to be the bottleneck, memory might be the new one."
"Nobody talks about what AI memory looks like after six months in production. Old preferences keep winning retrieval, sarcastic comments get stored as literal truth."
MemLayer was built because we saw these posts and knew the infrastructure to fix them already existed. We just built the API layer on top of it.
mem0 and LangMem make decisions about what to store — they extract memories from conversations automatically. MemLayer doesn't. You decide what gets stored. MemLayer's job is to store it reliably and retrieve the right thing at the right time. We're infrastructure, not intelligence. That boundary matters. Also: MemLayer shows you score_detail on every search result. No other memory API does that.
No. MemLayer handles pgvector, embeddings, indexing, and retrieval. You call an API. We do the rest.
Pure cosine similarity returns the most similar vector — not the most useful memory. A preference from 6 months ago can outscore an identical one from yesterday using pure similarity. Recency decay fixes that. Importance weighting lets you pin critical facts. Together they return what your agent actually needs.
Yes. Every search result includes score_detail — cosine, recency, importance, and final score. Most memory APIs are black boxes. MemLayer isn't.
Free and Pro plans: stored on our Supabase instance, isolated by your tenant ID. No other tenant can access your data. Enterprise BYOD: your Supabase instance, we run the engine. Your data never leaves your servers.
OpenAI text-embedding-3-small — 1536 dimensions, 99.9% uptime SLA, production-grade reliability. Not a free tier model with no uptime guarantee.
Yes. Any framework that can make an HTTP request works with MemLayer. We have a Python SDK (pip install memlayer-py) and a full REST API. Framework agnostic by design.
You get a clear 402 error: "Memory limit reached — upgrade to store more." No silent failures. No data loss. Upgrade and the new limit applies immediately.
Every session your agent starts from zero is a session where a user feels like a stranger. Persistent memory is no longer a nice-to-have. It's the baseline expectation for any AI product that wants users to come back.
Free forever on the starter plan. No credit card. Your first memory stored in under 5 minutes.