by fliellerjulian ·
We're a YC W23 company building AI agents for engineering labs - our customers run similar analyses repeatedly, and the agent treated every session like a blank slate.
We looked at Mem0, Letta/MemGPT, and similar memory solutions. They all solve a different problem: storing facts from conversations — "user prefers Python," "user is vegetarian." That's key-value memory with semantic search. Useful, but not what we needed.
What we needed was something that learns user patterns implicitly from behavior over time. When a customer corrects a threshold from 85% to 80% three sessions in a row, the agent should just know that next time. When a team always re-runs with stricter filters, the system should pick up on that pattern. So we built an internal API around a simple idea: user corrections are the highest-signal data. Instead of ingesting chat messages and hoping an LLM extracts something, we capture structured events — what the agent produced, what the user changed, what they accepted. A background job periodically runs an LLM pass to extract patterns and builds a confidence-weighted preference profile per user/team/org.
Before each session, the agent fetches the profile and gets smarter over time. The gap as I see it:
Mem0 = memory storage + retrieval. Doesn't learn patterns.
Letta = self-editing agent memory. Closer, but no implicit learning from behavior.
Missing = a preference learning layer that watches how users interact with agents and builds an evolving model. Like a rec engine for agent personalization.
I built this for our domain but the approach is domain-agnostic. Curious if others are hitting the same wall with their agents. Happy to share the architecture, prompts, and confidence scoring approach in detail.