Integrations¶
Semvec keeps framework adapters in user space — their dependencies are heavy and opinionated, and the API they consume is small enough to wire up in a few lines. This page shows the recipes for the common integrations.
External dependencies
The integration examples on this page each pull in a third-party library that semvec does not bundle. Install per integration:
| Integration | Install command |
|---|---|
| LangChain | pip install langchain-core langchain-openai |
| deepagents | pip install deepagents |
| Postgres / pgvector | pip install "psycopg[binary]" pgvector |
| Neo4j | pip install neo4j |
If a code block below imports a library not listed here, install it explicitly — semvec will not pull it as a transitive dependency.
LangChain¶
Expose Semvec memory as a LangChain retriever by wrapping SemvecState.memory.get_relevant_memories:
from langchain.schema import Document
from langchain_core.retrievers import BaseRetriever
from semvec import SemvecState
class SemvecRetriever(BaseRetriever):
def __init__(self, state: SemvecState, embedder, top_k: int = 5):
super().__init__()
self._state = state
self._embedder = embedder
self._top_k = top_k
def _get_relevant_documents(self, query: str):
vec = self._embedder.get_embedding(query)
hits = self._state.memory.get_relevant_memories(vec, top_k=self._top_k)
return [
Document(page_content=m.text, metadata={"importance": m.importance})
for m in hits
]
For tools, wrap each method you want the agent to call:
from langchain.tools import tool
@tool
def semvec_record_note(text: str) -> str:
"""Fold a note into the semantic state."""
state.update(embedder.get_embedding(text), text)
return "OK"
DeepAgents¶
DeepAgents middleware runs on every step to mutate context. Wrap SemvecStateSerializer to inject compressed context before each step and absorb the assistant reply afterwards:
from deepagents import AgentMiddleware
from semvec import SemvecState
from semvec.token_reduction import SemvecStateSerializer
class SemvecMiddleware(AgentMiddleware):
def __init__(self, state: SemvecState, embedder):
self._state = state
self._embedder = embedder
self._serializer = SemvecStateSerializer()
def before_step(self, context):
query_vec = self._embedder.get_embedding(context.last_message)
context.system += "\n\n" + self._serializer.serialize(
self._state, query_embedding=query_vec
)
return context
def after_step(self, context):
text = context.last_message + "\n" + context.assistant_reply
self._state.update(self._embedder.get_embedding(text), text[:500])
PostgreSQL persistence¶
Store full SemvecState.to_dict() snapshots in a JSONB column. Integrity checking is built-in.
CREATE TABLE semvec_states (
session_id TEXT PRIMARY KEY,
state JSONB NOT NULL,
updated_at TIMESTAMPTZ DEFAULT now()
);
import json
import psycopg
from semvec import SemvecState
def save(conn, session_id: str, state: SemvecState):
with conn.cursor() as cur:
cur.execute(
"INSERT INTO semvec_states(session_id, state) VALUES (%s, %s)"
" ON CONFLICT (session_id) DO UPDATE SET state = EXCLUDED.state,"
" updated_at = now()",
(session_id, json.dumps(state.to_dict())),
)
conn.commit()
def load(conn, session_id: str) -> SemvecState:
with conn.cursor() as cur:
cur.execute("SELECT state FROM semvec_states WHERE session_id = %s", (session_id,))
row = cur.fetchone()
if row is None:
raise KeyError(session_id)
return SemvecState.from_dict(row[0]) # raises StateCorruptionError on checksum mismatch
The from_dict call verifies the embedded checksum — tampered rows surface as StateCorruptionError rather than silently corrupt reads.
Neo4j property graph¶
Use LiteralCache entities as graph nodes. Typical schema:
(:CodeEntity {kind, name, file, signature, semantic_hash})(:Session {id, created_at})(:CodePointer {file, signature, importance})- Relationships:
(session) -[:TOUCHED]-> (entity),(entity) -[:CALLS]-> (entity).
from neo4j import GraphDatabase
from semvec import LiteralCache # re-exported from semvec._core (top-level import works)
driver = GraphDatabase.driver("bolt://localhost:7687", auth=("neo4j", "..."))
def sync_cache(cache: LiteralCache, session_id: str):
with driver.session() as tx:
for entity in cache.all_entities(): # see LiteralCache API
tx.run(
"MERGE (c:CodeEntity {semantic_hash: $hash}) "
"SET c.name = $name, c.kind = $kind, c.file = $file "
"MERGE (s:Session {id: $session_id}) "
"MERGE (s)-[:TOUCHED]->(c)",
hash=entity.semantic_hash,
name=entity.name,
kind=entity.kind.value,
file=entity.file_path,
session_id=session_id,
)
Mem0 head-to-head on LOCOMO¶
The [mem0] extra pulls in the Mem0 SDK so you can run side-by-side
LOCOMO comparisons against semvec without splitting harnesses:
The LOCOMO bench format (benchmarks/data/locomo10.json) is plain JSON
and easy to feed into a Mem0 runner of your own. The semvec runner
(benchmarks/run_locomo.py) reports per-QA pred/gold/f1 in the
same shape you can write from a Mem0 loop, so both result files plug
into the official LOCOMO eval logic.
For the judge prompt (cross-paper comparability) see
benchmarks/run_locomo_judge.py — it re-scores any existing result
file with the same LLM-as-Judge prompt the Mem0 paper uses.
See Benchmarks overview and Semvec vs. mem0 for the published head-to-head numbers.