Integrations¶

Semvec keeps framework adapters in user space — their dependencies are heavy and opinionated, and the API they consume is small enough to wire up in a few lines. This page shows the recipes for the common integrations.

External dependencies

The integration examples on this page each pull in a third-party library that semvec does not bundle. Install per integration:

Integration	Install command
LangChain	`pip install langchain-core langchain-openai`
deepagents	`pip install deepagents`
Postgres / pgvector	`pip install "psycopg[binary]" pgvector`
Neo4j	`pip install neo4j`

If a code block below imports a library not listed here, install it explicitly — semvec will not pull it as a transitive dependency.

LangChain¶

Expose Semvec memory as a LangChain retriever by wrapping SemvecState.memory.get_relevant_memories:

from langchain.schema import Document
from langchain_core.retrievers import BaseRetriever
from semvec import SemvecState

class SemvecRetriever(BaseRetriever):
    def __init__(self, state: SemvecState, embedder, top_k: int = 5):
        super().__init__()
        self._state = state
        self._embedder = embedder
        self._top_k = top_k

    def _get_relevant_documents(self, query: str):
        vec = self._embedder.get_embedding(query)
        hits = self._state.memory.get_relevant_memories(vec, top_k=self._top_k)
        return [
            Document(page_content=m.text, metadata={"importance": m.importance})
            for m in hits
        ]

For tools, wrap each method you want the agent to call:

from langchain.tools import tool

@tool
def semvec_record_note(text: str) -> str:
    """Fold a note into the semantic state."""
    state.update(embedder.get_embedding(text), text)
    return "OK"

DeepAgents¶

DeepAgents middleware runs on every step to mutate context. Wrap SemvecStateSerializer to inject compressed context before each step and absorb the assistant reply afterwards:

from deepagents import AgentMiddleware
from semvec import SemvecState
from semvec.token_reduction import SemvecStateSerializer

class SemvecMiddleware(AgentMiddleware):
    def __init__(self, state: SemvecState, embedder):
        self._state = state
        self._embedder = embedder
        self._serializer = SemvecStateSerializer()

    def before_step(self, context):
        query_vec = self._embedder.get_embedding(context.last_message)
        context.system += "\n\n" + self._serializer.serialize(
            self._state, query_embedding=query_vec
        )
        return context

    def after_step(self, context):
        text = context.last_message + "\n" + context.assistant_reply
        self._state.update(self._embedder.get_embedding(text), text[:500])

PostgreSQL persistence¶

Store full SemvecState.to_dict() snapshots in a JSONB column. Integrity checking is built-in.

CREATE TABLE semvec_states (
    session_id  TEXT PRIMARY KEY,
    state       JSONB NOT NULL,
    updated_at  TIMESTAMPTZ DEFAULT now()
);

import json
import psycopg
from semvec import SemvecState

def save(conn, session_id: str, state: SemvecState):
    with conn.cursor() as cur:
        cur.execute(
            "INSERT INTO semvec_states(session_id, state) VALUES (%s, %s)"
            " ON CONFLICT (session_id) DO UPDATE SET state = EXCLUDED.state,"
            " updated_at = now()",
            (session_id, json.dumps(state.to_dict())),
        )
    conn.commit()

def load(conn, session_id: str) -> SemvecState:
    with conn.cursor() as cur:
        cur.execute("SELECT state FROM semvec_states WHERE session_id = %s", (session_id,))
        row = cur.fetchone()
    if row is None:
        raise KeyError(session_id)
    return SemvecState.from_dict(row[0])  # raises StateCorruptionError on checksum mismatch

The from_dict call verifies the embedded checksum — tampered rows surface as StateCorruptionError rather than silently corrupt reads.

Neo4j property graph¶

Use LiteralCache entities as graph nodes. Typical schema:

(:CodeEntity {kind, name, file, signature, semantic_hash})
(:Session {id, created_at})
(:CodePointer {file, signature, importance})
Relationships: (session) -[:TOUCHED]-> (entity), (entity) -[:CALLS]-> (entity).

from neo4j import GraphDatabase
from semvec import LiteralCache  # re-exported from semvec._core (top-level import works)

driver = GraphDatabase.driver("bolt://localhost:7687", auth=("neo4j", "..."))

def sync_cache(cache: LiteralCache, session_id: str):
    with driver.session() as tx:
        for entity in cache.all_entities():  # see LiteralCache API
            tx.run(
                "MERGE (c:CodeEntity {semantic_hash: $hash}) "
                "SET c.name = $name, c.kind = $kind, c.file = $file "
                "MERGE (s:Session {id: $session_id}) "
                "MERGE (s)-[:TOUCHED]->(c)",
                hash=entity.semantic_hash,
                name=entity.name,
                kind=entity.kind.value,
                file=entity.file_path,
                session_id=session_id,
            )

Mem0 head-to-head on LOCOMO¶

The [mem0] extra pulls in the Mem0 SDK so you can run side-by-side LOCOMO comparisons against semvec without splitting harnesses:

pip install "semvec[mem0]"

The LOCOMO bench format (benchmarks/data/locomo10.json) is plain JSON and easy to feed into a Mem0 runner of your own. The semvec runner (benchmarks/run_locomo.py) reports per-QA pred/gold/f1 in the same shape you can write from a Mem0 loop, so both result files plug into the official LOCOMO eval logic.

For the judge prompt (cross-paper comparability) see benchmarks/run_locomo_judge.py — it re-scores any existing result file with the same LLM-as-Judge prompt the Mem0 paper uses.

See Benchmarks overview and Semvec vs. mem0 for the published head-to-head numbers.