Cortex (overview)¶

semvec.cortex is the multi-agent coordination layer. Multiple agents — analyst, planner, critic, per-tenant agents, per-task agents — share an aggregated view, exchange checksummed state vectors, and vote on proposals through a five-level consensus engine. There are three usage paths; the right one depends on how many agents you run, where they live, and whether anyone outside your process needs to talk to them.

Pick a path¶

You want to…	Path	Pulls in
Coordinate 2–10 agents inside one Python process	`SemvecAgentNetwork` (in-process)	`[cortex]` (marker — primitives are always available)
Plug a custom persistent store under the cortex (e.g. async fetch from Postgres / Redis / pgvector)	`SemvecCortexService` with `pss_store=`	`[cortex]` plus your store
Expose multi-agent coordination across machines, services, or tenants — clusters, regions, observers, drift events, trust scores	REST API: `/v1/cluster/`, `/v1/region/`, `/v1/observer/`, `/v1/network/`	`[api]` (FastAPI, JWT)

The REST path covers a much larger surface than the in-process API — it gates everything behind Ed25519 JWT auth, tracks ownership per license subject, persists session / cluster / region metadata in SQLite or Postgres, and adds machinery (drift bus, trust scores, anomaly detection) that does not exist in the in-process API. Treat it as the production surface when more than one process is involved. → Detailed walk-through: REST-hosted Cortex guide.

Path 1 — `SemvecAgentNetwork` (in-process)¶

Lightweight container for several SemvecAgent objects, aggregated into a single SemvecCortexObserver. The right shape when an analyst, planner, and critic all live in the same process and just need to share state and exchange feedback.

snippet — Network-managed agents need an embedder; for a runnable single-agent flow see api-reference/cortex.md

from semvec.cortex import SemvecAgentNetwork, AttentionAggregation

network = SemvecAgentNetwork(
    aggregation_strategy=AttentionAggregation(dimension=768),
    enable_feedback=True,
    feedback_strength=0.3,
    max_instances=10,
    dimension=768,
)
network.add_local_instance("analyst")
network.add_local_instance("planner")
network.add_local_instance("critic")

network.process_input("analyst", "quarterly revenue is up 23%")
network.process_input("planner", "we should redirect Q4 spend to retention")

state = network.get_network_state()
print(f"active agents: {state['active_instances']}/{state['total_instances']}")

# Pull per-agent feedback the next turn can blend into the embedding
feedback = network.get_feedback_for_agent("analyst")

Aggregation strategies: WeightedAverageAggregation, AttentionAggregation. The ConsensusEngine adds proposal voting at five levels (SIMPLE_MAJORITY, QUALIFIED_MAJORITY, UNANIMOUS, WEIGHTED_VOTE, ADAPTIVE_THRESHOLD); quorum is measured against the registered voter pool, not just votes-cast-so-far. StateVectorPacket round-trips bit-exactly via serialize()/deserialize() and verify_integrity() confirms byte equality.

→ API: semvec.cortex reference.

Path 2 — `SemvecCortexService` with a custom store¶

SemvecCortexService is the service-shaped facade — it accepts an async pss_store and aggregates whatever active states the store exposes. Use it when your agents are persisted somewhere other than process memory (Postgres, Redis, pgvector, your own session DB) and you want the cortex to reflect all active sessions, not just those registered locally.

from semvec.cortex import SemvecCortexService

class PostgresStore:
    async def list_active_states(self):
        """Async iterable of (agent_id, SemvecState) tuples."""
        async for row in fetch_active_sessions():
            yield row.agent_id, SemvecState.from_dict(row.snapshot)

svc = SemvecCortexService(
    pss_store=PostgresStore(),
    aggregation="attention",   # or "weighted"
    dimension=768,
)

result = await svc.update_global_state()
# {global_state, global_coherence, network_resonance, active_instances}

feedback = svc.get_feedback_for_agent("session_42")
# Pass into agent.process_input(text, global_feedback=feedback)

The service runs without a store too — when pss_store=None, it falls back to the in-memory cache populated via register_agent() + process_input(). Pick this path when your control plane is async and the cortex needs to see across process boundaries inside a single service.

→ API: SemvecCortexService reference.

Path 3 — REST API for multi-tenant Cortex¶

When clusters span machines, when several teams need their own region, when you want drift events fanned out automatically and a global observer watching for anomalies — switch to the REST surface. Every primitive in the in-process API has a REST counterpart, plus several that exist only at the REST layer:

In-process	REST	What's added at REST
`SemvecAgentNetwork`	`/v1/cluster/`	JWT-gated ownership, persistent membership, weighted-average or attention aggregation per cluster
`ConsensusEngine`	`/v1/region/`	Region groups multiple clusters; consensus realignment fires on aggregated drift events
(none)	`/v1/observer/`	Cross-cluster anomaly detection (`cross_cluster_convergence`, `systemic_drift`, `cluster_divergence`)
`StateVectorPacket`	`/v1/network/transfer`	Per-tenant user partitions, trust-score-weighted consensus, network-wide consensus proposals
(none)	`/v1/cluster/{id}/feedback`	One call blends the cluster aggregate back into all member sessions

Auth: Authorization: Bearer <jwt> or X-API-Key: <jwt>. Ownership is per license subject — the server never leaks resource existence across tenants (404 on owned-by-another vs 200 on owned-by-me).

pip install "semvec[api]"
export SEMVEC_LICENSE_KEY="eyJhbGciOiJFZERTQSI..."
semvec serve --host 0.0.0.0 --port 8080

→ Full walk-through with curl + httpx examples for every endpoint group: REST-hosted Cortex guide.

Cross-frontend dedup (shared cluster session)¶

Every frontend in a cluster writes into one backing session (cluster_id == session_id). So when frontend A stores a fact, frontend B querying the same cluster gets a dedup_signal flagging the overlap — the {is_update, max_sim, matched_id} triple (dedup-signal guide) computed cross-frontend instead of within a single session.

Make a fact visible to other frontends. A fact is only visible once it is written into the shared session:

# Frontend A stores a fact (lands in the shared session pool)
curl -X POST localhost:8080/v1/cluster/$CID/store \
  -H "Authorization: Bearer $JWT" \
  -d '{"message": "what is the SLA?", "response": "the SLA is 99.95% uptime"}'

# Frontend B paraphrases it on the same cluster — dedup_signal flags the overlap
curl -X POST localhost:8080/v1/cluster/$CID/store \
  -H "Authorization: Bearer $JWT" \
  -d '{"message": "uptime target?", "response": "we guarantee 99.95% availability"}'
# -> {"dedup_signal": {"is_update": true, "max_sim": 0.97, "matched_id": "019..."}, ...}

The matched_id is a durable handle to the matched fact: it stays the same identifier across the shared session's to_dict()/from_dict() and across a snapshot reload, so a frontend can store it and correlate later (dedup-signal guide).

A cluster run that carries a response (POST /v1/cluster/{id}/run with response=) writes into the same pool. A run without a response only reads.

Per-call threshold. Pass dedup_threshold (a cosine value in [-1, 1]) on the store call to override the config default (dedup_update_threshold, typically 0.85) for that one is_update decision. Storage is append-only regardless — the override only flips the informational signal:

curl -X POST localhost:8080/v1/cluster/$CID/store -H "Authorization: Bearer $JWT" \
  -d '{"message": "...", "response": "...", "dedup_threshold": 0.92}'

Pin a cluster against realignment. Create a cluster with {"drift_exempt": true} to shield its shared session from regional realignment — useful when the cluster is your durable dedup index. A pinned cluster cannot be added to a region (the request is refused with 409).

Caveats (read before relying on it)¶

Cross-FRONTEND, not cross-RAG. Only facts written through a Semvec frontend (cluster store / run-with-response) are visible. Batch jobs or direct-RAG writes that bypass Semvec never enter the shared session and are invisible to the dedup signal.
No contradiction detection. is_update == true means the embeddings look alike — not that the two facts agree. Detecting contradictions is out of scope; the signal is a similarity hint, nothing more.
Storage stays append-only. The signal never suppresses a write. Your frontend decides what to do (skip a re-store, merge into RAG, just log it).
Durability needs the gate. A cross-frontend index that survives a worker restart requires SEMVEC_STATE_PERSIST plus a backing store (production hardening → state persistence). Without it the shared session is in-memory and per-worker — two workers see two separate pools, and dedup only works within each.

In-process equivalent¶

Cross-frontend dedup is not REST-only. The REST cluster shared session is just a managed wrapper over a single shared SemvecSession — you get the same behaviour in-process by holding one session and feeding every frontend's turns through it:

from semvec import SemvecSession, SemvecState, SemvecConfig

# ONE shared session is the "cluster" — every frontend writes into it.
cfg = SemvecConfig(dimension=768)
shared = SemvecSession(SemvecState(config=cfg), my_embedder, cfg)

# Frontend A stores a fact (mirrors POST /v1/cluster/{id}/store).
res_a = shared.store_qa("what is the SLA?", "the SLA is 99.95% uptime")

# Frontend B paraphrases it on the SAME shared session.
res_b = shared.store_qa("uptime target?", "we guarantee 99.95% availability")
print(res_b["dedup_signal"])
# {'is_update': True, 'max_sim': 0.97, 'matched_id': '019...'}

Read the signal. store_qa(...) returns the update metrics with a dedup_signal ({is_update, max_sim, matched_id}); a full turn via run_sync(...) / run(...) returns a TurnResult whose .dedup_signal carries the same triple (None when the turn didn't store). See the DedupSignal guide.
Per-call threshold. Override the config default for a single is_update decision on the store path: shared.store_qa(..., dedup_threshold=0.92) (also on update_state(...) and the *_async variants). It is not on run() — mirroring REST, only the store path takes the override.
Durability. Persist the shared session's underlying state with shared.state.to_bytes(compress=True) and reload via SemvecState.from_bytes(...) (rebuild the SemvecSession around the restored state); the matched_id survives the round-trip. See Persisting state in-process.

The mapping is direct: POST /v1/cluster/{id}/store ↔ store_qa, POST /v1/cluster/{id}/run (with a response) ↔ run_sync, and the cluster's backing session ↔ this one shared SemvecSession.

Common building blocks (every path)¶

Concept	What it does	Where it lives
`SemvecAgent`	Per-agent state with embedder + `process_input(text)`	API: SemvecAgent
`SemvecCortexObserver`	Aggregator turning N agent states into one global state	API: SemvecCortexObserver
Aggregation strategies	`WeightedAverageAggregation`, `AttentionAggregation`	API: aggregation
`ConsensusEngine` + `ConsensusLevel`	Proposal voting (5 levels), quorum-aware finalisation	API: ConsensusEngine
`StateVectorPacket` + `TransferType`	Inter-agent state transfer with checksummed integrity	API: StateVectorPacket

When to choose which¶

Two analysts on one developer's laptop → SemvecAgentNetwork in-process. Keep it simple.
One service hosting a cortex on top of an existing session store → SemvecCortexService with the store you already have.
Production deployment with several services / tenants / regions → REST API. Drift events, observer anomalies, ownership boundaries, trust scores, and cluster realignment only exist at the REST layer.

Where to next¶

REST-hosted Cortex guide — the deep dive on clusters, regions, observers, network endpoints.
semvec.cortex API reference — every class and method.
REST API reference — endpoint catalogue.
Coding (overview) — sister-guide for semvec.coding.