Core API (`semvec`)¶

The semantic-state engine. All symbols listed here are importable directly from semvec.

`SemvecState`¶

The persistent semantic state.

from semvec import SemvecState, SemvecConfig

state = SemvecState(config=SemvecConfig(dimension=384))

Constructor¶

Parameter	Type	Default	Description
`config`	`SemvecConfig \\| None`	`SemvecConfig()`	Configuration bundle (see below).

`update(input_embedding, text, *, meta=None) -> dict`¶

Fold a single (embedding, text) pair into the state. meta is an optional dict attached to the resulting memory entry.

Returns a metric dict:

Key	Type	Meaning
`similarity`	`float`	Cosine similarity between input and current state, pre-update.
`beta`	`float`	Adaptive blending coefficient for this turn.
`pattern_strength`	`float`	How strongly retrieved memories pulled the state.
`fsm`	`float`	Stability score in `[0, 1]` (high = converged, low = oscillating).
`phase`	`str`	Detected phase (`initialization` / `exploration` / `convergence` / `resonance` / `stability` / `instability`).
`norm`	`float`	L2 norm of the post-update state vector.
`topic_switch`	`float`	Signalled topic-switch magnitude (0 = none).
`novelty_score`	`float`	How semantically novel this input was.

Other methods¶

Method	Purpose
`add_anchor(embedding)`	Register a drift anchor — biases retrieval toward the anchor's domain.
`add_resonance_trigger(trigger)`	Register a pre-built `ResonanceTrigger(keyword=..., embedding=..., threshold=0.7, weight=1.0)`. Boosts memories on keyword or embedding match during retrieval.
`add_resonance_trigger(trigger)`	Low-level: register a pre-built `ResonanceTrigger` instance (e.g. one you've round-tripped through a snapshot or constructed manually).
`to_dict(*, include_memory_text=True, include_literal_cache_text=True, include_adaptive_params=True) -> dict`	Checksummed, JSON-safe full-state snapshot. The three keyword-only privacy toggles are independent; see "Snapshot redaction" below.
`from_dict(data) -> SemvecState`	Restore from a `to_dict()` snapshot. Raises `StateCorruptionError` on checksum mismatch. Tolerates pre-redaction snapshots that lack the optional sections.
`to_bytes(compress=True, *, include_memory_text=True, include_literal_cache_text=True, include_adaptive_params=True) -> bytes`	Compact binary checkpoint with magic header + corruption check. Same redaction kwargs as `to_dict()`. `compress=False` skips gzip for hot-path persistence. (provided by the Rust core; not surfaced in Python type stubs)
`from_bytes(blob) -> SemvecState`	Restore from a `to_bytes()` blob (auto-detects compressed vs uncompressed via the version byte). (provided by the Rust core; not surfaced in Python type stubs)
`set_retrieval_projection_weights(matrix)`	Inject a custom retrieval projection matrix (advanced; pins retrieval scoring across replays).
`get_retrieval_projection_weights() -> list[list[float]]`	Snapshot the current projection matrix. Useful for parity tests; treat the contents as opaque.

Snapshot redaction¶

to_dict / to_bytes carry three independent privacy toggles for sharing snapshots outside trusted hands. All three default to True (back-compat) and can be combined freely:

Toggle	Default	When `False`, redacts
`include_memory_text`	`True`	The user-prose `text` on every memory entry — embedding, hash, scores stay so retrieval against the snapshot remains functional.
`include_literal_cache_text`	`True`	The verbatim text inside `LiteralCache` extracted facts (Decisions / Errors / Code structures). The structured fields and hashes stay.
`include_adaptive_params`	`True`	The `adaptive_params` block (`beta_max`, `decay_rate`, `reinforcement_threshold`, `diversity_penalty`, `learning_rate`) plus the internal `config` block (`beta_basis`, `beta_max_default`, `alpha_attention_scalar`, `state_norm_setpoint`). Strip when sharing snapshots outside trusted hands.

from_dict / from_bytes accept redacted snapshots — missing fields fall back to SemvecConfig defaults, and the integrity checksum is computed against those same defaults so the verification still succeeds.

# A snapshot you can hand to a third-party support engineer:
blob = state.to_bytes(
    include_memory_text=False,           # no user prose
    include_literal_cache_text=False,    # no verbatim facts
    include_adaptive_params=False,       # no internal tuning state
)

Aggregate diagnostics methods¶

SemvecState exposes a small set of on-demand diagnostics methods for dashboards and ops monitoring beyond what update() returns. Treat their return values as opaque indicators — useful for UI / monitoring / dispatch logic in your application, but not as a window into engine mechanics.

Method	Returns	Purpose
`calculate_fsm(...)`	`float` in `[0, 1]`	Overall stability score (higher = more stable). Useful for gating expensive actions on `> 0.7`.
`calculate_metrics(...)`	`dict[str, float]`	Internal diagnostics dict for ops dashboards.
`calculate_advanced_metrics(...)`	`dict[str, float]`	Extended diagnostics dict (super-set of `calculate_metrics`).

Keys and argument signatures are unstable

The exact keys in the diagnostics dicts, their interpretation, and the argument signatures of these methods are implementation details and may change between releases without notice.

Iterate the returned dict defensively (for k, v in d.items()) instead of hard-coding key names so your code stays forward-compatible.

The values are deterministic for a given (subject, dimension, input) tuple within a release; do not assume cross-release or cross-instance comparability. Outputs are licensing-bound — see licensing.

Attributes¶

Attribute	Type	Notes
`semantic_state`	`np.ndarray`	The live state vector, shape `(dim,)`. Readable and writable.
`interaction_count`	`int`	Total `update()` calls since construction or reset. (runtime attribute, not in stub)
`timestamp`	`int`	Monotonic tick counter. (runtime attribute, not in stub)
`memory`	`MultiResolutionMemory`	Short-term / medium-term / long-term tiers. (runtime attribute, not in stub)
`phase_detector`	`PhaseDetector`	Automatic phase detector. (runtime attribute, not in stub)
`literal_cache`	`LiteralCache`	Verbatim structured-memory layer. (runtime attribute, not in stub)
`anchor_count`	`int`	Number of registered drift anchors. (runtime attribute, not in stub)
`anchor_score`	`float`	Mean cosine of state vs all registered anchors. (runtime attribute, not in stub)
`topic_switch_history`	`list[dict]`	Bounded list of detected switches. (runtime attribute, not in stub)
`phase_history`	`list[str]`	Phase transitions recorded since construction.

Additional methods¶

Further public methods on SemvecState, surfaced here for completeness. Argument lists reflect the installed wheel; treat return values as opaque indicators (see the diagnostics warning above).

Method	Purpose
`inject_memory(embedding, text, tier, importance=1.0, timestamp=0.0, access_count=0, meta=None, protection_score=0.0)`	Manually seed a memory at a specific tier; useful for warmstarts and migration.
`consolidate_long_term()`	Run a single consolidation pass over the long-term tier.
`get_all_memories_flat()`	Flat list view of every memory across all tiers.
`get_total_stored()`	Total number of memories currently held across all tiers.
`get_metrics()`	Snapshot of the last computed metric dict.
`get_phase()`	Current phase label.
`get_dynamic_top_k()`	Suggested retrieval top-k for the current state.
`query_similarities_vectorized(query_embedding)`	Batch similarity scoring against the live state.
`add_negative_attractor(error_vector, description=..., source=..., severity=1.0)`	Register a negative attractor to demote in retrieval.
`clear_negative_attractors()`	Remove all registered negative attractors.
`clear_resonance_triggers()`	Remove all registered resonance triggers.
`set_isolation_filter(filter_)`	Restrict retrieval to memories matching the supplied filter.
`release_quarantine_count()`	Number of memories released from quarantine since construction.
`update_batch(...)`	Batched variant of `update()` for bulk ingestion.

Diagnostic attributes¶

Rolling-history and counter attributes exposed for dashboards and ops monitoring. Treat their contents as opaque indicators — useful for UI / monitoring in your application, not as a window into engine mechanics. Lengths are bounded by SemvecConfig.history_length.

Attribute	Type	Notes
`similarity_history`	`list[float]`	Rolling history of per-turn similarity values.
`norm_history`	`list[float]`	Rolling history of post-update state-vector norms.
`beta_history`	`list[float]`	Rolling history of adaptive blending coefficients.
`fsm_history`	`list[float]`	Rolling history of stability scores.
`drift_threshold`	`float`	Current drift threshold used by the detector.
`negative_attractor_count`	`int`	Number of currently registered negative attractors.
`resonance_trigger_count`	`int`	Number of currently registered resonance triggers.
`realignment_remaining`	`int`	Remaining steps in any active realignment.
`operator_state_vector`	`np.ndarray`	Companion vector tracking operator-side state.
`config`	`SemvecConfig`	The config object the state was constructed with.

`SemvecConfig`¶

Immutable configuration dataclass passed into SemvecState(config=…). Every field is a keyword argument; everything has a default so SemvecConfig() is a valid call.

from semvec import SemvecConfig

cfg = SemvecConfig(dimension=384)

Fields¶

Every SemvecConfig field is a keyword argument; everything has a default so SemvecConfig() is a valid call. Out-of-range values raise ConfigurationError in the constructor.

Field descriptions describe what each knob is for, not the underlying mechanism. The mechanism is out of scope for this reference.

Identity & embedder¶

Field	Type	Default	Purpose
`model_name`	`str`	`"all-MiniLM-L6-v2"`	Preferred embedder label (informational hint — the state never loads a model itself).
`dimension`	`int`	`384`	Embedding dimension. Must match your embedder's `get_dimension()`.
`device`	`str`	`"cpu"`	Device hint for the embedder (informational).
`debug`	`bool`	`False`	Enable verbose core logging.

Memory tiers & retention¶

Field	Type	Default	Purpose
`short_term_size`	`int`	`15`	Short-term memory capacity.
`medium_term_size`	`int`	`50`	Medium-term memory capacity.
`long_term_size`	`int`	`200`	Long-term memory capacity.
`use_selective_forgetting`	`bool`	`True`	Score-based eviction when a tier overflows vs FIFO.
`compression_ratio`	`float`	`0.3`	Text-compression ratio on promotion between tiers. Lower = shorter compressed output per turn, less retained nuance. Tune in `[0.1, 0.5]`.

Phase detector & rolling windows¶

Field	Type	Default	Purpose
`phase_detection_window`	`int`	`50`	Sliding-window size the phase detector consumes.
`context_window`	`int`	`20`	Recent-input window kept for novelty / topic-switch scoring.
`history_length`	`int`	`20`	Cap on rolling history arrays (`norm_history`, `similarity_history`, `beta_history`).

Topic-switch detector¶

Field	Type	Default	Purpose
`enable_topic_switch`	`bool`	`True`	Master switch for the topic-switch detector.
`topic_switch_threshold`	`float`	`0.3`	Sensitivity knob — raise to make the detector less twitchy on noisy domains, lower to make it fire sooner.
`topic_switch_window`	`int`	`5`	Number of consecutive turns the detector watches.
`auto_anchor_on_topic_switch`	`bool`	`False`	When `True`, snapshot the current `semantic_state` as a fresh anchor on every detected switch.
`max_auto_anchors`	`int`	`8`	Cap on anchors created via `auto_anchor_on_topic_switch`.

Retrieval boosts & tier weights¶

Field	Type	Default	Purpose
`anchor_retrieval_boost`	`float`	`0.6`	Score boost applied when registered anchors align with the candidate. Tune in `[0.1, 0.6]`.
`trigger_retrieval_boost`	`float`	`0.3`	Score boost applied when a `ResonanceTrigger` matches. Tune in `[0.1, 0.6]`.
`short_term_weight`	`float`	`1.0`	Tier weight for short-term memories during retrieval.
`medium_term_weight`	`float`	`0.95`	Tier weight for medium-term memories.
`long_term_weight`	`float`	`0.9`	Tier weight for long-term memories.
`negative_attractor_penalty`	`float`	`0.5`	Overall strength of `NegativeAttractor` demotion in retrieval (`[0, 1]`). (advanced; rarely changed — internal stability safeguard, leave at default unless a benchmark instructs otherwise)
`negative_attractor_threshold`	`float`	`0.3`	Cosine floor below which attractors are ignored.
`cluster_fallback_threshold`	`float`	`0.85`	Controls retrieval breadth for uncertain matches against the long-term tier. Higher values keep older domains reachable; lower values stay narrow.

Internal tuning constants¶

The following constants are used internally and exposed only for reproducibility of benchmark runs. They are not user-tunable — do not change them outside a benchmark harness. Names are listed without describing the underlying mechanism; see llms.txt for the disclosure policy.

Constant	Default
`beta_basis`	`0.05`
`beta_max_default`	`0.35`
`alpha_attention_scalar`	`0.3`
`state_norm_setpoint`	`1.2`

These four fields plus the adaptive_params block (beta_max, decay_rate, reinforcement_threshold, diversity_penalty, learning_rate) are stripped from snapshots when you pass include_adaptive_params=False to to_dict() / to_bytes() — use that toggle whenever you need to share a snapshot outside trusted hands.

`MultiResolutionMemory`¶

Three-tier episodic memory. Exposed via state.memory.

Method	Returns	Purpose
`get_relevant_memories(query_embedding, top_k=10, *, meta_filter=None)`	`list[MemoryUnit]`	Cosine-similarity retrieval across all tiers. Optional `meta_filter` dict restricts results to memories whose `meta` matches.
`short_term` / `medium_term` / `long_term`	`list[MemoryUnit]`	Direct tier access.

`MemoryUnit`¶

Attribute	Type
`embedding`	`np.ndarray`
`text`	`str`
`importance`	`float`
`access_count`	`int`
`timestamp`	`float`
`semantic_hash`	`str` (8-hex-char content hash)
`protection_score`	`float` (retention bias for selective forgetting)

`PhaseDetector`¶

Exposed via state.phase_detector.

current_phase: str
phase_transitions: list[dict] — historical transitions with timestamps and metric snapshots.
update(metrics, timestamp) -> Optional[str] — returns the new phase if a transition occurred.
signal_topic_switch(magnitude) — bias the next rule-scoring pass toward exploration.

`LiteralCache`¶

Exact-text structured-memory layer. See Coding API for the full surface — recording, querying (query(text, max_results=10)), and build_handoff_context() for cross-session prompts.

`safe_cosine_similarity(a, b, eps=1e-8) -> float`¶

Cosine similarity that returns 0.0 on zero-norm vectors instead of NaN.

Exceptions¶

from semvec import (
    SemvecError,                # base class
    ConfigurationError,
    EmbeddingError,
    StateCorruptionError,
    LicenseError,            # base for licensing issues
    LicenseExpiredError,
    RateLimitError,
)

All inherit from SemvecError. License-related exceptions inherit from LicenseError → SemvecError.

RateLimitError exceptions carry the standard Python args tuple; parse for retry hints if needed. (A dedicated retry_after attribute is planned for a future release.)

Core API (semvec)¶

SemvecState¶