Core API (semvec)¶
The semantic-state engine. All symbols listed here are importable directly from semvec.
SemvecState¶
The persistent semantic state.
from semvec import SemvecState, SemvecConfig
state = SemvecState(config=SemvecConfig(dimension=384))
Constructor¶
| Parameter | Type | Default | Description |
|---|---|---|---|
config |
SemvecConfig \| None |
SemvecConfig() |
Configuration bundle (see below). |
update(input_embedding, text, *, meta=None) -> dict¶
Fold a single (embedding, text) pair into the state. meta is an optional dict attached to the resulting memory entry.
Returns a metric dict:
| Key | Type | Meaning |
|---|---|---|
similarity |
float |
Cosine similarity between input and current state, pre-update. |
beta |
float |
Adaptive blending coefficient for this turn. |
pattern_strength |
float |
How strongly retrieved memories pulled the state. |
fsm |
float |
Stability score in [0, 1] (high = converged, low = oscillating). |
phase |
str |
Detected phase (initialization / exploration / convergence / resonance / stability / instability). |
norm |
float |
L2 norm of the post-update state vector. |
topic_switch |
float |
Signalled topic-switch magnitude (0 = none). |
novelty_score |
float |
How semantically novel this input was. |
Other methods¶
| Method | Purpose |
|---|---|
add_anchor(embedding) |
Register a drift anchor — biases retrieval toward the anchor's domain. |
add_resonance_trigger(trigger) |
Register a pre-built ResonanceTrigger(keyword=..., embedding=..., threshold=0.7, weight=1.0). Boosts memories on keyword or embedding match during retrieval. |
add_resonance_trigger(trigger) |
Low-level: register a pre-built ResonanceTrigger instance (e.g. one you've round-tripped through a snapshot or constructed manually). |
to_dict(*, include_memory_text=True, include_literal_cache_text=True, include_adaptive_params=True) -> dict |
Checksummed, JSON-safe full-state snapshot. The three keyword-only privacy toggles are independent; see "Snapshot redaction" below. |
from_dict(data) -> SemvecState |
Restore from a to_dict() snapshot. Raises StateCorruptionError on checksum mismatch. Tolerates pre-redaction snapshots that lack the optional sections. |
to_bytes(compress=True, *, include_memory_text=True, include_literal_cache_text=True, include_adaptive_params=True) -> bytes |
Compact binary checkpoint with magic header + corruption check. Same redaction kwargs as to_dict(). compress=False skips gzip for hot-path persistence. (provided by the Rust core; not surfaced in Python type stubs) |
from_bytes(blob) -> SemvecState |
Restore from a to_bytes() blob (auto-detects compressed vs uncompressed via the version byte). (provided by the Rust core; not surfaced in Python type stubs) |
set_retrieval_projection_weights(matrix) |
Inject a custom retrieval projection matrix (advanced; pins retrieval scoring across replays). |
get_retrieval_projection_weights() -> list[list[float]] |
Snapshot the current projection matrix. Useful for parity tests; treat the contents as opaque. |
Snapshot redaction¶
to_dict / to_bytes carry three independent privacy toggles for sharing snapshots outside trusted hands. All three default to True (back-compat) and can be combined freely:
| Toggle | Default | When False, redacts |
|---|---|---|
include_memory_text |
True |
The user-prose text on every memory entry — embedding, hash, scores stay so retrieval against the snapshot remains functional. |
include_literal_cache_text |
True |
The verbatim text inside LiteralCache extracted facts (Decisions / Errors / Code structures). The structured fields and hashes stay. |
include_adaptive_params |
True |
The adaptive_params block (beta_max, decay_rate, reinforcement_threshold, diversity_penalty, learning_rate) plus the internal config block (beta_basis, beta_max_default, alpha_attention_scalar, state_norm_setpoint). Strip when sharing snapshots outside trusted hands. |
from_dict / from_bytes accept redacted snapshots — missing fields fall
back to SemvecConfig defaults, and the integrity checksum is computed
against those same defaults so the verification still succeeds.
# A snapshot you can hand to a third-party support engineer:
blob = state.to_bytes(
include_memory_text=False, # no user prose
include_literal_cache_text=False, # no verbatim facts
include_adaptive_params=False, # no internal tuning state
)
Aggregate diagnostics methods¶
SemvecState exposes a small set of on-demand diagnostics methods for
dashboards and ops monitoring beyond what update() returns. Treat
their return values as opaque indicators — useful for UI /
monitoring / dispatch logic in your application, but not as a
window into engine mechanics.
| Method | Returns | Purpose |
|---|---|---|
calculate_fsm(...) |
float in [0, 1] |
Overall stability score (higher = more stable). Useful for gating expensive actions on > 0.7. |
calculate_metrics(...) |
dict[str, float] |
Internal diagnostics dict for ops dashboards. |
calculate_advanced_metrics(...) |
dict[str, float] |
Extended diagnostics dict (super-set of calculate_metrics). |
Keys and argument signatures are unstable
The exact keys in the diagnostics dicts, their interpretation, and the argument signatures of these methods are implementation details and may change between releases without notice.
Iterate the returned dict defensively (for k, v in d.items())
instead of hard-coding key names so your code stays forward-compatible.
The values are deterministic for a given (subject, dimension,
input) tuple within a release; do not assume cross-release or
cross-instance comparability. Outputs are licensing-bound — see
licensing.
Attributes¶
| Attribute | Type | Notes |
|---|---|---|
semantic_state |
np.ndarray |
The live state vector, shape (dim,). Readable and writable. |
interaction_count |
int |
Total update() calls since construction or reset. (runtime attribute, not in stub) |
timestamp |
int |
Monotonic tick counter. (runtime attribute, not in stub) |
memory |
MultiResolutionMemory |
Short-term / medium-term / long-term tiers. (runtime attribute, not in stub) |
phase_detector |
PhaseDetector |
Automatic phase detector. (runtime attribute, not in stub) |
literal_cache |
LiteralCache |
Verbatim structured-memory layer. (runtime attribute, not in stub) |
anchor_count |
int |
Number of registered drift anchors. (runtime attribute, not in stub) |
anchor_score |
float |
Mean cosine of state vs all registered anchors. (runtime attribute, not in stub) |
topic_switch_history |
list[dict] |
Bounded list of detected switches. (runtime attribute, not in stub) |
phase_history |
list[str] |
Phase transitions recorded since construction. |
Additional methods¶
Further public methods on SemvecState, surfaced here for completeness.
Argument lists reflect the installed wheel; treat return values as
opaque indicators (see the diagnostics warning above).
| Method | Purpose |
|---|---|
inject_memory(embedding, text, tier, importance=1.0, timestamp=0.0, access_count=0, meta=None, protection_score=0.0) |
Manually seed a memory at a specific tier; useful for warmstarts and migration. |
consolidate_long_term() |
Run a single consolidation pass over the long-term tier. |
get_all_memories_flat() |
Flat list view of every memory across all tiers. |
get_total_stored() |
Total number of memories currently held across all tiers. |
get_metrics() |
Snapshot of the last computed metric dict. |
get_phase() |
Current phase label. |
get_dynamic_top_k() |
Suggested retrieval top-k for the current state. |
query_similarities_vectorized(query_embedding) |
Batch similarity scoring against the live state. |
add_negative_attractor(error_vector, description=..., source=..., severity=1.0) |
Register a negative attractor to demote in retrieval. |
clear_negative_attractors() |
Remove all registered negative attractors. |
clear_resonance_triggers() |
Remove all registered resonance triggers. |
set_isolation_filter(filter_) |
Restrict retrieval to memories matching the supplied filter. |
release_quarantine_count() |
Number of memories released from quarantine since construction. |
update_batch(...) |
Batched variant of update() for bulk ingestion. |
Diagnostic attributes¶
Rolling-history and counter attributes exposed for dashboards and ops
monitoring. Treat their contents as opaque indicators — useful for
UI / monitoring in your application, not as a window into engine
mechanics. Lengths are bounded by SemvecConfig.history_length.
| Attribute | Type | Notes |
|---|---|---|
similarity_history |
list[float] |
Rolling history of per-turn similarity values. |
norm_history |
list[float] |
Rolling history of post-update state-vector norms. |
beta_history |
list[float] |
Rolling history of adaptive blending coefficients. |
fsm_history |
list[float] |
Rolling history of stability scores. |
drift_threshold |
float |
Current drift threshold used by the detector. |
negative_attractor_count |
int |
Number of currently registered negative attractors. |
resonance_trigger_count |
int |
Number of currently registered resonance triggers. |
realignment_remaining |
int |
Remaining steps in any active realignment. |
operator_state_vector |
np.ndarray |
Companion vector tracking operator-side state. |
config |
SemvecConfig |
The config object the state was constructed with. |
SemvecConfig¶
Immutable configuration dataclass passed into SemvecState(config=…). Every field is a keyword argument; everything has a default so SemvecConfig() is a valid call.
Fields¶
Every SemvecConfig field is a keyword argument; everything has a
default so SemvecConfig() is a valid call. Out-of-range values raise
ConfigurationError in the constructor.
Field descriptions describe what each knob is for, not the underlying mechanism. The mechanism is out of scope for this reference.
Identity & embedder¶
| Field | Type | Default | Purpose |
|---|---|---|---|
model_name |
str |
"all-MiniLM-L6-v2" |
Preferred embedder label (informational hint — the state never loads a model itself). |
dimension |
int |
384 |
Embedding dimension. Must match your embedder's get_dimension(). |
device |
str |
"cpu" |
Device hint for the embedder (informational). |
debug |
bool |
False |
Enable verbose core logging. |
Memory tiers & retention¶
| Field | Type | Default | Purpose |
|---|---|---|---|
short_term_size |
int |
15 |
Short-term memory capacity. |
medium_term_size |
int |
50 |
Medium-term memory capacity. |
long_term_size |
int |
200 |
Long-term memory capacity. |
use_selective_forgetting |
bool |
True |
Score-based eviction when a tier overflows vs FIFO. |
compression_ratio |
float |
0.3 |
Text-compression ratio on promotion between tiers. Lower = shorter compressed output per turn, less retained nuance. Tune in [0.1, 0.5]. |
Phase detector & rolling windows¶
| Field | Type | Default | Purpose |
|---|---|---|---|
phase_detection_window |
int |
50 |
Sliding-window size the phase detector consumes. |
context_window |
int |
20 |
Recent-input window kept for novelty / topic-switch scoring. |
history_length |
int |
20 |
Cap on rolling history arrays (norm_history, similarity_history, beta_history). |
Topic-switch detector¶
| Field | Type | Default | Purpose |
|---|---|---|---|
enable_topic_switch |
bool |
True |
Master switch for the topic-switch detector. |
topic_switch_threshold |
float |
0.3 |
Sensitivity knob — raise to make the detector less twitchy on noisy domains, lower to make it fire sooner. |
topic_switch_window |
int |
5 |
Number of consecutive turns the detector watches. |
auto_anchor_on_topic_switch |
bool |
False |
When True, snapshot the current semantic_state as a fresh anchor on every detected switch. |
max_auto_anchors |
int |
8 |
Cap on anchors created via auto_anchor_on_topic_switch. |
Retrieval boosts & tier weights¶
| Field | Type | Default | Purpose |
|---|---|---|---|
anchor_retrieval_boost |
float |
0.6 |
Score boost applied when registered anchors align with the candidate. Tune in [0.1, 0.6]. |
trigger_retrieval_boost |
float |
0.3 |
Score boost applied when a ResonanceTrigger matches. Tune in [0.1, 0.6]. |
short_term_weight |
float |
1.0 |
Tier weight for short-term memories during retrieval. |
medium_term_weight |
float |
0.95 |
Tier weight for medium-term memories. |
long_term_weight |
float |
0.9 |
Tier weight for long-term memories. |
negative_attractor_penalty |
float |
0.5 |
Overall strength of NegativeAttractor demotion in retrieval ([0, 1]). (advanced; rarely changed — internal stability safeguard, leave at default unless a benchmark instructs otherwise) |
negative_attractor_threshold |
float |
0.3 |
Cosine floor below which attractors are ignored. |
cluster_fallback_threshold |
float |
0.85 |
Controls retrieval breadth for uncertain matches against the long-term tier. Higher values keep older domains reachable; lower values stay narrow. |
Internal tuning constants¶
The following constants are used internally and exposed only for reproducibility of benchmark runs. They are not user-tunable — do not change them outside a benchmark harness. Names are listed without describing the underlying mechanism; see llms.txt for the disclosure policy.
| Constant | Default |
|---|---|
beta_basis |
0.05 |
beta_max_default |
0.35 |
alpha_attention_scalar |
0.3 |
state_norm_setpoint |
1.2 |
These four fields plus the adaptive_params block (beta_max,
decay_rate, reinforcement_threshold, diversity_penalty,
learning_rate) are stripped from snapshots when you pass
include_adaptive_params=False to to_dict() / to_bytes() — use that
toggle whenever you need to share a snapshot outside trusted hands.
MultiResolutionMemory¶
Three-tier episodic memory. Exposed via state.memory.
| Method | Returns | Purpose |
|---|---|---|
get_relevant_memories(query_embedding, top_k=10, *, meta_filter=None) |
list[MemoryUnit] |
Cosine-similarity retrieval across all tiers. Optional meta_filter dict restricts results to memories whose meta matches. |
short_term / medium_term / long_term |
list[MemoryUnit] |
Direct tier access. |
MemoryUnit¶
| Attribute | Type |
|---|---|
embedding |
np.ndarray |
text |
str |
importance |
float |
access_count |
int |
timestamp |
float |
semantic_hash |
str (8-hex-char content hash) |
protection_score |
float (retention bias for selective forgetting) |
PhaseDetector¶
Exposed via state.phase_detector.
current_phase: strphase_transitions: list[dict]— historical transitions with timestamps and metric snapshots.update(metrics, timestamp) -> Optional[str]— returns the new phase if a transition occurred.signal_topic_switch(magnitude)— bias the next rule-scoring pass towardexploration.
LiteralCache¶
Exact-text structured-memory layer. See Coding API for the full surface — recording, querying (query(text, max_results=10)), and build_handoff_context() for cross-session prompts.
safe_cosine_similarity(a, b, eps=1e-8) -> float¶
Cosine similarity that returns 0.0 on zero-norm vectors instead of NaN.
Exceptions¶
from semvec import (
SemvecError, # base class
ConfigurationError,
EmbeddingError,
StateCorruptionError,
LicenseError, # base for licensing issues
LicenseExpiredError,
RateLimitError,
)
All inherit from SemvecError. License-related exceptions inherit from LicenseError → SemvecError.
RateLimitError exceptions carry the standard Python args tuple; parse for retry hints if needed. (A dedicated retry_after attribute is planned for a future release.)
See also¶
- Quickstart — 5-minute REST + library walk-through
- Correcting memories — user-guide page that contextualises this API
- Architecture — abstract component model