Core API (semvec)¶
The Rust-backed PSS state engine. All symbols listed here are importable directly from semvec.
SemvecState¶
The persistent-semantic-state machine.
from semvec import SemvecState, SemvecConfig
state = SemvecState(config=SemvecConfig(dimension=384))
Constructor¶
| Parameter | Type | Default | Description |
|---|---|---|---|
config |
SemvecConfig \| None |
SemvecConfig() |
Configuration bundle (see below). |
_test_subject_override |
str \| None |
None |
Test seam — pins the license-subject used when deriving the per-state salt for calculate_* (≥ 0.2.0a1). Production code does not pass this; the subject is read from SEMVEC_LICENSE_KEY. |
update(embedding, text) -> dict¶
Fold a single (embedding, text) pair into the state.
Returns a metric dict:
| Key | Type | Meaning |
|---|---|---|
similarity |
float |
Cosine similarity between input and current state, pre-update. |
beta |
float |
Adaptive blending coefficient for this turn. |
pattern_strength |
float |
Projected-retrieval pattern strength. |
fsm |
float |
Field Stability Metric. |
phase |
str |
Detected phase (initialization / exploration / convergence / resonance / stability / instability). |
norm |
float |
L2 norm of the post-update state vector. |
topic_switch |
float |
Signalled topic-switch magnitude (if any). |
novelty_score |
float |
How semantically novel this input was. |
Other methods¶
| Method | Purpose |
|---|---|
to_dict() -> dict |
SHA-256-checksummed, JSON-safe full-state snapshot. The hidden license-subject salt used by calculate_* is not included. |
from_dict(data, _test_subject_override=None) -> SemvecState |
Restore from a to_dict() snapshot. Raises StateCorruptionError on checksum mismatch. As of 0.2.0a6 the salt is not persisted — from_dict() always mints a fresh instance_seed, so calculate_* outputs differ across the round-trip while non-salt state (semantic_state, memories, phase/fsm/beta histories, etc.) round-trips byte-identical. |
set_retrieval_projection_weights(matrix) |
Inject W_down (shape (dim, rank)) for bit-exact parity with pss. Validates dimension and rank. |
get_retrieval_projection_weights() -> list[list[float]] |
Snapshot current W_down. |
get_retrieval_projection_w_up() -> list[list[float]] |
Snapshot current W_up (for debugging the learned projection). |
State-bound metric methods (≥ 0.2.0a1)¶
The Field Stability Metric, the metrics aggregator, and the advanced metrics pipeline are exposed as methods on SemvecState instead of free functions. Each method salts the numeric inputs with a hidden value derived from SEMVEC_LICENSE_KEY (or "anonymous" if unset), the state's dimension, and a 16-byte per-instance random seed before delegating to the same Rust kernel. As of 0.2.0a6 the seed is never persisted — to_dict() does not export it, from_dict() always mints a fresh seed. A round-tripped state therefore produces different calculate_* outputs than the original; non-salt state (semantic_state, memories, all *_history arrays) round-trips byte-identical. The salt is non-linear: a Fisher-Yates permutation of the input order layered on top of the linear mantissa-XOR. Two states with different salts read positions of norm_history in different orders, which prevents a surrogate trainer from recovering the unsalted kernel by averaging outputs across many fresh states.
| Method | Signature | Notes |
|---|---|---|
calculate_fsm(norm_history, history_length=50, beta_history=None, similarity_history=None) -> float |
Field Stability Metric. beta_history is accepted for legacy compatibility and ignored. |
Salted; output is deterministic given (subject, dimension, inputs) but differs from the unsalted reference. |
calculate_metrics(norm_history, similarity_history, beta_history, memory, semantic_clusters, history_length=50, short_term_size=20) -> dict |
Aggregated turn metrics: SR, ST, CF, SM, BE_mean. |
Salted across all three history vectors. |
calculate_advanced_metrics(state_metrics, semantic_clusters, phase_history, adaptive_params) -> dict |
Adds OC, phase_stability, adaptive_health to a base metrics dict. |
Salts state_metrics and adaptive_params values; cluster sizes and phase strings are not perturbed. |
The deprecated free functions semvec._core.calculate_fsm, ...calculate_metrics, and ...calculate_advanced_metrics remain importable for byte-identical pss-port behaviour, with a DeprecationWarning on every call. They will be removed in 1.0.
Attributes¶
| Attribute | Type | Notes |
|---|---|---|
semantic_state |
np.ndarray |
The live state vector, shape (dim,). Readable and writable. |
interaction_count |
int |
Total update() calls since construction or reset. |
alpha_hit_count |
int |
Alpha-gate hits (legacy; see pss.core for semantics). |
timestamp |
int |
Monotonic tick counter. |
memory |
MultiResolutionMemory |
Short-term / medium-term / long-term tiers. |
phase_detector |
PhaseDetector |
Hybrid Markov × rules detector. |
literal_cache |
LiteralCache |
Verbatim code-entity tier. |
SemvecConfig¶
Immutable configuration dataclass passed into SemvecState(config=…). Every field is a keyword argument; everything has a default so SemvecConfig() is a valid call.
Fields¶
| Field | Type | Default | Purpose |
|---|---|---|---|
model_name |
str |
"all-MiniLM-L6-v2" |
Preferred embedder label (user-space hint only — the state never loads a model itself). |
dimension |
int |
384 |
Embedding dimension. Must match your embedder's get_dimension(). |
device |
str |
"cpu" |
Device hint for the embedder (same as above: informational). |
debug |
bool |
False |
Enable verbose core logging. |
beta_basis |
float |
0.05 |
Minimum β; also the value resonance triggers clamp β down to. |
beta_max_default |
float |
0.35 |
Maximum adaptive β before phase-specific caps kick in. |
alpha_attention_scalar |
float |
0.3 |
Weight of the pattern term in the PSS update equation. |
short_term_size |
int |
15 |
Short-term memory capacity. |
medium_term_size |
int |
50 |
Medium-term memory capacity. |
long_term_size |
int |
100 |
Long-term memory capacity. |
compression_ratio |
float |
0.3 |
Text-compression ratio on promotion. |
use_selective_forgetting |
bool |
True |
Score-based eviction (0.4 × importance + 0.35 × recency + 0.25 × access) vs FIFO. |
phase_detection_window |
int |
50 |
Sliding-window size for PhaseDetector. |
context_window |
int |
20 |
Count of recent input embeddings kept for entropy/topic-switch detection. |
history_length |
int |
20 |
Size cap on beta_history / similarity_history / norm_history. |
enable_topic_switch |
bool |
True |
Enable the topic-switch detector (contributes to novelty amplification). |
topic_switch_threshold |
float |
0.3 |
Similarity drop that triggers a topic switch. |
topic_switch_window |
int |
5 |
Consecutive-similarity window used by the topic-switch detector. |
template_vectors |
list[np.ndarray] |
[] |
Feature delta 24 — pre-initialise semantic_state from templates. Set via POST /v1/session/create or directly on the dataclass. |
policy_vectors |
list[PolicyVector] |
[] |
Feature delta 26 — compliance-modulation vectors. |
Validation happens in the constructor — out-of-range values raise ConfigurationError.
MultiResolutionMemory¶
Three-tier episodic memory. Exposed via state.memory.
| Method | Returns | Purpose |
|---|---|---|
get_relevant_memories(query, top_k=10) |
list[MemoryUnit] |
Cosine-similarity retrieval across all tiers. |
short_term / medium_term / long_term |
list[MemoryUnit] |
Direct tier access. |
MemoryUnit¶
| Attribute | Type |
|---|---|
embedding |
np.ndarray |
text |
str |
importance |
float |
access_count |
int |
timestamp |
float |
semantic_hash |
str (8-hex-char MD5 prefix) |
protection_score |
float |
PhaseDetector¶
Hybrid Markov × rules. Exposed via state.phase_detector.
current_phase: strphase_transitions: list[dict]— historical transitions with timestamps and metric snapshots.update(metrics, timestamp) -> Optional[str]— returns the new phase if a transition occurred.signal_topic_switch(magnitude)— bias the next rule-scoring pass towardexploration.
LiteralCache¶
Exact-text code-entity cache used by the coding product. See Coding API for typical usage.
safe_cosine_similarity(a, b, eps=1e-8) -> float¶
Cosine similarity that returns 0.0 on zero-norm vectors instead of NaN.
Exceptions¶
from semvec import (
SemvecError, # base class
ConfigurationError,
EmbeddingError,
StateCorruptionError,
LicenseError, # base for licensing issues
LicenseExpiredError,
RateLimitError, # carries retry_after + upgrade_url
)
All inherit from SemvecError. The license and rate-limit types are semvec additions; everything else is preserved from pss.