Skip to content

Core API (semvec)

The Rust-backed PSS state engine. All symbols listed here are importable directly from semvec.

SemvecState

The persistent-semantic-state machine.

from semvec import SemvecState, SemvecConfig

state = SemvecState(config=SemvecConfig(dimension=384))

Constructor

Parameter Type Default Description
config SemvecConfig \| None SemvecConfig() Configuration bundle (see below).
_test_subject_override str \| None None Test seam — pins the license-subject used when deriving the per-state salt for calculate_* (≥ 0.2.0a1). Production code does not pass this; the subject is read from SEMVEC_LICENSE_KEY.

update(embedding, text) -> dict

Fold a single (embedding, text) pair into the state.

Returns a metric dict:

Key Type Meaning
similarity float Cosine similarity between input and current state, pre-update.
beta float Adaptive blending coefficient for this turn.
pattern_strength float Projected-retrieval pattern strength.
fsm float Field Stability Metric.
phase str Detected phase (initialization / exploration / convergence / resonance / stability / instability).
norm float L2 norm of the post-update state vector.
topic_switch float Signalled topic-switch magnitude (if any).
novelty_score float How semantically novel this input was.

Other methods

Method Purpose
to_dict() -> dict SHA-256-checksummed, JSON-safe full-state snapshot. The hidden license-subject salt used by calculate_* is not included.
from_dict(data, _test_subject_override=None) -> SemvecState Restore from a to_dict() snapshot. Raises StateCorruptionError on checksum mismatch. As of 0.2.0a6 the salt is not persisted — from_dict() always mints a fresh instance_seed, so calculate_* outputs differ across the round-trip while non-salt state (semantic_state, memories, phase/fsm/beta histories, etc.) round-trips byte-identical.
set_retrieval_projection_weights(matrix) Inject W_down (shape (dim, rank)) for bit-exact parity with pss. Validates dimension and rank.
get_retrieval_projection_weights() -> list[list[float]] Snapshot current W_down.
get_retrieval_projection_w_up() -> list[list[float]] Snapshot current W_up (for debugging the learned projection).

State-bound metric methods (≥ 0.2.0a1)

The Field Stability Metric, the metrics aggregator, and the advanced metrics pipeline are exposed as methods on SemvecState instead of free functions. Each method salts the numeric inputs with a hidden value derived from SEMVEC_LICENSE_KEY (or "anonymous" if unset), the state's dimension, and a 16-byte per-instance random seed before delegating to the same Rust kernel. As of 0.2.0a6 the seed is never persistedto_dict() does not export it, from_dict() always mints a fresh seed. A round-tripped state therefore produces different calculate_* outputs than the original; non-salt state (semantic_state, memories, all *_history arrays) round-trips byte-identical. The salt is non-linear: a Fisher-Yates permutation of the input order layered on top of the linear mantissa-XOR. Two states with different salts read positions of norm_history in different orders, which prevents a surrogate trainer from recovering the unsalted kernel by averaging outputs across many fresh states.

Method Signature Notes
calculate_fsm(norm_history, history_length=50, beta_history=None, similarity_history=None) -> float Field Stability Metric. beta_history is accepted for legacy compatibility and ignored. Salted; output is deterministic given (subject, dimension, inputs) but differs from the unsalted reference.
calculate_metrics(norm_history, similarity_history, beta_history, memory, semantic_clusters, history_length=50, short_term_size=20) -> dict Aggregated turn metrics: SR, ST, CF, SM, BE_mean. Salted across all three history vectors.
calculate_advanced_metrics(state_metrics, semantic_clusters, phase_history, adaptive_params) -> dict Adds OC, phase_stability, adaptive_health to a base metrics dict. Salts state_metrics and adaptive_params values; cluster sizes and phase strings are not perturbed.

The deprecated free functions semvec._core.calculate_fsm, ...calculate_metrics, and ...calculate_advanced_metrics remain importable for byte-identical pss-port behaviour, with a DeprecationWarning on every call. They will be removed in 1.0.

Attributes

Attribute Type Notes
semantic_state np.ndarray The live state vector, shape (dim,). Readable and writable.
interaction_count int Total update() calls since construction or reset.
alpha_hit_count int Alpha-gate hits (legacy; see pss.core for semantics).
timestamp int Monotonic tick counter.
memory MultiResolutionMemory Short-term / medium-term / long-term tiers.
phase_detector PhaseDetector Hybrid Markov × rules detector.
literal_cache LiteralCache Verbatim code-entity tier.

SemvecConfig

Immutable configuration dataclass passed into SemvecState(config=…). Every field is a keyword argument; everything has a default so SemvecConfig() is a valid call.

from semvec import SemvecConfig

cfg = SemvecConfig(dimension=384)

Fields

Field Type Default Purpose
model_name str "all-MiniLM-L6-v2" Preferred embedder label (user-space hint only — the state never loads a model itself).
dimension int 384 Embedding dimension. Must match your embedder's get_dimension().
device str "cpu" Device hint for the embedder (same as above: informational).
debug bool False Enable verbose core logging.
beta_basis float 0.05 Minimum β; also the value resonance triggers clamp β down to.
beta_max_default float 0.35 Maximum adaptive β before phase-specific caps kick in.
alpha_attention_scalar float 0.3 Weight of the pattern term in the PSS update equation.
short_term_size int 15 Short-term memory capacity.
medium_term_size int 50 Medium-term memory capacity.
long_term_size int 100 Long-term memory capacity.
compression_ratio float 0.3 Text-compression ratio on promotion.
use_selective_forgetting bool True Score-based eviction (0.4 × importance + 0.35 × recency + 0.25 × access) vs FIFO.
phase_detection_window int 50 Sliding-window size for PhaseDetector.
context_window int 20 Count of recent input embeddings kept for entropy/topic-switch detection.
history_length int 20 Size cap on beta_history / similarity_history / norm_history.
enable_topic_switch bool True Enable the topic-switch detector (contributes to novelty amplification).
topic_switch_threshold float 0.3 Similarity drop that triggers a topic switch.
topic_switch_window int 5 Consecutive-similarity window used by the topic-switch detector.
template_vectors list[np.ndarray] [] Feature delta 24 — pre-initialise semantic_state from templates. Set via POST /v1/session/create or directly on the dataclass.
policy_vectors list[PolicyVector] [] Feature delta 26 — compliance-modulation vectors.

Validation happens in the constructor — out-of-range values raise ConfigurationError.

MultiResolutionMemory

Three-tier episodic memory. Exposed via state.memory.

Method Returns Purpose
get_relevant_memories(query, top_k=10) list[MemoryUnit] Cosine-similarity retrieval across all tiers.
short_term / medium_term / long_term list[MemoryUnit] Direct tier access.

MemoryUnit

Attribute Type
embedding np.ndarray
text str
importance float
access_count int
timestamp float
semantic_hash str (8-hex-char MD5 prefix)
protection_score float

PhaseDetector

Hybrid Markov × rules. Exposed via state.phase_detector.

  • current_phase: str
  • phase_transitions: list[dict] — historical transitions with timestamps and metric snapshots.
  • update(metrics, timestamp) -> Optional[str] — returns the new phase if a transition occurred.
  • signal_topic_switch(magnitude) — bias the next rule-scoring pass toward exploration.

LiteralCache

Exact-text code-entity cache used by the coding product. See Coding API for typical usage.

safe_cosine_similarity(a, b, eps=1e-8) -> float

Cosine similarity that returns 0.0 on zero-norm vectors instead of NaN.

Exceptions

from semvec import (
    SemvecError,                # base class
    ConfigurationError,
    EmbeddingError,
    StateCorruptionError,
    LicenseError,            # base for licensing issues
    LicenseExpiredError,
    RateLimitError,          # carries retry_after + upgrade_url
)

All inherit from SemvecError. The license and rate-limit types are semvec additions; everything else is preserved from pss.