CRV Core

Foundational, zero-IO core for the CRV stack. This module defines:

Canonical grammar and normalization helpers (enums/EBNF).
Pydantic v2 models for payloads, decisions, context/persona/affect, and row schemas.
Table descriptors and a registry tied to the schema version.
Canonical JSON serialization and hashing helpers.
Light typed IDs/aliases, constants, and error classes.
Versioning metadata and compatibility helpers.

Downstream packages (crv.io, crv.world, crv.mind, crv.viz) depend on these contracts.

Naming and Normalization

Core uses only stdlib + pydantic. No file/network IO except a minimal read of core.ebnf grammar into grammar.py.
Naming policy:
Enum classes: PascalCase
Enum member names: UPPER_SNAKE (Python)
Serialized enum values: lower_snake
All field/column names: lower_snake
Normalization: Free-form string inputs are normalized to canonical lower_snake via helpers in grammar.py.
action_kind_from_value, exchange_kind_from_value, edge_kind_from_value, normalize_visibility, is_lower_snake.
Tests enforce naming via ensure_all_enum_values_lower_snake([...]).

See: src/crv/core/grammar.py

Pydantic v2 Schemas

Models reside in src/crv/core/schema.py. Key groups:

Payloads:
Utterance: act/topic/stance/claims/style/audience.
Interpretation: event_type/targets/inferred/salience∈[0,1].
AppraisalVector: valence/arousal/certainty/novelty/goal_congruence∈[0,1].
GraphEdit: operation in { set_identity_edge_weight, adjust_identity_edge_weight, decay_identity_edges, remove_identity_edge } (canonical-only). edge_kind is normalized; use explicit fields:
- Token–token association: edge_kind="object_to_object", subject_id, object_id
- Positive trace: edge_kind="object_to_positive_valence", token_id
- Negative trace: edge_kind="object_to_negative_valence", token_id
- Optional slots: subject_id/object_id/related_agent_id/token_id; weights or decay_lambda as applicable.
RepresentationPatch: edits: List[GraphEdit]; energy_delta? (float).
Decisions:
ActionCandidate: action_type normalized via ActionKind; parameters; score; key.
DecisionHead: token_value_estimates; action_candidates; abstain; temperature.
Context/persona/affect:
ScenarioContext: visibility normalized; optional token_id, labels, channel_name; snapshots.
Persona: persona_id/label/traits.
AffectState: valence/arousal/stress defaults within [0,1].
Rows:
EventEnvelopeRow: envelope_kind in {"action","observation"}; status in {"pending","executed","rejected"}; visibility normalized.
MessageRow: visibility normalized; sender/channel/audience/speech_act/topic_label.
ExchangeRow: exchange_event_type normalized via ExchangeKind; optional side in {"buy","sell"}; quantity/price.
IdentityEdgeRow (Unified): edge_kind normalized via RepresentationEdgeKind; required-fields combination validator (see below).
ScenarioRow: observer perspective with snapshots; visibility normalized; includes context_hash.
DecisionRow: agent-level decisions (chosen_action/candidates/value estimates).
OracleCallRow: invocation metadata and hashes (persona/representation/context), timing, cache flags.

IdentityEdgeRow Combination Rules

Validator enforces required fields by edge_kind:

self_to_positive_valence: — (no additional slots; observer is self)
self_to_negative_valence: — (no additional slots; observer is self)
self_to_object: subject_id, token_id
self_to_agent: subject_id, object_id
agent_to_positive_valence: subject_id
agent_to_negative_valence: subject_id
agent_to_object: subject_id, token_id
agent_to_agent: subject_id, object_id
agent_pair_to_object: subject_id, related_agent_id, token_id
object_to_positive_valence: token_id
object_to_negative_valence: token_id
object_to_object: subject_id, object_id

This unifies identity edge logging into a single table identity_edges.

Quick examples

GraphEdit (canonical operations)

from crv.core.schema import GraphEdit, RepresentationPatch

# Token–token association
e1 = GraphEdit(
    operation="set_identity_edge_weight",
    edge_kind="object_to_object",
    subject_id="TokenA",
    object_id="TokenB",
    new_weight=0.75,
)

# Positive valence trace
e2 = GraphEdit(
    operation="adjust_identity_edge_weight",
    edge_kind="object_to_positive_valence",
    token_id="Alpha",
    delta_weight=0.1,
)

patch = RepresentationPatch(edits=[e1, e2])

IdentityEdgeRow (minimal valid payloads)

from crv.core.schema import IdentityEdgeRow

# object_to_positive_valence requires token_id
row1 = IdentityEdgeRow(
    tick=1,
    observer_agent_id="agent_1",
    edge_kind="object_to_positive_valence",
    token_id="Alpha",
    edge_weight=0.6,
)

# agent_to_object requires subject_id, token_id
row2 = IdentityEdgeRow(
    tick=1,
    observer_agent_id="agent_1",
    edge_kind="agent_to_object",
    subject_id="agent_2",
    token_id="Alpha",
    edge_weight=0.4,
)

Grammar and EBNF

grammar.py defines enums:

ActionKind, ChannelType, Visibility, PatchOp, RepresentationEdgeKind, TopologyEdgeKind (future), ExchangeKind, TableName.

Lower_snake EBNF and helpers:

EBNF_GRAMMAR (lower_snake authoritative terminals; see crv.core.grammar.EBNF_GRAMMAR)
Helpers: is_lower_snake, assert_lower_snake, normalize_visibility, canonical_action_key, etc.
Test utility: ensure_all_enum_values_lower_snake.

Design principles

One naming standard: Enum classes are PascalCase; enum member names are UPPER_SNAKE; serialized enum values (wire/EBNF/Parquet) are lower_snake; all field/column names are lower_snake.
Psychology-first: Core grammar does not encode legal “rights” taxonomies. Exchanges are generic and may publish a baseline_value that can feed valuation V(token) as B_token(t). Venue-specific mechanics live in payloads, not in core enums.
Representation vs. topology: RepresentationEdgeKind describes edges inside an agent’s identity/affect representation and is logged to identity_edges. TopologyEdgeKind (future) describes links in the world topology (e.g., is_neighbor, follows) and would live in a separate world_topology table.

Math-to-code mapping

Math symbol	Meaning (concept)	Code enum/value
s^+_{agent}	self positive anchor	self_to_positive_valence
s^-_{agent}	self negative anchor	self_to_negative_valence
s_{agent,token}	self→object attachment (endowment)	self_to_object
a_{agent,other_agent}	primitive self→other attitude	self_to_agent
u^+_{agent,other_agent}	positive feeling toward other agent	agent_to_positive_valence
u^-_{agent,other_agent}	negative feeling toward other agent	agent_to_negative_valence
b_{agent,other_agent,token}	other→object stance (as perceived by self)	agent_to_object
d_{agent,other_a,other_b}	other–other alliance/rivalry (as perceived)	agent_to_agent
q_{agent,other_a,other_b,token}	pair-on-object (perceived coalition on token)	agent_pair_to_object
r^+_{agent,token}	positive object trace	object_to_positive_valence
r^-_{agent,token}	negative object trace	object_to_negative_valence
c_{agent,token_a,token_b}	token–token association	object_to_object
U_agent(token)	representation readout driver	representation_score
V_agent(token)	bounded valuation	valuation_score
B_token(t)	exchange baseline (price/poll/trend)	baseline_value

Table Catalog

Descriptors live under src/crv/core/tables/ as frozen TableDescriptor instances; the tables package __init__.py registers them into a canonical registry. All tables include bucket (partitioning key; computed in IO as tick // TICK_BUCKET_SIZE) and version=SCHEMA_V.

exchange
Purpose: Generalized exchange events (trade/order/swap/gift/vote).
Key columns: tick, venue_id, token_id, exchange_event_type, side?, quantity?, price?, actor/counterparty?, baseline_value?, additional_payload (struct).
identity_edges (Unified representation edges)
Purpose: Snapshot/delta rows of edges inside an agent’s internal representation.
Key columns: tick, observer_agent_id, edge_kind, subject_id?, object_id?, related_agent_id?, token_id?, edge_weight, edge_sign?.
holdings
Purpose: Quantity snapshot of conserved resources per (tick, agent_id, token_id). Optional per ADR-003 when the domain models a conserved per-token resource.
Key columns: tick, agent_id, token_id, quantity.
holdings_valuation (TODO)
scenarios_seen
Purpose: Observer-centric scenario context snapshots used in valuation/decision.
Key columns: tick, observer_agent_id, token_id?, visibility_scope?, salient_agent_pairs (list[struct]), exchange_snapshot (struct), recent_affect_index?, salient_other_agent_id?, context_hash.
messages
Purpose: Communication events emitted by agents.
Key columns: tick, sender_agent_id, channel_name, visibility_scope, audience (struct), speech_act, topic_label, stance_label?, claims (struct), style (struct).
decisions
Purpose: Agent decision outputs per tick.
Key columns: tick, agent_id, chosen_action (struct), action_candidates (list[struct]), token_value_estimates (struct).
oracle_calls
Purpose: LLM/tooling calls with persona/context and cache metadata.
Key columns: tick, agent_id, engine, signature_id, persona_id, persona_hash, representation_hash, context_hash, value_json, latency_ms, cache_hit (i64), n_tool_calls (i64), tool_seq (struct).

APIs:

get_table(name: TableName) -> TableDescriptor
list_tables() -> list[TableDescriptor]

Versioning and Schema Evolution

Canonical version: src/crv/core/versioning.py
SchemaVersion (frozen dataclass)
SCHEMA_V: current = (0, 1, "2025-09-20")
Helpers: is_compatible(ver), is_successor_of(candidate, current)

Policy:

Major: breaking changes; Minor: additive, non-breaking changes.
During development of a feature sprint, keep SCHEMA_V unchanged until feature completion is approved.
When bumping:
Update SCHEMA_V.
Update descriptors/models/tests/docs in the same change.
Use is_successor_of to validate sequential bumps.

Hashing and Serde (Canonical JSON)

hashing.json_dumps_canonical(obj):
sort_keys=True, separators=(",", ":"), ensure_ascii=False
hashing.hash_row(row), hash_context(ctx_json), hash_state(agent_state): SHA-256 hex digest over canonical JSON.
serde.json_loads(s: str) thin wrapper around stdlib; re-exports json_dumps_canonical for a single canonicalization policy.

IDs, Typing, and Constants

ids.py: RunId, AgentId, TokenId, VenueId, SignatureId; make_run_id(prefix="run") -> RunId yields <prefix>_[0-9a-f]{6} with lower_snake prefix enforcement.
typing.py: Tick, GroupId, RoomId, JsonDict.
constants.py (consumed by crv.io):
TICK_BUCKET_SIZE = 100
ROW_GROUP_SIZE = 128 * 1024
COMPRESSION = "zstd"

Errors

Domain-specific exceptions in errors.py:

GrammarError for grammar/naming violations (e.g., not lower_snake).
SchemaError for schema-level validation failures (ranges, cross-field constraints).
VersionMismatch for schema version incompatibilities.

Design Notes

Identity edges unified:
All representation edges persist to a single identity_edges table distinguished by edge_kind (RepresentationEdgeKind) with combination rules enforced by validators.
Downstream readers should filter by edge_kind to reconstruct specific edge families (self_to_object, agent_to_agent, etc.).
Concept-doc cross-ref: some design docs list separate o2o_edges and o2o_obj tables; in core these are represented within identity_edges via edge_kind ∈ {agent_to_agent,agent_pair_to_object}. Downstream may materialize split views if desired.
IO alignment:
All descriptors include bucket and partitioning=["bucket"]; IO layers compute/populate bucket from tick using TICK_BUCKET_SIZE.
Compression defaults to "zstd"; adjust only via spec update.

Tests

Core tests (see tests/core/):

Grammar naming: all enum .value are lower_snake.
IdentityEdgeRow combination matrix (positive/negative cases).
ExchangeRow normalization; visibility normalization for MessageRow/ScenarioRow.
Table descriptor contract (columns lower_snake; required/nullable; partitioning; version pinned).
Decision schemas (ActionCandidate normalization; DecisionHead defaults).
Hashing/serde stability (order-insensitive canonical dumps; hash equality).

How to run locally:

uv run ruff check .
uv run mypy --strict
uv run pytest -q

All core tests pass on CI (pytest).