Telemetry signals

TokenShift emits five typed JSON records over the wire. Every record is hybrid-encrypted client-side, validated against an explicit allowlist, and stored on PointFive’s side keyed by signal. This page gives a one-paragraph orientation for each; the full field tables, fail-closed rules, and tier mapping live in Data contract.

#	Signal	Min tier	Fires when
1	`tool_invocation`	1	Claude makes a tool call.
2	`recovery_retrieved`	1	A developer pulls back the original output that was compressed away.
3	`session_summary`	1	A Claude session ends.
4	`compression_sample`	4	Triple-gated, off by default — sampled redacted before/after pair for rule tuning.
5	`client_state`	1	Periodically (≤ once / 24h per machine) — version + install metadata.

`tool_invocation`

The atomic record — one per Claude tool call, regardless of whether TokenShift compressed it, skipped it, fell back, or never matched a rule at all. Carries the tool kind, the model, the activity bucket, the project, a stable redactor-derived command_hash, the original token count, the outcome enum (unmatched / skipped / compressed / fallback / bypassed), and — at Tier 2+ — the redacted command_shape and rewritten_command_shape. This is the foundation: every other TokenShift question (and most Claude-usage questions) derive from this signal. Full field table: Data contract §1.

`recovery_retrieved`

Fires when a developer types the recovery command to pull back the original (uncompressed) output that TokenShift had compressed away. Strong signal that compression destroyed context the developer actually needed; per-rule rollups become a compression-quality score. Carries only the join key (recovery_id → tool_invocation), the recovering session, and an age_seconds bounded by the cache TTL — no content, no file paths. Full field table: Data contract §2.

`session_summary`

One best-effort record per Claude session, fired at session end. Carries the things you can’t reconstruct from tool_invocation alone — end_reason (clean / timeout / crash / interrupted), turn_count, context_tokens_at_end and context_tokens_max (how close to the model context limit the conversation got) — plus a small set of denormalized aggregates (total tokens in, total tokens saved, per-outcome counts, recovery count) for fast single-session dashboards. Reserved fields exist for future per-turn detail. Full field table: Data contract §3.

`compression_sample`

A redacted before/after pair captured for PointFive rule tuning. The only signal that carries actual output content (in redacted form) and the most tightly gated record in the contract: off by default, and three independent gates must all be true for any sample to leave the process — tenant has enabled Tier 4 in their manifest, the matched rule is on the per-rule sample allowlist, and a random draw lands inside the configured sampling percentage. Never surfaced on customer dashboards. Carries no user_id (defense in depth). Backend retention is a hard cap of 7 days. Full field table: Data contract §4.

`client_state`

Periodic meta-telemetry about the binary itself — which version is running where, when it was installed, OS / arch / install method, whether the binary’s internal self-check (redactor regression corpus, allowlist invariants, manifest validity, outbox writability) passed. Rate-limited to at most once per 24 hours per client_id; new installs always emit on first session. Without a control plane this is the only way PointFive and customer admins see fleet hygiene — stale versions, dominant install methods, enrollment age, shadowed system manifests. No PII, no per-tool-call detail. Full field table: Data contract §5.

Where this lands

Each signal is stored separately on PointFive’s side, keyed by signal name and tenant. The transport that gets it there is described in Data contract → Transport.