angzarr_client.readiness¶
Readiness probes and health-status supervisor for runner servers.
Python port of client-rust/main/src/readiness.rs. Audit #68.
A runner exposes its readiness through grpc.health.v1.Health. While
any probe is failing, the per-kind service name reports NOT_SERVING;
once every probe is green, it flips to SERVING. Probes are evaluated
on a fixed cadence (default 30s, override via
ANGZARR_READINESS_PROBE_INTERVAL) with a per-probe timeout (default
2s, override via ANGZARR_READINESS_PROBE_TIMEOUT).
Aggregation is binary — all up is SERVING, anything else is
NOT_SERVING. The health server itself always responds, so liveness
(“the process answers”) and readiness (“it’s safe to send traffic”)
share one wire surface and are distinguished by the response status.
Async architecture (audit #68 design call): the runner runs on
grpc.aio.server so the supervisor lives as an asyncio task on the
same event loop, mirroring Rust’s tokio-async supervisor 1:1. Probes
have async check() -> bool signatures; per-probe timeouts are
enforced via asyncio.wait_for.
Attributes¶
Default cadence for re-evaluating output-domain probes (seconds). |
|
Default per-probe timeout (seconds). |
|
optional async-bus endpoint (Kafka / RabbitMQ / SQS / |
Classes¶
Single readiness probe — evaluated once per supervisor tick. |
|
Side of a |
|
One-shot transport probe — flipped |
|
Per-output-domain coordinator probe — attempts to open a connection |
|
Async-bus reachability probe (audit #74) — covers the path that |
Functions¶
|
Read supervisor cadence + per-probe timeout from env, falling |
|
Poll every probe on each tick, aggregate (all_ok → |
Module Contents¶
- angzarr_client.readiness.DEFAULT_PROBE_INTERVAL = 30.0¶
Default cadence for re-evaluating output-domain probes (seconds).
- angzarr_client.readiness.DEFAULT_PROBE_TIMEOUT = 2.0¶
Default per-probe timeout (seconds).
- angzarr_client.readiness.ENV_INTERVAL = 'ANGZARR_READINESS_PROBE_INTERVAL'¶
- angzarr_client.readiness.ENV_TIMEOUT = 'ANGZARR_READINESS_PROBE_TIMEOUT'¶
- angzarr_client.readiness.ENV_BUS_ENDPOINT = 'ANGZARR_BUS_ENDPOINT'¶
optional async-bus endpoint (Kafka / RabbitMQ / SQS / SNS / NATS / etc.). When set, a single
BusProbecovers reachability of the async path for every async-only saga / PM target. When unset, no bus probe is added — async-only targets are simply not part of readiness.- Type:
Audit #74
- angzarr_client.readiness.probe_config_from_env() tuple[float, float]¶
Read supervisor cadence + per-probe timeout from env, falling back to
DEFAULT_PROBE_INTERVAL/DEFAULT_PROBE_TIMEOUT.Same env-var contract as Rust. Bad values (non-numeric) silently fall back to defaults — matches Rust’s
.parse::<u64>().ok().
- class angzarr_client.readiness.Probe¶
Bases:
abc.ABCSingle readiness probe — evaluated once per supervisor tick.
- class angzarr_client.readiness.TransportSignal¶
Side of a
TransportProbeused by the runner to mark ‘bound and serving’.
- class angzarr_client.readiness.TransportProbe(signal: TransportSignal)¶
Bases:
ProbeOne-shot transport probe — flipped
Trueonce the listener has bound and the server is accepting traffic. From that point its result never changes.- classmethod new() tuple[TransportProbe, TransportSignal]¶
- class angzarr_client.readiness.OutputDomainProbe(domain: str, endpoint: _Endpoint)¶
Bases:
ProbePer-output-domain coordinator probe — attempts to open a connection to the downstream domain’s command-handler coordinator endpoint.
Audit #74: built only for sync output domains (declared via
@saga(sync=True)or@process_manager(sync_targets=[...])). Async-only targets ride the bus; seeBusProbe.- classmethod for_domain(domain: str) OutputDomainProbe¶
- class angzarr_client.readiness.BusProbe(endpoint: _Endpoint)¶
Bases:
ProbeAsync-bus reachability probe (audit #74) — covers the path that async-only saga / PM targets ride. The endpoint is operator-supplied via
ENV_BUS_ENDPOINTand points at whatever broker the deployment uses (Kafka, RabbitMQ, SQS/SNS, NATS, etc.).Connection-only — confirms the broker is reachable, not that publishes will succeed end-to-end. Same contract as
OutputDomainProbefor sync targets.
- async angzarr_client.readiness.run_supervisor(probes: list[Probe], health_servicer, service_names: list[str], interval: float, timeout: float) None¶
Poll every probe on each tick, aggregate (all_ok →
SERVING, elseNOT_SERVING), publish to every service name. Loops until cancelled.health_serviceris an asyncgrpc_health.v1.health.aio.HealthServicer— itsset(service, status)is awaitable.