A BYOC container that works locally may fail in production under concurrent sessions, GPU memory pressure, or ungraceful restarts. Use this checklist to verify container behaviour before registering on mainnet.
GPU memory profiling
Profile your container under the expected concurrent session count:
# Monitor GPU memory during load test
watch -n 1 nvidia-smi
# Run multiple concurrent sessions against local orchestrator
for i in $(seq 1 5); do
curl -X POST http://localhost:8935/live-video-to-video -d '{"model_id":"my-model"}' &
done
Measure peak VRAM usage per session and multiply by expected concurrency. If peak exceeds your GPU’s VRAM, either reduce per-session memory (smaller batch size, lower resolution) or limit the Orchestrator’s maxSessions configuration.
Graceful shutdown
The Orchestrator sends SIGTERM when stopping a container. Handle it:
import signal
import asyncio
async def shutdown(server):
# Close active sessions
await server.close_all_sessions()
# Flush any buffered output
await server.flush()
def handle_sigterm(signum, frame):
asyncio.get_event_loop().create_task(shutdown(server))
signal.signal(signal.SIGTERM, handle_sigterm)
A container that does not handle SIGTERM is killed after a timeout (default 10 seconds). Active sessions receive no graceful close and may produce incomplete output.
Health check under load
The /health endpoint must return {"status": "ok"} even under full GPU load. If health checks fail, the Orchestrator stops advertising the capability and Gateways route elsewhere.
Common failure: the health check handler shares the GPU inference thread and blocks during heavy processing. Run health checks on a separate thread or async task.
Monitoring
Expose Prometheus metrics from your container for the Orchestrator’s monitoring stack:
| Metric | Description |
|---|
byoc_sessions_active | Current concurrent sessions |
byoc_frame_latency_ms | Per-frame processing latency histogram |
byoc_gpu_memory_bytes | Current GPU memory usage |
byoc_errors_total | Processing errors by type |
The Orchestrator’s Prometheus scraper picks up metrics from containers on the same Docker network.
The BYOC architecture covers the container interface. The production checklist covers Gateway-side production requirements. Last modified on June 2, 2026