Telephony automation product with local STT, local LLM, and local TTS posture.
Ship PBX agents, batch transcription, and live speech APIs from one custom model.
regn.io turns the verified 20.5 ASR bundle into three saleable products: a PBX voice agent, a batch speech API, and a realtime speech API. The positioning is commercial. The runtime stays operator-grade.
Clean path for recorded calls, archives, voicemails, and job-based transcription.
WebSocket ingress first, then WebRTC once the streaming gateway is fully productized.
Three products, one runtime story.
The stack is intentionally compact. Sell the PBX outcome, sell the batch inference surface, and extend the same model into live streaming instead of maintaining separate model families.
PBX Voice Agent
AI voice automation for 3CX, SIP, and PBX environments where telephony behavior matters as much as transcription.
- Best fit: reception, routing, callback capture, internal ops.
- Packaging: customer-hosted or operator-managed Linux runtime.
- Status: strongest shipped product path today.
Batch Speech API
Recorded-audio inference for WAV and archived calls with clear profile and device choices.
- Best fit: back-office processing, voicemail, archives, QA pipelines.
- Surface: file-based transcription and job-oriented API wrapping.
- Status: inference-ready, commercial wrapper next.
Realtime Speech API
Live transcription product around the same model with WebSocket ingress first and WebRTC next.
- Best fit: browser capture, operators, live assistants, streaming workflows.
- Posture: GPU-first for lower-latency multi-session work.
- Status: next build target.
Benchmark-backed capacity anchors.
The site should read like a product surface, but the numbers still matter. These are the warm throughput anchors used for deployment planning across the current stack.
Best throughput posture for premium batch and future live tiers.
Best CPU-only production posture on the verified 285K machine.
Per-core anchor when you need dense fleet planning instead of single-box estimates.
| Device | Profile | Warm WPM | Best use |
|---|---|---|---|
| RTX 5090 | Fast | 32,929 | Premium throughput, future live tier, highest-volume batch. |
| RTX 5090 | Balanced | 25,913 | Conservative GPU default for strong production headroom. |
| Intel Core Ultra 9 285K | INT8 Fast | 8,980 | Best CPU-only production posture. |
| Intel Core Ultra 9 285K | FP32 Balanced | 6,712 | Simpler CPU deployments with fewer runtime choices. |
| Single core | INT8 | 1,935 | Per-core planning anchor for denser fleets. |
Compact deployment ladder.
Use the smallest posture that still matches the workflow and the concurrency target.
- Standalone: customer-hosted PBX or internal batch processing with local control.
- Managed CPU INT8: efficient hosted transcription without GPU dependency.
- Managed GPU: premium throughput and live-session headroom.
Pages aligned to buyer language.
Companies search for PBX AI agents, 3CX AI agents, speech APIs, and realtime transcription APIs.