Custom model / PBX automation / batch + live API surfaces

Ship PBX agents, batch transcription, and live speech APIs from one custom model.

regn.io turns the verified 20.5 ASR bundle into three saleable products: a PBX voice agent, a batch speech API, and a realtime speech API. The positioning is commercial. The runtime stays operator-grade.

See the PBX product See the API products Register for API access Request a deployment review

3 products On-prem or managed CPU, INT8 CPU, and CUDA tiers Real custom bundle path api.regn.io wired

PBX Product 3CX / SIP

Telephony automation product with local STT, local LLM, and local TTS posture.

Batch Product WAV / file API

Clean path for recorded calls, archives, voicemails, and job-based transcription.

Live Product WS first

WebSocket ingress first, then WebRTC once the streaming gateway is fully productized.

PBX runtime

20.5

Pinned to the real checkpoint bundle path rather than the legacy Whisper path.

Batch throughput

25,913

Balanced GPU WPM on the verified RTX 5090 run.

CPU anchor

6,712

Balanced FP32 WPM on the 285K host for plain CPU deployments.

Product split

3 SKUs

PBX agent, batch API, and realtime speech API.

Three products, one runtime story.

The stack is intentionally compact. Sell the PBX outcome, sell the batch inference surface, and extend the same model into live streaming instead of maintaining separate model families.

Product 1

PBX Voice Agent

AI voice automation for 3CX, SIP, and PBX environments where telephony behavior matters as much as transcription.

Best fit: reception, routing, callback capture, internal ops.
Packaging: customer-hosted or operator-managed Linux runtime.
Status: strongest shipped product path today.

Product 2

Batch Speech API

Recorded-audio inference for WAV and archived calls with clear profile and device choices.

Best fit: back-office processing, voicemail, archives, QA pipelines.
Surface: file-based transcription and job-oriented API wrapping.
Status: inference-ready, commercial wrapper next.

Product 3

Realtime Speech API

Live transcription product around the same model with WebSocket ingress first and WebRTC next.

Best fit: browser capture, operators, live assistants, streaming workflows.
Posture: GPU-first for lower-latency multi-session work.
Status: next build target.

Benchmark-backed capacity anchors.

The site should read like a product surface, but the numbers still matter. These are the warm throughput anchors used for deployment planning across the current stack.

GPU fast 32,929 WPM

Best throughput posture for premium batch and future live tiers.

CPU INT8 fast 8,980 WPM

Best CPU-only production posture on the verified 285K machine.

1 core INT8 1,935 WPM

Per-core anchor when you need dense fleet planning instead of single-box estimates.

Device	Profile	Warm WPM	Best use
RTX 5090	Fast	32,929	Premium throughput, future live tier, highest-volume batch.
RTX 5090	Balanced	25,913	Conservative GPU default for strong production headroom.
Intel Core Ultra 9 285K	INT8 Fast	8,980	Best CPU-only production posture.
Intel Core Ultra 9 285K	FP32 Balanced	6,712	Simpler CPU deployments with fewer runtime choices.
Single core	INT8	1,935	Per-core planning anchor for denser fleets.

Deployment shapes

Compact deployment ladder.

Use the smallest posture that still matches the workflow and the concurrency target.

Standalone: customer-hosted PBX or internal batch processing with local control.
Managed CPU INT8: efficient hosted transcription without GPU dependency.
Managed GPU: premium throughput and live-session headroom.

Search surfaces

Pages aligned to buyer language.

Companies search for PBX AI agents, 3CX AI agents, speech APIs, and realtime transcription APIs.

PBX AI Agent Primary telephony product page. 3CX AI Agent Integration-focused SEO page for 3CX buyers. Speech-to-Text API Batch transcription product page. Realtime Speech API Streaming product direction over WebSocket and WebRTC.

Choose the surface that matches the workflow. PBX outcome, raw file inference, or live streaming ingress.

PBX voice agent for telephony workflows Batch API for WAV and recorded audio Realtime API for live transcription sessions

See deployment models Read the docs