Voice AI Engineer Senior

Vacaturegegevens

Job Description

Function description

Design and implement streaming pipelines: audio ingest, VAD/endpointing, STT, orchestration/LLM, streaming TTS.
Own turn-taking: barge-in, interruptions, end pointing tuning, silence handling, latency/accuracy tradeoffs.
Integrate telephony and app channels (SIP/WebRTC/CPaaS)
Implement reliability retries, backpressure, rate limiting, fallbacks between open-source and vendor components.

In addtion, AI Dev Engineers contribute by:

Working with the Data Scientists to define and develop the target solution with production constraints in mind. This allows to select the correct run infrastructure and serving model (e.g. data ingestion scheme, API synchronicity, …) to address the business requirements (real-time responses, processing volumetry, …)
Contributing to the automation of the different elements of the ML pipeline in order to integrate and deploy them in the production environment (e.g. building Docker/VM images, prepare unitary, regression and integration tests, …)
Supporting Data Scientists on the usage of the existing industrial solutions available to build and monitor AI services (i.e. the CI/CD tools)
Supporting IT Production on the parameterization of the target environment
Ensuring that the model runs without errors, is retrained if needed (incl. automatically) and is monitored both from the IT and the business perspective.

Required Experience

At least 4 years of relevant experience

Technical Experience

Mandatory

Strong Python + one systems language (Go/Rust/C++) preferred.
Experience with streaming audio + WebSockets/gRPC.
Familiarity with Whisper/NeMo/wav2vec2 or similar ASR stacks and neural TTS stacks.
Practical telephony experience (PSTN/IVR) and/or WebRTC.
4+ years engineering, with 2+ years shipping streaming speech in production.
Real-time STT + TTS + turn-taking/barge-in experience.
Telephony: SIP/WebRTC or CPaaS integration, codecs (u-law/A-law), 8kHz realities.
Can run evaluations: WER by cohort, latency p95, conversation KPIs.
Comfortable integrating open-source STT/TTS and vendor APIs like Gradium.
Containerization / virtualisation
AI platforms & IDEs
CI/CD (gitlab-ci)
Code, model & data versioning
Usage of package management tools and experience in dependency management
postgresql

Preferable

Speaker diarization / echo cancellation constraints knowledge.
Experience in regulated environments (banking/insurance/health).
Experience building semantic VAD or endpointing models.
experience with integration using different technologies (distributes/mainframe) and infra components

Business Experience

Mandatory

Soft Skills

Communication skills - oral & written
Ability to deliver/Results driven
Attention to detail/rigor
Creativity & Innovation/Problem Solving
Proactively invests time in continuous learning and knowledge improvement.
Demonstrates awareness of efficiency and efficacy.
Thinks out of the box outside existing processes and frameworks.
Works with energy and empowerment to deliver great results and a large contribution to the company’s success
Is constructive by being open to the changes and to other’s opinions, ideas and feedback