Working with the Data Scientists to define and develop the target solution with production constraints in mind. This allows to select the correct run infrastructure and serving model (e.g. data ingestion scheme, API synchronicity, …) to address the business requirements (real-time responses, processing volumetry, …)
Contributing to the automation of the different elements of the ML pipeline in order to integrate and deploy them in the production environment (e.g. building Docker/VM images, prepare unitary, regression and integration tests, …)
Supporting Data Scientists on the usage of the existing industrial solutions available to build and monitor AI services (i.e. the CI/CD tools)
Supporting IT Production on the parameterization of the target environment
Ensuring that the model runs without errors, is retrained if needed (incl. automatically) and is monitored both from the IT and the business perspective.
Strong Python + one systems language (Go/Rust/C++) preferred.
Experience with streaming audio + WebSockets/gRPC.
Familiarity with Whisper/NeMo/wav2vec2 or similar ASR stacks and neural TTS stacks.
Practical telephony experience (PSTN/IVR) and/or WebRTC.
4+ years engineering, with 2+ years shipping streaming speech in production.
Real-time STT + TTS + turn-taking/barge-in experience.
Telephony: SIP/WebRTC or CPaaS integration, codecs (u-law/A-law), 8kHz realities.
Can run evaluations: WER by cohort, latency p95, conversation KPIs.
Comfortable integrating open-source STT/TTS and vendor APIs like Gradium.
Containerization / virtualisation
AI platforms & IDEs
CI/CD (gitlab-ci)
Code, model & data versioning
Usage of package management tools and experience in dependency management
postgresql