What happened to the sharing ecomomy?
Romain Dillet
@romaindillet 2 hours ago
While you’d be hard-pressed to find any startup not brimming with confidence over the disruptive idea they’re chasing, it’s not often you come across a young company as calmly convinced it’s engineering the future as Dasha AI.
The team is building a platform for designing human-like voice interactions to automate business processes. Put simply, it’s using AI to make machine voices a whole lot less robotic.
“What we definitely know is this will definitely happen,” says CEO and co-founder Vladislav Chernyshov. “Sooner or later the conversational AI/voice AI will replace people everywhere where the technology will allow. And it’s better for us to be the first mover than the last in this field.”
“In 2018 in the U.S. alone there were 30 million people doing some kind of repetitive tasks over the phone. We can automate these jobs now or we are going to be able to automate it in two years,” he goes on. “If you multiple it with Europe and the massive call centers in India, Pakistan and the Philippines you will probably have something like close to 120 million people worldwide… and they are all subject for disruption, potentially.”
The New York-based startup has been operating in relative stealth up to now. But it’s breaking cover to talk to TechCrunch — announcing a $2 million seed round, led by RTP Ventures and RTP Global: An early-stage investor that’s backed the likes of Datadog and RingCentral. RTP’s venture arm, also based in NY, writes on its website that it prefers engineer-founded companies — that “solve big problems with technology.” “We like technology, not gimmicks,” the fund warns with added emphasis.
Dasha’s core tech right now includes what Chernyshov describes as “a human-level, voice-first conversation modelling engine;” a hybrid text-to-speech engine which he says enables it to model speech disfluencies (aka, the ums and ahs, pitch changes etc. that characterize human chatter); plus “a fast and accurate” real-time voice activity detection algorithm which detects speech in less than 100 milliseconds, meaning the AI can turn-take and handle interruptions in the conversation flow. The platform also can detect a caller’s gender — a feature that can be useful for healthcare use cases, for example.
Another component Chernyshov flags is “an end-to-end pipeline for semi-supervised learning” — so it can retrain the models in real time “and fix mistakes as they go” — until Dasha hits the claimed “human-level” conversational capability for each business process niche. (To be clear, the AI cannot adapt its speech to an interlocutor in real time — as human speakers naturally shift their accents closer to bridge any dialect gap — but Chernyshov suggests it’s on the roadmap.)