“Adversarial Attacks on Modern Speech-to-Text”
Max Little, Language Log, January 30, 2018
For many commercial STT and associated user-centric applications this is mostly a curiosity. If I can order pizza and nearly always get it right in one take through Siri, I don't really see the problem here, even if it is obviously highly brittle. …
Nonetheless, I think this brittleness does have consequences. There will be critical uses for which this technology simply can't work. Specialised dictionaries may exist (e.g. clinical terminology) for which it may be almost impossible to obtain sufficient training data to make it useful. Poorly represented minority accents may cause it to fail. Stroke survivors and those with voice or speech impairments may be unable to use them. And there are attacks … in which a device is hacked remotely.