Member-only story
Implement Continuous Speech Recognition on Android
Say the word to trigger speech recognition inside your application
Voice recognition has gained a lot of traction over the past few years. When building an app where you feel speech recognition would boost your user experience, you can either:
- Integrate
SpeechRecognizer
API. - Leverage Google Assistant.
Implementing SpeechRecognizer
in your Android application is straightforward. I’ll provide a detailed implementation later in the article.
However, we want continuous voice recognition. Unfortunately, the API doesn’t provide a mechanism to trigger voice recognition using a keyword. All voice recognition systems are based on this pattern, whether it’s “Ok Google” for Google Assistant, “Hey Siri” for iOS, or “Alexa” for Amazon devices.
For that, the second option should fit our needs. Sadly, Google Assistant remains a closed API and doesn’t offer many possibilities. It provides App Action, but you won’t achieve continuous voice recognition with it.
I was excited when I first came across VoiceInteractionService
. It seemed to do what I wanted with the AlwaysOnHotwordDetector
. Unfortunately, it’s tightly bound to Google Assistant by letting you integrate a custom assistant.
So far, we can use the SpeechRecognizer
but we still need to trigger speech recognition from user interaction.
Activate Speech Recognition on Hot Keyword
We want to break this process into several steps:
- Activate speech recognition.
- Listen for the hot keyword.
- On keyword detected, listen to the user’s voice.
- On words caught, yield result.
- Deactivate speech recognition.
We leverage SpeechRecozigner
by separating hot keyword detection from actual speech recognition.
Basic speech recognition setup
If you’re targeting SDK 23 or above, you must request permission for recording audio inside your application.