After the initial install of Home Assistant, I’ve been eager to get some basic voice recognition working. One of my early goals was for it to be “offline”; meaning, not use Amazon or Google.
Hardware
- Raspberry Pi 3 model B
- Running Raspbian Stretch
- A USB microphone
I was originally working with the HD-3000, but wasn’t very happy with the recording quality. I’m still experimenting with the ReSpeaker, but it definitely seems better. In any case, configuration was pretty similar- and likely the same goes for any other USB microphone.
Basic Alsa Audio
First, we need to get audio working; both a microphone and speaker.
Good, concise documentation that explains what’s going on with Raspberry Pi/Debian audio has eluded me thus far. Most of this is extracted from random forum posts, Stack Overflow, and a smattering of trial and error.
You can record from a microphone with arecord
. Abridged arecord --help
output:
Usage: arecord [OPTION]... [FILE]...
-l, --list-devices list all soundcards and digital audio devices
-L, --list-pcms list device names
-D, --device=NAME select PCM by name
-t, --file-type TYPE file type (voc, wav, raw or au)
-c, --channels=# channels
-f, --format=FORMAT sample format (case insensitive)
-r, --rate=# sample rate
-d, --duration=# interrupt after # seconds
-v, --verbose show PCM structure and setup (accumulative)
List various devices. arecord -l
:
**** List of CAPTURE Hardware Devices ****
card 1: Dummy [Dummy], device 0: Dummy PCM [Dummy PCM]
<SNIP>
card 2: ArrayUAC10 [ReSpeaker 4 Mic Array (UAC1.0)], device 0: USB Audio [USB Audio]
Subdevices: 1/1
Subdevice #0: subdevice #0
Then, arecord -L
:
null
Discard all samples (playback) or generate zero samples (capture)
default
<Bunch of CARD=Dummy>
sysdefault:CARD=ArrayUAC10
ReSpeaker 4 Mic Array (UAC1.0), USB Audio
Default Audio Device
<Bunch of CARD=ArrayUAC10 speakers/output>
dmix:CARD=ArrayUAC10,DEV=0
ReSpeaker 4 Mic Array (UAC1.0), USB Audio
Direct sample mixing device
dsnoop:CARD=ArrayUAC10,DEV=0
ReSpeaker 4 Mic Array (UAC1.0), USB Audio
Direct sample snooping device
hw:CARD=ArrayUAC10,DEV=0
ReSpeaker 4 Mic Array (UAC1.0), USB Audio
Direct hardware device without any conversions
plughw:CARD=ArrayUAC10,DEV=0
ReSpeaker 4 Mic Array (UAC1.0), USB Audio
Hardware device with all software conversions
To record using the ReSpeaker (card ArrayUAC10
):
# `-d 3` records for 3 seconds (otherwise `Ctrl+c` to stop)
# `-D` sets the PCM device
arecord -d 3 -D hw:ArrayUAC10 tmp_file.wav
It may output:
Recording WAVE 'tmp_file.wav' : Unsigned 8 bit, Rate 8000 Hz, Mono
arecord: set_params:1299: Sample format non available
Available formats:
- S16_LE
Like arecord -L
says, hw:
is “Direct hardware device without any conversions”. We either need to record in a supported format, or use plughw:
(“Hardware device with all software conversions”). Either of these work:
arecord -d 3 -D plughw:ArrayUAC10 tmp_file.wav
# `-f S16_LE` signed 16-bit little endian
# `-c 6` six channels
# `-r 16000` 16kHz
arecord -f S16_LE -c 6 -r 16000 -d 3 -D hw:ArrayUAC10 tmp_file.wav
You can get a list of supported parameters with arecord --dump-hw-params -D hw:ArrayUAC10
:
HW Params of device "hw:ArrayUAC10":
--------------------
ACCESS: MMAP_INTERLEAVED RW_INTERLEAVED
FORMAT: S16_LE
SUBFORMAT: STD
SAMPLE_BITS: 16
FRAME_BITS: 96
CHANNELS: 6
RATE: 16000
PERIOD_TIME: [1000 2730625]
PERIOD_SIZE: [16 43690]
PERIOD_BYTES: [192 524280]
PERIODS: [2 1024]
BUFFER_TIME: [2000 5461313)
BUFFER_SIZE: [32 87381]
BUFFER_BYTES: [384 1048572]
TICK_TIME: ALL
--------------------
In online resources you’ll see values similar to hw:2,0
, which means “card 2, device 0”. Looking at the arecord -l
output, it’s the same as hw:ArrayUAC10
since the ReSpeaker only has the one device.
You can play the recorded audio with aplay
. Looking at the output from aplay -L
, I can:
aplay -D plughw:SoundLink tmp_file.wav
If you’re not hearing anything also check alsamixer
to make sure the output is not muted and the volume isn’t 0.
There’s at least two configuration files that can affect behaviour of arecord
/aplay
:
/etc/asound.conf
~/.asoundrc
For example, after changing the default sound card via Audio Device Settings my ~/.asoundrc
contains:
pcm.!default {
type hw
card 2
}
ctl.!default {
type hw
card 2
}
If I check aplay -l
, “card 2” is my Bose Revolve SoundLink USB speaker:
**** List of PLAYBACK Hardware Devices ****
card 0: ALSA [bcm2835 ALSA], device 0: bcm2835 ALSA [bcm2835 ALSA]
<SNIP>
card 0: ALSA [bcm2835 ALSA], device 1: bcm2835 IEC958/HDMI [bcm2835 IEC958/HDMI]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 0: ALSA [bcm2835 ALSA], device 2: bcm2835 IEC958/HDMI1 [bcm2835 IEC958/HDMI1]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 1: Dummy [Dummy], device 0: Dummy PCM [Dummy PCM]
<SNIP>
card 2: SoundLink [Bose Revolve SoundLink], device 0: USB Audio [USB Audio]
Subdevices: 1/1
Subdevice #0: subdevice #0
Voice Recognition with Snips
At first I was thrilled to find Snips:
- Completely offline (once you create and download the assistant)
- Good integration with Home Assistant
- Simple to install and configure
But after initial (successful) experimentation, there’s a few largish problems:
- May be issues on Debian Buster
- Dubious support for devices other than Raspberry Pi
- Worst of all, post-acquisition they’re killing the “console” web-app you need to make assistants
Oops. Hopefully it will return in another form.
Install
Following the manual setup instructions:
sudo apt-get install -y dirmngr
sudo bash -c 'echo "deb https://raspbian.snips.ai/$(lsb_release -cs) stable main" > /etc/apt/sources.list.d/snips.list'
sudo apt-key adv --fetch-keys https://raspbian.snips.ai/531DD1A7B702B14D.pub
sudo apt-get update
sudo apt-get install -y snips-platform-voice
Create an Assistant
An “assistant” defines what voice commands Snips handles. Need to create an assistant via the (soon to be shutdown) Snips Console:
- Click Add an App
- Click
+
of apps of interest - Click Add Apps button
- Wait for training to complete
- Click Deploy Assistant button
- Download and install manually
pc> scp ~/Downloads/assistant_proj_XYZ.zip pi@pi3.local:~
pc> ssh pi@pi3.local
sudo rm -rf /usr/share/snips/assistant/
sudo unzip ~/assistant_proj_1mE9N2ylKWa.zip -d /usr/share/snips/
sudo systemctl restart 'snips-*'
At this point Snips should be working. If triggered with the wake word (default is hey snips
), it should send “intents” over MQTT.
Verification/Troubleshooting
Check all services are green and active (running)
:
sudo systemctl status 'snips-*'
Initially, the Snips Audio Server was unable to start. Check output in syslog:
tail -f /var/log/syslog
It was unable to open the “default” audio capture device:
Dec 5 07:22:25 pi3 snips-audio-server[28216]: INFO:snips_audio_alsa::capture: Starting ALSA capture on device "default"
Dec 5 07:22:25 pi3 snips-audio-server[28216]: ERROR:snips_audio_server : an error occured in the audio pipeline: Error("snd_pcm_open", Sys(ENOENT))
Dec 5 07:22:25 pi3 snips-audio-server[28216]: -> caused by: ALSA function 'snd_pcm_open' failed with error 'ENOENT: No such file or directory'
We could set the “default” device. Or, /etc/snips.toml
contains platform configuration where we can specify values from above:
[snips-audio-server]
alsa_capture = "plughw:ArrayUAC10"
alsa_playback = "plughw:SoundLink"
snips-watch shows a lot of information:
sudo apt-get install -y snips-watch
snips-watch -vv
I added the weather app to my Snips assistant. So, if I say, “hey snips, what’s the weather?” snips-watch should output:
[15:00:52] [Hotword] detected on site default, for model hey_snips
[15:00:52] [Asr] was asked to stop listening on site default
[15:00:52] [Hotword] was asked to toggle itself 'off' on site default
[15:00:52] [Dialogue] session with id 'e39a4367-e167-467c-912a-e047f49bea7a' was started on site default
[15:00:52] [Asr] was asked to listen on site default
[15:00:54] [Asr] captured text "what 's the weather" in 2.0s with tokens: what[0.950], 's[0.950], the[1.000], weather[1.000]
[15:00:54] [Asr] was asked to stop listening on site default
[15:00:55] [Nlu] was asked to parse input "what 's the weather"
[15:00:55] [Nlu] detected intent searchWeatherForecast with confidence score 1.000 for input "what 's the weather"
[15:00:55] [Dialogue] New intent detected searchWeatherForecast with confidence 1.000
Instead of snips-watch, you can use any MQTT client:
sudo apt-get install -y mosquitto-clients
# Subscribe to all topics
mosquitto_sub -p 1883 -t "#"
Home Assistant and Snips
Now that we know Snips is working, we can integrate it with Home Assistant.
Snips uses MQTT by default, and Hass has optional MQTT integration. You can either:
- Have Hass use Snips’ broker
- The Hass documentation incorrectly says the Snips broker is running on port 9898. Currently the default is 1883, but consult
/etc/snips.toml
.
- The Hass documentation incorrectly says the Snips broker is running on port 9898. Currently the default is 1883, but consult
- Have Snips use Hass’ broker
Since we did everything from scratch, Hass doesn’t have a broker. So, we should point Hass at the instance that got installed with Snips. In configuration.yaml
:
# Enable snips (VERY IMPORTANT)
snips:
# Setup MQTT
mqtt:
broker: 127.0.0.1
port: 1883
Restart Hass and from the UI pick ☰ > Developer Tools > MQTT > Listen to a Topic and enter hermes/intent/#
(all Snips intents) then Start Listening.
Now say “hey snips, what’s the weather” and you should see a message for searchWeatherForecast
intent pop up.
To test TTS, in Developer Tools > Services try snips.say
service with data text: hello
and Call Service. You should be greeted by a robo-voice from the speaker.
Let’s try a basic intent script triggered on the intent. In configuration.yaml
:
intent_script:
searchWeatherForecast:
speech:
text: 'Hello intent'
action:
- service: system_log.write
data_template:
message: 'Hello intent'
level: warning
Again, restart Hass and then saw “hey snips, what’s the weather”. Now when Hass receives the intent, the TTS engine will say “hello intent” and output the same to Developer > Logs.
The End?
It’s a total bummer the future of Snips is uncertain because it was perfect for voice controlled home automatic. But, that would be why it was acquired.