We're entering a golden era for small-batch artisinal hardware. Anyone with an idea and a modicum of talent can build hardware and get it shipped around the world at a reasonable price.
Enter "The ReSpeaker" - an open source alternative to Amazon's Echo. It promises ultimate hackability, speech recognition, and IoT control, wrapped in a cheap single-board design.
ReSpeaker is an open modular voice interface to hack things around you. Let you interact with your home appliances, your plant, your office, your internet-equipped devices or any other things in your daily life, all by your voice.
— Seeed (@seeedstudio) August 12, 2016
Here's their pitch:
I've been sent one for free - and I've only had a few days to play with it - here's what I found.
One of the most interesting things about ReSpeaker is how open their specs are. You can download the schematics freely from GitHub.
There's a brilliant range of hardware and open source software on board. Let's take a quick tour.
- The main chip is an AI7688H - is is a Linux (3.18.23) based, WiFi SOC, and it runs OpenWRT 15.05!
- LPC11U35 coprocessor which controls the 12 RGB LEDs and the GPIO ports.
- Audio is handled by the WM8960 chip.
- A single microphone (more on that later).
- Mono speaker header - up to 1W.
- MicroUSB for power and/or data. Power can be wired directly onto the circuit if you know what you're doing.
- Standard TRRS headphone jack - although I wasn't able to get the microphone working with on it.
- MicroSD card slot. There's only ~6MB of free space on the file system (yes, MegaBytes!) - although
/tmphas ~60MB available (wiped on reboot).
- Three I/O buttons, which you can see on the reverse of the board:
- Plug into a Micro USB power supply. I'm not sure what the minimum power requirements are, so I played it safe and used my 2.4amp multi-USB charger.
- Connect to the WiFi network generated by the device.
- Set the
rootpassword (yup, you get root without having to faff around).
- Select the (2.4GHz) network you want to connect to.
You can, if you want to, directly configure OpenWRT, although it is a little tricky on a phone.
Only takes a couple of minutes and you now have a little Linux computer attached to your network. Go ahead and plug some headphones in, or connect it to your HiFi system.
You've two different ways to connect to the device.
If you've connected the ReSpeaker to your laptop via USB, it will present as a serial device which you can connect directly to.
screen /dev/ttyACM0 57600
If you're connected to the same WiFi network,
ssh is open on port 22 as per usual. Use the root password that you set earlier and off you go!
This is why we're here, right? To be able to say "Play 'Flight of the Valkyries'" and have music suddenly appear.
For complex tasks, everything has to go to a recognition service in the cloud. Quite how you feel about an always on Linux box sending everything you say to Microsoft depends on your levels of paranoia.
My first attempt wasn't overly successful:
Tried to use bing's voice recognition service.
It didn't understand much of my English, but apparently 我的中文是很好！ pic.twitter.com/JYbv9jtddy
— Terence Eden ⏻ (@edent) August 13, 2016
Why was everything being recognised as Chinese? Ah! Because that's what the default was set to! Quickly changing
en-GB made things work slightly better.
There are a couple of core problem with this approach. Firstly, the little microphone in the middle of the board isn't of particularly high quality.
ReSpeaker has an optional add-on board - the ReSpeaker Microphone Array promises multiple microphones and improved sound detection.
Secondly, the general slowness of the board - and the round trip delay of bing's processing - make for an unacceptable delay.
Finally, Microsoft's free tier offers just 5,000 transactions per month. Depending on how often you're planning on using it - that could be an annoying restriction.
Is that all we're limited to? No!
ReSpeaker have provided an way to recognise speech locally using PocketSphinx. You'll need an SD card and to
git clone a bunch of stuff, but it is pretty easy to get set up. Here it is in action.
OK, Bing lets it down - but the local keyword recognition is pretty cool! That makes it possible to hack together simple "OK Google / Hey Siri / Yo Alexa" styles of home automation.
There's a reasonably good tutorial on how to use the Arduino software on the ReSpeaker Github.
I've not used Arduino like this before, and it was relatively simple to set up. There's currently no documentation on how to handle the LEDs, buttons, and GPIO from Python.
Airplay / DLNA / Audio Quality
I wasn't able to use the device as an Airplay or DLNA receiver. None of the apps I tried could see it as a server, and port 5000 didn't appear to be open. This looks like something that needs to be installed separately. Personally, I'd have preferred to see Bluetooth rather than DLNA, but I'm strange like that!
Audio quality was average. With nothing playing, there was a hiss on the line. No ground hum - just gentle white noise.
Make it speak!
By default, the venerable
espeak is included.
From the command line, you can run something like:
espeak -ven+m2 -s150 "I've just picked up a fault in the AE35 unit. Shall I proceed?"
The voice quality isn't brilliant. I used gTTS which is a simply Python package to Google's Text to Speech API. Very handy for creating stock phrases to use repeatedly.
pip install gTTS
gtts-cli.py "I'm sorry Dave, I can't do that." -l 'en-uk' -o sorry.mp3
Some highlights that I found.
- The device presents as
2341:0036 Arduino SA- which is an Arduino Leonardo. Basically, an ATmega32U4.
git2.3.5 is installed.
- Python 2.7.9 is installed. Along with
curl7.40 - also
nodeis only v0.12.7. You also get
Basically a pretty comprehensive set of utilities.
Ports are open on 22 (SSH), 53 (DNS), 80 (webserver), and 445 (SAMBA file sharing).
This is modular hardware, with loads of expansion ports. You can get an idea of what people are able to make on the ReSpeaker site
The ReSpeaker Kickstarter is now live! Early birds get the ReSpeaker for US$39 - which is a pretty compelling price.
The usual Kickstarter caveats apply - but Seeed Studio have released several hardware products so they have a good understanding of what it takes to manufacture and ship hardware.
This is a fun little unit - but it is not without its flaws.
- Audio in is only via the central microphone - the quality is acceptable, but it is a shame that there's no audio-in from the TRS headphone jack.
- Bing isn't very good at voice recognition - local recognition is OK though.
- Only 2.4GHz WiFi. Small and cheap devices like this don't tend to have 5GHz WiFi. They don't need the speed, but with the 2.4GHz band getting more crowded in urban environments, it's a pity it wasn't included.
- The CPU is, understandably, a bit slow. Running
pip install tweepytook a few minutes. Obviously you're not going to be doing much intensive computation on this thing, but it makes setting things up a little tiresome.
- No digital out or surround sound support. Given that the Raspberry Pi Zero has HDMI out for audio, it's a little disappointing to see the ReSpeaker stuck with stereo.
- Lack of high-fidelity playback. As above - if you're happy listening to MP3s via a 3.5mm jack, this is great. If you want it to playback your lossless 96kHz FLACs, you're out of luck.
Really, the challenger to this unit is something like the Raspberry Pi. You need the £30 Pi 3 model B in order to get a Pi with WiFi - the other models require USB dongles. The Pi family don't have microphones - so that's another add-on. The ReSpeaker is a slim, circular package with built in RGB LEDs and buttons - whereas the Pi is fairly chunky and devoid of extras.
My friend Sam Machin has got the Amazon Echo service running on a US$10 CHIP computer and also running on a Raspberry Pi - I think the ReSpeaker fits in nicely to this class of product.
Overall, the ReSpeaker works well, looks good, has great hackability, and is at a pretty good price point.
As I said, I received the ReSpeaker for free. If you'd like me to review your technology product, please get in touch.