Amazon Alexa and Solar Panels
I can now query my solar panels via my Alexa Amazon Dot Echo thingie (why so many names?).
I flatter myself as a reasonably competent techie and programmer, but fuck me AWS Lambdas and Alexa skills are a right pile of shite! Sorry if that sounds a bit harsh, but they're a pain in the arse to get anything done.
I wanted something simple. When I say "Solar Panels", call this API, then say this phrase. That's the kind of thing which should take 5 minutes in something like IFTTT. Instead, it took around two hours of following out-of-date official tutorials, and whinging on Twitter, before I got my basic service up and running.
A quick bit of preparatory searching on Alta Vista 2.0 and I'd got incredibly frustrated.
I ended up following this "easy" 30 step guide to develop a basic skill.
It's not so bad, but it does reveal Amazon's contempt for developers. Several of the steps contained errors, it involves multiple logins, random clicks, and a bunch of copy & pasting. Dull and complex.
A frustrating and ultimately unsatisfying experience. I ended up using StackOverflow to correct errors in my code because the documentation was so woefully lacking.
The Code
The Python is convoluted, but manageable. When it hears the trigger phrase it opens a JSON API, extracts a result, then speaks it. It's mostly scaffolding. This is based on the example code. I've removed the comments.
Python 3from __future__ import print_function
import json, requests
# --------------- Helpers that build all of the responses ----------------------
def build_speechlet_response(title, output, reprompt_text, should_end_session):
return {
'outputSpeech': {
'type': 'PlainText',
'text': output
},
'card': {
'type': 'Simple',
'title': "SessionSpeechlet - " + title,
'content': "SessionSpeechlet - " + output
},
'reprompt': {
'outputSpeech': {
'type': 'PlainText',
'text': reprompt_text
}
},
'shouldEndSession': should_end_session
}
def build_response(session_attributes, speechlet_response):
return {
'version': '1.0',
'sessionAttributes': session_attributes,
'response': speechlet_response
}
# --------------- Functions that control the skill's behavior ------------------
def get_welcome_response():
API_url = 'https://example.com/'
response = requests.get(url=API_url)
data = json.loads(response.text)
watts = data['Body']['Data']['PAC']['Values']['Result']
session_attributes = {}
card_title = "Welcome"
speech_output = "Your Solar Panels are generating " + str(watts) + " watts right now."
reprompt_text = ""
should_end_session = True
return build_response(session_attributes, build_speechlet_response(
card_title, speech_output, reprompt_text, should_end_session))
def handle_session_end_request():
card_title = "Session Ended"
speech_output = "May your day be sunny and bright! "
should_end_session = True
return build_response({}, build_speechlet_response(
card_title, speech_output, None, should_end_session))
# --------------- Events ------------------
def on_session_started(session_started_request, session):
print("on_session_started requestId=" + session_started_request['requestId']
+ ", sessionId=" + session['sessionId'])
def on_launch(launch_request, session):
print("on_launch requestId=" + launch_request['requestId'] +
", sessionId=" + session['sessionId'])
return get_welcome_response()
def on_intent(intent_request, session):
print("on_intent requestId=" + intent_request['requestId'] +
", sessionId=" + session['sessionId'])
intent = intent_request['intent']
intent_name = intent_request['intent']['name']
def on_session_ended(session_ended_request, session):
print("on_session_ended requestId=" + session_ended_request['requestId'] +
", sessionId=" + session['sessionId'])
# --------------- Main handler ------------------
def lambda_handler(event, context):
print("event.session.application.applicationId=" +
event['session']['application']['applicationId'])
if event['session']['new']:
on_session_started({'requestId': event['request']['requestId']},
event['session'])
if event['request']['type'] == "LaunchRequest":
return on_launch(event['request'], event['session'])
elif event['request']['type'] == "IntentRequest":
return on_intent(event['request'], event['session'])
elif event['request']['type'] == "SessionEndedRequest":
return on_session_ended(event['request'], event['session'])
This is not AI
I kinda thought that Amazon would hear "solar panels" and work out the rest of the query using fancy neural network magic. Nothing could be further from the truth. The developer has to manually code every single possible permutation of the phrase that they expect to hear.
This isn't AI. Voice interfaces are the command line. But you don't get tab-to-complete.
Amazon allow you to test your code by typing rather than speaking. I spent a frustrating 10 minutes trying to work out why my example code didn't work. Want to know why? I was typing "favourite" rather than the American spelling. Big Data my shiny metal arse.
Why not IFTT?
So, there is an official If-This-Then-That channel for Alexa.
But like most IFTTT services, it isn't well supported by the company. It works with a few blessed services, but you can't bring in your own APIs, nor define your own responses.
It is barely tested and has all sorts of weird restrictions.
Here's a tip, gang, if your service can't cope with upper-case characters that means it isn't ready to release to the public.
The founder of SaySpring recommended their easy to use product:
Sadly, it's only available to US customers.
Driving me dotty
I reluctantly got a Dot because I thought it would be a nifty way to control my new Internet connected light switches.
Most Alexa skills require you to have the sort of lifestyle where you are regularly desperate to know what the weather is like at a specific airport. Or have a life which is intimately tied to the range of Amazon-only services.
Taking a look through what developers have released, it's an obvious conclusion that most developers have better things to do that spend time battling with Amazon's inadequate developer experience.
Oh, and there's the requisite "fart apps" and other high quality services;.
The future may be voice interfaces - but Amazon aren't leading the way there.
Nelson says:
I did the same thing with my solar panels. It took forever to figure it all out. I just wanted it to read me back a value from an API call, that's it.
What I really wanted to do was ask it, what is the solar production so far for today, what is the current output right now, what has the output been for this month, etc.
I hacked a hello world example to hardcore a response for the first question, and gave up being fancy and I gave up in trying to do the rest.
Sloan says:
Hey Terence, I wrote a MVC system for skills that might make things a bit easier. Doesn't solve a lot of the pain but it does remove some of the more basic stuff. Check it out at: https://www.npmjs.com/package/skillvc
I've love to get your opinion on it.
Peter Nann says:
Hi Terence. I share your pain, but I put up with it.
I'd say this is the 'Agile' world we now live in. Ship MVP and iterate based on feedback. Yeah it sucks to be an early adopter, but what's the alternative? Yeah, Amazon could have polished it a lot more, but a lot of that polish might have been wasted if they eventually found out they were madly just polishing the dark, undesirable corners of the proverbial turd...
It will improve... Voice Assistants are going to be huge (as much as Siri/Apple tried to ruin it for us all...). Screens and keyboards are a horrible, crippled interface for a large number of short interactions, just not many people realise it yet...
I've been in Speech Tech for 23 years, and waiting for this era most of that time...
And P.S. - Yes, the term "AI" has been re-purposed (OK, abused!) in recent times. There is often not a lot of 'I' in the 'AI'. It ain't classic, historical 'academic AI', that's for sure. It seems to be widely used now with anything that vaguely mimics a human-to-human interaction. I guess we just gotta get with the program... Although that also means I developed a Voice AI back in 2001 - An OpenMenu system understanding spontaneous query utterances spoken to a banks up-front IVR.
That's it, "Voice AI in 2001" is going on my CV...
Al says:
Have you considered https://mycroft.ai/ instead?