My own thinking about conversational AI changed recently after a coworker of mine pointed out that most dictative conversations only require 2 or 3 pieces of carried context. That’s a level of depth that does not require a massive technical breakthrough, but could be semi reasonably solved with tremendous effort (exists), and current technologies.
It’s true that a good PA brings much earlier context, and “out of band” context to the table, but a customer support service (e.g. delta) really does not.
With this in mind, I think these conversational agents are going to be much more performant than expected in the fairly near future.
Some chatbots have been pretty passable at holding simple conversations for a while. Cleverbot's initial version started in 1986, and seems to have been okay at conversations even then. I remember using it in the early 2000s and being fairly impressed. I don't even think it's using any ML currently, and it's still giving me pretty decent results when I tried using it again just now.
Alexa has more complex requirements (it's funny when Cleverbot gives an occasional nonsensical response, but not so funny when Alexa unexpectedly orders you a $2,000 lawn gnome - okay, still kind of funny), but I definitely agree with you. I don't think this is a challenge that requires anything close to AGI, unless you want to start having truly insightful and philosophical conversations with a computer.
The existing implementations still have a long way to go. Here’s the chat I just had with cleverbot, the first thing it gave me after going to the site:
> Good afternoon. Do you know any good places to get dinner?
> Mot really.
> What about dessert?
> That's if you're a good bot.
> Not for you, for me.
> Okay..
Completely useless. I asked Alexa to find me the article on hacker news from today that talks about chat bots, she said “Sorry, I don’t know that one”
As a random, unscientific test it’s not very useful yet.
I remember reading that the user facing clever bot is really just for training. For competitions, a more robust dataset is used for the conversation and produces better results. [Citation needed]
...dude. The point never had anything to do with being the modern “PA” you’re imagining. It’s a chat bot designed 30+ years. It’s a neat (and at the time, breakthrough) for having conversations. It’s not intended to go find information for you..
It sounds advanced because it operated in an incredibility restricted context. It's impressive in the way that self-driving cars in the 80s were impressive because they worked at all.
I'm interested in someone's POV on Amazon's skill-forward approach - relying on third parties to create skills that are actually good, and then helping them talk to each other - compared to Google Assistant's approach, which seems to be focused on first-party capabilities as the Actions on Google ecosystem is not so developed.
I think both have their merits and it surprises me that both of them don’t do more of what the other one is doing. First party skills are a great way to make the platform more integrated and useful for the user but third-party is a great way to expand an ecosystem of skills they’ve never thought Of.
With their moves with Nest I don't think Google's play is to be a platform... But rather they want to own all of that stuff and tell you what to do with it. They want to be an agent and not a platform.
They do, but that move surprises me. You would think they’d want to be a platform so they could collect data that they didn’t even think about collecting. They could still own the data even if they were a platform they would just have to share little chunks of it with different third-parties.
Isn't that what Yahoo was trying to do when Google replaced them? Historically, it seems like minimal platforms for content tend to win out over attempts to completely control the user experience.
I think using Alexa conversation style is just awful. I wish it would move toward a more explicit list of items -> descend -> new list of items approach.
The area where Alexa is most likely to show its warts is when you fall for the trap of trying to engage in a normal dialog with it.
Agreed. I also wouldn't mind a lot less verbosity when giving it a command rather than a query. Turn off the lights, or set a timer for 15 minutes could probably be confirmed with an "Ok" or even just a pleasant chirp/beep sound. "Ok Turning 15 light off" "You got it, timer set for 15 minutes" is unnecessary.
The chime feature is enableable, and it's pretty tolerant on input too. These days I just say "Alexa bedroom off" and Alexa does the right thing. Getting rid of the cue phrase would be nice though. That is, if intonation or volume or directionality of my voice could help it determine when I'm giving a command vs random conversation.
I’ve noticed that my Alexa no longer requires me to ask a request of certain skills in a lot of cases so it seems to be getting smarter about choosing skills automatically. This seems like a natural next step.
They might not have had to do that in the first place(not that it's necessarily a bad feature) if they didn't shoot themselves in the foot by making skill discoverability exceptionally difficult. The fact that the Alexa mobile app continues to slower and slower, and was a slow and clunky piece of junk right out of the gate, certainly doesn't help.
Sometimes I worry that there are certain computational tasks which simultaneously require one human mind to encompass a large portion of their structure while that structure is too large to fit into a human mind, or too large to work with on a practical basis. You know, a "kernel of functionality" that can't be compartmentalized between multiple developers but still has to scale to a truly massive size to perform the desired task.
I worry strong AI is such a task. I also worry there's a barrier somewhere in conversational computing that will stop us from achieving much past a certain threshold before the framework becomes unworkable. I often run into painful reminders that our voice assistants are just Chinese rooms when it turns out they're missing basic functionality like "take me to the McDonald's near the Walmart".
Not a fan of Alexa at the moment, but I can totally picture it's going to be absolutely massive and everywhere once 5G network is launched and more IoT devices will flood the market. Amazon is pushing hard to make it more ubiquitous and it just seems inevitable. Might be a good time to learn some Alexa skills now.
Alexa needs voice personalization like Siri and a parental mode. My kids keep turning off my timers and lights.. it’d be nice for Alexa to have a mode where it won’t do that unless it’s my voice.
Biggest gripes for me is that it needs to work better across multiple devices. If I ask it to list my timers I want it to list my timers (so voice personalization) irrespective which room in the house I'm in. But currently timers and alarms are per Alexa device.
Also needs to get better at identifying the closest device, and letting you turn off alarms etc. from any device - it's incredibly annoying to have an alarm go off downstairs when I'm upstairs and have to be close enough to the right device to turn it off.
I get why to an extent - while solving the distribution might be simple, it's a much harder contextual problem to figure out what a request means across multiple devices. E.g if someone sets an alarm in one bedroom it probably shouldn't go off anywhere else, and if you ask to turn off an alarm in one bedroom it probably shouldn't cancel the one in the other. Voice personalization would at least make that less ambiguous, though they would sti need some careful thinking.
It’s true that a good PA brings much earlier context, and “out of band” context to the table, but a customer support service (e.g. delta) really does not.
With this in mind, I think these conversational agents are going to be much more performant than expected in the fairly near future.