Apple and Siri - Two Sides of a Coin

23 October, 2011 05:58AM ยท 4 minute read

With the iPhone 4S Apple introduced Siri - your own personal digital assistant to help you with common day to day tasks through an interactive voice interface. I’ve been playing with Siri for a while now and think it’s safe to say without any doubt that there are two distinct ways to interact with Siri. On the one side there’s the useful bit: Ask Siri to make an appointment for you, a meeting, a note, play some specific song or artist, set a reminder for when you get home or dictate brief text messages (not a comprehensive list). Then, on the other side of the coin there’s the part that makes it a bit of a toy: Ask Siri any ridiculous question you can think of and see what it says back to you.

In the serious way Siri is quite excellent when compared to Apples Voice Control that debuted with the iPhone 3GS running iOS3. For example asking Voice Control to “Play a song by Matchbox Twenty” would not work, but asking “Play a song by Matchbox Two-Zero” would find the right music just fine; whereas Siri gets both right, each time. The word recognition is superior and there is obvious tweaking to the pattern matching like my example - numbers and combination numbers - however Siri also accepts “I want to listen to Matchbox Twenty” where the previous Voice Control would draw a blank. The true leap forward however is the break from keyword based instructional commands to flowing conversation with context. This is most clear when making a reminder for example: Me: “Remind me to wake up”, Siri: ‘When would you like me to remind you?’ Me: “6 AM” Siri: ‘Okay I can add this to your reminders. Shall I go ahead?’ Me: “Move it to 7 AM” Siri: ‘Here’s your reminder for tomorrow at 7AM. Shall I create it?’ Me: “OK” Siri: ‘Okay, I’ll remind you.’

There are a few variants of the responses as I made the reminders example above - sometimes Siri responds with ‘…I can add this…’ and others ‘Here’s your reminder…’ and this tries to maintain an illusion that you’re not talking to a machine by switching things up. Of course Apple needn’t have cared, it still would have been just as useful with the same responses every time, but Apple clearly wanted to take this another step and not only allow the user to speak a multitude of requests, but to also respond in a multitude of ways. Inevitably however there are only so many ways in which Siri can respond. Time and careful attention to what you say and what the response is, for the keen observer at least, shows up clearly that it’s just a machine programmed by people and the “AI” part of it is perhaps stretching things. As an engineer, I look at Artificial Intelligence as neural networks combined with some fuzzy logic and machines that truly can learn. Siri certainly learns basic things, like who your partner or children or parents are (you tell Siri once, it remembers) but so far as deductive reasoning goes it’s not as intelligent as it might seem.

That said, it is fun and convenient to use. The fun part is clearly asking Siri all sorts of unusual questions. Below are some examples with Siris responses.

Siri Example 1 Siri Example 2 Siri Example 3 Siri Example 4 Siri Example 5 Siri Example 6 Siri Example 7 Siri Example 8

It becomes a pastime as you try more and more phrases, political and science-fiction references, anything that comes to mind - just to see what Siri will respond with. The responses you get have a bit of a slant to them - an edge of impatience even. The programmers were obviously keen to make it feel like a slightly annoyed personal assistant and they’ve generally succeeded if that was their goal. In a brief fit of frustration at one point I did nothing but spout expletives at Siri. Siri’s response: Nothing. It simply turned itself off as if you’d hit the microphone button to cancel your input. I suppose this is their way of putting virtual fingers in Siri’s virtual ears.

The software is currently advertised by Apple as being in Beta, so there is no doubt they will add more features, more responses and more attitude perhaps. Even at the moment, it is perhaps the best implementation so far (that I’ve used) for a voice interface to a digital device.