How Bixby Differs From Siri, Alexa, Google Assistant.


Reading Time: 6 minutes

“Bixby, place an order for the wine bottle I am holding”.

Today March 29th, 2017 Samsung released the Galaxy S8, an upgrade to the previous android phones. From a hardware perspective it is quite a good phone however history will remember the S8 for Bixby.  Bixby is Samsung’s new Voice First system and in some ways is related to Siri, Alexa and Google Assistant and in many ways it has bypassed them, at least temporally.

Samsung is fully committed to Bixby and has assigned a hardware button to activate the system.  There will also be a wake word “Bixby” and a screen button. We are not seeing the results from the Viv acquisition by Samsung in the current version of Bixby today.  The system does not use the amazing  “Dynamically evolving cognitive architecture system based on third-party developers” patent [3].  However there is little doubt Bixby is on the path that Viv created.

I first wrote about Samsung’s Voice First ambitions when they acquired Viv [1] and when they acquired Harman [2].  Viv was created by the same team that built Siri for Apple almost a decade ago.  Although we are not seeing even a small part of Viv [3] there is already some signs of how Viv is influencing Bixby.

I call these systems Voice First, not Voice Only.  The reason is simple, we ultimately will have a Voice OS as the dominate interaction between us and just about any device, machine or computer.  Voice will lead the interaction.  With Bixby we can see this already in use.  Bixby is actually three fundamental systems:

  1. Bixby Voice—The Siri-like and Alexa-like Voice First system that will allow you to do things like set timers, ask for the time and weather and other similar things, it will also go deep into the function of an app (with the deliver API) and allow for Bixby to do far more complex tasks.
  2. Bixby Vision—Is an augmented reality lens similar to Google Goggles [4] and Amazon Rekognition [5].  Bixby Vision uses the Lens Tool from Pinterest to power the results.  This system will allow for translation of over 52 languages when you point the camera foreign text.  Along with scanning QR codes, recognizing business cards, identifying monument, street location restaurants and product.  The product recognition has the ability to allow for online purchase of many items you show Bixby.
  3. Bixby Cards—This is similar to the Google Cards, Siri snippets and Alexa card most are similar with.  Bixby Cards will be dynamically updated and can be interacted with on a deeper level including queries via Voice.  Currently most cards are static.  With a swipe right you will gain access to this system on the new S8.

Fundamentally Bixby offers these three unique properties:

Completeness— When an application becomes Bixby-enabled, Bixby will be able to support almost every task that the application is capable of performing using the conventional interface (ie. touch commands). Most existing Voice First systems currently support only a few selected tasks for an application and therefore confuse users about what works or what doesn’t work by voice command. This completeness property of Bixby will simplify user education on the capability of the system, making it much more predictable.

Context Awareness— When using a Bixby-enabled application, users will be able to call upon Bixby at any time and it will understand the current context and state of the application and will allow users to carry out the current work-in-progress continuously. Bixby will allow users to weave various modes of interactions including touch or voice at any context of the application, whichever they feel is most comfortable and intuitive. Most Voice First systems completely dictate the interaction modality and, when switching among the modes, may either start the entire task over again, losing all the work in progress, or simply not understand the user’s intention.

Cognitive Tolerance
— When the number of supported voice commands gets larger, most users are cognitively challenged to remember the exact form of the voice commands. Most Voice First systems require users to state the exact commands in a set of fixed forms. Bixby will understand commands with incomplete information and execute the commanded task to the best of its knowledge, and then will prompt users to provide more information and take the execution of the task in piecemeal. This makes the interface much more natural and intuitive.All three points are very important aspects of the next two generations of Voice First systems. The ability to understand the contextual state and the place a user is in an app is extremely important. This deep linking and continuity will add a powerful layer to many app experiences. The Cognitive tolerance is quite important as generation one systems like Alexa and Siri require strict invocations to activate apps and services. Just these three points make Bixby a step or two levels ahead of just about everything in the market.

 

~—~

~—~

Bixby Voice Commerce

Bixby is designed to allow for Bixby Vision and Bixby Voice to work together.  For example you can hold a bottle of wine that you just sampled at a fine restaurant, aim the S8 camera at the bottle and let the system identify it and present online purchase options. You would complete the order with a simple “Order it”.

Voice Commerce will be as important if not more important than older web commerce and mobile commerce.  Voice Commerce will be an overlay on top of all of these systems as a new modality and Bixby is already lining up with this future even working with Amazon for some fulfillment in the US markets.

Bixby And Apple Siri + Workflow

Bixby was released hot on the announcement Apple acquired Workflow. Workflow is an automation app for iOS that lets you create workflows. A workflow is made of a series of actions, executed in a single flow from top to bottom. You press the Play button at the top, and Actions will execute, one after the other; once it’s done, the output of a workflow the result will be displayed at the bottom of the chain of actions or communicated visually or using Siri’s voice. After an action is completed, the workflow automatically jumps to the next one until it reaches the end. The power of this process lies in the ability to arrange actions any way you want — building workflows that solve either basic or complex problems for you.

With Workflow deeply integrated with iOS Apple is on their way to building a world class Voice First platform.  App builders can make deep connections into their apps and allow for a series of connections to other apps not only built into the OS from Apple but from other developers allow the AI and ML in new Siri to interconnect very complex actions.  The possibilities and the palette of option are truly unlimited with Workflow + Siri.

Bixby is using a similar approach with deep integrations to the OS.  The difference is at this point quite profound.  Bixby will be more of a developer based API approach, where Apple will of course have APIs for developers to integrate Workflow by the summer WWDC event, they will also grant the power to the user to build Recipes and Scripts to use Siri to interconnect between apps, functions and events.

Workflow will become the glue that holds together a number of new Voice First domains for Siri. Bixby could also allow for Scripting, however there are some issues they may face by not fully controlling Android.  By early 2018 we will see just about every Voice First system, including new ones, that will have far more powerful Workflow-like features.

If Bixby takes the direct Viv approach they could actually leap frog Siri + Workflow with the new self-programing paradigm of the “Dynamically evolving cognitive architecture system” patent that drives Viv’s intent domains [3].

Bixby Is Not Just About Smartphones

Bixby is a new system and there is not currently a developer system, there will be.  There is also a significant number of limitations because it is a new system.  This make Bixby less powerful than Sir, Alexa, Cortana, Google Assistant at this point in some ways.  This will change in a big way over the next year.  Samsung’s vision for acquiring Viv technology  is far beyond smartphones. The company manufactures many products, from washing machines to X-ray systems. They are looking at removing complex menus and commands and moving to a AI assisted voice interactions to short cut to product features.

The Voice First revolution is in full force and Samsung is taking it very seriously. Samsung is saying they expect 40% of phone interactions to be via Bixby over time.  This mirrors what I have stated:

“By 2025 50% of all human computer interactions will be via Voice” [6]

Bixby and the Viv technology we will no doubt see in the future will move this far faster than most experts imagine.

 

_____

[1] VIV was just acquired by Samsung

[2] Samsung’s acquisition of Harman will have us talking to our cars soon

[3] Whats the difference between Siri and Viv

[4] Google Goggles

[5] Amazon Rekognition

[6] There is a revolution ahead and it has a voice

 

~—~

~—~

 

Of interest: