Most of us in San Francisco have seen the robotic arm based Cafe-X barista. I was impressed by the precise movements of the robotic arm that served a cappuccino made by an espresso machine using fresh beans. The robot interfaced with humans via an app running on a kiosk or a phone. I was intrigued how the robotic barista picked a cup, placed it under an espresso machine that poured espresso and milk into it. When the espresso machine finished making the drink, the robotic arm picked up the cup and gently jiggled the cup to mix the contents and then put it aside for the customer. The robotic-arm even thanked the customer with a wave gesture when delivering the drink. What an amazing experience!
The Cafe-X robotic arm can work with multiple espresso machines and has the potential to replace human baristas.
Robotic arms are trained using tedious rewards-based reinforcement learning. I briefly mentioned requiring a comprehensive set of training data, and many iterations for succeeding, when I wrote about Self-supervised learning gets us closer to autonomous learning.
A dog trainer who rewards a dog for positive behavior, helps the dog learn actions it took to get a reward. Similarly, a robotic arm learns how to grasp coffee cups and move in the environment without spilling the coffee based on positive and negative rewards. This reward-based feedback reinforces which actions to perform and which to avoid.
There is more than just painstaking reinforcement learning involved here, but at a high level, the robotic barista is combining computer vision and robotics to bring a unique experience to customers.

The robotic barista is faster and cheaper than a human barista but so are vending machines. The current iteration of the robotic barista is therefore comparable to vending machines considering both speed and function. The robotic arm based barista is a cooler- and futuristic-looking novelty.
I started looking into autonomous robotic systems and came across robotic bartenders. Makr Shakr is a robotic bartender that makes cocktails for you that you design using a tablet or phone.
Like the robotic barista, the robotic bartender requires a custom purpose-built environment to function in i.e., a bar or cafe purpose-built for the robotic arm. Both are focused on replacing and augmenting baristas and bartenders by producing a large number of drinks in little time. Functionality-wise they aren’t any different from vending machines other than the fact that they use fresher ingredients. Today, vending machines may even be faster than robotic arms.
Beyond Computer Vision and Robotics
A couple of days ago I saw this tweet from Professor Rachel Thomas founder of fast.ai and professor at the University of San Francisco, quoting Computer Vision UC Berkeley Professor Jitendra Malik, “Vision, language, and robotics can no longer be separate silos. It is increasingly important to understand all three”. This made a lot of sense. Without Natural Language, robotic bartenders are simply vending machines.
Without Natural Language, robotic bartenders are simply vending machines.

Instead of using a kiosk to communicate, bringing NLP to the mix will improve interactions vastly.
Recommendations from vague interactions
Unless robotic baristas or bartenders are able to converse in a natural language and make recommendations from vague interactions humans will continue to have a leg up.
Consider this interaction that feels human.
🤖 “How may I help you?”
👩🏽 “I am interested in a low-calorie and not-to-sweet cocktail.”
🤖 “Do you like gin or vodka?”
👩🏽 “How about white rum?”
🤖 “any particular brand?”
👩🏽 “There is this Brazilian rum I like. It comes on a green and clear bottle. I forgot the name.”
🤖 “Yes. ‘Leblon’ we have that”
👩🏽 “Yes that’s the one – how about that with lime, soda, ice, and very less sugar?”
Of course, this futuristic interaction is accomplishable via smart vending machine as well. Once you have a machine capable of natural language conversations and can make recommendations from vague communications, having a robotic arm with a personality make complete sense. Humanoid baristas and bartenders would be a natural and welcome evolution.
Yan LeCun (Facebook AI Research, and New York University) in his Self-supervised learning Oct 5, 2018 presentation suggested that although deep learning brings us “Useful but stupid chatbots“, we still don’t have “Agile and dexterous robots” and “Smart chatbots“.


Environment
Robotic baristas and bartenders require specialized environments to function, I.e. a bar or cafe purpose-built for the robotic arm. A highly convincing and real innovation would be the ability for a robotic arm to work in an environment where a real barista or bartender would work.
Imagine the usefulness of robotic baristas and bartenders if they are able to work at your favorite Starbucks or bar without requiring a specific environment and equipment?

Such robots that can work where humans currently work are not so far-fetched. They do not have to be humanoid. For example, the robotic pilot can work in planes without autopilot and relies heavily on computer vision. To me, this is the Robotic Process Automation of Robotics, i.e., ability to work with legacy systems. Source: Robotic Copilot – Aurora Flight Sciences
So, to make robotic baristas and bartenders more human-like, we need to bring Natural Language Processing and Recommendations to robots. With advancements in NLP robots will be able to understand us from vague interactions. We should produce robots that can work in the same environment as and alongside human beings without requiring specialized environments.
