RealTimeWeekly | Future of WebRTC (and Bots!)
Presentation from WebRTC Argentina where we wanted to stretch people's minds a little bit about WebRTC. WebRTC can be one part of larger technological change in our society, and in this talk we give updates on WebRTC in the short term as well as one vision for how WebRTC and Artificial Intelligence may intersect in the future.
bots, WebRTC, artificial intelligence, health assistant, telehealth, telemedicine, Twilio Video
single,single-post,postid-17544,single-format-standard,ajax_fade,page_not_loaded,,qode-theme-ver-5.9,wpb-js-composer js-comp-ver-4.3.4,vc_responsive

WebRTC Argentina: The Future of WebRTC


01 Jul WebRTC Argentina: The Future of WebRTC

At WebRTC Argentina we covered a number of topics, including an introduction to WebRTC with coding demonstrations, demos of WebRTC use cases, and design/UX for WebRTC. We concluded the mini-conference with this talk, where we wanted to stretch people’s minds a little bit about WebRTC. WebRTC is not simply about 1-1 video chats or business conferencing tools. WebRTC can be one part of larger technological change in our society, and in this talk we give updates on WebRTC in the short term as well as one vision for how WebRTC and Artificial Intelligence may intersect in the future.

You can see the complete presentation below, and we’ve included the transcript below the video. The talk was given by Arin Sime and David Alfaro, co-founders of, with an additional appearance by Jorge de los Santos, a developer from our firm.

What is next for WebRTC?

Arin Sime (co-founder of, editor of RealTimeWeekly): We’ve talked about WebRTC and how it has all this great P2P aspect to it, but what’s coming next?

In the short term, as we talk about the technical changes, WebRTC is not quite a fully approved standard yet. Many things are still changing.

Short term updates to expect in WebRTC

Microsoft Edge support is coming. Everyone has speculated for a long time about WebRTC support in Safari and iOS native, and it seems that since Apple has posted jobs looking for WebRTC developers to join their core team … it’s leading everyone to believe that support in Safari is coming soon. Whatever soon is, who knows?

The VP9 codec that is coming out will reduce latency and improve the quality of calls.

They are also talking about adding in better error handling on startup, so that when the ICE candidates fail there is more information about why the P2P connection failed.

If you’re interested in more information about these changes to the standards as they are happening, then a great resource to check out is They do blogs and regular webinar series about the standard, it’s very good!

What to expect in the next wave of WebRTC applications

I have a technical background, but personally I’m more of a business person at this point so I’m always thinking about what sort of apps are we going to be building for our clients?

Certainly we see continued interest in 1-1 video chat applications. We expect more to be done in group calls and with media servers to handle larger calls. That is certainly a common request.

We also expect to see more broadcasting with WebRTC, such as the sports celebrity broadcasting tool that we showed you today. Interactive broadcasting is a very popular use case with WebRTC right now.

And then what else can you do with the Data Channel side of WebRTC? Internet of Things is very interesting to me, but there’s not a whole lot happening there yet with IoT and WebRTC.

When building your application, you also need to consider if you are going to build a WebRTC app from scratch, as Germán bravely did for you earlier today, or are you going to use one of the platforms like TokBox or Twilio or XirSys?

Those are all important considerations for you as you build your WebRTC application.

WebRTC is just a printer driver

There’s one really important thing that I want you to remember, and so I liked this quote from Al Cook at the Twilio Signal conference a couple weeks ago. “WebRTC is really just a printer driver”, this is really cool that we can do all this in the browser now and it’s very interesting to us as developers, but ultimately it’s just a transport layer.

WebRTC is like a driver, and it doesn’t provide any value inherently by itself. It’s all in how you use it in your application and what additional value does that provide to users. We shouldn’t just throw in video chat on everything we build just because we can! It has to add some intrinsic value to the customers.

So what we wanted to wrap up today with is something that’s not entirely WebRTC, but I think is intriguing to consider how video chat can be combined with other technologies that are very trendy right now.

Does anyone want to guess what I’m going to say is an uber trendy topic right now? How about bots?

Bots versus WebRTC

Bots and Artificially Intelligent Bots have been a topic at recent Google and Facebook developer conferences. If you are a WebRTC developer, than you are thinking a lot about how you can add human interaction to your application with WebRTC. But we also have a lot of talk in our field right now about how to add more Bot interaction to our applications.

What’s going to happen? Are we setting up a cultural war between Bots and Humans? Maybe not, but let’s walk through a quick example and see how we could use them together.

Typical interaction with Doctor office

Imagine that you have a doctor’s appointment tomorrow. You need to remember when is your appointment, and you will probably call the doctor’s office to find out, and see what do you need to bring with you? Will there be bloodwork during the appointment or can I eat breakfast before I go to the office?

Bots and Humans together in Customer Contact scenarios

How could we do that with a bot and a human together maybe? I might call or text (with Twilio’s SMS api) the doctor’s office.

The bot sees my request comes in and answers the basic questions, but then it probably needs to hand me off to a human for more advanced questions, and that’s where WebRTC comes back in.

Example architecture of a WebRTC and Bots application

This is not the most elegant architecture diagram I’ve ever drawn, but look at all the technologies that might be involved.

On the patient side, we might contact the doctor office in over the telephony network, SMS text message in, or use a chat tool on the website.

We need to take that information and get it to a bot, which means we need some Speech to Text API’s. We need a tool for Natural Language Understanding (NLU), to understand what’s been said and parse it. So we potentially have some WebRTC going on the left hand side, but then we’re going to take my voice and hand it off to the NLU and some artificial intelligence to figure out what I’m saying.

Then we give that to our actual application logic. There are many things going on here that are not our custom app, but API’s that we use instead. Then our app takes that info from the API’s and gets the data from the patient record and returns that to the human via the bot.

If the bot needs to connect me to a real human, we can do that on the right via WebRTC or one of these other channels for that.

Tools you might use in a Bots and WebRTC architecture

There are many tools we can use to do this. On the phone or SMS side, we have Twilio, Voxbone, Telestax, or others. For Natural Language Understanding and processing, we might use IBM Watson. You have probably heard of that AI system from the game show Jeopardy, but it’s now an API you can use. There are also other tools like Clarify and Voice Base that can help you parse out and understand what someone is saying.

And then on the WebRTC side, we’ve got TokBox, Twilio, XirSys and others.

In Chrome, we can do speech recognition. Then we could send that to the Translation API. WebRTC is capturing my voice, takes my Spanish and sends it out for a … mediocre … translation to English. The translation is not perfect, but with more work on the API it’s possible to do better.

This particular demo of taking my voice and translating it is from a website This site is a great resource for WebRTC tutorials written by Muaz Khan, and it’s a great place to go anytime you want to see an example of how to do something with WebRTC.

One last demo is from IBM Watson, where it shows a pizza bot taking my order. It’s doing the Natural Language Understanding part. So for example, the bot asks me “What size do you want?” and I say “the biggest you have!” The NLU knows to map that to the largest size of pizza, an Extra Large.

Context is important in WebRTC applications

All of this is cool, but we really want to build a contextual bot that understands the context of what we’re asking. David Alfaro will explain the next part of our demos.

David Alfaro (co-founder of We have an internal product that we are working on, just for a small amount of time so far. Our application is Uni, which has two faces: the face to the patient and the face to the doctor.

What we are doing here is giving information to the bot, which is a health assistant. We are takign the English phrase and using SyntaxNet from Google, which they relased publicly. On top of that they released Parsey McParseyFace, which has the ability to really understand the meaning of the phrase in English.

We take that phrase and build a tree structure that captures the actual meaning of the phrase in English.

What you have seen here is that we put in a delay to make it appear that the bot is thinking, which it doesn’t really need of course, but that creates engagement with the user.

I want to emphasize something here, the reason that WebRTC is so successful right now is that it enables real time capabilities for everybody.

The real topic here is not necessarily WebRTC, but real time. So if you can build something that solves real time urgencies, then you are solving real problems.

Now let’s look at the doctor view.

The doctor here learns about the appointment with the new patient, and he’s going to interact with them.

There are 2 or 3 artificial intelligence engines running here. The first one is that needs to convert from speech to text. The second is taking that text and converting it into a tree structure that gives meaning to the sentence.

The other part that is very important, is that now I understand what the patient is telling me, I have to go to the other artificial intelligence, and ask that given these phrase, what is the approximate diagnosis for the patient.

Of course what we are doing here is not giving that responsibility to the bot, but we are approximating something that the doctor can then use and refine later, in order to give the final prognosis.

Jorge de los Santos: [In a previous job] We used software to diagnose the most common illnesses based on some inputs, and so WebRTC provides a new way that we can implement assistants like this to connect patients with the practitioners.

Just to give you some feedback on how important it is to move forward with this type of technology. There is an estimate from the World Health Organization that we have around 59 million medical practitioners in the world (not just doctors, but nurses and others too).

To understand how small this number is, remember that you have 7 billion people in the world, and 80% of these medical practitioners are established in the most developed countries.

Here in Argentina, we have 46 patients for each doctor. In parts of Africa you have 50000 patients for every doctor.

This sort of technology allows doctors from anywhere to meet with patients. The WHO needed six months to send the first team to Africa when the Ebola epidemic started. They didn’t have doctors for 500km away from some patients, but they all had smart phones.

If they had access to this technology and a neural network behind it, then the response would be faster and the epidemic would be slower.

David: Now Arin is going to show you a WebRTC integration to a chat bot, which we didn’t have time yet to do in the Uni+ application. But we’ve had conversations with potential clients in Africa and around the world interested in this sort of WebRTC technology for remote telehealth, so this really is important.

Arin: So to demonstrate that last step of the process, I took the Botkit library which is library to build bots which respond to specific messages. It doesn’t have artificial intelligence behind it at this point, but it has keyword matching and so I can interact with the bot to tell me more about my appointment tomorrow.

Once I’ve stumped the bot, I ask it to connect me to a human, and that keyword tells it to call a doctor for me. So the Bot has initiated the WebRTC call for me using Twilio Video. That’s the final step of the Bots and WebRTC process that we wanted to show you today.

The code to do that is pretty simple. Basically with the Twilio API you are creating Conversations and inviting people to them. This is based on sample code from Twilio for how to integrate Botkit, as well as other examples to show how to do text chat and Twilio Video.

Twilio IP messaging

Botkit + Twilio IP messaging

Twilio Video demo

What will you build with WebRTC?

With that, I just want to wrap up and say that I hope you got some inspiration today at WebRTC Argentina for what you can build with WebRTC. There are many fields that WebRTC applies to. Thanks to all those who spoke and thanks to our sponsors (Thanks Twilio!), and thanks to all of you for coming!

Sponsors of WebRTC Argentina

If you’re interested in more about the applications shown in these demos, please contact the team at