The views and opinions expressed in this blog are purely those of my own and do not reflect the official positions of my current or past employers.

T

his past weekend, I came across a news article that mentioned that millions of students in India are stuck at home with no access to either internet or online education. In fact, over half the world’s population still does not have any internet connection. While this digital divide has manifested itself in the U.S., especially with regards to in children’s education during the Coronavirus pandemic lockdown, the problem is far worse in Asia and Africa where fewer than 1 in 5 people are connected to the internet.

As a result of lockdowns being implemented globally, many adults and children are excluded from online education and telehealth.

Screenshot of Whatsapp video call

WhatsApp video call

Despite the 1 in 5 statistic mentioned above, the problem in fact, is actually worse. For even those who have access to the internet, the price is premium and the bandwidth limited. For instance, while talking to my parents in India, they frequently run out of their allocated 4 GB far before the allowance period, after which the bandwidth gets throttled: Stalled frames, choppy audio, painful delays, and eventual disconnections, and subsequent retries are a normal occurrence, but still arguably much better than normal telephone conversations because I get to “see” them.

I understand when Andrew Stuart explains that video calling is better for the mental health¹ of people. But, video calls typically require 2 Mbps up and 2 Mbps down as in the case of Zoom, a privilege many cannot avail.

To address this problem, I propose a new approach based on the insight that if we are willing to give up some realism or realistic rendering of faces and screens, then there is whole new world of face and screen representations that can be derived for ultra-low bandwidth, with an acceptable quality of experience.

A person holding a whiteboard with the color image on left and a black-and-white photo on the right.

This article explores such representations and methods that reduce needed bandwidth from the normal 2 Mbps to as low as 1.5 Kbps, allowing video to be encoded along with telephone audio with minimal degradation in audio quality. The proposed solution can be primarily implemented as software needing no change in the underlying infrastructure. This would in turn be cheaper, and allow internet access to people that are currently being marginalized based on their affordability.

Cellular Telephone vs Internet

You can skip this section if you are convinced that cellular telephone calls are more reliable, faster, and better than VoIP calls

From anecdotal experience, it is clear that a cellular telephone call is often more reliable than a video call in many attributes such as delays, dropped calls, ease of use, minimal variation in call quality and long distance options.

Video-call providers such as Google Meet, Discord, Gotomeeting, Amazon Chime, and potentially Facebook use a protocol called WebRTC that enables real-time communication between two or more parties. WebRTC also relies on an internet protocol called User Datagram Protocol, which offers speedy delivery of packets, but provides poor guarantees resulting in some packets to either not arrive or arrive out of order, thereby resulting in lost frames and jitter (a variation in the latency of a packet flow between peers)². Newer Machine Learning techniques such as using WaveNetEq³ in Google Duo improves the audio quality, however video quality variations remain unsolved.

On the other hand, while cellular telephone calls are also increasingly based on packets, engineers have found ways of making packet-switching cellphone networks increasingly efficient by constantly ensuring stringent quality standards and clever bandwidth management. Another reason why cellular calls work well is that the bandwidth needed for audio is much lower compared to that for video. Further, the encoding standards have been developed over many decades allowing the human brain to fill in gaps. I will eventually argue that many of the goals of internet-video calling such as two-way video calling, multi-party video broadcasts and live events, screen sharing, A/B testing of Quality of Experience (QoE)⁷ are also possible over cellular telephone calls. However, in the rest of this article, we shall only focus on two-way video call to set the stage.

#ux #education #artificial-intelligence #videos

Video calling for billions without internet
1.30 GEEK