Recently, Computools released a multi-target Video Chat Solutions Kit that easily adapts to the needs of different industries. This ready-made “out-of-the-box” solution is suitable both for internal use and for expanding the functionality of products and services.
Video Chat Solutions Kit is a pre-designed platform for video conferencing and face-to-face meetings in specific chat rooms. High scalability of the solution allows it to be used for educational institutions and online schools, healthcare institutions (for example, telemedicine), dating applications and messengers, support and consulting services, property agencies, law firms, and others.
Purposes and Advantages of the Video Chat Solution
Due to quarantine caused by pandemic, the world is experiencing high demand in moving daily communication from in-person to online. Requests that the company received from clients on presales could be grouped in several items:
1. Party streaming – one main publisher who transmits video and audio, many publishers who temporarily (10-25 sec) send video, many subscribers who get video and audio. It could be used instead of real night clubs.
2. Auctions – not mandatory, but some applications add streaming to make users feel like they are at a real auction while they are watching Auction house employees describe the item and try to entertain users. It is considered one publisher who transmits video and audio, many subscribers who get video and audio.
3. A thorough combination of Zoom and Teamviewer.
Also, according to Lifesize statistics, video chats bring significant advantages to businesses:
The Computools’ Solution Kit is intended to acknowledge you with the company’s capabilities in providing video/audio streaming services. The out-of-the-box solution is not a copy-paste base for lots of similar products; it is a customizable and scalable tool that allows reducing estimation hours for streaming features and can be used for your business needs.
Here is some common info about the solution. Let’s start from definitions.
Some essential features of the Video Chat solution:
1. Privacy: private/public. The application displays a list of public rooms, and each user can join it if there’s no additional password, etc. Private rooms cannot be shown in a list; you will need a link to join it. The customized solution can include both public and private rooms or only one type of it, depending on your needs.
2. Additionally, some security measures can be added to rooms. For example, if the owner wants certain people to join their room, they can fill Allowed for the field, set a password, or a user can “knock” a room and wait for the approval.
3. Conference – a type of a room where all members are publishers and can publish video/audio. It can be used for internal meetings or negotiations.
So, how exactly Computools’ multi-purpose video chat system can benefit your business? The solution consists of two parts: a video chat and a conference. Video chat allows users to collaborate and share content through video. It can be integrated into any kind of product (platforms and applications), where a similar function is required. Solutions Kit’s capabilities allow you to separate user groups in separate chat rooms with their own video chats. Conference – as the name implies – is a convenient and useful tool for holding online conferences. This allows you to share ideas, conduct group classes, or other activities and services for a large number of listeners at the same time.
The implementation of video chat helps to improve the customer’s service, reduce costs for online support, increase sales and brand loyalty, strengthen teamwork, and build stronger relationships with partners. Using a ready-made scalable and easily customizable solution allows getting all these competitive advantages at several times faster, compared to building a video chat from scratch. The experience of Computools shows that the application of Solutions Kit reduces the custom video chat development time by 80%, and as a result, cut operating costs by 40%. Moreover, advanced collaborative capabilities allow you to reach a wider audience, improve your interaction with employees, customers, and partners, and scale your business.
Want to get a custom videoconferencing system to improve communication with your clients?Contact us →
Path From the Idea to a Solid Solution Kit
The main goal was to develop a cross-platform solution for video calls and conferences with chatting capabilities.
Hence, key features were determined:
1) authorization/registration of users;
2) create a conference for users with the ability to password-protect the rooms or indicate users who have rights to join the room;
3) edit conferences;
The solution should provide an opportunity to join the conference if the user has such permissions. By joining the room, users should see other participants and chat with them.
System requirements have also been identified:
1) the solution should be cross-platform – users should communicate via phones based on iOS and Android + via a browser;
2) the solution must withstand a load of up to 10 user sessions simultaneously;
3) no more than 1 VCPU 1GB RAM should be used to provide at least one room.
Also, the solution should work in the browser and does not require the installation of an additional expansion or something like that.
It was decided to make the application itself on the Electron framework, which allowed it to develop at once for many platforms, reuse the code when migrating from platform to platform, creating an acceptable ratio of development and application speed.
After conducting thorough research on the specified parameters, the team identified several possible solutions and ways to achieve the goal: RTP, HLS, Flash, WebRTC.
RTP and its version for streaming – RTSP:
Browsers generally do not support RTSP so that the stream will be converted for the browser through an intermediate server. It would be possible to use RTMP, which works well with flash, but there is a requirement – no additional plugins. HTML 5 wrappers can also be used to create a player, but then another problem would arise – how to get an RTSP stream producer? RTSP technology and its derivatives – work well using flash players (RTMFP, RTMP). Besides, this approach has massive connection establishment delays in comparison with, for example, WebRTC.
This protocol has the highest latency, among other alternatives. Since requests only use standard HTTP transactions, the protocol allows the stream to traverse firewalls or proxies that enable HTTP traffic, as opposed to UDP-based protocols such as RTP, as well as deliver to consumers via existing CDNs. Many browsers support this protocol, but its latency is much higher than that of WebRTC.
There are many solutions for organizing video conferencing using flash. Flash extensions are made for all browsers. But this solution is already gradually losing its popularity because the flash, along with the ease of implementation, brought a huge security hole to the world of browsers. The need to install a flash extension, the insecurity of the flash content itself, and, besides, the fact that Chrome will soon cease to fully support flash – played against this technology.
Unlike previous technologies, this solution has several advantages:
1) the technology is supported by all major browsers without installing any extensions;
2) provides low latency compared to other protocols;
3) can be used even without media servers, providing p2p communication.
The main problem of this technology is various firewalls and other systems that can block the receipt of the IP address of their consumer, and without this, it is impossible to establish communication. To solve this problem, STUN/TURN servers are used.
Custom protocols would make it possible to obtain a solution with any required characteristics, but this solution is very temporary and costly.
After conducting a comparative analysis of the capabilities of technologies, the choice of WebRTC became obvious. The next step was to choose the proper architecture.
WebRTC solutions offer the following types of architectures:
– MESH – each participant receives an audio/video stream from each participant separately. If you have, for example, three people in communication, then each of them will send audio/video streams to everyone. Thus, the system will create 3 * 3 streams with audio and video, i.e., 3 * 3 * 2 streams in total.
– MCU – each participant sends their audio/video streams to the server, from where they get to everyone else, but in the form of one stream. The MCU will merge streams from many participants into one output stream and distribute it to the rest of the participants. Also, this type of architecture should be optimized for the integration of various audio/video effects, transformations, and all these transformations will be applied to only one output stream.
The difference from MESH is that the server additionally assumes the authority to merge audio/video content (these operations are resource-intensive), on the other hand, each person will not receive separate streams of each participant, but will receive one stream with audio/video content of all participants. For example, coming back to the previous hypothetical situation, there are three people in communication, then each of them will send their audio and video streams to the server and receive only one stream which will already contain the audio/video of all other participants.
– SFU – tries to realize the advantages of MESH and MCU. Each user will send their own audio/video streams to the server and receive separate audio/video streams from the server for each participant. This type of architecture simply broadcasts the streams that it receives from other participants. Thus, if you have, for example, three people communicating, then each of them will send their own audio/video streams to the server and receive separate 3 streams * 2 (they will receive both audio and video) from the rest of the participants. The difference from MESH architecture is that the server broadcasts user streams, not the users themselves. The difference from the MCU is that the server does not merge user streams. This allows to organize the management of video streams (for example, add functions to mute individual participants, or turn off the display of some participants) and also reduce the load on the server (the server will not need to merge streams).
The latter type of architecture came closest to the stated objectives and was chosen for the project, as it allows providing the necessary functionality of the Video Chat Solution Kit. The next task that the team faced was choosing a server to work with WebRTC.
To make the right choice, the following servers were analyzed: Golang Ion media server, openVidu media server, Janus WebRTC server.
The project team chose Janus for several reasons:
1. The speed of work is much higher than that of other solutions. Janus provides SFU architecture, so you don’t waste server resources on merge streams.
2. Allows to easily configure individual streams on the client side, thanks to the SFU architecture. It also supports simulcasting, which enables clients to decide which streams they can receive with good quality, and which ones with poor quality. This will enable customers to flexibly customize the system to suit their needs.
3. A large number of plugins that provide much more features than just WebRTC video calls, which makes it much easier to add additional features to the system in the future.
4. Easy to configure, deploy, and build. Janus provides good documentation on exactly how to build the server, and setting plugins through separate config files makes it very easy to configure various aspects of the system.
Janus supports a large number of transport layers for sending JSON messages to it. For Solution Kit, the team has chosen the WebSocket as it is real-time, and quickly receives all the required information and events from Janus, unlike other transport layers. Besides, it simply integrates with various frameworks that were used to implement the front-end part of the solution.
To implement the video chat solution, the following stack was chosen:
1) Python + Django + Django channels;
2) React Native.
The application architecture consists of three parts:
1. Django backend that accepts messages from the client, connects to the Janus server using the Janus Python SDK, and organizes management logic, permissions + chat for users.
2. Janus Python SDK – a module that communicates with Janus, generates messages for him, receives, and sends responses to Django.
3. The frontend part of the application that generates requests from clients, displays video content of users, establishes WebRTC connections.
Django backend acts as a proxy between client and Janus server. It provides control of permissions of all users, organizes synchronization between user data and the Janus server, and organizes a chat subsystem for service users. Also, Django provides an admin panel that allows to easily manage all users of the system, manage the created rooms, etc.
During the development process, the solution was repeatedly tested and verified, which ensured the stable operation of the finished product. All the tasks set within the project were achieved – The Video Chat Solution Kit has broad capabilities, can be easily adapted and customized for specific needs.
If you need more information on how the implementation of Video Chat Solutions Kit can enhance your business, feel free to contact Computools’ expert by email firstname.lastname@example.org or using the form below.
Computools is an IT Consulting and Software Solutions Development company that helps businesses innovate faster by building the digital solutions or bringing the tech products to market sooner. Discover our collaborative approach and industry expertise that spans finance, retail, healthcare, consumer services and more.Contact us →