How to enlarge an image when using a telepresence robot?

Not all the telepresence robots in the market today have a feature that allows you to enlarge remote scene to see it in details. But BotEyes has.

Tree ways to zoom video

There are exist tree possibilities to do this. It is optical zooming and digital zooming which may be (i) embedded in the camera microprocessor and (ii) executable programmatically in main tablet processor.

In the video below you can see how the BotEyes zooming works in practice:

Optical magnification is realized using a system of lenses, the distance between which changes mechanically, using micromotors. It is most effective since the enlarged image is sampled using the full resolution of the optical matrix of the image sensor.

Before considering the principle of digital magnification, let us recall that the main limitation of the video resolution if it is transmitted over the Internet is the bandwidth of the Internet communication channel. In addition, the principle of operation of the telepresence robot prohibits the use of a buffer to compensate for sudden delays in the communication channel, since the video frames must be synchronized with the robot's movement, i.e. the video must be transmitted in real time. If there is a delay, the robot control becomes difficult or impossible. Therefore, WebRTC, used in all telepresence robots, automatically reduces the size of the transmitted video, adjusting it to the real bandwidth of the communication channel, taking into account its delays and failures. In practice, the video is not transmitted at the maximum resolution, since this is rarely necessary, but requires increased financial costs.

Therefore, if, for example, the video sensor matrix has a size of 8 MP, and the bandwidth of the communication channel allows you to transmit video with an image resolution of only 1 MP, then for video transmission its size is reduced to 1 MP to prevent the occurrence of big delays.

Digital image zooming in such conditions may consist in cutting out the necessary 1-MP fragment from the full-scale video (which will have the maximum resolution for this matrix) by reducing the resolution from 8 MP to 1 MP, and only after that transmit it through the communication channel. Reducing the resolution is not required, since the cut fragment has a size of 1 MP allowed for the communication channel.

The described procedure can be performed in two ways: by setting the constraint parameter { zoom: 1 }

and using JavaScript Canvas 2D API for HTML5.

The first method implies that the camera manufacturers for a tablet (or computer) implement zooming firmware (we will call it native zooming), directly in the camera processor and provide the operating system with an interface for controlling the "zoom" parameter. Unfortunately, not all tablets perform scaling in the same way: some enlarge the image with a simultaneous increase in the resolution of the camera, others do zooming without changing the resolution. In the second case, zooming can increase the number of pixels by interpolation, without changing the amount of information in the video. At the same time, the image increases, but its clarity decreases, it becomes more blurry.

The second method is implemented not in the camera, but in the software that is executed in the tablet. Such software can be written, for example, in Java, Javascript, or Objective-C. The zooming algorithm consists of two stages: increasing the camera resolution by a specified number of times and then cutting out the desired video fragment in order to maintain the same image size in pixels. At the same time, the value of the video stream remains the same and it can be transmitted without delay through the communication channel with the same bandwidth as before zooming.

Zooming methods impemented in the BotEyes telepresence robot

Native scaling is not supported by all tablets, and if it is supported by a tablet, it may not be supported by the browser. For example, native scaling in the Samsung Galaxy Tab S7+ tablet is supported in the Google Chrome browser but is not supported in Microsoft Edge.

For these reasons, both types of scaling are implemented in the BotEyes telepresence robot: native and software, in JavaScript. Depending on the selected tablet and browser, the user can choose the best scaling method for him.

Unfortunately, when using the software method, it is necessary to operate in the tablet's memory with a large image, and this sometimes (quite rarely) leads to a known error "Aw, Snap!" in the Google Chrome and Microsoft Edge browsers, if the maximum resolution is selected. A similar error does not appear in Samsung Internet Browser, at least we were not able to detect it experimentally.

To reduce the load on the processor, we made it possible to reduce the video frame rate after zooming in the video. This does not affect the convenience of using the robot, since the device is most often used to examine stationary objects (a blackboard, a sheet on a table, equipment parts). At the same time, the frame rate can be increased if the user has a tablet with large operational memory.

Since the video zooming factor depends on the resolution of the camera, and the rear camera of most tablets has a resolution almost 2 times higher than the front one, we made it possible to quickly switch cameras in the robot. To get the maximum magnification, use the rear camera.

Another problem associated with programmatic magnification is automatic focus pointing. The fact is that with the software zooming method, the focus criterion uses the entire field of the matrix, while only part of it increases. Therefore, it may turn out that the focus is not set to the part of the image that we have enlarged, and it turns out to be blurry.

As usual, the longer the camera is stationary, the clearer the image becomes. There are two reasons: WebRTC automatically increases the bandwidth of the communication channel and the tablet adjusts to focus.

In the video below you can see how the BotEyes zooming works in practice: