Jan. 26, 2023 - In recent years we have observed a major change that has shaken up many assumptions. Indeed, the advent of Smartphones and video streaming solutions applied to social networks (Youtube, Tik Tok, Whatsapp, Wechat, etc.) has caused the number of encoders to explode at the level of decoders with as many new sources of production of amateur or professional content. The development of IOT (Internet of Things) and M-to-M (Machine to Machine) will henceforth multiply the number of cameras requiring as many on-board encoders and very few decoders.
One of the consequences is that it is no longer possible to neglect the impact on the encoder of the algorithms of the next video compression standards, specifically H.267. There are at least three impacts to consider: cost, consumption and feasibility.
Video represents a dominant part of internet energy consumption (> 80%) and continues to grow rapidly
(> 25%/year). The Internet accounts for more than 1% of global greenhouse gas emissions.
Video is a very large medium that can only be transported and stored after compression of the information. Interoperability between equipment connected to the network is guaranteed by international standardization (ISO) of decompression algorithms. There are mainly MPEG standards, the first of which, MPEG-1, was published in 1992. For 30 years, several generations of successful standards have followed: MPEG-2, AVC/H.264, HEVC/H.265 and finally VVC/H.266 in 2020. Each generation has made it possible to gain almost a factor of 2 in compression at the cost of an increase in the complexity of the decoder of the order of a factor 2 and that of the encoder by a factor of 8 to 10.
In the past, MPEG standards have always favored decoders and neglected encoders for the good reason that there were clearly more decoders than encoders (a few hundred million STB: Set Top Box, against a few tens of thousands professional encoders). It was therefore commonly accepted that the cost of the encoder should in no way influence the choice of compression algorithms. Conversely, all the effort had to be concentrated, rightly, on optimizing the cost of the decoder.
How the encoder of the algorithms of the next video compression standards impacts cost, consumption and feasibility of video technology
The silicon cost of an encoder is dominant over that of the decoder (x3 minimum SRAM included for an integrated HEVC/AVC 4Kp60 multi codec). Given that the number of encoders installed will exceed the number of decoders, it would henceforth be preferable to optimize the cost of the encoder, contrary to what was practiced previously.
The consumption of an encoder is much higher than that of a decoder (5x to 12x depending on the standard). However, it should be noted that the total consumption is a function of the duration of use of the encoder, which is itself largely a function of the application. In the case of a VoD (Video on Demand) application, the film is encoded once and it is read many times, thus soliciting the decoder in a dominant way. In the case of a teleworking videoconferencing application (Zoom, Teams, etc.), each user is likely to use their encoder as much as their decoder. In the case of a video surveillance, IOT or M-to-M application, it is the encoder which is solicited in a dominant way (hundreds or thousands of cameras for a few display screens). In total, the encoders would have to be used 5x to 12x less than the decoders for their overall consumption to become lower than that of the decoders. In the absence of more precise figures, we cannot yet draw any conclusions, but the orders of magnitude and above all the tendency to multiply the number of encoders should alert us. It is therefore a subject that deserves our full attention and that future standards should take into account.
Feasibility has also become a topic since VVC/H.266. Remember that compression standards standardize the way to decode a stream with a standardized syntax and leave the field open to encoder designers as long as their encoder knows how to generate a stream decodable by a standard decoder. The encoder therefore remains in the competitive domain. The complexity of successive standards has continued to grow by a factor of 8 to 10 with each generation. However, at the cost of many proprietary tricks resulting from years of research, encoder designers were able to approach, or even exceed, the visual quality performance of the standard model, while significantly reducing the complexity of the search algorithms. of the best candidate in the space of predictor candidates.
Thus, until HEVC/H.265, it was possible to implement in state-of-the-art technology an encoder reaching the level of quality promised by the standard. For the first time since VVC/H.266 it has become impossible in the best technology to implement an encoder exceeding 30% PSNR gain, which is significantly less than the standard model. A quadrupling of the effort (and therefore of the silicon surface or the number of CPU cores) only produces an additional gain of 2%. Designers are hitting a wall of exponential complexity and performance only grows asymptotically. It is therefore essential that future standards take into account the feasibility of a good encoder, otherwise there will be no real-time encoder capable of achieving the performance promised by the new standard and justifying its very existence.
New uses of video technology requiring low latency create new challenges
It should be noted that some of these new uses make it possible to reduce other more energy-intensive human activities.
In conclusion, we face new challenges for the new video compression standard (H.267) with technological feasibility (it is already impossible to implement a real-time encoder reaching the full performance of the VVC/H.266 standard), very low latency required for new uses and most importantly – environmental sustainability – consumption for a given service increasing from standard to standard despite technological advances.
by Philippe Wetzel, CEO and Founder of VITEC