Home > Embedded Events > Software Infrastructure for Embedded Video Processor Cores

Software Infrastructure for Embedded Video Processor Cores

Date: 11-07-2022 ClickCount: 215

With new-age technologies such as the Internet of Things, machine learning and artificial intelligence, companies are re-imagining and creating intelligent multimedia applications by blending physical reality and digital information in innovative ways. Multimedia solutions involve audio/video codecs, image/audio/video processing, edge/cloud applications and, in a few cases, AR/VR. This blog will discuss the software infrastructure of embedded video processor cores in any multimedia solution.

The video processor is a hardened RTL-based IP block that can be used on leading FPGA boards. With this embedded core, users can natively support video conferencing, video streaming, and ML-based image recognition and facial recognition applications with low latency and high resource efficiency. However, software-level issues related to OS support, H.264/265 processing, driver development, etc., may arise before deploying the video processor.

Let's start with an overview of video processors and see how these issues can be addressed for semiconductor companies so that end users can reap the benefits of their products.

Embedded Video Processor Cores

The video processor is a multi-component solution consisting of the video processing engine, a DDR4 block, and a synchronization block. Together, these components are dedicated to supporting H.264/.265 encoding and decoding at resolutions up to 4k UHD (3840x2160p60), for the highest speed grade of this FPGA device family, up to 4096x2160p60. Supported levels and profiles include up to L5.1 Advanced Layer for HEVC and L5.2 for AVC. All three products are RTL-based embedded IP products deployed in the programmable logic fabric of the target FPGA device family and optimized/"hardened" for maximum resource efficiency and performance.

The video processor engine can encode and decode up to 32 video streams simultaneously. This is achieved by splitting the 2160p60 bandwidth across all expected channels, supporting 480p30 resolution video streams. H.264 decoding supports bitstreams up to 960Mb/s on the L5.2 2160p60 High 4:2:2 profile (CAVLC) and up to 533Mb/s on the L5.1 2160p60 Main 4:2:2 bitstream. H.265 decoding 10b internal profile (CABAC.)

The video processor engine also has significant built-in versatility. Rate control options include CBR, VBR and constant QP. Higher resolutions than 2160p60 are supported at lower frame rates. The engine can handle 8b and 10b colour depths as well as 4:0:0, 4:2:0 and 4:2:2 YCbCr chroma formats.

The microarchitecture consists of separate encoder and decoder sections, each managed by an embedded 32b synthesizable MCU through a single 32b AXI-4 Lite I/F slave host APU. Each MCU has an L1 instruction and data cache supported by a dedicated 32b AXI-4 master. Data transfer from system memory is performed through a 4-channel 128b AXI-4 master I/F that is separated between the encoder and decoder. There is also an embedded AXI performance monitor that directly measures bus transactions and latency, eliminating the need to lock down software overhead beyond the firmware for each MCU.

The DDR4 block is a combined memory controller and PHY. the controller part optimizes R/W transactions through SDRAM, while the PHY performs SerDes and clock management tasks. Additional support modules provide initialization and calibration of the system memory. Five AXI ports and one 64b SODIMM port provide up to 2677 MT/s performance.

A third block synchronizes data transactions between the video processor engine encoder and DMA. It can buffer up to 256 AXI transactions and ensure low latency performance.

The company's Integrated Development Environment (IDE) determines the number of video processor cores required for a given application and the buffer configuration for encoding or decoding based on the number of bitstreams, the selected codec and the required profile. Users can select AVC or HEVC codecs, I/B/P frame encoding, resolution and level, frames per second colour format and depth, memory usage, and compression/decompression operations through the toolchain. The IDE also provides estimates of bandwidth requirements and power consumption.

Embedded Software Support

Any embedded software development support for hardware-to-video processing can be divided into the following categories.

Video codec verification and functional testing

Linux support, including kernel development, driver development and application support

Tool and framework development

Reference design development and deployment

　　 Use and contribution to open source organizations as needed

The validation of AVC and HEVC codecs on video processors is extensive. It must be implemented at 3840x2160p60 performance levels for encoding and decoding in bare metal and Linux-supported environments. Low latency performance was also validated from prototyping to full production.

Linux work focused on multimedia frameworks and levels to customize the kernel and drivers. This included the v4l2 subsystem, DRM framework and drivers for synchronizing blocks to ensure low latency performance.

Codec and Linux projects effectively lead themselves to develop various reference designs on behalf of customers. Edge designs for encoding and decoding, from low-latency video conferencing to 32-channel video streaming, region-of-interest-based encoding and ML face detection development. All can be accomplished using carefully considered open options for source tools, frameworks and functionality. Find a summary of these products below.

　　GStreamer: An open-source multi-OS multimedia component library that can be assembled pipelined, following an object-oriented design approach and a plug-in architecture for multimedia playback, editing, recording and streaming. It supports the rapid construction of multimedia applications and is available under the GNU LGPL license. GStreamer products include various useful tools, including GST-launch (for building and running GStreamer pipelines) and get trace (a basic tracing tool.)

　　StreamEye - an open-source tool that provides data and graphical displays for in-depth analysis of video streams.

　　Gstshark: Available as an open-source project from Ridgerun, this tool provides benchmarking and tracing capabilities for analyzing and debugging GStreamer multimedia application builds.

　　FFmpeg and FFprobe: Both are part of the FFmpeg open source project. They are hardware-independent, multi-OS tools for multimedia software developers. FFmpeg allows users to convert multimedia files between multiple formats, change sample rates, and scale video. A probe is a basic tool for multimedia stream analysis.

　　OpenMAX: Available through the Khronos Group, this API and signal processing library allows developers to make multimedia stacks portable across hardware platforms.

　　Yocto: A Linux Foundation open source collaboration that creates tools (including SDKs and BSPs) and support functions to develop Linux custom implementations for embedded and IoT applications. The community and its Linux version are hardware-independent.

　　Libdrm - A set of open source low-level libraries for DRM support. Direct Rendering Manager is a Linux kernel that manages GPU-based video hardware on behalf of user programs. It manages program requests in arbitration mode via command queues and manages hardware subsystem resources, especially memory. libdrm libraries also include functions to support GPUs from Intel, AMD, and Nvidia. libdrm includes tools such as modest for testing DRM display drivers.

　　Media-ctl - A widely available open-source tool configuring media controller pipelines in the Linux v4l2 layer.

　　PYUV Player - Another widely used open-source tool allows users to play uncompressed video streams.

　　Audacity - free multi-OS audio editor.

The above tools/frameworks help to design efficient and quality multimedia solutions under video processing, streaming and conferencing.

Previous Difficult points to note in MCU circuit design
Why capacitor selection is critical
Next

Hot Products

CY8C21334-24PVXIT
Manufacturer: Cypress

IC MCU 8BIT 8KB FLASH 20SSOP

Product Categories: 8bit MCU

Lifecycle:

RoHS:
TMS320C6742BZCE2
Manufacturer: Texas Instruments

IC DSP FIX/FLOAT POINT 361NFBGA

Product Categories: DSP

Lifecycle:

RoHS:
TMS320C6742BZWT2
Manufacturer: Texas Instruments

IC DSP FIX/FLOAT POINT 361NFBGA

Product Categories: DSP

Lifecycle:

RoHS:
TMS320C6743BPTPT2
Manufacturer: Texas Instruments

IC DSP FIX/FLOAT POINT 176HLQFP

Product Categories: DSP

Lifecycle:

RoHS: