Vision-Based System Design Part 6 – Efficient Sensor Fusion in Sophisticated Embedded Vision Systems

Article Index

Giles Peckham, Regional Marketing Director at Xilinx
Adam Taylor CEng FIET, Embedded Systems Consultant.

This series of articles has considered several aspects of developing embedded vision systems, including sensor selection, interfacing, and development of the signal chain comprising vision-signal processing algorithms.

In a sophisticated embedded vision application, such as an automotive Advanced Driver Assistance System (ADAS), some functionality may be dependent on combining the results from two or more sensors. This is sensor fusion, and enables the system to acquire information that cannot be provided by one sensor alone.

In the context of vision-based systems, sensor fusion is usually done in real-time, to enable immediate decision making. The alternative is offline sensor fusion, where the sensor data is extracted, fused and decisions are made at a later point in time. Moreover, in an application such as ADAS, sensor fusion may involve combining several channels of data from sensors of the same type, or could demand fusion of data from different types of sensors. An object-detection and distance monitoring application provides a good example for comparison of these homogeneous and heterogeneous approaches to sensor fusion.

A system relying on a single forward-looking vision sensor could detect and identify objects, but at least one more vision sensor is needed to calculate distances to detected objects using a parallax algorithm.

Alternatively, combined object detection, recognition, and range-finding can be enabled by fusing vision-sensor data with RADAR or LIDAR. Other application examples involving fusion of differing images include X-Ray, MRI and CT for medical applications, or visible and infrared images in security systems.

Processing Demands
Crunching data from multiple vision sensors requires considerable processing power. If using colour image sensors, pre-fusion processing such as colour-filter interpolation, colour-space conversion, resampling and image correction are required. The sensor-fusion algorithm itself must be performed, and an ADAS system requires subsequent background subtraction, thresholding and contour detection to locate objects using the simplest approach, whereas some systems may use an even more processing-intensive HoG/SVM classifier. Moreover, demands for higher frame rate or larger image size further increase the computation required to pre-process the image and extract the information.

Of course, this is literally only half of the story: in a homogeneous system, the same image-processing pipeline needs to be implemented for the second sensor. Similarly, a heterogeneous system must configure, drive, receive and extract the information from the accompanying channel/s.

Benefit of All Programmable Architectures
Within embedded vision systems it is common to use All Programmable FPGAs or All Programmable SoCs to implement the image-processing pipeline. If these make sense for traditional embedded vision applications, then they really stand out for embedded vision fusion applications.
An Embedded Vision application typically uses a processor for supervision, control and communication. In an All Programmable SoC, this is a hard core with many supporting peripherals and interface standards. If an All Programmable FPGA is used, the processor can be a soft core with customized peripheral and interface support. Taking advantage of other features of these embedded processors, such as SPI or I2C interfaces, allows additional sensors such as accelerometers, magnetometers, gyroscopes and even GPS to be connected. This enables the software to quickly and easily obtain required information from a host of different sensor types, and provides for a scalable architecture.

While the image-processing pipeline required to extract information from the image sensor can be implemented easily in programmable logic fabric, this fabric can also implement pipeline for other heterogeneous sensors such as RADAR and LIDAR, or multiple pipeline instances in the case of a homogeneous system.

The tight coupling between the processor memory and programmable logic in All Programmable Zynq®-7000 or All Programmable Zynq® UltraScale+™ MPSoCs ensures the application software can easily access the resultant datasets for further processing and decision making. Because the separate sensor chains are implemented in logic they operate in parallel, which is beneficial when synchronisation is required, such as with stereoscopic vision. Moreover, implementation can be accelerated by using High Level Synthesis (HLS) to develop the algorithms directly for implementation within the programmable logic fabric.

Example Architecture
Both homogenous and heterogeneous approaches can be demonstrated in an All Programmable SoC. While the sensor types will be different for both applications the end objective of both architectures is to place two data sets within the processing system DDR memory, while maximising the performance provided by the programmable logic fabric.

Considering the homogeneous approach first, the resulting implementation is a stereoscopic vision system in which each channel uses a CMOS imaging sensor. A major advantage is that only one image-processing chain needs to be developed: the same design can be instantiated twice within the programmable logic fabric for both image sensors. This enables a significant saving in development costs, even though the algorithms for calculating parallax require intensive processing.
One of the most important requirements in such a system is to synchronise the two image-processing chains. When implementing the chains in parallel within the programmable logic fabric, this requirement can be met by applying the same clock to each chain, subject to appropriate constraints.

The architecture of the homogenous approach shows the two image-processing chains, which are based predominantly upon available IP blocks. Image data is captured using a bespoke sensor-interface IP module and converted from parallel format into streaming AXI. This allows for an easily extensible image-processing chain, the results from which can be transferred into the PS DDR using the high-performance AXI interconnect combined with video DMA.

If a heterogeneous implementation using differing sensor types is considered, this could combine the image-sensor object-detection architecture described earlier with RADAR to perform distance detection. There are two options for implementing the RADAR: a pulsed approach (Doppler) or a continuous wave. The best option will depend upon the requirements for the final application; however, both will follow a similar approach.

The RADAR implementation can be considered in two parts: signal generation including a high-speed digital-to-analogue converter to produce a continuous-wave or pulsed signal, and signal reception using a high-speed analogue-to-digital converter to capture the received continuous-wave or pulsed signal. When it comes to signal processing, both approaches will utilise FFT-based analysis implemented with the programmable logic fabric; the resultant data sets can be transferred to the PS DDR using DMA.

For either implementation, the fusion algorithm for both datasets is performed with the PS using software. It is worth noting that designers often find fusion algorithms can impose intensive demand for processing bandwidth. One option available to create higher performance is to utilise the power of the SDSoC™ design environment.

SDSoC enables software functions to be transferred seamlessly between the processor and the programmable logic of a SoC, using Vivado® HLS and a connectivity framework. Both are transparent to the software developer. The use of HLS to develop the processing chains of both homogenous and heterogeneous implementations can be extended further to create a custom SDSoC platform for the chosen implementation and then use SDSoC to harness uncommitted logic resources to further accelerate the performance of the overall embedded vision system.

As markets for embedded vision systems continue to grow, and as image-detection and other types of sensors become more readily available and affordable, system designers need increasingly fast and efficient sensor fusion methodologies. All Programmable FPGAs and SoCs can simplify implementation and synchronisation of multi-channel processing chains, while System and High-Level Synthesis tools help ensure increased system performance and on-time design completion.

For more information, please visit:




T&M Supplement

The Annual T&M Supplement, sponsored by Teledyne LeCroy, was published in July. Click on the image above to read this exclusive report for free.

Follow us