Vision-Based System Design Part 1 – Sensor Selection

Aaron Behman, Director of Strategic Marketing, Embedded Vision, Xilinx, Inc.

Adam Taylor CEng FIET, Embedded Systems Consultant.

Visual sensing is widely used in tasks such as industrial inspection and security surveillance, and opportunities continue to expand in applications such as drones, robotics and augmented reality.

An embedded vision system encompasses the entire signal chain from the image sensor to the system output, which may comprise processed or unprocessed images or information extracted from the images. The system architect must be familiar with concepts and techniques associated with the image sensor and the image-processing subsystem.

Choosing the Right Sensor

As far as sensor selection is concerned, CMOS image sensors are the most widely used in systems being developed today. On the other hand, Charge-Coupled Devices (CCDs) can deliver superior performance for high-end or specialist equipment.

When selecting either type of sensor, the first step is to determine the resolution required in terms of pixels per line and number of lines. A scientific astronomy application, for example, may require a high resolution two-dimensional sensor while an industrial inspection system may be best served using a line-scan approach.

Line-scan devices comprise one or a few lines of pixels in the X direction. Typical applications include inspection or Optical Character Recognition (OCR), where the camera or the target is moved to capture the image in the Y direction.

Time Domain Integration (TDI) line-scan sensors are also available, which have multiple lines in the X direction. As the target moves, the pixel value is accumulated from one to the next thereby increasing effective sensitivity to allow faster scan speed or greater low-light performance. Synchronization is required between the line transfer and movement of the target, to prevent smearing and image defects. Frame rates can be very high, as there are only a few lines to read out.

In a two-dimensional array containing a large number of lines, the size of the array is one factor that determines the maximum frame rate. Some sensors allow pixels to be read in parallel for enhanced performance. 2D sensors can also perform windowing or Region of Interest analysis, to read out pixels only from a particular area of the image. This can boost effective performance in applications where information of interest lies within a small area of the captured image, such as automotive Advanced Driver Assistance Systems (ADAS), or surveillance or scientific equipment.

Having determined the format of the imager and the resolution required, pixel pitch is the next important consideration. The pitch defines the size of the pixel which is available to collect charge created by incident photons. Pixels on a smaller pixel pitch collect less charge in a given time period, and thus may require longer integration times to capture an image. This can impair performance in low-light conditions or with fast-moving objects.

Image-capture speed and performance under the expected lighting conditions are also governed by the sensor technology and choice of either front-illuminated or back-illuminated type. The key parameter here is Quantum Efficiency (QE), which expresses the number of electrons produced relative to the number of photons striking the sensor. Typically, the QE of the chosen sensor should be as high as possible across the spectrum of interest. Absorption, reflection and transmission are the key criteria that influence the QE of a sensor.

In a front-illuminated sensor, photons strike the front face but circuit features such as metallic lines or polysilicon gates can shield some pixels resulting in lower QE. Rear-illuminated sensors are an alternative type that are specially back-thinned to receive photons on the back face thereby avoiding such obstructions. These devices typically deliver superior QE.

Understanding Sensor Noise

The next consideration is the noise allowable within the image sensor. There are three main noise sources:

  • Device Noise is temporal in nature and includes shot noise as well as noise introduced by the output amplifiers and reset circuits.
  • Fixed Pattern Noise (FPN) is spatial in nature and is related to the different responses of pixels when subject to the same illumination intensity. Among techniques to compensate for FPN, one of the most popular is correlated double sampling of the output signal.
  • Dark current is caused by thermal noise within the image sensor and is present even when there is no illumination. The impact of dark signal upon the final image quality is less significant at higher frame rates. It is also temperature related, and so may be reduced by cooling the sensor using a device such as a Peltier element.

Understanding the noise model helps determine the Signal to Noise Ratio (SNR) that can be achieved. Next, the required dynamic range can be determined. Dynamic range quantifies the ability of the sensor to capture images containing both highly illuminated areas and dark areas. It is usually expressed in dB or as the ratio of the full-well capacity of the pixel (the number of electrons the pixel can hold before saturating) to the sensor readout noise. Dynamic range is often determined by performing a Photon Transfer Curve test, which plots noise against the well capacity. If the device has a digital output, the dynamic range may also be influenced by the number of bits.

Remaining Criteria

Moreover, the I/O standard adopted for data, command and control connections is important, and can influence the effective frame rate. For instance, LVCMOS is not suitable for high frame rates, but is acceptable for a simple monitoring camera. Dedicated high-speed serialised LVDS links are typically used where high frame rates, resolution and bits per pixel are required.

In addition, sensors may be of colour or monochrome type. The requirements of the application will dominate selection. A colour sensor requires the use of a Bayer-pattern filter on top of each pixel alternating red, green on one line and blue, green on the next. The bias towards green reflects the fact that the human eye is more sensitive to green wavelengths. The true colour of the pixel is determined by post processing using results from surrounding pixels. This can reduce image resolution by up to about 20%.

In a monochrome sensor each pixel receives all of the photons, as there is no Bayer pattern on top of the image array. This results in increased image sensitivity, and allows for simpler readout of the image as the de-mosaicking required for colour reconstruction is not needed.

If the selection procedure suggests using a CMOS image sensor, these are in reality complex dedicated SoC devices that provide designers with additional choices and design considerations. The integration time, for instance, must usually be configured by writing to a register via the command interface. In addition, various shutter modes are often available, such as global shutter mode that enhances capture of fast-moving targets, at the expense of relatively high noise, or rolling shutter mode that reduces noise but also limits high-speed capability.

This article has explored several key aspects of the image sensor as the first stage in the complex signal chain at the heart of a modern vision-based system. The next article in this series will look at the post-sensor signal-processing requirements and potential solutions.

For more information, please visit: https://www.xilinx.com/products/design-tools/embedded-vision-zone.html

BLOG COMMENTS POWERED BY DISQUS

T&M Supplement

The Annual T&M Supplement, sponsored by Teledyne LeCroy, was published in July. Click on the image above to read this exclusive report for free.

Follow us