Elphel camera under the hood: from Verilog to PHP

How many bits are really needed in the image pixels?

Modern CMOS image sensors provide high resolution digital output, mainstream ones had 10 bits and now many have 12 bits. When CCD sensors are used, the same (or higher) resolution is provided by separate ADC or integrated CCD signal processors. That resolution is significantly higher than that used in popular image and video formats - most are limited to just 8 bits per pixel (or per color channel depending on the image format). To catch up with the sensor and provide higher dynamic range many cameras (and their users ) switch to formats that support higher number of bits per pixels, some of them use uncompressed, full (sensor) dynamic range raw data. Such formats use significantly more space on the storage devices and bandwidth for transmission.

Do these formats always preserve more of the information registered by the sensors? If the sensor has 12 bit digital output - does that mean that when using an 8-bit JPEG the four least significant bits are just wasted and to preserve them raw format is required?

In most cases the answer is "no". With the sensor technology advances the sensor pixels get smaller and smaller approaching the natural limit of the light wavelength - it is now common to have them less than 2x2 microns (i.e. in the sensors used in the mobile phone cameras). And one of the consequences of the small pixels is the reduced Full Well Capacity (FWC) - maximal number of electrons that each pixel can accommodate without spilling them out. Why is that important? - Because of the Shot Noise - variations of the number of electrons (there is always an integer number of them, there could be no ½ electron). That noise is caused by the quantum nature of electric charge itself, there is no way to eliminate or reduce it for the particular pixel measurement. This noise is proportional to the square root of the total number of electrons in a pixel, so it is highest when the pixel is almost full.

Pixel output uncertainty caused by the shot noise
If we'll try to keep track of the pixel value in ideal conditions with the same illumination, same camera settings that measured pixel value will change from frame to frame. The picture above shows that for hypothetical sensor with FWC of just 100, in the real sensors the uncertainty is smaller, but it can still significantly reduce amount of information that we can receive from the sensor. Our measurements show that a typical Micron/Aptina 5 Megapixel sensor MT9P031 with 2.2x2.2 micron pixels have FWC of ≈8500 electrons, that means that when the sensor is almost full, the pixel value would fluctuate as 8500±92 or more than ±1%. Such fluctuation corresponds to 44 counts of the 12 bit sensor ADC.

So is there any real need for such high resolution ADC when some 5-6 LSBs don't carry any real information? Yes, it is still needed because modern sensors have really low readout noise and in the darks a single ADC count is meaningful - square root of zero is zero (actually even in the dark there are some thermal electrons that get to the pixels - with no light at all). With ADC counts having different information payload for small and large signals it is possible to recode the pixel output equalizing the information values of the output counts.

Non-linear conversion of the sensor output
This non-linear conversion assigns incremental numbers to each pixel level that can be distinguished from the previous one in a single measurement, so for small signal (in the darks) each next ADC output value gets next value, increasing to more than 40 ADC counts per number for large signals. Such conversion significantly reduces number of output values (and so number of bits required to encode them) without sacrificing much of the pixel values. The form below calculates effective number of bits for different sensor parameters and ratio of the encoder step to the noise value for that output level.

ADC resolution	bits
Sensor Full Well Capacity	e-
Sensor readout noise	e-
Step-to-noise ratio
Number of distinct levels
Effective number of bits

Optimal encoding and gamma correction

Luckily enough nothing has to be done to utilize the non-linear encoding optimal for maintaining constant noise to output count ratio as described above. All cameras incorporate some kind of a gamma correction in the signal path. Historically it was needed to compensate for the non-linear transfer function of the electron guns used in CRT monitors (television receivers). Cameras had to apply non-linear function so the two functions applied in series (camera+display) were providing image that perceived to have contrast close to that of the original. Most CRTs are now replaced by LCD or other displays that do not have any electron guns, but that correction is still in use. It is not just for backward compatibility - gamma correction (also called gamma compression in the camera) does a nice job of transferring higher dynamic range signal even when the signal itself is converted into digital format. Different standards use slightly different values for gamma - usually in the range of 0.45-0.55 on the camera side. And compression with gamma=0.5 is exactly the same square root function shown above, optimal for encoding in the presence of the shot noise. With this kind of gamma encoding full well capacity of several hundred thousands electrons is needed to have 12 bits of meaningful data per pixel in the image file. Such high FWC values are available only in the CCD image sensors with very large pixels.

Measuring the sensor full well capacity

Earlier I wrote that the Aptina sensors we use have FWC≈8500e-. Such data was not provided to us by the manufacturer, it is not included (at least in the openly available) documentation - we measured it ourselves. The FWC value is important because it influences the camera performance, so by measuring the camera performance it is possible to calculate the FWC. We did it with our cameras, the same method may be applied to most other cameras too. To make such a measurement you need to be able to control ISO settings (gain) of the camera and acquire a long series of completely out of focus images (with lens removed if possible, if not - completely open iris and use uniform target). ISO (gain) should be set to minimum, exposure adjusted so the area you'll analyze has pixel level close to maximal. Use natural lighting or DC-powered lamps/LEDs to minimize flicker cause by AC power. If the sensor is color - use green (with incandescent lamps - red) pixels only - they will have the minimal gain. Then measure differences between the same pairs of pixels in multiple frames (we used 100 pairs in 100 frames) and find root mean square for the differences in each pair, divide than by square root of 2 to compensate for the fact you are using differences in pairs, not the pixel values (pairs make this method more tolerant to fluctuation of the total light intensity and to some uncontrolled parameter changes in the camera). If it is possible to turn off gamma correction (as was in our case), the ratio of the pixel value to the measured root mean square of the variation will be equal to the square root of the number of electrons, if the gamma-correction can not be controlled you may assume it is around 0.5.