What’s the Holy Grail Image Sensor?

How could a 5-ounce bird possibly carry a 1-pound coconut?

Here’s the thing that most people don’t get: in very low light, there’s always going to be noise if a sensor records perfectly. That’s because photons are random. At lower light levels, that randomness is a problem. That’s why you always want a bigger sensor than a smaller one, all else equal: the larger area minimizes the impacts of randomness. 

So, a “perfect” image sensor might not produce “perfect” results. If you agree with that premise—which is based upon math and physics that are immutable—then we can proceed.

What would a perfect image sensor look like? Well, we’d have a number of changes from our current state-of-the-art sensors:

  1. All photosites would record full color information. In other words, Bayer would be a thing of the past, as it only records one color at each photosites, and depending upon who you ask, that produces about 66% of the resolution we could have. Unfortunately, the only full color method we have right now is the Foveon (now owned by Sigma) approach, which uses the penetration of light wavelengths through silicon to collect full color information. In doing so, this introduces noise into two of the three layers. So not Holy Grail. The closest I’ve seen to what might become a full color sensor without increasing noise tendencies above Bayer is a patent from Nikon, which uses clever positioning of photo diodes and charge wells coupled with incredibly tiny mirrors to get the light to them. Unfortunately, I don’t know that current fabs could make such a sensor. Binning is not the answer, for those of you who will write suggesting it. I’ll leave it as a thought problem for you to solve as to why it isn’t.
  2. All photons aimed at the sensor would be converted to charge. State of the art on this changed over time. 30% efficiency, 40% efficiency, 50% efficiency, and today we have many 60% efficiency sensors. Microlenses and BSI have helped quite a bit, but even today all the photons headed towards a sensor are not received and converted to charge (generally you need light to be no more than 30° off perpendicular; less is better). Some never make it to the photo diode, because of things like tunnel walls (to keep photons/electrons from getting to adjacent photosites), and photo diodes aren’t perfect recorders. That said, this is an area where we’ve seen improvements and probably will continue to see improvements. Unfortunately, the low-hanging fruit has already been picked, so best case might be a half stop improvement long term, smaller gains short term.
  3. Color crosstalk reduced to zero. In most current designs, there's some crosstalk (pollution) of one color into an adjacent color. Unfortunately, the use of microlenses and BSI designs mean that we still have crosstalk issues. As I've mentioned in the past, the construct at the sensor is actually an optical design. Light enters into a UVIR filter, which might also contain antialiasing properties, then progresses through another glass/air transition before hitting the microlenses, which redirect some of the light. The light then goes through a Bayer layer before getting to the entrance to the photo diode. All these things happening just above the sensor produce some "spillage," or light that gets to where we don't want it. Light coming in perpendicular to the sensor doesn't tend to drift or produce crosstalk. Light coming in from an angle does. 
  4. Photosites wouldn’t saturate. Only so much charge can be stored at each photosite, and charge means electrons, and electrons get feisty when they’re bunched together and especially when exposed to heat, which makes them prone to migration. So just making larger charge storage isn’t the answer, and there are physical limits on how big the charge storage area can be for a photosite. What’s needed is a way of detecting when a photosite has saturated, triggering a counter, clearing the storage, then starting a new count. That’s proven to be a more difficult problem to solve than originally anticipated, as counting and clearing takes time, and clearing really fast can generate "noise" behind. We’ll eventually get some variation on this point, but my question would be whether that really will do as much as you think. Yes, solving this problem would mean a camera could essentially capture HDR data in real time, but we don’t have an output (display) system that can use all the data we currently record. In essence, you’d take all that extra data and move/compress the tonalities in some way to create a final image. Yes, we want this of our ideal sensor, but it’s a situational ability that isn’t always used to the max.
  5. All data would move off the sensor instantly. Right now we have limits to how fast we can move data off the photosites to the on-sensor analog-to-digital converter, and how fast the resulting digital numbers can be moved off the sensor to the camera’s electronics. Speed can generate electronic noise, which is something we want to avoid. The sensor makers are slowly solving this problem. Each generation of sensor has new technologies in it to deal with this problem, but we’re still a fair way from a true global shutter with no downside. 
  6. Materials used would be pure and consistent. As Mae West once said “I used to be Snow White, but I drifted.” Every image sensor drifts some from purity from photosite to photosite. Fixed pattern noise is the result of silicon impurities and more. In the Nikon sensors, the D5 sensor was remarkably free of fixed pattern noise (at least my sample and a few others I tested were). I’m not sure how Nikon achieved that, but that’s the goal: no wafer material or fab deficiencies that change response across the area receiving light.
  7. Perfect counting, perfect bit-count. The whole 12-bit, 14-bit, 16-bit controversy derives from this problem. A charge is in the sensor, and that needs to be converted into a digital number. And we want that number instantaneously (see #5). There’s an amplifier in the loop, too (gain), and we want it to be perfect. What we’re really talking about here is data integrity. Did X charge become X digital number, or it is X plus or minus a bit (rounded, truncated, miscalculated, doesn’t fit in our bit container right, etc.)? Right now, a state-of-the-art full frame sensor tends to produce somewhere between 12- and 14-bits in terms of the level of useful data that can be discriminated. “Between” is not what you want to hear. This is an area that will eventually be improved, but likely not high on anyone’s priority list until some of the other things I list above have been changed, as counting imperfect data perfectly doesn’t help us much. 

The ultimate problem with image sensors for cameras, though, isn’t any of the above, it’s production volume. At one time, photographic cameras (compacts, DSLRs) dominated image sensor use, and were driving the sensor R&D. Today, however, smartphones, autos, and security are all far bigger markets for image sensors, so the problems they need solved will get solved first. Then that tech will trickle up to the large dedicated camera sensors. #4 is important to small sensors, as they have smaller photosites, so would likely happen before some of the others. #5 is important for autos and security, so it, too, would tend to happen first.

text and images © 2020 Thom Hogan
portions Copyright 1999-2019 Thom Hogan-- All Rights Reserved
Follow us on Twitter: @bythom, hashtags #bythom, #dslrbodies