The Coming Cameras

Updated

I like to do a "thought piece" every now and then premised on what I did through most of my career in Silicon Valley: look five to ten years out and try to understand what developing or new technologies could be used to solve current user problems.

One problem with doing that at the moment in the camera market is this: they aren't today where they should have perceived they needed to be five years ago. (Yes, that's a complex sentence. Read it again and make sure you understand it.)  In other words, they're already behind where they should/could be, therefore looking further forward may not be quite as useful as it ought to be.

For example, you're probably well aware that inside your cameras are a bunch of semiconductors, including the image sensor. One of the predictable things about semiconductors has been—though it's getting more problematic at the forward edge of progress—the reduction of what's called process size. As a placeholder, think of process size as the smallest possible transistor. The smaller the transistor, the more of those you can pack on a chip and the closer you can put them together (which provides quicker communication between them). Those two things mean more computational power at faster speeds (though heat dissipation can become an issue as you miniaturize). 

So, here's a question: what's the process size for your image sensor? Or your imaging chip?

Apple is currently using a 7nm process size for its latest CPU (A12X Bionic). Indeed, looking at Apple's iOS CPUs is an illustration in process size reduction: 45nm, 32nm, 28nm, 20nm, 14nm, 16nm, 10nm, 7nm. That's why the newest iPhone and iPad have been getting faster, more capable, and able to do more things.

The problem with image sensors is that the the photons-to-electrons part of the sensor (photo diode) doesn't really benefit from process reduction, so there's not been as great a push to change it. But the ride-along electronics on CMOS sensors absolutely do benefit. Smaller process size allows you to do more with the storage charge the image capture creates, and to do it faster.

So again, what's the process size for your image sensor?

Would you believe probably something in the 200nm+ range? That's huge by today's state-of-the-art.

Update: An engineer or two pointed out that image sensors still work in the analog realm—what, you thought they were digital?—and going below 180nm becomes an issue. I should have caught that. One reason why Sony may have gone to stacked sensors has to do with this: if you can make the light gathering/ADC side of the sensor under a larger process and hook that fairly directly to something that is done in a smaller process, you can get some of the benefits. Nevertheless:

Moreover, from what I can gather, even the BIONZ, DIGIC, EXPEED type of chip is lagging behind current semiconductor state-of-the-art. I can't get official confirmation, but I believe the latest EXPEED6 chip, for instance, is made with 28nm process, and Nikon's SoC supplier, Socionext, currently only offers 16/12nm process as its smallest possible size. 

Why am I starting here? Because silicon is one of the easiest things to predict. Apple and Nikon both use ARM-licensed cores, but Nikon is using older, larger process Cortex cores while Apple has moved forward to their own version of ARM's latest core technology and producing it on smaller process fabs.

The trend that intersects with this is the use of computational and AI algorithms with image data. The smaller process, more sophisticated, main chip that Apple is using at the heart of their iOS devices simply can do more than the best the camera companies can do when it comes to changing pixel data or analyzing pixel data for hints on how to tune the camera's performance.

Moreover, Apple seems to have taken one of my original design philosophies on the QuickCam to heart: "the smallest number of components to get image sensor data into the CPU." There's almost nothing between the image sensor and A12X Bionic chip other than a data pipe. In our cameras, there's a bit more going on, and on designs such as Sony's stacked sensor chips, that can get quite complex and more expensive to make.

Where am I going with this? 

The future is going to see much more computational and AI logic in our cameras. No doubt about that. This was clear back at the turn of the century, but it was the smartphones that really got serious about this first, unfortunately for the dedicated camera makers. Heck, it was clear when I managed the team that put out the QuickCam in 1994, because the whole idea behind that product was to use your computer's computational power behind an image sensor.

This is a long-winded way of saying that camera makers have some catching up to do. Okay, not some. A lot. The silicon capabilities are there to let them do it, but when we talk about SoC (system on chip) entities like BIONZ, DIGIC, and EXPEED, we find ourselves caught up in the real problem: the camera industry is contracting rapidly. 

The reason that smartphones are eating the camera maker's lunch has to do with a lot of things, but one of them is volume. 1.5b cell phones were sold in 2018. Compare that to the 19.4m units that CIPA says shipped in the same period (that would understate total camera sales a bit, as there are a few non-Japanese companies that ship cameras). That's almost two orders of magnitude difference. Simply put, the smartphone companies can afford to spend more on R&D in keeping their silicon up to the state-of-the-art because they have so many more units over which to spread the cost.

Thus, one prediction that's easy to make is that dedicated cameras will continue to get better at adding computational and AI features in the future, but they won't catch up—let alone pass—the big smartphone vendors in the next five years. To do so would take a leap of innovation that is highly unlikely. 

Even for Sony, who recently decided that their smartphone and dedicated camera groups needed to work together, the volume problem is still a real issue. Sony's Xperia phones are not exactly big sellers (<2% of the market). So while combining efforts of their two groups does give them more volume to spread costs over, it's not quite as big a boost as it at first seems.

You wonder why there's so much emphasis on full frame these days? It's because the camera makers are looking not just for profit, but they're trying to stay in a lane they're pretty sure that the smartphones can't play in. "Good enough" is owned by the smartphone cameras now. That really only leaves "Exceptional and Unique" as the playground in which the camera makers can retain foothold. 

That's the reason why you see Nikon only making compact cameras with huge focal length range lenses or waterproofing. And the latter is now becoming a smartphone trait, so short of adding a really long focal length zoom to the waterproof camera...

The problem, of course, is that by defining narrower and narrower niches—full frame, superzoom, rugged/waterproof—you also limit your market size. Pushing up-scale to higher priced products also limits your market size.

So my first prediction is that we'll see a slow move towards more and more computational and AI attributes in our cameras. It will be slow not because the technology to do it is slow in coming, but because the cost of deploying it is being born over fewer and fewer units. Canon and Sony have a bit of an advantage here in that their scale of business is bigger than Nikon's and can better support additional R&D costs. But still, everyone is cautious because no one knows just how far the camera industry will contract. That's a bit of chicken and egg, though. If you move too cautiously, you actually make the industry contract more. (I'll come back to that in a bit.)

Meanwhile, there's another thing that's semiconductor-related that smartphones have gotten right and the cameras haven't: communications. 

Let's just admit the obvious: for the vast majority of people taking photographs, those images are now shared electronically. Smartphones embrace that in so many ways I'm not sure I can count them all. Dedicated cameras? Not so much, as I've pointed out many times.

The irony is that Nikon got into the photos-in-the-cloud business early with what's now known as Nikon Image Space (formerly myPicturetown, which dates all the way back to 2008!). Here's an easy way to see that Nikon doesn't understand what they're doing in cloud photo storage: exactly why aren't the Nikon Image Ambassadors using and promoting Nikon Image Space (NIS) to store, manage, and share their photos? Oops. NIS is a separate app from SnapBridge and doesn't allow sharing directly from it. Oops. Can Lightroom push my images to NIS? Oops. (The oops go on and on, but you should get my point with just three examples.)

Update: If you want to see an even bigger Oops, just check out the message Nikon sent to NIS users in May.

So here's the thing: if camera makers want people to keep using cameras, they need to fully embrace the way that people are using images and enable that. But for the most part, they aren't. Yet the technology is available that would let them do that.

At some point during the continued contraction, someone in Tokyo is going to bang their head against the wall, say "Doh", and start trying to do what users actually want and need. And guess what? While that might not generate the kinds of growth in the digital camera market we saw in the first decade of this century, a camera that functions well in today's image sharing world will sell better than one that doesn't. 

I put that last part in bold because every time I write about the fact that the communications side of dedicated cameras is terrible and needs to be fixed, I get a lot of pushback. Things like "that won't save the camera market." That's not what I'm saying at all. I'm saying that camera makers are getting sub-optimal sales because they've ignored a common and highly requested (and now necessary) user need. We can argue about whether camera sales would continue to go down (perhaps by a smaller percentage), stabilize, or start to grow a bit if the camera makers put the right technology done the right way into their cameras, but failing to do so will simply make them fail faster.

The thing is, the Japanese consumer electronics companies are fighting against Silicon Valley. In Silicon Valley, almost the opposite problem happens: Silicon Valley will pursue solving customer problems first and foremost almost without regard to cost or profit. Get the customer first, then worry about the business finances. Worse still, Silicon Valley stole the whole notion of sharing of images electronically from Japan, where that was technically done first with some early cell phones (but not particularly well commercialized or followed up on). 

What I find ironic is that in this world where everyone talks about the Internet of Things (IoT), dedicated digital cameras are some of the worst connected digital devices on the planet. Not only do they not "plug and play" into the Internet easily, making them too complicated for consumers, but their performance in doing so is woefully behind. 

We're about to go 5G in cellular, Wi-Fi 6 in radio. Both will be the primary thing you hear about in wireless communications in the next few years. Cameras aren't even close to the abilities we expect from those new technologies. Wi-Fi 5 (the current 802.11ac) is theoretically a minimum of 433Mbps speed. Divide by 9 (8 bits plus some overhead): 48MBps. A 48MB file should transfer in a second. Does it? Not on any Wi-Fi 5 capable camera in my gear closet (and there are several). Why? Because of the way the cameras are designed. 

So one thing that's going to have to change soon is in some internal structural ways that cameras are designed. In essence, cameras consider "communications" an interruptible and low priority background process. In a world where images are shared, and immediately shared at that, communications needs to be a primary process with some guarantee of delivery speed. The video camera makers have figured this out. I can stream real time from my video camera. The still camera makers are laggards.

Meanwhile, the latest trend in tech is in the proliferation of artificial intelligence software (AI), though I'm not at all sure I'd characterize all the things that are called AI as actual artificial intelligence. Just as graphics got its own dedicated chip (GPU), AI is now getting its own dedicated chip (Google calls theirs a TPU; I'll call it an IPU, for Intelligence Processor Unit). 

We've already seen camera makers deploy two aspects of this. For example, the Nikon D5 has a chip dedicated to autofocus. Olympus and Sony are referring to the new autofocus algorithms they're using as AI. 

But true AI as is being explored now in the labs is not task specific. The goal is to use the same "learning" and "processing" techniques to any problem that needs solving. We have lots of problems in cameras. Saturated signals, noise, distortions, astigmatisms, stray light, subject recognition, camera movement, depth cues, the list goes on and on. 

What you really want to build in the near future is a set of electronics that can do computational (CPU), graphics (GPU), and intelligence (IPU) tasks. Apple already has that in their latest iOS processors (a nascent AI engine being the latest processing core to be added). The net result of having a fast, deep, wide range of "processing" capabilities available in a single chip is that you reduce hardware costs while enabling the software guys to come along and do interesting things with all that facility. 

Finally, there's one other thing I believe will (should) happen: camera makers have to recognize that "tagging" (metadata) isn't something just for their own internal use (e.g. EXIF Maker Tags), but that in the coming world of imaging users need to tag their photos in quite a few ways. Copyright is an obvious one. We have some cameras that follow the IPTC guidelines on this, mostly because the camera companies' big press clients basically insisted or they'd stop buying product. (Irony note: apparently the Japanese camera makers haven't noticed that consumers stopped buying their product. Maybe we need to form a consumer lobbying organization ;~).

When most people take a photo now they actually would want it to be fully tagged, and to be tagged automatically if at all possible. When, where, who, what, photographer, who should be able to see it, and more. 

  • When: GPS, cell tower, or Wi-Fi provided data to be accurate. Automatic time zone detection.
  • Where: GPS, but with automatic coordinate-to-placename insertion.
  • Who: names of any identifiable people, pets, things, etc. With full ability to train those, and to make those private or public.
  • What: Intelligent categorization, which depends upon the Where and Who fields. For instance, a human identified in front of a well-known museum might be tagged "visiting the Louvre."
  • Photographer: could even go so far as to have fingerprint detection on the shutter release!

I could go on, but I think you get the idea. Because photos are now living in cloud space and shared via the Internet, it is highly desirable to make sure that the photographer can control how that image might be found by others, and that's going to involve deep and wide metadata that's being stored with the image data from the moment the photo was taken.

All the things I write about in this article are possible in the very near period (five years). Indeed, I'd argue that they're required and inevitable. There's some probability that a few of them will work their way into our cameras soon. The questions are how much so, and how fast?

The biggest problem I see is that the camera companies are hesitant to fully fund all the R&D that would be necessary to get these things—and others I haven't mentioned in this article—done sooner rather than later. That's a self-fulfilling prophecy, as I've pointed out over and over: by not getting at the front of the technology wave in recent years, cameras have now fallen behind. The potential buying public may not consciously understand that, but they've figured it out subconsciously. They know that their smartphone is doing things their camera can't do, so why do they need a new camera?

One of the things that surprised me coming out my MBA program into a wild, fast growing startup in the early days of Silicon Valley was this: all the problems we studied in those Harvard Case Studies came up. Every last one of them! What they didn't teach at the Kelley School of Business was this: the solution to the problem is always different than just making the numbers, procedures, or dependencies work right. The problem was always people. People that insisted on ignoring the right answer.

To a large degree I feel that's the problem in Tokyo right now. I'm pretty sure that there are plenty of engineers at the camera companies that understand everything I just wrote and want to give it to you in products. They're being held up in many cases by financial departments and upper management that is reluctant to take risk. 

What I know from my decades of experience in Silicon Valley is this:

  • Sometimes when you take risk, you fail.
  • If you don't take risk, you fail.

Understanding and coming to grips with those two statements is essential to a technology career. It's essential to any company that purports to be a technology company. 

So, in the end what I'll be looking for in the coming five years is not whether or not we get IPUs or 5G or full tagging. What I'll be looking at with the camera companies is who's seeing the wants/needs correctly and taking the risks necessary to fulfill those. 

Update: a final note: implicit in much of what I wrote is that the camera companies suck at software. Those of us who were appalled at how bad the original Windows version back in the 80's was are now reconsidering how good we actually had it ;~). 

text and images © 2019 Thom Hogan
portions Copyright 1999-2018 Thom Hogan-- All Rights Reserved
Follow us on Twitter: @bythom, hashtags #bythom, #dslrbodies