One of the larger themes I’ve been integrating in industry conversations with tech leaders and executives is how giving computers the ability to see and hear us will define a new era of computing. All the advancements in machine learning the past few years have set a new foundation to give computers the ability to learn on their own. A computers ability to learn depends entirely on the information available to it. This data has largely been textual but we are on the cusp of the computing being able to utilize all its sensors to learn as well. This includes the microphone, camera sensor, GPS, accelerometer, and any other host of optional sensors available to integrate into our computers. I do believe, however, the camera sensor will be the one which will lead to the most obvious customer value of machine learning and AI.
The camera sensor will be one of those unique areas for machine learning in consumer applications because it will only see what consumers allow it to see. In particular around the smartphone camera which is not utilized until consumers use an application which activates the sensor. Given the smartphone is the most personal computer people own, it is the camera sensor in the smartphone which could yield the most consumer value when it comes to machine learning and AI which benefits its owner. With that in mind, it is interesting to understand what consumers use their camera sensor for most often.
We at Creative Strategies, recently did a US-based study among consumers to understand the role the smartphone plays in photography. When we scoped out this study our goal was broadly looking at smartphone photography and I hadn’t anticipated a particular finding that got me thinking more about this machine learning theme. Below is a graph from our study showing the things US consumers take pictures of most often.
In general, you will find the position of most of these results fascinating. Particularly, how low selfie’s rank overall on the list. Selfies are the one use case where we observe publicly people do most often. So if you are like me you assumed selfies would rank more toward the top than the bottom of this list. Granted, what I’m showing you are the top-line results from our study. If I was to break this out by demographics of both age and gender it would like a bit different. For example, taking selfies of both the individual and with someone else, ranks as the third and fourth on this list for those 18-25 yrs old.
The thing that caught my up the most was the second most photographed thing on the list which is “information I need to remember.” If you examine your own usage you probably reaize you take pictures of lists, or things at a store you want to remember, or of homework or meeting notes on a whiteboard, etc. We may not realize just how often we do this because it isn’t the most exciting thing to photograph but it highlights how often we use our smartphone camera as a way to augment our memory.
This, to me, seems one of the more obvious opportunities for machine learning. What if our smartphone could recognize that what we have photographed is designed to help us remember something and take action on our behalf to help make that happen. Maybe if we take a picture of a homework list or meeting notes on a whiteboard the camera will take that information and automatically create the contextual notes or even itemize a to-do list for us when relevant. The clear challenge here for machine learning will be to truly understand the context of why we took the photo and what we want to remember. This could be where a host of data comes into play from my location, what is on my calendar like the context of the meeting I’m in, for example.
The other challenge that stands in front of machine learning in this use case is the personalized nature of this particular use case I’m describing. Machine learning does best with large communal data sets. This is why teaching a computer to recognize a dog or a tree is easy since it can take all images of dogs and trees to train the computer. The bigger challenge here is to teach a computer what is my dog vs any dog. This is where the localized, on-device, nature of these algorithyms will come into play and an area where I think Apple’s approach will yield fruit. Apple seems much more interested in training your iOS/OS X computer to learn about its owner specifcally than it is about more general machine learning like Google is doing. Apple is actually undergoing the more difficult of machine learning tasks of the two companies. Making machine learning personal is new, but also exceptionally difficult.
While this is one area to watch with regard to machine learning and the camera sensor, the other will be ways that the computer can make your photos better. This is single handedly the most interesting thing Google is doing with the Pixel 2/XL and their camera technology. I’ll be doing a deeper dive on the Google Pixel 2 camera in the coming weeks, but from my few weeks taking a range of photos with it, seeing how Google’s machine learning around imaging comes into play to eliminate noise in low-light, balance colors, apply post processing of blur or shades, and more is very interesting.
I point this out because in our smartphone study we asked consumers some of the things they are interested in as features for the camera or camera software and ranking near the top of the list was having smarter technology to help them take better pictures. Adopting machine learning into the photography process will only help people take better pictures and add value in information gathering and collecting. As I said earlier, machine learning around the image sensor is going to be a key battle ground for both operating systems and apps as well and will be the easiest place to see machine learning manifest itself for consumer benefit.