Computer vision has been frustrating and inspiring the tech industry since the 1960s. Human vision is immensely complex and replicating that on an artificial level presents a daunting challenge.
Rising to that challenge comes with an equally big payoff, though. Computer vision has potential that goes far beyond scientific curiosity. It’s stretching the boundaries of what’s possible for medicine, transportation, manufacturing, and entertainment. Take a look at how computer vision could change the playing field for enterprise.
The Power of Computer Vision
To start, “computer vision” is an umbrella term. It covers a variety of technologies aimed at training computers to process images and video like humans do. The goal is for computers to be able to recognize subjects within a picture and make statements about how they relate to each other.
Shown an image of a beach, for example, a computer could do more than note the location of specific colored pixels. It could produce observations like:
- “There is a beach scene.”
- “This is a woman, specifically this woman from a linked database.”
- “The car is moving in this direction.”
- “The buildings are this far apart.”
Why is this so important when humans could do the same work? For one thing, there’s more data being produced than humans have the time or interest to process at scale. 1.2 trillion photos were uploaded to the internet last year, and the number is only growing. There are valuable business insights in those images. Without computer vision, those insights can be lost or found too late to be effective.
Sometimes it’s dangerous or impractical for a human to be present. For instance:
- Space and deep ocean exploration
- Biohazardous sites
- Sensitive manufacturing processes
Computer vision allows machines to act in those situations without waiting for human guidance. This is especially useful for autonomous devices and exploration, where communications suffer from extended lag times.
Human vision has some shortcomings, too. People have an easier time recognizing complex images while computers handle simple ones better. As a result a computer could process those types of pictures faster and more accurately than a human.
Barriers To Advancement
Making computer vision work requires both technical savvy and a very good understanding of mathematics. It’s an interdisciplinary process that needs a high level team to work. Talent like that isn’t cheap, limiting the number of teams that can afford to devote time to innovation.
It’s important to recognize that vision is far more complex than it seems. Humans can handle a variety of conditions that stupify computers. What are the biggest challenges?
Images that are partly obscured, like when a person is standing behind a car, can confuse an algorithm that tries to identify the top half of a person as an independent object.
Computers can have trouble distinguishing if an item is far away or just small.
Dense or texturally complicated backgrounds can be mistaken for additional items. That slows down the analysis process and might even throw false positives. The internet full of “accidental face recognitions” where computers tag kneecaps or tree knots as people.
There’s no single “pattern” for what most items are. Humans comprehend a huge amount of variety in color, shape, size, and material, but that’s a difficult concept for computers. Identifying dogs and cars is a good example of this.
Computer Vision in Action
Despite the challenges, computer vision has matured into something with real enterprise value. It’s being used in ways most people probably haven’t even considered. Some of the most exciting applications include:
- Optical Character Recognition (OCR): Reading handwritten and PDF documents and translating them into text documents
- Face and Object Detection and Recognition: Identifying, sorting and classifying images, including correlating those images with examples in a linked database
- Special Effects: Matching and lining up effects to real world footage
- Sports: Action recognition and quality assessment
- Smart Cars:Navigating live environments in real time
- Games: Assessing user input (like drawings or photos)
- Mobile Apps: Giving information based on images (such as identifying items or translating signs using the device camera)
- Robotics: Processing surroundings and distinguishing between items when performing or triggering tasks
Almost all the big players are investing in one or more of these applications. Google, Facebook, HP, and Microsoft all have computer vision programs running right now, and they’re seeing impressive results.
It’s smart to be cautious about any individual application until it’s proven its worth- but it’s just as smart to keep an eye on promising technologies to cash in on the first mover advantage.
Computer vision is one of the easiest tech terms to define but has been one of the most difficult to teach computers. The progress made so far has opened a whole new world of possibilities for enterprise. What lies ahead is sure to be amazing.
Could computer vision add functionality to your next software project? Talk with one of our technology experts to explore what’s possible!