AI AND VIDEO ANALYTICS BLOG
Video Surveillance & Physical Security Industry Viewpoints
May 22nd, 2019
Author: Tom Edlund

How Deep Learning Has Revolutionized Video Analytics

The video analytics industry has undergone rapid transformation since its inception in the early 2000s. Early video analytics products were primarily designed as alerting solutions: By triggering calls to action, these applications attempted to eliminate the need for active human video monitoring. However, these computer vision-based solutions did not fully achieve the aim of removing human involvement in video surveillance and oversight. Furthermore, they tended to produce false positives and inaccurate matches for video search criteria.

Other solutions – such as BriefCam – sought to maintain and maximize human involvement in the video surveillance process, by accelerating video review for users and making it easier to understand whole scenes captured by video. These interactive solutions streamlined users’ comprehension of entire video scenes, thereby enabling operators to overcome video-based alerting limitations and quickly identify critical information in captured video.

The industry took a quantum leap forward in the mid-2000s by leveraging Deep Learning, which uses Deep Neural Networks (DNNs) to train computer systems, imitating the way a human is taught and learns. This post will delve into the development of video analysis based on Deep Learning and how the industry has evolved, as a result.

A Brief History of Deep Neural Networks

Deep Learning has its early roots in science from the 1940s, which gained some momentum in the 1980s and 90s. DNNs power Machine Learning, which is the subset of Artificial Intelligence that trains a machine how to learn. Deep Learning is a technique for enabling technologies to continually increase their sophistication and drive additional Artificial Intelligence (AI) applications.

To effectively power Deep Learning, the Deep Neural Networks that enable AI-backed technologies need to undergo training, which involves exposure to tagged data. Based on the descriptive tags, the DNNs learn to detect, identify and classify data. Successful training relies on access to large quantities of information – a challenge in and of itself – but tagging the different objects in imagery or video footage is an ongoing struggle, especially because Machine Learning’s accuracy is relative to the amount of data provided to train it. For instance, in the case of video content analytics, to be able to extract all women in a video scene, a DNN must first be exposed to large quantities of annotated images of women (and other objects that are not women), so that the technology can effectively detect and classify women in future video footage analysis.

Increased Computing for Powering DNNs

It took a few decades until DNNs were applied to video analytics because Central Processing Unit (CPU)-based processors were too slow for training DNNs effectively – for which a lot of processing power is required. In the mid-2000s, with the development of cluster/cloud computing, more data storage, and GPUs (Graphic Processing Units) with more computing power, it became much easier to leverage DNNs for analyzing images and videos. Increased coverage and cost-efficient processing allow systems to continuously process more video and aggregate metadata over time to make the data more accessible and actionable – offering deeper insight from previously underutilized video.

Modern video content analytics technology solutions use GPU and Deep Learning to break live or archived video into structured data with rich metadata. Beyond alerting functionality, Deep Learning-driven solutions make it possible to uncover quantifiable data and trends from video metadata, to derive actionable insights for business intelligence in addition to data-driven safety, security, and operational decision making.

Supported by Deep Learning and AI, advanced video analytics enable object extraction, recognition, classification, and indexing, for making video searchable, actionable and quantifiable. This, in turn, enables video system operators to review hours of video in minutes and rapidly identify people and objects of interest to extract maximum value from video for safety, security and productivity.

Learn more about how BriefCam’s fusion of Deep Learning-based analytics and unique Video Synopsis capabilities enable leading law enforcement agencies and major enterprises across the globe to maximize their investments in video surveillance, by scheduling a personalized platform demonstration.