 |

Incident Detection in a
Multi-Camera Environment for Visual Surveillance Applications
To extend the capabilities of existing visual surveillance systems, we are developing a 3D framework dedicated to incident detection based on a multi-camera setup. Our goal in this project is two-fold:
-
Processing the output of several cameras in order to handle occlusions among people and their environment and provide us with more robust people detection and tracking strategies;
-
Capturing the motion of a person or a group of people in order to make the interpretation of abnormal behaviors much easier.
We aim at designing a system that combines the video flows from several cameras with overlapping views in order to generate a 3D representation of the scene under surveillance, potentially with the help of planimetric information
when available.
Based on this representation, we will detect individuals and groups of individuals in the scene, to represent their relative positions in the 3D space and to analyze their behaviors and interactions.
Our project has been progressing so far through the following steps:
-
Multi-people detection on a single time frame using a probabilistic occupancy map (POM)
-
Multi-people tracking using dynamic programming
-
Detection-by-classification from multiple views
-
Anomaly detection using behavioral maps
Multi-People Detection on a Single Time Frame using a Probabilistic Occupancy Map
From the original video streams, a segmentation algorithm generates streams of binary images by estimating which part of the pictures are different from an estimated background picture.
From those binary streams, our algorithm iteratively estimates for
each location in the room the probability for an individual to be
present. In a nutshell, the algorithm optimizes those probabilities so
that average images computed from those probabilities match the image
provided by the segmentation algorithm. The convergence process can be
displayed by showing successively the computed average image for the
current estimates and the original images.
Finally, the detection on the complete sequence can be achieved by
running this algorithm on each individual frame and keeping the local
maxima as likely to be locations of individuals.
Multi-People Tracking using Dynamic Programming
Using the occupancy probability maps computed by our detection algorithm, we apply dynamic programming on this data to add tracking capability to our framework. We design a HMM taking into account the occupancy probabilities as well as an appearance model of the people and a simple isotropic pedestrian motion model. Such a model allows us to use Viterbi's algorithm to retrieve the most probable trajectories over a batch of frames.
To deal with the complexity of optimizing simultaneously multiple trajectories, we treat them as independent and optimize them one at a time. We carefully chose the trajectory computation order based on a reliability score, in order to avoid trajectories confusion.
As shown on the videos below, our algorithm is capable of following up to 6 people in a small room for several minutes without any tracking error. The results also show that the choice of detection box size does not have a strong influence on the accuracy of our algorithm.
Detection-by-Classification from Multiple Views
In order to avoid performing background subtraction, which can be very sensitive to image quality and misses discriminative capability, we perform people detection in the image plan directly.

Overview of our detection-by-classification method.
We train a decision tree to correctly classify windows containing a pedestrian.
As illustrated in the figure above, we then apply the classifier in each camera view independently at every possible position of the ground plane. We thus obtain as many score maps as there are cameras, that we then merge using our 3D knowledge of the scene, as well as the model of the classifier answer. We finally obtain an occupancy map, similar to the one derived with our people detection algorithm, but without the need of using background subtraction.
Anomaly Detection using Behavioral Maps
We extend our framework in order to detect abnormal behaviors using behavioral maps. The key idea is to represent standard movements in a scene with a set of behavioral maps. A behavioral map encodes, for every position of a top view, the probability of movement on the ground as well as the probability of switching to another behavioral map. The possibility for a tracked person to switch between maps allows us to model complex situations, with crossings or intersections for example.
To learn the behavioral maps corresponding to a given situation, we process video streams of the scene with our people detection algorithm and obtain a set of probability occupancy maps. We then perform Expectation Maximization on this data to generate a number of behavioral maps. The ideal number of maps for a situation is assessed by cross-validation. On the figure below are shown a set of behavioral map that were extracted from a test scenario described by the left-most image. The two right-most images represent the probability of staying in the same map, with dark colors representing high probability.
 |
 |
 |
 |
 |
| scenario |
1st movement map |
2nd movement map |
1st transition map |
2nd transition map |
Once computed, the behavioral maps are used in two different manners. First, we can use this knowledge about evolution of people in our scene to reinforce the quality of our people tracking algorithm. We replace the simple isotropic motion model by the more evolved motion model with behavioral maps. This was shown to improve the accuracy of the tracking, by reducing the number of mixed trajectories due to ambiguous situations.
Besides, we can also evaluate the likelihood of the trajectories retrieved by tracking with the help of the behavioral maps. This way, we can detect behaviors that are clearly different from the one that were observed during training. The videos below show an example of atypical motion detection. They correspond to the scenario illustrated by the figure above.
Source Code
The source code that we wrote for the people detection part of this project has been released under a GPL license. You can download it from the Software page of our web site.
Data Sets
Some of the multi-camera video sequences that we acquired for this project are available for download on the Data part of our web site.
References
Contact
J. Berclaz [jerome.berclaz@epfl.ch],
F. Fleuret [francois.fleuret@idiap.ch]
|