Center for Research in Comptuer Vision
Center for Research in Comptuer Vision

Statistical Inference of Motion in the Invisible


This work focuses on the unexplored problem of inferring motion of objects that are invisible to all cameras in a multiple camera setup. Given object trajectories within disjoint cameras' FOVs (field-of view), we introduce constraints on the behavior of objects as they travel through the unobservable areas that lie in between. These constraints include vehicle following (the trajectories of vehicles adjacent to each other at entry and exit are time-shifted relative to each other), collision avoidance (no two trajectories pass through the same location at the same time) and temporal smoothness (restricts the allowable movements of vehicles based on physical limits). The constraints are embedded in a generalized, global cost function for the entire scene, incorporating influences of all objects, followed by a bounded minimization using an interior point algorithm, to obtain trajectory representations of objects that define their exact dynamics and behavior while invisible. Finally, a statistical representation of motion in the entire scene is estimated to obtain a probabilistic distribution representing individual behaviors, such as turns, constant velocity motion, deceleration to a stop, and acceleration from rest for evaluation and visualization. Experiments are reported on real world videos from multiple disjoint cameras in NGSIM data set, and qualitative as well as quantitative analysis confirms the validity of our approach.

In the above figure, the first image depicts the input to our method - correspondences across multiple disjoint cameras. In this case, there are five cameras, the FOV of cameras are shown with different colors whereas invisible region is represented by black. Given the input, we reconstruct individual trajectories using constraints introduced in this paper. Next, reconstructed trajectories are used to infer expected behavior at each location in the scene, shown as thick color regions, where the direction of motion is shown by HSV color wheel. We also infer different behaviors such as stopping and turning from the reconstructed trajectories.


Data Set

We ran our experiments on two datasets from NGSIM (see [26] for details). The first invisible region was from Lankershim 8:30am - 8:45am located at the intersection of Lankershim/Universal Hollywood Dr. (LA) with a total of 1211 vehicles passing through the region. The second invisible region was from Peachtree 4:00pm to 4:15pm located at the intersection of Peachtree/10th Street NE (Atlanta) with 657 vehicles passing through the region. Both intersections were typical four-legged with three possible paths that could be taken by a vehicle entering a particular leg, thus, resulting in 12 total paths. The following figure shows the trajectories that were output for both the datasets.


First, we present some qualitative results. In the figure below, the black trajectory corresponds to the vehicle under consideration while proximal vehicles which it could possibly collide with are shown in colors. In (a) and (c), the trajectories are drawn assuming constant velocity for each vehicle. In (a), the vehicle collides with one of the vehicles whereas in (c), vehicle under consideration collides with six different vehicles. The locations of collision are shown with red spheres partially invisible due to other vehicles. Notice the change in shape in (b) and (d) after inferring motion for all trajectories with the outcome that none of the trajectories collides with the black trajectory. Both vehicle-following and smoothness constraints are also visibly in effect in both the examples.

In the next figure, each row is an example of trajectory reconstruction. Vehicle under consideration is shown with squares, yellow depicts constant velocity, red is from proposed method and green square marks the ground truth. The rest of the vehicles are shown in black. In first row, reconstruction with constant velocity causes collisions at t = 381 and 521, and in the second row, between t = 1200 and 1500. On the other hand, proposed method and ground truth allow the vehicles to pass without any collision.

Finally, the following figure shows the error profile for our method (yellow) vs. constant velocity (black) for both datasets. As can be seen, our method has lower error (it has smaller magnitude), thus provides more accurate inference. (c) ROC curves for our method (solid) vs. constant velocity (dashed) for the Lankershim (red) and Peachtree (green). The x-axis is the distance threshold in feet while y-axis gives the percentage of points that lie within that threshold distance of the ground-truth.

Statistical Representation of Motion

In the following figure, each row is the Mixture of Gaussians representation for a particular path using constant velocity, proposed method and ground truth. The patterns in the second and third column are similar and capture acceleration, deceleration, start and stop behaviors whereas in first column, all Gaussians have the same variance due to constant velocity

Scene Structure & Status Inference

Given the inferred motion and behavior of objects in the invisible regions, we propose to estimate some key aspects of the scene structure and status to, show the importance and usefulness of our framework, and allow evaluation. The following figure shows the stopping times (probability of green signal) for each of eight possible legs. The x-axis is time and y-axis in each graph is the probability from our method (blue) and groundtruth (black), which are evidently, perfectly aligned in time.

The probability maps for stopping positions inferred for both datasets are shown in the following figure, which are correct as vehicles in reality stop and queue before the signal.

Related Publication

Haroon Idrees, Imran Saleemi, and Mubarak Shah, Statistical Inference of Motion in the Invisible, 12th European Conference on Computer Vision (ECCV), Florence, Italy, October 7-13, 2012. [Video of Presentation]

Back to Motion Patterns Projects