Segmentation Dataset

Summary

The dataset consists of images obtained from a front facing camera attached to a car. The car was driven around Hyderabad, Bangalore cities and their outskirts. The images are mostly of 1080p resolution, but there is also some images with 720p and other resolutions. The dataset is divided into train, val and test splits as follows:

Type Images Drive Sequences
Full 10,003 182
Train 6,993 120
Val 981 22
Test 2,029 40

Label Hierarchy and Statistics

Our dataset annotations have unique labels like billboard, auto-rickshaw, animal etc. We also focus on identifying probable safe driving areas beside the road.

The labels for the dataset are organized as a 4 level hierarchy. Unique integer identifiers are given for each of these levels. The histogram bellow gives:

  1. Pixel counts for each label in the y axis.
  2. The four level label hierarchy and the label ids for intermediate levels (level 2, level 3).
  3. The color coding used for the prediction and ground truth masks are given to the corresponding masks.
Annotation Hierarchy

Examples

Some examples of the input images, predictions of a baseline Cityscapes pretrained model, predictions of the same baseline trained on this dataset and the ground truths from the validation set (in order of columns) can be seen bellow.

As can be seen models trained on our dataset clearly distinguishes muddy drivable areas beside the road from the road itself. Our dataset has labels like billboards and curb/median's in the middle of the road. Also our image frames are from unstructured driving settings, where road is muddy, lane disciple is not followed often and there is a large number of vehicles on the road.

Examples


Dataset specifications & Evaluation Code

More information about the dataset and the evaluation code is available here.




Detection Dataset

Summary

The dataset consists of images obtained from a front facing camera attached to a car. The car was driven around Hyderabad, Bangalore cities and their outskirts. The images are mostly of 1080p resolution, but there is also some images with 720p and other resolutions. The dataset is divided into train, val and test splits as follows:

Type Images
Full 46,588
Train 31,569
Val 10,225
Test 4,794

Label Statistics

Below is the histogram of pixel distribution.

  1. Pixel counts for each label in the y axis.
  2. label names in the x axis.
Label Hierarchy

Example Images

Some images (left) followed by their detection output (right).

Image
Output
Image
Output
Image
Output
Image
Output