Pretagging

Lightly Docker supports the use of pre-trained models to tag the dataset. We call this pretagging. For now, we offer a pre-trained model for object detection optimized for autonomous-driving.

Using a pretrained model does not resolve the need for high quality human annotations. However, we can use the model predictions to get some idea of the underlying distribution within the dataset.

The model is capable of detecting the following core classes:

  • bicycle

  • bus

  • car

  • motorcycle

  • person

  • train

  • truck

How It Works

Our pretagging model is based on a FasterRCNN model with a ResNet-50 backbone. The model has been trained on a dataset consisting of ~100k images.

The results of pretagging are visualized in the report. We report both, the object distribution before and after the selection process.

The following image shows an example of such a histogram for the input data before filtering.

some alt text

Histogram plot of the pretagging model for the input data (full dataset). The plot shows the distribution of the various detected classes. Further it shows the average number of objects per image.

For every docker run with pretagging enabled we also dump all model predictions into a json file with the following format:

// boxes have format x1, y1, x2, y2
[
    {
        "filename": "0000000095.png",
        "boxes": [
            [
                0.869,
                0.153,
                0.885,
                0.197
            ],
            [
                0.231,
                0.175,
                0.291,
                0.202
            ]
        ],
        "labels": [
            "person",
            "car"
        ],
        "scores": [
            0.9845203757286072,
            0.9323102831840515
        ]
    },
    ...
]

Usage

Pretagging can be activated by passing the following argument to your docker run command: pretagging=True

  • pretagging=True enables the use of the pretagging model

  • pretagging_debug=True add a few images to the report for debugging showing showing the image with the bounding box predictions.

    Note

    The debug mode for pretagging is currently not available for video datasets.

The final docker run command to enable pretagging as well as pretagging_debug should look like this:

docker run --gpus all --rm -it \
   -v OTHER_INPUT_DIR:/home/input_dir:ro \
   -v SHARED_DIR:/home/shared_dir \
   -v OUTPUT_DIR:/home/output_dir \
   lightly/sampling:latest \
   token=MYAWESOMETOKEN \
   pretagging=True \
   pretagging_debug=True

The following shows an example of how the debugging images in the report look like:

some alt text

The plot shows the detected bounding boxes from the pretagging overlayed on the image. Use the debug feature to figure out whether the pretagging mechanism works properly on your dataset.

Pretagging for Selection

You can also use pretagging to guide the data selection process. This can be helpful if you for example only care about images where there is at least one person.

To create such a pretagging selection mechanism you need to create a config file.

For the example of selecting only images with >=1 person we can create a min_requirements.json file:

{
    "person": 1
}

Move this file to the shared directory (to make it accessible to the docker container). Finally, run the docker with pretagging=True and pretagging_config=min_requirements.json.