Transfer Learning Using Detectron2 and Comma10k

When developing new computer vision solutions, we use transfer learning (we use a pre-trained model and transfer it to a new task). For transfer learning we require significantly fewer labeled examples, and it's a great way to kick off new projects. In this example, we show how to use Lightly to select the most relevant images to be used for labeling and transfer learning of your model. We do this by uploading predictions to Lightly and using them in the data selection process.
f
This is an introductory tutorial on how to upload predictions to be used by Lightly. Here, you'll apply the concepts in Work with Predictions in a more concrete example.

You will learn the following:

Prerequisites

In order to upload predictions to a Lightly datasource, you will need the following things:

  • A cloud bucket to which you can upload your dataset. The following tutorial will use an AWS S3 bucket.
  • OpenCV to read and preprocess the images. You can install the framework using the following command:
pip install opencv-python
  • The detectron2 framework is installed on your machine. Here you can find the detectron2 installation documentation.
  • The comma10k dataset. The dataset consists of 10’000 images for autonomous driving and is available on GitHub. You can download the dataset using git (~24GB of size).
git clone https://github.com/commaai/comma10k

To make sure everything is installed correctly, run the following package imports:

# Some basic setup:
# Setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common packages
import numpy as np
import os, json, cv2 
import tqdm
import matplotlib.pyplot as plt
from pathlib import Path

# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog

Compute the Predictions

First, you'll have to compute the predictions you want to upload. In this tutorial, you will use detectron2, but you can replace it with a machine learning model of your choice. Computing the predictions consists of the following three steps:

  • Create a Detectron2 model.
  • Run the model
  • Add tasks and schema

Create a Detectron2 Model

The first step is to set up the object detection model. Use a pre-trained Faster R-CNN with a ResNet-50 backbone and parameters pre-trained on MS COCO.

cfg = get_cfg()
# add project-specific config (e.g., TensorMask) here if you're not running a model in detectron2's core library
# cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # set threshold for this model
# Find a model from detectron2's model zoo. You can use the https://dl.fbaipublicfiles... url as well
# cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")
# cfg.MODEL.DEVICE = 'cpu' # If you are debugging on a cpu device
predictor = DefaultPredictor(cfg)

Run the Model

Now, collect all the filenames in the dataset and compute the predictions following the predictions folder structure. For each image, you will store an image_name.json file in the .lightly/predictions/object_detection_comma10k/ folder.

input_dir = "comma10k/imgs/"
output_dir = ".lightly/predictions/object_detection_comma10k/"

#create directories
Path(output_dir + "/comma10k/imgs/").mkdir(exist_ok=True, parents=True)

filenames = os.listdir(input_dir)
pbar = tqdm.tqdm(filenames, miniters=500, mininterval=60, maxinterval=120)
for fname in pbar:
  # load the image and run the detectron2 model
  fname_full = os.path.join(input_dir, fname)
  im = cv2.imread(fname_full)
  out = predictor(im)

  # convert the prediction to in the Lightly format
  pred_box  = out["instances"].pred_boxes.tensor.cpu().numpy()
  pred_score = out["instances"].scores.cpu().numpy()
  pred_class = out["instances"].pred_classes.cpu().numpy().tolist()
  prediction = {
    "file_name": fname_full,
    "predictions": [],
  }
  for category_id, bbox, score in zip(pred_class, pred_box, pred_score):
    pred = {
      "category_id": category_id,
      "bbox": [float(bbox[0]),float(bbox[1]),float(bbox[2]),float(bbox[3])],
      "score": float(score)
    }
    prediction["predictions"].append(pred) 

  # create the prediction file for the image
  path_to_prediction = Path(output_dir) / Path("comma10k/imgs/" + fname).with_suffix('.json')
  with open(path_to_prediction, 'w') as f:
      json.dump(prediction, f)

The resulting folder structure should look like this:

.lightly/predictions/
└── object_detection_comma10k/
    └── comma10k/
        └── imgs/
            ├── 0000_0085e9e41513078a_2018-08-19--13-26-08_11_864.json
            ├── ...
            └──  0999_e8e95b54ed6116a6_2018-10-22--11-26-21_3_339.json

Add Tasks and Schema

Next, you will have to add a tasks.json and a schema.json file. Without them, Lightly will not be able to find the predictions. Head to Prediction Format if you need a refresher on these concepts.

The following script will create the required files for you.

task_name = "object_detection_comma10k"
tasks_path = ".lightly/predictions/tasks.json"

# create tasks.json
with open(tasks_path, "w") as f:
    json.dump([task_name], f)

# create schema.json
# list of classes from coco
coco_classes =[
        "person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", 
        "traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat",
        "dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella",
        "handbag", "tie", "suitcase", "frisbee", "skis","snowboard", "sports ball", "kite", "baseball bat",
        "baseball glove", "skateboard", "surfboard", "tennis racket", "bottle", "wine glass", "cup", "fork", 
        "knife", "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", 
        "donut", "cake", "chair", "couch", "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", 
        "remote", "keyboard", "cell phone", "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", 
        "vase", "scissors", "teddy bear", "hair drier", "toothbrush"
]

schema = {
    "task_type": "object-detection",
    "categories": []
}
for i, name in enumerate(coco_classes):
    cat = {
        "id": i, 
        "name": name, 
        "supercategory": "none"
    }
    schema['categories'].append(cat)

schema_path = os.path.join(output_dir, 'schema.json')
with open(schema_path, 'w') as f:
    json.dump(schema, f)

After running the script, your directory should look like this:

.lightly/predictions/
├── tasks.json
└── object_detection_comma10k/
    ├── schema.json
    └── comma10k/
        └── imgs/
            ├── 0000_0085e9e41513078a_2018-08-19--13-26-08_11_864.json
            ├── ...
            └──  0999_e8e95b54ed6116a6_2018-10-22--11-26-21_3_339.json

Upload Dataset and Predictions

To do Active Learning, use the Object Level workflow or Class-Balancing a Dataset Using Predictions From Detectron2. The dataset and predictions must be accessible to Lightly. You need to upload the data to a cloud provider of your choice. For this tutorial, you can use AWS S3. Lightly also supports Google Cloud Storage or Azure.

Set Up Your S3 Bucket

If you haven't done it already, follow the instructions here to set up an input and Lightly datasource on S3.

Install AWS CLI

We suggest you use AWS CLI to upload your dataset and predictions because it is faster in uploading big numbers of images. You can find the tutorial on installing the CLI in your system here. Test if the installation went out correctly with the following:

which aws
aws --version

Upload Dataset and Predictions

Now you can copy the content of your dataset to your cloud bucket with the aws s3 cp command. We will copy both the content in the comma10k and the lightly_comma10k folders.

aws s3 cp comma10k s3://yourInputBucket/comma10k_input/ --recursive
aws s3 cp .lightly s3://yourLightlyBucket/comma10k_lightly/ --recursive

Example of Folder Structure in AWS

Here is the final structure of your files on AWS S3:

s3://yourInputBucket/comma10k_input/
└── comma10k/
      └── imgs/
          ├── 0000_0085e9e41513078a_2018-08-19--13-26-08_11_864.png
          ├── ...
          └── 0999_e8e95b54ed6116a6_2018-10-22--11-26-21_3_339.png

s3://yourLightlyBucket/comma10k_lightly/
└── .lightly/predictions/
    ├── tasks.json
    └── object_detection_comma10k/
          ├── schema.json
          └── comma10k/
              └── imgs/
                  ├── 0000_0085e9e41513078a_2018-08-19--13-26-08_11_864.json
                  ├── ...
                  └── 0999_e8e95b54ed6116a6_2018-10-22--11-26-21_3_339.json

Now you have your dataset and predictions ready to be used by the Lightly Worker!

Source Code

You can download the complete source code here.