Predict & Autolabel

LightlyTrain provides a simple interface to perform batch prediction on a full dataset. You can use this feature to generate predictions for your unlabeled images using a pretrained model checkpoint, which can then be used as e.g. pseudo labels for further training. This allows you to improve model performance by leveraging all your unlabeled images.

Benchmark Results

Semantic Segmentation with EoMT

The following table compares the performance of the DINOv3 EoMT models on ADE20k validation set with and without using pseudo masks of SUN397 dataset during fine-tuning. You can check the semantic segmentation docs for more details on how to train such models.

The pseudo masks were generated in the following way:

  • we first fine-tuned a ViT-H+ model on the ADE20k dataset, which reaches 0.595 validation mIoU;

  • we then used the checkpoint to create pseudo masks for the SUN397 dataset (~100k images);

  • we subsequently fine-tuned the smaller models using these masks.

The validation results are listed in the table below, where you can notice significant improvements when using the auto-labeled data:

Implementation

Model Name

Autolabel

Val mIoU

# Params (M)

Input Size

Checkpoint Name

LightlyTrain

dinov3/vits16-eomt

0.466

21.6

518×518

LightlyTrain

dinov3/vits16-eomt

0.533

21.6

518×518

dinov3/vits16-eomt-ade20k

LightlyTrain

dinov3/vitb16-eomt

0.544

85.7

518×518

LightlyTrain

dinov3/vitb16-eomt-ade20k

0.573

85.7

518×518

dinov3/vitb16-eomt-ade20k

We also released the model checkpoints fine-tuned with auto-labeled SUN397 dataset in the table above. You can use these checkpoints by specifying the checkpoint name in the model argument of the predict_semantic_segmentation function. See the Predict Semantic Segmentation Masks section below for more details.

Predict Model Checkpoint

Predict Semantic Segmentation Masks

You can use the predict_semantic_segmentation function to generate semantic segmentation masks for a dataset using a pretrained model checkpoint. An example command looks like this:

import lightly_train

if __name__ == "__main__":
    lightly_train.predict_semantic_segmentation(
        out="out/my_experiment",
        data="my_data_dir",
        model="dinov3/vits16-eomt-ade20k", # use a pretrained checkpoint name
    )

or if you want to use a local checkpoint file:

import lightly_train

if __name__ == "__main__":
    lightly_train.predict_semantic_segmentation(
        out="out/my_experiment",
        data="my_data_dir",
        model="path/to/my/checkpoint_file.pt", # use a local checkpoint file
    )

This will create predicted semantic segmentation masks for all images in the my_data_dir folder and save them to the out/my_experiment folder.

Out

All predicted masks will be saved in the out folder. The subdirectory structure will follow the structure of the input data folder; if data is a list of image files, the images will be saved directly in the out folder.

Each predicted mask will have the same filename as the corresponding input image. The following mask formats are supported:

  • png

Data

The data parameter expects a folder containing images or a list of (possibly mixed) folders and image files. Any folder will be recursively traversed and finds all image files within it (even in nested subdirectories).

The following image formats are supported:

  • jpg

  • jpeg

  • png

  • ppm

  • bmp

  • pgm

  • tif

  • tiff

  • webp

Model

The path to a model checkpoint. This can be:

  • a path to an exported checkpoint file (in .pt), or

  • a checkpoint name that points to a model pretrained by LightlyTrain.

Supported Checkpoint Names

The following checkpoint names are supported for semantic segmentation:

  • dinov3/vits16-eomt-ade20k

  • dinov3/vits16-eomt-coco

  • dinov3/vits16-eomt-cityscapes

  • dinov3/vitb16-eomt-ade20k

  • dinov3/vitb16-eomt-coco

  • dinov3/vitb16-eomt-cityscapes

  • dinov3/vitl16-eomt-coco

  • dinov3/vitl16-eomt-cityscapes