Train

The train command is a simple interface to pretrain a large number of models using different SSL methods. An example command looks like this:

import lightly_train

lightly_train.train(
    out="out/my_experiment",
    data="my_data_dir",
    model="torchvision/resnet50",
    method="dino",
    epochs=100,
    batch_size=128,
)
lightly-train train out="out/my_experiment" data="my_data_dir" model="torchvision/resnet50" method="dino" epochs=100 batch_size=128

This will pretrain a ResNet-50 model from TorchVision using images from my_data_dir and the DINO self-supervised learning method. All training logs and checkpoints are saved to the output directory at out/my_experiment.

Tip

See lightly_train.train() for a complete list of available arguments.

Out

The out argument specifies the output directory where all training logs and checkpoints are saved. It looks like this after training:

out/my_experiment
├── checkpoints
│   ├── epoch=99-step=123.ckpt                          # Intermediate checkpoint
│   └── last.ckpt                                       # Last checkpoint
├── events.out.tfevents.1721899772.host.1839736.0       # TensorBoard logs
├── metrics.jsonl                                       # Training metrics
└── train.log                                           # Training logs

The final model checkpoint is saved to out/my_experiment/checkpoints/last.ckpt.

Tip

Create a new output directory for each experiment to keep training logs and checkpoints organized.

Data

The data directory data="my_data_dir" can have any structure, including nested subdirectories. LightlyTrain finds all images in the directory recursively.

The following image formats are supported:

  • jpg

  • jpeg

  • png

  • ppm

  • bmp

  • pgm

  • tif

  • tiff

  • webp

Model

See Models for a list of all supported models.

Method

See Methods for a list of all supported methods.

Loggers

Logging is configured with the loggers argument. The following loggers are supported:

  • jsonl: Logs training metrics to a .jsonl file (enabled by default)

  • tensorboard: Logs training metrics to TensorBoard (enabled by default, requires TensorBoard to be installed)

  • wandb: Logs training metrics to Weights & Biases (disabled by default, requires Weights & Biases to be installed)

JSONL

The JSONL logger is enabled by default and logs training metrics to a .jsonl file at out/my_experiment/metrics.jsonl.

Disable the JSONL logger with:

loggers={"jsonl": None}
loggers.jsonl=null

TensorBoard

Important

TensorBoard must be installed with pip install lightly-train[tensorboard].

TensorBoard logs are automatically saved to the output directory. Run TensorBoard in a new terminal to visualize the training progress:

tensorboard --logdir out/my_experiment

Disable the TensorBoard logger with:

loggers={"tensorboard": None}
loggers.tensorboard=null

Weights & Biases

Important

Weights & Biases must be installed with pip install lightly-train[wandb].

The Weights & Biases logger can be configured with the following arguments:

import lightly_train

lightly_train.train(
    out="out/my_experiment",
    data="my_data_dir",
    model="torchvision/resnet50",
    method="dino",
    loggers={
        "wandb": {
            "project": "my_project",
            "name": "my_experiment",
            "log_model": False,              # Set to True to upload model checkpoints
        },
    },
)
lightly-train train out="out/my_experiment" data="my_data_dir" model="torchvision/resnet50" method="dino" loggers.wandb.project="my_project" loggers.wandb.name="my_experiment" loggers.wandb.log_model=False

More configuration options are available through the Weights & Biases environment variables. See the Weights & Biases documentation for more information.

Disable the Weights & Biases logger with:

loggers={"wandb": None}
loggers.wandb=null

Advanced Options

Input Image Resolution

The input image resolution can be set with the transform_args argument. By default a resolution of 224x224 pixels is used. A custom resolution can be set like this:

transform_args = {"image_size": (448, 448)} # (height, width)
transform_args.image_size="\[448,448\]"  # (height, width)

Warning

Not all models support all image sizes.

Performance Optimizations

For performance optimizations, e.g. using accelerators, multi-GPU, multi-node, and half precision training, see the performance page.