Train¶
The train command is a simple interface to pretrain a large number of models using different SSL methods. An example command looks like this:
import lightly_train
if __name__ == "__main__":
    lightly_train.train(
        out="out/my_experiment",
        data="my_data_dir",
        model="torchvision/resnet50",
        method="distillation",
        epochs=100,
        batch_size=128,
    )
lightly-train train out="out/my_experiment" data="my_data_dir" model="torchvision/resnet50" method="distillation" epochs=100 batch_size=128
This will pretrain a ResNet-50 model from TorchVision using images from my_data_dir
and the DINOv2 distillation pretraining method. All training logs, model exports, and
checkpoints are saved to the output directory at out/my_experiment.
Tip
See lightly_train.train() for a complete list of available arguments.
Out¶
The out argument specifies the output directory where all training logs, model exports,
and checkpoints are saved. It looks like this after training:
out/my_experiment
├── checkpoints
│   ├── epoch=99-step=123.ckpt                          # Intermediate checkpoint
│   └── last.ckpt                                       # Last checkpoint
├── events.out.tfevents.1721899772.host.1839736.0       # TensorBoard logs
├── exported_models
|   └── exported_last.pt                                # Final model exported
├── metrics.jsonl                                       # Training metrics
└── train.log                                           # Training logs
The final model checkpoint is saved to out/my_experiment/checkpoints/last.ckpt. The
file out/my_experiment/exported_models/exported_last.pt contains the final model,
exported in the default format (package_default) of the used library (see
export format for more details).
Tip
Create a new output directory for each experiment to keep training logs, model exports, and checkpoints organized.
Data¶
The data directory data="my_data_dir" can have any structure, including nested
subdirectories. LightlyTrain finds all images in the directory recursively.
The following image formats are supported:
jpg
jpeg
png
ppm
bmp
pgm
tif
tiff
webp
Model¶
See Models for a list of all supported models.
Method¶
See Methods for a list of all supported methods.
Loggers¶
Logging is configured with the loggers argument. The following loggers are
supported:
jsonl: Logs training metrics to a .jsonl file (enabled by default)tensorboard: Logs training metrics to TensorBoard (enabled by default, requires TensorBoard to be installed)wandb: Logs training metrics to Weights & Biases (disabled by default, requires Weights & Biases to be installed)
JSONL¶
The JSONL logger is enabled by default and logs training metrics to a .jsonl file
at out/my_experiment/metrics.jsonl.
Disable the JSONL logger with:
loggers={"jsonl": None}
loggers.jsonl=null
TensorBoard¶
TensorBoard logs are automatically saved to the output directory. Run TensorBoard in a new terminal to visualize the training progress:
tensorboard --logdir out/my_experiment
Disable the TensorBoard logger with:
loggers={"tensorboard": None}
loggers.tensorboard=null
Weights & Biases¶
Important
Weights & Biases must be installed with pip install "lightly-train[wandb]".
The Weights & Biases logger can be configured with the following arguments:
import lightly_train
if __name__ == "__main__":
    lightly_train.train(
        out="out/my_experiment",
        data="my_data_dir",
        model="torchvision/resnet50",
        loggers={
            "wandb": {
                "project": "my_project",
                "name": "my_experiment",
                "log_model": False,              # Set to True to upload model checkpoints
            },
        },
    )
lightly-train train out="out/my_experiment" data="my_data_dir" model="torchvision/resnet50" loggers.wandb.project="my_project" loggers.wandb.name="my_experiment" loggers.wandb.log_model=False
More configuration options are available through the Weights & Biases environment variables. See the Weights & Biases documentation for more information.
Disable the Weights & Biases logger with:
loggers={"wandb": None}
loggers.wandb=null
Advanced Options¶
Input Image Resolution¶
The input image resolution can be set with the transform_args argument. By default a resolution of 224x224 pixels is used. A custom resolution can be set like this:
transform_args = {"image_size": (448, 448)} # (height, width)
transform_args.image_size="\[448,448\]"  # (height, width)
Warning
Not all models support all image sizes.
Performance Optimizations¶
For performance optimizations, e.g. using accelerators, multi-GPU, multi-node, and half precision training, see the performance page.