Train¶
The train command is a simple interface to pretrain a large number of models using different SSL methods. An example command looks like this:
import lightly_train
if __name__ == "__main__":
lightly_train.train(
out="out/my_experiment",
data="my_data_dir",
model="torchvision/resnet50",
method="distillation",
epochs=100,
batch_size=128,
)
lightly-train train out="out/my_experiment" data="my_data_dir" model="torchvision/resnet50" method="distillation" epochs=100 batch_size=128
Important
The default pretraining method distillation
is recommended, as it consistently outperforms others in extensive experiments. Batch sizes between 128
and 1536
strike a good balance between speed and performance. Moreover, long training runs, such as 2,000 epochs on COCO, significantly improve results.
This will pretrain a ResNet-50 model from TorchVision using images from my_data_dir
and the DINOv2 distillation pretraining method. All training logs, model exports, and
checkpoints are saved to the output directory at out/my_experiment
.
Tip
See lightly_train.train()
for a complete list of available arguments.
Out¶
The out
argument specifies the output directory where all training logs, model exports,
and checkpoints are saved. It looks like this after training:
out/my_experiment
├── checkpoints
│ ├── epoch=99-step=123.ckpt # Intermediate checkpoint
│ └── last.ckpt # Last checkpoint
├── events.out.tfevents.1721899772.host.1839736.0 # TensorBoard logs
├── exported_models
| └── exported_last.pt # Final model exported
├── metrics.jsonl # Training metrics
└── train.log # Training logs
The final model checkpoint is saved to out/my_experiment/checkpoints/last.ckpt
. The
file out/my_experiment/exported_models/exported_last.pt
contains the final model,
exported in the default format (package_default
) of the used library (see
export format for more details).
Tip
Create a new output directory for each experiment to keep training logs, model exports, and checkpoints organized.
Data¶
The data directory data="my_data_dir"
can have any structure, including nested
subdirectories. LightlyTrain finds all images in the directory recursively.
The following image formats are supported:
jpg
jpeg
png
ppm
bmp
pgm
tif
tiff
webp
Model¶
See supported libraries in the Models page for a detailed list of all supported libraries and their respective docs pages for all supported models.
Method¶
See Methods for a list of all supported methods.
Loggers¶
Logging is configured with the loggers
argument. The following loggers are
supported:
jsonl
: Logs training metrics to a .jsonl file (enabled by default)tensorboard
: Logs training metrics to TensorBoard (enabled by default, requires TensorBoard to be installed)wandb
: Logs training metrics to Weights & Biases (disabled by default, requires Weights & Biases to be installed)
JSONL¶
The JSONL logger is enabled by default and logs training metrics to a .jsonl file
at out/my_experiment/metrics.jsonl
.
Disable the JSONL logger with:
loggers={"jsonl": None}
loggers.jsonl=null
TensorBoard¶
TensorBoard logs are automatically saved to the output directory. Run TensorBoard in a new terminal to visualize the training progress:
tensorboard --logdir out/my_experiment
Disable the TensorBoard logger with:
loggers={"tensorboard": None}
loggers.tensorboard=null
Weights & Biases¶
Important
Weights & Biases must be installed with pip install "lightly-train[wandb]"
.
The Weights & Biases logger can be configured with the following arguments:
import lightly_train
if __name__ == "__main__":
lightly_train.train(
out="out/my_experiment",
data="my_data_dir",
model="torchvision/resnet50",
loggers={
"wandb": {
"project": "my_project",
"name": "my_experiment",
"log_model": False, # Set to True to upload model checkpoints
},
},
)
lightly-train train out="out/my_experiment" data="my_data_dir" model="torchvision/resnet50" loggers.wandb.project="my_project" loggers.wandb.name="my_experiment" loggers.wandb.log_model=False
More configuration options are available through the Weights & Biases environment variables. See the Weights & Biases documentation for more information.
Disable the Weights & Biases logger with:
loggers={"wandb": None}
loggers.wandb=null
Advanced Options¶
Input Image Resolution¶
The input image resolution can be set with the transform_args argument. By default a resolution of 224x224 pixels is used. A custom resolution can be set like this:
transform_args = {"image_size": (448, 448)} # (height, width)
transform_args.image_size="\[448,448\]" # (height, width)
Warning
Not all models support all image sizes.
Performance Optimizations¶
For performance optimizations, e.g. using accelerators, multi-GPU, multi-node, and half precision training, see the performance page.