lightly_train¶
Documentation for lightly_train
module.
- lightly_train.embed(out: str | Path, data: str | Path, checkpoint: str | Path, format: str | EmbeddingFormat, image_size: int | tuple[int, int] = (224, 224), batch_size: int = 128, num_workers: int | Literal['auto'] = 'auto', accelerator: str | Accelerator = 'auto', overwrite: bool = False) None ¶
Embed images from a model checkpoint.
See the documentation for more information: https://docs.lightly.ai/train/stable/embed.html
- Args:
- out:
Filepath where the embeddings will be saved. For example “embeddings.csv”.
- data:
Directory containing the images to embed.
- checkpoint:
Path to the LightlyTrain checkpoint file used for embedding. The location of the checkpoint depends on the train command. If training was run with
out="out/my_experiment"
, then the last LightlyTrain checkpoint is saved toout/my_experiment/checkpoints/last.ckpt
.- format:
Format of the embeddings. Supported formats are [‘csv’, ‘lightly_csv’, ‘torch’]. ‘torch’ is the recommended and most efficient format. Torch embeddings can be loaded with
torch.load(out, weigths_only=True)
. Choose ‘lightly_csv’ if you want to use the embeddings as custom embeddings with the Lightly Worker.- image_size:
Size to which the images are resized before embedding. If a single integer is provided, the image is resized to a square with the given side length. If a (height, width) tuple is provided, the image is resized to the given height and width. Note that not all models support all image sizes.
- batch_size:
Number of images per batch.
- num_workers:
Number of workers for the dataloader. ‘auto’ automatically sets the number of workers based on the available CPU cores.
- accelerator:
Hardware accelerator. Can be one of [‘cpu’, ‘gpu’, ‘tpu’, ‘ipu’, ‘hpu’, ‘mps’, ‘auto’]. ‘auto’ will automatically select the best accelerator available.
- overwrite:
Overwrite the output file if it already exists.
- lightly_train.export(out: str | Path, checkpoint: str | Path, part: ModelPart | str, format: ModelFormat | str, overwrite: bool = False) None ¶
Export a model from a checkpoint.
See the documentation for more information: https://docs.lightly.ai/train/stable/export.html
- Args:
- out:
Path where the exported model will be saved.
- checkpoint:
Path to the LightlyTrain checkpoint file to export the model from. The location of the checkpoint depends on the train command. If training was run with
out="out/my_experiment"
, then the last LightlyTrain checkpoint is saved toout/my_experiment/checkpoints/last.ckpt
.- part:
Part of the model to export. Valid options are ‘model’ and ‘embedding_model’. ‘model’ is the recommended option and exports the model that was passed as
model
argument to the train function. ‘embedding_model’ exports the embedding model. This includes the model passed with the model argument in the train function and an extra embedding layer if theembed_dim
argument was set during training. This is useful if you want to use the exported model for embedding images.- format:
Format to save the model in. Valid options are ‘torch_model’ and ‘torch_state_dict’. ‘torch_state_dict’ is the recommended option and ensures compatibility with different LightlyTrain versions. It exports the model’s state dict which can be loaded with
model.load_state_dict(torch.load(out, weights_only=True))
. ‘torch_model’ exports the model as a torch module which can be loaded withmodel = torch.load(out)
. This requires that the same LightlyTrain version is installed when the model is exported and when it is loaded again.- overwrite:
Overwrite the output file if it already exists.
- lightly_train.list_methods() list[str] ¶
Lists all available self-supervised learning methods.
See the documentation for more information: https://docs.lightly.ai/train/stable/methods/
- lightly_train.list_models() list[str] ¶
Lists all models in
<package_name>/<model_name>
format.See the documentation for more information: https://docs.lightly.ai/train/stable/models/
- lightly_train.train(out: str | Path, data: str | Path | Dataset, model: str | Module, method: str = 'simclr', method_args: dict[str, Any] | None = None, embed_dim: int | None = None, epochs: int = 100, batch_size: int = 128, num_workers: int | Literal['auto'] = 'auto', devices: int | str | list[int] = 'auto', num_nodes: int = 1, resume: bool = False, checkpoint: str | Path | None = None, overwrite: bool = False, accelerator: str | Accelerator = 'auto', strategy: str | Strategy = 'auto', precision: Literal[64, 32, 16, 'transformer-engine', 'transformer-engine-float16', '16-true', '16-mixed', 'bf16-true', 'bf16-mixed', '32-true', '64-true', '64', '32', '16', 'bf16'] = '32-true', seed: int = 0, loggers: dict[str, dict[str, Any] | None] | None = None, callbacks: dict[str, dict[str, Any] | None] | None = None, optim_args: dict[str, Any] | None = None, transform_args: dict[str, Any] | None = None, loader_args: dict[str, Any] | None = None, trainer_args: dict[str, Any] | None = None, model_args: dict[str, Any] | None = None) None ¶
Train a self-supervised model.
See the documentation for more information: https://docs.lightly.ai/train/stable/train.html
The training process can be monitored with TensorBoard (requires
pip install lightly-train[tensorboard]
):` tensorboard --logdir out `
After training, the model checkpoint is saved to
out/checkpoints/last.ckpt
and can be exported to different formats using thelightly_train.export
command.- Args:
- out:
Output directory to save logs, checkpoints, and other artifacts.
- data:
Path to a directory containing images or a PyTorch Dataset.
- model:
Model name or instance to use for training.
- method:
Self-supervised learning method name.
- method_args:
Arguments for the self-supervised learning method. The available arguments depend on the
method
parameter.- embed_dim:
Embedding dimension. Set this if you want to train an embedding model with a specific dimension. If None, the output dimension of
model
is used.- epochs:
Number of training epochs.
- batch_size:
Global batch size. The batch size per device/GPU is inferred from this value and the number of devices and nodes.
- num_workers:
Number of workers for the dataloader per device/GPU. ‘auto’ automatically sets the number of workers based on the available CPU cores.
- devices:
Number of devices/GPUs for training. ‘auto’ automatically selects all available devices. The device type is determined by the
accelerator
parameter.- num_nodes:
Number of nodes for distributed training.
- checkpoint:
Checkpoint to load the model weights from. The checkpoint must be a file created by a previous training run. Apart from the weights, all other training state components (e.g. optimizer, epochs) are not loaded.
- resume:
Resume training from the last checkpoint.
- overwrite:
Overwrite the output directory if it already exists. Warning, this might overwrite existing files in the directory!
- accelerator:
Hardware accelerator. Can be one of [‘cpu’, ‘gpu’, ‘tpu’, ‘ipu’, ‘hpu’, ‘mps’, ‘auto’]. ‘auto’ will automatically select the best accelerator available.
- strategy:
Training strategy. For example ‘ddp’ or ‘auto’. ‘auto’ automatically selects the best strategy available.
- precision:
Training precision. Select ‘16-mixed’ for mixed 16-bit precision, ‘32-true’ for full 32-bit precision, or ‘bf16-mixed’ for mixed bfloat16 precision.
- seed:
Random seed for reproducibility.
- loggers:
Loggers for training. Either None or a dictionary of logger names to either None or a dictionary of logger arguments. None uses the default loggers. To disable a logger, set it to None:
loggers={"tensorboard": None}
. To configure a logger, pass the respective arguments:loggers={"wandb": {"project": "my_project"}}
.- callbacks:
Callbacks for training. Either None or a dictionary of callback names to either None or a dictionary of callback arguments. None uses the default callbacks. To disable a callback, set it to None:
callbacks={"model_checkpoint": None}
. To configure a callback, pass the respective arguments:callbacks={"model_checkpoint": {"every_n_epochs": 5}}
.- optim_args:
Arguments for AdamW optimizer. Available arguments are:
optim_args={"lr": float, "betas": (float, float), "weight_decay": float}
.- transform_args:
Arguments for the image transform. The available arguments depend on the method parameter. The following arguments are always available:
transform_args={ "image_size": (int, int), "random_resize": { "min_scale": float, "max_scale": float, }, "random_flip": { "horizonal_prob": float, "vertical_prob": float, }, "random_gray_scale": float, "normalize": { "mean": (float, float, float), "std": (float, float, float), } }
- loader_args:
Arguments for the PyTorch DataLoader. Should only be used in special cases as default values are automatically set. Prefer to use the batch_size and num_workers arguments instead. For details, see: https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader
- trainer_args:
Arguments for the PyTorch Lightning Trainer. Should only be used in special cases as default values are automatically set. For details, see: https://lightning.ai/docs/pytorch/stable/common/trainer.html
- model_args:
Arguments for the model. The available arguments depend on the
model
parameter. For example, ifmodel='torchvision/<model_name>'
, the arguments are passed totorchvision.models.get_model(model_name, **model_args)
.