lightly_train

Documentation of the public API of the lightly_train package.

Functions

lightly_train.embed(*, out: str | Path, data: str | Path | Sequence[str | Path], checkpoint: str | Path, format: str | EmbeddingFormat = 'torch', image_size: int | tuple[int, int] = (224, 224), batch_size: int = 128, num_workers: int | Literal['auto'] = 'auto', accelerator: str | Accelerator = 'auto', overwrite: bool = False, precision: Literal[64, 32, 16, 'transformer-engine', 'transformer-engine-float16', '16-true', '16-mixed', 'bf16-true', 'bf16-mixed', '32-true', '64-true', '64', '32', '16', 'bf16'] = '32-true') None

Embed images from a model checkpoint.

See the documentation for more information: https://docs.lightly.ai/train/stable/embed.html

Args:
out:

Filepath where the embeddings will be saved. For example “embeddings.csv”.

data:

Directory containing the images to embed or a sequence of image directories and files.

checkpoint:

Path to the LightlyTrain checkpoint file used for embedding. The location of the checkpoint depends on the train command. If training was run with out="out/my_experiment", then the last LightlyTrain checkpoint is saved to out/my_experiment/checkpoints/last.ckpt.

format:

Format of the embeddings. Supported formats are [‘csv’, ‘lightly_csv’, ‘torch’]. ‘torch’ is the recommended and most efficient format. Torch embeddings can be loaded with torch.load(out, weigths_only=True). Choose ‘lightly_csv’ if you want to use the embeddings as custom embeddings with the Lightly Worker.

image_size:

Size to which the images are resized before embedding. If a single integer is provided, the image is resized to a square with the given side length. If a (height, width) tuple is provided, the image is resized to the given height and width. Note that not all models support all image sizes.

batch_size:

Number of images per batch.

num_workers:

Number of workers for the dataloader. ‘auto’ automatically sets the number of workers based on the available CPU cores.

accelerator:

Hardware accelerator. Can be one of [‘cpu’, ‘gpu’, ‘tpu’, ‘ipu’, ‘hpu’, ‘mps’, ‘auto’]. ‘auto’ will automatically select the best accelerator available.

overwrite:

Overwrite the output file if it already exists.

precision:

Embedding precision. Select ‘32-true’ for full 32-bit precision, or ‘bf16-mixed’/’16-mixed’ for mixed precision.

lightly_train.export(*, out: str | Path, checkpoint: str | Path, part: str | ModelPart = 'model', format: str | ModelFormat = 'package_default', overwrite: bool = False) None

Export a model from a checkpoint.

See the documentation for more information: https://docs.lightly.ai/train/stable/pretrain_distill/export.html

Args:
out:

Path where the exported model will be saved.

checkpoint:

Path to the LightlyTrain checkpoint file to export the model from. The location of the checkpoint depends on the train command. If training was run with out="out/my_experiment", then the last LightlyTrain checkpoint is saved to out/my_experiment/checkpoints/last.ckpt.

part:

Part of the model to export. Valid options are ‘model’ and ‘embedding_model’. ‘model’ is the default option and exports the model that was passed as model argument to the train function. ‘embedding_model’ exports the embedding model. This includes the model passed with the model argument in the train function and an extra embedding layer if the embed_dim argument was set during training. This is useful if you want to use the exported model for embedding images.

format:

Format to save the model in. Valid options are [‘package_default’, ‘torch_model’, ‘torch_state_dict’]. ‘package_default’ is the default option and exports the model in the default format of the package that was used for training. This ensures compatibility with the package and is the most flexible option. ‘torch_state_dict’ exports the model’s state dict which can be loaded with model.load_state_dict(torch.load(out, weights_only=True)). ‘torch_model’ exports the model as a torch module which can be loaded with model = torch.load(out). This requires that the same LightlyTrain version is installed when the model is exported and when it is loaded again.

overwrite:

Overwrite the output file if it already exists.

lightly_train.export_onnx(*, out: str | Path, checkpoint: str | Path, batch_size: int = 1, height: int | None = None, width: int | None = None, precision: Literal['32-true', '16-true'] = '32-true', simplify: bool = True, verify: bool = True, overwrite: bool = False, format_args: dict[str, Any] | None = None) None

Export a model as ONNX from a checkpoint.

Args:
out:

Path where the exported model will be saved.

checkpoint:

Path to the LightlyTrain checkpoint file to export the model from.

batch_size:

Batch size of the input tensor.

height:

Height of the input tensor.

width:

Width of the input tensor.

precision:

“32-true” for float32 precision or “16-true” for float16 precision. Choosing “16-true” can lead to less memory consumption and faster inference times on GPUs but might lead to slightly more inaccuracies. Default is “32-true”.

simplify:

Simplify the ONNX model with onnxslim after the export. Default is True.

verify:

Check the exported model for errors. With recommend to enable this.

overwrite:

Overwrite the output file if it already exists.

format_args:

Arguments that are passed to torch.onnx.export. Only use this if you know what you are doing.

lightly_train.list_methods() list[str]

Lists all available self-supervised learning methods.

See the documentation for more information: https://docs.lightly.ai/train/stable/pretrain_distill/methods/index.html

lightly_train.list_models() list[str]

Lists all models in <package_name>/<model_name> format.

See the documentation for more information: https://docs.lightly.ai/train/stable/pretrain_distill/models/

lightly_train.load_model(model: str | Path, device: Literal['cpu', 'cuda', 'mps'] | device | None = None) TaskModel

Either load model from an exported model file (in .pt format) or a checkpoint file (in .ckpt format) or download it from the Lightly model repository.

First check if model points to a valid file. If not and model is a str try to match that name to one of the models in the Lightly model repository and download it. Downloaded models are cached under the location specified by the environment variable LIGHTLY_TRAIN_MODEL_CACHE_DIR.

Args:
model:

Either a path to the exported model/checkpoint file or the name of a model in the Lightly model repository.

device:

Device to load the model on. If None, the model will be loaded onto a GPU (“cuda” or “mps”) if available, and otherwise fall back to CPU.

Returns:

The loaded model.

lightly_train.pretrain(*, out: str | Path, data: str | Path | Sequence[str | Path], model: str | Module | ModelWrapper | Any, method: str = 'distillation', method_args: dict[str, Any] | None = None, embed_dim: int | None = None, epochs: int | Literal['auto'] = 'auto', batch_size: int = 128, num_workers: int | Literal['auto'] = 'auto', devices: int | str | list[int] = 'auto', num_nodes: int = 1, resume_interrupted: bool = False, checkpoint: str | Path | None = None, overwrite: bool = False, accelerator: str | Accelerator = 'auto', strategy: str | Strategy = 'auto', precision: Literal[64, 32, 16, 'transformer-engine', 'transformer-engine-float16', '16-true', '16-mixed', 'bf16-true', 'bf16-mixed', '32-true', '64-true', '64', '32', '16', 'bf16', 'auto'] = 'auto', float32_matmul_precision: Literal['auto', 'highest', 'high', 'medium'] = 'auto', seed: int = 0, loggers: dict[str, dict[str, Any] | None] | None = None, callbacks: dict[str, dict[str, Any] | None] | None = None, optim: str = 'auto', optim_args: dict[str, Any] | None = None, transform_args: dict[str, Any] | None = None, loader_args: dict[str, Any] | None = None, trainer_args: dict[str, Any] | None = None, model_args: dict[str, Any] | None = None, resume: bool | None = None) None

Pretrain a self-supervised model.

See the documentation for more information: https://docs.lightly.ai/train/stable/pretrain_distill.html

The pretraining process can be monitored with TensorBoard:

tensorboard --logdir out

After pretraining, the model is exported in the library default format to out/exported_models/exported_last.pt. It can be exported to different formats using the lightly_train.export command.

Args:
out:

Output directory to save logs, checkpoints, and other artifacts.

data:

Path to a directory containing images or a sequence of image directories and files.

model:

Model name or instance to use for pretraining / distillation.

method:

Method name for pretraining / distillation.

method_args:

Arguments for the pretraining / distillation method. The available arguments depend on the method parameter.

embed_dim:

Embedding dimension. Set this if you want to pretrain an embedding model with a specific dimension. If None, the output dimension of model is used.

epochs:

Number of training epochs. Set to “auto” to automatically determine the number of epochs based on the dataset size and batch size.

batch_size:

Global batch size. The batch size per device/GPU is inferred from this value and the number of devices and nodes.

num_workers:

Number of workers for the dataloader per device/GPU. ‘auto’ automatically sets the number of workers based on the available CPU cores.

devices:

Number of devices/GPUs for training. ‘auto’ automatically selects all available devices. The device type is determined by the accelerator parameter.

num_nodes:

Number of nodes for distributed training.

checkpoint:

Use this parameter to further pretrain a model from a previous run. The checkpoint must be a path to a checkpoint file created by a previous training run, for example “out/my_experiment/checkpoints/last.ckpt”. This will only load the model weights from the previous run. All other training state (e.g. optimizer state, epochs) from the previous run are not loaded. Instead, a new run is started with the model weights from the checkpoint.

If you want to resume training from an interrupted or crashed run, use the resume_interrupted parameter instead. See https://docs.lightly.ai/train/stable/pretrain_distill/index.html#resume-training for more information.

resume_interrupted:

Set this to True if you want to resume training from an interrupted or crashed training run. This will pick up exactly where the training left off, including the optimizer state and the current epoch.

  • You must use the same out directory as the interrupted run.

  • You must NOT change any training parameters (e.g., learning rate, batch size, data, etc.).

  • This is intended for continuing the same run without modification.

If you want to further pretrain a model or change the training parameters, use the checkpoint parameter instead. See https://docs.lightly.ai/train/stable/pretrain_distill/index.html#resume-training for more information.

overwrite:

Overwrite the output directory if it already exists. Warning, this might overwrite existing files in the directory!

accelerator:

Hardware accelerator. Can be one of [‘cpu’, ‘gpu’, ‘tpu’, ‘ipu’, ‘hpu’, ‘mps’, ‘auto’]. ‘auto’ will automatically select the best accelerator available.

strategy:

Training strategy. For example ‘ddp’ or ‘auto’. ‘auto’ automatically selects the best strategy available.

precision:

Training precision. Select ‘16-mixed’ for mixed 16-bit precision, ‘32-true’ for full 32-bit precision, or ‘bf16-mixed’ for mixed bfloat16 precision.

float32_matmul_precision:

Precision for float32 matrix multiplication. Can be one of [‘auto’, ‘highest’, ‘high’, ‘medium’]. See https://docs.pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision for more information.

seed:

Random seed for reproducibility.

loggers:

Loggers for training. Either None or a dictionary of logger names to either None or a dictionary of logger arguments. None uses the default loggers. To disable a logger, set it to None: loggers={"tensorboard": None}. To configure a logger, pass the respective arguments: loggers={"wandb": {"project": "my_project"}}.

callbacks:

Callbacks for training. Either None or a dictionary of callback names to either None or a dictionary of callback arguments. None uses the default callbacks. To disable a callback, set it to None: callbacks={"model_checkpoint": None}. To configure a callback, pass the respective arguments: callbacks={"model_checkpoint": {"every_n_epochs": 5}}.

optim:

Optimizer name. Must be one of [‘auto’, ‘adamw’, ‘sgd’]. ‘auto’ automatically selects the optimizer based on the method.

optim_args:

Optimizer arguments. Available arguments depend on the optimizer.

AdamW:

optim_args={"lr": float, "betas": (float, float), "weight_decay": float}

SGD:

optim_args={"lr": float, "momentum": float, "weight_decay": float}

transform_args:

Arguments for the image transform. The available arguments depend on the method parameter. The following arguments are always available:

transform_args={
    "image_size": (int, int),
    "random_resize": {
        "min_scale": float,
        "max_scale": float,
    },
    "random_flip": {
        "horizonal_prob": float,
        "vertical_prob": float,
    },
    "random_rotation": {
        "prob": float,
        "degrees": int,
    },
    "random_gray_scale": float,
    "normalize": {
        "mean": (float, float, float),
        "std": (float, float, float),
    }
}
loader_args:

Arguments for the PyTorch DataLoader. Should only be used in special cases as default values are automatically set. Prefer to use the batch_size and num_workers arguments instead. For details, see: https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

trainer_args:

Arguments for the PyTorch Lightning Trainer. Should only be used in special cases as default values are automatically set. For details, see: https://lightning.ai/docs/pytorch/stable/common/trainer.html

model_args:

Arguments for the model. The available arguments depend on the model parameter. For example, if model='torchvision/<model_name>', the arguments are passed to torchvision.models.get_model(model_name, **model_args).

resume:

Deprecated. Use resume_interrupted instead.

lightly_train.train(*, out: str | Path, data: str | Path | Sequence[str | Path], model: str | Module | ModelWrapper | Any, method: str = 'distillation', method_args: dict[str, Any] | None = None, embed_dim: int | None = None, epochs: int | Literal['auto'] = 'auto', batch_size: int = 128, num_workers: int | Literal['auto'] = 'auto', devices: int | str | list[int] = 'auto', num_nodes: int = 1, resume_interrupted: bool = False, checkpoint: str | Path | None = None, overwrite: bool = False, accelerator: str | Accelerator = 'auto', strategy: str | Strategy = 'auto', precision: Literal[64, 32, 16, 'transformer-engine', 'transformer-engine-float16', '16-true', '16-mixed', 'bf16-true', 'bf16-mixed', '32-true', '64-true', '64', '32', '16', 'bf16', 'auto'] = 'auto', float32_matmul_precision: Literal['auto', 'highest', 'high', 'medium'] = 'auto', seed: int = 0, loggers: dict[str, dict[str, Any] | None] | None = None, callbacks: dict[str, dict[str, Any] | None] | None = None, optim: str = 'auto', optim_args: dict[str, Any] | None = None, transform_args: dict[str, Any] | None = None, loader_args: dict[str, Any] | None = None, trainer_args: dict[str, Any] | None = None, model_args: dict[str, Any] | None = None, resume: bool | None = None) None

Deprecated. Use pretrain() instead.

lightly_train.train_instance_segmentation(*, out: str | Path, data: dict[str, Any] | str, model: str, steps: int | Literal['auto'] = 'auto', batch_size: int | Literal['auto'] = 'auto', num_workers: int | Literal['auto'] = 'auto', devices: int | str | list[int] = 'auto', num_nodes: int = 1, resume_interrupted: bool = False, checkpoint: str | Path | None = None, reuse_class_head: bool = False, overwrite: bool = False, accelerator: str = 'auto', strategy: str = 'auto', precision: Literal[64, 32, 16, 'transformer-engine', 'transformer-engine-float16', '16-true', '16-mixed', 'bf16-true', 'bf16-mixed', '32-true', '64-true', '64', '32', '16', 'bf16'] = 'bf16-mixed', float32_matmul_precision: Literal['auto', 'highest', 'high', 'medium'] = 'auto', seed: int | None = 0, logger_args: dict[str, Any] | None = None, model_args: dict[str, Any] | None = None, transform_args: dict[str, Any] | None = None, loader_args: dict[str, Any] | None = None, save_checkpoint_args: dict[str, Any] | None = None) None

Train an instance segmentation model.

See the documentation for more information: https://docs.lightly.ai/train/stable/instance_segmentation.html

The training process can be monitored with TensorBoard:

tensorboard --logdir out

After training, the last model checkpoint is saved in the out directory to: out/checkpoints/last.ckpt and also exported to out/exported_models/exported_last.pt.

Args:
out:

The output directory where the model checkpoints and logs are saved.

data:

The dataset configuration or path to a YAML file with the configuration. See the documentation for more information: https://docs.lightly.ai/train/stable/instance_segmentation.html#data

model:

The model to train. For example, “dinov2/vits14-eomt”, “dinov3/vits16-eomt-coco”, or a path to a local model checkpoint.

If you want to resume training from an interrupted or crashed run, use the resume_interrupted parameter.

steps:

The number of training steps.

batch_size:

Global batch size. The batch size per device/GPU is inferred from this value and the number of devices and nodes.

num_workers:

Number of workers for the dataloader per device/GPU. ‘auto’ automatically sets the number of workers based on the available CPU cores.

devices:

Number of devices/GPUs for training. ‘auto’ automatically selects all available devices. The device type is determined by the accelerator parameter.

num_nodes:

Number of nodes for distributed training.

checkpoint:

Use this parameter to further fine-tune a model from a previous fine-tuned checkpoint. The checkpoint must be a path to a checkpoint file, for example “checkpoints/model.ckpt”. This will only load the model weights from the previous run. All other training state (e.g. optimizer state, epochs) from the previous run are not loaded.

This option is equivalent to setting model="<path_to_checkpoint>".

If you want to resume training from an interrupted or crashed run, use the resume_interrupted parameter instead.

reuse_class_head:

Deprecated. Now the model will reuse the classification head by default only when the num_classes in the data config matches that in the checkpoint. Otherwise, the classification head will be re-initialized.

resume_interrupted:

Set this to True if you want to resume training from an interrupted or crashed training run. This will pick up exactly where the training left off, including the optimizer state and the current step.

  • You must use the same out directory as the interrupted run.

  • You must NOT change any training parameters (e.g., learning rate, batch size, data, etc.).

  • This is intended for continuing the same run without modification.

overwrite:

Overwrite the output directory if it already exists. Warning, this might overwrite existing files in the directory!

accelerator:

Hardware accelerator. Can be one of [‘cpu’, ‘gpu’, ‘mps’, ‘auto’]. ‘auto’ will automatically select the best accelerator available.

strategy:

Training strategy. For example ‘ddp’ or ‘auto’. ‘auto’ automatically selects the best strategy available.

precision:

Training precision. Select ‘16-mixed’ for mixed 16-bit precision, ‘32-true’ for full 32-bit precision, or ‘bf16-mixed’ for mixed bfloat16 precision.

float32_matmul_precision:

Precision for float32 matrix multiplication. Can be one of [‘auto’, ‘highest’, ‘high’, ‘medium’]. See https://docs.pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision for more information.

seed:

Random seed for reproducibility.

logger_args:

Logger arguments. Either None or a dictionary of logger names to either None or a dictionary of logger arguments. None uses the default loggers. To disable a logger, set it to None: logger_args={"tensorboard": None}. To configure a logger, pass the respective arguments: logger_args={"mlflow": {"experiment_name": "my_experiment", ...}}. See https://docs.lightly.ai/train/stable/instance_segmentation.html#logging for more information.

model_args:

Model training arguments. Either None or a dictionary of model arguments.

transform_args:

Transform arguments. Either None or a dictionary of transform arguments. The image size and normalization parameters can be set with transform_args={"image_size": (height, width), "normalize": {"mean": (r, g, b), "std": (r, g, b)}}

loader_args:

Arguments for the PyTorch DataLoader. Should only be used in special cases as default values are automatically set. Prefer to use the batch_size and num_workers arguments instead. For details, see: https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

save_checkpoint_args:

Arguments to configure the saving of checkpoints. The checkpoint frequency can be set with save_checkpoint_args={"save_every_num_steps": 100}.

lightly_train.train_object_detection(*, out: str | Path, data: dict[str, Any] | str, model: str, steps: int | Literal['auto'] = 'auto', batch_size: int | Literal['auto'] = 'auto', num_workers: int | Literal['auto'] = 'auto', devices: int | str | list[int] = 'auto', num_nodes: int = 1, resume_interrupted: bool = False, checkpoint: str | Path | None = None, reuse_class_head: bool = False, overwrite: bool = False, accelerator: str = 'auto', strategy: str = 'auto', precision: Literal[64, 32, 16, 'transformer-engine', 'transformer-engine-float16', '16-true', '16-mixed', 'bf16-true', 'bf16-mixed', '32-true', '64-true', '64', '32', '16', 'bf16'] = 'bf16-mixed', float32_matmul_precision: Literal['auto', 'highest', 'high', 'medium'] = 'auto', seed: int | None = 0, logger_args: dict[str, Any] | None = None, model_args: dict[str, Any] | None = None, transform_args: dict[str, Any] | None = None, loader_args: dict[str, Any] | None = None, save_checkpoint_args: dict[str, Any] | None = None) None

Train an object detection model.

See the documentation for more information: https://docs.lightly.ai/train/stable/object_detection.html

The training process can be monitored with TensorBoard:

tensorboard --logdir out

After training, the last model checkpoint is saved in the out directory to: out/checkpoints/last.ckpt and also exported to out/exported_models/exported_last.pt.

Args:
out:

The output directory where the model checkpoints and logs are saved.

data:

The dataset configuration or path to a YAML file with the configuration. See the documentation for more information: https://docs.lightly.ai/train/stable/object_detection.html#data

model:

The model to train. For example, “dinov3/convnext-tiny-ltdetr-coco”, “dinov2/vits14-ltdetr”, or a path to a local model checkpoint.

If you want to resume training from an interrupted or crashed run, use the resume_interrupted parameter.

steps:

The number of training steps.

batch_size:

Global batch size. The batch size per device/GPU is inferred from this value and the number of devices and nodes.

num_workers:

Number of workers for the dataloader per device/GPU. ‘auto’ automatically sets the number of workers based on the available CPU cores.

devices:

Number of devices/GPUs for training. ‘auto’ automatically selects all available devices. The device type is determined by the accelerator parameter.

num_nodes:

Number of nodes for distributed training.

checkpoint:

Use this parameter to further fine-tune a model from a previous fine-tuned checkpoint. The checkpoint must be a path to a checkpoint file, for example “checkpoints/model.ckpt”. This will only load the model weights from the previous run. All other training state (e.g. optimizer state, epochs) from the previous run are not loaded.

This option is equivalent to setting model="<path_to_checkpoint>".

If you want to resume training from an interrupted or crashed run, use the resume_interrupted parameter instead.

reuse_class_head:

Deprecated. Now the model will reuse the classification head by default only when the num_classes in the data config matches that in the checkpoint. Otherwise, the classification head will be re-initialized.

resume_interrupted:

Set this to True if you want to resume training from an interrupted or crashed training run. This will pick up exactly where the training left off, including the optimizer state and the current step.

  • You must use the same out directory as the interrupted run.

  • You must NOT change any training parameters (e.g., learning rate, batch size, data, etc.).

  • This is intended for continuing the same run without modification.

overwrite:

Overwrite the output directory if it already exists. Warning, this might overwrite existing files in the directory!

accelerator:

Hardware accelerator. Can be one of [‘cpu’, ‘gpu’, ‘mps’, ‘auto’]. ‘auto’ will automatically select the best accelerator available.

strategy:

Training strategy. For example ‘ddp’ or ‘auto’. ‘auto’ automatically selects the best strategy available.

precision:

Training precision. Select ‘16-mixed’ for mixed 16-bit precision, ‘32-true’ for full 32-bit precision, or ‘bf16-mixed’ for mixed bfloat16 precision.

float32_matmul_precision:

Precision for float32 matrix multiplication. Can be one of [‘auto’, ‘highest’, ‘high’, ‘medium’]. See https://docs.pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision for more information.

seed:

Random seed for reproducibility.

logger_args:

Logger arguments. Either None or a dictionary of logger names to either None or a dictionary of logger arguments. None uses the default loggers. To disable a logger, set it to None: logger_args={"tensorboard": None}. To configure a logger, pass the respective arguments: logger_args={"mlflow": {"experiment_name": "my_experiment", ...}}. See https://docs.lightly.ai/train/stable/semantic_segmentation.html#logging for more information.

model_args:

Model training arguments. Either None or a dictionary of model arguments.

transform_args:

Transform arguments. Either None or a dictionary of transform arguments. The image size and normalization parameters can be set with transform_args={"image_size": (height, width), "normalize": {"mean": (r, g, b), "std": (r, g, b)}}

loader_args:

Arguments for the PyTorch DataLoader. Should only be used in special cases as default values are automatically set. Prefer to use the batch_size and num_workers arguments instead. For details, see: https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

save_checkpoint_args:

Arguments to configure the saving of checkpoints. The checkpoint frequency can be set with save_checkpoint_args={"save_every_num_steps": 100}.

lightly_train.train_panoptic_segmentation(*, out: str | Path, data: dict[str, Any], model: str, steps: int | Literal['auto'] = 'auto', batch_size: int | Literal['auto'] = 'auto', num_workers: int | Literal['auto'] = 'auto', devices: int | str | list[int] = 'auto', num_nodes: int = 1, resume_interrupted: bool = False, checkpoint: str | Path | None = None, reuse_class_head: bool = False, overwrite: bool = False, accelerator: str = 'auto', strategy: str = 'auto', precision: Literal[64, 32, 16, 'transformer-engine', 'transformer-engine-float16', '16-true', '16-mixed', 'bf16-true', 'bf16-mixed', '32-true', '64-true', '64', '32', '16', 'bf16'] = 'bf16-mixed', float32_matmul_precision: Literal['auto', 'highest', 'high', 'medium'] = 'auto', seed: int | None = 0, logger_args: dict[str, Any] | None = None, model_args: dict[str, Any] | None = None, transform_args: dict[str, Any] | None = None, loader_args: dict[str, Any] | None = None, save_checkpoint_args: dict[str, Any] | None = None) None

Train a panoptic segmentation model.

See the documentation for more information: https://docs.lightly.ai/train/stable/panoptic_segmentation.html

The training process can be monitored with TensorBoard:

tensorboard --logdir out

After training, the last model checkpoint is saved in the out directory to: out/checkpoints/last.ckpt and also exported to out/exported_models/exported_last.pt.

Args:
out:

The output directory where the model checkpoints and logs are saved.

data:

The dataset configuration or path to a YAML file with the configuration. See the documentation for more information: https://docs.lightly.ai/train/stable/panoptic_segmentation.html#data

model:

The model to train. For example “dinov3/vits16-eomt-coco” or a path to a local model checkpoint.

If you want to resume training from an interrupted or crashed run, use the resume_interrupted parameter.

steps:

The number of training steps.

batch_size:

Global batch size. The batch size per device/GPU is inferred from this value and the number of devices and nodes.

num_workers:

Number of workers for the dataloader per device/GPU. ‘auto’ automatically sets the number of workers based on the available CPU cores.

devices:

Number of devices/GPUs for training. ‘auto’ automatically selects all available devices. The device type is determined by the accelerator parameter.

num_nodes:

Number of nodes for distributed training.

checkpoint:

Use this parameter to further fine-tune a model from a previous fine-tuned checkpoint. The checkpoint must be a path to a checkpoint file, for example “checkpoints/model.ckpt”. This will only load the model weights from the previous run. All other training state (e.g. optimizer state, epochs) from the previous run are not loaded.

This option is equivalent to setting model="<path_to_checkpoint>".

If you want to resume training from an interrupted or crashed run, use the resume_interrupted parameter instead.

reuse_class_head:

Set this to True if you want to keep the class head from the provided checkpoint. The default behavior removes the class head before loading so that a new head can be initialized for the current task.

resume_interrupted:

Set this to True if you want to resume training from an interrupted or crashed training run. This will pick up exactly where the training left off, including the optimizer state and the current step.

  • You must use the same out directory as the interrupted run.

  • You must NOT change any training parameters (e.g., learning rate, batch size, data, etc.).

  • This is intended for continuing the same run without modification.

overwrite:

Overwrite the output directory if it already exists. Warning, this might overwrite existing files in the directory!

accelerator:

Hardware accelerator. Can be one of [‘cpu’, ‘gpu’, ‘mps’, ‘auto’]. ‘auto’ will automatically select the best accelerator available.

strategy:

Training strategy. For example ‘ddp’ or ‘auto’. ‘auto’ automatically selects the best strategy available.

precision:

Training precision. Select ‘16-mixed’ for mixed 16-bit precision, ‘32-true’ for full 32-bit precision, or ‘bf16-mixed’ for mixed bfloat16 precision.

float32_matmul_precision:

Precision for float32 matrix multiplication. Can be one of [‘auto’, ‘highest’, ‘high’, ‘medium’]. See https://docs.pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision for more information.

seed:

Random seed for reproducibility.

logger_args:

Logger arguments. Either None or a dictionary of logger names to either None or a dictionary of logger arguments. None uses the default loggers. To disable a logger, set it to None: logger_args={"tensorboard": None}. To configure a logger, pass the respective arguments: logger_args={"mlflow": {"experiment_name": "my_experiment", ...}}. See https://docs.lightly.ai/train/stable/panoptic_segmentation.html#logging for more information.

model_args:

Model training arguments. Either None or a dictionary of model arguments.

transform_args:

Transform arguments. Either None or a dictionary of transform arguments. The image size and normalization parameters can be set with transform_args={"image_size": (height, width), "normalize": {"mean": (r, g, b), "std": (r, g, b)}}

loader_args:

Arguments for the PyTorch DataLoader. Should only be used in special cases as default values are automatically set. Prefer to use the batch_size and num_workers arguments instead. For details, see: https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

save_checkpoint_args:

Arguments to configure the saving of checkpoints. The checkpoint frequency can be set with save_checkpoint_args={"save_every_num_steps": 100}.

lightly_train.train_semantic_segmentation(*, out: str | Path, data: dict[str, Any], model: str, steps: int | Literal['auto'] = 'auto', batch_size: int | Literal['auto'] = 'auto', num_workers: int | Literal['auto'] = 'auto', devices: int | str | list[int] = 'auto', num_nodes: int = 1, resume_interrupted: bool = False, checkpoint: str | Path | None = None, reuse_class_head: bool = False, overwrite: bool = False, accelerator: str = 'auto', strategy: str = 'auto', precision: Literal[64, 32, 16, 'transformer-engine', 'transformer-engine-float16', '16-true', '16-mixed', 'bf16-true', 'bf16-mixed', '32-true', '64-true', '64', '32', '16', 'bf16'] = 'bf16-mixed', float32_matmul_precision: Literal['auto', 'highest', 'high', 'medium'] = 'auto', seed: int | None = 0, logger_args: dict[str, Any] | None = None, model_args: dict[str, Any] | None = None, transform_args: dict[str, Any] | None = None, loader_args: dict[str, Any] | None = None, save_checkpoint_args: dict[str, Any] | None = None) None

Train a semantic segmentation model.

See the documentation for more information: https://docs.lightly.ai/train/stable/semantic_segmentation.html

The training process can be monitored with TensorBoard:

tensorboard --logdir out

After training, the last model checkpoint is saved in the out directory to: out/checkpoints/last.ckpt and also exported to out/exported_models/exported_last.pt.

Args:
out:

The output directory where the model checkpoints and logs are saved.

data:

The dataset configuration or path to a YAML file with the configuration. See the documentation for more information: https://docs.lightly.ai/train/stable/semantic_segmentation.html#data

model:

The model to train. For example, “dinov2/vits14-eomt”, “dinov3/vits16-eomt-coco”, or a path to a local model checkpoint.

If you want to resume training from an interrupted or crashed run, use the resume_interrupted parameter.

steps:

The number of training steps.

batch_size:

Global batch size. The batch size per device/GPU is inferred from this value and the number of devices and nodes.

num_workers:

Number of workers for the dataloader per device/GPU. ‘auto’ automatically sets the number of workers based on the available CPU cores.

devices:

Number of devices/GPUs for training. ‘auto’ automatically selects all available devices. The device type is determined by the accelerator parameter.

num_nodes:

Number of nodes for distributed training.

checkpoint:

Use this parameter to further fine-tune a model from a previous fine-tuned checkpoint. The checkpoint must be a path to a checkpoint file, for example “checkpoints/model.ckpt”. This will only load the model weights from the previous run. All other training state (e.g. optimizer state, epochs) from the previous run are not loaded.

This option is equivalent to setting model="<path_to_checkpoint>".

If you want to resume training from an interrupted or crashed run, use the resume_interrupted parameter instead.

reuse_class_head:

Deprecated. Now the model will reuse the classification head by default only when the num_classes in the data config matches that in the checkpoint. Otherwise, the classification head will be re-initialized.

resume_interrupted:

Set this to True if you want to resume training from an interrupted or crashed training run. This will pick up exactly where the training left off, including the optimizer state and the current step.

  • You must use the same out directory as the interrupted run.

  • You must NOT change any training parameters (e.g., learning rate, batch size, data, etc.).

  • This is intended for continuing the same run without modification.

overwrite:

Overwrite the output directory if it already exists. Warning, this might overwrite existing files in the directory!

accelerator:

Hardware accelerator. Can be one of [‘cpu’, ‘gpu’, ‘mps’, ‘auto’]. ‘auto’ will automatically select the best accelerator available.

strategy:

Training strategy. For example ‘ddp’ or ‘auto’. ‘auto’ automatically selects the best strategy available.

precision:

Training precision. Select ‘16-mixed’ for mixed 16-bit precision, ‘32-true’ for full 32-bit precision, or ‘bf16-mixed’ for mixed bfloat16 precision.

float32_matmul_precision:

Precision for float32 matrix multiplication. Can be one of [‘auto’, ‘highest’, ‘high’, ‘medium’]. See https://docs.pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision for more information.

seed:

Random seed for reproducibility.

logger_args:

Logger arguments. Either None or a dictionary of logger names to either None or a dictionary of logger arguments. None uses the default loggers. To disable a logger, set it to None: logger_args={"tensorboard": None}. To configure a logger, pass the respective arguments: logger_args={"mlflow": {"experiment_name": "my_experiment", ...}}. See https://docs.lightly.ai/train/stable/semantic_segmentation.html#logging for more information.

model_args:

Model training arguments. Either None or a dictionary of model arguments.

transform_args:

Transform arguments. Either None or a dictionary of transform arguments. The image size and normalization parameters can be set with transform_args={"image_size": (height, width), "normalize": {"mean": (r, g, b), "std": (r, g, b)}}

loader_args:

Arguments for the PyTorch DataLoader. Should only be used in special cases as default values are automatically set. Prefer to use the batch_size and num_workers arguments instead. For details, see: https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

save_checkpoint_args:

Arguments to configure the saving of checkpoints. The checkpoint frequency can be set with save_checkpoint_args={"save_every_num_steps": 100}.

Models

class lightly_train._task_models.dinov3_eomt_instance_segmentation.task_model.DINOv3EoMTInstanceSegmentation
export_onnx(out: str | Path, *, precision: Literal['auto', 'fp32', 'fp16'] = 'auto', batch_size: int = 1, height: int | None = None, width: int | None = None, opset_version: int | None = None, simplify: bool = True, verify: bool = True, format_args: dict[str, Any] | None = None) None

Exports the model to ONNX for inference.

The export uses a dummy input of shape (batch_size, C, H, W) where C is inferred from the first model parameter and (H, W) come from self.image_size. The ONNX graph uses dynamic batch size for both inputs and produces three outputs: labels, masks, and scores.

Optionally simplifies the exported model in-place using onnxslim and verifies numerical closeness against a float32 CPU reference via ONNX Runtime.

Args:
out:

Path where the ONNX model will be written.

precision:

Precision for the ONNX model. Either “auto”, “fp32”, or “fp16”. “auto” uses the model’s current precision.

batch_size:

Batch size for the ONNX input.

height:

Height of the ONNX input. If None, will be taken from self.image_size.

width:

Width of the ONNX input. If None, will be taken from self.image_size.

opset_version:

ONNX opset version to target. If None, PyTorch’s default opset is used.

simplify:

If True, run onnxslim to simplify and overwrite the exported model.

verify:

If True, validate the ONNX file and compare outputs to a float32 CPU reference forward pass.

format_args:

Optional extra keyword arguments forwarded to torch.onnx.export.

Returns:

None. Writes the ONNX model to out.

export_tensorrt(out: str | Path, *, precision: Literal['auto', 'fp32', 'fp16'] = 'auto', onnx_args: dict[str, Any] | None = None, max_batchsize: int = 1, opt_batchsize: int = 1, min_batchsize: int = 1, verbose: bool = False) None

Build a TensorRT engine from an ONNX model.

Note

TensorRT is not part of LightlyTrain’s dependencies and must be installed separately. Installation depends on your OS, Python version, GPU, and NVIDIA driver/CUDA setup. See the TensorRT documentation for more details. On CUDA 12.x systems you can often install the Python package via pip install tensorrt-cu12.

This loads the ONNX file, parses it with TensorRT, infers the static input shape (C, H, W) from the “images” input, and creates an engine with a dynamic batch dimension in the range [min_batchsize, opt_batchsize, max_batchsize]. Spatial dimensions must be static in the ONNX model (dynamic H/W are not yet supported).

The engine is serialized and written to out.

Args:
out:

Path where the TensorRT engine will be saved.

precision:

Precision for ONNX export and TensorRT engine building. Either “auto”, “fp32”, or “fp16”. “auto” uses the model’s current precision.

onnx_args:

Optional arguments to pass to export_onnx when exporting the ONNX model prior to building the TensorRT engine. If None, default arguments are used and the ONNX file is saved alongside the TensorRT engine with the same name but .onnx extension.

max_batchsize:

Maximum supported batch size.

opt_batchsize:

Batch size TensorRT optimizes for.

min_batchsize:

Minimum supported batch size.

verbose:

Enable verbose TensorRT logging.

Raises:

FileNotFoundError: If the ONNX file does not exist. RuntimeError: If the ONNX cannot be parsed or engine building fails. ValueError: If batch size constraints are invalid or H/W are dynamic.

predict(image: str | Path | Image | Tensor, threshold: float = 0.8) dict[str, Tensor]

Returns the predicted mask for the given image.

Args:
image:

The input image as a path, URL, PIL image, or tensor. Tensors must have shape (C, H, W).

threshold:

The confidence threshold for the predicted masks. Only masks with a confidence score above this threshold are returned.

Returns:

A {“labels”: Tensor, “masks”: Tensor, “scores”: Tensor} dict. Labels is a tensor of shape (Q,) containing the predicted class for each query. Masks is a tensor of shape (Q, H, W) containing the predicted mask for each query. Scores is a tensor of shape (Q,) containing the confidence score for each query.

class lightly_train._task_models.dinov2_ltdetr_object_detection.task_model.DINOv2LTDETRObjectDetection
predict(image: str | Path | Image | Tensor, threshold: float = 0.6) dict[str, Tensor]

Returns predictions for the given image.

Args:
image:

The input image as a path, URL, PIL image, or tensor. Tensors must have shape (C, H, W).

predict_sahi(image: str | Path | Image | Tensor, threshold: float = 0.6, overlap: float = 0.2, nms_iou_threshold: float = 0.3, global_local_iou_threshold: float = 0.1) dict[str, Tensor]

Run Slicing Aided Hyper Inference (SAHI) inference on the input image.

The image is first converted to a tensor, then:

  • Tiled into overlapping crops of size self.image_size.

  • A resized full-image version is added as a “global” tile.

  • All tiles (global + local) are passed through the model in parallel.

  • Predictions are filtered by score and merged using NMS and a global/local consistency heuristic. NMS is only applied on tiles predictions. The heuristic discards tiles predictions that heavily overlaps with global predictions.

Args:
image:

Input image. Can be a path, a PIL image, or a tensor of shape (C, H, W).

threshold:

Score threshold for filtering low-confidence predictions.

overlap:

Fractional overlap between tiles in [0, 1). 0.0 means no overlap.

nms_iou_threshold:

IoU threshold used for non-maximum suppression when merging predictions from tiles and global image. A lower nms_iou_threshold value yields less predictions.

global_local_iou_threshold:

Minimum IoU required to consider a tile prediction as matching a global prediction when combining them. A lower global_local_iou_threshold yields less predictions.

Returns:
dict[str, Tensor]: A dictionary with:
  • “labels”: Tensor of shape (N,) with predicted class indices.

  • “bboxes”: Tensor of shape (N, 4) with bounding boxes in

    (x_min, y_min, x_max, y_max) in the coordinates of the original image.

  • “scores”: Tensor of shape (N,) with confidence scores for each prediction.

class lightly_train._task_models.dinov3_ltdetr_object_detection.task_model.DINOv3LTDETRObjectDetection
export_onnx(out: str | Path, *, precision: Literal['auto', 'fp32', 'fp16'] = 'auto', opset_version: int | None = None, simplify: bool = True, verify: bool = True, format_args: dict[str, Any] | None = None, num_channels: int | None = None) None

Exports the model to ONNX for inference.

The export uses a dummy input of shape (1, C, H, W) where C is inferred from the first model parameter and (H, W) come from self.image_size. The ONNX graph uses dynamic batch size for both inputs and produces three outputs: labels, boxes, and scores.

Optionally simplifies the exported model in-place using onnxslim and verifies numerical closeness against a float32 CPU reference via ONNX Runtime.

Args:
out:

Path where the ONNX model will be written.

precision:

Precision for the ONNX model. Either “auto”, “fp32”, or “fp16”. “auto” uses the model’s current precision.

opset_version:

ONNX opset version to target. If None, PyTorch’s default opset is used.

simplify:

If True, run onnxslim to simplify and overwrite the exported model.

verify:

If True, validate the ONNX file and compare outputs to a float32 CPU reference forward pass.

format_args:

Optional extra keyword arguments forwarded to torch.onnx.export.

num_channels:

Number of input channels. If None, will be inferred.

Returns:

None. Writes the ONNX model to out.

export_tensorrt(out: str | Path, *, precision: Literal['auto', 'fp32', 'fp16'] = 'auto', onnx_args: dict[str, Any] | None = None, max_batchsize: int = 1, opt_batchsize: int = 1, min_batchsize: int = 1, verbose: bool = False) None

Build a TensorRT engine from an ONNX model.

Note

TensorRT is not part of LightlyTrain’s dependencies and must be installed separately. Installation depends on your OS, Python version, GPU, and NVIDIA driver/CUDA setup. See the [TensorRT documentation](https://docs.nvidia.com/deeplearning/tensorrt/latest/installing-tensorrt/installing.html) for more details. On CUDA 12.x systems you can often install the Python package via pip install tensorrt-cu12.

This loads the ONNX file, parses it with TensorRT, infers the static input shape (C, H, W) from the “images” input, and creates an engine with a dynamic batch dimension in the range [min_batchsize, opt_batchsize, max_batchsize]. Spatial dimensions must be static in the ONNX model (dynamic H/W are not yet supported).

The engine is serialized and written to out.

Args:
out:

Path where the TensorRT engine will be saved.

precision:

Precision for ONNX export and TensorRT engine building. Either “auto”, “fp32”, or “fp16”. “auto” uses the model’s current precision.

onnx_args:

Optional arguments to pass to export_onnx when exporting the ONNX model prior to building the TensorRT engine. If None, default arguments are used and the ONNX file is saved alongside the TensorRT engine with the same name but .onnx extension.

max_batchsize:

Maximum supported batch size.

opt_batchsize:

Batch size TensorRT optimizes for.

min_batchsize:

Minimum supported batch size.

verbose:

Enable verbose TensorRT logging.

Raises:

FileNotFoundError: If the ONNX file does not exist. RuntimeError: If the ONNX cannot be parsed or engine building fails. ValueError: If batch size constraints are invalid or H/W are dynamic.

predict(image: str | Path | Image | Tensor, threshold: float = 0.6) dict[str, Tensor]

Returns predictions for the given image.

Args:
image:

The input image as a path, URL, PIL image, or tensor. Tensors must have shape (C, H, W).

predict_sahi(image: str | Path | Image | Tensor, threshold: float = 0.6, overlap: float = 0.2, nms_iou_threshold: float = 0.3, global_local_iou_threshold: float = 0.1) dict[str, Tensor]

Run Slicing Aided Hyper Inference (SAHI) inference on the input image.

The image is first converted to a tensor, then:

  • Tiled into overlapping crops of size self.image_size.

  • A resized full-image version is added as a “global” tile.

  • All tiles (global + local) are passed through the model in parallel.

  • Predictions are filtered by score and merged using NMS and a global/local consistency heuristic. NMS is only applied on tiles predictions. The heuristic discards tiles predictions that heavily overlaps with global predictions.

Args:
image:

Input image. Can be a path, a PIL image, or a tensor of shape (C, H, W).

threshold:

Score threshold for filtering low-confidence predictions.

overlap:

Fractional overlap between tiles in [0, 1). 0.0 means no overlap.

nms_iou_threshold:

IoU threshold used for non-maximum suppression when merging predictions from tiles and global image. A lower nms_iou_threshold value yields less predictions.

global_local_iou_threshold:

Minimum IoU required to consider a tile prediction as matching a global prediction when combining them. A lower global_local_iou_threshold yields less predictions.

Returns:
dict[str, Tensor]: A dictionary with:
  • “labels”: Tensor of shape (N,) with predicted class indices.

  • “bboxes”: Tensor of shape (N, 4) with bounding boxes in (x_min, y_min, x_max, y_max) in the coordinates of the original image.

  • “scores”: Tensor of shape (N,) with confidence scores for each prediction.

class lightly_train._task_models.picodet_object_detection.task_model.PicoDetObjectDetection

PicoDet-S object detection model.

PicoDet is a lightweight anchor-free object detector designed for mobile and edge deployment. It uses an Enhanced ShuffleNet backbone, CSP-PAN neck, and GFL-style detection head.

export_onnx(out: str | Path, *, precision: Literal['auto', 'fp32', 'fp16'] = 'auto', opset_version: int | None = None, simplify: bool = True, verify: bool = True, format_args: dict[str, Any] | None = None, num_channels: int | None = None) None

Exports the model to ONNX for inference.

The export uses a dummy input of shape (1, C, H, W) where C is inferred from the first model parameter and (H, W) come from self.image_size. The ONNX graph outputs labels, boxes, and scores in the resized input image space.

Optionally simplifies the exported model in-place using onnxslim and verifies numerical closeness against a float32 CPU reference via ONNX Runtime.

Args:
out:

Path where the ONNX model will be written.

precision:

Precision for the ONNX model. Either “auto”, “fp32”, or “fp16”. “auto” uses the model’s current precision.

opset_version:

ONNX opset version to target. If None, PyTorch’s default opset is used.

simplify:

If True, run onnxslim to simplify and overwrite the exported model.

verify:

If True, validate the ONNX file and compare outputs to a float32 CPU reference forward pass.

format_args:

Optional extra keyword arguments forwarded to torch.onnx.export.

num_channels:

Number of input channels. If None, will be inferred.

export_tensorrt(out: str | Path, *, precision: Literal['auto', 'fp32', 'fp16'] = 'auto', onnx_args: dict[str, Any] | None = None, max_batchsize: int = 1, opt_batchsize: int = 1, min_batchsize: int = 1, verbose: bool = False) None

Build a TensorRT engine from an ONNX model.

Note

TensorRT is not part of LightlyTrain’s dependencies and must be installed separately. Installation depends on your OS, Python version, GPU, and NVIDIA driver/CUDA setup. See the [TensorRT documentation](https://docs.nvidia.com/deeplearning/tensorrt/latest/installing-tensorrt/installing.html) for more details. On CUDA 12.x systems you can often install the Python package via pip install tensorrt-cu12.

This loads the ONNX file, parses it with TensorRT, infers the static input shape (C, H, W) from the “images” input, and creates an engine with a dynamic batch dimension in the range [min_batchsize, opt_batchsize, max_batchsize]. Spatial dimensions must be static in the ONNX model (dynamic H/W are not yet supported).

The engine is serialized and written to out.

Args:
out:

Path where the TensorRT engine will be saved.

precision:

Precision for ONNX export and TensorRT engine building. Either “auto”, “fp32”, or “fp16”. “auto” uses the model’s current precision.

onnx_args:

Optional arguments to pass to export_onnx when exporting the ONNX model prior to building the TensorRT engine. If None, default arguments are used and the ONNX file is saved alongside the TensorRT engine with the same name but .onnx extension.

max_batchsize:

Maximum supported batch size.

opt_batchsize:

Batch size TensorRT optimizes for.

min_batchsize:

Minimum supported batch size.

verbose:

Enable verbose TensorRT logging.

Raises:

FileNotFoundError: If the ONNX file does not exist. RuntimeError: If the ONNX cannot be parsed or engine building fails. ValueError: If batch size constraints are invalid or H/W are dynamic.

predict(image: str | Path | Image | Tensor, threshold: float = 0.6) dict[str, Tensor]

Run inference on a single image.

Args:

image: Input image as path, PIL image, or tensor (C, H, W). threshold: Score threshold for detections.

Returns:

Dictionary with: - labels: Tensor of shape (N,) with class indices. - bboxes: Tensor of shape (N, 4) with boxes in xyxy format. - scores: Tensor of shape (N,) with confidence scores.

class lightly_train._task_models.dinov3_eomt_panoptic_segmentation.task_model.DINOv3EoMTPanopticSegmentation
export_onnx(out: str | Path, *, precision: Literal['auto', 'fp32', 'fp16'] = 'auto', batch_size: int = 1, height: int | None = None, width: int | None = None, threshold: float = 0.8, mask_threshold: float = 0.5, mask_overlap_threshold: float = 0.8, opset_version: int | None = None, simplify: bool = True, verify: bool = True, format_args: dict[str, Any] | None = None) None

Exports the model to ONNX for inference.

The export uses a dummy input of shape (batch_size, C, H, W) where C is inferred from the first model parameter and (H, W) come from self.image_size. The ONNX graph uses dynamic batch size for input images. The output masks, segment_ids, and scores have dynamic shapes depending on the number of detected segments.

Optionally simplifies the exported model in-place using onnxslim and verifies numerical closeness against a float32 CPU reference via ONNX Runtime.

Args:
out:

Path where the ONNX model will be written.

precision:

Precision for the ONNX model. Either “auto”, “fp32”, or “fp16”. “auto” uses the model’s current precision.

batch_size:

Batch size for the ONNX input. Only batch size 1 is supported.

height:

Height of the ONNX input. If None, will be taken from self.image_size.

width:

Width of the ONNX input. If None, will be taken from self.image_size.

threshold:

Confidence threshold to keep predicted masks. Will be folded into the ONNX graph as a constant.

mask_threshold:

Threshold to convert predicted mask logits to binary masks. Will be folded into the ONNX graph as a constant.

mask_overlap_threshold:

Overlap area threshold for the predicted masks. Used to filter out or merge disconnected mask regions for every instance. Will be folded into the ONNX graph as a constant.

opset_version:

ONNX opset version to target. If None, PyTorch’s default opset is used.

simplify:

If True, run onnxslim to simplify and overwrite the exported model.

verify:

If True, validate the ONNX file and compare outputs to a float32 CPU reference forward pass.

format_args:

Optional extra keyword arguments forwarded to torch.onnx.export.

Returns:

None. Writes the ONNX model to out.

export_tensorrt(out: str | Path, *, precision: Literal['auto', 'fp32', 'fp16'] = 'auto', onnx_args: dict[str, Any] | None = None, max_batchsize: int = 1, opt_batchsize: int = 1, min_batchsize: int = 1, verbose: bool = False) None

Build a TensorRT engine from an ONNX model.

Note

TensorRT is not part of LightlyTrain’s dependencies and must be installed separately. Installation depends on your OS, Python version, GPU, and NVIDIA driver/CUDA setup. See the TensorRT documentation for more details. On CUDA 12.x systems you can often install the Python package via pip install tensorrt-cu12.

This loads the ONNX file, parses it with TensorRT, infers the static input shape (C, H, W) from the “images” input, and creates an engine with a dynamic batch dimension in the range [min_batchsize, opt_batchsize, max_batchsize]. Spatial dimensions must be static in the ONNX model (dynamic H/W are not yet supported).

The engine is serialized and written to out.

Args:
out:

Path where the TensorRT engine will be saved.

precision:

Precision for ONNX export and TensorRT engine building. Either “auto”, “fp32”, or “fp16”. “auto” uses the model’s current precision.

onnx_args:

Optional arguments to pass to export_onnx when exporting the ONNX model prior to building the TensorRT engine. If None, default arguments are used and the ONNX file is saved alongside the TensorRT engine with the same name but .onnx extension.

max_batchsize:

Maximum supported batch size.

opt_batchsize:

Batch size TensorRT optimizes for.

min_batchsize:

Minimum supported batch size.

verbose:

Enable verbose TensorRT logging.

predict(image: str | Path | Image | Tensor, threshold: float = 0.8, mask_threshold: float = 0.5, mask_overlap_threshold: float = 0.8) dict[str, Tensor]

Returns the predicted mask for the given image.

Args:
image:

The input image as a path, URL, PIL image, or tensor. Tensors must have shape (C, H, W).

threshold:

The confidence threshold to keep predicted masks.

mask_threshold:

The threshold to convert predicted mask logits to binary masks.

mask_overlap_threshold:

The overlap area threshold for the predicted masks. Used to filter out or merge disconnected mask regions for every instance.

Returns:

A {“masks”: Tensor, “segment_ids”: Tensor, “scores”: Tensor} dict. Mask is a tensor of shape (H, W, 2) where the last dimension has two channels: - Channel 0: class label per pixel - Channel 1: segment id per pixel Segment ids are in [-1, num_unique_segment_ids - 1]. There can be multiple segments with the same id if they belong to the same stuff class. Id -1 indicates pixels without an assigned segment. Scores is a tensor of shape (num_segments,) containing the confidences score for each segment.

class lightly_train._task_models.dinov2_eomt_semantic_segmentation.task_model.DINOv2EoMTSemanticSegmentation
export_onnx(out: str | Path, *, precision: Literal['auto', 'fp32', 'fp16'] = 'auto', batch_size: int = 1, height: int | None = None, width: int | None = None, opset_version: int | None = None, simplify: bool = True, verify: bool = True, format_args: dict[str, Any] | None = None) None

Exports the model to ONNX for inference.

The export uses a dummy input of shape (batch_size, C, H, W) where C is inferred from the first model parameter and (H, W) come from self.image_size. The ONNX graph uses dynamic batch size for both inputs and produces two outputs: masks and logits.

Optionally simplifies the exported model in-place using onnxslim and verifies numerical closeness against a float32 CPU reference via ONNX Runtime.

Args:
out:

Path where the ONNX model will be written.

precision:

Precision for the ONNX model. Either “auto”, “fp32”, or “fp16”. “auto” uses the model’s current precision.

batch_size:

Batch size for the ONNX input.

height:

Height of the ONNX input. If None, will be taken from self.image_size.

width:

Width of the ONNX input. If None, will be taken from self.image_size.

opset_version:

ONNX opset version to target. If None, PyTorch’s default opset is used.

simplify:

If True, run onnxslim to simplify and overwrite the exported model.

verify:

If True, validate the ONNX file and compare outputs to a float32 CPU reference forward pass.

format_args:

Optional extra keyword arguments forwarded to torch.onnx.export.

Returns:

None. Writes the ONNX model to out.

export_tensorrt(out: str | Path, *, precision: Literal['auto', 'fp32', 'fp16'] = 'auto', onnx_args: dict[str, Any] | None = None, max_batchsize: int = 1, opt_batchsize: int = 1, min_batchsize: int = 1, verbose: bool = False) None

Build a TensorRT engine from an ONNX model.

Note

TensorRT is not part of LightlyTrain’s dependencies and must be installed separately. Installation depends on your OS, Python version, GPU, and NVIDIA driver/CUDA setup. See the TensorRT documentation for more details. On CUDA 12.x systems you can often install the Python package via pip install tensorrt-cu12.

This loads the ONNX file, parses it with TensorRT, infers the static input shape (C, H, W) from the “images” input, and creates an engine with a dynamic batch dimension in the range [min_batchsize, opt_batchsize, max_batchsize]. Spatial dimensions must be static in the ONNX model (dynamic H/W are not yet supported).

The engine is serialized and written to out.

Args:
out:

Path where the TensorRT engine will be saved.

precision:

Precision for ONNX export and TensorRT engine building. Either “auto”, “fp32”, or “fp16”. “auto” uses the model’s current precision.

onnx_args:

Optional arguments to pass to export_onnx when exporting the ONNX model prior to building the TensorRT engine. If None, default arguments are used and the ONNX file is saved alongside the TensorRT engine with the same name but .onnx extension.

max_batchsize:

Maximum supported batch size.

opt_batchsize:

Batch size TensorRT optimizes for.

min_batchsize:

Minimum supported batch size.

verbose:

Enable verbose TensorRT logging.

Raises:

FileNotFoundError: If the ONNX file does not exist. RuntimeError: If the ONNX cannot be parsed or engine building fails. ValueError: If batch size constraints are invalid or H/W are dynamic.

predict(image: str | Path | Image | Tensor) Tensor

Returns the predicted mask for the given image.

Args:
image:

The input image as a path, URL, PIL image, or tensor. Tensors must have shape (C, H, W).

Returns:

The predicted mask as a tensor of shape (H, W). The values represent the class IDs as defined in the classes argument of your dataset. These classes are also stored in the classes attribute of the model. The model will always predict the pixels as one of the known classes even when your dataset contains ignored classes defined by the ignore_classes argument.

class lightly_train._task_models.dinov3_eomt_semantic_segmentation.task_model.DINOv3EoMTSemanticSegmentation
export_onnx(out: str | Path, *, precision: Literal['auto', 'fp32', 'fp16'] = 'auto', batch_size: int = 1, height: int | None = None, width: int | None = None, opset_version: int | None = None, simplify: bool = True, verify: bool = True, format_args: dict[str, Any] | None = None) None

Exports the model to ONNX for inference.

The export uses a dummy input of shape (batch_size, C, H, W) where C is inferred from the first model parameter and (H, W) come from self.image_size. The ONNX graph uses dynamic batch size for both inputs and produces two outputs: masks and logits.

Optionally simplifies the exported model in-place using onnxslim and verifies numerical closeness against a float32 CPU reference via ONNX Runtime.

Args:
out:

Path where the ONNX model will be written.

precision:

Precision for the ONNX model. Either “auto”, “fp32”, or “fp16”. “auto” uses the model’s current precision.

batch_size:

Batch size for the ONNX input.

height:

Height of the ONNX input. If None, will be taken from self.image_size.

width:

Width of the ONNX input. If None, will be taken from self.image_size.

opset_version:

ONNX opset version to target. If None, PyTorch’s default opset is used.

simplify:

If True, run onnxslim to simplify and overwrite the exported model.

verify:

If True, validate the ONNX file and compare outputs to a float32 CPU reference forward pass.

format_args:

Optional extra keyword arguments forwarded to torch.onnx.export.

Returns:

None. Writes the ONNX model to out.

export_tensorrt(out: str | Path, *, precision: Literal['auto', 'fp32', 'fp16'] = 'auto', onnx_args: dict[str, Any] | None = None, max_batchsize: int = 1, opt_batchsize: int = 1, min_batchsize: int = 1, verbose: bool = False) None

Build a TensorRT engine from an ONNX model.

Note

TensorRT is not part of LightlyTrain’s dependencies and must be installed separately. Installation depends on your OS, Python version, GPU, and NVIDIA driver/CUDA setup. See the TensorRT documentation for more details. On CUDA 12.x systems you can often install the Python package via pip install tensorrt-cu12.

This loads the ONNX file, parses it with TensorRT, infers the static input shape (C, H, W) from the “images” input, and creates an engine with a dynamic batch dimension in the range [min_batchsize, opt_batchsize, max_batchsize]. Spatial dimensions must be static in the ONNX model (dynamic H/W are not yet supported).

The engine is serialized and written to out.

Args:
out:

Path where the TensorRT engine will be saved.

precision:

Precision for ONNX export and TensorRT engine building. Either “auto”, “fp32”, or “fp16”. “auto” uses the model’s current precision.

onnx_args:

Optional arguments to pass to export_onnx when exporting the ONNX model prior to building the TensorRT engine. If None, default arguments are used and the ONNX file is saved alongside the TensorRT engine with the same name but .onnx extension.

max_batchsize:

Maximum supported batch size.

opt_batchsize:

Batch size TensorRT optimizes for.

min_batchsize:

Minimum supported batch size.

verbose:

Enable verbose TensorRT logging.

Raises:

FileNotFoundError: If the ONNX file does not exist. RuntimeError: If the ONNX cannot be parsed or engine building fails. ValueError: If batch size constraints are invalid or H/W are dynamic.

predict(image: str | Path | Image | Tensor) Tensor

Returns the predicted mask for the given image.

Args:
image:

The input image as a path, URL, PIL image, or tensor. Tensors must have shape (C, H, W).

Returns:

The predicted mask as a tensor of shape (H, W). The values represent the class IDs as defined in the classes argument of your dataset. These classes are also stored in the classes attribute of the model. The model will always predict the pixels as one of the known classes even when your dataset contains ignored classes defined by the ignore_classes argument.