(faq)=
# FAQ
## General
```{dropdown}
What is LightlyTrain?
LightlyTrain is a production-ready framework to train computer vision models on unlabeled
data. This allows you to get started with model training immediately as you don't have
to wait for your data to get labeled and it significantly reduces the amount of labeled
data needed to reach a high model performance. This decreases model deployment time and
costs, and allows you to use your unlabeled data to its full potential.
Training a model with LightlyTrain only requires a single line of code. It is available
as a Python package or Docker container and runs fully on-premise.
```
```{dropdown} Who is LightlyTrain for?
LightlyTrain is designed for engineers and teams who want to use their unlabeled data to its
full potential. It is ideal if any of the following applies to you:
- You want to speedup model development cycles
- You have limited labeled data but abundant unlabeled data
- You have slow and expensive labeling processes
- You want to build your own foundation model
- You work with domain-specific datasets (video analytics, robotics, medical, agriculture, etc.)
- You cannot use public pretrained models
- No pretrained models are available for your specific architecture
- You want to leverage the latest research in self-supervised learning and distillation
```
````{dropdown} How do I install LightlyTrain?
You can install LightlyTrain using pip:
```bash
pip install lightly-train
```
LightlyTrain is also available as a Docker container.
For full details see the {ref}`installation page `.
````
```{dropdown} What are the system requirements?
LightlyTrain requires:
- Linux (CUDA & CPU) or MacOS (CPU only). Windows support is coming soon!
- Python 3.8 or higher
- PyTorch 2.1 or higher
- CUDA-compatible GPU(s) for fast training
- Sufficient storage for your dataset and model
LightlyTrain scales from single GPU setups to multi-GPU and multi-node configurations
automatically, depending on your available hardware.
We recommend the following minimal setup for training:
- NVIDIA GPU with 12GB+ memory
- 16GB+ RAM
- 50GB+ disk space for datasets and models
```
```{dropdown} Is LightlyTrain free to use?
Lightly**Train** offers flexible licensing options to suit your specific needs:
- **AGPL-3.0 License**: Perfect for open-source projects, academic research, and community contributions.
Share your innovations with the world while benefiting from community improvements.
- **Commercial License**: Ideal for businesses and organizations that need proprietary development freedom.
Enjoy all the benefits of LightlyTrain while keeping your code and models private.
We're committed to supporting both open-source and commercial users.
Please [contact us](https://www.lightly.ai/contact) to discuss the best licensing option for your project!
```
```{dropdown} What's the difference between LightlyTrain and LightlySSL?¶
- **LightlyTrain**: A production-ready framework designed for engineers and teams who want to
pretrain models with minimal code and configuration. It integrates directly with many popular
model training libraries.
- **LightlySSL**: A research-oriented package targeting self-supervised learning (SSL) researchers
who want to modify and develop new SSL methods. It is not designed for production use and requires
expertise in writing model training code.
```
```{dropdown} What's the difference between LightlyTrain and other open-source self-supervised learning implementations?¶
LightlyTrain offers several advantages over other self-supervised learning (SSL) implementations:
- **User-friendly**: You don't need to be an SSL expert - focus on training your model instead
of implementation details. Existing SSL frameworks require deep model training knowledge and
focus on research instead of industry applications.
- **Works with various model architectures**: Popular SSL frameworks are usually limited to few
model architectures like ResNet or ViT. LightlyTrain directly integrates with different
libraries such as Torchvision, Ultralytics, etc. and allows pretraining of their models out
of the box.
- **Handles complexity**: LightlyTrain manages many complexities around model training and SSL,
such as scaling from single GPU to multi-GPU training and optimizing hyperparameters.
- **Seamless workflow**: LightlyTrain automatically pretrains the correct layers and exports
models in the right format for fine-tuning, reducing risks when moving from pretraining to
fine-tuning.
- **DINOv2 distillation**: Lightly has developed a unique distillation method that allows
you to train smaller models with the knowledge of larger DINOv2 models without the need for
large compute resources.
```
## Capabilities & Use Cases
```{dropdown} Which tasks does LightlyTrain support?
LightlyTrain supports pretraining for various downstream computer vision tasks such as:
- Image classification
- Object detection
- Semantic segmentation
- Instance segmentation
- Pose estimation
- Depth estimation
- Image embedding
- Image retrieval
The framework is designed to create general-purpose visual representations that can be fine-tuned
for any visual task, especially when you have limited labeled data or domain-specific requirements.
```
```{dropdown} Which models does LightlyTrain support?
LightlyTrain supports the latest model architectures in computer vision such as:
- YOLO (YOLOv5, YOLOv6, YOLOv8, YOLOv11, YOLOv12)
- RF-DETR
- RT-DETR
- Vision Transformers
- ResNets
- Many others
LightlyTrain supports models from popular model training libraries:
- Torchvision
- TIMM
- Ultralytics
- RT-DETR
- RF-DETR
- YOLOv12
- SuperGradients
- Custom Models
Check the [Models](#models) documentation for details on specific models and their
supported configurations.
```
```{dropdown} Can I train an embedding model with LightlyTrain?
Yes, LightlyTrain is the perfect choice to train image embedding models. Its algorithms
are optimized to learn strong image representations for tasks such as image visualization,
clustering, and retrieval. With LightlyTrain you can train your own image embedding model
on your unlabeled data. LightlyTrain also comes with an embedding function that lets you
create image embeddings of your data.
```
```{dropdown} Which data types does LightlyTrain support?
LightlyTrain currently supports training on images and video frames. It works with standard image
formats such as JPG, PNG, etc. The framework handles image loading, preprocessing, and transformation
automatically.
See [the documentation](#train-data) for all the supported data formats.
```
```{dropdown} Which datasets and domains does LightlyTrain support?¶
LightlyTrain is designed to work with any RGB image or video dataset and domain. It works especially
well for domains such as:
- Video Analytics
- Robotics
- ADAS
- Agriculture
- Medical Imaging
- Visual Inspection
As LightlyTrain doesn't need labeled data it is particularly valuable in domains
where labeled data is rare or expensive to get.
```
```{dropdown} How much data do I need?
LightlyTrain supports use-cases from thousands to millions of images. We recommend
a minimum of a several thousand unlabeled images for training with LightlyTrain and 100+ labeled images
for the fine-tuning afterwards. The larger the difference in dataset size between the
unlabeled and labeled data, the larger the benefit of LightlyTrain. For best
results we recommend at least 5x more unlabeled than labeled data. However, for most cases
2x more unlabeled than labeled data yields already strong improvements.
An example use-case looks like this:
- 100'000 unlabeled images
- 10'000 labeled training images
- 1'000 labeled validation images
The model is pretrained on the 100'000 unlabeled images with LightlyTrain.
The pretrained model is then fine-tuned on the 10'000 labeled images with your favorite
fine-tuning library and evaluated on the 1'000 validation images.
You can include the 10'000 labeled images in the unlabeled images for
pretraining (it will make your model better). But do not include the validation
images in the unlabeled data as this will lead to data leakage.
```
```{dropdown} Can I train on labeled images?
Yes but LightlyTrain will ignore the labels. In fact, we recommend to add the
training split of your labeled dataset to the dataset used for pretraining with
LightlyTrain. This will make your model better. However, do not include the
validation images in dataset used for pretraining with LightlyTrain as this
leads to data leakage when you later evaluate the model on those images.
The unlabeled dataset must always be treated like a training split and any images
in the unlabeled set must not be used for evaluation.
```
```{dropdown} Is there a limit on the number of images I can use?
No, LightlyTrain does not impose any limits on the number of images you can use for training.
The framework is designed to scale to millions of images and you can use as many images as
your hardware can handle. In fact, the training method in LightlyTrain performs better the
more images you use as long as they are similar to the images you want to use the model on.
```
```{dropdown} Can I pretrain a custom model on my own dataset?
Yes you can! LightlyTrain supports any model implemented in PyTorch. See the documentation
on [custom models](#custom-models) on how to pretrain your model.
There are no restrictions on the dataset you use, except that it must contain images stored in a directory.
See [the documentation](#train-data) for all the supported images formats and the dataset structure.
```
```{dropdown} How can I fine-tune a model?
The focus of LightlyTrain is on pretraining models on unlabeled data. It doesn't support
fine-tuning. Instead, LightlyTrain integrates with popular fine-tuning libraries and
supports pretraining models from those libraries out of the box. At the end of the
pretraining, LightlyTrain exports the model in the correct format for the fine-tuning
library which allows you to directly fine-tune the model without having to worry about
converting and copying model weights.
See the documentation for a list of all [supported libraries](#models-supported-libraries).
```
```{dropdown} How can I improve the performance of my model?
If your model performance is not satisfactory after fine-tuning consider the following
options:
- Check that the fine-tuning runs smoothly. Make sure there are no large spikes in the
fine-tuning loss curves as those can negate the effect of the LightlyTrain pretraining.
Pretrained models might need a lower fine-tuning learning rate than randomly initialized models.
- Check that the pretraining runs smoothly. Make sure there are no large spikes in the
pretraining loss curves. If you observe unstable training then lower the learning rate.
- Increase the number of epochs for pretraining with LightlyTrain. Pretraining benefits
a lot from long training schedules and doesn't suffer as much from overfitting as
fine-tuning. Large datasets (>100'000 images) benefit from pretraining up to 3000 epochs.
Smaller datasets (<100'000 images) benefit from even longer pretraining for up to 10'000 epochs.
- Increase the number of unlabeled images in your pretraining dataset. The more unlabeled
images you have the better. If you have more labeled than unlabeled data then pretraining
is unlikely to help.
- If pretraining and fine-tuning are stable and you don't observe improvements then
increase the learning rate during pretraining. This is especially recommended for smaller models
with ~10Mio or fewer parameters.
```
```{dropdown} Do I have to tune hyperparameters?
No, LightlyTrain is designed to work well with the default hyperparameters for most use cases.
If you want to customize specific aspects of the training, LightlyTrain provides configuration
options through its API. However, most users can achieve good results without any manual
hyperparameter tuning.
You might want to adjust the following parameters based on your hardware and training budget:
- `batch_size`: We recommend a batch size between 128-1536.
- `epochs`: We recommend to train for 100-3000 epochs for large datasets (>100'000 images).
For small datasets (<100'000 images) we recommend to train up to 10'000 epochs.
- `num_devices`: The number of GPUs to use for training. LightlyTrain automatically uses all available
GPUs.
- `precision`: We recommend training with "bf16-mixed" precision for faster training speed.
Other parameters should only be adjusted for special use cases:
- `learning_rate`: Change only if you want to further optimize model performance. Increase learning
rate for small models.
- `transform_args`: Change only if you need to customize the data augmentation pipeline in case
you have a special dataset.
```
```{dropdown} Why should I use LightlyTrain instead of other already pretrained models?
Pretrained models are often trained on curated, object-centric datasets that don't reflect the reality of
real-world applications. In practice, these models can struggle when faced with raw, unlabeled data from
production environments. Fortunately, LightlyTrain offers a scalable way to learn tailored representations
directly from your own unlabeled data. Off-the-shelf models can still be a good starting point, and LightlyTrain
fully supports leveraging existing pretrained weights. The benefit is that you can adapt representations
to your domain without requiring any labels. Whether you start from scratch or fine-tune existing weights
is entirely up to you.
LightlyTrain is most beneficial in these scenarios:
- **Different data domains**: When working with data that has a very different distribution from
the data existing pretrained models are trained on. For example, models pretrained on COCO or
ImageNet often don't transfer well to medical images or industrial data. In these cases,
LightlyTrain can help you train a model that is better suited for your specific domain.
- **Policies or license restrictions**: Models trained on popular datasets such as ImageNet have
oftentimes unclear licensing policies, making it difficult to use them in production. If you are
restricted by policies or licenses, LightlyTrain allows you to train your own model without
relying on existing pretrained models.
- **Limited labeled data**: When you have access to a lot of unlabeled data but limited labeled
data for your specific task.
- **Custom architectures**: When using custom model architectures where no pretrained checkpoints
are available.
For domains similar to COCO or natural images, supervised pretrained models like YOLO or Vision Transformers
are very strong, and you may not see significant benefits from SSL pretraining.
```
## Technical Background
```{dropdown} What is pretraining?
In the context of LightlyTrain, pretraining is the process of training a model on an unlabeled dataset to
learn semantic representations of the images in the dataset. A pretrained model cannot classify or detect
objects yet. But it has already a very good internal representation of the dataset and requires only a few
labeled example images to become a strong model for a specific task.
Pretrained model serves as a much better starting point for fine-tuning than
randomly initialized weights, typically leading to:
- Higher accuracy
- Faster convergence
- Better generalization
- Less labeled data needed for downstream tasks
```
```{dropdown} What is fine-tuning?
Fine-tuning is the process of taking a pretrained model and further training it on a specific
task with labeled data. The pretrained model already has learned useful visual features, and
fine-tuning adapts these features to a specific task such as classification, object detection,
or segmentation.
With LightlyTrain, you first pretrain your model on unlabeled data, then fine-tune it on a smaller
labeled dataset for your specific task.
```
```{dropdown} What is the difference between pretraining and fine-tuning?¶
Pretraining and fine-tuning are two distinct phases in the model training process:
**Pretraining:**
- Uses unlabeled data
- Usually performed on larger datasets
- Uses self-supervised learning
- Resulting model is general-purpose
**Fine-tuning:**
- Uses labeled data for a specific task
- Can be performed on smaller datasets
- Uses supervised learning
- Resulting model is task specific (classification, object detection, etc.)
Pretraining happens before fine-tuning and is essential for learning useful
representations from unlabeled data. LightlyTrain focuses on the pretraining phase,
while fine-tuning is typically done using standard supervised learning frameworks
appropriate for your downstream task.
```
```{dropdown} Which pretraining methods are supported?
LightlyTrain supports different methods such as:
- DINOv2 distillation
- DINO
- SimCLR
See [the documentation on methods](#methods) for more information.
```
```{dropdown} Will you add more pretraining methods?
Yes, we will add more pretraining methods in the future. That being said, there are hundreds of
different pretraining methods and we will not add all of them. Instead, the goal of LightlyTrain
is to focus on the best pretraining methods for industry relevant tasks.
```
```{dropdown} What is self-supervised learning (SSL)?
Self-supervised learning (SSL) is a machine learning approach where models learn useful
representations from unlabeled data without requiring manual annotations. Instead of being
trained on explicit labels, SSL methods create "proxy tasks" where the model learns by solving
objectives derived from the data itself.
In computer vision, SSL typically involves training models to match different views of the same
image, predict masked regions, or understand relationships between different parts of images.
This allows models to develop rich, general-purpose representations that can be fine-tuned for
downstream tasks with much less labeled data.
```
```{dropdown} What is distillation?
In the context of LightlyTrain, distillation refers to knowledge distillation where a teacher
model guides the training of your student model.
During distillation, the student model learns to produce similar feature representations as the
teacher model for the same input images. This allows smaller models to benefit from the knowledge
of larger, more powerful models like vision transformers pretrained with DINOv2, without requiring
the same computational resources for training.
```
```{dropdown} What is DINOv2?
DINOv2 is a self-supervised learning method developed by Facebook AI Research (FAIR) that produces
state-of-the-art visual representations. It builds upon the original DINO (Self-Distillation with
No Labels) method and uses a Vision Transformer (ViT) architecture.
DINOv2 models are pretrained on a diverse set of image data and have shown remarkable performance
across various visual tasks, from classification to dense prediction tasks. The resulting
representations exhibit strong semantic understanding and work well for zero-shot and few-shot
learning scenarios.
In LightlyTrain, DINOv2-pretrained ViT models serve as teacher models for the distillation method,
transferring their knowledge to your chosen backbone architecture.
```
## Deployment
```{dropdown} Does LightlyTrain run on the cloud?
Lightly does not offer a service to train models with LightlyTrain in the cloud.
However, you can run the LightlyTrain Python package on your own cloud instances.
```
```{dropdown} Does LightlyTrain need an internet connection?
LightlyTrain does not require an internet connection to run. However, it may need to download
model weights on the first run. All subsequent runs will use the cached weights. The model weights
can also be downloaded beforehand for deployments without an internet connection.
LightlyTrain does not send any telemetry data and does not require authetification over the internet.
```
```{dropdown} How many GPUs do I need?
LightlyTrain works well with a single GPU. Multi-GPU and multi-node training is supported
for faster training speeds. See our [documentation](#performance) for more information.
```
```{dropdown} Can I train on multiple GPUs and Nodes?¶
Yes, LightlyTrain supports multi-GPU and multi-node training. LightlyTrain automatically trains
on multiple GPUs if available. By default it uses only a single node. You can control the
number of GPUs and nodes with:
- `num_devices`: The number of GPUs to use for training.
- `num_nodes`: The number of nodes to use for training.
See our [documentation](#performance) for more information.
```
## Support & Resources
```{dropdown} How can I get a license?
Please [contact us](https://www.lightly.ai/contact) to discuss the best licensing option for your project!
```
```{dropdown} How can I get support?
If you need help with LightlyTrain, you have several options:
1. **Documentation**: Check the [official documentation](https://docs.lightly.ai/train)
for guides and reference material.
2. **Discord Community**: Join our [Discord server](https://discord.gg/xvNJW94) to ask questions
and interact with other users.
3. **Email Support**: Contact us at [sales@lightly.ai](mailto:sales@lightly.ai) for licensing questions.
4. **GitHub Issues**: Report bugs or request features on [GitHub](https://github.com/lightly-ai/lightly-train/issues).
For commercial users with a license, additional support options are available.
```
```{dropdown} How can I report a bug?
Report bugs on our issue tracker on [GitHub](https://github.com/lightly-ai/lightly-train/issues).
```
```{dropdown} How can I request a feature?
Request features on our issue tracker on [GitHub](https://github.com/lightly-ai/lightly-train/issues).
```
```{dropdown} How can I contribute to LightlyTrain?
We welcome contributions to LightlyTrain!
To contribute, please head to our [GitHub repo](https://github.com/lightly-ai/lightly-train)
and join our [Discord](https://discord.gg/xvNJW94) for any questions.
```