.. _dino: DINO ==== Example implementation of the DINO architecture. Reference: `Emerging Properties in Self-Supervised Vision Transformers, 2021 `_ .. tabs:: .. tab:: PyTorch This example can be run from the command line with:: python lightly/examples/pytorch/dino.py .. literalinclude:: ../../../examples/pytorch/dino.py .. tab:: Lightning This example can be run from the command line with:: python lightly/examples/pytorch_lightning/dino.py .. literalinclude:: ../../../examples/pytorch_lightning/dino.py .. tab:: Lightning Distributed This example runs on multiple gpus using Distributed Data Parallel (DDP) training with Pytorch Lightning. At least one GPU must be available on the system. The example can be run from the command line with:: python lightly/examples/pytorch_lightning_distributed/dino.py The model differs in the following ways from the non-distributed implementation: - Distributed Data Parallel is enabled - Synchronized Batch Norm is used in place of standard Batch Norm - Distributed Sampling is used in the dataloader Note that Synchronized Batch Norm is optional and the model can also be trained without it. Without Synchronized Batch Norm the batch norm for each GPU is only calculated based on the features on that specific GPU. Distributed Sampling makes sure that each distributed process sees only a subset of the data. .. literalinclude:: ../../../examples/pytorch_lightning_distributed/dino.py