lightly.models¶
The lightly.models package provides model implementations.
The package contains an implementation of the commonly used ResNet and adaptations of the architecture which make selfsupervised learning simpler.
The package also hosts the Lightly model zoo  a list of downloadable ResNet checkpoints.
.resnet¶
Custom ResNet Implementation
Note that the architecture we present here differs from the one used in torchvision. We replace the first 7x7 convolution by a 3x3 convolution to make the model faster and run better on smaller input image resolutions.
Furthermore, we introduce a resnet9 variant for extra small models. These can run for example on a microcontroller with 100kBytes of storage.

class
lightly.models.resnet.
BasicBlock
(in_planes: int, planes: int, stride: int = 1, num_splits: int = 0)¶ Implementation of the ResNet Basic Block.
 Attributes:
 in_planes:
Number of input channels.
 planes:
Number of channels.
 stride:
Stride of the first convolutional.

forward
(x: torch.Tensor)¶ Forward pass through basic ResNet block.
 Args:
 x:
Tensor of shape bsz x channels x W x H
 Returns:
Tensor of shape bsz x channels x W x H

class
lightly.models.resnet.
Bottleneck
(in_planes: int, planes: int, stride: int = 1, num_splits: int = 0)¶ Implementation of the ResNet Bottleneck Block.
 Attributes:
 in_planes:
Number of input channels.
 planes:
Number of channels.
 stride:
Stride of the first convolutional.

forward
(x)¶ Forward pass through bottleneck ResNet block.
 Args:
 x:
Tensor of shape bsz x channels x W x H
 Returns:
Tensor of shape bsz x channels x W x H

class
lightly.models.resnet.
ResNet
(block: torch.nn.modules.module.Module = <class 'lightly.models.resnet.BasicBlock'>, layers: List[int] = [2, 2, 2, 2], num_classes: int = 10, width: float = 1.0, num_splits: int = 0)¶ ResNet implementation.
[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun Deep Residual Learning for Image Recognition. arXiv:1512.03385
 Attributes:
 block:
ResNet building block type.
 layers:
List of blocks per layer.
 num_classes:
Number of classes in final softmax layer.
 width:
Multiplier for ResNet width.

forward
(x: torch.Tensor)¶ Forward pass through ResNet.
 Args:
 x:
Tensor of shape bsz x channels x W x H
 Returns:
Output tensor of shape bsz x num_classes

lightly.models.resnet.
ResNetGenerator
(name: str = 'resnet18', width: float = 1, num_classes: int = 10, num_splits: int = 0)¶ Builds and returns the specified ResNet.
 Args:
 name:
ResNet version from resnet{9, 18, 34, 50, 101, 152}.
 width:
ResNet width.
 num_classes:
Output dim of the last layer.
 num_splits:
Number of splits to use for SplitBatchNorm (for MoCo model). Increase this number to simulate multigpu behavior. E.g. num_splits=8 simulates a 8GPU cluster. num_splits=0 uses normal PyTorch BatchNorm.
 Returns:
ResNet as nn.Module.
 Examples:
>>> # binary classifier with ResNet34 >>> from lightly.models import ResNetGenerator >>> resnet = ResNetGenerator('resnet34', num_classes=2)
.barlowtwins¶
Barlow Twins resnetbased Model [0] [0] Zbontar,J. et.al. 2021. Barlow Twins… https://arxiv.org/abs/2103.03230

class
lightly.models.barlowtwins.
BarlowTwins
(backbone: torch.nn.modules.module.Module, num_ftrs: int = 2048, proj_hidden_dim: int = 8192, out_dim: int = 8192, num_mlp_layers: int = 3)¶ Implementation of BarlowTwins[0] network.
Recommended loss:
lightly.loss.barlow_twins_loss.BarlowTwinsLoss
Default params are the ones explained in the original paper [0]. [0] Zbontar,J. et.al. 2021. Barlow Twins… https://arxiv.org/abs/2103.03230
 Attributes:
 backbone:
Backbone model to extract features from images. ResNet50 in original paper [0].
 num_ftrs:
Dimension of the embedding (before the projection head).
 proj_hidden_dim:
Dimension of the hidden layer of the projection head. This should be the same size as num_ftrs.
 out_dim:
Dimension of the output (after the projection head).

forward
(x0: torch.Tensor, x1: torch.Tensor = None, return_features: bool = False)¶ Forward pass through BarlowTwins.
Extracts features with the backbone and applies the projection head to the output space. If both x0 and x1 are not None, both will be passed through the backbone and projection. If x1 is None, only x0 will be forwarded. Barlow Twins only implement a projection head unlike SimSiam.
 Args:
 x0:
Tensor of shape bsz x channels x W x H.
 x1:
Tensor of shape bsz x channels x W x H.
 return_features:
Whether or not to return the intermediate features backbone(x).
 Returns:
The output projection of x0 and (if x1 is not None) the output projection of x1. If return_features is True, the output for each x is a tuple (out, f) where f are the features before the projection head.
 Examples:
>>> # single input, single output >>> out = model(x) >>> >>> # single input with return_features=True >>> out, f = model(x, return_features=True) >>> >>> # two inputs, two outputs >>> out0, out1 = model(x0, x1) >>> >>> # two inputs, two outputs with return_features=True >>> (out0, f0), (out1, f1) = model(x0, x1, return_features=True)
.simclr¶
SimCLR Model

class
lightly.models.simclr.
SimCLR
(backbone: torch.nn.modules.module.Module, num_ftrs: int = 32, out_dim: int = 128)¶ Implementation of the SimCLR[0] architecture
Recommended loss:
lightly.loss.ntx_ent_loss.NTXentLoss
[0] SimCLR, 2020, https://arxiv.org/abs/2002.05709
 Attributes:
 backbone:
Backbone model to extract features from images.
 num_ftrs:
Dimension of the embedding (before the projection head).
 out_dim:
Dimension of the output (after the projection head).

forward
(x0: torch.Tensor, x1: torch.Tensor = None, return_features: bool = False)¶ Embeds and projects the input images.
Extracts features with the backbone and applies the projection head to the output space. If both x0 and x1 are not None, both will be passed through the backbone and projection head. If x1 is None, only x0 will be forwarded.
 Args:
 x0:
Tensor of shape bsz x channels x W x H.
 x1:
Tensor of shape bsz x channels x W x H.
 return_features:
Whether or not to return the intermediate features backbone(x).
 Returns:
The output projection of x0 and (if x1 is not None) the output projection of x1. If return_features is True, the output for each x is a tuple (out, f) where f are the features before the projection head.
 Examples:
>>> # single input, single output >>> out = model(x) >>> >>> # single input with return_features=True >>> out, f = model(x, return_features=True) >>> >>> # two inputs, two outputs >>> out0, out1 = model(x0, x1) >>> >>> # two inputs, two outputs with return_features=True >>> (out0, f0), (out1, f1) = model(x0, x1, return_features=True)
.moco¶
MoCo Model

class
lightly.models.moco.
MoCo
(backbone: torch.nn.modules.module.Module, num_ftrs: int = 32, out_dim: int = 128, m: float = 0.999, batch_shuffle: bool = False)¶ Implementation of the MoCo (Momentum Contrast)[0] architecture.
Recommended loss:
lightly.loss.ntx_ent_loss.NTXentLoss
with a memory bank.[0] MoCo, 2020, https://arxiv.org/abs/1911.05722
 Attributes:
 backbone:
Backbone model to extract features from images.
 num_ftrs:
Dimension of the embedding (before the projection head).
 out_dim:
Dimension of the output (after the projection head).
 m:
Momentum for momentum update of the keyencoder.

forward
(x0: torch.Tensor, x1: torch.Tensor = None, return_features: bool = False)¶ Embeds and projects the input image.
Performs the momentum update, extracts features with the backbone and applies the projection head to the output space. If both x0 and x1 are not None, both will be passed through the backbone and projection head. If x1 is None, only x0 will be forwarded.
 Args:
 x0:
Tensor of shape bsz x channels x W x H.
 x1:
Tensor of shape bsz x channels x W x H.
 return_features:
Whether or not to return the intermediate features backbone(x).
 Returns:
The output projection of x0 and (if x1 is not None) the output projection of x1. If return_features is True, the output for each x is a tuple (out, f) where f are the features before the projection head.
 Examples:
>>> # single input, single output >>> out = model(x) >>> >>> # single input with return_features=True >>> out, f = model(x, return_features=True) >>> >>> # two inputs, two outputs >>> out0, out1 = model(x0, x1) >>> >>> # two inputs, two outputs with return_features=True >>> (out0, f0), (out1, f1) = model(x0, x1, return_features=True)
.simsiam¶
SimSiam Model

class
lightly.models.simsiam.
SimSiam
(backbone: torch.nn.modules.module.Module, num_ftrs: int = 2048, proj_hidden_dim: int = 2048, pred_hidden_dim: int = 512, out_dim: int = 2048, num_mlp_layers: int = 3)¶ Implementation of SimSiam[0] network
Recommended loss:
lightly.loss.sym_neg_cos_sim_loss.SymNegCosineSimilarityLoss
[0] SimSiam, 2020, https://arxiv.org/abs/2011.10566
 Attributes:
 backbone:
Backbone model to extract features from images.
 num_ftrs:
Dimension of the embedding (before the projection head).
 proj_hidden_dim:
Dimension of the hidden layer of the projection head. This should be the same size as num_ftrs.
 pred_hidden_dim:
Dimension of the hidden layer of the predicion head. This should be num_ftrs / 4.
 out_dim:
Dimension of the output (after the projection head).

forward
(x0: torch.Tensor, x1: torch.Tensor = None, return_features: bool = False)¶ Forward pass through SimSiam.
Extracts features with the backbone and applies the projection head and prediction head to the output space. If both x0 and x1 are not None, both will be passed through the backbone, projection, and prediction head. If x1 is None, only x0 will be forwarded.
 Args:
 x0:
Tensor of shape bsz x channels x W x H.
 x1:
Tensor of shape bsz x channels x W x H.
 return_features:
Whether or not to return the intermediate features backbone(x).
 Returns:
The output prediction and projection of x0 and (if x1 is not None) the output prediction and projection of x1. If return_features is True, the output for each x is a tuple (out, f) where f are the features before the projection head.
 Examples:
>>> # single input, single output >>> out = model(x) >>> >>> # single input with return_features=True >>> out, f = model(x, return_features=True) >>> >>> # two inputs, two outputs >>> out0, out1 = model(x0, x1) >>> >>> # two inputs, two outputs with return_features=True >>> (out0, f0), (out1, f1) = model(x0, x1, return_features=True)
.byol¶
BYOL Model

class
lightly.models.byol.
BYOL
(backbone: torch.nn.modules.module.Module, num_ftrs: int = 2048, hidden_dim: int = 4096, out_dim: int = 256, m: float = 0.9)¶ Implementation of the BYOL architecture.
 Attributes:
 backbone:
Backbone model to extract features from images.
 num_ftrs:
Dimension of the embedding (before the projection mlp).
 hidden_dim:
Dimension of the hidden layer in the projection and prediction mlp.
 out_dim:
Dimension of the output (after the projection/prediction mlp).
 m:
Momentum for the momentum update of encoder.

forward
(x0: torch.Tensor, x1: torch.Tensor, return_features: bool = False)¶ Symmetrizes the forward pass (see _forward).
Performs two forward passes, once where x0 is passed through the encoder and x1 through the momentum encoder and once the other way around.
Note that this model currently requires two inputs for the forward pass (x0 and x1) which correspond to the two augmentations. Furthermore, the return_features argument does not work yet.
 Args:
 x0:
Tensor of shape bsz x channels x W x H.
 x1:
Tensor of shape bsz x channels x W x H.
 Returns:
A tuple out0, out1, where out0 and out1 are tuples containing the predictions and projections of x0 and x1: out0 = (z0, p0) and out1 = (z1, p1).
 Examples:
>>> # initialize the model and the loss function >>> model = BYOL() >>> criterion = SymNegCosineSimilarityLoss() >>> >>> # forward pass for two batches of transformed images x1 and x2 >>> out0, out1 = model(x0, x1) >>> loss = criterion(out0, out1)
.nnclr¶
NNCLR Model

class
lightly.models.nnclr.
NNCLR
(backbone: torch.nn.modules.module.Module, num_ftrs: int = 512, proj_hidden_dim: int = 2048, pred_hidden_dim: int = 4096, out_dim: int = 256, num_mlp_layers: int = 3)¶ Implementation of the NNCLR[0] architecture
Recommended loss:
lightly.loss.ntx_ent_loss.NTXentLoss
Recommended module:lightly.models.modules.nn_memory_bank.NNmemoryBankModule
[0] NNCLR, 2021, https://arxiv.org/abs/2104.14548
 Attributes:
 backbone:
Backbone model to extract features from images.
 num_ftrs:
Dimension of the embedding (before the projection head).
 proj_hidden_dim:
Dimension of the hidden layer of the projection head.
 pred_hidden_dim:
Dimension of the hidden layer of the predicion head.
 out_dim:
Dimension of the output (after the projection head).
 num_mlp_layers:
Number of linear layers for MLP.
 Examples:
>>> model = NNCLR(backbone) >>> criterion = NTXentLoss(temperature=0.1) >>> >>> nn_replacer = NNmemoryBankModule(size=2 ** 16) >>> >>> # forward pass >>> (z0, p0), (z1, p1) = model(x0, x1) >>> z0 = nn_replacer(z0.detach(), update=False) >>> z1 = nn_replacer(z1.detach(), update=True) >>> >>> loss = 0.5 * (criterion(z0, p1) + criterion(z1, p0))

forward
(x0: torch.Tensor, x1: torch.Tensor = None, return_features: bool = False)¶ Embeds and projects the input images.
Extracts features with the backbone and applies the projection head to the output space. If both x0 and x1 are not None, both will be passed through the backbone and projection head. If x1 is None, only x0 will be forwarded.
 Args:
 x0:
Tensor of shape bsz x channels x W x H.
 x1:
Tensor of shape bsz x channels x W x H.
 return_features:
Whether or not to return the intermediate features backbone(x).
 Returns:
The output projection of x0 and (if x1 is not None) the output projection of x1. If return_features is True, the output for each x is a tuple (out, f) where f are the features before the projection head.
 Examples:
>>> # single input, single output >>> out = model(x) >>> >>> # single input with return_features=True >>> out, f = model(x, return_features=True) >>> >>> # two inputs, two outputs >>> out0, out1 = model(x0, x1) >>> >>> # two inputs, two outputs with return_features=True >>> (out0, f0), (out1, f1) = model(x0, x1, return_features=True)
.zoo¶
Lightly Model Zoo

lightly.models.zoo.
checkpoints
()¶ Returns the Lightly model zoo as a list of checkpoints.
 Checkpoints:
 ResNet9:
SimCLR with width = 0.0625 and num_ftrs = 16
 ResNet9:
SimCLR with width = 0.125 and num_ftrs = 16
 ResNet18:
SimCLR with width = 1.0 and num_ftrs = 16
 ResNet18:
SimCLR with width = 1.0 and num_ftrs = 32
 ResNet34:
SimCLR with width = 1.0 and num_ftrs = 16
 ResNet34:
SimCLR with width = 1.0 and num_ftrs = 32
 Returns:
A list of available checkpoints as URLs.
The lightly.models.modules package provides reusable modules.
This package contains reusable modules such as the NNmemoryBankModule which can be combined with any lightly model.
.nn_memory_bank¶
Nearest Neighbour Memory Bank Module

class
lightly.models.modules.nn_memory_bank.
NNMemoryBankModule
(size: int = 65536)¶ Nearest Neighbour Memory Bank implementation
This class implements a nearest neighbour memory bank as described in the NNCLR paper[0]. During the forward pass we return the nearest neighbour from the memory bank.
[0] NNCLR, 2021, https://arxiv.org/abs/2104.14548
 Attributes:
 size:
Number of keys the memory bank can store. If set to 0, memory bank is not used.
 Examples:
>>> model = NNCLR(backbone) >>> criterion = NTXentLoss(temperature=0.1) >>> >>> nn_replacer = NNmemoryBankModule(size=2 ** 16) >>> >>> # forward pass >>> (z0, p0), (z1, p1) = model(x0, x1) >>> z0 = nn_replacer(z0.detach(), update=False) >>> z1 = nn_replacer(z1.detach(), update=True) >>> >>> loss = 0.5 * (criterion(z0, p1) + criterion(z1, p0))

forward
(output: torch.Tensor, update: bool = False)¶ Returns nearest neighbour of output tensor from memory bank
 Args:
output: The torch tensor for which you want the nearest neighbour update: If True updated the memory bank by adding output to it