lightly.models

The lightly.models package provides model implementations.

Note that the high-level building blocks will be deprecated with lightly version 1.3.0. Instead, use low-level building blocks to build the models yourself.

Example implementations for all models can be found here: Model Examples

The package contains an implementation of the commonly used ResNet and adaptations of the architecture which make self-supervised learning simpler.

The package also hosts the Lightly model zoo - a list of downloadable ResNet checkpoints.

.resnet

Custom ResNet Implementation

Note that the architecture we present here differs from the one used in torchvision. We replace the first 7x7 convolution by a 3x3 convolution to make the model faster and run better on smaller input image resolutions.

Furthermore, we introduce a resnet-9 variant for extra small models. These can run for example on a microcontroller with 100kBytes of storage.

class lightly.models.resnet.BasicBlock(in_planes: int, planes: int, stride: int = 1, num_splits: int = 0)

Implementation of the ResNet Basic Block.

in_planes

Number of input channels.

planes

Number of channels.

stride

Stride of the first convolutional.

forward(x: torch.Tensor)

Forward pass through basic ResNet block.

Parameters

x – Tensor of shape bsz x channels x W x H

Returns

Tensor of shape bsz x channels x W x H

class lightly.models.resnet.Bottleneck(in_planes: int, planes: int, stride: int = 1, num_splits: int = 0)

Implementation of the ResNet Bottleneck Block.

in_planes

Number of input channels.

planes

Number of channels.

stride

Stride of the first convolutional.

forward(x)

Forward pass through bottleneck ResNet block.

Parameters

x – Tensor of shape bsz x channels x W x H

Returns

Tensor of shape bsz x channels x W x H

class lightly.models.resnet.ResNet(block: torch.nn.modules.module.Module = <class 'lightly.models.resnet.BasicBlock'>, layers: List[int] = [2, 2, 2, 2], num_classes: int = 10, width: float = 1.0, num_splits: int = 0)

ResNet implementation.

[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun Deep Residual Learning for Image Recognition. arXiv:1512.03385

block

ResNet building block type.

layers

List of blocks per layer.

num_classes

Number of classes in final softmax layer.

width

Multiplier for ResNet width.

forward(x: torch.Tensor)

Forward pass through ResNet.

Parameters

x – Tensor of shape bsz x channels x W x H

Returns

Output tensor of shape bsz x num_classes

lightly.models.resnet.ResNetGenerator(name: str = 'resnet-18', width: float = 1, num_classes: int = 10, num_splits: int = 0)

Builds and returns the specified ResNet.

Parameters
  • name – ResNet version from resnet-{9, 18, 34, 50, 101, 152}.

  • width – ResNet width.

  • num_classes – Output dim of the last layer.

  • num_splits – Number of splits to use for SplitBatchNorm (for MoCo model). Increase this number to simulate multi-gpu behavior. E.g. num_splits=8 simulates a 8-GPU cluster. num_splits=0 uses normal PyTorch BatchNorm.

Returns

ResNet as nn.Module.

Examples

>>> # binary classifier with ResNet-34
>>> from lightly.models import ResNetGenerator
>>> resnet = ResNetGenerator('resnet-34', num_classes=2)

.zoo

Lightly Model Zoo

lightly.models.zoo.checkpoints()

Returns the Lightly model zoo as a list of checkpoints.

Checkpoints:
ResNet-9:

SimCLR with width = 0.0625 and num_ftrs = 16

ResNet-9:

SimCLR with width = 0.125 and num_ftrs = 16

ResNet-18:

SimCLR with width = 1.0 and num_ftrs = 16

ResNet-18:

SimCLR with width = 1.0 and num_ftrs = 32

ResNet-34:

SimCLR with width = 1.0 and num_ftrs = 16

ResNet-34:

SimCLR with width = 1.0 and num_ftrs = 32

Returns

A list of available checkpoints as URLs.

The lightly.models.modules package provides reusable modules.

This package contains reusable modules such as the NNmemoryBankModule which can be combined with any lightly model.

.nn_memory_bank

Nearest Neighbour Memory Bank Module

class lightly.models.modules.nn_memory_bank.NNMemoryBankModule(size: int = 65536)

Nearest Neighbour Memory Bank implementation

This class implements a nearest neighbour memory bank as described in the NNCLR paper[0]. During the forward pass we return the nearest neighbour from the memory bank.

[0] NNCLR, 2021, https://arxiv.org/abs/2104.14548

size

Number of keys the memory bank can store. If set to 0, memory bank is not used.

Examples

>>> model = NNCLR(backbone)
>>> criterion = NTXentLoss(temperature=0.1)
>>>
>>> nn_replacer = NNmemoryBankModule(size=2 ** 16)
>>>
>>> # forward pass
>>> (z0, p0), (z1, p1) = model(x0, x1)
>>> z0 = nn_replacer(z0.detach(), update=False)
>>> z1 = nn_replacer(z1.detach(), update=True)
>>>
>>> loss = 0.5 * (criterion(z0, p1) + criterion(z1, p0))
forward(output: torch.Tensor, update: bool = False)

Returns nearest neighbour of output tensor from memory bank

Parameters
  • output – The torch tensor for which you want the nearest neighbour

  • update – If True updated the memory bank by adding output to it

.heads

Projection and Prediction Heads for Self-supervised Learning

class lightly.models.modules.heads.BYOLProjectionHead(input_dim: int, hidden_dim: int, output_dim: int)

Projection head used for BYOL.

“This MLP consists in a linear layer with output size 4096 followed by batch normalization, rectified linear units (ReLU), and a final linear layer with output dimension 256.” [0]

[0]: BYOL, 2020, https://arxiv.org/abs/2006.07733

class lightly.models.modules.heads.BarlowTwinsProjectionHead(input_dim: int, hidden_dim: int, output_dim: int)

Projection head used for Barlow Twins.

“The projector network has three linear layers, each with 8192 output units. The first two layers of the projector are followed by a batch normalization layer and rectified linear units.” [0]

[0]: 2021, Barlow Twins, https://arxiv.org/abs/2103.03230

class lightly.models.modules.heads.MoCoProjectionHead(input_dim: int, hidden_dim: int, output_dim: int)

Projection head used for MoCo.

“(…) we replace the fc head in MoCo with a 2-layer MLP head (hidden layer 2048-d, with ReLU)” [0]

[0]: MoCo, 2020, https://arxiv.org/abs/1911.05722

class lightly.models.modules.heads.NNCLRPredictionHead(input_dim: int, hidden_dim: int, output_dim: int)

Prediction head used for NNCLR.

“The architecture of the prediction MLP g is 2 fully-connected layers of size [4096,d]. The hidden layer of the prediction MLP is followed by batch-norm and ReLU. The last layer has no batch-norm or activation.” [0]

[0]: NNCLR, 2021, https://arxiv.org/abs/2104.14548

class lightly.models.modules.heads.NNCLRProjectionHead(input_dim: int, hidden_dim: int, output_dim: int)

Projection head used for NNCLR.

“The architectureof the projection MLP is 3 fully connected layers of sizes [2048,2048,d] where d is the embedding size used to apply the loss. We use d = 256 in the experiments unless otherwise stated. All fully-connected layers are followed by batch-normalization [36]. All the batch-norm layers except the last layer are followed by ReLU activation.” [0]

[0]: NNCLR, 2021, https://arxiv.org/abs/2104.14548

class lightly.models.modules.heads.ProjectionHead(blocks: List[Tuple[int, int, torch.nn.modules.module.Module, torch.nn.modules.module.Module]])

Base class for all projection and prediction heads.

Parameters

blocks – List of tuples, each denoting one block of the projection head MLP. Each tuple reads (in_features, out_features, batch_norm_layer, non_linearity_layer).

Examples

>>> # the following projection head has two blocks
>>> # the first block uses batch norm an a ReLU non-linearity
>>> # the second block is a simple linear layer
>>> projection_head = ProjectionHead([
>>>     (256, 256, nn.BatchNorm1d(256), nn.ReLU()),
>>>     (256, 128, None, None)
>>> ])
forward(x: torch.Tensor)

Computes one forward pass through the projection head.

Parameters

x – Input of shape bsz x num_ftrs.

class lightly.models.modules.heads.SimCLRProjectionHead(input_dim: int, hidden_dim: int, output_dim: int)

Projection head used for SimCLR.

“We use a MLP with one hidden layer to obtain zi = g(h) = W_2 * σ(W_1 * h) where σ is a ReLU non-linearity.” [0]

[0] SimCLR, 2020, https://arxiv.org/abs/2002.05709

class lightly.models.modules.heads.SimSiamPredictionHead(input_dim: int, hidden_dim: int, output_dim: int)

Prediction head used for SimSiam.

“The prediction MLP (h) has BN applied to its hidden fc layers. Its output fc does not have BN (…) or ReLU. This MLP has 2 layers.” [0]

[0]: SimSiam, 2020, https://arxiv.org/abs/2011.10566

class lightly.models.modules.heads.SimSiamProjectionHead(input_dim: int, hidden_dim: int, output_dim: int)

Projection head used for SimSiam.

“The projection MLP (in f) has BN applied to each fully-connected (fc) layer, including its output fc. Its output fc has no ReLU. The hidden fc is 2048-d. This MLP has 3 layers.” [0]

[0]: SimSiam, 2020, https://arxiv.org/abs/2011.10566

class lightly.models.modules.heads.SwaVProjectionHead(input_dim: int, hidden_dim: int, output_dim: int)

Projection head used for SwaV.

[0]: SwAV, 2020, https://arxiv.org/abs/2006.09882

class lightly.models.modules.heads.SwaVPrototypes(input_dim: int, n_prototypes: int)

Prototypes used for SwaV.

Each output feature is assigned to a prototype, SwaV solves the swapped predicition problem where the features of one augmentation are used to predict the assigned prototypes of the other augmentation.

Examples

>>> # use features with 128 dimensions and 512 prototypes
>>> prototypes = SwaVPrototypes(128, 512)
>>>
>>> # pass batch through backbone and projection head.
>>> features = model(x)
>>> features = nn.functional.normalize(features, dim=1, p=2)
>>>
>>> # logits has shape bsz x 512
>>> logits = prototypes(features)