lightly.models

The lightly.models package provides model implementations.

The package contains an implementation of the commonly used ResNet and adaptations of the architecture which make self-supervised learning simpler.

The package also hosts the Lightly model zoo - a list of downloadable ResNet checkpoints.

.resnet

Custom ResNet Implementation

Note that the architecture we present here differs from the one used in torchvision. We replace the first 7x7 convolution by a 3x3 convolution to make the model faster and run better on smaller input image resolutions.

Furthermore, we introduce a resnet-9 variant for extra small models. These can run for example on a microcontroller with 100kBytes of storage.

class lightly.models.resnet.BasicBlock(in_planes: int, planes: int, stride: int = 1, num_splits: int = 0)

Implementation of the ResNet Basic Block.

Attributes:
in_planes:

Number of input channels.

planes:

Number of channels.

stride:

Stride of the first convolutional.

forward(x: torch.Tensor)

Forward pass through basic ResNet block.

Args:
x:

Tensor of shape bsz x channels x W x H

Returns:

Tensor of shape bsz x channels x W x H

class lightly.models.resnet.Bottleneck(in_planes: int, planes: int, stride: int = 1, num_splits: int = 0)

Implementation of the ResNet Bottleneck Block.

Attributes:
in_planes:

Number of input channels.

planes:

Number of channels.

stride:

Stride of the first convolutional.

forward(x)

Forward pass through bottleneck ResNet block.

Args:
x:

Tensor of shape bsz x channels x W x H

Returns:

Tensor of shape bsz x channels x W x H

class lightly.models.resnet.ResNet(block: torch.nn.modules.module.Module = <class 'lightly.models.resnet.BasicBlock'>, layers: List[int] = [2, 2, 2, 2], num_classes: int = 10, width: float = 1.0, num_splits: int = 0)

ResNet implementation.

[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun Deep Residual Learning for Image Recognition. arXiv:1512.03385

Attributes:
block:

ResNet building block type.

layers:

List of blocks per layer.

num_classes:

Number of classes in final softmax layer.

width:

Multiplier for ResNet width.

forward(x: torch.Tensor)

Forward pass through ResNet.

Args:
x:

Tensor of shape bsz x channels x W x H

Returns:

Output tensor of shape bsz x num_classes

lightly.models.resnet.ResNetGenerator(name: str = 'resnet-18', width: float = 1, num_classes: int = 10, num_splits: int = 0)

Builds and returns the specified ResNet.

Args:
name:

ResNet version from resnet-{9, 18, 34, 50, 101, 152}.

width:

ResNet width.

num_classes:

Output dim of the last layer.

num_splits:

Number of splits to use for SplitBatchNorm (for MoCo model). Increase this number to simulate multi-gpu behavior. E.g. num_splits=8 simulates a 8-GPU cluster. num_splits=0 uses normal PyTorch BatchNorm.

Returns:

ResNet as nn.Module.

Examples:
>>> # binary classifier with ResNet-34
>>> from lightly.models import ResNetGenerator
>>> resnet = ResNetGenerator('resnet-34', num_classes=2)

.barlowtwins

Barlow Twins resnet-based Model [0] [0] Zbontar,J. et.al. 2021. Barlow Twins… https://arxiv.org/abs/2103.03230

class lightly.models.barlowtwins.BarlowTwins(backbone: torch.nn.modules.module.Module, num_ftrs: int = 2048, proj_hidden_dim: int = 8192, out_dim: int = 8192, num_mlp_layers: int = 3)

Implementation of BarlowTwins[0] network.

Recommended loss: lightly.loss.barlow_twins_loss.BarlowTwinsLoss

Default params are the ones explained in the original paper [0]. [0] Zbontar,J. et.al. 2021. Barlow Twins… https://arxiv.org/abs/2103.03230

Attributes:
backbone:

Backbone model to extract features from images. ResNet-50 in original paper [0].

num_ftrs:

Dimension of the embedding (before the projection head).

proj_hidden_dim:

Dimension of the hidden layer of the projection head. This should be the same size as num_ftrs.

out_dim:

Dimension of the output (after the projection head).

forward(x0: torch.Tensor, x1: torch.Tensor = None, return_features: bool = False)

Forward pass through BarlowTwins.

Extracts features with the backbone and applies the projection head to the output space. If both x0 and x1 are not None, both will be passed through the backbone and projection. If x1 is None, only x0 will be forwarded. Barlow Twins only implement a projection head unlike SimSiam.

Args:
x0:

Tensor of shape bsz x channels x W x H.

x1:

Tensor of shape bsz x channels x W x H.

return_features:

Whether or not to return the intermediate features backbone(x).

Returns:

The output projection of x0 and (if x1 is not None) the output projection of x1. If return_features is True, the output for each x is a tuple (out, f) where f are the features before the projection head.

Examples:
>>> # single input, single output
>>> out = model(x)
>>>
>>> # single input with return_features=True
>>> out, f = model(x, return_features=True)
>>>
>>> # two inputs, two outputs
>>> out0, out1 = model(x0, x1)
>>>
>>> # two inputs, two outputs with return_features=True
>>> (out0, f0), (out1, f1) = model(x0, x1, return_features=True)

.simclr

SimCLR Model

class lightly.models.simclr.SimCLR(backbone: torch.nn.modules.module.Module, num_ftrs: int = 32, out_dim: int = 128)

Implementation of the SimCLR[0] architecture

Recommended loss: lightly.loss.ntx_ent_loss.NTXentLoss

[0] SimCLR, 2020, https://arxiv.org/abs/2002.05709

Attributes:
backbone:

Backbone model to extract features from images.

num_ftrs:

Dimension of the embedding (before the projection head).

out_dim:

Dimension of the output (after the projection head).

forward(x0: torch.Tensor, x1: torch.Tensor = None, return_features: bool = False)

Embeds and projects the input images.

Extracts features with the backbone and applies the projection head to the output space. If both x0 and x1 are not None, both will be passed through the backbone and projection head. If x1 is None, only x0 will be forwarded.

Args:
x0:

Tensor of shape bsz x channels x W x H.

x1:

Tensor of shape bsz x channels x W x H.

return_features:

Whether or not to return the intermediate features backbone(x).

Returns:

The output projection of x0 and (if x1 is not None) the output projection of x1. If return_features is True, the output for each x is a tuple (out, f) where f are the features before the projection head.

Examples:
>>> # single input, single output
>>> out = model(x) 
>>> 
>>> # single input with return_features=True
>>> out, f = model(x, return_features=True)
>>>
>>> # two inputs, two outputs
>>> out0, out1 = model(x0, x1)
>>>
>>> # two inputs, two outputs with return_features=True
>>> (out0, f0), (out1, f1) = model(x0, x1, return_features=True)

.moco

MoCo Model

class lightly.models.moco.MoCo(backbone: torch.nn.modules.module.Module, num_ftrs: int = 32, out_dim: int = 128, m: float = 0.999, batch_shuffle: bool = False)

Implementation of the MoCo (Momentum Contrast)[0] architecture.

Recommended loss: lightly.loss.ntx_ent_loss.NTXentLoss with a memory bank.

[0] MoCo, 2020, https://arxiv.org/abs/1911.05722

Attributes:
backbone:

Backbone model to extract features from images.

num_ftrs:

Dimension of the embedding (before the projection head).

out_dim:

Dimension of the output (after the projection head).

m:

Momentum for momentum update of the key-encoder.

forward(x0: torch.Tensor, x1: torch.Tensor = None, return_features: bool = False)

Embeds and projects the input image.

Performs the momentum update, extracts features with the backbone and applies the projection head to the output space. If both x0 and x1 are not None, both will be passed through the backbone and projection head. If x1 is None, only x0 will be forwarded.

Args:
x0:

Tensor of shape bsz x channels x W x H.

x1:

Tensor of shape bsz x channels x W x H.

return_features:

Whether or not to return the intermediate features backbone(x).

Returns:

The output projection of x0 and (if x1 is not None) the output projection of x1. If return_features is True, the output for each x is a tuple (out, f) where f are the features before the projection head.

Examples:
>>> # single input, single output
>>> out = model(x) 
>>> 
>>> # single input with return_features=True
>>> out, f = model(x, return_features=True)
>>>
>>> # two inputs, two outputs
>>> out0, out1 = model(x0, x1)
>>>
>>> # two inputs, two outputs with return_features=True
>>> (out0, f0), (out1, f1) = model(x0, x1, return_features=True)

.simsiam

SimSiam Model

class lightly.models.simsiam.SimSiam(backbone: torch.nn.modules.module.Module, num_ftrs: int = 2048, proj_hidden_dim: int = 2048, pred_hidden_dim: int = 512, out_dim: int = 2048, num_mlp_layers: int = 3)

Implementation of SimSiam[0] network

Recommended loss: lightly.loss.sym_neg_cos_sim_loss.SymNegCosineSimilarityLoss

[0] SimSiam, 2020, https://arxiv.org/abs/2011.10566

Attributes:
backbone:

Backbone model to extract features from images.

num_ftrs:

Dimension of the embedding (before the projection head).

proj_hidden_dim:

Dimension of the hidden layer of the projection head. This should be the same size as num_ftrs.

pred_hidden_dim:

Dimension of the hidden layer of the predicion head. This should be num_ftrs / 4.

out_dim:

Dimension of the output (after the projection head).

forward(x0: torch.Tensor, x1: torch.Tensor = None, return_features: bool = False)

Forward pass through SimSiam.

Extracts features with the backbone and applies the projection head and prediction head to the output space. If both x0 and x1 are not None, both will be passed through the backbone, projection, and prediction head. If x1 is None, only x0 will be forwarded.

Args:
x0:

Tensor of shape bsz x channels x W x H.

x1:

Tensor of shape bsz x channels x W x H.

return_features:

Whether or not to return the intermediate features backbone(x).

Returns:

The output prediction and projection of x0 and (if x1 is not None) the output prediction and projection of x1. If return_features is True, the output for each x is a tuple (out, f) where f are the features before the projection head.

Examples:
>>> # single input, single output
>>> out = model(x) 
>>> 
>>> # single input with return_features=True
>>> out, f = model(x, return_features=True)
>>>
>>> # two inputs, two outputs
>>> out0, out1 = model(x0, x1)
>>>
>>> # two inputs, two outputs with return_features=True
>>> (out0, f0), (out1, f1) = model(x0, x1, return_features=True)

.zoo

Lightly Model Zoo

lightly.models.zoo.checkpoints()

Returns the Lightly model zoo as a list of checkpoints.

Checkpoints:
ResNet-9:

SimCLR with width = 0.0625 and num_ftrs = 16

ResNet-9:

SimCLR with width = 0.125 and num_ftrs = 16

ResNet-18:

SimCLR with width = 1.0 and num_ftrs = 16

ResNet-18:

SimCLR with width = 1.0 and num_ftrs = 32

ResNet-34:

SimCLR with width = 1.0 and num_ftrs = 16

ResNet-34:

SimCLR with width = 1.0 and num_ftrs = 32

Returns:

A list of available checkpoints as URLs.