lightly.loss¶
The lightly.loss package provides loss functions for selfsupervised learning.
.ntx_ent_loss¶

class
lightly.loss.ntx_ent_loss.
NTXentLoss
(temperature: float = 0.5, memory_bank_size: int = 0)¶ Implementation of the Contrastive Cross Entropy Loss.
This implementation follows the SimCLR[0] paper. If you enable the memory bank by setting the memory_bank_size value > 0 the loss behaves like the one described in the MoCo[1] paper.
[0] SimCLR, 2020, https://arxiv.org/abs/2002.05709 [1] MoCo, 2020, https://arxiv.org/abs/1911.05722
 Attributes:
 temperature:
Scale logits by the inverse of the temperature.
 memory_bank_size:
Number of negative samples to store in the memory bank. Use 0 for SimCLR. For MoCo we typically use numbers like 4096 or 65536.
 Raises:
ValueError if abs(temperature) < 1e8 to prevent divide by zero.
Examples:
>>> # initialize loss function without memory bank >>> loss_fn = NTXentLoss(memory_bank_size=0) >>> >>> # generate two random transforms of images >>> t0 = transforms(images) >>> t1 = transforms(images) >>> >>> # feed through SimCLR or MoCo model >>> batch = torch.cat((t0, t1), dim=0) >>> output = model(batch) >>> >>> # calculate loss >>> loss = loss_fn(output)

forward
(out0: torch.Tensor, out1: torch.Tensor)¶ Forward pass through Contrastive CrossEntropy Loss.
If used with a memory bank, the samples from the memory bank are used as negative examples. Otherwise, withinbatch samples are used as negative samples.
 Args:
 out0:
Output projections of the first set of transformed images. Shape: (batch_size, embedding_size)
 out1:
Output projections of the second set of transformed images. Shape: (batch_size, embedding_size)
 Returns:
Contrastive Cross Entropy Loss value.
.sym_neg_cos_sim_loss¶

class
lightly.loss.sym_neg_cos_sim_loss.
SymNegCosineSimilarityLoss
¶ Implementation of the Symmetrized Loss used in the SimSiam[0] paper.
[0] SimSiam, 2020, https://arxiv.org/abs/2011.10566
Examples:
>>> # initialize loss function >>> loss_fn = SymNegCosineSimilarityLoss() >>> >>> # generate two random transforms of images >>> t0 = transforms(images) >>> t1 = transforms(images) >>> >>> # feed through SimSiam model >>> out0, out1 = model(t0, t1) >>> >>> # calculate loss >>> loss = loss_fn(out0, out1)

forward
(out0: torch.Tensor, out1: torch.Tensor)¶ Forward pass through Symmetric Loss.
 Args:
 out0:
Output projections of the first set of transformed images. Expects the tuple to be of the form (z0, p0), where z0 is the output of the backbone and projection mlp, and p0 is the output of the prediction head.
 out1:
Output projections of the second set of transformed images. Expects the tuple to be of the form (z1, p1), where z1 is the output of the backbone and projection mlp, and p1 is the output of the prediction head.
 Returns:
Contrastive Cross Entropy Loss value.
 Raises:
ValueError if shape of output is not multiple of batch_size.

.memory_bank¶

class
lightly.loss.memory_bank.
MemoryBankModule
(size: int = 65536)¶ Memory bank implementation
This is a parent class to all loss functions implemented by the lightly Python package. This way, any loss can be used with a memory bank if desired.
 Attributes:
 size:
Number of keys the memory bank can store. If set to 0, memory bank is not used.
 Examples:
>>> class MyLossFunction(MemoryBankModule): >>> >>> def __init__(self, memory_bank_size: int = 2 ** 16): >>> super(MyLossFunction, self).__init__(memory_bank_size) >>> >>> def forward(self, output: torch.Tensor, >>> labels: torch.Tensor = None): >>> >>> output, negatives = super( >>> MyLossFunction, self).forward(output) >>> >>> if negatives is not None: >>> # evaluate loss with negative samples >>> else: >>> # evaluate loss without negative samples

forward
(output: torch.Tensor, labels: torch.Tensor = None, update: bool = False)¶ Query memory bank for additional negative samples
 Args:
 output:
The output of the model.
 labels:
Should always be None, will be ignored.
 Returns:
The output if the memory bank is of size 0, otherwise the output and the entries from the memory bank.
.barlow_twins_loss¶

class
lightly.loss.barlow_twins_loss.
BarlowTwinsLoss
(lambda_param=0.005)¶ Implementation of the Barlow Twins Loss from Barlow Twins[0] paper. This code specifically implements the Figure Algorithm 1 from [0].
[0] Zbontar,J. et.al, 2021, Barlow Twins… https://arxiv.org/abs/2103.03230
Examples:
>>> # initialize loss function >>> loss_fn = BarlowTwinsLoss() >>> >>> # generate two random transforms of images >>> t0 = transforms(images) >>> t1 = transforms(images) >>> >>> # feed through SimSiam model >>> out0, out1 = model(t0, t1) >>> >>> # calculate loss >>> loss = loss_fn(out0, out1)

forward
(z_a: torch.Tensor, z_b: torch.Tensor)¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

.hypersphere_loss¶

class
lightly.loss.hypersphere_loss.
HypersphereLoss
(t=1.0, lam=1.0, alpha=2.0)¶ Implementation of the loss described in ‘Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere.’ [0]
[0] Tongzhou Wang. et.al, 2020, … https://arxiv.org/abs/2005.10242
In order for this loss to function as advertized, an l1normalization to the hypersphere is required. This loss function applies this l1normalization internally in the losslayer. However, it is recommended that the same normalization is also applied in your architecture, considering that this l1loss is also intended to be applied during inference. Perhaps there may be merit in leaving it out of the inferrence pathway, but this use has not been tested.
Moreover it is recommended that the layers preceeding this loss function are either a linear layer without activation, a batchnormalization layer, or both. The directly upstream architecture can have a large influence on the ability of this loss to achieve its stated aim of promoting uniformity on the hypersphere; and if by contrast the last layer going into the embedding is a RELU or similar nonlinearity, we may see that we will never get very close to achieving the goal of uniformity on the hypersphere, but will confine ourselves to the subspace of positive activations. Similar architectural considerations are relevant to most contrastive loss functions, but we call it out here explicitly.
Examples:
>>> # initialize loss function >>> loss_fn = HypersphereLoss() >>> >>> # generate two random transforms of images >>> t0 = transforms(images) >>> t1 = transforms(images) >>> >>> # feed through SimSiam model >>> out0, out1 = model(t0, t1) >>> >>> # calculate loss >>> loss = loss_fn(out0, out1)

forward
(z_a: torch.Tensor, z_b: torch.Tensor) → torch.Tensor¶  Args:
x : torch.Tensor, [b, d], float y : torch.Tensor, [b, d], float
 Returns:
 torch.Tensor, [], float
scalar loss value

.regularizer.co2¶

class
lightly.loss.regularizer.co2.
CO2Regularizer
(alpha: float = 1, t_consistency: float = 0.05, memory_bank_size: int = 0)¶ Implementation of the CO2 regularizer [0] for selfsupervised learning.
[0] CO2, 2021, https://arxiv.org/abs/2010.02217
 Attributes:
 alpha:
Weight of the regularization term.
 t_consistency:
Temperature used during softmax calculations.
 memory_bank_size:
Number of negative samples to store in the memory bank. Use 0 to use the second batch for negative samples.
 Examples:
>>> # initialize loss function for MoCo >>> loss_fn = NTXentLoss(memory_bank_size=4096) >>> >>> # initialize CO2 regularizer >>> co2 = CO2Regularizer(alpha=1.0, memory_bank_size=4096) >>> >>> # generate two random trasnforms of images >>> t0 = transforms(images) >>> t1 = transforms(images) >>> >>> # feed through the MoCo model >>> out0, out1 = model(t0, t1) >>> >>> # calculate loss and apply regularizer >>> loss = loss_fn(out0, out1) + co2(out0, out1)

forward
(out0: torch.Tensor, out1: torch.Tensor)¶ Computes the CO2 regularization term for two model outputs.
 Args:
 out0:
Output projections of the first set of transformed images.
 out1:
Output projections of the second set of transformed images.
 Returns:
The regularization term multiplied by the weight factor alpha.