lightly.utils¶
The lightly.utils package provides global utility methods.
The io module contains utility to save and load embeddings in a format which is understood by the Lightly library. With the embeddings_2d module, embeddings can be transformed to a two-dimensional space for better visualization.
.io¶
I/O operations to save and load embeddings.
-
lightly.utils.io.
load_embeddings
(path: str)¶ Loads embeddings from a csv file in a Lightly compatible format.
- Args:
- path:
Path to the csv file.
- Returns:
The embeddings as a numpy array, labels as a list of integers, and filenames as a list of strings in the order they were saved.
The embeddings will always be of the Float32 datatype.
- Examples:
>>> import lightly.utils.io as io >>> embeddings, labels, filenames = io.load_embeddings( >>> 'path/to/my/embeddings.csv')
-
lightly.utils.io.
load_embeddings_as_dict
(path: str, embedding_name: str = 'default', return_all: bool = False)¶ Loads embeddings from csv and store it in a dictionary for transfer.
Loads embeddings to a dictionary which can be serialized and sent to the Lightly servers. It is recommended that the embedding_name is always specified because the Lightly web-app does not allow two embeddings with the same name.
- Args:
- path:
Path to the csv file.
- embedding_name:
Name of the embedding for the platform.
- return_all:
If true, return embeddings, labels, and filenames, too.
- Returns:
A dictionary containing the embedding information (see load_embeddings)
- Examples:
>>> import lightly.utils.io as io >>> embedding_dict = io.load_embeddings_as_dict( >>> 'path/to/my/embeddings.csv', >>> embedding_name='MyEmbeddings') >>> >>> result = io.load_embeddings_as_dict( >>> 'path/to/my/embeddings.csv', >>> embedding_name='MyEmbeddings', >>> return_all=True) >>> embedding_dict, embeddings, labels, filenames = result
-
lightly.utils.io.
save_embeddings
(path: str, embeddings: numpy.ndarray, labels: List[int], filenames: List[str])¶ Saves embeddings in a csv file in a Lightly compatible format.
Creates a csv file at the location specified by path and saves embeddings, labels, and filenames.
- Args:
- path:
Path to the csv file.
- embeddings:
Embeddings of the images as a numpy array (n x d).
- labels:
List of integer labels.
- filenames:
List of filenames.
- Raises:
ValueError if embeddings, labels, and filenames have different lengths.
- Examples:
>>> import lightly.utils.io as io >>> io.save_embeddings( >>> 'path/to/my/embeddings.csv', >>> embeddings, >>> labels, >>> filenames)
.embeddings_2d¶
Transform embeddings to two-dimensional space for visualization.
-
class
lightly.utils.embeddings_2d.
PCA
(n_components: int = 2, eps: float = 1e-10)¶ Handmade PCA to bypass sklearn dependency.
- Attributes:
- n_components:
Number of principal components to keep.
- eps:
Epsilon for numerical stability.
-
fit
(X: numpy.ndarray)¶ Fits PCA to data in X.
- Args:
- X:
Datapoints stored in numpy array of size n x d.
- Returns:
PCA object to transform datapoints.
-
transform
(X: numpy.ndarray)¶ Uses PCA to transform data in X.
- Args:
- X:
Datapoints stored in numpy array of size n x d.
- Returns:
Numpy array of n x p datapoints where p <= d.
-
lightly.utils.embeddings_2d.
fit_pca
(embeddings: numpy.ndarray, n_components: int = 2, fraction: float = None)¶ Fits PCA to randomly selected subset of embeddings.
For large datasets, it can be unfeasible to perform PCA on the whole data. This method can fit a PCA on a fraction of the embeddings in order to save computational resources.
- Args:
- embeddings:
Datapoints stored in numpy array of size n x d.
- n_components:
Number of principal components to keep.
- fraction:
Fraction of the dataset to fit PCA on.
- Returns:
A transformer which can be used to transform embeddings to lower dimensions.
- Raises:
ValueError if fraction < 0 or fraction > 1.