Configuring Image Augmentations¶

Pretraining relies strongly on image augmentations such as:

Random Cropping and Resizing: Crops random parts of images and resizes them to fixed resolutions.
Random Horizontal and Vertical Flipping: Mirrors images across horizontal or vertical axes.
Random Rotation: Rotates images by random angles.
Color Jittering: Randomly modifies brightness, contrast, saturation, and hue.
Random Grayscaling: Converts images to grayscale with certain probability.
Gaussian Blurring: Applies Gaussian blur filter of random \(\sigma\), smoothing the image.
Random Solarization: Inverts pixel values above a random threshold.
Normalization: Scales pixel values using predefined mean and standard deviation.

While the default settings in LightlyTrain should work well for most use cases, for some downstream tasks and image domains it might be beneficial to override the defaults and adjust the applied augmentations. This can be done as follows:

Python

For the Python API, use a dictionary structure to override any augmentations settings and pass it to the lightly_train.train function through the transform_args argument. Many augmentations can also be selectively turned off completely by setting them to None, as is demonstrated in this example with the color_jitter augmentation.

import lightly_train
my_transform_args = {
    "random_resize": {
        "min_scale": 0.1
    },
    "image_size": (128, 128),
    "color_jitter": None,
}
if __name__ == "__main__":
    lightly_train.train(
        out="out/my_experiment",            # Output directory
        data="my_data_dir",                 # Directory with images
        model="torchvision/resnet18",       # Model to train
        transform_args=my_transform_args,   # Overrides of default augmentation parameters
    )

Command Line

There are two options on how you can configure the augmentations on the command line:

Dotted Notation
Pass all arguments as a single JSON structure

Important

Make sure that any values that you pass through the command line are JSON-compatible. This means:

Strings inside JSON structures must have double quotes (wrap the whole structure by single quotes).
Tuples do not exist, use bracketed notation (like a Python list).
JSON’s correspondence to Python’s None is null, which you will have to use in order to selectively turn off an augmentation.

An example of how you can use the bracketed notation, would be:

lightly-train train \
    out="out/my_experiment" \
    overwrite=True \
    data="my_data_dir" \
    model="torchvision/resnet18" \
    transform_args.image_size="[128,128]" \
    transform_args.random_resize.min_scale=0.1 \
    transform_args.color_jitter=null

And an example of using a single JSON structure would look as follows:

lightly-train train \
    out="out/my_experiment" \
    data="my_data_dir" model="torchvision/resnet18" \
    transform_args='{"image_size": [128, 128], "random_resize": {"min_scale": 0.1}, "color_jitter": null}'

The next sections will cover which arguments are available across all methods, and also the arguments unique to specific methods.