(method-transform-args)=

# Configuring Image Augmentations

Pretraining relies strongly on image augmentations such as:

- **Random Cropping and Resizing**: Crops random parts of images and resizes them to fixed resolutions.
- **Random Horizontal and Vertical Flipping**: Mirrors images across horizontal or vertical axes.
- **Random Rotation**: Rotates images by random angles.
- **Color Jittering**: Randomly modifies brightness, contrast, saturation, and hue.
- **Random Grayscaling**: Converts images to grayscale with certain probability.
- **Gaussian Blurring**: Applies Gaussian blur filter of random {math}`\sigma`, smoothing the image.
- **Random Solarization**: Inverts pixel values above a random threshold.
- **Normalization**: Scales pixel values using predefined mean and standard deviation.

While the default settings in LightlyTrain should work well for most use cases, for some downstream tasks and image domains it might be beneficial to override the defaults and adjust the applied augmentations. This can be done as follows:

````{tab} Python
For the Python API, use a dictionary structure to override any augmentations settings and pass it to the `lightly_train.train` function through the `transform_args` argument. Many augmentations can also be selectively turned off completely by setting them to `None`, as is demonstrated in this example with the `color_jitter` augmentation.
```python
import lightly_train
my_transform_args = {
    "random_resize": {
        "min_scale": 0.1
    },
    "image_size": (128, 128),
    "color_jitter": None,
}
if __name__ == "__main__":
    lightly_train.train(
        out="out/my_experiment",            # Output directory
        data="my_data_dir",                 # Directory with images
        model="torchvision/resnet18",       # Model to train
        transform_args=my_transform_args,   # Overrides of default augmentation parameters
    )
```
````

````{tab} Command Line
There are two options on how you can configure the augmentations on the command line:
1. Dotted Notation
2. Pass all arguments as a single JSON structure

```{important}
Make sure that any values that you pass through the command line are JSON-compatible. This means:
 - Strings inside JSON structures must have double quotes (wrap the whole structure by single quotes).
 - Tuples do not exist, use bracketed notation (like a Python list).
 - JSON's correspondence to Python's `None` is `null`, which you will have to use in order to selectively turn off an augmentation.
```

An example of how you can use the bracketed notation, would be:
```bash
lightly-train train \
    out="out/my_experiment" \
    overwrite=True \
    data="my_data_dir" \
    model="torchvision/resnet18" \
    transform_args.image_size="[128,128]" \
    transform_args.random_resize.min_scale=0.1 \
    transform_args.color_jitter=null
```

And an example of using a single JSON structure would look as follows:

```bash
lightly-train train \
    out="out/my_experiment" \
    data="my_data_dir" model="torchvision/resnet18" \
    transform_args='{"image_size": [128, 128], "random_resize": {"min_scale": 0.1}, "color_jitter": null}'
```
````

The next sections will cover which arguments are available across all methods, and also the arguments unique to specific methods.

```{seealso}
Interested in the default augmentation settings for each method? Check the method pages:
 - {ref}`methods-dino`
 - {ref}`methods-distillation`
 - {ref}`methods-simclr`
```

## Arguments available for all methods

The following arguments are available for all methods {ref}`methods-distillation`, {ref}`methods-dino` and {ref}`methods-simclr`.

### Random Cropping and Resizing

Can be disabled by setting to `None`.

```python skip_ruff
"random_resize": {
    "min_scale": float,
    "max_scale": float,
}
```

### Image Size

Cannot be disabled, required for all transforms.

```python skip_ruff
"image_size": tuple[int, int]  # height, width
```

### Random Horizontal and Vertical Flipping

Can be disabled by setting to `None`.

```python skip_ruff
"random_flip": {
    "horizontal_prob": float, # probability of applying horizontal flip
    "vertical_prob": float,   # probability of applying vertical flip
}
```

### Random Rotation

Can be disabled by setting to `None`.

```python skip_ruff
"random_rotation": {
    "prob": float,   # probability of applying rotation
    "degrees": int,  # maximum rotation angle in degrees
}
```

### Color Jittering

Can be disabled by setting to `None`.

```python skip_ruff
"color_jitter": {
    "prob": float,       # probability of applying color jitter
    "strength": float,   # multiplier for all parameters below
    "brightness": float, # how much to jitter brightness (non-negative)
    "contrast": float,   # how much to jitter contrast (non-negative)
    "saturation": float, # how much to jitter saturation (non-negative)
    "hue": float,        # how much to jitter hue (non-negative)
}
```

### Random Grayscaling

Can be disabled by setting to `None`.

```python skip_ruff
"random_gray_scale": float  # probability of converting to grayscale
```

### Gaussian Blurring

Can be disabled by setting to `None`.

```python skip_ruff
"gaussian_blur": {
    "prob": float,                     # probability of applying blur
    "sigmas": tuple[float, float],          # range of sigma values
    "blur_limit": int | tuple[int, int],    # range of kernel size, either [0, high] or [low, high]
}
```

### Random Solarization

Can be disabled by setting to `None`.

```python skip_ruff
"solarize": {
    "prob": float,      # probability of applying solarization
    "threshold": float  # threshold value in range [0, 1]
}
```

### Normalization

Cannot be disabled, required for all transforms.

```python skip_ruff
"normalize": {
    "mean": tuple[float, float, float],  # means of the three channels
    "std": tuple[float, float, float]    # standard deviations of the three channels
}
```

## Arguments unique to methods

The methods Distillation and SimCLR have no transform configuration options beyond the globally available ones, which were listed above.

### DINO

DINO uses a multi-crop strategy with two full-resolution "global" views (which have slightly different augmentation parameters) and optional additional smaller resolution "local" views (default: 6 views).

Besides the default arguments, the following DINO-specific arguments are available. Note that `local_view` itself can be disabled by setting it to `None`. Additionally, some augmentations within these structures can be disabled by setting them to `None`:

```python skip_ruff
"global_view_1": {                     # modifications for second global view (cannot be disabled)
    "gaussian_blur": {                 # can be disabled by setting to None
        "prob": float,                 
        "sigmas": tuple[float, float],
        "blur_limit": int | tuple[int, int]
    },
    "solarize": {                      # can be disabled by setting to None
        "prob": float,
        "threshold": float
    }
},
"local_view": {                        # configuration for local views (can be disabled by setting to None)
    "num_views": int,                  # number of local views to generate
    "view_size": tuple[int, int],      # size of local views
    "random_resize": {                 # can be disabled by setting to None
        "min_scale": float,
        "max_scale": float
    },
    "gaussian_blur": {                 # can be disabled by setting to None
        "prob": float,
        "sigmas": tuple[float, float],
        "blur_limit": int | tuple[int, int]
    }
}
```