Configuration Options
Run Configuration
The following configuration options are available when scheduling a run:
from lightly.api import ApiWorkflowClient
# Create the Lightly client to connect to the API.
client = ApiWorkflowClient(token="MY_LIGHTLY_TOKEN", dataset_id="MY_DATASET_ID")
client.schedule_compute_worker_run(
worker_config={
# Enable training of the self-supervised model.
"enable_training": False,
"training": {
# Prediction task name. For example "lightly_pretagging" or "my_predictions".
# Set to train the embedding model on object crops. Must be an object detection or
# keypoint detection task.
"task_name": "",
},
# Provide a checkpoint for the self-supervised model.
"checkpoint": "https://my-checkpoint-read-url",
# Enable pretagging. This detects objects in the input images/videos and makes
# them available for selection. See https://docs.lightly.ai/docs/lightly-pretagging for details.
"pretagging": False,
# Path to a file containing filenames to run the Lightly Worker on a subset of the
# files in the input datasource. See https://docs.lightly.ai/docs/relevant-filenames
# on how to specify relevant filenames.
"relevant_filenames_file": "",
# Sequence length for sequence selection on videos.
# See https://docs.lightly.ai/docs/sequence-selection for details.
"selected_sequence_length": 1,
# Datasource settings. See https://docs.lightly.ai/docs/input-files-and-folder-struture
# for details on how to configure a datasource.
"datasource": {
# If False then only new samples in the datasource are processed, that were
# not yet processed by an earlier Lightly Worker run. Set to True to reprocess
# all samples in the datasource.
"process_all": False,
# Set to False to disable uploading the selected samples to the Lightly
# Platform. This will keep your dataset unchanged and can be useful for
# dry-runs of the Lightly Worker.
"enable_datapool_update": True,
# Bypass the verification of read/write access to the datasource.
"bypass_verify": False,
},
# Image format for selected video frames that are uploaded to the bucket.
"output_image_format": "png",
# Number of data loading processes. If -1, then one process per CPU core
# is created. Set to 0 to load data in the main process. Set to low number
# to reduce memory usage at cost of slower processing.
"num_processes": -1,
# Number of data loading threads. If -1, then two threads per CPU core
# are created. Is always at least one.
"num_threads": -1,
# Path to a file containing custom embeddings for the images in your input datasource.
# The file must be stored in the .lightly/embeddings/ directory in your Lightly
# datasource. The path in the config must be relative to the .lightly/embeddings
# directory. See https://docs.lightly.ai/docs/custom-embeddings for details.
"embeddings": "",
},
# Selection settings. See https://docs.lightly.ai/docs/selection for details.
selection_config={
# Absolute number of samples to select. When using a datapool, n_samples additional
# samples will be added to the datapool independent of the datapool size.
"n_samples": None,
# Number of samples to select relative to the number of input samples. If set to
# 0.1 then 10% of the input samples are selected. When using a datapool, proportion_samples
# of the new input samples will be added to the datapool independent of the datapool size.
"proportion_samples": None,
# List of selection strategy configurations.
"strategies": [
{
# See https://docs.lightly.ai/docs/selection#selection-input on how to
# set the input configuration.
"input": {
# Input type. For example "EMBEDDINGS".
"type": None,
# Prediction task name. For example "lightly_pretagging" or "my_predictions".
# Only used if input type is "EMBEDDINGS", "PREDICTIONS" or "SCORES".
"task": None,
# Active learning score name. For example "uncertainty_entropy".
# Only used if input type is "SCORES".
"score": None,
# Metadata key. For example "lightly.sharpness" or "weather.temperature".
# Only used if input type is "METADATA".
"key": None,
# Must be set to "CLASS_DISTRIBUTION" if input type is "PREDICTIONS".
# Otherwise unused.
"name": None,
# Dataset id from which similarity search query embeddings are loaded.
# Only used if input type is "EMBEDDINGS".
"dataset_id": None,
# Tag name from which similarity search query embeddings are loaded.
# Only used if input type is "EMBEDDINGS".
"tag_name": None,
},
# See https://docs.lightly.ai/docs/selection#selection-strategy on how
# to set the strategy configuration.
"strategy": {
# Strategy type. For example "DIVERSITY".
"type": None,
# Minimum distance between chosen samples. For example 0.1.
# Only used if strategy type is "DIVERSITY". Value should be between
# 0 and 2. Increasing the distance results in fewer selected samples.
"stopping_condition_minimum_distance": None,
# Selection threshold. For example 20. Only used if strategy type is
# "THRESHOLD".
"threshold": None,
# Threshold operation. For example "BIGGER_EQUAL". Only used if
# strategy type is "THRESHOLD".
"operation": None,
# Balancing target. Must be dictionary from target name to target
# ratio. For example {"Amulance": 0.4, "Bus": 0.6}. Only used if
# strategy type is "BALANCE".
"target": None,
# Strength of this strategy relative to other strategies. Value must
# be in [-1e9, 1e9].
"strength": 1.0,
},
},
],
},
lightly_config={
# Dataloader Settings.
"loader": {
# The number of processes and number of threads to use for data loading.
# This is deprecated, please use "num_processes" and "num_threads" instead.
"num_workers": -1,
# Batch size used by the Lightly Worker. Reduce to lower memory usage.
# We recommend to not reduce the batch size if training is enabled.
"batch_size": 16,
# Whether to reshuffle data after each epoch.
"shuffle": True,
},
# Trainer Settings.
"trainer": {
# Number of GPUs to use for training. Set to 0 to use CPU instead.
# Using more than one GPU is not yet supported.
"gpus": 1,
# Number of training epochs.
"max_epochs": 100,
# Floating point precision. Set to 16 for faster processing with half-precision.
"precision": 32,
},
# Model Settings.
"model": {
# Name of the model, currently supports popular variants:
# resnet-18, resnet-34, resnet-50, resnet-101, resnet-152.
"name": 'resnet-18',
# Dimensionality of output on which self-supervised loss is calculated.
"out_dim": 128,
# Dimensionality of feature vectors (embedding size).
"num_ftrs": 32,
# Width of the resnet.
"width": 1,
},
# Training Loss Settings.
"criterion": {
# Temperature by which logits are divided in self-supervised loss.
"temperature": 0.5,
},
# Training Optimizser Settings.
"optimizer": {
# Learning rate of the optimizer.
"lr": "1.",
# L2 penalty.
"weight_decay": 0.00001,
},
# Training Augmentation Settings.
"collate": {
# Size of the input images in pixels.
"input_size": 64,
# Probability that color jitter is applied.
"cj_prob": 0.8,
# How much to jitter brightness.
"cj_bright": 0.7,
# How much to jitter contrast.
"cj_contrast": 0.7,
# How much to jitter saturation.
"cj_sat": 0.7,
# How much to jitter hue.
"cj_hue": 0.2,
# Minimum size of random crop relative to input_size.
"min_scale": 0.15,
# Probability that image is converted to grayscale.
"random_gray_scale": 0.2,
# Probability that gaussian blur is applied.
"gaussian_blur": 0.5,
# Kernel size of gaussian blur relative to input_size.
"kernel_size": 0.1,
# Probability that vertical flip is applied.
"vf_prob": 0.0,
# Probability that horizontal flip is applied.
"hf_prob": 0.5,
# Probability that random rotation is applied.
"rr_prob": 0.0,
# Range of degrees to select from for random rotation.
# If rr_degrees is None, images are rotated by 90 degrees.
# If rr_degrees is a [min, max] list, images are rotated
# by a random angle in [min, max]. All rotations are counter-clockwise.
"rr_degrees": None,
},
"checkpoint_callback": {
# If True, the checkpoint from the last epoch is saved.
"save_last": True,
},
# Random Seed.
"seed": 1,
}
)
Lightly Worker Start Configuration
The following configuration options can be passed when starting the Lightly Worker docker image:
docker run --shm-size="1024m" --gpus all --rm -it \
-e LIGHTLY_TOKEN={MY_LIGHTLY_TOKEN} \
-e LIGHTLY_WORKER_ID={MY_WORKER_ID} \
lightly/worker:latest \
worker.force_start=True \
sanity_check=False
worker_id
: See Install Lightly on how to get a worker id. Alternatively, the worker id can be provided asLIGHTLY_WORKER_ID
environment variable.worker.force_start
: IfTrue
, the worker notifies that it is online even if another worker with the sameworker_id
is already online. This can be useful if the other worker is actually offline but was not able to properly shut down. IfFalse
, the new worker will not start if another worker with the same id already exists.sanity_check
: Set toTrue
to verify the installation of the Lightly Worker. The worker shuts down once the installation is verified. See Sanity Check for more information.
Updated 9 days ago