Assign Scheduled Runs to Specific Workers

When using multiple Lightly Workers that should work on multiple scheduled runs, it can be very useful to assign scheduled runs to specific workers. Lightly also offers this feature. It is based on labels:

  • Each worker can have a set of labels, e.g. ["gpu-A100", "gpu", "bobs_worker", "team_worker"]. Multiple workers can have some labels in common (e.g. multiple gpu workers), but also have labels that only they have.
  • When scheduling a run, you can specify a set of labels the worker picking up the run must have. E.g. specify ["gpu-A100"] to let the run be picked up by any worker with the label gpu-A100.

Specifying Worker Labels

The labels of a Lightly Worker must be assigned when registering it and cannot be changed later. Follow the usual steps when registering the Lightly Worker, but just add the labels argument when calling client.register_compute_worker():

# execute the following code once to get a worker_id
from lightly.api import ApiWorkflowClient

client = ApiWorkflowClient(token="MY_LIGHTLY_TOKEN") # replace this with your token
# THE FOLLOWING LINE IS DIFFERENT
worker_id = client.register_compute_worker(name="worker-with-labels", labels=["gpu-A100", "gpu", "bobs_worker", "team_worker"])
# THE LAST LINE WAS DIFFERENT
print(worker_id)

Next, start the Lightly Worker on your machine with the just created worker_id, just like usually.
Lightly Workers do not have any default labels.

Specifying Labels when Scheduling a Run

Follow the the usual steps of scheduling a run with one change: Additionally specify the runs_on argument when calling client.schedule_compute_worker_run():

from lightly.api import ApiWorkflowClient

# Create a client with your token and configure it to use your dataset ID.
client = ApiWorkflowClient(token="MY_LIGHTLY_TOKEN", dataset_id="MY_DATASET_ID")

# Configure and schedule a run.
scheduled_run_id = client.schedule_compute_worker_run(
  	# THE FOLLOWING LINE IS DIFFERENT
  	runs_on =["gpu-A100"]
    # THE LAST LINE WAS DIFFERENT
    worker_config={...},
    selection_config={...},
)
print(scheduled_run_id)

This scheduled run is going to picked up only by Workers that have the label "gpu-A100" among their labels.

Label Matching

The Worker only picks up runs whose runs_on labels are a SUBSET of its own labels.

For legacy reasons, workers without labels will pickup all runs, including those that have a runs_on specified.
For the same reason, runs without the runs_on specified will be picked up by all workers, including those that have labels.
Thus we recommend you to fully switch to using labels for both all your workers and all your scheduled runs.