Quick Start

This quick start shows how to use the LightlyOne solution locally to select the 50 most diverse images from a set of locally saved images.

For a more in-depth guide, see our step-by-step getting started.

Step 1: Install Lightly Python Client Package

pip3 install lightly

Step 2: Download the LightlyOne Worker

docker pull lightly/worker:latest
docker run --shm-size="1024m" --rm -it lightly/worker:latest sanity_check=True

If these commands fail, follow our docker installation guide.

While the worker is being downloaded, you can already continue with step 3, as both are independent of each other.

🚧

Mac with Apple Silicon

If you use a Mac with an Apple silicon chip, make sure to enable Rosetta emulation in Docker Desktop for fast processing. To enable it go to Docker Desktop > Settings > General > Use Rosetta for x86_64/amd64 emulation on Apple Silicon. This requires Docker Desktop 4.25 or later.

Step 3: Schedule a Selection Run

  1. Create a python script named e.g. schedule_selection_run.py and copy the following code into it.
  2. Change two variables in it:
    1. LIGHTLY_TOKEN: Sign up for free and retrieve your token from the preferences page in the LightlyOne Platform.
    2. Set the DATASET_PATH to your images.
      • If you already have such a dataset on your local disk, locate the path to it.
      • Otherwise, use the clothing_dataset:
        git clone https://github.com/lightly-ai/dataset_clothing_images.git clothing_dataset
  3. Then run the script, e.g. with python3 schedule_selection_run.py.
from os import linesep
from pathlib import Path
from datetime import datetime
from lightly.api import ApiWorkflowClient
from lightly.openapi_generated.swagger_client import DatasetType, DatasourcePurpose

###### CHANGE THESE 2 VARIABLES
LIGHTLY_TOKEN = "CHANGE_ME_TO_YOUR_TOKEN"  # Copy from https://app.lightly.ai/preferences
DATASET_PATH = Path("CHANGE_ME")  # e.g., Path("/path/to/images") or Path("clothing_dataset")
######

assert DATASET_PATH.exists(), f"Dataset path {DATASET_PATH} does not exist."

# Create the LightlyOne client to connect to the API.
client = ApiWorkflowClient(token=LIGHTLY_TOKEN)

# Create the dataset on the LightlyOne Platform.
# See our guide for more details and options:
# https://docs.lightly.ai/docs/set-up-your-first-dataset
client.create_dataset(
    dataset_name=f"first_dataset__{datetime.now().strftime('%Y_%m_%d__%H_%M_%S')}",
    dataset_type=DatasetType.IMAGES,
)

# Configure the datasources.
# See our guide for more details and options:
# https://docs.lightly.ai/docs/set-up-your-first-dataset
client.set_local_config(purpose=DatasourcePurpose.INPUT)
client.set_local_config(purpose=DatasourcePurpose.LIGHTLY)

# Schedule a run on the dataset to select 50 diverse samples.
# See our guide for more details and options:
# https://docs.lightly.ai/docs/run-your-first-selection
scheduled_run_id = client.schedule_compute_worker_run(
    worker_config={"shutdown_when_job_finished": True},
    selection_config={
        "n_samples": 50,
        "strategies": [
            {"input": {"type": "EMBEDDINGS"}, "strategy": {"type": "DIVERSITY"}}
        ],
    },
)

# Print the next commands
print(
    f"{linesep}Docker Run command: {linesep}"
    f"\033[7m"
    f"docker run --shm-size='1024m' --rm -it \\{linesep}"
    f"\t-v '{DATASET_PATH.absolute()}':/input_mount:ro \\{linesep}"
    f"\t-v '{Path('lightly').absolute()}':/lightly_mount \\{linesep}"
    f"\t-e LIGHTLY_TOKEN={LIGHTLY_TOKEN} \\{linesep}"
    f"\tlightly/worker:latest{linesep}"
    f"\033[0m"
)
print(
    f"{linesep}Lightly Serve command:{linesep}"
    f"\033[7m"
    f"lightly-serve \\{linesep}"
    f"\tinput_mount='{DATASET_PATH.absolute()}' \\{linesep}"
    f"\tlightly_mount='{Path('lightly').absolute()}'{linesep}"
    f"\033[0m"
)

Step 4: Process the Run With the LightlyOne Worker

Run the Docker Run command printed by the Python script from step 3.

The worker will take a while to process your dataset.

πŸ‘

Congratulations! You successfully ran your first selection with the LightlyOne Worker!

Step 5: Explore the Selected Dataset

Next, you need to serve the images from your local disk to your local browser by using the Lightly Serve command printed by the Python script from step 3 as well.

In case your images are on a machine different from your web browser (i.e., dataset_path of the above script is not on the computer you are reading this ), you also need to forward a port. See the docs on port forwarding.

πŸ‘

Awesome! You are now able to view and explore the dataset interactively on the LightlyOne Platform.

Next Steps

LightlyOne is a powerful tool for automated data curation. To better understand all the possibilities of the LightlyOne Worker and how to setup pipelines at your enterprise, please follow the following guides:

  • To understand the commands used in this quick start better, see Getting Started
  • For changing the selection configuration, see Selection
  • For using data stored in cloud storage (AWS S3, Google Cloud Storage, Azure), see Cloud Storage
  • To understand how LightlyOne ensures total PII compliance and ensures no sensitive data leaves your premises, see Security.
  • For getting into more advanced features, either do one of the tutorials in the Tutorialssection or directly go to the Advancedsection.