Quick Start
This quick start shows how to use the LightlyOne solution locally to select the 50 most diverse images from a set of locally saved images.
For a more in-depth guide, see our step-by-step getting started.
Step 1: Install Lightly Python Client Package
pip3 install lightly
Step 2: Download the LightlyOne Worker
docker pull lightly/worker:latest
docker run --shm-size="1024m" --rm -it lightly/worker:latest sanity_check=True
If these commands fail, follow our docker installation guide.
While the worker is being downloaded, you can already continue with step 3, as both are independent of each other.
Mac with Apple Silicon
If you use a Mac with an Apple silicon chip, make sure to enable Rosetta emulation in Docker Desktop for fast processing. To enable it go to
Docker Desktop > Settings > General > Use Rosetta for x86_64/amd64 emulation on Apple Silicon
. This requires Docker Desktop 4.25 or later.
Step 3: Schedule a Selection Run
- Create a python script named e.g.
schedule_selection_run.py
and copy the following code into it. - Change two variables in it:
LIGHTLY_TOKEN
: Sign up for free and retrieve your token from the preferences page in the LightlyOne Platform.- Set the
DATASET_PATH
to your images.- If you already have such a dataset on your local disk, locate the path to it.
- Otherwise, use the clothing_dataset:
git clone https://github.com/lightly-ai/dataset_clothing_images.git clothing_dataset
- Then run the script, e.g. with
python3 schedule_selection_run.py
.
from os import linesep
from pathlib import Path
from datetime import datetime
from lightly.api import ApiWorkflowClient
from lightly.openapi_generated.swagger_client import DatasetType, DatasourcePurpose
###### CHANGE THESE 2 VARIABLES
LIGHTLY_TOKEN = "CHANGE_ME_TO_YOUR_TOKEN" # Copy from https://app.lightly.ai/preferences
DATASET_PATH = Path("CHANGE_ME") # e.g., Path("/path/to/images") or Path("clothing_dataset")
######
assert DATASET_PATH.exists(), f"Dataset path {DATASET_PATH} does not exist."
# Create the LightlyOne client to connect to the API.
client = ApiWorkflowClient(token=LIGHTLY_TOKEN)
# Create the dataset on the LightlyOne Platform.
# See our guide for more details and options:
# https://docs.lightly.ai/docs/set-up-your-first-dataset
client.create_dataset(
dataset_name=f"first_dataset__{datetime.now().strftime('%Y_%m_%d__%H_%M_%S')}",
dataset_type=DatasetType.IMAGES,
)
# Configure the datasources.
# See our guide for more details and options:
# https://docs.lightly.ai/docs/set-up-your-first-dataset
client.set_local_config(purpose=DatasourcePurpose.INPUT)
client.set_local_config(purpose=DatasourcePurpose.LIGHTLY)
# Schedule a run on the dataset to select 50 diverse samples.
# See our guide for more details and options:
# https://docs.lightly.ai/docs/run-your-first-selection
scheduled_run_id = client.schedule_compute_worker_run(
worker_config={"shutdown_when_job_finished": True},
selection_config={
"n_samples": 50,
"strategies": [
{"input": {"type": "EMBEDDINGS"}, "strategy": {"type": "DIVERSITY"}}
],
},
)
# Print the next commands
print(
f"{linesep}Docker Run command: {linesep}"
f"\033[7m"
f"docker run --shm-size='1024m' --rm -it \\{linesep}"
f"\t-v '{DATASET_PATH.absolute()}':/input_mount:ro \\{linesep}"
f"\t-v '{Path('lightly').absolute()}':/lightly_mount \\{linesep}"
f"\t-e LIGHTLY_TOKEN={LIGHTLY_TOKEN} \\{linesep}"
f"\tlightly/worker:latest{linesep}"
f"\033[0m"
)
print(
f"{linesep}Lightly Serve command:{linesep}"
f"\033[7m"
f"lightly-serve \\{linesep}"
f"\tinput_mount='{DATASET_PATH.absolute()}' \\{linesep}"
f"\tlightly_mount='{Path('lightly').absolute()}'{linesep}"
f"\033[0m"
)
Step 4: Process the Run With the LightlyOne Worker
Run the Docker Run command
printed by the Python script from step 3.
The worker will take a while to process your dataset.
Congratulations! You successfully ran your first selection with the LightlyOne Worker!
Step 5: Explore the Selected Dataset
Next, you need to serve the images from your local disk to your local browser by using the Lightly Serve command
printed by the Python script from step 3 as well.
In case your images are on a machine different from your web browser (i.e., dataset_path
of the above script is not on the computer you are reading this ), you also need to forward a port. See the docs on port forwarding.
Awesome! You are now able to view and explore the dataset interactively on the LightlyOne Platform.
Next Steps
LightlyOne is a powerful tool for automated data curation. To better understand all the possibilities of the LightlyOne Worker and how to setup pipelines at your enterprise, please follow the following guides:
- To understand the commands used in this quick start better, see Getting Started
- For changing the selection configuration, see Selection
- For using data stored in cloud storage (AWS S3, Google Cloud Storage, Azure), see Cloud Storage
- To understand how LightlyOne ensures total PII compliance and ensures no sensitive data leaves your premises, see Security.
- For getting into more advanced features, either do one of the tutorials in the
Tutorials
section or directly go to theAdvanced
section.
Updated about 1 month ago