The Lightly Worker has an integrated reporting component that provides plots, statistics, and more information collected during the various processing steps to facilitate sustainability and reproducibility in machine learning. For example, there is information about the corruptness check, embedding process, and selection process.

Lightly puts the essential information into an automatically generated PDF report to make it easier for you to understand and discuss the dataset. You can download it for all completed worker runs from the runs page in the Lightly Platform

1686

Download the report from the Lightly Platform.

For easy access to all values, the report is also available as a report_v2.json file. Both the report.pdfand report_v2.json can be downloaded with the Lightly Python client:

from lightly.api import ApiWorkflowClient

# Create the Lightly client to connect to the API.
client = ApiWorkflowClient(token="MY_LIGHTLY_TOKEN", dataset_id="MY_DATASET_ID")

scheduled_run_id = client.schedule_compute_worker_run(...)

# Get the scheduled run given its id.
run = client.get_compute_worker_run_from_scheduled_run(scheduled_run_id=scheduled_run_id)
# Download the report as pdf and json files.
client.download_compute_worker_run_report_pdf(run=run, output_path="my_run/artifacts/report.pdf")
client.download_compute_worker_run_report_v2_json(run=run, output_path="my_run/artifacts/report_v2.json")

# Alternatively, get all runs for a given dataset_id.
runs = client.get_compute_worker_runs(dataset_id=client.dataset_id)
run = runs[-1] # get the latest run

# Download the artifacts as before.
client.download_compute_worker_run_report_pdf(run=run, output_path="my_run/artifacts/report.pdf")
client.download_compute_worker_run_report_v2_json(run=run, output_path="my_run/artifacts/report_v2.json")

The report contains information about the number of available, corrupt, duplicate, and selected images. It also includes information on selected and discarded samples and different diagrams showing the difference between them with respect to specific properties.

🚧

Deprecation of report.json (version 1)

The old report.json file will soon be deprecated in favor of report_v2.json. report_v2.json contains all information from report.json but with many additional metrics and a new structure.

 Examples

You can find example report.pdf files from Lightly Worker runs on a video dataset here: