Other Outputs

The Lightly Worker produces various files which can be used for debugging or further processing the selected samples. To access the generated files, you must mount a local volume to the docker container when starting the Lightly Worker.

Don’t forget to remove the curly brackets { } when replacing {OUTPUT_DIR} with the path to where you want to have the output directory.

docker run --shm-size="1024m" --gpus all --rm -it \
    -e LIGHTLY_TOKEN={MY_LIGHTLY_TOKEN} \
    -v {OUTPUT_DIR}:/output_dir \
    lightly/worker:latest \
    worker.worker_id={MY_WORKER_ID}

🚧

Host system first!

Docker volume or port mappings always follow the scheme that you first specify the host systems volume followed by the internal volume of the container. E.g. -v /my_system/outputs:/outputs would mount /my_system/outputs from your system to /outputs in the docker container. See the official docker docs for more information.

The output directory is structured in the following way:

  • config/: A directory containing copies of the configuration files and overwrites.

  • data/: The data directory contains everything to do with data.

    • embeddings.csv contains the computed embeddings for all input samples used in selection (including datapool samples, excluding corrupt or duplicate samples).
    • selected_embeddings_including_datapool.csv contains the embeddings of all selected samples (including preselected datapool samples).
    • If selected_sequence_length > 1, there will be a sequence_information.json file with information about the selected sequences (filenames, video frame timestamps, etc.). Head to Sequence Selection for more details on sequence selection.
  • filenames/: This directory contains lists of filenames of the corrupt images, removed images, selected images, and the removed images because they have an exact duplicate in the dataset.

  • plots/: A directory containing the plots which were produced for the report.

  • log.txt: The logs described in Debugging for more information.

  • lightly_epoch_X.ckpt: Checkpoint with the trained model weights. See also Train a Self-Supervised Model.

  • report.pdf: The report described in Report.

Below you will find a typical output folder structure.

{OUTPUT_DIR}/
├── config
│   ├── config.yaml
│   ├── hydra.yaml
│   └── overrides.yaml
├── data
│   ├── al_score_embeddings.csv
│   ├── bounding_boxes.json
│   ├── bounding_boxes_examples
│   ├── embeddings.csv
│   ├── normalized_embeddings.csv
│   ├── sampled
│   ├── selected_embeddings.csv
│   └── sequence_information.json
├── filenames
│   ├── corrupt_filenames.txt
│   ├── duplicate_filenames.txt
│   ├── removed_filenames.txt
│   └── sampled_filenames_excluding_datapool.txt
├── lightly_epoch_X.ckpt
├── plots
│   ├── distance_distr_after.png
│   ├── distance_distr_before.png
│   ├── filter_decision_0.png
│   ├── filter_decision_11.png
│   ├── filter_decision_22.png
│   ├── filter_decision_33.png
│   ├── filter_decision_44.png
│   ├── filter_decision_55.png
│   ├── pretagging_histogram_after.png
│   ├── pretagging_histogram_before.png
│   ├── scatter_pca.png
│   ├── scatter_pca_no_overlay.png
│   ├── scatter_umap_k_15.png
│   ├── scatter_umap_k_15_no_overlay.png
│   ├── scatter_umap_k_5.png
│   ├── scatter_umap_k_50.png
│   ├── scatter_umap_k_50_no_overlay.png
│   ├── scatter_umap_k_5_no_overlay.png
├── report.json
└── report.pdf