Other Outputs

The Lightly Worker produces various files which can be used for debugging or further processing the selected samples. To access the generated files, you must mount a local volume to the docker container when starting the Lightly Worker.

Replace the MY_PATH_TO_OUTPUT_DIRECTORY with the absolute paths to local directories where you want to have the output.

docker run --shm-size="1024m" --gpus all --rm -it \
	-e LIGHTLY_TOKEN="MY_LIGHTLY_TOKEN" \
	-e LIGHTLY_WORKER_ID="MY_WORKER_ID" \
	-v "MY_PATH_TO_OUTPUT_DIRECTORY":/output_dir \
	lightly/worker:latest
  • data/: The data directory contains everything to do with data.

    • embeddings.csv contains the computed embeddings for all input samples used in selection (including datapool samples, excluding corrupt or duplicate samples).
    • selected_embeddings_including_datapool.csv contains the embeddings of all selected samples (including preselected datapool samples).
    • If selected_sequence_length > 1, there will be a sequence_information.json file with information about the selected sequences (filenames, video frame timestamps, etc.). Head to Sequence Selection for more details on sequence selection.
  • filenames/: This directory contains lists of filenames of the corrupt images, removed images, selected images, and the removed images because they have an exact duplicate in the dataset.

  • plots/: A directory containing the plots which were produced for the report.

  • log.txt: The logs described in Debugging for more information.

  • lightly_epoch_X.ckpt: Checkpoint with the trained model weights. See also Train a Self-Supervised Model.

  • report.pdf: The report described in Report.

Below you will find a typical output folder structure.

MY_PATH_TO_OUTPUT_DIRECTORY/
├── data
│   ├── al_score_embeddings.csv
│   ├── bounding_boxes.json
│   ├── bounding_boxes_examples
│   ├── embeddings.csv
│   ├── normalized_embeddings.csv
│   ├── sampled
│   ├── selected_embeddings.csv
│   └── sequence_information.json
├── filenames
│   ├── corrupt_filenames.txt
│   ├── duplicate_filenames.txt
│   ├── removed_filenames.txt
│   └── sampled_filenames_excluding_datapool.txt
├── lightly_epoch_X.ckpt
├── plots
│   ├── distance_distr_after.png
│   ├── distance_distr_before.png
│   ├── filter_decision_0.png
│   ├── filter_decision_11.png
│   ├── filter_decision_22.png
│   ├── filter_decision_33.png
│   ├── filter_decision_44.png
│   ├── filter_decision_55.png
│   ├── pretagging_histogram_after.png
│   ├── pretagging_histogram_before.png
│   ├── scatter_pca.png
│   ├── scatter_pca_no_overlay.png
│   ├── scatter_umap_k_15.png
│   ├── scatter_umap_k_15_no_overlay.png
│   ├── scatter_umap_k_5.png
│   ├── scatter_umap_k_50.png
│   ├── scatter_umap_k_50_no_overlay.png
│   ├── scatter_umap_k_5_no_overlay.png
├── report.json
└── report.pdf