Sequence Selection

Instead of selecting single video frames, LightlyOne can also select sequences of frames. The key for doing this is the parameter selected_sequence_length which can be set in the worker_config. If its value is 1 (default), the LightlyOne Worker selects single frames. If it is larger than one, each video is split into sequences of that length and the frame representations are aggregated into a sequence representation. The selection then happens on these sequence representations.

📘

Sequence selection only works with Videos as Input.

Sequence selection consists of the following steps:

  1. Each input video is split into sequences of length selected_sequence_length.
  2. Next, the embeddings of all frames in a sequence are aggregated (averaged).
  3. The selection is performed on sequence level.
  4. The frames of the selected sequences are uploaded to the Lightly datasource.

The following code snippet shows how to use sequence selection:

from lightly.api import ApiWorkflowClient

# Create the LightlyOne client to connect to the API.
client = ApiWorkflowClient(token="MY_LIGHTLY_TOKEN", dataset_id="MY_DATASET_ID")

scheduled_run_id = client.schedule_compute_worker_run(
    worker_config={
        "selected_sequence_length": 10, # Split videos into sequences of 10 frames each
    },
    selection_config={
        "n_samples": 50,
        "strategies": [
            {
                "input": {
                    "type": "EMBEDDINGS"
                },
                "strategy": {
                    "type": "DIVERSITY"
                }
            }
        ]
    },
)

📘

Number of Selected Sequences

n_samples specifies the total number of selected frames and must therefore be a multiple of selected_sequence_length! The number of selected sequences is calculated as n_samples / selected_sequence_length. Setting n_samples = 50 and selected_sequence_length = 10 will result in 50 / 10 = 5 selected sequences with each sequence containing 10 frames.

If proportion_samples is used instead of n_samples the number of selected frames is determined as num_frames_in_dataset * proportion_samples rounded down to the next multiple of selected_sequence_length. Given a dataset with 110 frames and setting proportion_samples = 0.5 and selected_sequence_length = 10 will result in floor(110 * 0.5 / 10) = 5 selected sequences with each sequence containing 10 frames.

📘

Selection Strategies

Sequence selection is limited to the following selection strategies:

📘

Datapool

Sequence selection is compatible with the datapool feature. The only limitation is that selected_sequence_length must be set to the same value for all runs.

Crop Sequences from Videos

Sometimes it is useful to get the selected sequences not as extracted video frames but as video clips. LightlyOne does not yet support extracting video clips directly but instead creates a sequence_information.json file that stores the exact timestamps of the selected sequences from each video. Using this file it is possible to crop the selected sequences from the original videos.

Obtaining sequence_information.json

The file can be downloaded manually on the LightlyOne Platform. Navigate to "My Worker Runs" and click on the "Detail" icon next to your run. The download link is in the "Artifacts" section.

Alternatively, the file can be downloaded programmatically using the download_compute_worker_run_sequence_information function:

from lightly.api import ApiWorkflowClient

# Create the LightlyOne client to connect to the API.
client = ApiWorkflowClient(token="MY_LIGHTLY_TOKEN", dataset_id="MY_DATASET_ID")

# Schedule a run
scheduled_run_id = client.schedule_compute_worker_run(
    # Your run configuration
    ...
)

# Wait for completion
for run_info in client.compute_worker_run_info_generator(
    scheduled_run_id=scheduled_run_id
):
    # Monitor the run here
    pass

# Download sequence information
run = client.get_compute_worker_run_from_scheduled_run(
    scheduled_run_id=scheduled_run_id
)
client.download_compute_worker_run_sequence_information(
    run=run, output_path="sequence_information.json"
)

📘

Download sequence information from multiple runs

The following example showcases a more complex case to download sequence information from multiple runs on a single dataset (e.g. when using a datapool to build up a dataset).

import json
from lightly.api import ApiWorkflowClient
from lightly.openapi_generated.swagger_client import DockerRunArtifactType

# Create the LightlyOne client to connect to the API
client = ApiWorkflowClient(token="MY_LIGHTLY_TOKEN")

# Get all runs sorted from old to new
runs = client.get_compute_worker_runs(dataset_id="MY_DATASET_ID")

sequence_infos = []
for run in runs:
    print(f"Processing {run.id}")

    # Skip if the run did not produce sequence information
    has_sequence_info = any(
        artifact.type == DockerRunArtifactType.SEQUENCE_INFORMATION
        for artifact in run.artifacts
    )
    if not has_sequence_info:
        continue

    # Download sequence_information.json, load the contents and add it to sequence_infos
    output_path = f"artifacts/sequence_information_{run.id}.json"
    client.download_compute_worker_run_sequence_information(
        run=run, output_path=output_path
    )
    with open(output_path) as f:
        sequence_info = json.load(f)
        sequence_infos.extend(sequence_info)

# All sequence information is now loaded in sequence_infos

Sequence Information

The sequence_information.json file contains a list of dictionaries with each dictionary containing the information of a selected sequence:

[
    {
        "video_name": "video1.mp4",
        "frame_names": [
            "video1-40-mp4.png",
            "video1-41-mp4.png",
            "video1-42-mp4.png",
            ...
        ],
        "frame_timestamps_pts": [
            359726680,
            368719847,
            377713014,
            ...
        ],
        "frame_timestamps_sec": [
            4.886145,
            5.008298625,
            5.13045225,
            ...
        ],
        "frame_indices": [
            40,
            41,
            42,
            ...
        ]
    },
    {
        "video_name": "video1.mp4",
        "frame_names": [
            "video1-100-mp4.png",
            "video1-101-mp4.png",
            "video1-102-mp4.png",
            ...
        ],
        "frame_timestamps_pts": [
            422678849,
            431672016,
            440665183,
            ...
        ],
        "frame_timestamps_sec": [
            6.095856060606061,
            6.217773181818182,
            6.339690303030303,
            ...
        ],
        "frame_indices": [
            100,
            101,
            102,
            ...
        ]
    },
    ...
]

Each sequence dictionary has the following fields:

  • video_name is the original video filename from which the sequence was created.
  • frame_names lists the filenames of the selected frames in the sequence.
  • frame_timestamps_pts lists the presentation timestamps of the selected frames in the sequence.
  • frame_timestamps_sec lists the timestamps in seconds since the beginning of the video for the selected frames in the sequence.
  • frame_indices lists the frame indices (starting at 0) since the beginning of the video for the selected frames in the sequence.

Crop the Sequences

Using the timestamps stored in the sequence_information.json file, the selected video sequences can be cropped from the original videos. Make sure that FFmpeg is available on your system for cropping the videos.

To crop a sequence, the first and last timestamp from the frame_timestamps_pts list and the video_name stored in the sequence_information.json file are required. The cropping can be done with the following command using an FFmpeg trim filter:

ffmpeg -i {VIDEO_NAME} -copyts -filter "trim=start_pts={FIRST_TIMESTAMP_PTS}:end_pts={LAST_TIMESTAMP_PTS + 1}" {SEQUENCE_NAME}

# example using an mp4 video
ffmpeg -i video.mp4 -copyts -filter "trim=start_pts=359726680:end_pts=377713015" sequence_1.mp4

🚧

Make sure that end_pts is set to LAST_TIMESTAMP + 1 otherwise the last frame in the sequence will not be included in the cropped sequence!

Sequences can also be cropped using the first and last timestamp from the frame_timestamps_sec list. However, depending on the video and sequence, this can result in the last frame of the sequence not being included in the cropped video. We recommend to use frame_timestamps_pts if possible. The following command can be used for cropping using frame_timestamps_sec:

ffmpeg -i {VIDEO_NAME} -copyts -filter "trim=start={FIRST_TIMESTAMP_SEC}:end={LAST_TIMESTAMP_SEC}" {SEQUENCE_NAME}

# example using an mp4 video
ffmpeg -i video.mp4 -copyts -filter "trim=start=4.886145:end=5.985527625" sequence_1.mp4