In this guide, we will use the "diversity" strategy. It selects frames which are visually different from those observed before. This is useful to collect novel scenarios on your edge device.

We showcase it on the same images we used in 03 Similarity Search. This time, we expect only one image selected from each location.

Project Setup

Code for this tutorial is provided in examples/04_diversity_selection directory. Before starting this tutorial, copy the model file to examples/lightly_model.tar and verify that your project layout is as follows:

lightly_edge_sdk_cpp
├── ...
└── examples
    ├── ...
    ├── 04_diversity_selection
    │   ├── CMakeLists.txt
    │   ├── images
    │   │   ├── london1.jpg
    │   │   ├── london2.jpg
    │   │   ├── matterhorn1.jpg
    │   │   └── matterhorn2.jpg
    │   ├── main.cpp
    │   └── stb_image.h
    └── lightly_model.tar

Build and Run a Complete Example

See below the content of the main.cpp file. We will first run the example, and explain it right after.

// main.cpp
#include <iostream>
#define STB_IMAGE_IMPLEMENTATION
#include "stb_image.h"
#include "lightly_edge_sdk.h"
using namespace lightly_edge_sdk;
 
// Loads an image using stb_image and returns it as a lightly_edge_sdk::Frame struct.
Frame load_image(std::string image_path) {
  std::cout << "Loading image: " << image_path << std::endl;
  int width, height, channels;
  unsigned char *data = stbi_load(image_path.c_str(), &width, &height, &channels, 0);
  if (data == nullptr) {
    throw std::runtime_error("Failed to load image.");
  }
  // Create a Frame struct.
  return Frame(width, height, data);
}
 
int main() {
  // Initialize the LightlyEdge SDK.
  std::cout << "Initializing LightlyEdge..." << std::endl << std::endl;
  lightly_edge_sdk::LightlyEdgeConfig config = lightly_edge_sdk::default_config();
  LightlyEdge lightly_edge =
      LightlyEdge::new_from_tar("../lightly_model.tar", config);
 
  // Register a diversity strategy.
  float threshold = 0.1;
  lightly_edge.register_diversity_strategy(threshold);
 
  // Iterate over the images.
  std::vector<std::string> image_paths = {
    "images/matterhorn1.jpg",
    "images/matterhorn2.jpg",
    "images/london1.jpg",
    "images/london2.jpg",
  };
  for (const auto &path : image_paths) {
    // Load the image.
    Frame frame = load_image(path);
 
    // Embed the image and check if it should be selected.
    std::vector<float> image_embedding = lightly_edge.embed_frame(frame);
    SelectInfo select_info = lightly_edge.should_select(image_embedding, {});
 
    // Add to the embedding database if the image is selected.
    DiversitySelectInfo diversity_select_info = select_info.diversity[0];
    if (diversity_select_info.should_select) {
      lightly_edge.insert_into_embedding_database(image_embedding);
    }
 
    // Print the information about the image.
    std::cout << "  should_select: " << diversity_select_info.should_select << std::endl;
    std::cout << "  min_distance: " << diversity_select_info.min_distance << std::endl << std::endl;
 
    // The image is no longer needed, free the memory.
    stbi_image_free(frame.rgbImageData_);
  }
 
  std::cout << "Program successfully finished." << std::endl;
  return 0;
}

Build and run:

# Enter the project folder.
cd 04_diversity_selection
 
# Configure CMake. This will create a `build` subfolder.
cmake -B build
 
# Build using configuration from the `build` subfolder.
cmake --build build
 
# Run (Linux variant)
./build/main
# Or run (Windows variant)
.\build\[build_type]\main.exe
# where [build_type] is either Release or Debug.

The output should be similar to the following, the distances might slightly differ on your machine architecture:

Initializing LightlyEdge...
 
Loading image: images/matterhorn1.jpg
  should_select: 1
  min_distance: 1
 
Loading image: images/matterhorn2.jpg
  should_select: 0
  min_distance: 0.0369358
 
Loading image: images/london1.jpg
  should_select: 1
  min_distance: 0.24345
 
Loading image: images/london2.jpg
  should_select: 0
  min_distance: 0.084776
 
Program successfully finished.

We see that only the first Matterhorn image and the first London image are selected, which is what we expected. We explain the rest of the output below.

Note: In this example we use the model version lightly_model_14.tar. You might need to adjust the thresholds in this tutorial if your model version differs.

Diversity Selection

There are several steps needed to set up diversity selection. Note that in a real application the exceptions should be handled in a try-catch block. For clarity, we annotate the return types.

Image loading is identical to the previous 03 Similarity Search guide. What differs is how LightlyEdge is set up and how each image is processed. Let's start with the initialization code at the beginning of the main function:

// Initialize the LightlyEdge SDK.
std::cout << "Initializing LightlyEdge..." << std::endl << std::endl;
LightlyEdge lightly_edge = LightlyEdge::new_from_tar("lightly_model.tar");
 
// Register a diversity strategy.
float threshold = 0.1;
lightly_edge.register_diversity_strategy(threshold);

LightlyEdge is first initialized from a TAR archive. Recall that we can register any number of selection strategies with LightlyEdge. We register a single diversity strategy with lightly_edge_sdk::LightlyEdge::register_diversity_strategy.

The function accepts a single argument: min_distance. An observed image is selected only if it is further than min_distance from all previously observed images in the embedding space. The distances are based on cosine similarity and range from 0 (closest) to 1 (furthest).

Next, images are processed in the loop in the main function:

for (const auto &path : image_paths) {
  // Load the image.
  Frame frame = load_image(path);
 
  // Embed the image and check if it should be selected.
  std::vector<float> image_embedding = lightly_edge.embed_frame(frame);
  SelectInfo select_info = lightly_edge.should_select(image_embedding);
 
  // Add to the embedding database if the image is selected.
  DiversitySelectInfo diversity_select_info = select_info.diversity[0];
  if (diversity_select_info.should_select) {
    lightly_edge.insert_into_embedding_database(image_embedding);
  }
 
  // Print the information about the image.
  std::cout << "  should_select: " << diversity_select_info.should_select << std::endl;
  std::cout << "  min_distance: " << diversity_select_info.min_distance << std::endl << std::endl;
 
  // The image is no longer needed, free the memory.
  stbi_image_free(frame.rgbImageData_);
}

The code first embeds the image and calls lightly_edge_sdk::LightlyEdge::should_select which returns lightly_edge_sdk::SelectInfo. The structure contains the decision of each registered strategy whether a frame should be selected.

We load the decision for the single strategy we registered into diversity_select_info. LightlyEdge does not by default remember the images that are observed, we have to manually insert the embeddings into the embedding database by calling lightly_edge_sdk::LightlyEdge::insert_into_embedding_database.

Finally, we print whether the image was selected and its distance to the closest embedding in the database.

How To Choose Diversity Parameters

Consecutive images in a video stream are often very similar. Set the min_distance parameter to a small value, in the range 0.005 to 0.1. The value should be adjusted based on the desired level of diversity in the selected images. A higher value selects more diverse images, leading to a lower selection rate.

With lightly_model_14.tar model, a value in the range 0.008 to 0.015 leads to a selection rate of about 0.5 fps. This is approximate, low values should be used if there is low variation in input video, e.g. a car moving slowly during a rush hour or driving in a rural area in the night.

Next Steps

Next, we introduce Adaptive Diversity which allows you to select diverse images without specifying thresholds.

Previous	Next
Similarity Search	Adaptive Diversity Selection