02 Similarity Search#

We will build our first real application. We will pass to LightlyEdge the four images shown below. Using the similarity search feature, we will identify images that contain mountains.

Project Setup#

Verify you have installed Python version 3.8 or higher: python --version

Create a new folder getting_started.
```
mkdir getting_started
```
Copy the embedding model to getting_started/lightly_model.tar.
```
cp lightly_model.tar getting_started
```
Inside of the getting_started folder create a new folder 02_similarity_search with images subfolder.
```
mkdir -p getting_started/02_similarity_search/images
```
Right-click and download the four images above into 02_similarity_search/images

After completing this guide, the folder structure will be as follows:

getting_started
├── 02_similarity_search
│   └── images/
│   └── main.py
└── lightly_model.tar

Note

In this example we use the model version lightly_model_14.tar. You might need to adjust the thresholds in this tutorial if your model version differs.

Run a Complete Example#

Create 02_similarity_search/main.py and copy the contents below into it. We will first run the example, and explain it right after.

from lightly_edge_sdk import LightlyEdge
from PIL import Image


# Initialize the LightlyEdge SDK.
print("Initializing LightlyEdge...\n")
lightly_edge = LightlyEdge(path="../lightly_model.tar")

# Register a similarity strategy for the text "mountains".
threshold = 0.5
text_embeddings = lightly_edge.embed_texts(texts=["mountains"])
lightly_edge.register_similarity_strategy(
    query_embedding=text_embeddings[0], max_distance=threshold
)

# Iterate over the images
image_paths = [
    "images/matterhorn1.jpg",
    "images/matterhorn2.jpg",
    "images/london1.jpg",
    "images/london2.jpg",
]
for image_path in image_paths:
    # Embed the image and check if it should be selected.
    with Image.open(image_path) as frame:
        print(f"Loading image: {image_path}")
        image_embedding = lightly_edge.embed_frame(frame=frame)

    select_info = lightly_edge.should_select(embedding=image_embedding)
    similarity_select_info = select_info.similarity[0]

    # Print whether the image is selected.
    print(f"Should select: {similarity_select_info.should_select}")
    print(f"Distance: {similarity_select_info.distance:.2f}\n")

print("Program successfully finished.")

Run it:

# Enter the project folder.
cd getting_started/02_similarity_search

# Run the Python script
python main.py

The output should be similar to the following, the distances might slightly differ on your machine architecture:

Initializing LightlyEdge...

Loading image: images/matterhorn1.jpg
Should select: True
Distance: 0.46

Loading image: images/matterhorn2.jpg
Should select: True
Distance: 0.46

Loading image: images/london1.jpg
Should select: False
Distance: 0.53

Loading image: images/london2.jpg
Should select: False
Distance: 0.53

Program successfully finished.

That’s what we wanted to see! The model can separate the images of Matterhorn and London, and chooses the first two to be selected.

Feel free to experiment with different text queries on the line with lightly_edge.embed_texts, try e.g. “city” or “bus”.

Similarity Search#

Let’s explain what is going on in the code. There are several steps needed to set up similarity search.

This is the initialization code:

# Initialize the LightlyEdge SDK.
print("Initializing LightlyEdge...\n")
lightly_edge = LightlyEdge(path="../lightly_model.tar")

# Register a similarity strategy for the text "mountains".
threshold = 0.5
text_embeddings = lightly_edge.embed_texts(texts=["mountains"])
lightly_edge.register_similarity_strategy(
    query_embedding=text_embeddings[0], max_distance=threshold
)

LightlyEdge is first initialized from a TAR archive. Then we get an embedding for our text query “mountains” by calling lightly_edge_sdk.LightlyEdge.embed_texts. The function accepts a list of strings and returns a list of embeddings. We are interested in the only embedding returned at index 0.

Multiple selection strategies can be registered on lightly_edge_sdk.LightlyEdge. Each strategy independently decides whether a frame should be selected.

We call lightly_edge_sdk.LightlyEdge.register_similarity_strategy. to register a single similarity strategy. It has two arguments: query_embedding and max_distance. A frame is selected if it is closer than max_distance to the query in the embedding space. The distances are based on cosine similarity and range from 0 (closest) to 1 (furthest).

Next, images are processed in a for loop. We load the image with the Python Imaging Library (Pillow).

for image_path in image_paths:
    # Embed the image and check if it should be selected.
    with Image.open(image_path) as frame:
        print(f"Loading image: {image_path}")
        image_embedding = lightly_edge.embed_frame(frame=frame)

    select_info = lightly_edge.should_select(embedding=image_embedding)
    similarity_select_info = select_info.similarity[0]

    # Print whether the image is selected.
    print(f"Should select: {similarity_select_info.should_select}")
    print(f"Distance: {similarity_select_info.distance:.2f}\n")

The code embeds the image and calls lightly_edge_sdk.LightlyEdge.should_select which returns lightly_edge_sdk.SelectInfo. The object contains the decision of each registered strategy whether a frame should be selected. We print the decision result and the distance to the query.

How To Choose Similarity Parameters#

For best results, the query texts and max distance threshold should be carefully selected based on your data.

The max_distance parameter controls the tradeoff between precision and recall. It ranges from 0 to 1. A higher value selects more images, but might include images that are not relevant. A lower value selects fewer images which match the query most closely, but might miss some relevant images.

A suitable value for max_distance can be found by collecting a small set of positive and negative examples, logging the distances to the query, and choosing a value that separates the two sets well.

A good starting point is to set max_distance to 0.475 and adjust it in 0.005 increments. Suitable threshold values usually lie in the range 0.45 to 0.55.

The text queries should be chosen based on the use case. The queries can be specified in a natural language. The exact formulation of the query influences the results. We recommend trying different queries and thresholds to find the best match for the use case. The model tends to be more accurate for queries that are unambiguous and refer to simpler rather than complex concepts.

Image Search#

The example above showcases searching images with a text. Using the same interface, it is possible to search for images similar to a known image. The only difference is registering a similarity strategy with an image embedding instead of a text embedding:

with Image.open(query_image_path) as frame:
    query_embedding = lightly_edge.embed_frame(frame=frame)
lightly_edge.register_similarity_strategy(
    query_embedding=query_embedding, max_distance=threshold
)

Next Steps#

Next we will set up LightlyEdge to perform Diversity Selection.

02 Similarity Search

Contents

02 Similarity Search#

Project Setup#

Run a Complete Example#

Similarity Search#

How To Choose Similarity Parameters#

Image Search#

Next Steps#