Object Detection with Ultralytics’ YOLO¶
This tutorial demonstrates how to pre-train a YOLO model using lightly-train
and then fine-tune it for object detection using the ultralytics
framework. We will perform both steps on the PASCAL VOC dataset.
Warning
Using Ultralytics models might require a commercial Ultralytics license. See the Ultralytics website for more information.
Install Dependencies¶
Install the required packages:
lightly-train
for pretraining, with support forultralytics
’ YOLO modelssupervision
to visualize some of the annotated pictures
pip install "lightly-train[ultralytics]" "supervision==0.25.1"
Download the Dataset¶
We can download the dataset directly using Ultralytics’ API with the check_det_dataset
function:
from ultralytics.data.utils import check_det_dataset
dataset = check_det_dataset("VOC.yaml")
Ultralytics always downloads your datasets to a fixed location, which you can fetch via their settings
module:
from ultralytics import settings
print(settings["datasets_dir"])
Inside that directory (
tree -d <DATASET-DIR>/VOC -I VOCdevkit
> datasets/VOC
> ├── images
> │ ├── test2007
> │ ├── train2007
> │ ├── train2012
> │ ├── val2007
> │ └── val2012
> └── labels
> ├── test2007
> ├── train2007
> ├── train2012
> ├── val2007
> └── val2012
Note
Labels are not required for self-supervised pre-training. We will use the labels only for finetuning.
Inspect a few Images¶
Let’s use supervision
and look at a few of the annotated samples to get a feeling of what the data looks like:
import random
import matplotlib.pyplot as plt
import supervision as sv
import yaml
from ultralytics import settings
from ultralytics.data.utils import check_det_dataset
dataset = check_det_dataset("VOC.yaml")
detections = sv.DetectionDataset.from_yolo(
data_yaml_path=dataset["yaml_file"],
images_directory_path=f"{settings["datasets_dir"]}/VOC/images/train2012",
annotations_directory_path=f"{settings["datasets_dir"]}/VOC/labels/train2012",
)
with open(dataset["yaml_file"], "r") as f:
data = yaml.safe_load(f)
names = data["names"]
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
fig, ax = plt.subplots(2, 2, figsize=(10, 10))
ax = ax.flatten()
detections = [detections[random.randint(0, len(detections))] for _ in range(4)]
for i, (path, image, annotation) in enumerate(detections):
annotated_image = box_annotator.annotate(scene=image, detections=annotation)
annotated_image = label_annotator.annotate(
scene=annotated_image,
detections=annotation,
labels=[names[elem] for elem in annotation.class_id],
)
ax[i].imshow(annotated_image[..., ::-1])
ax[i].axis("off")
fig.tight_layout()
fig.show()
Pre-train and Fine-tune¶
We will use lightly-train
to pre-train a YOLO11 model using self-supervised learning.
The following scripts or CLI commands will:
Initialize a YOLO11s model with random weights.
Pre-train the YOLO11s model on the training images of PASCAL VOC using distillation pretraining.
Export the pre-trained YOLO11s model.
Fine-tune the pre-trained model on PASCAL VOC dataset using labels, and evaluate its performance.
# pretrain_yolo.py
import lightly_train
from ultralytics import settings
data_path = f"{settings["datasets_dir"]}/VOC/images/train2012"
if __name__ == "__main__":
# Pre-train with lightly-train.
lightly_train.train(
out="out/my_experiment", # Output directory.
model="ultralytics/yolo11s.yaml", # Pass the YOLO model.
data=data_path, # Path to a directory with training images.
epochs=100, # Adjust epochs for faster training.
batch_size=64, # Adjust batch size based on hardware.
)
# finetune_yolo.py
from ultralytics import YOLO
if __name__ == "__main__":
# Load the exported model.
model = YOLO("out/my_experiment/exported_models/exported_last.pt")
# Fine-tune with ultralytics.
model.train(data="VOC.yaml", epochs=100)
lightly-train train out="out/my_experiment" data="<DATASET-DIR>/VOC/images/train2012" model="ultralytics/yolo11s.yaml" epochs=100 batch_size=64
yolo detect train model="out/my_experiment/exported_models/exported_last.pt" data="VOC.yaml" epochs=100
Congratulations! You have successfully pre-trained a model using lightly-train
and fine-tuned it for object detection using ultralytics
.
For more advanced options, explore the Python API and Ultralytics documentation.
Next Steps¶
Go beyond distillation pretraining and experiment self-supervised learning methods in
lightly-train
, such as DINO or SimCLR.Try various YOLO models (
YOLOv5
,YOLOv6
,YOLOv8
).Use the pre-trained model for other tasks, like image embeddings.