Skip to content

Model Evaluation

EvaluationResult

Evaluation interface for image datasets.

EvaluationResult

Bases: BaseModel

Summary of the inputs used by an evaluation run.

Returned by every task method on ImageDatasetEvaluate. The field set is shared across tasks (object detection, classification, segmentation).

Attributes:

Name Type Description
sample_count int

Number of samples included in the evaluation.

gt_annotation_count int

Number of ground truth annotations used.

pred_annotation_count int

Number of prediction annotations used.

from_evaluation_data classmethod

from_evaluation_data(data: EvaluationData) -> EvaluationResult

Build a result from the prepared evaluation data.

ObjectDetectionEvaluationConfig

Evaluation interface for image datasets.

ObjectDetectionEvaluationConfig

Bases: BaseModel

Configuration for object-detection evaluation runs.

Attributes:

Name Type Description
iou_threshold float

IoU threshold used by object-detection evaluators. Stored in the run config for reproducibility.

classwise bool

If True, match predictions and ground truths only within the same annotation class. If False, match globally across all annotation classes.

ClassificationEvaluationConfig

Evaluation interface for image datasets.

ClassificationEvaluationConfig

Bases: BaseModel

Configuration for classification evaluation runs.

Currently has no fields. Placeholder for future task-specific options.

SemanticSegmentationEvaluationConfig

Evaluation interface for image datasets.

SemanticSegmentationEvaluationConfig

Bases: BaseModel

Configuration for semantic-segmentation evaluation runs.

Currently has no fields. Placeholder for future task-specific options.

ImageDatasetEvaluate

Evaluation interface for image datasets.

ImageDatasetEvaluate

ImageDatasetEvaluate(session: Session, collection_id: UUID, samples: Iterable[ImageSample])

Task-specific evaluation entry points for image datasets.

This facade groups evaluation methods by task (e.g. object detection) and keeps evaluation-specific logic separate from ImageDataset.

Parameters:

Name Type Description Default
session Session

Database session used by resolver calls.

required
collection_id UUID

ID of the collection being evaluated.

required
samples Iterable[ImageSample]

Samples selected for evaluation.

required

classification

classification(
    name: str,
    gt_annotation_source: str,
    pred_annotation_source: str,
    config: ClassificationEvaluationConfig | None = None,
) -> EvaluationResult

Create a classification evaluation run and persist per-image metrics.

Parameters:

Name Type Description Default
name str

Display name of the evaluation run.

required
gt_annotation_source str

Name of the annotation source containing ground truth annotations.

required
pred_annotation_source str

Name of the annotation source containing predictions.

required
config ClassificationEvaluationConfig | None

Optional classification evaluation config. If omitted, defaults are used.

None

Returns:

Type Description
EvaluationResult

Summary of the samples and annotations used by the evaluation.

object_detection

object_detection(
    name: str,
    gt_annotation_source: str,
    pred_annotation_source: str,
    config: ObjectDetectionEvaluationConfig | None = None,
) -> EvaluationResult

Create an object-detection evaluation run and persist per-image metrics.

Parameters:

Name Type Description Default
name str

Display name of the evaluation run.

required
gt_annotation_source str

Name of the annotation source containing ground truth annotations.

required
pred_annotation_source str

Name of the annotation source containing predictions.

required
config ObjectDetectionEvaluationConfig | None

Optional object-detection evaluation config. If omitted, defaults are used.

None

Returns:

Type Description
EvaluationResult

Summary of the samples and annotations used by the evaluation.

semantic_segmentation

semantic_segmentation(
    name: str,
    gt_annotation_source: str,
    pred_annotation_source: str,
    config: SemanticSegmentationEvaluationConfig | None = None,
) -> EvaluationResult

Create a semantic segmentation evaluation run and persist per-image metrics.

Parameters:

Name Type Description Default
name str

Display name of the evaluation run.

required
gt_annotation_source str

Name of the annotation source containing ground truth labels.

required
pred_annotation_source str

Name of the annotation source containing predictions.

required
config SemanticSegmentationEvaluationConfig | None

Optional semantic segmentation evaluation config. If omitted, defaults are used.

None

Returns:

Type Description
EvaluationResult

Summary of the samples and annotations used by the evaluation.