lightly.active_learning

.agents

class lightly.active_learning.agents.agent.ActiveLearningAgent(api_workflow_client: lightly.api.api_workflow_client.ApiWorkflowClient, query_tag_name: str = None, preselected_tag_name: str = None)

A basic class providing an active learning policy

Attributes:
api_workflow_client:

The client to connect to the api.

preselected_tag_id:

The id of the tag containing the already labeled samples, default: None == no labeled samples yet.

query_tag_id:

The id of the tag defining where to sample from, default: None resolves to initial_tag

labeled_set:

The filenames of the samples in the labeled set, List[str]

unlabeled_set:

The filenames of the samples in the unlabeled set, List[str]

query(sampler_config: lightly.active_learning.config.sampler_config.SamplerConfig, al_scorer: lightly.active_learning.scorers.scorer.Scorer = None) → List[str]

Performs an active learning query.

As part of it, the self.labeled_set and self.unlabeled_set are updated and can be used for the next step.

Args:
sampler_config:

The sampling configuration.

al_scorer:

An instance of a class inheriting from Scorer, e.g. a ClassificationScorer.

Returns:

The filenames of the samples in the new labeled_set.

.config

class lightly.active_learning.config.sampler_config.SamplerConfig(method: lightly.openapi_generated.swagger_client.models.sampling_method.SamplingMethod = 'CORESET', n_samples: int = 32, min_distance: float = - 1, name: str = None)

Configuration class for a sampler.

Attributes:
method:

The method to use for sampling, e.g. CORESET.

n_samples:

The maximum number of samples to be chosen by the sampler including the samples in the preselected tag. One of the stopping conditions.

min_distance:

The minimum distance of samples in the chosen set, one of the stopping conditions.

name:

The name of this sampling, defaults to a name consisting of all other attributes and the datetime. A new tag will be created in the web-app under this name.

.scorers

class lightly.active_learning.scorers.classification.ScorerClassification(model_output: numpy.ndarray)

Class to compute active learning scores from the model_output of a classification task.

Attributes:
model_output:

Predictions of shape N x C where N is the number of unlabeled samples and C is the number of classes in the classification task. Must be normalized such that the sum over each row is 1. The order of the predictions must be the one specified by ActiveLearningAgent.unlabeled_set.

Examples:
>>> # example with three unlabeled samples
>>> al_agent.unlabeled_set
>>> > ['img0.jpg', 'img1.jpg', 'img2.jpg']
>>> predictions = np.array(
>>>     [
>>>          [0.1, 0.9], # predictions for img0.jpg
>>>          [0.3, 0.7], # predictions for img1.jpg
>>>          [0.8, 0.2], # predictions for img2.jpg
>>>     ] 
>>> )
>>> np.sum(predictions, axis=1)
>>> > array([1., 1., 1.])
>>> scorer = ScorerClassification(predictions)
class lightly.active_learning.scorers.detection.ScorerObjectDetection(model_output: List[lightly.active_learning.utils.object_detection_output.ObjectDetectionOutput], config: Dict = None)

Class to compute active learning scores from the model_output of an object detection task.

Currently supports the following scorers:

object-frequency:

This scorer uses model predictions to focus more on images which have many objects in them. Use this scorer if you want scenes with lots of objects in them like we usually want in computer vision tasks such as perception in autonomous driving.

prediction-margin:

This scorer uses the margin between 1.0 and the highest confidence prediction. Use this scorer to select images where the model is insecure.

Attributes:
model_output:

List of model outputs in an object detection setting.

config:

A dictionary containing additional parameters for the scorers.

frequency_penalty (float):

Used by the object-frequency scorer. If objects of the same class are within the same sample we multiply them with the penalty. 1.0 has no effect. 0.5 would count the first object fully and the second object of the same class only 50%. Lowering this value results in a more balanced setting of the classes. 0.0 is max penalty. (default: 0.25)

min_score (float):

Used by the object-frequency scorer. Specifies the minimum score per sample. All scores are scaled to [min_score, 1.0] range. Lowering the number makes the sampler focus more on samples with many objects. (default: 0.9)

Examples:
>>> # typical model output
>>> predictions = [{
>>>     'boxes': [[0.1, 0.2, 0.3, 0.4]],
>>>     'object_probabilities': [0.1024],
>>>     'class_probabilities': [[0.5, 0.41, 0.09]]
>>> }]
>>>
>>> # generate detection outputs
>>> model_output = []
>>> for prediction in predictions:
>>>     # convert each box to a BoundingBox object
>>>     boxes = []
>>>     for box in prediction['boxes']:
>>>         x0, x1 = box[0], box[2]
>>>         y0, y1 = box[1], box[3]
>>>         boxes.append(BoundingBox(x0, y0, x1, y1))
>>>     # create detection outputs
>>>     output = ObjectDetectionOutput(
>>>         boxes,
>>>         prediction['object_probabilities'],
>>>         prediction['class_probabilities']
>>>     )
>>>     model_output.append(output)
>>>
>>> # create scorer from output
>>> scorer = ScorerObjectDetection(model_output)