fiftyone.brain¶
Module contents¶
The brains behind FiftyOne: a powerful package for dataset curation, analysis, and visualization.
See https://github.com/voxel51/fiftyone for more information.
Functions
|
Adds a hardness field to each sample scoring the difficulty that the specified label field observed in classifying the sample. |
|
Computes the mistakenness of the labels in the specified |
|
Adds a uniqueness field to each sample scoring how unique it is with respect to the rest of the samples. |
-
fiftyone.brain.
compute_hardness
(samples, label_field, hardness_field='hardness')¶ Adds a hardness field to each sample scoring the difficulty that the specified label field observed in classifying the sample.
Hardness is a measure computed based on model prediction output (through logits) that summarizes a measure of the uncertainty the model had with the sample. This makes hardness quantitative and can be used to detect things like hard samples, annotation errors during noisy training, and more.
- Parameters
samples – a
fiftyone.core.collections.SampleCollection
label_field – the
fiftyone.core.labels.Classification
orfiftyone.core.labels.Classifications
field to use from each samplehardness_field ("hardness") – the field name to use to store the hardness value for each sample
-
fiftyone.brain.
compute_mistakenness
(samples, pred_field, label_field='ground_truth', mistakenness_field='mistakenness', missing_field='possible_missing', spurious_field='possible_spurious', use_logits=True)¶ Computes the mistakenness of the labels in the specified
label_field
, scoring the chance that the labels are incorrect.Mistakenness is computed based on the predictions in the
pred_field
, through itslogits
orconfidence
. This measure can be used to detect things like annotation errors and unusually hard samples.This method supports both classifications and detections.
For classifications, a
mistakenness_field
field is populated on each sample that quantifies the likelihood that the label in thelabel_field
of that sample is incorrect.For detections, the mistakenness of each detection in
label_field
is computed, usingfiftyone.utils.evaluation.evaluate_detections()
to locate corresponding detections inpred_field
. Three types of mistakes are identified:(Mistakes) Detections with a match in
pred_field
are assigned a mistakenness value in theirmistakenness_field
, which captures the likelihood that the detection inlabel_field
is a mistake. Such mistakes may be due to either the class label or localization of the detection(Missing) Detections in
pred_field
with no matches inlabel_field
but which are likely to be correct are added tolabel_field
and given a value ofTrue
in theirmissing_field
attribute(Spurious) Detections in
label_field
with no matches inpred_field
but which are likely to be incorrect are given a value ofTrue
in theirspurious_field
attribute
These per-detection data are then aggregated at the sample-level as follows:
(Mistakes) The
mistakenness_field
of each sample is populated with the maximum mistakenness of the detections inlabel_field
(Missing) The
missing_field
of each sample is populated with the number of missing detections that were deemed missing and thus added tolabel_field
(Spurious) The
spurious_field
of each sample is populated with the number of detections inlabel_field
that were given deemed spurious
- Parameters
samples – a
fiftyone.core.collections.SampleCollection
pred_field – the name of the predicted label field to use from each sample. Can be of type
fiftyone.core.labels.Classification
,fiftyone.core.labels.Classifications
, orfiftyone.core.labels.Detections
label_field ("ground_truth") – the name of the “ground truth” label field that you want to test for mistakes with respect to the predictions in
pred_field
. Must have the same type aspred_field
mistakenness_field ("mistakenness") – the field name to use to store the mistakenness value for each sample
missing_field ("possible_missing) – the field in which to store per-sample counts of potential missing detections. Only applicable for
fiftyone.core.labels.Detections
labelsspurious_field ("possible_spurious) – the field in which to store per-sample counts of potential spurious detections. Only applicable for
fiftyone.core.labels.Detections
labelsuse_logits (True) – whether to use logits (True) or confidence (False) to compute mistakenness. Logits typically yield better results, when they are available
-
fiftyone.brain.
compute_uniqueness
(samples, uniqueness_field='uniqueness', roi_field=None)¶ Adds a uniqueness field to each sample scoring how unique it is with respect to the rest of the samples.
This function only uses the pixel data and can therefore process labeled or unlabeled samples.
- Parameters
samples – a
fiftyone.core.collections.SampleCollection
uniqueness_field ("uniqueness") – the field name to use to store the uniqueness value for each sample
roi_field (None) – an optional
fiftyone.core.labels.Detection
,fiftyone.core.labels.Detections
,fiftyone.core.labels.Polyline
, orfiftyone.core.labels.Polylines
field defining a region of interest within each image to use to compute uniqueness