Skip to main content

Masks

When applicable, sources contains masks, a list of dictionaries used to connect a special type of source data to the ClearML Enterprise platform. That source data is a mask.

Masks are used in deep learning for semantic segmentation.

Masks correspond to raw data where the objects to be detected are marked with colors in the masks. The colors are RGB values and represent the objects, which are labeled for segmentation.

In frames used for semantic segmentation, the metadata connecting the mask files / images to the ClearML Enterprise platform, and the RGB values and labels used for segmentation are separate. They are contained in two different dictionaries of a SingleFrame:

  • masks (plural) is in sources and contains the mask files / images URI (in addition to other keys and values).

  • mask (singular) is in the rois array of a Frame.

    Each rois dictionary contains:

    • RGB values and labels of a mask (in addition to other keys and values)

    • Metadata and data for the labeled area of an image

See Example 1, which shows masks in sources, mask in rois, and the key-value pairs used to relate a mask to its source in a frame.

Masks Structure#

The chart below explains the keys and values of the masks dictionary (in the sources section of a Frame).

KeyValue Description
idType: integer.
  • The ID is used to relate this mask data source to the mask dictionary containing the label and RGB value for the mask.
  • See the mask key in rois.
content_typeType: string.
  • Type of mask data. For example, image / png or video / mp4.
timestampType: integer.
  • For images from a video, indicates the absolute position of the frame from the source (video)
  • For still images, set this to 0 (for example, video from a camera on a car, at 30 frames per second, would have a timestamp of 0 for the first frame, and 33 for the second frame).
uriType: string.
  • URI of the mask file / image.

Examples#

Example 1#

This example demonstrates an original image, its masks, and its frame containing the sources and ROI metadata.

Example 1: View the frame

This frame contains the masks list of dictionaries in sources, and the rois array, as well as several top-level key-value pairs.

{
"timestamp": 1234567889,
"context_id": "car_1",
"meta": {
"velocity": "60"
},
"sources": [
{
"id": "front",
"content_type": "video/mp4",
"width": 800,
"height": 600,
"uri": "https://s3.amazonaws.com/my_cars/car_1/front.mp4",
"timestamp": 1234567889,
"meta" :{
"angle":45,
"fov":129
},
"masks": [
{
"id": "seg",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_seg.mp4",
"timestamp": 123456789
},
{
"id": "seg_instance",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_instance_seg.mp4",
"timestamp": 123456789
}
]
}
],
"rois": [
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "car",
"value": [210,210,120]
}
},
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "person",
"value": [147,44,209]
}
},
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "road",
"value": [197,135,146]
}
},
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "street",
"value": [135,198,145]
}
},
{
"sources":["front"],
"label": ["seg"],
"mask": {
"id": "building",
"value": [72,191,65]
}
}
]
}

  • In sources:
    • The source ID is front.
    • In the masks dictionary, the source contains mask sources with IDs of seg and seg_instance.
  • In rois:
    • Each ROI source is front, relating the ROI to its original source image.
    • Each ROI has a label of seg, indicating segmentation.
    • Each mask has an id (car, person, road, street, and building) and a unique RGB value (color-coding).
Example image and masks

Original Image

image

Mask image

image


Example 2#

This example shows two masks for video from a camera. The masks label cars and the road.

Example 2: View the frame
"sources": [
{
"id": "front",
"content_type": "video/mp4",
"width": 800,
"height": 600,
"uri": "https://s3.amazonaws.com/my_cars/car_1/front.mp4",
"timestamp": 1234567889,
"meta" :{
"angle":45,
"fov":129
},
"masks": [
{
"id": "car",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_seg.mp4",
"timestamp": 123456789
},
{
"id": "road",
"content_type": "video/mp4",
"uri": "https://s3.amazonaws.com/seg_masks/car_1/front_instance_seg.mp4",
"timestamp": 123456789
}
]
}
],
"rois": [
{
"sources":["front"],
"label": ["right_lane"],
"mask": {
"id": "car",
"value": [210,210,120]
}
},
{
"sources":["front"],
"label": ["right_lane"],
"mask": {
"id": "road",
"value": [197,135,146]
}
}

  • In sources:
    • The source ID is front.
    • The source contains mask sources with IDs of car and road.
  • In rois:
    • Each ROI source is front relating the ROI to its original source image.
    • Each ROI has a label of right_lane indicating the ROI object.
    • Each mask has an id (car, person) and a unique RGB value (color-coding).

Usage#

Adding Mask Annotations#

To add a mask annotation to a frame, use the SingleFrame.add_annotation. This method is generally used to add ROI annotations, but it can also be used to add frame specific mask labels. Input the mask value as a list with the RGB values in the mask_rgb parameter, and a list of labels in the labels parameter.

frame = SingleFrame(
source='/home/user/woof_meow.jpg',
preview_uri='https://storage.googleapis.com/kaggle-competitions/kaggle/3362/media/woof_meow.jpg',
frame.add_annotation(mask_rgb=[0, 0, 0], labels=['cat'])