Skip to main content

DataView

class dataview.DataView(name=None, iteration_order='sequential', iteration_infinite=False, maximum_number_of_frames=None, auto_connect_with_task=True, queries=None, **kwargs)#

DataView object creates a single view over a mixture of dataset versions.

For example: balancing filtering rules (for every answer from one filter give me two answers from another). Additionally, augmentation instructions are sent together with data-iterator, to be executed in-flight. These gives us not only flexibility and reproducibility, but also enable us multiple training jobs with zero-code interference, for large scale hyper-parameter optimization.

Create a new DataView.

  • Parameters

    • name (str ) – Name the current DataView. Important when multiple DataViews are used. For example one DataView name is ‘Test’ and the other ‘Validation’.

    • iteration_order (IterationOrder ) – The order in which frames are iterated on with this DataView.

    • iteration_infinite (bool ) – if True, dataview may return any frame more than once, up until maximum_number_of_frames are reached. If the total number of unique frames is less than maximum_number_of_frames, no duplicate frames will be returned.

      info

      Duplicate frames would still vary in augmentations, if used, due to the random nature of augmentation operation and parameter selection.

    • maximum_number_of_frames (int ) – The maximum number of frames to be returned when iterating on this DataView.

    • auto_connect_with_task (bool ) – If True the DataView will be automatically connected with the main Task context. Default True. Optional: readonly disabling DataView changes from the UI. This means that if the dataview was changed it the UI, it will have no effect on the code!

    • queries (list [ FilterRule ] ) – Initial queries (filter rules) for this DataView.

info

Dataview access is lazy, only when iterator is needed we actually check dataview validity


RoiQuery#

class RoiQuery(label=None, count_range=None, conf_range=None, must_not=None)

A single query on the dataview.

Method generated by attrs for class DataView.RoiQuery.

  • Return type

    None


label#

label

The label of the ROI. Only ROIs with this label are matched.

Possible values are:

  • A single string- The ROI must have a label equals to this string

  • A sequence of strings - The ROI must have all the labels in the sequence.

  • A white-space separated list of labels as a single string - The ROI must have all the labels in the list.

  • A lucene query - The ROI must have labels that match the query.

Examples:

  • None or ‘*’ or ‘’: ROIs with any labels are matched.

  • ‘cat’: selecting only frames with ROIs who’s labels contain the word ‘cat’.

  • ‘cat AND dog’: selecting only frames with ROIs who’s labels contain both ‘cat’ and ‘dog’

  • Type

    list[str] or str or None


count_range#

count_range

A count constraint over the query.

If provided, limits query results to frames where the number of ROIs matching the query is within the given range.

  • Type

    tuple(int, int)


conf_range#

conf_range

A confidence filter over the query.

If provided, frames match the query only if they contain ROIs that both match RoiQuery.label and who’s confidence is within the given range.

  • Type

    tuple(float, float)


must_not#

must_not

If True negate the entire roi selection rule

If True, frames that do not match the query will be returned. :type: bool


is_none#

is_none()


name#

property name

  • Return type

    str

  • Returns

    Return DataView name (str)


DataView.get#

classmethod get(dataview_id=None, dataview_name=None)

Get a previously defined dataview from the server.

  • Parameters

    • dataview_id (str ) – The ID of the dataview.

    • dataview_name (str ) – The name of the dataview.

  • Returns

    A new Dataview object for the needed dataview

  • Return type

    DataView

info

:paramref:~.get.dataview_id and :paramref:~.get.dataview_name are mutually exclusive, setting both to non-None values will raise a UsageError exception.


clone#

clone()

Clone this dataview into a new one.

The new dataview is an sdk-only clone of this dataview, it has no representation in the backend.

  • Returns

    The clone DataView object.

  • Return type

    DataView


add_query#

add_query(dataset_id=None, dataset_name=None, version_id=None, version_name=None, weight=1.0, roi_query=None, roi_count_range=None, roi_conf_range=None, roi_query_must_not=None, frame_query=None, source_query=None, query_object=None)

Add a new query to the dataview.

info

Dataview access is lazy, only when iterator is needed we actually create the dataview.

  • Parameters

    • dataset_id (str ) – The ID of the dataset used as input for this query.

    • dataset_name (str ) – The name of the dataset used as input for this query.

info

:paramref:~.add_query.dataset_id and :paramref:~.add_query.dataset_name are mutually exclusive, setting both to non-None values will raise a UsageError exception.

  • Parameters

    • version_id (str ) – The ID of the version used as input for this query.

    • version_name (str ) – The name of the version used as input for this query.

      warning

      Version names are not unique. The query is applied to a single, the last updated version will be selected!

info

:paramref:~.add_query.version_id and :paramref:~.add_query.version_name are mutually exclusive, setting both to non-None values will raise a UsageError exception.

info

If dataset version is not specified, the last updated version in the dataset will be selected

  • Parameters

    • weight (float ) – Weight of the rule. Measured relative to other queries. For example, two queries with the same weight will cause the dataview to have the exact same number of frames from each query, regradless of the number of frames in the input version. This is crucial to remove inherit bias in datasets.

    • str ] roi_query (Union [ list [ str ] , ) – List of labels or a string represent a single label or a lucene query. A frame will return from the query only if it contains an ROI with ALL the labels. if roi_query==DataView.EmptyRois , it means select only frames with no rois at all.

      info

      All the ROIs in the frame are returned even if only a single one match the query.

      ROI is matched with the query if it has all the labels in the query.

      Example: [“black”, “cat”] will match [“black”, “occluded”, “cat”] but will not match [“black”, “dog”] or [“cat”] Example: DataView.EmptyRois will only match frames with zero rois Example: “car” will return all the frames with “car” label Example: Lucene query - “person OR car” will return all the frames with “car” or “person” label

    • frame_query (str ) – Lucene query on the frame (can be mixed with roi query). Notice, this is a direct Lucene query, escape any special Lucene character with . Example: ‘(src:directory_name) AND (meta.key:my_value) AND (width:>50)’ will match any frame with src field containing ‘directory_name’, width over 50px and meta.key contains my_value.

      info

      A query such as ‘meta.cat_type:”white cat”’ will match frames in which the meta.cat_type field contains “white”, “cat” or both. In order to match frames in which meta.cat_type==”white cat”, use ‘meta.cat_type.keyword:”white cat”’.

    • source_query (str ) – Lucene query on the frame sources (can be mixed with any other query). Notice, this is a direct Lucene query, escape any special Lucene character with . Example: ‘(sources.uri:directory_name) AND (sources.preview.uri:https*)’ Returns frames with sources uri with directory_name in the link and a preview link with https prefix

    • roi_count_range (tuple ( int , int ) ) – (min, max) occurrences of the matched item.

    • roi_conf_range (tuple ( float , float ) ) – (min, max) confidence of matched annotation must be in this range.

    • roi_query_must_not (bool ) – if True negates the roi query, i.e. return only frames that do not answer the roi query terms Example: roi_query=[“black”, “cat”] roi_query_must_not=True, will match [“black”, “occluded”, “dog”] but will not match [“black”, “partial”, “cat”]

    • query_object (Query ) – An instance of dataview.Query storing all the information on a specific rule If passing query_object, all other fields should be None, as they will be ignored.


add_multi_query#

add_multi_query(dataset_id=None, dataset_name=None, version_id=None, version_name=None, weight=1.0, roi_queries=None, frame_query=None, source_query=None)

Add a new query with multiple label queries to the dataview.

info

Dataview access is lazy, only when iterator is needed we actually create the dataview.

  • Parameters

    • dataset_id (str ) – The ID of the dataset used as input for this query.

    • dataset_name (str ) – The name of the dataset used as input for this query.

info

:paramref:~.add_query.dataset_id and :paramref:~.add_query.dataset_name are mutually exclusive, setting both to non-None values will raise a UsageError exception.

  • Parameters

    • version_id (str ) – The ID of the version used as input for this query.

    • version_name (str ) – The name of the version used as input for this query.

      warning

      Version names are not unique. The query is applied to a single, the last updated version will be selected!

info

:paramref:~.add_query.version_id and :paramref:~.add_query.version_name are mutually exclusive, setting both to non-None values will raise a UsageError exception.

info

If dataset version is not specified, the last updated version in the dataset will be selected

  • Parameters

    • weight (float ) – Weight of the rule. Measured relative to other queries. For example, two queries with the same weight will cause the dataview to have the exact same number of frames from each query, regradless of the number of frames in the input version. This is crucial to remove inherit bias in datasets.

    • roi_queries (list [ RoiQuery or dict ] ) – A list of RoiQuery or dictionaries with ‘label’, ‘count_range’ and ‘conf_rang’ keys. Each item in the list is a query on the frame’s ROIs. Only frames that matches all queries are returned by this dataview. if roi_queries==DataView.EmptyRois , it means select only frames with no rois at all.

    • frame_query (str ) – Lucene query on the frame (can be mixed with roi query). Example: ‘(src:directory_name) AND (meta.key:my_value) AND (width:>50)’ will match any frame with src field containing ‘directory_name’, width over 50px and meta.key contains my_value.

      info

      A query such as ‘meta.cat_type:”white cat”’ will match frames in which the meta.cat_type field contains “white”, “cat” or both. In order to match frames in which meta.cat_type==”white cat”, use ‘meta.cat_type.keyword:”white cat”’.

    • source_query (str ) – Lucene query on the frame sources (can be mixed with any other query). Notice, this is a direct Lucene query, escape any special Lucene character with . Example: ‘(sources.uri:directory_name) AND (sources.preview.uri:https*)’ Returns frames with sources uri with directory_name in the link and a preview link with https prefix


add_queries#

add_queries(queries)

Add a list of queries to the dataview.

info

Dataview access is lazy, only when iterator is needed we actually create the dataview.

  • Parameters

    queries (Query ) – List (or a single) of Query object representing new queries to add.

  • Return type

    None


get_queries#

get_queries()

Get this dataview’s queries.

Usage example: Create a new DataView based on the returned queries.

dataview = DataView(queries=source_dataview.get_queries())
  • Returns

    A list of this dataview’s queries.

  • Return type

    list[Query]


get_versions#

get_versions()

Get a list of this dataview’s dataset versions.

The dataview’s versions are the versions specified in all the queries, and the versions registered as inputs to the dataview.

  • Return type

    List[DatasetVersion]

  • Returns

    A list of allegroai.DatasetVersion objects, one for each version of the dataview.


get_datasets#

get_datasets()

Get a list of this dataview’s dataset versions. Equivalent to get_queries()

The dataview’s versions are the versions specified in all the queries, and the versions registered as inputs to the dataview.

  • Return type

    List[DatasetVersion]

  • Returns

    A list of allegroai.DatasetVersion objects, one for each version of the dataview.


add_mapping_rule#

add_mapping_rule(from_labels, to_label, dataset_id=None, dataset_name=None, version_id=None, version_name=None)

Add new mapping to the dataview.

Mapping automatically converts label names to canonical names in the ROIs returned for frames while iterating over the dataview. This is used to make sure that different naming in different datasets will not produce two different classes for the same object.

Example: If one dataset has ROIs with the label ‘pedestrian’ and another has the ROIs with the label ‘person’, we can use both in a single dataview to create a person detector by adding mapping from ‘pedestrian’ to ‘person’

info

If this Dataview was not created from an existing dataview in the server, this function triggers the creation of such dataview.

info

Label mapping is performed after the frame is matched against the dataview’s queries. For that reason, the queries must be defined according to the dataset’s original labels.

  • Parameters

    • dataset_id (str ) – The ID of the dataset to apply the mapping rule to.

    • dataset_name (str ) – The name of the dataset to apply the mapping rule to.

    • from_labels (Union [ List [ str ] , str ] ) –

    • to_label (str ) –

    • version_id (Optional [ str ] ) –

    • version_name (Optional [ str ] ) –

  • Return type

    None

info

:paramref:~.add_mapping_rule.dataset_id and :paramref:~.add_mapping_rule.dataset_name are mutually exclusive, setting both to non-None values will raise a UsageError exception.

  • Parameters

    • version_id (str ) – The ID of the version to apply the mapping rule to.

    • version_name (str ) – The name of the version to apply the mapping rule to.

    • from_labels (Union [ List [ str ] , str ] ) –

    • to_label (str ) –

    • dataset_id (Optional [ str ] ) –

    • dataset_name (Optional [ str ] ) –

  • Return type

    None

info

:paramref:~.add_mapping_rule.version_id and :paramref:~.add_mapping_rule.version_name are mutually exclusive, setting both to non-None values will raise a UsageError exception.

info

If dataset version is not specified, the last updated version in the dataset will be selected

  • Parameters

    • from_labels (str or list [ str ] ) – Label or a list of labels to map to :paramref:`~.add_mapping_rule.to_lable`. The ROI must match to all of the labels for the mapping to take place.

    • to_label (str ) – Label to change :paramref:`~.add_mapping_rule.from_labels` to.

    • dataset_id (Optional [ str ] ) –

    • dataset_name (Optional [ str ] ) –

    • version_id (Optional [ str ] ) –

    • version_name (Optional [ str ] ) –

  • Return type

    None


set_labels#

set_labels(*label_dicts, labels)**

Set dataview label enumeration.

Label enumeration maps label strings to integers, for later use within the network.

  • Parameters

    • label_dicts (Mapping [ str: int ] ) – Mapping from a label string to its integer representation in the network. e.g. {‘cat’: 0, ‘dog’: 1, ‘hound’: 1}. Several mappings may be passed, and they will be merged together. In case of conflict, the latter one will be the deciding one.

    • labels (int ) – keyword arguments interface to pass mapping with. e.g set_label(cat=0, person=1). May be used in conjunction with :paramref:`~.set_labels.label_dicts`. In case of conflict, the keyword arguments are preferred.

  • Return type

    None


add_augmentation_affine#

add_augmentation_affine(operations=('bypass', 'scale', 'rotate', 'shear', 'reflect-horiz', 'reflect-vert'), strength=1.0)

Add affine based augmentation instruction to the dataview.

Augmentations are selected by the backend, where randomness is generated and can be reproduced (both in operation selection and in additional parameters)

The actual execution of the augmentation is preformed by the worker, while getting the data from the frame using ImageFrame

  • Parameters

    • operations (Sequence [ Augmentation.Affine ] ) – A sequence of affine operations. One will be picked randomly and uniformly.

    • strength (float ) – Augmentation operation strength. This is how we scale (multiply) the random [0-1] parameters passed to an augmentation action.

  • Returns

    True if augmentation was added to the dataview.

  • Return type

    bool


add_augmentation_pixel#

add_augmentation_pixel(operations=('bypass', 'blur', 'noise', 'recolor'), strength=1.0)

Add pixel based augmentation instruction to the dataview.

Augmentations are selected by the backend, where randomness is generated and can be reproduced (both in operation selection and in additional parameters)

The actual execution of the augmentation is preformed by the worker, while getting the data from the frame using ImageFrame

  • Parameters

    • operations (Sequence [ Augmentation.Pixel ] ) – A sequence of pixel operations. One will be picked randomly and uniformly.

    • strength (float ) – Augmentation operation strength. This is how we scale (multiply) the random [0-1] parameters passed to an augmentation action.

  • Returns

    True if augmentation was added to the dataview.

  • Return type

    bool


add_augmentation_custom#

add_augmentation_custom(operations, strength=1.0, arguments=None)

Add custom augmentation instruction to the dataview.

Augmentations are selected by the backend, where randomness is generated and can be reproduced (both in operation selection and in additional parameters)

The actual execution of the augmentation is preformed by the worker, while getting the data from the frame using ImageFrame

  • Parameters

    • operations (Sequence ( str ) ** ) – A sequence of custom operation names. Each has to be registered with ImageFrame.register_custom_augmentation. One will be picked randomly and uniformly.

    • strength (float ) – Augmentation operation strength. this is how we scale (multiply) the random [0-1] parameters passed to an augmentation action.

    • arguments (Mapping ( str: Mapping ( str: Any ) ) ) – Arguments for custom operations. For each operation in :paramref:`~.add_augmentation_custom.operations` there may be an entry in arguments with key that equals to the operation name. The value of the entry is a mapping from name to any JSON-able value. The entry of the selected operation is returned as-is from the server, i.e it will not be randomly chosen. This mapping is later accessible from the augmentation code when it is executed.

      from allegroai import ImageFrame, DataView
      from my_augmentation_library import MyCustomAugmentation, MyCoolAugmentation
      ImageFrame.register_custom_augmentation('custom_aug', MyCustomAugmentation)
      ImageFrame.register_custom_augmentation('cool_aug', MyCoolAugmentation)
      aug_arguments = {
      'custom_aug': {
      'color': 'blue',
      'count': 42,
      },
      'cool_aug': {
      'vector': [1, 2, 5],
      'verify: True,
      },
      }
      dv = DataView()
      dv.add_augmentation_custom(
      operations=('custom_aug', 'cool_aug'),
      arguments=aug_arguments,
      )
  • Returns

    True if augmentation was added to the dataview.

  • Return type

    bool


set_iteration_parameters#

set_iteration_parameters(order=None, infinite=None, maximum_number_of_frames=None, random_seed=None)

Set dataview general iteration parameters.

  • Parameters

    • order (IterationOrder ) – The order in which frames are iterated on with this DataView.

      None means not-applicable (ignored)

    • infinite (bool ) – if True, dataview infinitely returns frames (with duplicates, of course).

      None means not-applicable (ignored)

      info

      Duplicate frames would still vary in augmentations, if used, due to the random nature of augmentation operation and parameter selection.

    • maximum_number_of_frames (int or None ) – Limit the total number of frames the dataview returns Note: Zero or Negative values are ‘unlimited’ number of frames

      None means not-applicable (ignored)

    • random_seed (int or None ) – Random seed for any randomness needed (e.g. order, augmentation, etc.). Default random seed is fixed for easy reproducibility.

      None means not-applicable (ignored)

  • Returns

    True if at least one of the specified dataview iteration parameters was set

  • Return type

    bool


get_iteration_parameters#

get_iteration_parameters()

Get dataview iteration parameters (includes general and video iteration parameters)

  • Returns

    A dictionary specifying iteration parameters

  • Return type

    dict


set_video_parameters#

set_video_parameters(minimum_time_between_consecutive_frames=0, sequence_minimum_time=0)

Set dataview video specific iteration parameters.

These settings are only relevant to video content (frames sharing the same source uri with different timestamps).

warning

This method overrides any unset argument to default.

  • Parameters

    • minimum_time_between_consecutive_frames (int ) – When the frames contains a positive timestamp (i.e video), make sure two consecutive frame are at least minimum_time_between_consecutive_frames in milliseconds apart.

    • sequence_minimum_time (int ) – When the frame contains a positive timestamp (i.e videos), expand the selected frame (based on the filters) with enough frames so that we end up with a sequence of at least sequence_minimum_time length (in timebase/milliseconds)

  • Returns

    True if at least one of the specified dataview iteration parameters was set

  • Return type

    bool


set_random_seed#

set_random_seed(random_seed)

Set the random seed for this dataview.

The random seed is always fixed so the entire run is reproducible. The default random seed is fixed to 1337.

  • Parameters

    random_seed (int ) – Random seed for any randomness needed (e.g. order, augmentation, etc.).

  • Return type

    None


get_random_seed#

get_random_seed()

Get the random seed for this dataview.

The random seed is always fixed so the entire run is reproducible. The default random seed is fixed to 1337.

  • Return int

    Random seed for any randomness needed (e.g. order, augmentation, etc.).

  • Return type

    int


get_iterator#

get_iterator(query_cache_size=None, query_queue_depth=None, allow_repetition=False, kwargs)**

Get an iterator for this DataView.

The iterator will yield frames from the dataview according to its queries (see add_query and add_multi_query), and its iteration parameters (see set_iteration_parameters and set_video_parameters).

The yielded frames are only the frame’s metadata. If DataView is image-based, you can wrap the SingleFrame/FrameGroup with ImageFrame for builtin augmentation support.

info

Every function call will create a new iterator for the DataView!

info

Iterator length will return the expected number of frames based on the specific queries

If DataView infinite flag is set, len(iterator) will return 2^32

If the maximum_number_of_frames was set, len(iterator) will return maximum_number_of_frames

If allow_repetitions is True, a limit to the maximum returned frames will be calculated automatically. This limit can be thought of as an entire epoch, as it guarantees that we cover all unique frames in each rule (that said, in some rules we might have repetition as part of the rule ratio balance). len(iterator) will return this synthetic epoch limit, and the Iterator will raise StopIteration when reaching this limit (as expected by an iterator)

  • Parameters

    • query_cache_size (int ) – The requested number of metadata frames in every API call to the server. A large value is slower per request, but faster in average for frames returned by the iterator.

    • query_queue_depth (int ) – Number of API request results to store in the return queue. The maximum total number of metadata frames stored in the return queue is query_cache_size * query_queue_depth.

    • allow_repetition (bool ) – The length of the iterator returned, will be limited to cover all the unique frames from all the different queries of the DataView. For example: if we have two queries, one has 100 unique frames, and the other 2 unique frames. The allow_repetition will set the DataView maximum_number_of_frames to 200 frames, and unset the infinite flag of the data view. This will cause the len(iterator) to return 200 and StopIteration will be raised after 200 frames (as expected)

    • kwargs (int ) –

  • Returns

    An iterator over the DataView’s frames.

  • Return type

    Generator((SingleFrame, FrameGroup))


to_list#

to_list(allow_repetition=False, auto_synthetic_epoch_limit=None)

Get a list of frames for this DataView.

The returned list will hold frames from the DataView according to its queries (see add_query and add_multi_query), and its iteration parameters (see set_iteration_parameters and set_video_parameters).

The yielded frames are only the frame’s metadata (SingleFrame/FrameGroup).

info

Every function call will create a new list of frames for the DataView!

  • Parameters

    • allow_repetition (bool ) – The length of the iterator returned, will be limited to cover all the unique frames from all the different queries of the DataView. For example: if we have two queries, one has 100 unique frames, and the other 2 unique frames. The allow_repetition will set the DataView maximum_number_of_frames to 200 frames, and unset the infinite flag of the data view. This will cause the len(iterator) to return 200 and StopIteration will be raised after 200 frames (as expected)

    • auto_synthetic_epoch_limit (Optional[int]) – deprecated, use allow_repetition instead

  • Returns

    A list of SingleFrame, FrameGroup generated from the DataView’s query.

  • Return type

    list((SingleFrame, FrameGroup))


split_to_lists#

split_to_lists(ratio, allow_repetition=False, seed=42, frame_id_fn=None)

Partition the frames represented by the DataView into non-overlapping lists (e.g. train/validation/test) according to the split weights

  • It is guaranteed that there is no overlap between the lists.

  • It is guaranteed frames partitioning is based solely on weights and seed, and is independent of the DataView query, this means partitions are consistent.

The resulting frames are uniformly split based on the frame_stringify_fn(frame) returned identifier combined with the seed number.

The yielded frames are only the frame’s metadata (SingleFrame/FrameGroup).

info

Every function call will create a new list of frames for the DataView!

  • Parameters

    • ratio (list ( int ) ) – List of weights (integer) for the partitions. The split is based on the ratio between the weight ot the total sum of weights. For example: a train/val/test split of 60% / 20%/ 20% is achieved with partition_weights=[3, 1, 1] and will result in three partitions, the first will have 3/5th of the DataView’s frames, the second 1/5th, and the third the last 1/5th of the frames in the DataView.

    • allow_repetition (bool ) – The length of the iterator returned, will be limited to cover all the unique frames from all the different queries of the DataView. For example: if we have two queries, one has 100 unique frames, and the other 2 unique frames. The allow_repetition will set the DataView maximum_number_of_frames to 200 frames, and unset the infinite flag of the data view. This will cause the len(iterator) to return 200 and StopIteration will be raised after 200 frames (as expected)

    • seed (int ) – Random number to add to the frame identifier, controlling the randomness in the selection criteria of the frame partitioning

    • frame_id_fn (lambda ) – User-provided frame identifier function used for consistent partitioning of the DataView frames. Default is (lambda frame: frame.id)

  • Returns

    A list of partitions, where each partition is a list of SingleFrame/FrameGroup generated from the DataView’s query.

  • Return type

    list(list((SingleFrame, FrameGroup)))


to_json#

to_json(json_file, allow_repetition=False, auto_synthetic_epoch_limit=None)

Store DataView data to json file. Stored DataView frames are contain only the metadata.

  • Parameters

    • json_file (Union[str, BytesIO]) – Path (str) or BytesIO object, to store the DataView data to (in JSON format).

    • allow_repetition (bool ) – The length of the iterator returned, will be limited to cover all the unique frames from all the different queries of the DataView. For example: if we have two queries, one has 100 unique frames, and the other 2 unique frames. The allow_repetition will set the DataView maximum_number_of_frames to 200 frames, and unset the infinite flag of the data view. This will cause the len(iterator) to return 200 and StopIteration will be raised after 200 frames (as expected)

    • auto_synthetic_epoch_limit (Optional[int]) – deprecated, use allow_repetition instead

  • Return type

    bool

  • Returns

    True if successful


from_json_to_list#

from_json_to_list(json_file)

Load DataView data from a json file. The returned frames are only the frame’s metadata and of type SingleFrame/FrameGroup.

  • Parameters

    json_file (Union[str, BytesIO]) – Path (str) or BytesIO object, to load the DataView data from (in JSON format).

  • Returns

    A list of SingleFrame, FrameGroup generated from JSON file DataView data.

  • Return type

    list((SingleFrame, FrameGroup))


prefetch_files#

prefetch_files(num_workers=None, wait=False, query_cache_size=None)

Prefetch the DataView’s files (files/uri pointed by frames in the DataView). e.g. call SingleFrame.get_local_source() .get_local_preview() .get_local_mask() on all the frames in the DataView Pre-fetching is done in the background, and the function call returns immediately

  • Parameters

    • num_workers (Optional[int]) – None (default), number of workers is set to cpu count

    • wait (bool) – if True return after all files were prefetched to a local storage

    • query_cache_size (Optional[int]) – The requested number of metadata frames in every API call to the server.

  • Return type

    bool

  • Returns

    True if pre-fetching started


get_mapping_rules#

get_mapping_rules()

Get current dataview mappings

Note: if dataview was created from existing dataview, this function triggers the creation of the dataview

  • Return type

    List[None]

  • Returns

    list of mapping objects


get_labels#

get_labels()

Get dataview label enumeration (i.e. dictionary of label string to id)

example: {‘person’: 1, ‘background’: 0, ‘pedestrian’: 1} Notice: every function call will create a new iterator for the dataview!

  • Return type

    Dict[str, int]

  • Returns

    dictionary of string to integer


store#

store(project_name=None, name=None, description=None, tags=None)

Stores the current dataview in the system for future use

  • Parameters

    • project_name (Optional[str]) – project name (string)

    • name (Optional[str]) – dataview name (string)

    • description (Optional[str]) – dataview description (strung)

    • tags (Optional[List[str]]) – list tags (strings) for this dataview

  • Return type

    None


connect#

connect(task)

Connect current dataview with a specific task

When running in debug mode (i.e. locally), the task is updated with the dataview object When running remotely (i.e. from a daemon) the dataview is being updated from the task Notice! when running remotely the dataview is ignored and loaded from the task object regardless of the code

  • Parameters

    task (Task ) – Task object

  • Return type

    None

:raise ValueError exception if DataView object is already connected to other task

  • Return type

    None

  • Parameters

    task (Task ) –


get_id#

get_id()

return the dataview id in the system (for future use)

  • Return type

    str

  • Returns

    dataview id (string)


has_id#

has_id()

  • Return type

    bool

  • Returns

    True iff the data view is stored under an ID in the server.


get_count#

get_count()

Returns the total number of unique frames returned by this dataview, as well as the number of unique frames returned from each of this dataview’s queries (rules)

  • Return type

    Tuple[int, Sequence[int]]


get_sources#

get_sources()

Returns a list of sources URI links from all the frames in the current DataView :rtype: Sequence[str] :returns: list of URI strings

  • Return type

    Sequence[str]