Plums-Microlib Data-Model

The Plums data-model module implements the data-model described in Data-Model.

User documentation

The actual data-model composition is exposed here.

It consists of 2 categories of classes:

  • Container classes: They are mainly descriptor classes in that they only serve to aggregates instances of various other classes in a semantic fashion with no actual functional purpose.

  • Type classes: They encode actual type information and a functional part which controls the way they are constructed and manipulated.

Container classes

class plums.commons.data.data.DataPoint(tiles, annotation, id=None, **properties)[source]

Bases: plums.commons.data.mixin.PropertyContainer, plums.commons.data.mixin.IdentifiedMixIn

Data model class which aggregates a Tile and an Annotation, as well as additional properties.

Parameters
  • tiles (OrderedDict[Tile]) – The data-point’s tiles as an ordered mapping (See TileCollection).

  • annotation (Annotation) – The data-point’s annotation.

  • id (str) – Optional. Default to a random UUID4. An id to store along the instance.

  • **properties (Any) – Additional properties to store alongside the DataPoint.

id

The instance uuid.

Type

str

tiles

The stored data-point’s tiles as an ordered mapping.

Type

TileCollection

annotation

The stored data-point’s annotation

Type

Annotation

properties

Properties provided as kwargs in the constructor.

Type

dict

property tile

The first tile in the tiles collection.

Type

Tile

class plums.commons.data.data.Annotation(record_collection, mask_collection=None, id=None, **properties)[source]

Bases: plums.commons.data.mixin.PropertyContainer, plums.commons.data.mixin.IdentifiedMixIn

Data model class which aggregates a RecordCollection and an MaskCollection, as well as additional properties.

Parameters
  • record_collection (RecordCollection) – The annotation’s record collection.

  • mask_collection (MaskCollection) – The annotation’s mask collection.

  • id (str) – Optional. Default to a random UUID4. An id to store along the instance.

  • **properties (Any) – Additional properties to store alongside the DataPoint.

id

The instance uuid.

Type

str

record_collection

The stored annotation’s record collection.

Type

RecordCollection

mask_collection

The stored annotation’s mask collection.

Type

MaskCollection

properties

Properties provided as kwargs in the constructor.

Type

dict

class plums.commons.data.tile.TileCollection(*tiles, **named_tiles)[source]

Bases: collections.OrderedDict

An ordered dictionary-like collection of Tile.

It is effectively a subclass of OrderedDict with a friendlier constructor.

Named Tile can be added either as item tuples (in an ordered dictionary fashion) or as keyword argument, note that ordered keyword arguments where introduced in python 3.6 and using those in python 3.5 would result in a random ordering. Thus, to avoid hard to track mistake, a ValueError will be raised if one attempts to do so.

Anonymous Tile can be added as position arguments, in which case a tile_<n> name will be added to fit in the dictionary, where n is the current Tile index position.

Parameters
  • *tiles (Tile, Tuple[str, Tile]) – Either a Tile, in which case a default name is used, or a tuple (name, Tile) where the name will be used as the tile’s key.

  • **named_tiles (Tile) – A Tile to add to the collection, as the keyword’s entry.

Raises
  • ValueError – If attempting to use keyword arguments on python \(<=\) 3.5.

  • TypeError – If any provided tile in not a Tile-like object.

property iloc

Access stored tiles through their insert positions.

class plums.commons.data.record.RecordCollection(*records, id=None, taxonomy=None)[source]

Bases: plums.commons.data.base.GeoInterfaced

Data model class which aggregates multiple Record together.

It also implement list accessors and append() to easily edit and access the RecordCollection.

Parameters
  • *records (Record) – Record instances to aggregate.

  • id (str) – Optional. Default to a random UUID4. An id to store along the instance.

  • taxonomy (Taxonomy) – Optional. Default to None. A Taxonomy describing the range of possible values one may expect as labels in the enclosed Record. If not provided a new implicit, “flatTaxonomy will be constructed on the go.

id

The instance uuid.

Type

str

records

Stored Record instances.

Type

list

property taxonomy

The range of possible label values in the enclosed Record and their relationships.

Warning

The setter will iterate over all enclosed Record to assess that the proposed Taxonomy is compatible with the RecordCollection. This might be a slow operation.

Raises

ValueError – If trying to set a Taxonomy incompatible with the enclosed records.

New in version 0.2.0.

Type

Taxonomy

get(max_depth=None)[source]

Get Record and cap their labels to a maximum depth.

See also

The label get_labels() method which handles the lifting.

Parameters

max_depth (int, dict) –

Optional. Default to None.

  • If an integer is provided, Label fetched through the attached taxonomy will be capped to the provided maximum tree depth.

  • If a dictionary is provided, it must map taxonomy true-roots to a given integer max_depth. Missing true-root will be interpreted as non-capped.

Returns

The Record labels as a tuple of Label.

Return type

(Label, )

Raises

New in version 0.2.0.

append(record)[source]

Append a Record to the stored records list.

Parameters

record (Record) – A Record to append to the collection.

Raises

ValueError – If trying to append a Record incompatible with the enclosed taxonomy.

to_geojson(style='GeoPaaS')[source]

Implement the object conversion into a valid GeoJSON mapping.

Parameters

style (str) – Either ‘GeoPaaS’ or ‘export-service’. Control the GeoJSON representation properties format.

Returns

The GeoJSON representation of the RecordCollection.

Return type

dict

class plums.commons.data.mask.MaskCollection(*masks)[source]

Bases: object

Data model class which aggregates multiple Mask together.

It also implement a index and name handy access to stored Mask.

Examples

>>> rm = RasterMask(image, 'raster-data')
>>>  vm = VectorMask([[[0, 0], [0, 1], [1, 1], [0, 0]]], 'vector-data')
>>> mc = MaskCollection(rm, vm)
>>> mc[0] == rm
True
>>> mc[1] == vm
True
>>> mc['vector-data'] == vm
True
>>> mc['raster-data'] == rm
True
Parameters

*masks (Mask) – Mask instances to aggregate.

masks

Stored Mask instances.

Type

tuple

Type classes

Tile classes

class plums.commons.data.tile.Tile(array_data)[source]

Bases: plums.commons.data.base.ArrayInterfaced

Utility class which wraps an ArrayInterfaced and forward its __array_interface__.

It is not intended to be instantiated as such but rather subclassed (like TileWrapper) or used to check whether an particular instance validates as a Tile.

Because it registers PIL.Image.Image as a virtual subclasses, this implies that only subclasses of Tile (such as TileWrapper) or Pillow Image are considered valid Tile.

Parameters

array_data (ArrayInterfaced) – An ArrayInterfaced instance to wrap.

class plums.commons.data.tile.TileWrapper(array_data, filename=None, **properties)[source]

Bases: plums.commons.data.mixin.PropertyContainer, plums.commons.data.tile.Tile

A wrapper around a Numpy ndarray which forwards its __array_interface__.

It accepts any instance which have an __array_interface__ and a shape property as a valid ndarray (see _Array).

The properties it exposes mimic some Pillow Image properties which make this class useful to wrap images opened with different library (like OpenCV) and make them usable in places where one would expect a Pillow Image with useful metadata.

Transformation into an actual Pillow Image can be done with:

def as_pillow_image(tile_wrapper):
    pillow_image = PIL.Image.fromarray(np.asarray(tile_wrapper))
    pillow_image.filename = tile_wrapper.filename
    pillow_image.info.update(tile_wrapper.info)

    return pillow_image
Parameters
  • array_data (ndarray) – The image data

  • filename (str) – Optional. Default to None. The filename from where the image was read (if any).

  • **properties (Any) – Additional properties to store alongside the image.

filename

The image filename or None.

Type

str

properties

Properties provided as kwargs in the constructor.

Type

dict

property size

The stored image size as a (width, height) tuple.

Type

tuple

property width

The stored image width.

Type

float

property height

The stored image height.

Type

float

property info

Additional properties stored alongside the image.

Type

dict

property data

The stored image data.

Type

ndarray

Record classes

class plums.commons.data.record.Record(coordinates, labels, confidence=None, id=None, taxonomy=None, **properties)[source]

Bases: plums.commons.data.mixin.PropertyContainer, plums.commons.data.base.GeoInterfaced

Data model class which represents a Record.

It implements the __geo_interface__ and represents itself as a GeoJSON Feature.

Parameters
  • coordinates (list, tuple) – A GeoJSON-valid coordinate sequence describing the Record shape.

  • labels (list, tuple) – The Record labels as a sequence.

  • confidence (float) – Optional. Default to None. A Record confidence score.

  • id (str) – Optional. Default to a random UUID4. An id to store along the instance.

  • **properties (Any) – Additional properties to store alongside the VectorMask.

id

The instance uuid.

Type

str

taxonomy

If not None, a Taxonomy through which labels will be looked up.

Important

The optionally attached taxonomy is only used to fetch labels and any discrepancies between the taxonomy and the stored labels will be silently swallowed unless the Record and its Taxonomy are part of a RecordCollection. This is because a Record by itself is assumed to be context-agnostic and the ability to attach to a taxonomy is an implementation convenience but not a part of the data-model.

New in version 0.2.0.

Type

Taxonomy

coordinates

A GeoJSON-valid coordinate sequence describing the Record shape.

Type

list, tuple

confidence

The Record confidence score.

Type

float

properties

Additional properties stored alongside the Record.

Warning

The properties attribute does not corresponds to the GeoJSON representation properties which also include the labels and the confidence score.

Type

dict

property labels

The Record labels as a tuple of Label.

If a Taxonomy is attached to the Record, the Label returned are fetched through the attached Taxonomy.

Changed in version 0.2.0.

Type

(Label, )

property type

Either “Point” or “Polygon”, it is computed according to the coordinates structure.

Type

str

get_labels(max_depth=None)[source]

Get the record labels and optionally cap them to a maximum depth.

Hint

The way max_depth provided are interpreted depends on what was provided:

  • If an integer, the whole Taxonomy is taken into account and 0 corresponds to __root__ whereas 1 corresponds to the Taxonomy true-roots.

  • If a dictionary, the corresponding true-root sub-trees are taken into account and 0 corresponds to the true-root whereas 1 is the first level underneath it.

Parameters

max_depth (int, dict) –

Optional. Default to None.

  • If an integer is provided, Label fetched through the attached taxonomy will be capped to the provided maximum tree depth.

  • If a dictionary is provided, it must map taxonomy true-roots to a given integer max_depth. Missing true-root will be interpreted as non-capped.

Returns

The Record labels as a tuple of Label.

Return type

(Label, )

Raises

ValueError – If a max_depth is provided although taxonomy is also None.

New in version 0.2.0.

to_geojson(style='GeoPaaS')[source]

Implement the object conversion into a valid GeoJSON mapping.

Parameters

style (str) – Either ‘GeoPaaS’ or ‘export-service’. Control the GeoJSON representation properties format.

Returns

The GeoJSON representation of the Record.

Return type

dict

Mask classes

class plums.commons.data.mask.VectorMask(coordinates, name, id=None, **properties)[source]

Bases: plums.commons.data.base.GeoInterfaced, plums.commons.data.mask.Mask

Data model class which represents a VectorMask.

It implements the __geo_interface__ and represents itself as a GeoJSON Feature.

Parameters
  • coordinates (list, tuple) – A GeoJSON-valid coordinate sequence describing the VectorMask shape.

  • name (str) – The VectorMask name.

  • id (str) – Optional. Default to a random UUID4. An id to store along the instance.

  • **properties (Any) – Additional properties to store alongside the VectorMask.

name

The Mask name.

Type

str

id

The instance uuid.

Type

str

coordinates

A GeoJSON-valid coordinate sequence describing the VectorMask shape.

Type

list, tuple

properties

Properties provided as kwargs in the constructor.

Type

dict

to_geojson()[source]

Implement the object conversion into a valid GeoJSON mapping.

Returns

The GeoJSON representation of the VectorMask.

Return type

dict

class plums.commons.data.mask.RasterMask(data, name, id=None, **properties)[source]

Bases: plums.commons.data.mask.Mask

Data model class which represents a RasterMask.

It forwards the stored array’s __array_interface__ and exposes useful properties in a similar fashion as TileWrapper.

Parameters
  • data (ndarray) – The RasterMask raster data.

  • name (str) – The RasterMask name.

  • id (str) – Optional. Default to a random UUID4. An id to store along the instance.

  • **properties (Any) – Additional properties to store alongside the RasterMask.

name

The Mask name.

Type

str

id

The instance uuid.

Type

str

properties

Properties provided as kwargs in the constructor.

Type

dict

property size

The stored mask size as a (width, height) tuple.

Type

tuple

property width

The stored mask width.

Type

float

property height

The stored mask height.

Type

float

property data

The stored mask data.

Type

ndarray

Developper documentation

Some internal classes used for interface-checking and semantic typing and base classes for the data model implementation.

class plums.commons.data.mixin.PropertyContainer(*args, **properties)[source]

Bases: plums.commons.data.mixin.SlottedDict

Utility class which swallows every key-word arguments provided and exposes them as attributes.

Parameters

**properties (Any) – Properties which are stored in properties and exposed as attributes.

property properties

Properties provided as kwargs in the constructor.

Type

dict

class plums.commons.data.base.ArrayInterfaced[source]

Bases: object

Abstract base class which checks for the __array_interface__ property.

Implement a subclass hook to check for the presence of the __array_interface__ property and mark as a virtual subclass of ArrayInterfaced classes which implement the interface.

Examples

>>> class MockArrayInterfaced(object):
...     @property
...     def __array_interface__(self):
...         # This passes the inheritance test although this
...         # __array_interface__ implementation is invalid
...         return None
>>> isinstance(MockArrayInterfaced(), ArrayInterfaced)
True
class plums.commons.data.base.GeoInterfaced(*args, **kwargs)[source]

Bases: plums.commons.data.mixin.IdentifiedMixIn

Abstract class which checks for the __geo_interface__ property and provides a __geo_interface__.

Implement a subclass hook to check for the presence of the __geo_interface__ property and mark as a virtual subclass of GeoInterfaced classes which implement the interface.

Parameters

id (str) – Optional. Default to a random UUID4. An id to store along the instance.

Examples

>>> class MockGeoInterfaced(object):
...     @property
...     def __geo_interface__(self):
...         # This actually is a valid GeoJSON mapping
...         return {'type': 'FeatureCollection', 'features': []}
>>> isinstance(MockGeoInterfaced(), GeoInterfaced)
True
id

The instance uuid.

Type

str

property is_valid

Return True if the __geo_interface__ returns a valid GeoJSON object.

Type

bool

abstract to_geojson()[source]

Abstract method which implements the object conversion into a valid GeoJSON mapping.

Subclasses must override this method in order to be instantiable.

Returns

The GeoJSON representation of the GeoInterfaced.

Return type

dict

class plums.commons.data.Mask(name, id=None, **properties)[source]

Bases: plums.commons.data.mixin.PropertyContainer, plums.commons.data.mixin.IdentifiedMixIn

Utility class which implements a generic template of a Mask.

It is not intended to be instantiated as such but rather subclassed (like VectorMask or RasterMask) or to type-test.

Parameters
  • name (str) – The Mask name.

  • id (str) – Optional. Default to a random UUID4. An id to store along the instance.

  • **properties (Any) – Additional properties to store alongside the Mask.

name

The Mask name.

Type

str

id

The instance uuid.

Type

str

properties

Properties provided as kwargs in the constructor.

Type

dict

Implementation helpers

class plums.commons.data.mixin.PropertyContainer(*args, **properties)[source]

Bases: plums.commons.data.mixin.SlottedDict

Utility class which swallows every key-word arguments provided and exposes them as attributes.

Parameters

**properties (Any) – Properties which are stored in properties and exposed as attributes.

property properties

Properties provided as kwargs in the constructor.

Type

dict

class plums.commons.data.mixin.SlottedDictMeta(name, bases, namespace, **kwargs)[source]

Bases: abc.ABCMeta

Add the attribute __all_slots__ to a class.

__all_slots__ is a set that contains all unique slots of a class, including the ones that are inherited from parents.

class plums.commons.data.mixin.SlottedDict[source]

Bases: object

Base class which enables both __slots__ and __dict__ for mix-in classes.

class plums.commons.data.mixin.IdentifiedMixIn(*args, **kwargs)[source]

Bases: plums.commons.data.mixin.SlottedDict

Mix In class to add a unique identifier and the ability to manually provide it in the constructor.

Parameters

id (str) – Optional. Default to a random UUID4. An id to store along the instance.

id

The instance uuid.

Type

str

class plums.commons.data.mixin.PropertyContainer(*args, **properties)[source]

Bases: plums.commons.data.mixin.SlottedDict

Utility class which swallows every key-word arguments provided and exposes them as attributes.

Parameters

**properties (Any) – Properties which are stored in properties and exposed as attributes.

property properties

Properties provided as kwargs in the constructor.

Type

dict

class plums.commons.data.base._Array[source]

Bases: object

Abstract base class which checks for both the __array_interface__ and the shape properties.

Implement a subclass hook to check for the presence of the __array_interface__ and the shape properties and mark as a virtual subclass of _Array classes which implement the interface.

Examples

>>> class MockArray(object):
...     @property
...     def __array_interface__(self):
...         # This passes the inheritance test although this
...         # __array_interface__ implementation is invalid
...         return None
...
...     @property
...     def shape(self):
...         # This passes the inheritance test although this
...         # shape implementation is invalid
...         return None
>>> isinstance(MockArray(), _Array)
True