Quickstart

The Plums Data-Model describes a common descriptor format with python representations for object commonly used.

Lets look at a basic example where we will try to construct a simple DataPoint with an empty Tile, a few Record which reference a Taxonomy:

import numpy as np
from plums.commons.data import DataPoint, Annotation, TileWrapper, RecordCollection, Record
from plums.commons.data.taxonomy import Taxonomy, Label

# Make a taxonomy
breakfast = Label('breakfast item')
eggs = Label('eggs', parent=breakfast)
fried = Label('fried', parent=eggs)
scrambled = Label('scrambled', parent=eggs)
meat = Label('meat', parent=breakfast)
bacon = Label('bacon', parent=meat)
spam = Label('spam', parent=meat)

taxonomy = Taxonomy(breakfast)

# Make a record collection
first = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('eggs', ))
second = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('bacon', ))
third = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('scrambled', ))
fourth = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('meat', ))
fifth = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('eggs', ))
last = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('spam', ),
              spam=('spam', 'spam'),
              beautiful='spam')

record_collection = RecordCollection(first, second, third, fourth, fifth, last)
record_collection.taxonomy = taxonomy

# Make a tile
tile = TileWrapper(np.zeros((100, 100, 3)))

# Make an annotation
annotation = Annotation(record_collection)

# Make a data point
data_point = DataPoint(tile, annotation)

Code breakdown

Make a taxonomy

The first step would be to construct a Taxonomy using the Taxonomy API:

from plums.commons.data.taxonomy import Taxonomy, Label

breakfast = Label('breakfast item')
eggs = Label('eggs', parent=breakfast)
fried = Label('fried', parent=eggs)
scrambled = Label('scrambled', parent=eggs)
meat = Label('meat', parent=breakfast)
bacon = Label('bacon', parent=meat)
spam = Label('spam', parent=meat)

taxonomy = Taxonomy(breakfast)

Lets break that down quickly.

At the base of a Taxonomy is a set of Label on which we declare hierarchical relationships, here using the parent keyword argument of the Label constructor as in:

eggs = Label('eggs', parent=breakfast)

These Label once linked implicitly defines a label Tree, thus we may define a Taxonomy which is a special kind of tree with the last line:

taxonomy = Taxonomy(breakfast)

We can print it to check that it was correctly created:

>>> print(taxonomy)
╰── breakfast item
    ├── eggs
    │   ├── fried
    │   ╰── scrambled
    ╰── meat
        ├── bacon
        ╰── spam

Make a record collection

The next step is to create a bunch of Record and to store them in a RecordCollection:

from plums.commons.data import RecordCollection, Record

first = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('eggs', ))
second = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('bacon', ))
third = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('scrambled', ))
fourth = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('meat', ))
fifth = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('eggs', ))
last = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('spam', ),
              spam=('spam', 'spam'),
              beautiful='spam')

record_collection = RecordCollection(first, second, third, fourth, fifth, last)

Let’s break that down.

We begin by creating a few Record with basic and identical coordinates and one label for each, for example:

first = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('eggs', ))

Note that Record may gather arbitrary properties with them which are accessible as attributes, for example, we added properties in the last record:

last = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('spam', ),
              spam=('spam', 'spam'),
              beautiful='spam')

The RecordCollection is what we call a container class because it gives us a compact manner to hold a set of Record with convenient access and extension mechanisms.

Moreover, RecordCollection are context-aware, in the sense that that we can attach a Taxonomy to them and make use of the relationships defined.

As of now, the RecordCollection constructed a flat, non-informative Taxonomy from its records:

>>> print(record_collection.taxonomy)
├── eggs
├── bacon
├── scrambled
├── meat
╰── spam

If we want to go further than that, let’s attach the taxonomy from earlier:

record_collection.taxonomy = taxonomy

This allows getting Record with labels “up to a certain depth” for example:

>>> print(record_collection[1])
bacon
>>> print(record_collection.get(max_depth=2)[1])
meat

Or enforcing a set of known Label with automatic validation for all modification:

>>> invalid_record = Record([[[0, 0], [0, 1], [1, 1], [1, 0], [0, 0]]], ('sausages', ))
>>> record_collection.append(invalid_record)
Traceback (most recent call last):
  File "plums/commons/data/taxonomy/__init__.py", line 183, in validate
    if len(set(labels) & viewkeys(self._label.descendants)) != len(labels):
ValueError: Invalid label tuple: {Label(name=sausages)} are not part of the taxonomy.
>>> record_collection[1] = invalid_record
Traceback (most recent call last):
  File "plums/commons/data/taxonomy/__init__.py", line 183, in validate
    if len(set(labels) & viewkeys(self._label.descendants)) != len(labels):
ValueError: Invalid label tuple: {[Label(name=sausages)} are not part of the taxonomy.

Make a tile, an annotation and a data point

From here on making a DataPoint is rather straight forward as it mainly involves container classes.

We have to build an Annotation from our RecordCollection:

from plums.commons.data import Annotation

annotation = Annotation(record_collection)

Then we will build a dummy empty Tile with numpy and the TileWrapper util class:

import numpy as np
from plums.commons.data import TileWrapper

tile = TileWrapper(np.zeros((100, 100, 3)))

A DataPoint is a container class for a Tile, Annotation couple, which is straightforward to construct:

from plums.commons.data import DataPoint

data_point = DataPoint(tile, annotation)