Before we can create machine learning and computer vision applications that understand the data they look at, we need to establish just exactly what is to be understood and recognized. An ontology, sometimes known as a taxonomy or labeling protocol, does exactly that. Ontologies establish the set of concepts, their relationships and their representations in our data, and are the layer of communication which allow us to semantically program our machine learning applications.
Our DICOM customers might be more familiar with the term 'labeling protocol', which is equivalent to an ontology.
Creating an ontology
The tutorial below goes through the basics of creating ontologies.
Intro to ontology structure
Ontologies are hierarchical structures which capture not only the top-level concepts and categories present in your data, but also allow nested attributes for fine-grained differentation or detailed annotations. At the top most level, ontologies are composed of classes (sometimes known as categories), the first level of concepts you wish to represent. Ontology classes can be either objects or classifications. Hierarchical attributes are added via nested classifications.
Objects are used in object detection and segmentation, when you want to annotate not only the category but also the location. Classifications (sometimes known as frame level classifications on Encord) annotate whether or not something is present in a given frame, but don't require localization data. Nested classifications can be nested under objects, classifications, and even other nested classifications to create deeply nested structures capable of modeling complex and even conditional attribute relationships.
Considerations when designing ontologies
Creating an ontology is an important prerequisite when creating effective machine learning applications and it can be helpful to keep some of the following in mind when designing your ontologies:
- The Problem Domain: it's important that ontologies are exhaustive, be sure there is a class or representation for the important concepts you want to address. Also keep in mind at what level to separate concepts. E.g. an application focused on recognizing various animals might have top-level classes of "cat" and "dog," but a problem focused on differentiating dog breeds might more appropriately feature "German Shepherd" and "Border Collie" as top-level classes.
- The Team: be sure to frame the classes and attributes in terms that are communicable across your entire team of annotators, reviewers, project managers, algorithmic developers or other involved stakeholders.
- The Workflow: annotation can be a difficult and time-consuming process. It's important to represent classes and their nested classifications appropriately, but designing ontologies such that objects and scenes can be labelled both correctly and quickly can lead to a more efficient labelling process.
The diagram below illustrates the relationship different entities in Encord have to each other.
- Projects bring together ontologies and datasets.
- A project can have multiple datasets attached to it, but only one ontology.
- One ontology can be attached to multiple projects.