Tabular data Projects work a little differently than typical Projects in Encord. Annotators and Reviewers select from options columns for each row. You can use multiple columns for selection.
Modify the following script example to create your Ontology.
Items of Interest
Notes
READ_ONLY_COLUMNS
Specifies the columns you want your Annotators and Reviewers to see in the Label Editor.
Column count starts at 0.
Omit the columns in your CSV you do not want your Annotators and Reviewers to see.
ANNOTATION_COLUMNS
Specifies the columns your Annotators and Reviewers use to label data from. Your Annotators and Reviewers select answers from a drop down in these columns.
Specify the options available to Annotators and Reviewers using the files in MAPPING_FIELD_OPTION_PATHS. These files are single columnm files with one option available on each row.
ONTOLOGY_NAME
Specifies the name for your Ontology.
OBJECT_NAME
Specifies the name of the text region for each row in your CSV file. The script applies a label to each row in your CSV file using this text region.
tabular_create_ontology script
Copy
import pandas as pdfrom encord.objects import OntologyStructure, Shape, TextAttributefrom encord.objects.attributes import RadioAttributefrom encord.user_client import EncordUserClient# --- Configuration ---ENCORD_SSH_KEY = "/Users/chris-encord/ssh-private-key.txt" # Replace with the file path to your SHH private keyTASK_CSV_PATH = "/file/path/to/video_game_annotation_1.csv" # Replace with the file path to any of the video_game_annotation_X.csv filesREAD_ONLY_COLUMNS = [0, 1, 2]ANNOTATION_COLUMNS = [3, 4]# Replace these paths with actual mapping column name > options fileMAPPING_FIELD_OPTION_PATHS = { "genre": "/file/path/to/genre-options.csv", "platform": "/file/path/to/platform-options.csv",}ONTOLOGY_NAME = "E2E - Tabular Data - Ontology"OBJECT_NAME = "Game Row"def parse_csv(): csv_df = pd.read_csv(TASK_CSV_PATH) readonly_columns = csv_df.columns[READ_ONLY_COLUMNS].tolist() mapping_columns = csv_df.columns[ANNOTATION_COLUMNS].tolist() return mapping_columns, readonly_columnsdef create_ontology(text_attribute_names, radio_option_names): ontology_structure = OntologyStructure() text_object = ontology_structure.add_object(name=OBJECT_NAME, shape=Shape.TEXT) for attribute in text_attribute_names: text_object.add_attribute(TextAttribute, attribute) for column_name in radio_option_names: options_path = MAPPING_FIELD_OPTION_PATHS.get(column_name) if options_path is None: raise ValueError(f"No options file defined for column '{column_name}'") options = pd.read_csv(options_path).iloc[:, 0].dropna().astype(str).tolist() radio_attribute = text_object.add_attribute(RadioAttribute, column_name, required=True) for option in options: radio_attribute.add_option(option) user_client = EncordUserClient.create_with_ssh_private_key( ssh_private_key_path=ENCORD_SSH_KEY, domain="https://api.encord.com", ) return user_client.create_ontology(ONTOLOGY_NAME, structure=ontology_structure)if __name__ == "__main__": mapping_columns, readonly_columns = parse_csv() ontology = create_ontology(readonly_columns, mapping_columns) print(f"Created ontology {ontology.title}, id: {ontology.ontology_hash}")