Deliverables - Marine Omics Observation

Delivered¶

Definition of an ontology and namespace describing the EMO BON data model
Development of sampling event (meta)data validation (Pydantic library framework) and quality control procedures using explicit rules and GitHub actions
Procedures and software creation to perform semantic uplift of EMO BON data using the data model to RDF triples
RO-crate specification and generation software for the metaGOflow data products
Jupyter notebooks of data visualisations of taxonomic and genetic such as alpha and beta diversities, workflow for biosynthetic gene cluster identification using software on the Galaxy back-end via their API, and others.
PyPI distributed marine-omics-methods package of reusable methods implemented for the Jupyter notebooks
Metadata RDF triple generation
DCAT catalogue templating descriptions of EMO BON data assets, i.e. IDDAS ingestion framework for the metadata triple store
Storage solution for MGF data products - S3 Object Store in testbed infrastructure
IDDAS ingestion framework for the metadata triple database and the MGF data product RO-crates
RO-crate viewer to browse and download specific files from the sample and MGF RO-crates
Incorporate our RO-crates and triple store into the UDAL code, provide named queries Uniform Data Access Layer (UDAL)
Initial VRE construction on the BlueCloud platform

SPARQL endpoint construction for the RDF triples of sample metadata and metaGOflow data products
Documentation and landing pages from the VRE on BlueCloud2026 infrastructure.
Build FAIR metadata records for all our datasets/triple store and have them assessed through tools like F-UJI
Ensure provenance is provided out of our Jupyter Notebooks in the VRE