Docs | BioBox Platform
GuideAPIRelease NotesLog In
  • Welcome to BioBox!
  • Framework
    • Overview
    • Why Graph?
    • Core Concepts
  • Data Packages
    • Data Packages
    • External Ontologies
      • Gene Ontology
      • Tissue
      • Disease
      • Cell Ontology
      • Phenotype
  • How To
    • 🧭Configure your Knowledge Graph Schema
      • πŸ—‚οΈFormat Internal Data For Uploading
      • ‴️Upload Internal Data
    • βš–οΈCreate Prioritization Graph Models
      • πŸ—’οΈGenerate Reports
      • πŸ›£οΈPathway Enrichment
    • πŸ—ΊοΈUse the Graph Explorer
      • βš™οΈRunning Graph Algorithms
      • πŸ’ΎSave Graph Explorer Sessions
    • πŸ”—Use the Query Language
      • Customize the data table returned with your query
      • 🧭Explore and understand your results
      • πŸ“ŠVisualize data returned in Query Language
      • ✏️Modify Queries using the Query Editor
    • Ask questions with Natural Language (GraphRAG)
      • Use Agent Orion to generate a query with Natural Language
      • Use Agent Iris to converse with your data
  • πŸ“ŠVisualize data on the Legacy Platform
    • πŸ“„Create a Genomic Sequencing Dashboard (Legacy Platform)
    • 🍭Create a Stacked Lollipop Plot (Legacy Platform)
    • ↕️Upload raw data (Legacy Platform)
    • πŸ“«Invite Users to Your Organization
  • Release Notes
Powered by GitBook
On this page
  • Single Cell RNAseq Adapter
  • Single Cell ATACseq Adapter
  • ChIPseq Adapter
  • Genome Adapter
  1. How To
  2. Configure your Knowledge Graph Schema

Format Internal Data For Uploading

How to use our python data adapters to format internal data to upload.

PreviousConfigure your Knowledge Graph SchemaNextUpload Internal Data

Last updated 11 months ago

To make the data ingestion process as simple as possible we have created several python data adapters that will convert your processed data files into JSON files compatible with the BioBox platform.

Each adapter will

  • Handle the column mapping to ensure the necessary experimental metadata required for each sequencing modality is available.

  • Produce two JSON files (objects and edges) that can be uploaded to the platform.

Create a copy of the adapter file, import your files and execute the code. Once the objects and edges files have been created, to upload them to the platform.

Single Cell RNAseq Adapter

The required input for this adapter is an h5ad file. This adapter will enable you to upload Single Cell RNAseq observations.

Single Cell ATACseq Adapter

The required input for this adapter is an h5ad file. This adapter will enable you to upload Single Cell ATACseq observations.

Example ATACseq Data Schema

scrna.list_schema()

{'_meta': {'version': '0.0.1', 'date_updated': '2024-06-18 01:33:36.226002'},
 'name': 'SingleCellRNASeq Datapack - 2024-06-18 01:33:36.226002',
 'key': 'scrna:2024-06-18 01:33:36.226002',
 'description': 'SingleCellRNASeq Datapack created through Python SDK',
 'dependencies': ['Ensembl'],
 'concepts': {'Experiment': {'label': 'Experiment',
   'dbLabel': 'Experiment',
   'definition': 'Experiment of the sample tissue'},
  'SingleCellExperiment': {'label': 'SingleCellExperiment',
   'dbLabel': 'SingleCellExperiment',
   'definition': 'Single Cell Experiment of the sample tissue',
   'sco': 'Experiment'},
  'SingleCellRNAseqExperiment': {'label': 'SingleCellRNAseqExperiment',
   'dbLabel': 'SingleCellRNAseqExperiment',
   'definition': 'Single Cell RNAseq Experiment of the sample tissue',
   'sco': 'SingleCellExperiment'},
  'Sample': {'label': 'Sample',
   'dbLabel': 'Sample',
   'definition': 'Sample organism from which tissue was taken to be analyzed'},
  'CellBarcode': {'label': 'Cell Barcode',
   'dbLabel': 'CellBarcode',
   'definition': 'Individual cell from scRNA experiment, identified by barcode'}},
 'relationships': {'contains cell': {'from': 'SingleCellExperiment',
   'to': 'CellBarcode'},
  'expresses': {'from': 'CellBarcode', 'to': 'Gene'},
  'has experiment': {'from': 'Sample', 'to': 'Experiment'},
  'has cell type': {'from': 'CellBarcode', 'to': 'CellType'}}}

ChIPseq Adapter

The required input for this adapter is a BED file. This adapter will enable you to upload ChIPseq observations.

Example ChIPseq Data Schema
{'_meta': {'version': '0.0.1', 'date_updated': '2024-06-24 14:47:37.050712'},
 'name': 'My ChIPSeq',
 'key': 'My ChIPSeq',
 'description': '',
 'concepts': {'NarrowPeak': {'label': 'NarrowPeak',
   'dbLabel': 'NarrowPeak',
   'definition': ''},
  'ChIPseq': {'label': 'ChIPseq', 'dbLabel': 'ChIPseq', 'definition': ''}},
 'relationships': {'has narrow peak': {'from': 'ChIPseq', 'to': 'NarrowPeak'},
  'peak start on': {'from': 'NarrowPeak', 'to': 'GenomicInterval'},
  'peak end on': {'from': 'NarrowPeak', 'to': 'GenomicInterval'},
  'assay target on': {'from': 'ChIPseq', 'to': 'Protein'},
  'has chipseq': {'from': 'Sample', 'to': 'ChIPseq'}}}

Genome Adapter

The required input for this adapter is a gzipped GTF (Gene transfer format) file. This adapter will enable you to upload a custom reference genome.

Example Data Schema
{'_meta': {'version': '0.0.1', 'date_updated': '2024-06-24 17:28:23.528499'},
 'name': 'Genome Datapack - homo sapiens 9606 (2024-06-24 17:28:23.528499)',
 'key': 'genome:9606:2024-06-24 17:28:23.528499',
 'description': 'Genome Datapack created through Python SDK',
 'concepts': {'Gene': {'label': 'Gene',
   'dbLabel': 'Gene',
   'definition': 'Gene encompassing all biotypes'},
  'Transcript': {'label': 'Transcript',
   'dbLabel': 'Transcript',
   'definition': 'Transcripts derived from gene'},
  'Protein': {'label': 'Protein',
   'dbLabel': 'Protein',
   'definition': 'Protein derived from gene'},
  'Genome': {'label': 'Genome',
   'dbLabel': 'Genome',
   'definition': 'Genome encompassing this data pack'},
  'GenomicInterval': {'label': 'Genomic Interval',
   'dbLabel': 'GenomicInterval',
   'definition': "Genomic Interval splitting the genome's chromosomal regions into sections of 1kbp"}},
 'relationships': {'genome contains interval': {'from': 'Genome',
   'to': 'GenomicInterval'},
  'next': {'from': 'GenomicInterval', 'to': 'GenomicInterval'},
  'transcribed to': {'from': 'Gene', 'to': 'Transcript'},
  'has translation': {'from': 'Transcript', 'to': 'Protein'}}}
🧭
πŸ—‚οΈ
follow the instructions here
Google Colab
BioBox scRNAseq Adapter
Google Colab
BioBox Single Cell ATACseq adapter
Logo
Logo
Google Colab
BioBox ChIPseq Adapter
Logo
Google Colab
BioBox Genome Adapter
Logo