# Format Internal Data For Uploading

To make the data ingestion process as simple as possible we have created several python data adapters that will convert your processed data files into JSON files compatible with the BioBox platform.

Each adapter will

* Handle the column mapping to ensure the necessary experimental metadata required for each sequencing modality is available.
* Produce two JSON files (objects and edges) that can be uploaded to the platform.&#x20;

Create a copy of the adapter file, import your files and execute the code. Once the objects and edges files have been created, [follow the instructions here](/guide/how-to/configure-your-knowledge-graph-schema/upload-internal-data.md) to upload them to the platform.&#x20;

### Single Cell RNAseq Adapter

The required input for this adapter is an h5ad file. This adapter will enable you to upload Single Cell RNAseq observations.&#x20;

{% embed url="<https://colab.research.google.com/drive/1qGfdK3Zfu8gchVEnPtO-dufZQwGnY7zd?usp=sharing>" %}
BioBox scRNAseq Adapter
{% endembed %}

### Single Cell ATACseq Adapter

The required input for this adapter is an h5ad file. This adapter will enable you to upload Single Cell ATACseq observations.&#x20;

{% embed url="<https://colab.research.google.com/drive/1qGfdK3Zfu8gchVEnPtO-dufZQwGnY7zd?usp=sharing>" %}
BioBox Single Cell ATACseq adapter
{% endembed %}

<details>

<summary>Example ATACseq Data Schema</summary>

\
scrna.list\_schema()

{% code overflow="wrap" %}

```python
{'_meta': {'version': '0.0.1', 'date_updated': '2024-06-18 01:33:36.226002'},
 'name': 'SingleCellRNASeq Datapack - 2024-06-18 01:33:36.226002',
 'key': 'scrna:2024-06-18 01:33:36.226002',
 'description': 'SingleCellRNASeq Datapack created through Python SDK',
 'dependencies': ['Ensembl'],
 'concepts': {'Experiment': {'label': 'Experiment',
   'dbLabel': 'Experiment',
   'definition': 'Experiment of the sample tissue'},
  'SingleCellExperiment': {'label': 'SingleCellExperiment',
   'dbLabel': 'SingleCellExperiment',
   'definition': 'Single Cell Experiment of the sample tissue',
   'sco': 'Experiment'},
  'SingleCellRNAseqExperiment': {'label': 'SingleCellRNAseqExperiment',
   'dbLabel': 'SingleCellRNAseqExperiment',
   'definition': 'Single Cell RNAseq Experiment of the sample tissue',
   'sco': 'SingleCellExperiment'},
  'Sample': {'label': 'Sample',
   'dbLabel': 'Sample',
   'definition': 'Sample organism from which tissue was taken to be analyzed'},
  'CellBarcode': {'label': 'Cell Barcode',
   'dbLabel': 'CellBarcode',
   'definition': 'Individual cell from scRNA experiment, identified by barcode'}},
 'relationships': {'contains cell': {'from': 'SingleCellExperiment',
   'to': 'CellBarcode'},
  'expresses': {'from': 'CellBarcode', 'to': 'Gene'},
  'has experiment': {'from': 'Sample', 'to': 'Experiment'},
  'has cell type': {'from': 'CellBarcode', 'to': 'CellType'}}}
```

{% endcode %}

</details>

### ChIPseq Adapter&#x20;

The required input for this adapter is a BED file.  This adapter will enable you to upload ChIPseq observations.&#x20;

{% embed url="<https://colab.research.google.com/drive/1On_jYqeamNXC5rJIS5GTtlb_mkUa4zWk?usp=sharing>" %}
BioBox ChIPseq Adapter
{% endembed %}

<details>

<summary>Example ChIPseq Data Schema </summary>

{% code overflow="wrap" %}

```python
{'_meta': {'version': '0.0.1', 'date_updated': '2024-06-24 14:47:37.050712'},
 'name': 'My ChIPSeq',
 'key': 'My ChIPSeq',
 'description': '',
 'concepts': {'NarrowPeak': {'label': 'NarrowPeak',
   'dbLabel': 'NarrowPeak',
   'definition': ''},
  'ChIPseq': {'label': 'ChIPseq', 'dbLabel': 'ChIPseq', 'definition': ''}},
 'relationships': {'has narrow peak': {'from': 'ChIPseq', 'to': 'NarrowPeak'},
  'peak start on': {'from': 'NarrowPeak', 'to': 'GenomicInterval'},
  'peak end on': {'from': 'NarrowPeak', 'to': 'GenomicInterval'},
  'assay target on': {'from': 'ChIPseq', 'to': 'Protein'},
  'has chipseq': {'from': 'Sample', 'to': 'ChIPseq'}}}
```

{% endcode %}

</details>

### Genome Adapter

The required input for this adapter is a gzipped GTF (Gene transfer format) file. This adapter will enable you to upload a custom reference genome.&#x20;

{% embed url="<https://colab.research.google.com/drive/19j7hPHl5XBUj8mCTldonRDywYMTULU2P?usp=sharing>" %}
BioBox Genome Adapter
{% endembed %}

<details>

<summary>Example Data Schema </summary>

{% code overflow="wrap" %}

```python
{'_meta': {'version': '0.0.1', 'date_updated': '2024-06-24 17:28:23.528499'},
 'name': 'Genome Datapack - homo sapiens 9606 (2024-06-24 17:28:23.528499)',
 'key': 'genome:9606:2024-06-24 17:28:23.528499',
 'description': 'Genome Datapack created through Python SDK',
 'concepts': {'Gene': {'label': 'Gene',
   'dbLabel': 'Gene',
   'definition': 'Gene encompassing all biotypes'},
  'Transcript': {'label': 'Transcript',
   'dbLabel': 'Transcript',
   'definition': 'Transcripts derived from gene'},
  'Protein': {'label': 'Protein',
   'dbLabel': 'Protein',
   'definition': 'Protein derived from gene'},
  'Genome': {'label': 'Genome',
   'dbLabel': 'Genome',
   'definition': 'Genome encompassing this data pack'},
  'GenomicInterval': {'label': 'Genomic Interval',
   'dbLabel': 'GenomicInterval',
   'definition': "Genomic Interval splitting the genome's chromosomal regions into sections of 1kbp"}},
 'relationships': {'genome contains interval': {'from': 'Genome',
   'to': 'GenomicInterval'},
  'next': {'from': 'GenomicInterval', 'to': 'GenomicInterval'},
  'transcribed to': {'from': 'Gene', 'to': 'Transcript'},
  'has translation': {'from': 'Transcript', 'to': 'Protein'}}}
```

{% endcode %}

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.biobox.io/guide/how-to/configure-your-knowledge-graph-schema/format-internal-data-for-uploading.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
