1. BioBox Documentation
  2. Getting Started
  3. Setting Up your Library + Uploading Data

Preparing for your scRNA Onboarding

How to prepare for your BioBox Single Cell RNAseq Onboarding

We are very excited to be supporting your scientific research journey. In order to best prepare for your onboarding session please have the following items prepared. 

  • An excel sheet of biological replicates and their metadata
  • A excel sheet of the technical replicates and their metadata
  • Raw FASTQ files
  • Google Chrome Browser 

You must be able to screen-share from the computer that has the data you will be working with during the onboarding.  Please note that we  support 3' and 5' single cell sequencing. 

Models ( Biological Replicates)

A model is the overarching category that will connect al of your lab's data. A model can be anything from a cell line, patient, mouse, organoid etc.  Model Metadata would refer to any model information that you would like to keep track of e.g. a cell line mutation, a treatment applied to the cell line, a patient's date of birth etc. This information can be appended to your heatmap in a legend.

If you are working with a cell line model, each time you create a new cell line record in your library, this is the information about the cell line you will have to provide. This information is completely customizable and can be updated at any time after the onboarding.

Each record that you create within a model is a biological replicate. e.g. If you are working with a cell line model, each record within the cell line model would be a unique cell line that your lab works with.  Please attend the onboarding with an excel sheet containing all of the model metadata as column headers and all of the biological replicates as rows. 

In the example above, the model is Patient.  The metadata that will be tracked include Mutation, and Treatment. Each row corresponds to a biological replicate and they are the individual samples that DNA or RNA was extracted from for the sequencing experiment.  Have this excel sheet ready so that this information and be copy and pasted into your library. 

Experiments

Experiments refer to the specific sequencing technique that was applied to one of your biological replicates. Experiments are categorized according to Transcriptomic, Genomic, and Epigenomic sequencing techniques. The metadata associated with your experiment includes 

  • Experiment Name- User provided
  • Species- Homo sapiens or Mus Musculus  
  • Derived from- Which biological replicate- cell line, patient etc is it associated with 
  • Run Type - Paired or Single End Sequencing 
  • The number of technical replicates 
  • Protocol- 3'Tag or full length.  We currently only support 3' Tag. 

In order to differentiate the files and downstream insights associated with each experiment it is mandatory that they have distinct names. The nomenclature used to name your experiments is completely up to you and your lab. 

Why do I need to provide an experiment name and why do they have to be unique? 

It is important that you provide unique names for all of your experiments because 1 biological replicate may be associated with multiple experiments.  e.g. Patient-01 N could be associated with total RNA, WGS and scRNA experiments. 

Data Files 

To use our Guided Analysis you must begin with raw FASTQ files.  Please ensure that your FASTQ files are downloaded to your computer prior to the onboarding meeting. The requirements for the raw FASTQ files are as follows 

  • All read lengths within each pair of files (so read1 and read2 independently) have to equal. For e.g. all read lengths of R1 should be 26 or 28 or 30 etc, but you can’t have mixtures
  • Read1 and read2 must be fully paired files. Singletons in either of the files will not be tolerated, and so must be kept in a separate file not passed/uploaded with the experiment.
  • The name of files has to adhere to the Cell Ranger naming convention:
    • [any alphanumeric name without special characters aside from - ]_S[some unique number for each experiment. Easier to just do 1]_L[lane number distinguishing each run]_R[1 or 2 depending on pair]_001.fastq.gz
    • L can be used to denote the technical replicate 
    • R can be used to denote read 1 or read 2 if working with paired read sequencing  
      • format: NAME_S#_L##_R#_00#.fastq.gz
      • An example name would be: pbmc_S1_L001_R1_001.fastq.gz

An example of files using the CellRanger nomenclature and their associated technical replicates can be seen below. 

Screen Shot 2022-05-09 at 12.32.44 PM

To learn more about CellRanger and the required FASTQ naming conventions read this article.