Configuring and Launching Pipelines from the Pipeline Design Studio

How to customize and launch pipelines on the BioBox Platform

 

Accessing your pipelines 

HubSpot Video

To build a new pipeline or access and existing pipeline create a project.  Create a new "Analysis" within your project and select "Pipelines". You will have the option to launch an existing pipeline or build a new one in the pipeline design studio.  If you are processing a large amount of files, you can use one of our prebuilt pipelines. You can learn how to use our prebuilt pipelines here. 

 

Configuring Pipelines in the Pipeline Design Studio

HubSpot Video

If you choose to create a new pipeline, give your pipeline a name and select the reference genome that will be applied to all alignment workflows used in your pipeline.  This will open the pipeline design studio where you can drag and drop workflows in a code free environment. Within the left hand panel you will see all of the available workflows and input file types. Clicking on the info icon next to the workflow will provide you with a description of what the workflow does, use cases, inputs required, and outputs. 

  1.  Select the input file types that you would like to build you pipeline with. Each input corresponds to 1 file that you will add to your pipeline. If you are building a pipeline using paired reads you will have 2 FASTQ input files per sample and if you are building a single end read you will have 1 FASTQ input file per sample. 
  2. Select the workflow you would like to connect the input to, this will result in the workflow populating in the centre of the design studio.
  3. Connect your inputs to the workflow by clicking the connection icon on the input file and dragging the connection to the workflow.  The outputs from each workflow can be connected to subsequent workflows e.g. Sample Genecount outputs from STAR can be connected to DESeq2. 
  4. Once you have configured your pipeline, provide your inputs with a name.  Click on the info icon next to an input file to update the name. This will provide you with context when adding files from your library to the pipeline. The names of your inputs can specify which files are experimental vs control or any other metadata that is relevant to your experiment. 
Build these connections for all of the samples you would like to process in your pipeline 

E.g. If I would like to build an RNAseq pipeline using STAR + DESeq 2  for 6 paired read samples ( 3 Reference and 3 Experimental) I would have the following; 

  • 12 FASTQ inputs - 2 for each sample 
  • 6 STAR workflows (Each STAR workflow is connected to 2 FASTQ inputs )
  • 1 DESeq2 workflow ( 1 Sample gene counts file from each STAR workflow connected to DESeq2)

The pipeline design studio can also be used to build other pipelines pertaining to genomics, quality control, and read processing using the available workflows.  If you are processing a large number of  bulk RNAseq samples, you may alternatively use our prebuilt pipeline that includes (STAR + DESeq). You can learn how to use our prebuilt pipelines here. 

Editing Pipelines

Select the 3 dots located on the right side of the workflow to access any advanced parameters available for the workflow. If you would like to delete a connection left click on the connection and an option to delete will populate. Once you have finished configuring your pipeline, save it. You can edit your pipeline at any time and share it with other members in your organization. 

Screen Shot 2022-01-25 at 2.44.20 PM

 

Launching your Pipeline

HubSpot Video

When launching your pipeline you will be prompted to select files from your library to be processed in the pipeline.  Ensure that you have selected the appropriate model you would like to work with. e.g. If you have cell line, patient, and mouse models in your library select the model from which you would like to access the data.  

Select the "Experiments" button next to the biological replicates you would like to work with. This will open a dialog displaying all of the experiments and files associated with each biological replicate. 

 If you have numerous records you can use the filter within the table to subset your records e.g. only display patients with a specific mutation. Once you have identified the files, add them to your pipeline and launch. 

Pipeline results can be accessed from the "Results" tab within your Analysis.  All results and workflow file outputs will be available on the results page. All files will be saved to your BioBox Library and do not need to be exported.   For an in-depth walk through of the pipeline results page, read more here.