July 30, 2019

New generalized standard workflows on FASTGenomics

FASTGenomics has added two new standard workflows that can be applied to a large collection of new datasets. The aim is to provide reusable workflows that allow to have a quick first look at your data and also to quantitatively compare different datasets. Of course, to analyze these data further, you can still customize your own workflows and dig deeper.

The first new standard workflow is based on the Seurat guided clustering tutorial and is written in R-markdown script. Using R-markdown, we generate an interactive results notebook that allows you to easily browse the results while you can still access the underlying code. The workflow has been enhanced with many more features by our partners at the Schultze lab at LIMES in Bonn, Germany. We hope you find it as helpful as we did.

The second standard workflow is based on the recent publication Current best practices in single‐cell RNA‐seq analysis: a tutorial by Luecken & Theis (2019) and the supplement single cell tutorial on github. The tutorial includes a case study analysis of intestinal epithelium data from Haber et al. (2017), which we slightly modified in order to generalize their preprocessing and clustering to other data sets. The new “Generalized best practices preprocessing workflow” is now available on FASTGenomics and you are welcome to try it out with your own dataset.

In both workflows we perform the typical first steps of a scRNA-seq data analysis. We start with general preprocessing steps, which include cell and gene quality control, normalization, batch correction, selection of highly variable genes, visualization, and cell cycle scoring. In these steps the overall structure of the data is explored and filtered to produce optimal downstream analysis results. With the clustering of the data and – in case of Luecken & Theis (2019) – application of PAGA, we also take the first steps toward the downstream scRNA-seq analysis.