In this video we will show you how to create your own analyses. As a newly registered user, you do not yet have the permission to create an analysis. In order to be able to add your own analysis to the platform, you need to have the analyzer role. For that matter, please contact us by email and we will provide you with the corresponding rights.
Let us now go to the analyses section. Here, you can find a list of publicly available analyses, most of which have already been calculated. What we refer to as an analysis, is a sequence of calculation tasks, namely a workflow, that is applied to a certain dataset. According to our naming convention, the analysis title consists of the workflow and the dataset it is applied to.
On this page, for example, you can see several analyses based on the exploratory single cell RNA-seq workflow and the pseudotime analysis workflow. The calculation tasks of each analysis are listed in the details section. Sometimes the workflows consist of several small tasks, but there are also workflows that do not have this subdivision, for example workflows that are derived from Jupyter notebooks.
Now let’s assume you have the rights to create an analysis like our test user here. You can then click on this plus sign and a new window pops up, in which you can define your analysis. Note that if you would like to use your own dataset and workflow, these have to be uploaded to FASTGenomics before creating the analysis. The information you need to fill in are the title, the abstract, the description, the workflow, the calculation parameters, and the dataset. The title should ideally name the workflow and the dataset it is applied to. The abstract is the text that appears in the table that lists the analyses and should therefore capture the essence of the analysis. The description can be a little more detailed. It is shown in the details pane on the analysis page.
Now you need to choose an existing workflow for your analysis. After you click on “Select workflow”, a window pops up that shows the available workflows together with a short abstract and a description of what calculation tasks are involved. There is also a version number attached, so that you are aware of updates and changes to the workflows. For starters, you may choose a standard workflow to explore the overall structure of the data and to produce optimal downstream analysis results.
Two workflows that involve the typical first steps of a single cell RNA-seq data analysis are the “Generalized preprocessing workflow” by Lücken and Theis (2019) and the “Standard single cell RNA-seq analysis” as applied by the LIMES institute. The latter is based on the Seurat guided clustering tutorial. Both workflows include quality control, filtering, normalization, selection of highly variable genes, and clustering. Some workflows allow setting certain parameters such as filter thresholds, the number of PCA components, or whether or not to save the analysis output as an anndata object. This is usually passed over in JSON format. Every workflow has default parameters, so we do not need to put anything here.
Finally, let’s choose a dataset. Again, a window pops up and let’s you choose the dataset that you’re interested in. Here, we choose the 10x PBMC dataset. Click on apply. And then on “create and start calculation”. Now we are set. We have to wait for the cloud resources to be allocated and for the calculation to finish. This is a good time to grab a coffee.
When the calculation is finished, you will see that a “View result” button has appeared. Now you can have a look at your analysis. More details about the analysis output will be covered in another tutorial.
That concludes our tutorial on how to create an analysis. If you would like to know more, visit us at fast genomics dot org, where you can find the latest news and other tutorials to get you started. You can always get in touch with us via email or twitter, or join our slack community.