Fastgenomics

free for academic use

Forgot Passwort?

Anonymous Access Try for free!
October 15, 2019

Single-Cell Data Formats in FASTGenomics

As a researcher in the field of single cell RNA-sequencing, new datasets are always exciting to work with. However, it can be challenging to analyze datasets from public sources as they come in a variety of formats. Sometimes, you download a set of files in an uncommon, non-standard format, or data that are only readable using a specific software package. Converting between formats is often prone to information loss or requires installation of different tools. Have you been there? We feel you. It is our vision at FASTGenomics to take care of software installation and data loading so that you can focus completely on what is important – your research.

In this article, we will explain how to use FASTGenomics’ provided readers out of the box, and how to implement your own reading routine if you are using a file format FASTGenomics does not yet support.

Which single-cell data formats are supported by FASTGenomics?

The key to data loading in FASTGenomics are our reader modules fgread-r and fgread-py. They implement reading of various standardized single-cell file formats in both R and Python. All you need to do is tell FASTGenomics the file format of your dataset and your data will be read automatically by our fgread module. Currently, FASTGenomics supports the following file formats:

Name
Extension
Description
10x hdf5.hdf510x HDF5 Feature Barcode Matrix Format
Seurat object.rdsOnly supported in R, not in Python
AnnData object.h5adFor Seurat we provide a beta version since loading h5ad is not working reliable in Seurat v3
Loom.loomFor Seurat we provide a beta version since loading loom is not supported in Seurat v3
tab-separated text.tsvSupport for tab-separated, dense matrices with cell/gene identifiers in first row/column, respectively
comma-separated text.csvSupport for comma-separated, dense matrices with cell/gene identifiers in first row/column, respectively

 

If your dataset uses another format, please specify “Other” in the dataset detail information. Feel free to contact us or join the discussion on our Slack channel if you feel an important single-cell data format is missing!

How do I specify the file format of my dataset?

Upon uploading, FASTGenomics will ask you to specify the file format of your dataset. You can also change this information on the detail page of the dataset. This information can only be modified by the uploader of the dataset.

Data Upload Dialog

Upon uploading a new dataset, FASTGenomics will ask you to set the data format.

File format attribute
If you are the uploader of the dataset, you can change the file format on the dataset detail page.

 

Only if the file format is specified on the dataset detail page, FASTGenomics is able to read your data out of the box.

How can I use FASTGenomics’ reader modules to load a dataset?

If your dataset uses one of FASTGenomics’ supported file formats, you can load the data using our reader modules. Simply call fgread::read_datasets() in R or fgread.read_datasets() in Python. Please note that to read AnnData and Loom in R, you need to use the option experimental_readers.

code showing how to read datasets using Python
Reading datasets using fgread in Python

code showing how to read datasets using R

Reading datasets using fgread in R

If you are using R, the data will be loaded as a list of Seurat objects; in Python, you will get a list of AnnData objects. For details on the internal workings of our readers fgread-r and fgread-py, please visit our Github page.

Where can I get help?

If you have problems with data loading, please contact us via email or join our Slack channel. We will be happy to help!