Introduction to ScRNA-Seq Data Analysis

Duration: 30 Minutes

Lecture Material:


Roadmap for typical single-cell RNA sequencing data analysis (Jovic et al. 2022).

Quick Summary


The fascinating domain of single-cell data analysis offers insights into the unique biological processes within individual cells, revealing patterns that can remain masked in bulk-cell studies. Analysing an individual cell's genomic or transcriptomic information presents opportunities and challenges.

The Importance of Pre-Processing

Every robust analytical workflow begins with data pre-processing. Here are the critical stages:

  • Pre-Alignment Quality Control: This stage examines the quality of raw sequence reads using tools like FastQC.

  • Adapter Trimming & Filtering: Tools such as Cutadapt come in handy, ensuring reads are free from technical artefacts.

  • Reference-Based Alignment: Processed reads are then aligned to reference genomes using algorithms like Cell Ranger, Alevin, or STARsolo, producing a count matrix.

  • Count Matrix QC: A crucial step where the count matrix undergoes rigorous quality checks, including doublet detection, imputation, and filtering techniques.

Unveiling Cell-Types through General Analysis

Post-pre-processing, the data undergoes several transformative processes:

  • Normalization & Standardization: This step removes biases, ensuring differences in gene expression are genuinely biological. Methods range from Counts Per Million to Quantile Normalization.

  • Feature Selection: From the myriad of genes detected, a subset showing informative variation is chosen using techniques like Variance Thresholding or dispersion-based methods.

  • Dimension Reduction: Transforming the high-dimensional gene expression data into a lower-dimensional space aids visualization and analysis. Tools such as PCA, t-SNE, and UMAP are commonly used.

  • Clustering: Cells are grouped based on the similarity of their gene expression profiles using techniques like K-means clustering or density-based methods.

  • Cell-Type Inference: Individual cells are categorized into specific cell types or subtypes based on their expression profiles, helping decode tissue heterogeneity.

Exploratory Analysis

  • Differential Expression & Abundance: Identifying genes that exhibit significant expression differences between groups of cells.

  • Trajectory Analysis: This offers a temporal view, positioning cells based on their transition through various states.

  • Cell-Cell Communication: Here, multi-cellular coordination and tissue functions are inferred, shedding light on how cells interact.

  1. Jovic D, Liang X, Zeng H et al. Single‐cell RNA sequencing technologies and applications: A brief overview. Clinical & Translational Med 2022;12, DOI: 10.1002/ctm2.694.

  2. Orchestrating Single-Cell Analysis with Bioconductor

  3. Single-cell best practices

  4. Awesome Single Cell

  5. Analysis of single-cell RNA-seq data