By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Baseline Models for Single-Cell RNA-seq Dimensionality Reduction | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Baseline Models for Single-Cell RNA-seq Dimensionality Reduction | HackerNoon
Computing

Baseline Models for Single-Cell RNA-seq Dimensionality Reduction | HackerNoon

News Room
Last updated: 2025/05/21 at 11:09 PM
News Room Published 21 May 2025
Share
SHARE

Table of Links

Abstract and 1. Introduction

2. Background

2.1 Amortized Stochastic Variational Bayesian GPLVM

2.2 Encoding Domain Knowledge through Kernels

3. Our Model and Pre-Processing and Likelihood

3.2 Encoder

4. Results and Discussion and 4.1 Each Component is Crucial to Modifies Model Performance

4.2 Modified Model achieves Significant Improvements over Standard Bayesian GPLVM and is Comparable to SCVI

4.3 Consistency of Latent Space with Biological Factors

4. Conclusion, Acknowledgement, and References

A. Baseline Models

B. Experiment Details

C. Latent Space Metrics

D. Detailed Metrics

A BASELINE MODELS

A.1 SCVI

Proposed in 2019 by Lopez et al. (2018), single-cell variational inference (scVI) is a variational autoencoder that is tuned for single-cell data and has been shown to match current state of the art methods in a variety of downstream tasks, including clustering and differential expression (Lopez et al., 2018; Luecken et al., 2022). Furthermore, due to its neural network structure, the model is scalable to large datasets. An overview of the model is presented in Figure 5.

Figure 5: Overview of the scVI architecture adapted from Lopez et al. (2018).Figure 5: Overview of the scVI architecture adapted from Lopez et al. (2018).

We highlight several key components of the model that target phenomena commonly seen in single-cell data: (1) count data, (2) batch effect, and (3) library size normalization.

Count Data. As scRNA-seq raw count data are discrete, scVI adopts various discrete likelihoods, such as the negative binomial likelihood, for its models. This allows the model to learn a latent space directly from the raw expression data without any conventional pre-processing pipelines. Note that the original paper uses the zero-inflated negative binomial likelihood for the main model to account for dropouts, where gene expressions for a cell are not detected due to technical artifacts (Lopez et al., 2018; Luecken & Theis, 2019).

Accounting for Batch Effects. scVI also models for any effects from different sampling batches by incorporating batch ID information for each cell in both the encoding and decoding portions of the VAE model. While batch information is incorporated as input to the neural network encoder and decoders, it is unclear how exactly the batch effects are modelled.

Library Size Normalization. The third component scVI accounts for is the differences in total gene expression count per cell, or library size, of the data. In the raw count data, each cell has different total gene counts, which may affect comparisons between cells and impact downstream analysis (Hie et al., 2020). As this difference in library size, or sequencing depth, may be a result of technical noise, scVI chooses to model a scaling factor ℓ stand-in for library size. This latent variable is modelled as a log normal as done in Zappia et al. (2017) mappings from the raw counts and batch information to the mean and variance learned by the neural network encoder. To avoid conflating the effects of the scaling factor and of biological effects in the data, a softmax is applied to the output of the decoder before being multiplied by the scaling factor to determine the negative binomial likelihood mean.

The corresponding loss term for each data point is given by

where the parameters to be optimized are the weights of the neural network encoders and decoders as well as the inverse dispersion factor r of the negative binomial likelihood. The way in which the loss can be decomposed into terms for datapoint allows the model to be trained with mini-batching (Hoffman et al., 2013).

While scVI has been shown to perform well in a variety of downstream tasks (Lopez et al., 2018; Luecken et al., 2022), its complex architecture (as seen in Figure 5) and opaque incorporation of known nuisance variables like batch effects make the model and its inferences difficult to interpret.

A.2 LDVAE

In response to this lack of interpretability in the original scVI, Svensson et al. (2020) proposed a linear version of scVI, where the neural network decoder is replaced with a linear mapping. In particular the LDVAE model is defined in the generative way as follows:

where W represents the linear mapping. Note that the mapping from latent space to data space is not completely linear as a nonlinearity is introduced in the softmax function. Moreover, Svensson et al. explored applying a BatchNorm layer to the linearly decoded parameters and found it matched or improved model performance in reconstruction error and learning the latent space in a mouse embryo development dataset (Svensson et al., 2020; Cao et al., 2019). This BatchNorm layer is thus adopted in the LDVAE model, which further obscures a straightforward interpretation of the mapping defined by the decode

Thus, while the LDVAE model allows for a more interpretable mapping from the latent space to the dimension space when compared to scVI, the use of a library size surrogate and a not clearly defined incorporation of batch information through NNs make both models less interpretable

Authors:

(1) Sarah Zhao, Department of Statistics, Stanford University, ([email protected]);

(2) Aditya Ravuri, Department of Computer Science, University of Cambridge ([email protected]);

(3) Vidhi Lalchand, Eric and Wendy Schmidt Center, Broad Institute of MIT and Harvard ([email protected]);

(4) Neil D. Lawrence, Department of Computer Science, University of Cambridge ([email protected]).

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article 5 Jaw-Dropping Google I/O Reveals That Point to a Sci-Fi Future
Next Article Dyson’s New PencilVac Is the Lightest and Thinnest Cordless Vacuum I’ve Ever Used
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Israeli couple shot dead in Washington DC by gunman chanting ‘Free Palestine’
News
iPhone 7 Plus and iPhone 8 Now Considered Vintage
News
‘Bucket of Death’ was stuffed with dead VIP in bombshell Sutton Hoo find
News
Top Samsung Promo Codes and Coupons for May 2025
Gadget

You Might also Like

Computing

The Census Didn’t Just Get Safer—It Got More Complex | HackerNoon

5 Min Read
Computing

Amortized BGPLVM: Improved Dimensionality Reduction for scRNA-seq | HackerNoon

8 Min Read
Computing

Balancing Transaction Speed and Fees in Decentralized Apps | HackerNoon

12 Min Read
Computing

How Chillonic’s “Chill-404” Protocol Could Transform NFTs and Memecoins | HackerNoon

5 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?