SMS scnews item created by Shila Ghazanfar at Mon 20 Nov 2017 0904
Type: Seminar
Distribution: World
Expiry: 21 Nov 2017
Calendar1: 20 Nov 2017 1300-1400
CalLoc1: CPC Seminar Room Level 5
Auth: sheilag@10.17.88.210 (sgha9047) in SMS-WASM

Statistical Bioinformatics Seminar: Mason -- Modelling transcriptional variability in single cell RNA-seq data during human embryogenesis captures changes in the regulation of critical developmental genes

The aim of the statistical bioinformatics seminar is to provide a forum for 
people working within the broad area of computation and statistics and their 
application to various aspects of biology to present their work and showcase 
their ongoing projects. It is intended to foster the exchange of ideas and 
build potential collaborations across multiple disciplines.

Monday November 20, 2017 (PLEASE NOTE: Special location - Level 5 Large 
Meeting Room, Usual time: 1pm - 2pm)

Speaker: Elizabeth Mason (The University of Melbourne)

Title: Modelling transcriptional variability in single cell RNA-seq data 
during human embryogenesis captures changes in the regulation of critical 
developmental genes

Abstract: Human development is a temporally and spatially ordered series of 
events that occur with remarkable precision; the same DNA blueprint gives 
rise to more than 250 sharply defined cell phenotypes. At the functional 
phenotype level embryogenesis appears predictable because we observe the 
average behaviour of many individual cells, even as the number of cells, 
the range of phenotypes and transcriptional complexity increases during 
the course of development. When we evaluate single molecules and transcripts 
that the stochastic nature of gene expression is revealed, for example in 
single cell RNA-seq experiments (scRNA-seq). Current methods reduce scRNA-seq 
data to a well-defined trajectory based on the abundance of key regulators of 
phenotype, and differential abundance between cells in a given phenotype is 
used to identify sub-populations. Here we present an alternative approach: 
that measuring the transcriptional variability at the gene level informs the 
level of regulation imposed on it, reflecting an intrinsic property of 
development that is often overlooked. While linear models have been a 
successful framework to characterize differences in abundance between 
phenotypes on average, they do not account for stochastic differences 
captured by scRNA-seq experiments. Accurately determining abundance and 
variability is further complicated by the sparseness of non-zero expression 
values. To address these challenges and evaluate gene expression during 
human pre-implantation embryogenesis, we applied a statistical mixture 
model to scRNA-seq data. Fitting the model on a gene-by-gene basis allowed 
us to evaluate shifts in the proportion of cells expressing a given gene (λ), 
and also the mean (μ) and standard deviation (σ) of expression. From here, a 
correlation based analysis evaluated whether abundance (μ) and variability (σ) 
capture different aspects of transcriptional regulation. While each metric 
largely identified the same genes, the number and nature of relationships 
between them differed. Indeed, genes sharing correlated patterns of variability 
during development were enriched for motifs associated with developmental 
transcription factors (e.g. HIC2, PPARG, E2F4 and ZNF692). Variability was 
more effective than abundance at specifically detecting regulatory 
relationships during development, and with less redundancy. Our approach 
provides a gene-centric platform to evaluate population-based parameters 
of gene expression, while preserving the complexity of scRNA-seq data.

About the speaker: Lizzi began her career in human genomics as a laboratory 
manager and laboratory technician with Professor Greg Gibson (Centre for 
Integrative Genomics, Georgia Tech University). She conducted 2 investigations 
in Australia which identified maternal influences on development of the neonate 
immune system, and uncovered population structure of the leukocyte transcriptome. 
Together with scientists at Emory University, Greg and Lizzi initiated the CIG’s 
involvement in the WHOLE (Wellness and Health Omics Linked to the Environment) 
study of Predictive Health Genomics in Atlanta (USA) which is currently in 
its 6th year. Lizzi has recently completed a PhD in systems biology of human 
stem cells at the Australian Institute for Bioengineering and Nanotechnology 
at the University of Queensland. Her PhD project formed an international 
collaboration with Professor Christine Wells (University of Melbourne AUS), 
stem cell biologists Professors Martin Pera (Jackson Laboratory USA) and 
Ernst Wolvetang (University of Queensland AUS), biostatistician Assistant 
Professor Jessica Mar (Albert Einstein College of Medicine, USA) and 
computational biologist Professor John Quackenbush (Harvard University, USA). 
Her primary focus is evaluating whether molecular variability in stem cell 
populations describes an important, but until now hidden predictor of cellular 
behaviour and phenotype. Phenotypic heterogeneity in clonally derived cell 
populations is ubiquitous, and biologically relevant information is often 
masked by using population-averaging techniques, versus individual cell 
based measurements. She has developed new network approaches which incorporate 
gene expression variance, with the goal of identifying genetic elements which 
stabilize a cell phenotype, and push a cell to transition between phenotypes. 
During her PhD Lizzi has been invited to present her work in departmental 
seminars at the Harvard Stem Cell Institute, the Lieber Brain Institute at 
Johns Hopkins University, and the Black Family Stem Cell Institute at Mt Sainai 
Hospital New York. She was also one of 12 international scientists who were 
invited to participate in the Radcliffe Exploratory Workshop for Variation at 
Harvard University in 2011. She is currently based with Professor Christine 
Wells in the Centre for Stem Cell Systems at the University of Melbourne, 
where she is working on applied statistical methods to evaluate molecular 
variability in single cell RNA-seq data.