research research research-lab cancer-center p true

Explore the Pinello Lab

Research Summary

The focus of the Pinello laboratory is to use innovative computational approaches and cutting-edge experimental assays to systematically analyze sources of genetic and epigenetic variation and gene expression variability that underlie human traits and diseases. The lab uses machine learning, data mining and high-performance computing technologies, for instance parallel computing and cloud-oriented architectures, to solve computationally challenging and Big Data problems associated with next generation sequencing data analysis. Our mission is to use computational strategies to further our understanding of disease etiology and to provide a foundation for the development of new drugs and more targeted treatments.

Research Projects

Epigenetic variability in cellular identity and gene regulation

We are studying the relationship between epigenetic regulators, chromatin structure and DNA sequence and how these factors influence gene expression patterns. We recently developed an integrative computational pipeline called HAYSTACK(1). HAYSTACK is a software tool ( to study epigenetic variability, cross-cell-type plasticity of chromatin states and transcription factor motifs and provides mechanistic insights into chromatin structure, cellular identity and gene regulation. By integrating sequence information, histone modification and gene expression data measured across multiple cell-lines, it is possible to identify the most epigenetically variable regions of the genome, to find cell-type specific regulators, and to predict cell-type specific chromatin patterns that are important in normal development and differentiation or potentially involved in diseases such as cancer. .

Computational methods for genome editing

We embraced the revolution in functional genomics made possible by the novel genome editing approaches such as CRISPR/Cas9, base editing and prime editing by developing computational tools for the design(2), quantification of CRISPR edits (3) and for the analysis of coding and non-coding tiling screens for functional genomics (4).

We have developed CRISPREsso2, a software for the quantification of genome editing events that is now the standard de facto for the genome editing community. In collaboration with the groups of Daniel Bauer and Stuart Orkin, we applied our computational strategies to aid the development of several CRISPR screens for dissecting enhancer functionality in the blood system (4).

We have recently proposed a protocol that describes in detail both the computational and benchtop implementation of an arrayed and/or pooled CRISPR genome editing experiments that serves as a key resource for labs interested in adopting CRISPR genome editing (5).

Exploring single cell gene expression variation in development and cancer

Cancer often starts from mutations occurring in a single cell that results in a heterogeneous cell population. Although traditional gene expression assays have provided important insights into the transcriptional programs of cancer cells, they often measure a combined signal from a mixed population of cells and hence do not provide adequate information regarding subpopulations of malignant cells. Emerging single cell assays now offer exciting opportunities to isolate and study individual cells in heterogeneous cancer tissues, allowing us to investigate how genes transform one subpopulation into another. Characterizing stochastic variation at the single cell level is crucial to understand how healthy cells use variation to modulate their gene expression programs, and how these patterns of variation are disrupted in cancer cells. We are developing tools to characterize cellular types and states at single cell resolution by using data from single cell transcriptomic or epigenomics data. For example, we recently released STREAM (6) (Single-cell Trajectories Reconstruction, Exploration And Mapping), an interactive computational pipeline for reconstructing complex cellular developmental trajectories from sc-qPCR, scRNA-seq or scATAC-seq data available at This method can be used for disentangling complex cellular types and states in development, cancer, differentiation or in perturbation studies.


Selected Publications

Pinello L*†, Farouni R*, Yuan GC†. Haystack: systematic analysis of the variation of epigenetic states and cell-type specific regulatory elements. Bioinformatics. 2018; 34(11):1930-1933.

Cancellieri S, Canver MC, Bombieri N, Giugno R†, Pinello L†. CRISPRitz: rapid, high-throughput, and variant-aware in silico off-target site identification for CRISPR genome editing. Bioinformatics. 2019 Nov 25. pii: btz867.

Clement K, Rees H, Canver MC, Gehrke JM, Farouni R, Hsu JY, Cole MA, Liu DR, Joung JK, Bauer DE†, Pinello L†. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat Biotechnol. 2019 Mar;37(3):224-226.

Hsu JY, Fulco CP, Cole MA, Canver MC, Pellin D, Sher F, Farouni R, Clement K, Guo JA, Biasco L, Orkin SH, Engreitz JM, Lander ES, Joung JK, Bauer DE†, Pinello L†. CRISPR-SURF: discovering regulatory elements by deconvolution of CRISPR tiling screen data. Nat Methods. 2018 Dec;15(12):992-993.

Canver MC*, Haeussler M*, Bauer DE, Orkin SH, Sanjana NE, Shalem O, Yuan GC, Zhang F, Concordet JP & Pinello LIntegrated design, execution, and analysis of arrayed and pooled CRISPR genome-editing experiments. Nat Protocols. 2018 May;13(5):946-986.

Chen H, Albergante L, Hsu JY, Lareau CA, Lo Bosco G, Guan J, Zhou S, Gorban AN, Bauer DE, Aryee MJ, Langenau DM, Zinovyev A, Buenrostro JD, Yuan GC†, Pinello L†Single-cell trajectories reconstruction, exploration and mapping of omics data with STREAM. Nat Commun. 2019 Apr 23;10(1):1903.

*Co-first authors
†Co-corresponding authors

Research Image - STREAM on transcriptomic data from the mouse hematopoietic system

A Dimensionality reduction, reconstructed hierarchical structure composed of curves approximating the inferred trajectories. Single cells are represented as circles & colored according to the FACS sorting labels. B Flat tree representation at single cell resolution; branches represented as straight lines, (cells are represented as in A). The length of the branches & the distances between cells & assigned branches are proportional to the original representation in the 3D space

Our Researchers

Luca Pinello, PhD

Principal Investigator

Group Members

  • Huidong Chen, PhD
  • Kendell Clement, PhD (shared with Keith Joung Lab)
  • Nafiz Hamid, PhD (shared with Keith Joung Lab)
  • Jonathan Hsu* (shared with Keith Joung Lab)
  • Qin Qian, PhD (shared with Dave Langenau lab) – Research Fellow
  • Micheal Vinyard* (shared with Gad Getz Lab)
  • Justine Shih (shared with Keith Joung Lab)
  • Nikolaos Trasanidis†
  • Qian Zhang (shared with Daniel Bauer Lab)
  • Jiecong Lin**

*PhD student
** Visiting PhD student
† Visiting Research Fellow