In summer 2020, we introduced intMEMOIR, a serine integrase-based recording system that allows in situ readout, and demonstrate its ability to reconstruct lineage relationships in cultured stem cells and flies. The system employs an array of independent three-state genetic memory elements that can recombine stochastically and irreversibly, allowing up to 59,049 distinct digital states.
intMEMOIR sequences and datatset are available here:https://data.caltech.edu/records/1303
A description of how to use our code for the reconstruction of lineages is available here:https://agranado.github.io/intMEMOIR.html
To catalyze the development of new methods to perform lineage reconstruction, we organized the Allen Institute Lineage Reconstruction DREAM Challenge, which ran from October 2019 through February 2020. DREAM challenges are a platform for crowdsourcing collaborative competitions where a rigorous evaluation of each submitted solution allows for objective comparison and assessment of their performance. The value of DREAM resides not only in the acceleration of research through the participation of many teams in solving a problem, but just as importantly, in the diversification of approaches used in emerging areas of biology and in the quality and reproducibility of each provided solution.
Github repositories for examples of the best performing challenge methods are linked below.
Sub-challenge 1 winners: Cassiopeia (Yosef Lab)
Cassiopeia is a software suite for processing data from single cell lineage tracing experiments. This suite comes equipped with three main modules: Target Site Sequencing Pipeline, Phylogeny Reconstruction, and Benchmarking.https://github.com/YosefLab/Cassiopeia Sub-challenges 2 and 3 winners: Distance based Cell LinEAge Reconstruction (DCLEAR) (Il-Youp Kwak and Wuming Gong) R/DCLEAR is an R package for Distance based Cell LinEAge Reconstruction(DCLEAR). https://github.com/ikwak2/DCLEAR
Additional information about the Allen Discovery Center for Lineage Tracing DREAM Challenges is available here:http://dreamchallenges.org/project/allen-institute-cell-lineage-reconstruction-dream-challenge/
In our 2020 Nature Biotechnology publication, we introduce Zombie, a system for image-based readout of short (20-base-pair) DNA barcodes. In this system, phage RNA polymerases transcribe engineered barcodes in fixed cells, and the resulting RNA is subsequently detected by fluorescent in situ hybridization. Using competing match and mismatch probes, Zombie can accurately discriminate single-nucleotide differences in the barcodes.
Raw and analyzed data, as well as code to recreate the publication results are available here:https://data.caltech.edu/records/1303
Optimized lineage recorder: scGESTALTv2
We have recently described an optimized lineage recorder, scGESTALTv2, with improved lineage barcode recovery from single cells. Details of the method and associated results can be found in our preprint:
The zebrafish brain lineage trees (15 dpf) from these experiments can be explored here:https://scgestalt.mckennalab.org/
URD reconstructed developmental trajectories of zebrafish embryogenesis
URD is an R package designed for reconstructing transcriptional trajectories underlying specification or differentiation processes in the form of a branching tree, using single cell RNA-sequencing data.
URD is hosted in Github (https://github.com/farrellja/URD). Detailed installation instructions and tutorials are located in the repository.
Accessing zebrafish brain scRNA-seq and scGESTALT data
The juvenile zebrafish brain dataset (23-25 dpf) from Raj et al. Nature Biotechnology 2018 can be downloaded from NCBI GEO under accession number GSE105010. This Series contains the following data:
1. Raw BAM files and processed counts matrices from inDrops scRNA-seq experiments analyzing transcriptome only.
2. Raw BAM files and processed tables containing CRISPR-Cas9 barcode editing for inDrops scRNA-seq experiments analyzing transcriptome + lineage (scGESTALT).
3. Processed R object (58,492 cells) generated for the above two experiments using Seurat that can be used to explore the data. 4. Raw BAM files and processed tables containing CRISPR-Cas9 barcode editing for experiments analyzing genomic DNA from zebrafish embryos (GESTALT).
In addition, the zebrafish brain lineage trees from these experiments can be explored here:https://krishna.gs.washington.edu/content/members/
Designing and analyzing scGESTALT experiments
We have written a protocol paper that details all the reagents and steps involved in using barcode editing via CRISPR-Cas9 for recording lineages in zebrafish, using both genomic DNA and scRNA-seq to sequence recorded lineages, and parsing the sequenced datasets to extract all editing events.
The pipeline for analyzing edited lineage barcodes is available here:https://github.com/aaronmck/SC_GESTALT
Seattle Organismal Molecular Atlases
Single cell genomics can in principle reveal the molecular definition of every cell type in an entire animal. The Shendure, Trapnell and collaborating labs have begun generating "whole organism" single cell datasets for mouse, worm, fly and other animals, including of gene expression (sci-RNA-seq) and chromatin accessibility (sci-ATAC-seq).
Data, analysis, and tutorials are available here:http://atlas.gs.washington.edu/hub/
Giotto Viewer is a web-based visualization package for spatial transcriptomic data. It interactively displays spatial gene expression data, and allows users to interact with the data which would be otherwise difficult to do in a static visualization.
Giottto Viewer details, demos, and tutorials are available here:http://spatial.rc.fas.harvard.edu/giotto-viewer/