Each data set is visualized on an independent ring layer, and multiple ring layers are stacked on a circular phylogenetic tree, which makes the ggtreeExtra package particularly useful for layering different data sets to create highly informative tree graphics.
#CIRCULAR STATISTICS IN BIOLOGY. FREE#
The number of external rings is not strictly limited and the user is free to visualize several associated data sets using different geometric layers on different external rings. These two layers were automatically aligned to the circular phylogenetic tree and were displayed on different external rings. For example, geom_fruit is able to display a heatmap and a bar plot to the outer rings of an annotated phylogenetic tree to compare microbial abundance across different body sites of humans ( supplementary fig. Different data graph layers can be added to a tree progressively. It can internally reorder associated data based on the structure of a phylogenetic tree, visualize the data using specific geometric layer function with user-provided aesthetic mapping and nonvariable setting, and the graphic layer will be displayed with the tree side by side (i.e., right-hand side for rectangular layout or external ring for circular layout fig. 1A supplementary table S1, Supplementary Material online). The ggtreeExtra package implemented a layer function, geom_fruit, which is a universal function that aligns graphic layers to a phylogenetic tree ( fig. The ggtreeExtra package has been released within the Bioconductor project ( Gentleman et al. The ggtreeExtra package allows progressively representing taxon-specific features on external panels of a phylogenetic tree and helps users to explore and compare different heterogeneous data sets in the evolutionary context. To fully extend ggtree to support the visualization of multisource phylogenetic data in the era of big data, especially for circular layout, we developed the ggtreeExtra package. However, the geom_facet function does not work with a circular layout. With the increasing type and scale of biological data, it is a new challenge to visualize richly layered phylogenetic data in the circular layout, which can display more data in a given space. It allows us to visualize multiple associated data sets in different panels and serves as a general tool since there is no prerequisite for the input data type ( Yu et al. 2017) employs a modular design to separate tree visualization, data integration, and graph alignment ( Yu et al.
The geom_facet function provided in ggtree ( Yu et al. We previously proposed two general methods for mapping and visualizing associated data on phylogeny, which were implemented in ggtree ( Yu et al.
But these tools are developed mainly for certain fields and are difficult to apply to other research domains.
Over the past decade, several packages and web tools have been developed to integrate external data into phylogenetic trees, such as iTOL ( Letunic and Bork 2019), Evolview ( Subramanian et al. However, integrating and visualizing multidimensional data with phylogenetic trees is still not an easy task. Associated data sets, such as the species abundance in each sample and the number or status of target genes for each species, can be incorporated and visualized on a phylogenetic tree to reveal new insights into factors that influence microbial community dynamics ( Morgan et al. For instance, a microbiome study may collect hundreds of samples and reconstruct a phylogenetic tree representing the evolutionary relationships of a microbial community composed of thousands of species. The development of high-throughput experimental technologies has expanded the scales of phylogenetic trees and associated data sets. For example, a recent research constructed a phylogenetic tree of SARS-CoV-2 and integrated the state information of initial diagnosis of Australian SARS-CoV-2 genomes and country information of the origin of the GISAID genomes to investigate origins and transmission pathways of the COVID-19 strains in Australia ( Rockett et al. Integrating and visualizing phylogenetic trees with multidimensional associated data sets help to identify patterns and generate new hypotheses.
Phylogenetic trees are widely used in several biological fields, including comparative genomics, epidemiology, and microbiome.
#CIRCULAR STATISTICS IN BIOLOGY. SOFTWARE#
Phylogeny, data integration, data visualization, software Introduction