Preprint/Phylogenetic reconciliation

From Wikiversity
Jump to navigation Jump to search

PLOS Topic Pages
PLOS Computational Biology • PLOS Genetics • PLOS ONE

This is a PLOS Topic Page draft

Public peer review comments are posted here
All content on this page is being developed under a CC BY 4.0 license


Authors
Hugo Menet, Vincent Daubin , Eric Tannier
About the Authors 

Hugo Menet
AFFILIATION: Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622 Villeurbanne, France
0000-0002-6809-3878

Vincent Daubin
AFFILIATION: Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622 Villeurbanne, France
0000-0001-8269-9430

Eric Tannier
AFFILIATION: Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622 Villeurbanne, France; INRIA Grenoble Rhône-Alpes, F-38334 Montbonnot, France
0000-0002-3681-7536


Abstract[edit | edit source]

In phylogenetics, reconciliation is an approach to connect the history of two or more coevolving biological entities. The general idea of reconciliation is that a phylogenetic tree representing the evolution of an entity (e.g. homologous genes, symbionts...) can be drawn within another phylogenetic tree representing an encompassing entity (respectively, species, hosts) to reveal their interdependence and the evolutionary events that have marked their shared history. The development of reconciliation approaches started in the 1980s, mainly to depict the coevolution of a gene and a genome, and of a host and a symbiont, which can be mutualist, commensalist or parasitic. It has also been used for example to detect horizontal gene transfer, or understand the dynamics of genome evolution.

Phylogenetic reconciliation can account for a diversity of evolutionary trajectories of what makes life's history, intertwined with each other at all scales that can be considered, from molecules to populations or cultures. A recent avatar of the importance of interactions between levels of organization is the holobiont concept, where a macro-organism is seen as a complex partnership of diverse species. Modeling the evolution of such complex entities are one of the challenging and exciting direction of current research on reconciliation.


Phylogenetic trees as matryoshka dolls[edit | edit source]

Phylogenies have been used for representing the diversification of life at many levels of organization: macro-organisms,[1] their cells throughout development,[2] micro-organisms through marker genes,[3] chromosomes,[4] proteins,[5] protein domains,[6] and can also be helpful to understand the evolution of human culture elements such as languages [7] or folktales.[8] At each of these levels, phylogenetic trees describe different stories made of specific diversification events, which may or may not be shared among levels. Yet because they are structurally nested or functionally dependent, the evolution at a particular level is bound to others.

Phylogenetic reconciliation is the identification of the links between levels through the comparison of at least two associated trees. Originally developed for two trees, reconciliations for more than two levels have been recently constructed § Explicit modeling of three or more levels. As such, reconciliation provides evolutionary scenarii that reveal conflict and cooperation among evolving entities. These links may be unintuitive, for instance, genes present in the same genome may show uncorrelated evolutionary histories while some genes present in the genome of a symbiont may show a strong coevolution signal with the host phylogeny. Hence, reconciliation can be a useful tool to understand the constraints and evolutionary strategies underlying the assemblage that makes an holobiont.

Because all levels essentially deal with the same object, a phylogenetic tree, the same models of reconciliation can be transposed, with slight modifications, to any pair of connected levels:[9] an "inner", "lower", or "associate" entity (gene, symbiont species, population...) evolves inside an "upper", or "host" one (respectively species, host, geographical area...) see figure 2. The upper and lower entities are partially bound to the same history, leading to similarities in their phylogenetic trees, but the associations can change over time, become more or less strict or switch to other partners, see figure 1.

Two-level reconciliation methods have been reviewed several times, generally focusing on a particular pair of levels, e.g. gene/species or host/symbiont.[10][11][12][13][14][15][16]


A phylogenetic reconciliation between an upper, blue, and a lower, red, tree, with the most often used evolutionary events (S,D,T,L), and their name in phylogeography, host/symbiont and gene/species frameworks. For instance S event is called allopatric speciation when reconciling geographical areas and species, cospeciation between host and symbiont, and speciation for gene and species, but always correspond to the same co-diversification pattern.


Intertwined phylogenies at multiple levels and the use of reconciliation


Table 1. Pairs of biological levels compared with phylogenetic reconciliations
Upper level Lower Level Main references or review Main softwares
Geography Species [17][18][19] Diva,[17] Lagrange[19]
Host species Symbiont species [14][15] Jane,[20] eMPRess,[21] Eucalypt[22]
Species Gene [10][11][12][13] Ranger-DTL,[23] Notung,[24] Mowgli,[25] Angst,[26] ecceTera,[27] ALE,[28] Treerecs,[29] RecPhyloXML[30]
Gene Gene domain [31][32][33]


Construction and limits of the Duplication Transfer Loss model[edit | edit source]

Models and methods used today in phylogeny (see figure 3) are the result of several decades of research, made of a progressive complexification, driven by the nature of the data and the quest for biological realism on one side, and the limits and progresses of mathematical and algorithmic methods on the other.

Pre-reconciliation models: characters on trees.[edit | edit source]

Character methods can be used when there is no tree available for one of the levels, but only values for a character at the leaves of a phylogenetic tree for the other level. A model defines the events of character value change, their rate, probabilities or costs. For instance the character can be the presence of a host on a symbiont tree,[34] the geographical region on a species tree,[35] the number of genes on a genome tree,[36] or nucleotides in a sequence.[37] Such methods thus aim at reconstructing ancestral characters at internal nodes of the tree.[38]

Although these methods have produced results on genome evolution, the utility of a second tree appears with very simple examples. If a symbiont has recently acquired the ability to spread in a group of species and thus it is present in most of them, characters methods will wrongly indicate that the common ancestor of the hosts already had the symbiont. In contrast, a comparison of the symbiont and host trees would show discrepancies revealing horizontal transfers.

The origins of reconciliation: the Duplication Loss model and the Lowest Common Ancestor mapping.[edit | edit source]

Duplication and loss were invoked first to explain the presence of multiple copies of a gene in a genome or its absence in certain species.[5] It is possible with those two events to reconcile any two trees [39] i.e. to map the nodes and branches of the lower and upper trees, or equivalently to give a list of evolutionary events explaining the discrepancies between the upper tree and lower tree. A most parsimonious Duplication and Loss (DL) reconciliation is computed through the Lowest Common Ancestor (LCA) mapping: proceeding from the leaves to the root, each internal node is mapped to the lowest common ancestor of the mapping of its two children.

A Markovian model for reconciliation.[edit | edit source]

The LCA mapping in the DL model follows a parsimony principle: no event should be invoked if it is not necessary. However the use of this principle is debated[37] and it is commonly admitted that it is more accurate in molecular evolution to fit a probabilistic model as a random walk, which does not necessarily produce parsimonious scenarios. A birth and death Markovian model is such a model that can generate a lower tree "inside" a fixed upper one from root to leaves.[40] Statistical inference provides a framework to find most likely scenarios, and in that case, a maximum likelihood reconciliation of two trees is also a parsimonious one. In addition, it is possible with such a framework to sample scenarios, or integrate over several possible scenarios in order to test different hypotheses, for example to explore the space of lower trees. Moreover probabilistic models can be integrated in larger models as probabilities simply multiply when assuming independence, for instance combining sequence evolution and DL reconciliation.[41]

Introducing horizontal transfer (figure 3A).[edit | edit source]

Host switch, i.e. inheritance of a symbiont from a kin lineage, is a crucial event in the evolution of parasitic or symbiotic relationships between species. This horizontal transfer also models migration events in biogeography and became of interest for the reconciliation of gene and species trees when it appeared that many discrepancies could not simply be explained by duplication and loss and that horizontal gene transfer (HGT) was a major evolutionary process in micro-organisms evolution. This switching, or horizontal transfer, pattern can also model admixture or introgression.[42] It is considered in character methods, without information from the symbiont phylogeny.[34][43] On top of the DL model, horizontal transfer enables new very different reconciliation scenarios (figure 3A).

Illustration of reconciliation events, inputs, outputs, and computational difficulties. This table is intended to serve as illustration to 2-Level reconciliation section and can be read along it. Inputs are on the left of entries, output on the right. Upper trees are drawn in blue, lower trees in red.

Necessity to weight evolutionary events (figure 3B).[edit | edit source]

The LCA reconciliation method yields a unique solution, which has been shown to be optimal for the problem of minimizing the weighted number of events, whatever the relative weights of duplication and loss.[44] In contrast, with Duplication, horizontal Transfer and Loss (DTL), there can be several equally parsimonious reconciliations. For instance a succession of duplications and losses can be replaced by a single transfer (figure 3 B). One of the first ideas to define a computational problem and approach a resolution was, in a host/symbiont framework, to maximize the number of co-speciations with a heuristic algorithm.[45] Another solution is to give relative costs to the events and find a scenario that minimizes the sum of the costs of its events.[46] In the probabilistic model frameworks, the equivalent task consists in assigning rates or probabilities to events and search for maximum likelihood scenarios, or sample scenarios according to their likelihood. All these problems are solved with a dynamic programming approach.

The simple yet powerful dynamic programming approach.[edit | edit source]

This dynamic programming method consists in traversing the two trees in a postorder. Proceeding from the leaves and then going up in the two trees, for each couple of internal nodes (one for each tree), the cost of a most parsimonious DTL reconciliation is computed.[46]

In a parsimony framework, costs of reconciling a lower subtree rooted at l with a upper subtree rooted at U is initialized for the leaves with their matching:

   c(U,l) = 0 if l ∈ U else c(U,l) = ∞ 

And then inductively, denoting l',l" the children of l, U',U" the children of U, cS,cD,cT,cL the costs associated to speciation, duplication, horizontal transfer and loss, respectively (with cS often fixed to 0),


   c(U,l) =   min(
       cS + min( c(U',l')+c(U",l"), c(U",l') + c(U',l"), c(U',l) + cL, c(U",l) + cL )
       cD + c(U,l') + c(U,l")
       cT+ min( minV(c(V,l')) +c(U,l"), minV(c(V,l"))+c(U,l')) 
       )


The costs minV(c(V,l')) and minV(c(V,l")), because they do not depend on U, can be computed once for all U, hence achieving quadratic complexity to compute c for all couples of U and l. The cost of losses only appears in association with other events because in parsimony, a loss can always be associated with the preceding event in the tree.

The induction behind the use of dynamic programming is based on always progressing in the trees toward the roots. However some combinations of events that can happen consecutively can make this induction ill-defined. One such combination consists in a transfer followed immediately by a loss in the donor lineage (TL). Restricting the use of this TL event [47] repairs the induction. With an unlimited use it is necessary to use or add other known methods to solve systems of equations like fixed point methods,[28] or numerical solving of differential equations.[19] In 2016, only two out of seven of the most commonly used parsimony reconciliation programs did handle TL events [27] although its consideration can drastically change the result of a reconciliation.[12]

Unlike LCA mapping, DTL reconciliation typically yields several scenarios of minimal cost, in some cases an exponential number. The strength of the dynamic programming approach is that it enables to compute a minimum cost of coevolution of the input upper and lower tree in quadratic time, [23] and to get a most parsimonious scenario through backtracking. It can also be transposed to a probabilistic framework to compute the likelihood of coevolution and get a most likely reconciliation, replacing costs with rates, minimums by sums and sums by products.[48] Moreover the approach is suitable, through multiple backtracks, to enumerate all parsimonious solutions or to sample scenarios, optimal and sub-optimal, according to their likelihood.

Estimation of event costs and rates (figure 3B).[edit | edit source]

Dynamic programming per se is only a partial solution and does not solve several problems raised by reconciliation. Defining a most parsomonious DTL reconciliation requires giving costs to the different kind of events (D, T and L). Different cost assignations can yield different reconciliation scenarios (figure 3B), so there is a need for a way to choose those costs. There is a diversity of approaches to do so. CoRe-PA [49] explores in a recursive manner the space of cost vectors, searching for a good matching with the event frequencies in reconciliations. ALE [48] uses the same idea in a probabilistic framework to estimate the event rates by maximum likelihood. Alternatively COALA [50] is a preprocess using approximate bayesian computation with sequential Monte Carlo: simulation and statistic rejection or acceptance of parameters with successive refinement.

In the parsimony framework it is also possible to divide the space of possible event costs in areas of costs which lead to the same Pareto optimal solution.[51] Pareto optimal reconciliations are such that no other reconciliation has a strictly inferior cost for one type of event (duplication, transfer or loss), and less or equal for the others.

It is also possible to rely on external considerations in order to choose the event costs. For example the software Angst [26] chooses the costs that minimize the variation of genome size, in number of genes, between parent and children species.

The problem of time feasibility (figure 3C).[edit | edit source]

The dynamic programming method works for dated (internal nodes are totally ordered) or undated upper trees. However with undated trees there is a time feasibility issue. Indeed a horizontal transfer implies that the donor and the receiver are contemporaneous, therefore implying a time constraint on the tree. In consequence two horizontal transfers may be incompatible, because they imply contradicting time constraints (figure 3C). The dynamic programming can not easily check for such incompatibilities. If the upper tree is undated, finding a time feasible most parsimonious reconciliation is NP-hard.[52][53][54] It is fixed parameter tractable, which means that there are algorithms running in time bounded by an exponential of the number of transfers in the output scenarios.[53] Some solutions imply integer linear programming [55] or branch and bound exploration.[9] If the upper tree is dated, then there is no incompatibility issue because horizontal transfers can be constrained to never go backward in time. Finding a coherent optimal reconciliation is then solved in polynomial time.[53] Most of the software taking undated trees do not look for temporal feasibility, except Jane [20] which explores the space of total orders via a genetic algorithm, or, in a post process, Notung [56] and Eucalypt, [22] which search inside the set of optimal solutions for a time consistent ones. Other methods work as supplementary layers to reconciliations, correcting reconciliations [57] or returning a subset of feasible transfers,[58] which can be used to date a species tree.[58][59]

Expanding phylogenies: Transfers from the dead (figure 3D).[edit | edit source]

In phylogenetics in general, it is important to keep in mind that the species, extant and ancestral which are represented in any phylogeny are only a sparse sample of the species that currently exist or have existed. This is why one can safely assess that all transfers that can be detected using phylogenetic methods have originated in lineages that are, strictly speaking, absent from a studied phylogeny (figure 3 D).[60] Accounting for extinct or unsampled biodiversity in phylogenetic studies can give a better understanding of these processes.[61] Originally, DTL reconciliation methods did not recognize this phenomenon and only allowed for transfer between contemporaneous branches of the tree, hence ignoring most plausible solutions. However methods working on undated upper trees can be seen as implicitly handling the unknown diversity by allowing transfers "to the future" from the point of view of one phylogeny, that is, the donor is more ancient than the recipient. A transfer to the future can be translated into a speciation to unknown species, followed by a transfer from unknown species. ALE [60] in its dated version explicitly takes the unknown diversity into account by adding a Moran process of speciation/extinctions of species to the dated birth/death model of gene evolution.

The specificity of biogeography: a tree like structure for the "evolution" of areas (figure 3E).[edit | edit source]

In biogeography, some applications of reconciliation approaches consider as an upper tree an area cladogram with defined ancestral nodes. For instance the root can be Pangea and the nodes contemporary continents. Sometimes internal nodes are not ancestral areas but the unions of the areas of their children, to account for the possibility of species evolving along the lower tree to inhabit one or several areas. In this case, the evolutionary events are migration, where one species colonizes a new area, speciation Allopatric speciation, or vicariance, equivalent to co-speciation in host/symbiont comparisons (figure 3E). Despite this does not always give a tree (if the unions AB and BC of leaves A, B, C exist, a child can have several parents) and this structure is not associated with time (it is possible for a species to go from A to AB by migration, as well as from AB to A by extinction), reconciliation methods, with events and dynamic programming, can infer evolutionary scenarios between this upper geographical structure and lower species tree. Diva [17] and Lagrange [18][19] are two reconciliation models constructing such a tree-like structure and then applying reconciliation, the first with a parsimony principle, the second in a probabilistic framework.

Graphical output[edit | edit source]

With two trees and multiple evolutionary events linking them to represent, viewing reconciled trees is a challenging but necessary question in order to make reconciliation studies more accessible. Some reconciliation softwares include annotation of the evolutionary events on the lower trees,[56] while others, [20][21][22][49] and specific packages, in DL[62] or DTL,[63] trace the lower tree embedded in the upper one. One difficulty in this regard is the variety of output format for the different reconciliation softwares, however recently a common standard, recphyloxml,[30] has been established and endorsed by part of the community with available viewer.

Using and expanding reconciliation[edit | edit source]

Exploring the space of reconciliations ( figure 3F)[edit | edit source]

Multiple DTL reconciliation scenarios can have equal cost or tight probabilities (figure 3E). Dynamic programming makes it possible to sample reconciliations, uniformly among optimal ones or according to their likelihood. It is also possible to enumerate them in time proportional to the number of solutions,[22] a number which can quickly become intractable (even only for optimal ones) (figure 3F). Finding and presenting structure among the multitude of possible reconciliations has been at the center of recent methodological developments, especially for host and symbiont aimed methods. Several works have focused on representing a set of reconciliations in a compact way. This can be achieved by giving support values to specific events based on all optimal (or suboptimal) reconciliations,[64] or with the use of a consensus reconciled tree.[65][66] In a DL model it is possible to define a median reconciliation, based on shared events and to compute it in polynomial time.[67] EMPRess[21] can group similar reconciliations through clustering,[68] with all pairwise distance between reconciliations computable in polynomial time (independently of the number of most parsimonious reconciliations).[69] With the same aim, Capybara [70] defines equivalence classes among reconciliations, efficiently computing representative for all classes, and outputs with linear delay a given number of reconciliations (first optimal ones, then sub optimal). The space of most parsimonious reconciliation can be expanded or reduced when increasing or decreasing horizontal transfer allowed distance,[22] which is easily done by dynamic programming.

Inferring phylogenetic trees with reconciliation[edit | edit source]

Reconciliation and input uncertainty[edit | edit source]

Reconciliation works with two fixed trees, a lower and an upper, both assumed correct and rooted. However, those trees are not first hand data. The most frequently used data for phylogenetics consists in aligned nucleotidic or proteic sequences. Extracting DNA, sequencing, assembling and annotating genomes, recognizing homology relationships among genes and producing multiple alignments for phylogenetic reconstruction are all complex processes where errors can ultimately affect the reconstructed tree.[71] Any topology or rooting error can be misinterpreted and cause systematic bias. For instance, in DL reconciliations, errors on the lower tree bias the reconciliation toward more duplication events closer to the root and more losses closer to the leaves.[72]

On the other hand, reconciliation, as a macro evolutionary model, can work as a supplementary layer to the micro evolutionary model of sequence evolution, resolving polytomies (nodes with more than two children) or rooting trees, or be intertwined with it through integrative models in order to get better phylogenies.

Most of the works in this direction focus on gene/species reconciliations, nevertheless some first steps have been made in host/symbiont, such as considering unrooted symbiont trees [73] or dealing with polytomies in Jane.[20]

Exploring the space of lower trees with reconciliation (figure 3G,H,I).[edit | edit source]

Reconciliation can easily take unrooted lower trees as input (figure 3G), which is a frequently used feature because trees inferred from molecular data are typically unrooted. It is possible to test all possible roots, or a thoughtful triple traversal of the unrooted tree allows to do it without additional time complexity.[47] In a duplication-loss model the set of roots minimizing the costs are found close to one another, forming a "plateau", [74] a property which does not generalizes to DTL.[73][65]

Reconciliation can also take as input non binary trees (figure 3H), that is, with internal nodes with more than two children. Such trees can be obtained for example by contracting branches with low statistical support. Inferring a binary tree from a non binary tree according to reconciliation scores is solved in DL with efficient methods.[56][24][75][29] In DTL, the problem is NP hard.[76] Heuristics [77] and exact fixed parameter tractable algorithms [76][78] are possible resolutions.

Another way to handle uncertainty in lower trees is to take as input a sample of alternative lower trees instead of a single one. For example in the paper that gave reconciliation its name [39] it was proposed to consider all most likely lower trees, and choose from these trees the best one according to their DL costs, a principle also used by TreeFix-DTL.[79] The sample of lower trees can also reflect their likelihood according to the aligned sequences (figure 3I), as obtained from bayesian Markov chain Monte Carlo methods as implemented for example in Phylobayes.[80] AngST,[26] ALE[28] and EcceTERA [81] use "amalgamation", a extension of the DTL dynamic programming that is able to efficiently traverse a set of alternative lower trees instead of a single tree.

A local search in the space of lower trees guided by a joint likelihood, on the one hand from multiple sequence alignments and on the other hand from reconciliation with the upper tree, is achieved in Phyldog with a DL model [82] and in GeneRax with DTL.[83] In a DL model with sequence evolution and relaxed molecular clock the lower tree space can be explored with an MCMC.[84] MowgliNNI [25] can modify the input gene tree at poorly supported nodes to increase DTL score.

Finally, integrative models, mixing sequence evolution and reconciliation, can compute a joint likelihood via dynamic programming (for both reconciliation and gene sequences evolution),[28] include molecular clock to estimate branch lengths, in a DL model [40] or with a relaxed molecular clock.[84] These models have been applied in gene/species frameworks, not yet in host/symbiont or biogeography.

Inferring upper trees using reconciliation (figure 3J).[edit | edit source]

Inferring an upper tree from a set of lower trees is a long standing question related to the supertree problem.[85] It is particularly interesting in the case of gene/species reconciliation where many (typically thousands of) gene trees are available from complete genome sequences. Supertree methods attempt to assemble a species tree based on sets of trees which may differ in terms of contemporary species sets and topology, but usually without consideration for the biological process explaining these differences. However some supertree approaches are statistically consistent for the reconstruction of the species tree if the gene trees are simulated under a DL model. This means that if the number of input lower trees generated from the true upper tree via the DL model grows toward infinity, given that there are no additional error, the output upper tree converges almost surely to the true one. This has been shown in the case of a quartet distance,[86] and with a generalised Robinson Foulds multicopy distance [87] with better running time but assuming gene trees do not contain bipartitions contradicting the species tree, which seems rare under a DL model.

However, reconciliation can also be used for the inference of upper tree. It is a computationally hard problem: already resolving polytomies in a non binary upper tree with a binary lower one, minimizing a DL reconciliation score, is NP-hard.[88] In particular, reconstructing the species tree giving the best DL cost for several gene trees is NP-hard and 2-approximable [89] (figure 3J).

ODTL [48] takes as input gene trees and searches a maximum likelihood species tree according to a DTL model, with a hill-climbing search. The approach produces a species tree with internal nodes ordered in time ensuring a time compatibility for the scenarios of transfer among lower trees {link section|The problem of time feasibility}.

Addressing a more general problem, Phyldog [82] searches for the maximum likelihood species tree, gene trees and DL parameters from multiple family alignments via multiple rounds of local search. It thus performs the exploration of both upper and lower trees at the same time. MixTreEM [90] presents a faster solution.

Limits of the two-level DTL model[edit | edit source]

A limit to dynamic programming: non independent evolution of children lineages (figure 3K).[edit | edit source]

The dynamic programming framework, like usual birth and death models, works under the hypothesis of independent evolution of children lineages in the lower tree. However this hypothesis does not hold if the model is complemented with several other documented evolutionary events, such as horizontal transfer with replacement of an homologous gene in the recipient lineage, or gene conversion. Horizontal transfer with replacement is usually modeled by a rearrangement of the upper tree, called Subtree Prune and Regraft (SPR) (figure 3 K left). Reconciling under SPR is NP-hard, even in dated trees, and fixed parameter tractable regarding the output size.[91][92]

Another way to model and infer replacing horizontal transfers is through maximum agreement forest, where branches are cut in the lower and upper trees in order to get two identical (or statistically indistinguishable [93]) upper and lower forests. The problem is NP-hard,[94] but several approximations have been proposed.[95] Replacing transfers can be considered on top of the DL model.[96] In the same vein gene conversion can be seen as a "replacing duplication" (figure 3K right). In this latter case, a polynomial algorithm which does not use dynamic programming and is an extension of the LCA method, can find all optimal solutions including gene conversions.[92]

Integrating population levels: failure to diverge and Incomplete Lineage Sorting (figure 3L,M).[edit | edit source]

In host/symbiont frameworks, a single symbiont species is sometimes associated to several hosts species. This means that while a speciation or diversification has been observed in the host, the populations are indistinguishable in the symbiont. This is handled for example by additional polytomies in the symbiont tree, possibly leading to intractable inference problems, because polytomies need to be resolved. It is also modeled by an additional evolutionary event "failure to diverge" (Jane,[20] Amocoala [97]) (figure 3L). Failure to diverge can be a way to allow "free" host switch in a population, a flow of symbionts between closely related hosts. Following that vision, host switch allowed only for close hosts is considered in Eucalypt.[22] This idea of horizontal flow between close populations can also be applied to gene/species frameworks, with a definition of species based on a gradient of gene flow between populations.[98]

Failure to diverge is one way of introducing population dynamics in reconciliation, a framework mainly adapted to the multi-species level, where populations are supposed to be well differentiated. There are other population phenomena that limit this framework, one of them being deep coalescence of lineages, leading to Incomplete Lineage Sorting (ILS), which is not handled by the DTL model.[24][11] The multi species coalescent is a classical model of alleles evolution along a species tree, with birth of alleles and sorting of alleles at speciations, that takes into account population sizes and naturally encompass ILS.[99][100][101][102][103] In a reconciliation context, several attempts have been made in order to account for ILS without the complex integration of a population model. For example, ILS can be seen as a possible evolutionary pattern for the gene tree (figure 3M). In that case children lineages are not independent of one another, leading to intractability results. ILS alone can be handled with LCA, but ILS + DL reconciliation is NP hard, even without transfers.[104]

Notung [24] handles ILS by collapsing short branches of the species tree in polytomies and allowing ILS as a free diversification of gene trees on those polytomies. EcceTERA [105] bounds the maximum size of connected parts of the species tree where ILS can happen, proposing a fixed parameter tractable algorithm in that parameter.

ILS and DL can be considered on an upper network instead of tree. This models in particular introgression, with the possibility to estimate model parameters.[106]

More integrative reconciliation models accounting for ILS have been proposed including both DL and multispecies coalescent,[107] with DLCoal. It is a probabilistic model with a parsimony translation,[108] proposing two sequential LCA-type heuristics handled via an intermediate locus tree between gene and species. However outside of the gene/species reconciliation framework ILS seems, for no particular reason, never considered in host/symbiont, nor in biogeography.

Documenting dependence between multiple scales of biological organization[edit | edit source]

A striking aspect of reconciliation is the common methodology handling different levels of organization: it is used for comparing domain and protein trees, gene and species trees, hosts and symbiont trees, population and geographic trees. However, now that scientists tend to consider that multi-scale models of biological functioning bring a novel and game changing view of organisms and their environment,[109] the question is how to use reconciliation to bring phylogenetics to this holobiont era (figure 2).

Coevolution of entities at different scales of evolution is at the basis of the holobiont idea: macro-organisms, micro-organisms and their genes all have a different history bound to a common functioning in a single ecosystem. Biological system like the entanglement of host, symbionts and their genes imply functional and evolutionary dependencies between more than two levels.

Examples of multi level systems with complex evolutionary inter-dependencies[edit | edit source]

Genes coevolving beyond genome boundaries[edit | edit source]

The holobiont concept stresses the possibility of genes from different genomes to cooperate and coevolve. For instance, certain genes in a symbiont genome may provide a function to its host, like the production of a vital compound absent from available feeding sources. An iconic example is the case for blood-feeding or sap-feeding insects, which often depend on one or several bacterial symbionts to thrive on a resource that is abundant in sugar, but lacks essential amino-acids or vitamins.[110] Another example is the association of Fabaceae with nitrogen-fixing bacteria. The compound beneficiary to the host is typically produced by a set of genes encoded in the symbiont genome, which throughout evolution, may be transferred to other symbionts, and/or in and out of the host genome. Reconciliation methods have the potential to reveal evolutionary links between portions of genomes from different species. A search for coevolving genes beyond the boundaries of the genomes in which they are encoded would highlight the basis for the association of organisms in the holobiont.

Horizontal gene transfer routes depend on multiple levels[edit | edit source]

In intracellular mutualistic symbiont insect systems, multiple occurrence of horizontal gene transfers have been identified, whether from host to symbiont, symbiont to host or symbiont to symbiont.[111]

Transfers of endosymbiont genes involved in nutrition pathways beneficiary to the insect host have been shown to occur preferentially if the donor and recipient lineages share the same host.[112][113][114] This is also the case in insect with bacterial symbionts providing defensive protein [115] or in obligate leaf nodule bacterial symbionts associated with plants.[116] In the human host, gene transfers has been shown to occur preferentially among symbionts hosted in the same organs.[117]

A review on horizontal gene transfers in host/symbiont systems [118] stresses the importance of supporting HGTs with multiple evidence. Notably it is argued that transfers should be considered better supported when involving symbionts sharing a habitat, a geographical area, or a same host. One should however keep in mind that most of the diversity of hosts and symbionts is unknown and that transfers may have occurred in unsampled closely related species, hosts or symbionts.

The idea that gene transfer in symbionts is constrained by the host can also be used to investigate hosts history. For instance, based on phylogeographical studies, it is now accepted that the bacteria Helicobacter pylori has been associated with Human populations since the origins of the human species.[119][120] Analysis of the genomes of Helicobacter pylori in Europe suggests that they are issued from a recombination between African and Asian Helicobacter pylori. This strongly implies early contacts between the corresponding human populations.

Similarly, an analysis of HGTs in coronaviruses from different mammalian species using reconciliation methods has revealed frequent contact between viruses lineages which can be interpreted as frequent host switches.[121]

Cultural evolution[edit | edit source]

The evolution of elements of human culture, for instance languages and folktales, in association with human population genetics, has been studied using concepts from phylogenetics. Although reconciliation has never been used in this framework, some of these studies encompass multiple levels of organization, each represented by a tree or the evolution of a character, with a focus on the coevolution of these levels.

Language trees can be compared with population trees in order to reveal vertically transmitted folktales, via a character model on this language tree.[122] Variants in each folktales family, languages, genetic diversity, populations and geography can be compared two by two, to link folktales diversification with languages on one side and with geography on the other side.[123] As in genetics with symbionts sharing host promoting HGTs, linguistic barriers can foreclose the transmission of folktales or language elements.[124]

Investigating three-level systems using two-level reconciliation[edit | edit source]

Multi level reconciliation is not as developed as two-level reconciliation. One way to approach the evolutionary dependencies between more than two levels of organization is to try to use available standard two-level methods to give a first insight into biological system's complexity.

Models and challenges to reconcile more than two phylogenetic trees.

Multi-gene events: implicit consideration of an intermediate level (figure 4A,B,C).[edit | edit source]

At the gene/species tree level, one typically deals with many different gene trees. In this case, the hypothesis that different gene families evolve independently is made implicitly. However this needs not be the case. For instance, duplication, transfer and loss can occur for segments of a genome spanning an arbitrary number of contiguous genes. It is possible to consider such multi-gene events using an intermediate guide for lower trees inside the upper one. For instance one can compute the joint likelihood of multiple gene tree reconciliations with a dated species tree with duplication, loss and whole genome duplication [125] (figure 4A). Similarly the DL framework can be enriched with duplication and loss of chromosome segments instead of a single gene (figure 4B). However DL reconciliation becomes intractable with that new possibility.[126]

The link between two consecutive genes can also be modeled as an evolving character, subject to gain, loss, origination, breakage, duplication and transfer.[127] The evolution of this link appears as an additional level to species and gene trees, partly constrained by the gene/species tree reconciliation, partly evolving on its own, according to genome organization. It thus models the synteny, or proximity between genes. At another scale it can as well model the evolution of the belonging of two domains to a protein.

The detection of "highways of transfers", the preferential acquisition of groups of genes from a specific donor, is another example of non-independence of gene histories.[128] It has also lead to methodological developments such as reconciliations using phylogenetic networks, seen as a tree augmented with transfers edges, which can be used to constrain transfers in a DTL model.[129] Networks can also be used to model introgression and Incomplete Lineage Sorting [130][131][42] (figure 4C).

Detecting coevolution in multiple pairs of levels (figure 4D).[edit | edit source]

It is a central question to understand the evolution of an holobiont to know what are the levels that coevolve with each others, for instance between host species, host genes, symbionts and symbiont genes. It is possible to approach the multiple inter-dependencies between all levels of evolution by multiple pairwise comparisons of two evolving entities.

Reconciliation of host and symbiont on one side and geography and symbiont on the other side, can also help to identify patterns of diversification of host and symbiont that reflect coevolution on one side, and patterns that can be explained by a common geographical diversification on the other [132][133][134][135] (figure 4D). Similarly, a study used reconciliation methods to differentiate the effect of diet evolution and phylogenetic inertia on the composition of mammalian gut microbiomes. By reconstructing ancestral diets and microbiome composition onto a mammalian phylogeny, the study revealed that both effects contribute but at different time scales.[136]

Explicit modeling of three or more levels[edit | edit source]

In a model of a multi-level system as host/symbiont/genes, horizontal gene transfers should be more likely between two symbionts of a same host. This is invisible to a two-level gene tree/species tree or host/symbiont reconciliation: in some cases looking at any combination of two levels can lead to miss an evolutionary scenario which can only be the most likely if the information from the three trees are considered together (figure 5).

Higher level of organization can structure two lower levels phylogenetic reconciliation.

Trying to face the limitation of these use of standard two-level reconciliations with systems involving inter-dependencies at multiple levels, a methodological effort has been done in the last decade to construct and use multi-level models. It requires the identification of at least one "intermediate" level between the upper and the lower one.

Pre-reconciliation: characters onto reconciled trees (figure 4E,F).[edit | edit source]

A first step towards integrated three levels model is to consider phylogenetic trees at two levels and another level represented only with characters at the leaves of one of the trees (figure 4E). For instance a reconciliation of host and symbiont phylogenies can be informed by geographic data.[137] Ancestral geographic locations of host and symbiont species obtained through a character inference method can then be used to constraint the host/symbiont reconciliation: ancestral hosts and symbionts can only be associated if they belong to the same geographical location (figure 4F).

At another scale the evolution at the sub-gene level can be approached with a character method.[31] Here, parts of genes (e.g. the sequence coding for protein domains) is reconciled according to a DL model with a species tree, and the genes they belong to are mentioned as characters of these parts. Ancestral genes are then reconstructed a posteriori via merge and splits of gene parts.

Two-level reconciliations informed by a third level (figure 4G,H).[edit | edit source]

As pointed by several studies § Horizontal transfer at multiple levels, an upper level can inform a reconciliation between an intermediate and lower one, notably for horizontal transfers. Three level models can take into account these assumptions to guide reconciliations between an intermediate and lower trees with the knowledge of an upper tree. The model can for example give higher likelihoods to reconciliation scenarios where horizontal gene transfers happen between entities sharing the same habitat. It has been achieved for the first time with DTL gene/species reconciliations nested with a DTL gene domain and gene reconciliation.[32] Different costs for inter and intra transfers depend on whether or not transfers happen between genes of the same genomes (figure 4G,H sequential).

Note that this model explicitly considers three levels and three trees, but does not yet define a real three level reconciliation, with a likelihood or score associated.[32] It relies on a sequential operation, where the second reconciliation is informed by the result of the first one.

The reconciliation problem in multi-scale models (figure 4J).[edit | edit source]

The next step is to define the score of a reconciliation consisting of three nested trees and to compute, given the three trees, three-level reconciliations according to their score. It has been achieved with a species/gene/domain system, where genes evolve within the species tree with a DL model and domains evolve within the gene/species system with a DTL model, forbidding domain transfers between genes of two different species (figure 4G).[33] Inference involves candidate scenarios with joint scores (figure 4H joint). Computing the minimum score scenario is NP-hard, but dynamic programming or integer linear programming can offer heuristics.[33][138] Variation of the problem when multiple domains are considered [139] and a simulation framework [140] is available.

Exploring the space of intermediate trees (figure 4I)[edit | edit source]

Just like two-level reconciliation can be used to improve lower or upper phylogenies, or to help constructing them from aligned sequences, joint reconciliation models can be used in the same manner. In this vein a coupled gene/species DL, domain gene DL and gene sequence evolution model in a bayesian framework improves the reconstruction of gene trees [141] (figure 4 I).


Conclusion[edit | edit source]

Reconciliation is now mature as a methodological research subject, a network of researchers and labs working together is emerging, with an active research, a good diversity of available software, and cooperative initiatives like RecPhyloXML, a common standard of output of reconciliations.[30] In the future methodological advances which sustain the development of new models will certainly play an important part in the possibilities of studies surrounding reconciliations. Notably, new approaches may depart from the dynamic programming solution for DTL which progresses along a rather narrow road: almost each new constraint or event on top of it yields intractability results.

In this article we progressed from two to three embedded trees, and there is potentially an infinity of interacting and coevolving levels to study (see four levels examples in [122] [123][114][116]). Current quantitative methods obviously cannot yet handle such a complexity. In order to compare hypotheses, and assess them in a statistically grounded framework, they are still to be developed and generalized to help the understanding of multi-scale evolving systems, including protein domains, genes, protein complexes, micro and macro organisms, and their ecology.

We showed that there have been multiple first steps in the modeling and methods for the embedding of three trees with lower/intermediate and intermediate/upper reconciliations. Methodological efforts could propose new hints for a joint optimization with horizontal transfers for each levels, and moreover offer a probabilistic framework.

Three level reconciliations have only been applied to domain/gene/species combinations while they could handle the classical holobiontic combination gene/symbiont/host. Models could allow the identification of the coevolving entities inside an ecosystem or a holobiont. For example, the parts of a symbiont tree which follow its hosts, while other parts escape this host but follow geography. Or, at another level, the parts of gene trees evolving with symbiont genomes, and the parts evolving with hosts, indicating at which level they are selected.

Optional info for Wikipedia[edit | edit source]

See also (optional)[edit | edit source]

You may suggest some additional relevant wikipedia pages that this article can list when uploaded to Wikipedia if they were not linked to in the main text.


Wikipedia pages that should link here (optional)[edit | edit source]

You may suggest some wikipedia pages that should link to this article on Wikipedia :

Front page sentence (optional)[edit | edit source]

Newly created/expanded pages can be nominated to have a sentence featured on Wikipedia's front page (see our guidelines here).

  • Did you know that the evolution of life can be depicted as the imbrication of a variety of phylogenetic trees representing the history of its elements, from molecules to interacting organisms?
  • Did you know that evolution at many scales, nucleotides, genes, genomes, organisms, holobionts, ecosystems, can be studied together thanks to a common method, phylogenetic reconciliation?

References[edit | edit source]

  1. Bagowski, C. P.; Bruins, W.; Te Velthuis, A. J. (2010). "The nature of protein domain evolution: Shaping the interaction network". Current Genomics 11 (5): 368–376. doi:10.2174/138920210791616725. PMID 21286315. PMC 2945003. //www.ncbi.nlm.nih.gov/pmc/articles/PMC2945003/. 
  2. Nair, N. U.; Lin, Y.; Manasovska, A.; Antic, J.; Grnarova, P.; Sahu, A. D.; Bucher, P.; Moret, B. M. (2014). "Study of cell differentiation by phylogenetic analysis using histone modification data". BMC Bioinformatics 15: 269. doi:10.1186/1471-2105-15-269. PMID 25104072. PMC 4138389. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4138389/. 
  3. Woese, C. R.; Kandler, O.; Wheelis, M. L. (1990). "Towards a natural system of organisms: Proposal for the domains Archaea, Bacteria, and Eucarya". Proceedings of the National Academy of Sciences of the United States of America 87 (12): 4576–4579. doi:10.1073/pnas.87.12.4576. PMID 2112744. PMC 54159. //www.ncbi.nlm.nih.gov/pmc/articles/PMC54159/. 
  4. Dobzhansky, T.; Sturtevant, A. H. (1938). "Inversions in the Chromosomes of Drosophila Pseudoobscura". Genetics 23 (1): 28–64. doi:10.1093/genetics/23.1.28. PMID 17246876. PMC 1209001. //www.ncbi.nlm.nih.gov/pmc/articles/PMC1209001/. 
  5. 5.0 5.1 Zuckerkandl, E.; Pauling, L. (1965). "Molecules as documents of evolutionary history". Journal of Theoretical Biology 8 (2): 357–366. doi:10.1016/0022-5193(65)90083-4. PMID 5876245. 
  6. Bagowski, C. P.; Bruins, W.; Te Velthuis, A. J. (2010). "The nature of protein domain evolution: Shaping the interaction network". Current Genomics 11 (5): 368–376. doi:10.2174/138920210791616725. PMID 21286315. PMC 2945003. //www.ncbi.nlm.nih.gov/pmc/articles/PMC2945003/. 
  7. Gray, R. D.; Bryant, D.; Greenhill, S. J. (2010). "On the shape and fabric of human history". Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 365 (1559): 3923–3933. doi:10.1098/rstb.2010.0162. PMID 21041216. PMC 2981918. //www.ncbi.nlm.nih.gov/pmc/articles/PMC2981918/. 
  8. Tehrani, J. J. (2013). "The phylogeny of Little Red Riding Hood". PLOS ONE 8 (11): e78871. doi:10.1371/journal.pone.0078871. PMID 24236061. PMC 3827309. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3827309/. 
  9. 9.0 9.1 Wieseke N and Bernt M and Middendorf M (2013) Unifying Parsimonious Tree Reconciliation. arXiv
  10. 10.0 10.1 Boussau B, and Scornavacca C (2020) Reconciling Gene trees with Species Trees. No commercial publisher | Authors open access book: 3.2:1--3.2:23
  11. 11.0 11.1 11.2 Szöllősi, G. J.; Tannier, E.; Daubin, V.; Boussau, B. (2015). "The inference of gene trees with species trees". Systematic Biology 64 (1): e42-62. doi:10.1093/sysbio/syu048. PMID 25070970. PMC 4265139. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4265139/. 
  12. 12.0 12.1 12.2 Doyon, J. P.; Ranwez, V.; Daubin, V.; Berry, V. (2011). "Models, algorithms and programs for phylogeny reconciliation". Briefings in Bioinformatics 12 (5): 392–400. doi:10.1093/bib/bbr045. PMID 21949266. 
  13. 13.0 13.1 Nakhleh, L. (2013). "Computational approaches to species phylogeny inference and gene tree reconciliation". Trends in Ecology & Evolution 28 (12): 719–728. doi:10.1016/j.tree.2013.09.004. PMID 24094331. PMC 3855310. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3855310/. 
  14. 14.0 14.1 Charleston, M. A.; Perkins, S. L. (2006). "Traversing the tangle: Algorithms and applications for cophylogenetic studies". Journal of Biomedical Informatics 39 (1): 62–71. doi:10.1016/j.jbi.2005.08.006. PMID 16226921. 
  15. 15.0 15.1 Charleston M and Libeskind-Hadas R (2014) Event-Based Cophylogenetic Comparative Analysis. In: Garamszegi L. (eds) Modern Phylogenetic Comparative Methods and Their Application in Evolutionary Biology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43550-2_20
  16. Martínez-Aquino, A. (2016). "Phylogenetic framework for coevolutionary studies: A compass for exploring jungles of tangled trees". Current Zoology 62 (4): 393–403. doi:10.1093/cz/zow018. PMID 29491928. PMC 5804275. //www.ncbi.nlm.nih.gov/pmc/articles/PMC5804275/. 
  17. 17.0 17.1 17.2 Ronquist F (1997) Phylogenetic approaches in coevolution and biogeography. Zoologica Scripta 26:313--322
  18. 18.0 18.1 Ree, R. H.; Moore, B. R.; Webb, C. O.; Donoghue, M. J. (2005). "A likelihood framework for inferring the evolution of geographic range on phylogenetic trees". Evolution; International Journal of Organic Evolution 59 (11): 2299–2511. doi:10.1111/j.0014-3820.2005.tb00940.x. PMID 16396171. 
  19. 19.0 19.1 19.2 19.3 Ree, R. H.; Smith, S. A. (2008). "Maximum likelihood inference of geographic range evolution by dispersal, local extinction, and cladogenesis". Systematic Biology 57 (1): 4–14. doi:10.1080/10635150701883881. PMID 18253896. 
  20. 20.0 20.1 20.2 20.3 20.4 Conow, C.; Fielder, D.; Ovadia, Y.; Libeskind-Hadas, R. (2010). "Jane: A new tool for the cophylogeny reconstruction problem". Algorithms for Molecular Biology : Amb 5: 16. doi:10.1186/1748-7188-5-16. PMID 20181081. PMC 2830923. //www.ncbi.nlm.nih.gov/pmc/articles/PMC2830923/. 
  21. 21.0 21.1 21.2 Santichaivekin, S.; Yang, Q.; Liu, J.; Mawhorter, R.; Jiang, J.; Wesley, T.; Wu, Y. C.; Libeskind-Hadas, R. (2021). "EMPRess: A systematic cophylogeny reconciliation tool". Bioinformatics (Oxford, England) 37 (16): 2481–2482. doi:10.1093/bioinformatics/btaa978. PMID 33216126. 
  22. 22.0 22.1 22.2 22.3 22.4 22.5 Donati, B.; Baudet, C.; Sinaimeri, B.; Crescenzi, P.; Sagot, M. F. (2015). "EUCALYPT: Efficient tree reconciliation enumerator". Algorithms for Molecular Biology : Amb 10 (1): 3. doi:10.1186/s13015-014-0031-3. PMID 25648467. PMC 4310143. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4310143/. 
  23. 23.0 23.1 Bansal, M. S.; Alm, E. J.; Kellis, M. (2012). "Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss". Bioinformatics (Oxford, England) 28 (12): i283-91. doi:10.1093/bioinformatics/bts225. PMID 22689773. PMC 3371857. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3371857/. 
  24. 24.0 24.1 24.2 24.3 Stolzer, M.; Lai, H.; Xu, M.; Sathaye, D.; Vernot, B.; Durand, D. (2012). "Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees". Bioinformatics (Oxford, England) 28 (18): i409–i415. doi:10.1093/bioinformatics/bts386. PMID 22962460. PMC 3436813. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3436813/. 
  25. 25.0 25.1 Nguyen, T. H.; Ranwez, V.; Pointet, S.; Chifolleau, A. M.; Doyon, J. P.; Berry, V. (2013). "Reconciliation and local gene tree rearrangement can be of mutual profit". Algorithms for Molecular Biology : Amb 8 (1): 12. doi:10.1186/1748-7188-8-12. PMID 23566548. PMC 3871789. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3871789/. 
  26. 26.0 26.1 26.2 David, L. A.; Alm, E. J. (2011). "Rapid evolutionary innovation during an Archaean genetic expansion". Nature 469 (7328): 93–96. doi:10.1038/nature09649. PMID 21170026. 
  27. 27.0 27.1 Jacox, E.; Chauve, C.; Szöllősi, G. J.; Ponty, Y.; Scornavacca, C. (2016). "EcceTERA: Comprehensive gene tree-species tree reconciliation using parsimony". Bioinformatics (Oxford, England) 32 (13): 2056–2058. doi:10.1093/bioinformatics/btw105. PMID 27153713. 
  28. 28.0 28.1 28.2 28.3 Szöllõsi, G. J.; Rosikiewicz, W.; Boussau, B.; Tannier, E.; Daubin, V. (2013). "Efficient exploration of the space of reconciled gene trees". Systematic Biology 62 (6): 901–912. doi:10.1093/sysbio/syt054. PMID 23925510. PMC 3797637. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3797637/. 
  29. 29.0 29.1 Comte, N.; Morel, B.; Hasić, D.; Guéguen, L.; Boussau, B.; Daubin, V.; Penel, S.; Scornavacca, C. et al. (2020). "Treerecs: An integrated phylogenetic tool, from sequences to reconciliations". Bioinformatics (Oxford, England) 36 (18): 4822–4824. doi:10.1093/bioinformatics/btaa615. PMID 33085745. 
  30. 30.0 30.1 30.2 Duchemin, W.; Gence, G.; Arigon Chifolleau, A. M.; Arvestad, L.; Bansal, M. S.; Berry, V.; Boussau, B.; Chevenet, F. et al. (2018). "RecPhyloXML: A format for reconciled gene trees". Bioinformatics (Oxford, England) 34 (21): 3646–3652. doi:10.1093/bioinformatics/bty389. PMID 29762653. PMC 6198865. //www.ncbi.nlm.nih.gov/pmc/articles/PMC6198865/. 
  31. 31.0 31.1 Wu, Y. C.; Rasmussen, M. D.; Kellis, M. (2012). "Evolution at the subgene level: Domain rearrangements in the Drosophila phylogeny". Molecular Biology and Evolution 29 (2): 689–705. doi:10.1093/molbev/msr222. PMID 21900599. PMC 3258039. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3258039/. 
  32. 32.0 32.1 32.2 Stolzer, M.; Siewert, K.; Lai, H.; Xu, M.; Durand, D. (2015). "Event inference in multidomain families with phylogenetic reconciliation". BMC Bioinformatics 16 Suppl 14: S8. doi:10.1186/1471-2105-16-S14-S8. PMID 26451642. PMC 4610023. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4610023/. 
  33. 33.0 33.1 33.2 Li, L.; Bansal, M. S. (2019). "An Integrated Reconciliation Framework for Domain, Gene, and Species Level Evolution". IEEE/ACM Transactions on Computational Biology and Bioinformatics 16 (1): 63–76. doi:10.1109/TCBB.2018.2846253. PMID 29994126. 
  34. 34.0 34.1 Brooks D (1981) Hennig's Parasitological Method: A Proposed Solution. Systematic Zoology, 30(3), 229-249. doi:10.2307/2413247
  35. Wiley E (1988) Parsimony Analysis and Vicariance Biogeography. Systematic Zoology, 37(3), 271-290. doi:10.2307/2992373
  36. Csurös, M.; Miklós, I. (2009). "Streamlining and large ancestral genomes in Archaea inferred with a phylogenetic birth-and-death model". Molecular Biology and Evolution 26 (9): 2087–2095. doi:10.1093/molbev/msp123. PMID 19570746. PMC 2726834. //www.ncbi.nlm.nih.gov/pmc/articles/PMC2726834/. 
  37. 37.0 37.1 Felsenstein J (2004) Inferring Phylogenies. Oxford University Press
  38. Groussin M and Daubin V and Gouy M and Tannier E (2016) Ancestral Reconstruction: Theory and Practice. Encyclopedia of Evolutionary Biology, Academic Press, 70-77,https://doi.org/10.1016/B978-0-12-800049-6.00166-9.
  39. 39.0 39.1 Goodman M, Czelusniak J, Moore G, Romero-Herrera A & Matsuda G (1979) Fitting the Gene Lineage into its Species Lineage, a Parsimony Strategy Illustrated by Cladograms Constructed from Globin Sequences. Systematic Zoology, 28(2), 132-163. doi:10.2307/2412519
  40. 40.0 40.1 Arvestad, L.; Berglund, A. C.; Lagergren, J.; Sennblad, B. (2003). "Bayesian gene/Species tree reconciliation and orthology analysis using MCMC". Bioinformatics (Oxford, England) 19 Suppl 1: i7-15. doi:10.1093/bioinformatics/btg1000. PMID 12855432. 
  41. Arvestad L, Berglund A-C, Lagergren J and Sennblad B (2004) Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution. In Proceedings of the eighth annual international conference on Resaerch in computational molecular biology (RECOMB '04). Association for Computing Machinery, New York, NY, USA, 326–335. https://doi.org/10.1145/974614.974657
  42. 42.0 42.1 Yu, Y.; Dong, J.; Liu, K. J.; Nakhleh, L. (2014). "Maximum likelihood inference of reticulate evolutionary histories". Proceedings of the National Academy of Sciences of the United States of America 111 (46): 16448–16453. doi:10.1073/pnas.1407950111. PMID 25368173. PMC 4246314. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4246314/. 
  43. Csurös, M. (2010). "Count: Evolutionary analysis of phylogenetic profiles with parsimony and likelihood". Bioinformatics (Oxford, England) 26 (15): 1910–1912. doi:10.1093/bioinformatics/btq315. PMID 20551134. 
  44. Chauve C and El-Mabrouk N (2009) New Perspectives on Gene Family Evolution: Losses in Reconciliation and a Link with Supertrees. RECOMB 5541:46-58
  45. Page R (1994) Parallel phylogenies: reconstructing the history of host-parasite assemblages. Cladistics 10: 2: 155–173.
  46. 46.0 46.1 Charleston, M. A. (1998). "Jungles: A new solution to the host/Parasite phylogeny reconciliation problem". Mathematical Biosciences 149 (2): 191–223. doi:10.1016/s0025-5564(97)10012-8. PMID 9621683. 
  47. 47.0 47.1 Doyon J, Scornavacca C, Gorbunov K, Szöllősi G, Ranwez V et al. (2010) An Efficient Algorithm for Gene/Species Trees Parsimonious Reconciliation with Losses, Duplications and Transfers. RECOMB-CG
  48. 48.0 48.1 48.2 Szöllosi, G. J.; Boussau, B.; Abby, S. S.; Tannier, E.; Daubin, V. (2012). "Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations". Proceedings of the National Academy of Sciences of the United States of America 109 (43): 17513–17518. doi:10.1073/pnas.1202997109. PMID 23043116. PMC 3491530. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3491530/. 
  49. 49.0 49.1 Merkle, D.; Middendorf, M.; Wieseke, N. (2010). "A parameter-adaptive dynamic programming approach for inferring cophylogenies". BMC Bioinformatics 11 Suppl 1: S60. doi:10.1186/1471-2105-11-S1-S60. PMID 20122236. PMC 3009534. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3009534/. 
  50. Baudet, C.; Donati, B.; Sinaimeri, B.; Crescenzi, P.; Gautier, C.; Matias, C.; Sagot, M. F. (2015). "Cophylogeny reconstruction via an approximate Bayesian computation". Systematic Biology 64 (3): 416–431. doi:10.1093/sysbio/syu129. PMID 25540454. PMC 4395844. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4395844/. 
  51. Libeskind-Hadas, R.; Wu, Y. C.; Bansal, M. S.; Kellis, M. (2014). "Pareto-optimal phylogenetic tree reconciliation". Bioinformatics (Oxford, England) 30 (12): i87-95. doi:10.1093/bioinformatics/btu289. PMID 24932009. PMC 4058917. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4058917/. 
  52. Hallett M and Lagergren J (2001) Efficient algorithms for lateral gene transfer problems. RECOMB '01, Proceedings of the fifth annual international conference on Computational biology: 149--156
  53. 53.0 53.1 53.2 Tofigh, A.; Hallett, M.; Lagergren, J. (2011). "Simultaneous identification of duplications and lateral gene transfers". IEEE/ACM Transactions on Computational Biology and Bioinformatics 8 (2): 517–535. doi:10.1109/TCBB.2010.14. PMID 21233529. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-10969. 
  54. Ovadia, Y.; Fielder, D.; Conow, C.; Libeskind-Hadas, R. (2011). "The co phylogeny reconstruction problem is NP-complete". Journal of Computational Biology : A Journal of Computational Molecular Cell Biology 18 (1): 59–65. doi:10.1089/cmb.2009.0240. PMID 20715926. 
  55. Wieseke, N.; Hartmann, T.; Bernt, M.; Middendorf, M. (2015). "Cophylogenetic Reconciliation with ILP". IEEE/ACM Transactions on Computational Biology and Bioinformatics 12 (6): 1227–1235. doi:10.1109/TCBB.2015.2430336. PMID 26671795. 
  56. 56.0 56.1 56.2 Durand, D.; Halldórsson, B. V.; Vernot, B. (2006). "A hybrid micro-macroevolutionary approach to gene tree reconstruction". Journal of Computational Biology : A Journal of Computational Molecular Cell Biology 13 (2): 320–335. doi:10.1089/cmb.2006.13.320. PMID 16597243. 
  57. Ma, W.; Smirnov, D.; Forman, J.; Schweickart, A.; Slocum, C.; Srinivasan, S.; Libeskind-Hadas, R. (2018). "DTL-RNB: Algorithms and Tools for Summarizing the Space of DTL Reconciliations". IEEE/ACM Transactions on Computational Biology and Bioinformatics 15 (2): 411–421. doi:10.1109/TCBB.2016.2537319. PMID 26955051. 
  58. 58.0 58.1 Chauve C, Rafiey A, Davin A, Scornavacca C, Philippe V et al. (2017) MaxTiC: Fast ranking of a phylogenetic tree by Maximum Time Consistency with lateral gene transfers. bioarxiv
  59. Davín, A. A.; Tannier, E.; Williams, T. A.; Boussau, B.; Daubin, V.; Szöllősi, G. J. (2018). "Gene transfers can date the tree of life". Nature Ecology & Evolution 2 (5): 904–909. doi:10.1038/s41559-018-0525-3. PMID 29610471. PMC 5912509. //www.ncbi.nlm.nih.gov/pmc/articles/PMC5912509/. 
  60. 60.0 60.1 Szöllosi, G. J.; Tannier, E.; Lartillot, N.; Daubin, V. (2013). "Lateral gene transfer from the dead". Systematic Biology 62 (3): 386–397. doi:10.1093/sysbio/syt003. PMID 23355531. PMC 3622898. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3622898/. 
  61. Davín, A. A.; Tricou, T.; Tannier, E.; De Vienne, D. M.; Szöllősi, G. J. (2020). "Zombi: A phylogenetic simulator of trees, genomes and sequences that accounts for dead linages". Bioinformatics (Oxford, England) 36 (4): 1286–1288. doi:10.1093/bioinformatics/btz710. PMID 31566657. PMC 7031779. //www.ncbi.nlm.nih.gov/pmc/articles/PMC7031779/. 
  62. Sennblad, B.; Schreil, E.; Berglund Sonnhammer, A. C.; Lagergren, J.; Arvestad, L. (2007). "Primetv: A viewer for reconciled trees". BMC Bioinformatics 8: 148. doi:10.1186/1471-2105-8-148. PMID 17484781. PMC 1891116. //www.ncbi.nlm.nih.gov/pmc/articles/PMC1891116/. 
  63. Chevenet, F.; Doyon, J. P.; Scornavacca, C.; Jacox, E.; Jousselin, E.; Berry, V. (2016). "SylvX: A viewer for phylogenetic tree reconciliations". Bioinformatics (Oxford, England) 32 (4): 608–610. doi:10.1093/bioinformatics/btv625. PMID 26515823. 
  64. Nguyen, T. H.; Ranwez, V.; Berry, V.; Scornavacca, C. (2013). "Support measures to estimate the reliability of evolutionary events predicted by reconciliation methods". PLOS ONE 8 (10): e73667. doi:10.1371/journal.pone.0073667. PMID 24124449. PMC 3790797. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3790797/. 
  65. 65.0 65.1 Kundu, S.; Bansal, M. S. (2018). "On the impact of uncertain gene tree rooting on duplication-transfer-loss reconciliation". BMC Bioinformatics 19 (Suppl 9): 290. doi:10.1186/s12859-018-2269-0. PMID 30367593. PMC 6101088. //www.ncbi.nlm.nih.gov/pmc/articles/PMC6101088/. 
  66. Ma, W.; Smirnov, D.; Forman, J.; Schweickart, A.; Slocum, C.; Srinivasan, S.; Libeskind-Hadas, R. (2018). "DTL-RNB: Algorithms and Tools for Summarizing the Space of DTL Reconciliations". IEEE/ACM Transactions on Computational Biology and Bioinformatics 15 (2): 411–421. doi:10.1109/TCBB.2016.2537319. PMID 26955051. 
  67. Huber K, Moulton V, Sagot M-F, Sinaimeri B (2018) Geometric medians in reconciliation spaces of phylogenetic trees. Information Processing Letter. 136: 96–101
  68. Mawhorter, R.; Libeskind-Hadas, R. (2019). "Hierarchical clustering of maximum parsimony reconciliations". BMC Bioinformatics 20 (1): 612. doi:10.1186/s12859-019-3223-5. PMID 31775628. PMC 6882150. //www.ncbi.nlm.nih.gov/pmc/articles/PMC6882150/. 
  69. Santichaivekin, S.; Mawhorter, R.; Libeskind-Hadas, R. (2019). "An efficient exact algorithm for computing all pairwise distances between reconciliations in the duplication-transfer-loss model". BMC Bioinformatics 20 (Suppl 20): 636. doi:10.1186/s12859-019-3203-9. PMID 31842734. PMC 6915856. //www.ncbi.nlm.nih.gov/pmc/articles/PMC6915856/. 
  70. Wang, Y.; Mary, A.; Sagot, M. F.; Sinaimeri, B. (2020). "Capybara: Equivalence ClAss enumeration of coPhylogenY event-BAsed ReconciliAtions". Bioinformatics (Oxford, England) 36 (14): 4197–4199. doi:10.1093/bioinformatics/btaa498. PMID 32556075. 
  71. Boussau, B.; Daubin, V. (2010). "Genomes as documents of evolutionary history". Trends in Ecology & Evolution 25 (4): 224–232. doi:10.1016/j.tree.2009.09.007. PMID 19880211. 
  72. Hahn, M. W. (2007). "Bias in phylogenetic tree reconciliation methods: Implications for vertebrate genome evolution". Genome Biology 8 (7): R141. doi:10.1186/gb-2007-8-7-r141. PMID 17634151. PMC 2323230. //www.ncbi.nlm.nih.gov/pmc/articles/PMC2323230/. 
  73. 73.0 73.1 Urbini, L.; Sinaimeri, B.; Matias, C.; Sagot, M. F. (2019). "Exploring the Robustness of the Parsimonious Reconciliation Method in Host-Symbiont Cophylogeny". IEEE/ACM Transactions on Computational Biology and Bioinformatics 16 (3): 738–748. doi:10.1109/TCBB.2018.2838667. PMID 29993554. https://hal.inria.fr/hal-01842451/file/Robustness_Cophylogeny.pdf. 
  74. Górecki, P.; Eulenstein, O.; Tiuryn, J. (2013). "Unrooted tree reconciliation: A unified approach". IEEE/ACM Transactions on Computational Biology and Bioinformatics 10 (2): 522–536. doi:10.1109/TCBB.2013.22. PMID 23929875. 
  75. Lafond M and Noutahi E and El-Mabrouk N (2016) Efficient Non-Binary Gene Tree Resolution with Weighted Reconciliation Cost. 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016) 14:1--14:12
  76. 76.0 76.1 Kordi, M.; Bansal, M. S. (2017). "On the Complexity of Duplication-Transfer-Loss Reconciliation with Non-Binary Gene Trees". IEEE/ACM Transactions on Computational Biology and Bioinformatics 14 (3): 587–599. doi:10.1109/TCBB.2015.2511761. PMID 28055898. 
  77. Lai H, Stolzer M and Durand D (2017) Fast Heuristics for Resolving Weakly Supported Branches Using Duplication, Transfers, and Losses. In: Meidanis J., Nakhleh L. (eds) Comparative Genomics. RECOMB-CG 2017. Lecture Notes in Computer Science, vol 10562. Springer, Cham. https://doi.org/10.1007/978-3-319-67979-2_16
  78. Jacox, E.; Weller, M.; Tannier, E.; Scornavacca, C. (2017). "Resolution and reconciliation of non-binary gene trees with transfers, duplications and losses". Bioinformatics (Oxford, England) 33 (7): 980–987. doi:10.1093/bioinformatics/btw778. PMID 28073758. 
  79. Bansal, M. S.; Wu, Y. C.; Alm, E. J.; Kellis, M. (2015). "Improved gene tree error correction in the presence of horizontal gene transfer". Bioinformatics (Oxford, England) 31 (8): 1211–1218. doi:10.1093/bioinformatics/btu806. PMID 25481006. PMC 4393519. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4393519/. 
  80. Lartillot, N.; Philippe, H. (2004). "A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process". Molecular Biology and Evolution 21 (6): 1095–1109. doi:10.1093/molbev/msh112. PMID 15014145. 
  81. Scornavacca, C.; Jacox, E.; Szöllősi, G. J. (2015). "Joint amalgamation of most parsimonious reconciled gene trees". Bioinformatics (Oxford, England) 31 (6): 841–848. doi:10.1093/bioinformatics/btu728. PMID 25380957. PMC 4380024. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4380024/. 
  82. 82.0 82.1 Boussau, B.; Szöllosi, G. J.; Duret, L.; Gouy, M.; Tannier, E.; Daubin, V. (2013). "Genome-scale coestimation of species and gene trees". Genome Research 23 (2): 323–330. doi:10.1101/gr.141978.112. PMID 23132911. PMC 3561873. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3561873/. 
  83. Morel, B.; Kozlov, A. M.; Stamatakis, A.; Szöllősi, G. J. (2020). "GeneRax: A Tool for Species-Tree-Aware Maximum Likelihood-Based Gene Family Tree Inference under Gene Duplication, Transfer, and Loss". Molecular Biology and Evolution 37 (9): 2763–2774. doi:10.1093/molbev/msaa141. PMID 32502238. PMC 8312565. //www.ncbi.nlm.nih.gov/pmc/articles/PMC8312565/. 
  84. 84.0 84.1 Akerborg, O.; Sennblad, B.; Arvestad, L.; Lagergren, J. (2009). "Simultaneous Bayesian gene tree reconstruction and reconciliation analysis". Proceedings of the National Academy of Sciences of the United States of America 106 (14): 5714–5719. doi:10.1073/pnas.0806251106. PMID 19299507. PMC 2667006. //www.ncbi.nlm.nih.gov/pmc/articles/PMC2667006/. 
  85. Warnow T (2018) Supertree Construction: Opportunities and Challenges. arXiv
  86. Legried, B.; Molloy, E. K.; Warnow, T.; Roch, S. (2021). "Polynomial-Time Statistical Estimation of Species Trees Under Gene Duplication and Loss". Journal of Computational Biology : A Journal of Computational Molecular Cell Biology 28 (5): 452–468. doi:10.1089/cmb.2020.0424. PMID 33325781. 
  87. Molloy, E. K.; Warnow, T. (2020). "FastMulRFS: Fast and accurate species tree estimation under generic gene duplication and loss models". Bioinformatics (Oxford, England) 36 (Suppl_1): i57–i65. doi:10.1093/bioinformatics/btaa444. PMID 32657396. PMC 7355287. //www.ncbi.nlm.nih.gov/pmc/articles/PMC7355287/. 
  88. Zheng Y, Taoyang W, Zhang L (2012) Reconciliation of Gene and Species Trees With Polytomies. arXiv:1201.3995
  89. Ma B and Li M and Zhang L (2000) From Gene Trees to Species Trees. SIAM J. Comput., may, 729–752 24
  90. Ullah, I.; Parviainen, P.; Lagergren, J. (2015). "Species Tree Inference Using a Mixture Model". Molecular Biology and Evolution 32 (9): 2469–2482. doi:10.1093/molbev/msv115. PMID 25963975. 
  91. Bordewich W and Semple C (2005) On the Computational Complexity of the Rooted Subtree Prune and Regraft Distance. Annals of Combinatoris 8: 409-423
  92. 92.0 92.1 Hasić, D.; Tannier, E. (2019). "Gene tree species tree reconciliation with gene conversion". Journal of Mathematical Biology 78 (6): 1981–2014. doi:10.1007/s00285-019-01331-w. PMID 30767052. 
  93. Abby, S. S.; Tannier, E.; Gouy, M.; Daubin, V. (2010). "Detecting lateral gene transfers by statistical reconciliation of phylogenetic forests". BMC Bioinformatics 11: 324. doi:10.1186/1471-2105-11-324. PMID 20550700. PMC 2905365. //www.ncbi.nlm.nih.gov/pmc/articles/PMC2905365/. 
  94. Hein J, Jiang T, Wang L, Zhang K (1996) On the complexity of comparing evolutionary trees. Discrete Applied Mathematics 71:153--169
  95. Rodrigues E, Sagot M-F, Wakabayashi Y (2007) The maximum agreement forest problem. Theoretical Computer Science, apr:91--110
  96. Kordi M (2019) Inferring Microbial Gene Family Evolution Using Duplication-Transfer-Loss Reconciliation: Algorithms and Complexity. Doctoral Dissertations. 2101.
  97. Urbini L (2017) Models and algorithms to study the common evolutionary history of hosts and symbionts. Doctoral thesis, Université de Lyon
  98. Marin, J.; Achaz, G.; Crombach, A.; Lambert, A. (2020). "The genomic view of diversification". Journal of Evolutionary Biology 33 (10): 1387–1404. doi:10.1111/jeb.13677. PMID 32654283. 
  99. Rannala, B.; Yang, Z. (2003). "Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci". Genetics 164 (4): 1645–1656. doi:10.1093/genetics/164.4.1645. PMID 12930768. PMC 1462670. //www.ncbi.nlm.nih.gov/pmc/articles/PMC1462670/. 
  100. Degnan, J. H.; Salter, L. A. (2005). "Gene tree distributions under the coalescent process". Evolution; International Journal of Organic Evolution 59 (1): 24–37. doi:10.1111/j.0014-3820.2005.tb00891.x. PMID 15792224. 
  101. Maddison, W. P.; Knowles, L. L. (2006). "Inferring phylogeny despite incomplete lineage sorting". Systematic Biology 55 (1): 21–30. doi:10.1080/10635150500354928. PMID 16507521. 
  102. Liu, L.; Pearl, D. K. (2007). "Species trees from gene trees: Reconstructing Bayesian posterior distributions of a species phylogeny using estimated gene tree distributions". Systematic Biology 56 (3): 504–514. doi:10.1080/10635150701429982. PMID 17562474. 
  103. Rannala B and Edwards S and Leaché A and Yang Z (2020) Phylogenetics in the Genomic Era, 3.3:1--3.3:21
  104. Bork, D.; Cheng, R.; Wang, J.; Sung, J.; Libeskind-Hadas, R. (2017). "On the computational complexity of the maximum parsimony reconciliation problem in the duplication-loss-coalescence model". Algorithms for Molecular Biology : Amb 12: 6. doi:10.1186/s13015-017-0098-8. PMID 28316640. PMC 5349084. //www.ncbi.nlm.nih.gov/pmc/articles/PMC5349084/. 
  105. Chan, Y. B.; Ranwez, V.; Scornavacca, C. (2017). "Inferring incomplete lineage sorting, duplications, transfers and losses with reconciliations". Journal of Theoretical Biology 432: 1–13. doi:10.1016/j.jtbi.2017.08.008. PMID 28801222. https://hal.archives-ouvertes.fr/hal-02154862/file/Scornavacca_20.pdf. 
  106. Du P and Ogilvie H A and Nakhleh L (2019) Unifying Gene Duplication, Loss, and Coalescence on Phylogenetic Networks. Bioinformatics Research and Applications. ISBRA 2019. Lecture Notes in Computer Science, vol 11490. Springer, Cham.
  107. Rasmussen, M. D.; Kellis, M. (2012). "Unified modeling of gene duplication, loss, and coalescence using a locus tree". Genome Research 22 (4): 755–765. doi:10.1101/gr.123901.111. PMID 22271778. PMC 3317157. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3317157/. 
  108. Wu, Y. C.; Rasmussen, M. D.; Bansal, M. S.; Kellis, M. (2014). "Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees". Genome Research 24 (3): 475–486. doi:10.1101/gr.161968.113. PMID 24310000. PMC 3941112. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3941112/. 
  109. Theis, K. R.; Dheilly, N. M.; Klassen, J. L.; Brucker, R. M.; Baines, J. F.; Bosch, T. C.; Cryan, J. F.; Gilbert, S. F. et al. (2016). "Getting the Hologenome Concept Right: An Eco-Evolutionary Framework for Hosts and Their Microbiomes". mSystems 1 (2). doi:10.1128/mSystems.00028-16. PMID 27822520. PMC 5069740. //www.ncbi.nlm.nih.gov/pmc/articles/PMC5069740/. 
  110. Moran, N. A.; McCutcheon, J. P.; Nakabachi, A. (2008). "Genomics and evolution of heritable bacterial symbionts". Annual Review of Genetics 42: 165–190. doi:10.1146/annurev.genet.41.110306.130119. PMID 18983256. 
  111. López-Madrigal, S.; Gil, R. (2017). "Et tu, Brute? Not Even Intracellular Mutualistic Symbionts Escape Horizontal Gene Transfer". Genes 8 (10): 247. doi:10.3390/genes8100247. PMID 28961177. PMC 5664097. //www.ncbi.nlm.nih.gov/pmc/articles/PMC5664097/. 
  112. Penz, T.; Schmitz-Esser, S.; Kelly, S. E.; Cass, B. N.; Müller, A.; Woyke, T.; Malfatti, S. A.; Hunter, M. S. et al. (2012). "Comparative genomics suggests an independent origin of cytoplasmic incompatibility in Cardinium hertigii". PLOS Genetics 8 (10): e1003012. doi:10.1371/journal.pgen.1003012. PMID 23133394. PMC 3486910. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3486910/. 
  113. Nikoh, N.; Hosokawa, T.; Moriyama, M.; Oshima, K.; Hattori, M.; Fukatsu, T. (2014). "Evolutionary origin of insect-Wolbachia nutritional mutualism". Proceedings of the National Academy of Sciences of the United States of America 111 (28): 10257–10262. doi:10.1073/pnas.1409284111. PMID 24982177. PMC 4104916. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4104916/. 
  114. 114.0 114.1 Manzano-Marı n, A.; Coeur d'Acier, A.; Clamens, A. L.; Orvain, C.; Cruaud, C.; Barbe, V.; Jousselin, E. (2020). "Serial horizontal transfer of vitamin-biosynthetic genes enables the establishment of new nutritional symbionts in aphids' di-symbiotic systems". The ISME Journal 14 (1): 259–273. doi:10.1038/s41396-019-0533-6. PMID 31624345. PMC 6908640. //www.ncbi.nlm.nih.gov/pmc/articles/PMC6908640/. 
  115. Nakabachi, A.; Ueoka, R.; Oshima, K.; Teta, R.; Mangoni, A.; Gurgui, M.; Oldham, N. J.; Van Echten-Deckert, G. et al. (2013). "Defensive bacteriome symbiont with a drastically reduced genome". Current Biology : Cb 23 (15): 1478–1484. doi:10.1016/j.cub.2013.06.027. PMID 23850282. 
  116. 116.0 116.1 Pinto-Carbó, M.; Sieber, S.; Dessein, S.; Wicker, T.; Verstraete, B.; Gademann, K.; Eberl, L.; Carlier, A. (2016). "Evidence of horizontal gene transfer between obligate leaf nodule symbionts". The ISME Journal 10 (9): 2092–2105. doi:10.1038/ismej.2016.27. PMID 26978165. PMC 4989318. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4989318/. 
  117. Jeong, H.; Arif, B.; Caetano-Anollés, G.; Kim, K. M.; Nasir, A. (2019). "Horizontal gene transfer in human-associated microorganisms inferred by phylogenetic reconstruction and reconciliation". Scientific Reports 9 (1): 5953. doi:10.1038/s41598-019-42227-5. PMID 30976019. PMC 6459891. //www.ncbi.nlm.nih.gov/pmc/articles/PMC6459891/. 
  118. Wijayawardena, B. K.; Minchella, D. J.; Dewoody, J. A. (2013). "Hosts, parasites, and horizontal gene transfer". Trends in Parasitology 29 (7): 329–338. doi:10.1016/j.pt.2013.05.001. PMID 23759418. 
  119. Moodley, Y.; Linz, B.; Bond, R. P.; Nieuwoudt, M.; Soodyall, H.; Schlebusch, C. M.; Bernhöft, S.; Hale, J. et al. (2012). "Age of the association between Helicobacter pylori and man". PLOS Pathogens 8 (5): e1002693. doi:10.1371/journal.ppat.1002693. PMID 22589724. PMC 3349757. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3349757/. 
  120. Achtman, M. (2016). "How old are bacterial pathogens?". Proceedings. Biological Sciences 283 (1836). doi:10.1098/rspb.2016.0990. PMID 27534956. PMC 5013766. //www.ncbi.nlm.nih.gov/pmc/articles/PMC5013766/. 
  121. Fu Y, Pistolozzi M, Yang X and Lin Z (2020) bioRxiv 2020.08.11.232520; doi: https://doi.org/10.1101/2020.08.11.232520
  122. 122.0 122.1 Da Silva, S. G.; Tehrani, J. J. (2016). "Comparative phylogenetic analyses uncover the ancient roots of Indo-European folktales". Royal Society Open Science 3 (1): 150645. doi:10.1098/rsos.150645. PMID 26909191. PMC 4736946. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4736946/. 
  123. 123.0 123.1 Ross, R. M.; Greenhill, S. J.; Atkinson, Q. D. (2013). "Population structure and cultural geography of a folktale in Europe". Proceedings. Biological Sciences 280 (1756). doi:10.1098/rspb.2012.3065. PMID 23390109. PMC 3574383. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3574383/. 
  124. Bortolini, E.; Pagani, L.; Crema, E. R.; Sarno, S.; Barbieri, C.; Boattini, A.; Sazzini, M.; Da Silva, S. G. et al. (2017). "Inferring patterns of folktale diffusion using genomic data". Proceedings of the National Academy of Sciences of the United States of America 114 (34): 9140–9145. doi:10.1073/pnas.1614395114. PMID 28784786. PMC 5576778. //www.ncbi.nlm.nih.gov/pmc/articles/PMC5576778/. 
  125. Zwaenepoel, A.; Van De Peer, Y. (2019). "Inference of Ancient Whole-Genome Duplications and the Evolution of Gene Duplication and Loss Rates". Molecular Biology and Evolution 36 (7): 1384–1404. doi:10.1093/molbev/msz088. PMID 31004147. 
  126. Dondi, R.; Lafond, M.; Scornavacca, C. (2019). "Reconciling multiple genes trees via segmental duplications and losses". Algorithms for Molecular Biology : Amb 14: 7. doi:10.1186/s13015-019-0139-6. PMID 30930955. PMC 6425616. //www.ncbi.nlm.nih.gov/pmc/articles/PMC6425616/. 
  127. Duchemin, W.; Anselmetti, Y.; Patterson, M.; Ponty, Y.; Bérard, S.; Chauve, C.; Scornavacca, C.; Daubin, V. et al. (2017). "DeCoSTAR: Reconstructing the Ancestral Organization of Genes or Genomes Using Reconciled Phylogenies". Genome Biology and Evolution 9 (5): 1312–1319. doi:10.1093/gbe/evx069. PMID 28402423. PMC 5441342. //www.ncbi.nlm.nih.gov/pmc/articles/PMC5441342/. 
  128. Bansal, M. S.; Banay, G.; Gogarten, J. P.; Shamir, R. (2011). "Detecting highways of horizontal gene transfer". Journal of Computational Biology : A Journal of Computational Molecular Cell Biology 18 (9): 1087–1114. doi:10.1089/cmb.2011.0066. PMID 21899418. 
  129. Scornavacca, C.; Mayol JCP; Cardona, G. (2017). "Fast algorithm for the reconciliation of gene trees and LGT networks". Journal of Theoretical Biology 418: 129–137. doi:10.1016/j.jtbi.2017.01.024. PMID 28111320. https://hal.archives-ouvertes.fr/hal-02154890/file/Scornavacca_27.pdf. 
  130. Yu, Y.; Ristic, N.; Nakhleh, L. (2013). "Fast algorithms and heuristics for phylogenomics under ILS and hybridization". BMC Bioinformatics 14 Suppl 15: S6. doi:10.1186/1471-2105-14-S15-S6. PMID 24564257. PMC 3852049. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3852049/. 
  131. Yu, Y.; Barnett, R. M.; Nakhleh, L. (2013). "Parsimonious inference of hybridization in the presence of incomplete lineage sorting". Systematic Biology 62 (5): 738–751. doi:10.1093/sysbio/syt037. PMID 23736104. PMC 3739885. //www.ncbi.nlm.nih.gov/pmc/articles/PMC3739885/. 
  132. Nieberding, C. M.; Durette-Desset, M. C.; Vanderpoorten, A.; Casanova, J. C.; Ribas, A.; Deffontaine, V.; Feliu, C.; Morand, S. et al. (2008). "Geography and host biogeography matter for understanding the phylogeography of a parasite". Molecular Phylogenetics and Evolution 47 (2): 538–554. doi:10.1016/j.ympev.2008.01.028. PMID 18346916. 
  133. Martínez-Aquino, A.; Ceccarelli, F. S.; Eguiarte, L. E.; Vázquez-Domínguez, E.; De León, G. P. (2014). "Do the historical biogeography and evolutionary history of the digenean Margotrema SPP. Across central Mexico mirror those of their freshwater fish hosts (Goodeinae)?". PLOS ONE 9 (7): e101700. doi:10.1371/journal.pone.0101700. PMID 24999998. PMC 4084993. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4084993/. 
  134. Weckstein, J. D. (2004). "Biogeography explains cophylogenetic patterns in toucan chewing lice". Systematic Biology 53 (1): 154–164. doi:10.1080/10635150490265085. PMID 14965910. 
  135. Fountain, E. D.; Pauli, J. N.; Mendoza, J. E.; Carlson, J.; Peery, M. Z. (2017). "Cophylogenetics and biogeography reveal a coevolved relationship between sloths and their symbiont algae". Molecular Phylogenetics and Evolution 110: 73–80. doi:10.1016/j.ympev.2017.03.003. PMID 28288943. 
  136. Groussin, M.; Mazel, F.; Sanders, J. G.; Smillie, C. S.; Lavergne, S.; Thuiller, W.; Alm, E. J. (2017). "Unraveling the processes shaping mammalian gut microbiomes over evolutionary time". Nature Communications 8: 14319. doi:10.1038/ncomms14319. PMID 28230052. PMC 5331214. //www.ncbi.nlm.nih.gov/pmc/articles/PMC5331214/. 
  137. Berry, V.; Chevenet, F.; Doyon, J. P.; Jousselin, E. (2018). "A geography-aware reconciliation method to investigate diversification patterns in host/Parasite interactions". Molecular Ecology Resources 18 (5): 1173–1184. doi:10.1111/1755-0998.12897. PMID 29697894. 
  138. Li L and Bansal M (2018) An Integer Linear Programming Solution for the Domain-Gene-Species Reconciliation Problem. In Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics (BCB '18). Association for Computing Machinery, New York, NY, USA, 386–397. DOI:https://doi.org/10.1145/3233547.3233603
  139. Li L and Bansal M (2019) Simultaneous Multi-Domain-Multi-Gene Reconciliation Under the Domain-Gene-Species Reconciliation Model. Bioinformatics Research and Applications. ISBRA 2019. Lecture Notes in Computer Science, vol 11490. Springer, Cham. https://doi.org/10.1007/978-3-030-20242-2_7
  140. Kundu, S.; Bansal, M. S. (2019). "SaGePhy: An improved phylogenetic simulation framework for gene and subgene evolution". Bioinformatics (Oxford, England) 35 (18): 3496–3498. doi:10.1093/bioinformatics/btz081. PMID 30715213. 
  141. Muhammad S and Sennblad B and Lagergren J (2018) bioRxiv 336453; doi: https://doi.org/10.1101/336453