PANDA.jl Phylogenetic ANalyses of DiversificAtion

Implements macroevolutionary analyses on phylogenetic trees.

Installation

You can install the package by typing

julia> using Pkg
julia> Pkg.add("PANDA")

PANDA uses R functions and packages for plotting. If you want to be able to use the plotting functions, the R language needs to be installed on your computer. You will also need a few R packages to be installed, including : ape, coda, RColorBrewer, fields. You can install them from a R session by typing

> install.packages("ape", "coda", "RColorBrewer", "fields")

You will then be able to load PANDA to Julia by typing

julia> using PANDA

ClaDS

The ClaDS module implements the inference of ClaDS parameters on a phylogeny using data augmentation. A step by step presentation of how to perform the inference is available in the manual

PANDA.ClaDS.load_treeMethod
load_tree(file::String)

Loads a tree from a file. Supported formats are '.nex' and '.tre'.

Arguments

  • file::String:path to the file.
source
PANDA.ClaDS.infer_ClaDSFunction
infer_ClaDS(tree::Tree, n_reccord::Int64; ...)

Infer ClaDS parameters on a tree

Arguments

  • tree::Tree: the phylogeny on which to perform the inference.
  • n_reccord::Int64: number of iterations between two computations of the gelman statistics. Default to 1_000.

Keyword arguments

  • former_run::CladsOutput: the result of a run of infer_ClaDS on the tree. The new mcmc chains will be added to these ones.
  • f::Float64: The sampling probability. It can be a Float (in which case the whole clade has the same sampling probability), or a vector length n_tips(tree)(in which case sampling probabilities are subclades specific andmust be given in the same order as the tip tip_labels(tree))Default to1.`.
  • goal_gelman::Float64: The gelman parameter value below which the run is stoped. Default to 1.05.
  • end_it::Int64: Maximum number of MCMC iterations. If the number of iterations reaches this value, the run stop even if the gelman statistic is still above goal_gelamn. Default to Inf.
  • thin::Int64: The thinning parameter. Default to 1..
  • burn::Float64: The proportion of the mcmc that will be discarded befor computing the gelman statistics and point estiamtes for the parameters. Default to 0.25.
  • n_trees::Int64: Number of samples from the posterior distribution of complete phylogenies to be outputed. Default to 10.
  • ltt_steps::Int64: Number of time points at which the rate through time and diversity through time should be computed. Default to 50.
  • print_state::Int64: If > 0, the state of the chains is printed every print_stateiteration. Default to 0.
  • prior_ε::String : The prior to be used for ε. Default to "uniform" (a uniform prior in [0,1000]), but a "lognormal" prior can be defined as an alternative. It can also be set to a flat prior on [0,Inf], as used in the paper, with "unifromInf". Finally, setting it to "ClaDS0" forces the extinction at 0.
  • logε0::Float64: If prior_ε = "lognormal", mean of the ε prior on the log scale.
  • sdε::Float64: If prior_ε = "lognormal", standard deviation of the ε prior on the log scale.
source
PANDA.ClaDS.plot_ClaDSMethod
plot_ClaDS(tree::Tree ; ln=true, show_labels = false, options="")

Plots a tree with branches colored according to their rates

Arguments

  • tree::Tree: the phylogeny to be plotted.

Keyword arguments

  • ln::Bool: Should rates be plotted on a log scale? Default to true.
  • show_labels::Bool: Should tip labels be printed? Default to false.
  • options::String: Additonal options for the ploting function.
plot_ClaDS(tree::Tree, rates::Array{Number,1} ; ln=true, show_labels = false, options="")

Additional arguments

  • rates::Array{Number,1}: the speciation rates.
source
PANDA.ClaDS.sim_ClaDS2_ntipsMethod
sim_ClaDS2_ntips(n::Int64,σ::Float64,α::Float64,ε::Float64,λ0::Float64 ;prune_extinct = true, sed = 0.001,max_time = 5, max_simulation_try = 100)

Simulate a tree from the ClaDS model conditionned on the number of tips at present

Arguments

  • n::Int64: the goal number of tips at present.
  • σ::Float64: the stochasticity parameter.
  • α::Float64: the trend parameter.
  • ε::Float64: turnover rate (extinction / speciation).
  • λ0::Float64: initial speciation rate.

Keyword arguments

  • prune_extinct::Bool: Should extinct lineages be removed from the output tree? Default to false.
source
PANDA.ClaDS.sample_tipsMethod
sample_tips(tree::Tree, f::Float64 ; root_age = true)

Sample tips from a tree. Each tip is kept with probability f. The fuction returns a tupple (phylo, rates), where phylo is a Tree object containing the sampled tree, and rates a vector of Floats with its tip rates.

Arguments

  • tree::Tree: the phylogeny.
  • f::Float64: the sampling probability.

Keyword arguments

  • root_age::Bool: If true, the sampling is constrained so that the tree keeps the same root age : at least one tip is sampled in each of the two tree subtrees.
source
PANDA.ClaDS.TreeType

A phylogeny object for the module ClaDS. It is represented as a branch with a given length and optional attributes, and its daughter trees.

  • offsprings::Array{Tree,1}: the two daughter trees.
  • branch_length::Float64: the length of the branch
  • attributes::Array{T,1} where {T<:Number}: soem attributes of the branch. Used to store the speciation rates.
  • n_nodes::Int64: the number of nodes in the tree (internal nodes + tips)
  • extant::Bool: does the tree have any extant species?
  • label::String: if the tree is a tip (ie n_nodes == 1), contains the name of the corresponding species.
source
PANDA.ClaDS.CladsOutputType

A structure containig the informations about the resulot of a ClaDS run. Contains the following fields :

  • tree::Tree: the phylogeny on which the analysis was performed.
  • chains::Array{Array{Array{Float64,1},1},1} : the mcmc chains
  • rtt_chains::Array{Array{Array{Float64,1},1},1} : the mcmc chains with the rate through time
  • σ_map::Float64 : the σ parameter estimate
  • α_map::Float64 : the α parameter estimate
  • ε_map::Float64 : the ε parameter estimate
  • λ0_map::Float64 : the initial speciation rate estimate
  • λi_map::Array{Float64,1} : the estimates of the branh specific speciation rates
  • λtip_map::Array{Float64,1} : the estimates of the tip speciation rates
  • DTT_mean::Array{Float64,1} : the diversity through time estimates
  • RTT_map::Array{Float64,1} : the rate through time estimates
  • time_points::Array{Float64,1} : the times at which DTT_mean and RTT_map are computed
  • enhanced_trees::Array{Tree,1} : a sample of the complete tree distribution
  • gelm::Tuple{Int64,Float64} : the gelman parameter
  • current_state : other variables, by infer_ClaDS used to continue the run
source
PANDA.ClaDS.plot_CladsOutputMethod
plot_CladsOutput(co::CladsOutput ; method = "tree", ...)

Plots various aspects of the output of ClaDS

Arguments

  • co::CladsOutput : A CladsOutput object, the output of a ClaDS run.

Keyword arguments

  • method::String : A String indicating what aspect of the output should be plotted, see details.
source
PANDA.ClaDS.save_ClaDS_in_RMethod
save_ClaDS_in_R(co::CladsOutput, path::String ; maxit = Inf, ...)

sss Save the output of a ClaDS run as a Rdata file.

Arguments

  • co::CladsOutput: the result of a run of infer_ClaDS.
  • path::String: the path the file should be saved to.

.Rdata file

The .Rdata file contains an object called CladsOutput that is a list with the following fields :

  • tree: the phylogeny, saved as a phylo object
  • chains: the MCMC chains
  • rtt_chains: the MCMCs with the rate through time information
  • sig_map: the MAP estimate for the σ parameter
  • al_map: the MAP estimate for the α parameter
  • eps_map: the MAP estimate for the ε parameter
  • lambda0_map: the MAP estimate for the λ0 parameter
  • lambdai_map: the MAP estimate for the branch-specific speciation rates
  • lambdatip_map: the MAP estimate for the tip speciation rates
  • DTT_mean: the point estimate for the diversity through time
  • RTT_map : the rate through time estimates
  • time_points : the times at which DTT_mean and RTT_map are computed
  • enhanced_trees : a sample of the complete tree distribution, as a list of phylo objects
  • gelm : the gelman parameter
source