# PANDA.jl Phylogenetic ANalyses of DiversificAtion

Implements macroevolutionary analyses on phylogenetic trees.

## Installation

You can install the package by typing

julia> using Pkg
julia> Pkg.add("PANDA")

PANDA uses R functions and packages for plotting. If you want to be able to use the plotting functions, the R language needs to be installed on your computer. You will also need a few R packages to be installed, including : ape, coda, RColorBrewer, fields. You can install them from a R session by typing

> install.packages("ape", "coda", "RColorBrewer", "fields")

You will then be able to load PANDA to Julia by typing

julia> using PANDA

The ClaDS module implements the inference of ClaDS parameters on a phylogeny using data augmentation. A step by step presentation of how to perform the inference is available in the manual

PANDA.ClaDS.load_treeMethod
load_tree(file::String)

Loads a tree from a file. Supported formats are '.nex' and '.tre'.

Arguments

• file::String:path to the file.
source
PANDA.ClaDS.infer_ClaDSFunction
infer_ClaDS(tree::Tree, n_reccord::Int64; ...)

Infer ClaDS parameters on a tree

Arguments

• tree::Tree: the phylogeny on which to perform the inference.
• n_reccord::Int64: number of iterations between two computations of the gelman statistics. Default to 1_000.

Keyword arguments

• former_run::CladsOutput: the result of a run of infer_ClaDS on the tree. The new mcmc chains will be added to these ones.
• f::Float64: The sampling probability. It can be a Float (in which case the whole clade has the same sampling probability), or a vector length n_tips(tree)(in which case sampling probabilities are subclades specific andmust be given in the same order as the tip tip_labels(tree))Default to1..
• goal_gelman::Float64: The gelman parameter value below which the run is stoped. Default to 1.05.
• end_it::Int64: Maximum number of MCMC iterations. If the number of iterations reaches this value, the run stop even if the gelman statistic is still above goal_gelamn. Default to Inf.
• thin::Int64: The thinning parameter. Default to 1..
• burn::Float64: The proportion of the mcmc that will be discarded befor computing the gelman statistics and point estiamtes for the parameters. Default to 0.25.
• n_trees::Int64: Number of samples from the posterior distribution of complete phylogenies to be outputed. Default to 10.
• ltt_steps::Int64: Number of time points at which the rate through time and diversity through time should be computed. Default to 50.
• print_state::Int64: If > 0, the state of the chains is printed every print_stateiteration. Default to 0.
• prior_ε::String : The prior to be used for ε. Default to "uniform" (a uniform prior in [0,1000]), but a "lognormal" prior can be defined as an alternative. It can also be set to a flat prior on [0,Inf], as used in the paper, with "unifromInf". Finally, setting it to "ClaDS0" forces the extinction at 0.
• logε0::Float64: If prior_ε = "lognormal", mean of the ε prior on the log scale.
• sdε::Float64: If prior_ε = "lognormal", standard deviation of the ε prior on the log scale.
source
PANDA.ClaDS.plot_ClaDSMethod
plot_ClaDS(tree::Tree ; ln=true, show_labels = false, options="")

Plots a tree with branches colored according to their rates

Arguments

• tree::Tree: the phylogeny to be plotted.

Keyword arguments

• ln::Bool: Should rates be plotted on a log scale? Default to true.
• show_labels::Bool: Should tip labels be printed? Default to false.
• options::String: Additonal options for the ploting function.
plot_ClaDS(tree::Tree, rates::Array{Number,1} ; ln=true, show_labels = false, options="")

• rates::Array{Number,1}: the speciation rates.
source
PANDA.ClaDS.sim_ClaDS2_ntipsMethod
sim_ClaDS2_ntips(n::Int64,σ::Float64,α::Float64,ε::Float64,λ0::Float64 ;prune_extinct = true, sed = 0.001,max_time = 5, max_simulation_try = 100)

Simulate a tree from the ClaDS model conditionned on the number of tips at present

Arguments

• n::Int64: the goal number of tips at present.
• σ::Float64: the stochasticity parameter.
• α::Float64: the trend parameter.
• ε::Float64: turnover rate (extinction / speciation).
• λ0::Float64: initial speciation rate.

Keyword arguments

• prune_extinct::Bool: Should extinct lineages be removed from the output tree? Default to false.
source
PANDA.ClaDS.sample_tipsMethod
sample_tips(tree::Tree, f::Float64 ; root_age = true)

Sample tips from a tree. Each tip is kept with probability f. The fuction returns a tupple (phylo, rates), where phylo is a Tree object containing the sampled tree, and rates a vector of Floats with its tip rates.

Arguments

• tree::Tree: the phylogeny.
• f::Float64: the sampling probability.

Keyword arguments

• root_age::Bool: If true, the sampling is constrained so that the tree keeps the same root age : at least one tip is sampled in each of the two tree subtrees.
source
PANDA.ClaDS.TreeType

A phylogeny object for the module ClaDS. It is represented as a branch with a given length and optional attributes, and its daughter trees.

• offsprings::Array{Tree,1}: the two daughter trees.
• branch_length::Float64: the length of the branch
• attributes::Array{T,1} where {T<:Number}: soem attributes of the branch. Used to store the speciation rates.
• n_nodes::Int64: the number of nodes in the tree (internal nodes + tips)
• extant::Bool: does the tree have any extant species?
• label::String: if the tree is a tip (ie n_nodes == 1), contains the name of the corresponding species.
source
PANDA.ClaDS.CladsOutputType

A structure containig the informations about the resulot of a ClaDS run. Contains the following fields :

• tree::Tree: the phylogeny on which the analysis was performed.
• chains::Array{Array{Array{Float64,1},1},1} : the mcmc chains
• rtt_chains::Array{Array{Array{Float64,1},1},1} : the mcmc chains with the rate through time
• σ_map::Float64 : the σ parameter estimate
• α_map::Float64 : the α parameter estimate
• ε_map::Float64 : the ε parameter estimate
• λ0_map::Float64 : the initial speciation rate estimate
• λi_map::Array{Float64,1} : the estimates of the branh specific speciation rates
• λtip_map::Array{Float64,1} : the estimates of the tip speciation rates
• DTT_mean::Array{Float64,1} : the diversity through time estimates
• RTT_map::Array{Float64,1} : the rate through time estimates
• time_points::Array{Float64,1} : the times at which DTT_mean and RTT_map are computed
• enhanced_trees::Array{Tree,1} : a sample of the complete tree distribution
• gelm::Tuple{Int64,Float64} : the gelman parameter
• current_state : other variables, by infer_ClaDS used to continue the run
source
PANDA.ClaDS.plot_CladsOutputMethod
plot_CladsOutput(co::CladsOutput ; method = "tree", ...)

Plots various aspects of the output of ClaDS

Arguments

• co::CladsOutput : A CladsOutput object, the output of a ClaDS run.

Keyword arguments

• method::String : A String indicating what aspect of the output should be plotted, see details.
source
PANDA.ClaDS.save_ClaDS_in_RMethod
save_ClaDS_in_R(co::CladsOutput, path::String ; maxit = Inf, ...)

sss Save the output of a ClaDS run as a Rdata file.

Arguments

• co::CladsOutput: the result of a run of infer_ClaDS.
• path::String: the path the file should be saved to.

.Rdata file

The .Rdata file contains an object called CladsOutput that is a list with the following fields :

• tree: the phylogeny, saved as a phylo object
• chains: the MCMC chains
• rtt_chains: the MCMCs with the rate through time information
• sig_map: the MAP estimate for the σ parameter
• al_map: the MAP estimate for the α parameter
• eps_map: the MAP estimate for the ε parameter
• lambda0_map: the MAP estimate for the λ0 parameter
• lambdai_map: the MAP estimate for the branch-specific speciation rates
• lambdatip_map: the MAP estimate for the tip speciation rates
• DTT_mean: the point estimate for the diversity through time
• RTT_map : the rate through time estimates
• time_points : the times at which DTT_mean and RTT_map are computed
• enhanced_trees : a sample of the complete tree distribution, as a list of phylo objects
• gelm` : the gelman parameter
source