PANDA.jl Phylogenetic ANalyses of DiversificAtion
Implements macroevolutionary analyses on phylogenetic trees.
Installation
You can install the package by typing
julia> using Pkg
julia> Pkg.add("PANDA")PANDA uses R functions and packages for plotting. If you want to be able to use the plotting functions, the R language needs to be installed on your computer. You will also need a few R packages to be installed, including : ape, coda, RColorBrewer, fields. You can install them from a R session by typing
> install.packages("ape", "coda", "RColorBrewer", "fields")You will then be able to load PANDA to Julia by typing
julia> using PANDAClaDS
The ClaDS module implements the inference of ClaDS parameters on a phylogeny using data augmentation. A step by step presentation of how to perform the inference is available in the manual
PANDA.ClaDS.load_tree — Methodload_tree(file::String)Loads a tree from a file. Supported formats are '.nex' and '.tre'.
Arguments
- file::String:path to the file.
PANDA.ClaDS.infer_ClaDS — Functioninfer_ClaDS(tree::Tree, n_reccord::Int64; ...)Infer ClaDS parameters on a tree
Arguments
- tree::Tree: the phylogeny on which to perform the inference.
- n_reccord::Int64: number of iterations between two computations of the gelman statistics. Default to- 1_000.
Keyword arguments
- former_run::CladsOutput: the result of a run of- infer_ClaDSon the tree. The new mcmc chains will be added to these ones.
- f::Float64: The sampling probability. It can be a- Float(in which case the whole clade has the same sampling probability), or a vector length- n_tips(tree)(in which case sampling probabilities are subclades specific andmust be given in the same order as the tip tip_labels(tree))- Default to1.`.
- goal_gelman::Float64: The gelman parameter value below which the run is stoped. Default to- 1.05.
- end_it::Int64: Maximum number of MCMC iterations. If the number of iterations reaches this value, the run stop even if the gelman statistic is still above- goal_gelamn. Default to- Inf.
- thin::Int64: The thinning parameter. Default to- 1..
- burn::Float64: The proportion of the mcmc that will be discarded befor computing the gelman statistics and point estiamtes for the parameters. Default to- 0.25.
- n_trees::Int64: Number of samples from the posterior distribution of complete phylogenies to be outputed. Default to- 10.
- ltt_steps::Int64: Number of time points at which the rate through time and diversity through time should be computed. Default to- 50.
- print_state::Int64: If- > 0, the state of the chains is printed every- print_stateiteration. Default to- 0.
- prior_ε::String: The prior to be used for ε. Default to "uniform" (a uniform prior in [0,1000]), but a "lognormal" prior can be defined as an alternative. It can also be set to a flat prior on [0,Inf], as used in the paper, with "unifromInf". Finally, setting it to "ClaDS0" forces the extinction at 0.
- logε0::Float64: If- prior_ε = "lognormal", mean of the ε prior on the log scale.
- sdε::Float64: If- prior_ε = "lognormal", standard deviation of the ε prior on the log scale.
PANDA.ClaDS.plot_ClaDS — Methodplot_ClaDS(tree::Tree ; ln=true, show_labels = false, options="")Plots a tree with branches colored according to their rates
Arguments
- tree::Tree: the phylogeny to be plotted.
Keyword arguments
- ln::Bool: Should rates be plotted on a log scale? Default to true.
- show_labels::Bool: Should tip labels be printed? Default to false.
- options::String: Additonal options for the ploting function.
plot_ClaDS(tree::Tree, rates::Array{Number,1} ; ln=true, show_labels = false, options="")Additional arguments
- rates::Array{Number,1}: the speciation rates.
PANDA.ClaDS.sim_ClaDS2_ntips — Methodsim_ClaDS2_ntips(n::Int64,σ::Float64,α::Float64,ε::Float64,λ0::Float64 ;prune_extinct = true, sed = 0.001,max_time = 5, max_simulation_try = 100)Simulate a tree from the ClaDS model conditionned on the number of tips at present
Arguments
- n::Int64: the goal number of tips at present.
- σ::Float64: the stochasticity parameter.
- α::Float64: the trend parameter.
- ε::Float64: turnover rate (extinction / speciation).
- λ0::Float64: initial speciation rate.
Keyword arguments
- prune_extinct::Bool: Should extinct lineages be removed from the output tree? Default to false.
PANDA.ClaDS.sample_tips — Methodsample_tips(tree::Tree, f::Float64 ; root_age = true)Sample tips from a tree. Each tip is kept with probability f. The fuction returns a tupple (phylo, rates), where phylo is a Tree object containing the sampled tree, and rates a vector of Floats with its tip rates.
Arguments
- tree::Tree: the phylogeny.
- f::Float64: the sampling probability.
Keyword arguments
- root_age::Bool: If true, the sampling is constrained so that the tree keeps the same root age : at least one tip is sampled in each of the two tree subtrees.
PANDA.ClaDS.Tree — TypeA phylogeny object for the module ClaDS. It is represented as a branch with a given length and optional attributes, and its daughter trees.
- offsprings::Array{Tree,1}: the two daughter trees.
- branch_length::Float64: the length of the branch
- attributes::Array{T,1} where {T<:Number}: soem attributes of the branch. Used to store the speciation rates.
- n_nodes::Int64: the number of nodes in the tree (internal nodes + tips)
- extant::Bool: does the tree have any extant species?
- label::String: if the tree is a tip (ie n_nodes == 1), contains the name of the corresponding species.
PANDA.ClaDS.n_tips — Methodn_tips(tree::Tree)Get the number of tip in a tree.
Arguments
- tree::Tree: a- Treeobject.
PANDA.ClaDS.tip_labels — Methodtip_labels(tree::Tree)Extract the tip labels of a tree.
Arguments
- tree::Tree: a- Treeobject.
PANDA.ClaDS.CladsOutput — TypeA structure containig the informations about the resulot of a ClaDS run. Contains the following fields :
- tree::Tree: the phylogeny on which the analysis was performed.
- chains::Array{Array{Array{Float64,1},1},1}: the mcmc chains
- rtt_chains::Array{Array{Array{Float64,1},1},1}: the mcmc chains with the rate through time
- σ_map::Float64: the σ parameter estimate
- α_map::Float64: the α parameter estimate
- ε_map::Float64: the ε parameter estimate
- λ0_map::Float64: the initial speciation rate estimate
- λi_map::Array{Float64,1}: the estimates of the branh specific speciation rates
- λtip_map::Array{Float64,1}: the estimates of the tip speciation rates
- DTT_mean::Array{Float64,1}: the diversity through time estimates
- RTT_map::Array{Float64,1}: the rate through time estimates
- time_points::Array{Float64,1}: the times at which- DTT_meanand- RTT_mapare computed
- enhanced_trees::Array{Tree,1}: a sample of the complete tree distribution
- gelm::Tuple{Int64,Float64}: the gelman parameter
- current_state: other variables, by- infer_ClaDSused to continue the run
PANDA.ClaDS.plot_CladsOutput — Methodplot_CladsOutput(co::CladsOutput ; method = "tree", ...)Plots various aspects of the output of ClaDS
Arguments
- co::CladsOutput: A- CladsOutputobject, the output of a ClaDS run.
Keyword arguments
- method::String: A- Stringindicating what aspect of the output should be plotted, see details.
PANDA.ClaDS.save_ClaDS_in_R — Methodsave_ClaDS_in_R(co::CladsOutput, path::String ; maxit = Inf, ...)sss Save the output of a ClaDS run as a Rdata file.
Arguments
- co::CladsOutput: the result of a run of- infer_ClaDS.
- path::String: the path the file should be saved to.
.Rdata file
The .Rdata file contains an object called CladsOutput that is a list with the following fields :
- tree: the phylogeny, saved as a- phyloobject
- chains: the MCMC chains
- rtt_chains: the MCMCs with the rate through time information
- sig_map: the MAP estimate for the σ parameter
- al_map: the MAP estimate for the α parameter
- eps_map: the MAP estimate for the ε parameter
- lambda0_map: the MAP estimate for the λ0 parameter
- lambdai_map: the MAP estimate for the branch-specific speciation rates
- lambdatip_map: the MAP estimate for the tip speciation rates
- DTT_mean: the point estimate for the diversity through time
- RTT_map: the rate through time estimates
- time_points: the times at which- DTT_meanand- RTT_mapare computed
- enhanced_trees: a sample of the complete tree distribution, as a list of- phyloobjects
- gelm: the gelman parameter