Merry Christmas!

It’s Christmas time, and I decided to enter the mood by making a Christmas tree representing my current interests. Back in 2015 I constructed a Christmas tree based on the shape of an arbitrary cladogram and then added color and shapes in order to represent a Christmas tree. Below I describe how I did it long with the final output. Then I describe the process for constructing another Christmas tree, this time using a random walk stochastic process and some automatic additions to the plot of several realizations of such process. As with the cladistic Christmas tree, I present below the output along with the script for constructing it. These two graphics represent my shift from a hard-core parsimony guy to a fanatic of all quantitative things.

Cladistic, manual version

First thing is to create a taxon-only matrix in mesquite, and then to manually create the relationships and also to transpose the daughter branches for each node. The output would be something like this:

[written Wed Dec 23 14:54:24 BRST 2015 by Mesquite  version 3.04 (build 725) at ware/]

	TITLE Taxa;
		taxon_1 taxon_2 taxon_3 taxon_4 taxon_5 taxon_6 taxon_7 taxon_8 taxon_9 taxon_10 taxon_11 taxon_12 taxon_13 taxon_14 taxon_15 taxon_16 taxon_17 taxon_18 taxon_19 taxon_20


	Title Trees;
	LINK Taxa = Taxa;
[0] 		1 taxon_1,
[1] 		2 taxon_2,
[2] 		3 taxon_3,
[3] 		4 taxon_4,
[4] 		5 taxon_5,
[5] 		6 taxon_6,
[6] 		7 taxon_7,
[7] 		8 taxon_8,
[8] 		9 taxon_9,
[9] 		10 taxon_10,
[10] 		11 taxon_11,
[11] 		12 taxon_12,
[12] 		13 taxon_13,
[13] 		14 taxon_14,
[14] 		15 taxon_15,
[15] 		16 taxon_16,
[16] 		17 taxon_17,
[17] 		18 taxon_18,
[18] 		19 taxon_19,
[19] 		20 taxon_20;
	TREE tree1 = (20,(18,((1,((3,((15,((4,((5,((7,((11,((9,10),12)),8)),6)),13)),14)),16)),2)),17)),19)[% ] [% ] [%  setBetweenBits = selected ];


Then I manually edited it in figtree, mostly making the branches thicker, and fixing the fonts, resulting in the following file:

begin taxa;
	dimensions ntax=20;

begin trees;
	tree tree_1 = [&R] (taxon_20:1.0,(taxon_18:1.0,((taxon_1:1.0,((taxon_3:1.0,((taxon_15:1.0,((taxon_4:1.0,((taxon_5:1.0,((taxon_7:1.0,((taxon_11:1.0,((taxon_9:1.0,taxon_10:1.0):1.0,taxon_12:1.0):1.0):1.0,taxon_8:1.0):1.0):1.0,taxon_6:1.0):1.0):1.0,taxon_13:1.0):1.0):1.0,taxon_14:1.0):1.0):1.0,taxon_16:1.0):1.0):1.0,taxon_2:1.0):1.0):1.0,taxon_17:1.0):1.0):1.0,taxon_19:1.0);

begin figtree;
	set appearance.backgroundColorAttribute="Default";
	set appearance.backgroundColour=#ffffff;
	set appearance.branchColorAttribute="User selection";
	set appearance.branchColorGradient=false;
	set appearance.branchLineWidth=7.0;
	set appearance.branchMinLineWidth=0.0;
	set appearance.branchWidthAttribute="Fixed";
	set appearance.foregroundColour=#000000;
	set appearance.hilightingGradient=false;
	set appearance.selectionColour=#2d3680;
	set branchLabels.colorAttribute="User selection";
	set branchLabels.displayAttribute="Branch times";
	set branchLabels.fontName="Abyssinica SIL";
	set branchLabels.fontSize=8;
	set branchLabels.fontStyle=0;
	set branchLabels.isShown=false;
	set branchLabels.significantDigits=4;
	set layout.expansion=642;
	set layout.layoutType="RECTILINEAR";
	set layout.zoom=0;
	set legend.attribute=null;
	set legend.fontSize=10.0;
	set legend.isShown=false;
	set legend.significantDigits=4;
	set nodeBars.barWidth=4.0;
	set nodeBars.displayAttribute=null;
	set nodeBars.isShown=false;
	set nodeLabels.colorAttribute="User selection";
	set nodeLabels.displayAttribute="Node ages";
	set nodeLabels.fontName="Abyssinica SIL";
	set nodeLabels.fontSize=8;
	set nodeLabels.fontStyle=0;
	set nodeLabels.isShown=false;
	set nodeLabels.significantDigits=4;
	set nodeShape.colourAttribute=null;
	set nodeShape.isShown=false;
	set nodeShape.minSize=10.0;
	set nodeShape.scaleType=Width;
	set nodeShape.shapeType=Circle;
	set nodeShape.size=4.0;
	set nodeShape.sizeAttribute=null;
	set polarLayout.alignTipLabels=false;
	set polarLayout.angularRange=0;
	set polarLayout.rootAngle=0;
	set polarLayout.rootLength=100;
	set polarLayout.showRoot=true;
	set radialLayout.spread=0.0;
	set rectilinearLayout.alignTipLabels=false;
	set rectilinearLayout.curvature=0;
	set rectilinearLayout.rootLength=100;
	set scale.offsetAge=0.0;
	set scale.rootAge=1.0;
	set scale.scaleFactor=1.0;
	set scale.scaleRoot=false;
	set scaleAxis.automaticScale=true;
	set scaleAxis.fontSize=8.0;
	set scaleAxis.isShown=false;
	set scaleAxis.lineWidth=1.0;
	set scaleAxis.majorTicks=1.0;
	set scaleAxis.origin=0.0;
	set scaleAxis.reverseAxis=false;
	set scaleAxis.showGrid=true;
	set scaleBar.automaticScale=true;
	set scaleBar.fontSize=10.0;
	set scaleBar.isShown=true;
	set scaleBar.lineWidth=1.0;
	set scaleBar.scaleRange=0.0;
	set tipLabels.colorAttribute="User selection";
	set tipLabels.displayAttribute="Names";
	set tipLabels.fontName="Abyssinica SIL";
	set tipLabels.fontSize=16;
	set tipLabels.fontStyle=2;
	set tipLabels.isShown=true;
	set tipLabels.significantDigits=4;
	set trees.order=false;
	set trees.orderType="increasing";
	set trees.rooting=false;
	set trees.rootingType="User Selection";
	set trees.transform=false;
	set trees.transformType="cladogram";

We can save the file above in svg vectorial format so that then we can do a bit of edition in Inkscape I ended up with this:


The process is fairly manual, but it pleased both my contacts in social networks and myself, and it was funny to tweak things so that an evolutionary construct based on graph theory could mimic a Christmas tree. It is noteworthy that all was carried out with open source programs, so you can try it yourself without need to pay for the programs 🙂

Stochastic, automated Christmas tree

Stochastic processes are models that describe phenomena incorporating uncertainty across time, that is, where the final result and the trajectory of a sequence of events can not be determined, however, they can be characterized so that we can build expectations about them. A big deal of real-world phenomena can be described using stochastic processes, from time series analysis (including financial applications, medical imaging, and world-scale phenomena such as global warming) to modeling of evolution (e.g., Brownian motion, the birth-death process).

One of these is often called random walk, and says that for a given variable, the values it takes through time are allowed to show variation (or to show uncertainty) between certain admissible values, so that the change in the value from a step 1 to a step 2 can not be predicted exactly and can change if we repeat the realization of the events. Mathematically we then say that, for a time step t and its previous step t - 1 in variable x, the value of x at time step t would depend on the previous value t - 1 plus or minus an uncertain quantity that applies to the current step w_t:

x_{t} = x_{t - 1} + w_{t}

Oftentimes, we use the normal distribution for describing w_t centered at \mu = 0 and with variance \sigma^2 taking smaller or larger values depending on whether we want the process to vary a lot or just a little; the larger the variance, the larger the likely values that w_t can sum or rest from the previous value.

w_t = N(\mu,\sigma^2)

If we repeat a lot of times such stochastic processes we can consider a whole bunch of possible trajectories, that once plotted together, can produce the shape of a pine with tip on the origin.

Using unicode characters we can plot a point with a colored star to the plot area with the points function, and choose colors with finer control with the function RGB that combines values of the colors red, green, and blue, along with a transparency value alpha. We can use this source for picking a color visually and then look for the adequate amounts of red, green, and blue.

# set seed for repeatable results
# initialize the plot with the tip star
plot(x = 0, y = 0, xlim = c(-60, 60), ylim = c(-500, 0),
     xlab = NA, ylab = NA, pch = -9733, cex = 3,
     col = rgb(red = 255/255, green = 191/255, blue = 0/255))
# set initial values
x <- 0
y <- x
# design a data frame for saving the values from each realization
allPoints <- data.frame(x, y, stringsAsFactors = FALSE)
# iterate over realizations and numbers of steps for producing
# 1000 different trajectories, each with 500 steps
for (j in 1:1000) {
    x <- 0
    y <- x
    for (i in 1:500) {
        # x to the left of the arrow is x_t and the one to the right is x_{t-1}
        # plus or minus an amount drawn at random from a normal(0,0.7)
        x <- x + rnorm(1, sd = 0.7)
        # append the value to the series
        y <- c(y, x)
    # plot the line with the values saved before
    lines(x = y, y = -seq_along(y),
          col = rgb(green = 128/255,
                    red = 22/255,
                    blue = 20/255,
                    alpha = 0.3))
    # save the data to the data frame
    allPoints <- rbind(allPoints, data.frame(x = y, y = -seq_along(y)))
# plot the star again
points(x = 0, y = 0, pch = -9733, cex = 3,
       col = rgb(red = 255/255, green = 191/255, blue = 0/255))
# red balls
# red balls, y axis
redY <- seq(from = 0, to = 500, by = 80)[-1]
# sequence for calculating the values of each x component for a given y in red balls
for (i in redY) {
    redX <- c(seq(from = min(allPoints[allPoints$y == -i, "x"]),
                  to = max(allPoints[allPoints$y == -i, "x"]),
                  by = 20),
              max(allPoints[allPoints$y == -i, "x"]))
    points(x = redX, y = rep(x = -i, times = length(redX)),
           pch = 21,  bg = "red", cex = 2)
# yellow balls
# yellow balls, y axis
yellowY <- (seq(from = 0, to = 500, by = 80) + 40)[-length(seq(from = 0,
                                                               to = 500,
                                                               by = 80) + 40)]
# sequence for calculating the values of each x component for a given y in red balls
for (i in yellowY) {
    yellowX <- c(seq(from = min(allPoints[allPoints$y == -i, "x"]),
                     to = max(allPoints[allPoints$y == -i, "x"]),
                     by = 20), max(allPoints[allPoints$y == -i, "x"]))
    points(x = yellowX, y = rep(x = -i, times = length(yellowX)),
           pch = 21, bg = "yellow", cex = 2)


A couple of tricks were needed in order to produce the orientation of trajectories and position of red and yellow balls: explicit transposition of variables where the values of x were flipped to its negative -x and the y as the independent variable and x as the dependent one. Also, the coordinate points for the balls were automated and therefore they may look weird to the right, since I did not found a way to automatically calculate coordinates and variable distance among points so that the initial and final ones were always on the largest or smallest point. It might be possible but I just got tired of thinking about an alternative, in the end, a script that builds the whole thing seems fair enough.

That’s all for now, happy holidays to everybody!


Author: gaballench

Ictiólogo con interés en informática y música, simpatizante de iniciativas open source.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s