Speeding “Bayesian Power Analysis t-test” up with Snowfall

This is a direct (though minor) answer to Daniel’s blogpost Power Analysis for default Bayesian t-tests, which I found very interesting, as I have been trying to get my head around Bayesian statistics for quite a while now. However, one thing that bugs me, is the time needed for the simulation. On my machine it took around 22 minutes. Depending on the task, 22 minutes for a signle test can be way too long (especially if the tests are done in a business environment where many tests are needed – yesterday) and a simple t-test might be more appealing only because it takes a shorter computing time. Here is my solution to speed-up the code using snowfall‘s load-balancing parallel structures to reduce the time to 8.5 minutes.

The initial code (which you can find in the original post) uses a for loop, snowfall, however, uses a function that we need to export, which is simply called simFun. The function takes the sample size n, the true effect size D, as well as the effect size of the alternative hypothesis rscaleBF as arguments (note, the names are equivalent to Daniel’s, as well as the values of the variables (i.e., n = 50 etc.)).

simFun <- function(n, D, rscaleBF){
  library(BayesFactor)
  x <- rnorm(n = n, mean = 0, sd = 1)
  y <- rnorm(n = n, mean = D, sd = 1)
  
  return(exp((ttestBF(x, y, rscale = rscaleBF))@bayesFactor$bf))
}

To load, initiate, and finally execute the snowfall-code, we can use the following:

# load the library
library(snowfall)
n <- 50; D <- 0.0; nSim <- 100000; rscaleBF <- sqrt(2)/2; threshold <- 3

# for time-keeping
t0 <- Sys.time()

# initiate a parallel cluster with 4 cpus
sfInit(parallel = T, cpus = 4)

# export the function to the clusters
sfExport("simFun", "n", "D", "rscaleBF")

# execute the code
bf <- sfClusterApplyLB(1:nSim, function(i) simFun(n = n, D = D, rscaleBF = rscaleBF))

# stop the clusters
sfStop()

# print the time it took for the calculation
Sys.time() - t0
# Time difference of 8.490639 mins

# and finally the result
sum(bf < (1/threshold))/nSim
#[1]  0.68578

With Daniel’s code I got supportH0 = 0.6904, the snowfall-code almost reproduces the same number 0.68578, but reduces the time substantially.

I hope you found this post interesting/helpful, should you have any questions, you are more then welcome to ask them here, otherwise send me an e-mail.

Lastly, thanks to Daniel, who was able to find some bugs in an earlier version of the code.

Note, the earlier version stated that the code was improved by 60x (without giving a result), the numbers are now corrected.

Advertisements

3 thoughts on “Speeding “Bayesian Power Analysis t-test” up with Snowfall

  1. Thank you for the code example. Since a long time I try to use parallel packages on R to speed up my computational time. However, I am really unsure if this really works and I do not know what I am doing wrong.

    I tried your code with a “new” i7-4790CPU @3.60 GHz, 64bit system and 8 CPUs (16GB ram). However, with 4 CPUs it took 8.316368 mins and using 8 CPUs it took 5.159408 mins. Still better than 22min, but I thought I would easily beat your 1.24 min. example, but I didn´t…Why? 🙂

    please tell me that your hardware is muuuuch better than mine (whatever that means). Do you have any idea, why my machine is so slow?

    Oh…also, the object “bf” was not created. So in the end, there was not even a result.

    cheers

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s