I recently had the opportunity to listen to some great minds in the area of high-frequency data and trading. While I won’t go into the details about what has been said, I wanted to illustrate the importance of proper out-of-sample testing and proper variable lags in potential trade algorithms or arbitrage models that has been brought up. This topic can also be generalized to a certain degree to all forecasts.

# Author: David Zimmermann

# A Gentle Introduction to Finance using R: Efficient Frontier and CAPM – Part 1

The following entry explains a basic principle of finance, the so-called efficient frontier and thus serves as a gentle introduction into one area of finance: “portfolio theory” using R. A second part will then concentrate on the Capital-Asset-Pricing-Method (CAPM) and its assumptions, implications and drawbacks.

# Speeding “Bayesian Power Analysis t-test” up with Snowfall

This is a direct (though minor) answer to Daniel’s blogpost Power Analysis for default Bayesian t-tests, which I found very interesting, as I have been trying to get my head around Bayesian statistics for quite a while now. However, one thing that bugs me, is the time needed for the simulation. On my machine it took around 22 minutes. Depending on the task, 22 minutes for a signle test can be way too long (especially if the tests are done in a business environment where many tests are needed – yesterday) and a simple t-test might be more appealing only because it takes a shorter computing time. Here is my solution to speed-up the code using *snowfall*‘s load-balancing parallel structures to reduce the time to 8.5 minutes.

# Simulating backtests of stock returns using Monte-Carlo and snowfall in parallel

You could say that the following post is an answer/comment/addition to Quintuitive, though I would consider it as a small introduction to parallel computing with *snowfall* using the thoughts of Quintuitive as an example.

# Getting that X with the Glog function and Lambert’s W

Facing a simple, yet frustrating formula like this

and the task to solve it for x left me googling around for hours until I found salvation in Wolfram Alpha, Wikipedia, and a nice blog post with R-syntax to solve a similar equation.

# Getting started with PostgreSQL in R

When dealing with large datasets that potentially exceed the memory of your machine it is nice to have another possibility such as your own server with an SQL/PostgreSQL database on it, where you can query the data in smaller digestible chunks. For example, recently I was facing a financial dataset of 5 GB. Although 5 GB fit into my RAM the data uses a lot of resources. One solution is to use an SQL-based database, where I can query data in smaller chunks, leaving resources for the computation.

# Agent Based Modelling with data.table OR how to model urban migration with R

## Introduction

Recently I found a good introduction to the Schelling-Segregation Model and to Agent Based Modelling (ABM) for Python (Binpress Article by Adil). The model follows an ABM approach to simulate how urban segregation can be explained.

# Using rvest and dplyr to look at aviation incidents

For a project I recently faced the issue of getting a database of all aviation incidents. As I really wanted to try Hadley’s new *rvest*-package, I thought I will give it a try and share the code with you.