Dimension Reduction

Having a large collection of data is always a wonderful thing because it allows inferences and dependencies to be gleaned. The problem with great amounts of data is the fact that using all of this data can lead to both overfitting and being struck by the curse of dimensionality. In this post I go over a basic approach to dimension reduction and some reasons why it is so important.

[Read More]

Handling Missing Data

Missing data is a problem which plagues all manner of science and there are a number of ways which missing data can be dealt with. In this post I introduce the general ideas behind missing data, and demonstrate a few methods with which data missingness can be attacked.

[Read More]