Using hexbin to better visualize a dense two-dimensional plot
November 9, 2007
Anyone working with high dimensional data would have tried to plot two variables at some point. The problem is that if even a small proportion of the data is noisy, this translated to a large number of points which can obscure good visualization. Here is an example:
x <- runif(100000)
y <- -10 + 20*x + rnorm(100000, sd=2)
y[1:20000] <- rnorm(20000, sd=10) # 20% noise
plot(x, y, pch=”O”)
which produces the following
which I don’t think is very informative as the very strong linear trend is now obscured by 20% of the noisy data. An alternative at better visualizing such plot will be to use the hexbin package.
library( hexbin)
plot( hexbin(x, y, xbins=50) )
The hexbin package estimates the density (number of points in) the neighbourhood of predefined grid centres and uses varying shades of grey to represent the density. You can install the hexbin package in R using the following commands:
source(“http://bioconductor.org/biocLite.R”)
biocLite(“hexbin”)
Murder Princess review
November 3, 2007
What a wonderful little gem! OK, the title is a bit dodgy but do not be put off by that.
This comprises of only six half-hour OAV episodes but it very fast paced and there is more depth in the characters and story than many big budget blockbusters out there.
Be warned that it is very addictive though and will probably leave you wanting more!
Truth in Numbers: The Wikipedia Story
November 3, 2007
“Imagine a world where everyone had access to the sum of all human knowledge… that’s what we are doing with Wikipedia”
I have been a long time supporter and fan of Wikipedia and am excited to know that a documentary is being made about it. Truth in Numbers: The Wikipedia Story is expected to be released for international audience in 2008. In the meanwhile here is the trailer.
Superman Theme Song Parody
November 2, 2007
This guy is so hilarious!


