Reasons to use R

I use a statistical software called R ( I would highly recommend anyone who needs to manipulate, analyse and visualise data to check the software out.

Here are some reasons to use R over other statistical softwares:

  1. It is completely free to download and use without restrictions.
  2. It is also open source which means you can read the source code of all functions and modify it yourself.
  3. It has good documentation for most functions. There are a number of significantly well written introductory R books, tutorials, reference cards, reference manuals, vignettes and newsletters.
  4. A sensible default values and error messages is available for most functions. This is because majority of the functions and packages are written by experts in the field who are aware of the common and frequent pitfalls that a naive user might fall into.
  5. Large number of packages dealing with many areas of applications. A lot of statistical procedures are already available, so you do not need to waste time re-inventing the wheel everytime. Plus the many different ways to store, manipulate, analyse and visualise your data, though a bit overwhelming to new users at first, is intended to make you think about your data.
  6. Active, responsive and vibrant R community as seen in the helpful R mailing lists. But please do check the helpful posting guide before posting in order to ellicit the informative responses.
  7. Ability to work in an interactive and Integrated Development Environment which means you are able to debug errors and visualise results on the fly without having to first compile and then execute it.
  8. Cross platform compatability means that you are able to develop it on one environment and then implement it in many different platforms with nominal changes (if any). I usually develop and test the codes on either Windows of Mac with small datasets before running the codes on bigger datasets or bigger simulations on large UNIX or on Linux clusters.
  9. Various benchmark shows that R is just as fast or faster than many statistical softwares for various tasks. Here is an example of such a benchmark test which is outdated but still indicative. Many personal experience of people who have used multiple statistical softwares suggest that R codes are much easier to understand (but this depends on how one writes it).
  10. One can use Sweaveto automate reports that need to be compiled on regular or frequent basis.

BioConductor contains a collection of many R packages that specifically deal with analysis of genomic data (e.g. SAGE, SNP, sequence, microarrays, array CGH, proteomics, biological annotations and ontologies). This is a the fist choice of tools for many statisticians and leading experts in the field. Thus this is where the software for many proposed new methods becomes first available.


One response to “Reasons to use R

  1. Thought I’d drop some info onto this post: at our genetics lab in Dresden, Germany, we do a lot of work with arrayCGH analysis on tumor and other genetic cases. We’ve taken an open-source LAMP tool called “arrayCGHbase”, fixed a ton of bugs and added a bunch of features (basically made it stable and usable), and have extended it’s use of several R packages, including


    The upshot is that if you want a single tool that gives you access to all these packages (and more as we add them), we think our tool is pretty nice to use. We’re currently scaling it up to handle Agilent 244k chips, and that’s going pretty well so far.

    By the way, it’s completely free and we like to support other labs using it. We include comprehensive but easy-to-follow instructions for getting the underlying tools set up (Apache, PHP, MySql), and we’re glad to work online with you to get things started. We run on both Mac and Windows.

    Did I mention it’s completely free?

    Drop us a line at acgh_base "at" and we will get you started…

