How to use Cluster and Treeview

One of my colleagues was interested in visualizing some data based on CpG data using heatmap approaches adopted by researchers in gene expression microarray. I pointed to him to Cluster and Treeview, one of the the earliest free standalone softwares developed for heatmap visualization in gene expression studies and developed in Mike Eisen’s group. You can apply the same approach to any normally distributed variable instead of CpG or gene expression data.

Here is a quick tutorial on using the data.

1. Download and install Cluster and Treeview from http://rana.lbl.gov/EisenSoftware.htm. For Cluster, you will need to download the zip file and decompress it to find the SETUP.EXE file.

2. Format your input dataset. Here is a simple example. The file needs to be tab-separated with missing values coded as empty cells. (see the File Format Help on the Cluster software for further info).

UID NAME GWEIGHT asthma1 asthma2 asthma3 healthy1 healthy2 healthy3 healthy4
EWEIGHT 1 1 1 1 1 1 1
CPG1 AAA 1 0.23 -1.79 -1.29 -1.56 -0.27 -0.38
CPG2 BBB 1 0.41 -0.89 -1.06 -1.6 -1.84 -1.6
CPG3 CCC 1 0.61 -0.07 -1.29 -1.29 -2 -1.84 -2.25
CPG4 DDD 1 0.16 -0.15 -0.76 -1.25 -1.89 -1.74 -1.6
CPG5 EEE 1 0.03 1.39 -0.84 -1.64 -2.84 -2.47 -2.4
CPG6 FFF 1 -0.18 -0.18 -0.62 -1.32 -1.69 -1.43 -1.7

3. Launch Cluster (try START -> Cluster -> Cluster). Press ‘Load file’. Check the number of rows and columns matches the number of CpG islands and subjects.

4. (optional) In the Cluster software, you can “filter” out potentially uninteresting CpG islands by some criteria (e.g. missingness, variance) if you wish.

5. (optional) If your input file is arranged by asthmatics followed by non-asthmatics, then you should untick the cluster arrays in the “heirarchical clustering” tab.

6. Press the ‘Average Linkage Clustering’ button (or complete or single linkage) at the bottom of “Hierarchical Clustering” tab. This should produce 3 files (including cdt, gtr).

7. Start Treeview (try START -> EisenSoftare -> Treeview). Load the cdt file to see the plot. Click on the dendrograms, CpG islands to navigate and zoom etc.

8. (optional) You might want to change the X, Y pixel sizes (Settings -> Options) to get a bigger picture.

I appears that these softwares are no longer being actively developed anymore but that is fine since they do a limited amount of analysis extremely well.

Alternatives options:

  1. You can use R to generate similar plots (but not zoomable and requires command line programming) or any other main statistical software
  2. I have heard good stuff about the dChip software but I have not tried it myself.
  3. There are also a couple of free webtools where you can upload your data to generate these plots. For example [1], [2]
Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s