How to use Cluster and Treeview
May 6, 2009
One of my colleagues was interested in visualizing some data based on CpG data using heatmap approaches adopted by researchers in gene expression microarray. I pointed to him to Cluster and Treeview, one of the the earliest free standalone softwares developed for heatmap visualization in gene expression studies and developed in Mike Eisen’s group. You can apply the same approach to any normally distributed variable instead of CpG or gene expression data.
Here is a quick tutorial on using the data.
1. Download and install Cluster and Treeview from http://rana.lbl.gov/EisenSoftware.htm. For Cluster, you will need to download the zip file and decompress it to find the SETUP.EXE file.
2. Format your input dataset. Here is a simple example. The file needs to be tab-separated with missing values coded as empty cells. (see the File Format Help on the Cluster software for further info).
| UID | NAME | GWEIGHT | asthma1 | asthma2 | asthma3 | healthy1 | healthy2 | healthy3 | healthy4 |
| EWEIGHT | 1 | 1 | 1 | 1 | 1 | 1 | 1 | ||
| CPG1 | AAA | 1 | 0.23 | -1.79 | -1.29 | -1.56 | -0.27 | -0.38 | |
| CPG2 | BBB | 1 | 0.41 | -0.89 | -1.06 | -1.6 | -1.84 | -1.6 | |
| CPG3 | CCC | 1 | 0.61 | -0.07 | -1.29 | -1.29 | -2 | -1.84 | -2.25 |
| CPG4 | DDD | 1 | 0.16 | -0.15 | -0.76 | -1.25 | -1.89 | -1.74 | -1.6 |
| CPG5 | EEE | 1 | 0.03 | 1.39 | -0.84 | -1.64 | -2.84 | -2.47 | -2.4 |
| CPG6 | FFF | 1 | -0.18 | -0.18 | -0.62 | -1.32 | -1.69 | -1.43 | -1.7 |
3. Launch Cluster (try START -> Cluster -> Cluster). Press ‘Load file’. Check the number of rows and columns matches the number of CpG islands and subjects.
4. (optional) In the Cluster software, you can “filter” out potentially uninteresting CpG islands by some criteria (e.g. missingness, variance) if you wish.
5. (optional) If your input file is arranged by asthmatics followed by non-asthmatics, then you should untick the cluster arrays in the “heirarchical clustering” tab.
6. Press the ‘Average Linkage Clustering’ button (or complete or single linkage) at the bottom of “Hierarchical Clustering” tab. This should produce 3 files (including cdt, gtr).
7. Start Treeview (try START -> EisenSoftare -> Treeview). Load the cdt file to see the plot. Click on the dendrograms, CpG islands to navigate and zoom etc.
8. (optional) You might want to change the X, Y pixel sizes (Settings -> Options) to get a bigger picture.
I appears that these softwares are no longer being actively developed anymore but that is fine since they do a limited amount of analysis extremely well.
Alternatives options:
- You can use R to generate similar plots (but not zoomable and requires command line programming) or any other main statistical software
- I have heard good stuff about the dChip software but I have not tried it myself.
- There are also a couple of free webtools where you can upload your data to generate these plots. For example [1], [2]