Thursday, October 29, 2009

Don't log your data twice

I am sitting here with some RNA expression microarray data from an Illumina Beadarray experiment. In general the data consist of a long [23k genes] list of spot intensities for each array in the experiment. If you want to compare any two of these lists in R using Bioconductor and the beadarray package you can run a small piece of code like this:

arrays = 1:2,
labels = targets$sampleID[1:2])

[plotMAXY is a plotting function; exprs extracts expressions from the BeadSummaryData (BSData); arrays tells the function which arrays to compare and labels tells it what to call them]

Then you get a graph like this:

Figure 1: On the lower left you see a XY-plot, where the intensities of all genes on array C1 are plotted against the intensities on array C2. The dashed lines show a two fold change in expression between arrays. To the upper right is the MA-plot (M is the ratio of intensities on the two arrays plotted on the y-axis, and A is the average intensity of a gene plotted on the x-axis).

What should strike you is the fact that one of the arrays appears to have a generally higher intensity than the other and that the difference changes with intensity, giving a banana-shaped curve. Some very smart people have shown that most of this is just an artifact of the method and that the average intensity of all genes in comparable samples should be the same and not vary with the expression. So what you do is normalise the data of all arrays to the same quantile distribution.

BSData.log2.quantile = normaliseIllumina(BSData,
method = "quantile",
transform = "log2")

[BSData.log2.quantile is the output variable; normaliseIllumina is the function that normalises data from Illumina arrays using the method given by method and performs the transformation given by transform; BSData is the BeadSummaryData]

The result should be a graph such as:

Figure 2: plotMAXY result after normalisation, when all is well.

You can see that the average intensities of the two arrays are now equal so that the cloud lies around the unity line in the XY-plot and the 0-line in the MA-plot. There is no longer any banana shape to the curve. But, there are a numer of spots outside the two-fold lines showing differentially expressed genes.

What I got when I plotted the normalised data this time was this plot:

Figure 3: Example of a plotMAXY result that shouldn't exist.

It's like, D..N. It shouldn't look like that. S..T! It's impossible that there should be such a curve. The XY-plot shows one sample to be impossibly much stronger than the other, but only at high intensities. What the F..K are the fold change lines doing at that kind of angle any way? The MA curve at least looks like it should, but show no differently expressed genes at all. B....Y V......E! Even technical replicates aren't that similar.

When I can hear myself above the beating of my heart and can see the screen again through the torrent of cold sweat pouring into my eyes I look more closely at the XY-plot and see that the scales on the x- and y-axis are different. Actually no intensities are higher than 4.

What has happened is that the plotMAXY function in log2-transforms the data by default before plotting (that's why the 2-fold line is called 1), and I already did the log2-transformation when I quantile-normalised the data. What I should have done is added the "log = FALSE" line to the plotMAXY, like this:

arrays = 1:2,
labels = targets$sampleID[1:2],
log = FALSE)

[log = FALSE tells the function plotMAXY not to log2 transform the expression data from BSData.log2.quantile]

Now i get a graph more like figure 2 and can continue with my analysis without further risk of apoplexy. Or maybe not, data analysis is a forest fraught with formidable foes and fearsome formulas.

I leave you now to venture again into the jungle of statistical tricks.

Wish me luck,


Thursday, October 15, 2009

12th Symposium of vascular Neuroeffector Mechanisms

I just got a heads-up on an interesting symposium in connection with WorldPharma 2010. It is the 12th Symposium of vascular Neuroeffector Mechanisms, put together by Pernille Hansen in Odense. You can find the website at, it will give you a perfect reason for staying another day in Denmark at the very best time of year.

Now I just have to scrape together an abstract. I'm sure I saw some data here somewhere.



Monday, October 05, 2009

Telomers and Kidneys

Elizabeth H. Blackburn, Carol W. Greider & Jack W. Szostak were announced as the 2009 Nobel laureates in medicine or physiology: "for the discovery of how chromosomes are protected by telomeres and the enzyme telomerase" (

There are a 169 articles matching "kidney" AND "telomere" on Medline. So what is this Nobel Prize thing about for the nephrocentric community (except for a wide open field)?

Well, experimentally it is important to realize that rodents don't show telomere attrition. Their cells still stop dividing and go into senescence, actually using some of the same proteins, but their cells telomeres does not get shorter (Famulski & Halloran 2005).

In humans on the other hand there are some interesting findings. Telomere length in circulating leucocytes is correlated to kidney function in patients with chronic heart failure (CHF) (Wong et al 2009). That just says that greater biological age makes the kidneys more vulnarable in CHF, maybe even narrower, that greater biological age of the immune system puts the kidneys at risk. So, what about the telomeres in the kidneys?

They do shorten with age, and interestingly shorten more in the cortex than in the medulla (Melk et al 2000). I guess this can be tied to the higher metabolic activity in the cortex, and to the lower oxygen tension in the medulla. Decreased oxygen tension has namely been shown to protect against cellular senescence, although in fibroblasts, and not in the kidney (Betts et al 2008). The mechanism that is mostly branded about is the role of reactive oxygen species, which have been shown to play a role in Cyclosporin A induced renal cellular senescence in renal tubular cells (Jennings et al 2007), as well as angiotensin II in vascular smooth muscle (Herbert et al 2007). Finally telomere length has been shown to predict graft survival in transplant patients (Koppelstaetter et al 2008).

Well, that's pretty much it for this post. In short, rodents have telomers that don't get shorter with ageing. In humans they do, and they can be used to predict outcomes. Reactive oxygen species is the most important pathway for senescence and can probably be used to predict telomere out comes in humans from experimental models.