lumi **

Lumi is an ongoing project for Illumina microarray data analysis, including the design of a software package in R/Bioconductor and the discussion of microarray experiment design.

** an open source project at the Bioinformatics Core of the Robert H. Lurie Comprehensive Cancer Center, Northwestern University.


Batch download of the DMAPs

How does it work?

The utility retrieves the DMAP files automatically over the Internet from a file server at Illumina.

What is the benefit?

Avoid manually loading CDs to the drive and copying then one-by-one.

How long does it take?

About 30 seconds to minutes per chip.

What is the DMAP file?

DMAP is a decode file. It is unique for each array on each chip. Without the decode file, we can not figure out which bead corresponds to which gene.

What is the catch?

Illumina says they will keep the file at least 30 days after the chip is shipped.

Where can I download the utility?

How does the CD look like?


Tricks of Experiment Design

Before we start discussing the optimal experiment design, we should first review some basics of the Illumina expression array platform. There are six arrays on the Illumina WG-6 human genome chip (Figure 1). This is very different from the Affymetrix design or the home-made glass-slide arrays! A special term “array-of-arrays” was coined to describe this unique arrangement. We use two different words, “arrays” and “chips”, to indicate this difference.

Array: one of the six arrays on the chip.
Chip: the slide that contains six arrays.

Obviously, the six arrays on the same chip will have more consistent hybridizing conditions than arrays across slides. Thus, it is called a block, in terms of statistical experiment design.

To make a more solid example, let us consider a hypothetical experiment (Figure 2). We want to find the difference between treatment A and treatment B over a time course (t1-- early response; t2-- immediate response; t3-- late response). Suppose we have three biological replicates at each time point to address the biological variations. It will end up with 3 (time points) *2 (treatments) *3 (replicates) = 18 (arrays). Remember there are 6 arrays on each Illumina chip. So, we need 3 chips.

How do we put these 18 arrays on the three chips? Let us compare the following two experiment designs (Figure 3):

Schema A: block design
Schema B: confounded design

We prefer schema A over schema B. Why? Here are two statistical terms frequently used in guiding experiment designs.

BLOCKING -- “Blocking is the arranging of experimental units in groups (blocks) which are similar to one another. For example, an experiment is designed to test a new drug on patients. There are two levels of the treatment, drug, and placebo, administered to male and female patients in a double blind trial. The sex of the patient is a blocking factor accounting for treatment variablility between males and females. This reduces sources of variability and thus leads to greater precision.” -- Wikipedia

CONFOUNDING—“Confounding variable is a "hidden" variable in a statistical or research model that affects the variables in question but is not known or acknowledged, and thus (potentially) distorts the resulting data. For example, ice cream consumption and murder rates are highly correlated. Now, does ice cream incite murder or does murder increase the demand for ice cream? Neither: they are joint effects of a common cause or lurking variable, namely, hot weather. Another look at the sample shows that it failed to account for the time of year, including the fact that both rates rise in the summertime.” -- Wikipedia