VennMaster: Area-proportional Euler diagrams for functional GO analysis of microarrays


Supplementary Information

Supplementary information

For details on intersecting convex polygons, area calculation of polygons, definition of the cost function, particle swarm optimization and software implementation follow this link to the PDF "Supplementary Information 1 ".

Details of the simulation results are given below.

Software

The approach was implemented as a platform independent Java application requiring JRE 5.0 which is freely available at java.sun.com.

Quick Guide

  1. Install JRE 5.0 (1.5.0) if it is not available on your system.
  2. Download VennMaster-0.37.3.zip and unpack the file into a directory.
  3. Start VennMaster by Hint: It is not recommended to call VennMaster by a direct click on the venn.jar file since the memory assignment of one MB is too low for the most Euler arrangements.
More details can be found at the VennMaster software documentation.

Cost functional

The proposed error function evaluates the goodness of the graphical Euler arrangement putting different weights on the (contradicting) constraints. Since an arrangement may become unconnected an error term is proposed putting more weight on compact solutions. In order to show that the introduced term lead in many cases to a better convergence (lower original error term E) we computed Euler diagrams for 10 artificial random data sets and one previously published gene expression dataset containing genes differentially expressed between a specialized mesenchymal cell type (stellate cells) and normal skin fibroblasts (filter settings: minimum total: 40; maximum total: 140; max p-value: 0.05). The data is available as a zip-file data.zip. For different delta values (=weight of the pressure term) the cost functional E is shown for the two optimization strategies. The number of optimization steps until the stop criterion (maximum number of constant steps=50, maximum number of steps=500) is met is shown on the right side. The following diagrams show min/max (light color), interquartile range (dark color), and median value (black line) over n=20 runs and 20 different delta parameter settings (x-axis) ranging from 0 to 2000.

Evolutionary strategy with self-adapting mutation rates

Particle swarm optimization (PSO)

Results

Stellate cell data set

To evaluate both strategies we compared the 400 cost values/number of optimization steps of the evolutionary strategy to those of the swarm optimizer with an unpaired one-sided Wilcoxon rank test. For both values the p-value was below 2.2e-16. Furthermore it can be seen that a moderate delta value of 400 leads to more stable solutions.

Random data sets

The results for the 10 random data sets can be downloaded as pdf files:
Evolutionary Optimizationrandom-evo.pdf
Particle Swarm Optimizationrandom-swarm.pdf
For the 10 random data sets the cost functional and the number of iterations were significantly lower for the swarms p=4.567e-10 and p<2.2e-16 respectively.
Hans A. Kestler
last modified: 2008-02-19