VennMaster: Area-proportional Euler diagrams for functional GO analysis of microarrays
Supplementary Information
Supplementary information
For details on intersecting convex polygons, area calculation of polygons, definition of the cost function, particle swarm optimization and software implementation follow this link to the PDF "Supplementary Information 1 ".
Details of the simulation results are given below.
Software
The approach was implemented as a platform independent Java application
requiring JRE 5.0 which is freely available at
java.sun.com.
Quick Guide
Install JRE 5.0 (1.5.0) if it is not available on your system.
Windows: Double click on venn.bat in the installation
directory.
Unix/Linux: Call venn.sh from the command line
(ensure that your java executable is in the search path)
in the installation directory.
Hint: It is not recommended to call VennMaster by a direct
click on the venn.jar file since the memory assignment of one
MB is too low for the most Euler arrangements.
The proposed error function evaluates the goodness of the graphical Euler arrangement
putting different weights on the (contradicting) constraints.
Since an arrangement may become unconnected an error term is proposed
putting more weight on compact solutions.
In order to show that the introduced term lead in many cases to a better
convergence (lower original error term E) we computed Euler diagrams
for 10 artificial random data sets and one previously published
gene expression dataset containing genes differentially expressed between a
specialized mesenchymal cell type (stellate cells)
and normal skin fibroblasts (filter settings: minimum total: 40; maximum total: 140; max p-value: 0.05).
The data is available as a zip-file data.zip.
For different delta values (=weight of the pressure term) the cost functional
E is shown for the two optimization strategies.
The number of optimization steps until the stop criterion (maximum number of
constant steps=50, maximum number of steps=500) is met is shown on
the right side.
The following diagrams show min/max (light color), interquartile range
(dark color), and median value (black line) over n=20 runs and 20 different
delta parameter settings (x-axis) ranging from 0 to 2000.
Evolutionary strategy with self-adapting mutation rates
Particle swarm optimization (PSO)
Results
Stellate cell data set
To evaluate both strategies we compared the 400 cost values/number of optimization
steps of the evolutionary strategy to those of the swarm optimizer with an unpaired
one-sided Wilcoxon rank test.
For both values the p-value was below 2.2e-16.
Furthermore it can be seen that a moderate delta value of 400
leads to more stable solutions.
Random data sets
The results for the 10 random data sets can be downloaded as pdf
files:
For the 10 random data sets the cost functional and the number of iterations
were significantly lower for the swarms p=4.567e-10 and p<2.2e-16 respectively.
Hans A. Kestler