back

Matrix CGH Analyzer

Analyzing Matrix/Array Comparative Genomic Hybridization Data

Requirements

Optional packages (without these packages there is only a reduced functionality such as no external R access or no karyogram view):

Download

MatrixCGH-0-52.exe Matrix CGH Analyzer (Version 0.52)

Old Release

MatrixCGH-0-51a.exe Matrix CGH Analyzer (Version 0.51)

Installation Instructions

There are two possibilities to install Matrix CGH Analyzer:
  1. Fully automatic containing all required packages
  2. Manually installing the required packages

Automatic Installation

Download the current installer MatrixCGH-0-52.exe to e.g. your desktop, start it with by a double-click (ensure that you have administrator priviledges), and follow the instructions of the setup program. Use always the default settings.

Manual Installation

  1. Login as an user with administrative privileges
  2. Install R 2.1.1. by downloading and executing the installer rw2011.exe
  3. Install the R (D)COM server 1.35 by downloading and executing the installer RSrv135.exe
  4. Install the Sun Java runtime environment (JRE) 1.5 or newer download java (Offline installation recommended)
  5. Install a Perl interpreter such as the ActivePerl environment Active Perl
  6. Download the installation program MatrixCGH-0-52.exe, start it by a double-click, and follow the instructions of the setup program. Use always the default settings.
  7. Optional: Logout and login as normal user.
  8. Call the Matrix CGH application from the windows start menu (programs/University of Ulm/Matrix CGH Analyzer xxx).
  9. Depending on the security settings of Excel a dialog window has to be confirmed that the execution of macros should be allowed for the given file. Hint: The security settings of Excel must allow the execution of macros.

License

Matrix CGH Analyzer and its source code (which will be available soon) are licensed according to:

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Source code

The source code of Matrix CGH Analyzer will be available upon publication.

MCGH Analyzer 0.52

Features

Documentation

Quick Start

The following steps has to be executed after the installation of Matrix CGH Analyzer:
  1. Start Matrix CGH Analyzer by calling the software from the Windows startmenu.
  2. Import MCGH files via the Excel menu MCGH/Import MCGH File(s)
  3. Selection processing options in the menu MCGH/Options

MCGH File Format

Matrix CGH Analyzer is able to import MCGH files. These files have the following format:
[Header]
Option=Value
...
[ColumnName]
"Name" "F635 Median" "F635 Mean" "B635 Median" "B635 Mean" "F532 Median" "F532 Mean" "B532 Median" "B532 Mean" "Flags" "CloneID" "Band" "Accession" "Status" "Start Basepair" "End Basepair" "Control Clone" "Reject Clone"
[Data]
Data entries
...

File conversion with the mcghconverter tool

In the installation directory of the mcgh analyzer you can find a directory named mcghconverter. Here you can find several perl scripts to convert several file formats into the mcgh format recognized by the mcgh analyzer software.
This Package contains several scripts for combining abberation data with mapping information. In the root folder you can find certain scripts for special converting procedures: These scripts can be used on the commandline. If you want to use them with drag'n'drop, you can find the corresponding scripts in the folder dragndropscripts! The "main" script is the file mcghconverter.pl this script produces output files valid for the mcgh Analyzer software, you can specify input filenames. Default values are assumed, if no options are given to the script.
options:
-i interactive Mode, data which is not given, will be asked, default is to use default values.
-h display help
-imt input mapping type, see this document for supported file types
-ims source of the data(e.g. file name)
-imo options for the specific file type (differs from file type to filetype)
-idt input abberationdata type, see this document for supported file types
-ids data source (e.g. file name)
-ido options for the specific file type (differs from file type to filetype)
-os file for output
-oo options for output
most of the options are to combine the correct columns, so the easiest way of merging files together is to rename the headers so that they fit to the output format.
The output format consists of these columns:
"ID","Name","F635 Median","F635 Mean","B635 Median","B635 Mean", "F532 Median","F532 Mean","B532 Median","B532 Mean","Flags", "CloneID","Band","Accession","Status","Start Basepair","End Basepair"
Supported Abberationdata files
Supported mapping files
supported output files

R Integration

For the steps normalization, breakpoint detection, and consensus region finding R scripts may be used instead of the built-in default operators. By a descriptor file the parameters for R scripts can be made available in the graphical user interface so that the user need no R programming skills. Wrapper scripts for the two copy number packages GLAD and DNAcopy are already provided. By the use of these scripts (they just have to be copied into a predefined directory) new algorithms can be directly integrated into the Excel add-in. The creation of new scripts and descriptor files requires only minimal programming skills. An example for the definition of a wrapper R script for the detection of copy numbers using the (hypothetical) copynumber package could look like this
# load the R library
library(copynumber)

# get data from the Excel add-in
Chrom <- DATA[,1] # the chromosome number  
Loc <- DATA[,2]   # base-pair location
LogRatio <- DATA[,5] # the ratios

# compute copy numbers and return them to Excel
DATA[,6] <- copy.number(Chrom,Loc,LogRatio)
assuming that the package copynumber provides a function copy.number() accepting the three vectorial parameters chromosome number, base-pair localization, and log ratio. The variable DATA is a matrix which is filled from the Excel add-in with the columns chromosome number, start base-pair, stop base-pair, a rejected flag, log ratio, and copy number (the return value). Scripts for normalization and consensus region detection can be implemented similarly. For further details see the appendix.

The Options Dialog

Background Intensity Correction
Spot Quality Control
Visual Options
Global Options
Control Clone Selection
A control clone normalization procedure of Matrix CGH Analyzer Only non-rejected clones will be selected as control clones.
Field Selection
The fields which are shown in the Excel table can be selected in this dialog.
Diagram Options
Normalization
Breakpoint Detection
Consensus Region Finding

Analysis Tools

Ideogram Browser

Pattern Recognition Toolbox

The Pattern Recognition Toolbox includes standard evaluation algorithms for clustering and classification: It accesses the R core via the R (D)COM Server 1.35.
Select a region in Excel and choose the desired method from the toolbar. For classification the last column or row (depending on the setting normal versus transposed) must contain a class label.

Appendix

[A] R Integration - Wrapper Scripts

For normalization, copy number detection, and consensus region finding R scripts can be copied into the following directory struture:
/Matrix CGH Analyzer
  /RScripts
    /Normalize
    /Breakpoint
    /Consensus
The R scripts must have the extension .R and the descriptor files must have the same name as the corresponding R script but must have instead the extension .Rdsc. The .Rdsc file contains entries of the form
parameter_name default_value
or
parameter_name lower_bound upper_bound default_value
which are then directly shown in the Excel add-in.

i. Normalization Scripts

The normalization script may access the data flow before the replica processing or after the replica processing. Hereto the descriptor file must contain a line
DataType "normal"
(that is with replica processing) or
DataType "extended"
(without replica processing) such that all single spots are in the data structure. Depending on the presence of a switched experiment the following four cases may arise. The output parameters which can be returned from the scripts are marked in bold.
  1. Non-extended/Switched experiment present:
    NORMALSWITCHEDCOMBINED
    "db_chromosome""db_chromosome""db_chromosome"
    "db_start_bp""db_start_bp""db_start_bp"
    "db_end_bp""db_end_bp""db_end_bp"
    "reject""reject""reject"
    "control""control""control"
    "cy5_median""cy5_median""CMean_Ratio"
    "cy5_mean""cy5_mean""CNorm_Ratio"
    "cy5_median_back""cy5_median_back"
    "cy5_mean_back""cy5_mean_back"
    "cy3_median""cy3_median"
    "cy3_mean""cy3_mean"
    "cy3_median_back""cy3_median_back"
    "cy3_mean_back""cy3_mean_back"
    "flags""flags"
    "CNorm_Ratio""CNorm_Ratio""CNorm_Ratio"
  2. Extended/No switched experiment present:
    NORMAL
    "db_chromosome"
    "db_start_bp"
    "db_end_bp"
    "reject"
    "control"
    "cy5_median"
    "cy5_mean"
    "cy5_median_back"
    "cy5_mean_back"
    "cy3_median"
    "cy3_mean"
    "cy3_median_back"
    "cy3_mean_back"
    "flags"
    "CNorm_Ratio"
  3. Normal/Switched experiment
    NORMALSWITCHEDCOMBINED
    "db_chromosome""db_chromosome""db_chromosome"
    "db_start_bp""db_start_bp""db_start_bp"
    "db_end_bp""db_end_bp""db_end_bp"
    "reject""reject""reject"
    "control""control""control"
    "CMean_Ratio""CMean_Ratio""CMean_Ratio"
    "CNorm_Ratio""CNorm_Ratio""CNorm_Ratio"
  4. Normal/No switched experiment
    NORMAL
    "db_chromosome"
    "db_start_bp"
    "db_end_bp"
    "reject"
    "control"
    "CMean_Ratio"
    "CNorm_Ratio"

ii. Breakpoint (Copy Number) Detection Scripts

DATA
"db_chromosome"
"db_start_bp"
"db_end_bp"
"reject"
"CNorm_Ratio"
"copy_number"

iii. Consensus Region Scripts

DATACONSENSUS
"db_chromosome""db_chromosome"
"db_start_bp""db_start_bp"
"db_end_bp""db_end_bp"
"reject_0""reject"
"copy_number_0""copy_number"
"reject_1"
"copy_number_1"
...
...
"reject_n"
"copy_number_n"

Hans A. Kestler
Last modified: 2007-03-19