The pipeline to run MAD (Mosaic Alteration Detector)


MAD, acronym of Mosaic Alteration Detector, is an R package, depending from the package R-GADA, that allows to detect mosaic events using SNP data. The best feature of this package is that it allows to run the process over a big set of samples.

To run the analysis each sample must have a file, containing the following information:

  • Name (of the SNP)
  • Chromosome (where the SNP is placed)
  • Position (of the SNP in the chromosome)
  • The value for the log2 ratio (of the SNP)
  • The value of the B allele frequency (of the SNP)
  • And the given genotype (of the SNP)

All this information needs to be stored in a tab-splited file:

[carleshf@coruscan analysisstarwarsrawData]$ head padme 
Name    Chr     Position        Log.R.Ratio     GType   B.Allele.Freq
S-4DTYN 3       75333236        0.320067        AB      0.503067484662577
S-4DTYJ 2       137804803       -0.372684       BB      0.93558282208589
S-4DTYD 1       79235573        -0.224208       AA      0.00920245398773001
...     ...     ...             ...             ...     ...

Moreover, the R package MAD requires to place all the samples into a folder called rawData, let’s see:

  +-- starwats
        +-- rawData
              +-- padme
              +-- anakin
              +-- qui_gon
              +-- palpatine
              +-- watto
  +-- startrek
        +-- rawData
              +-- james_kirk
              +-- spock
              +-- uhura
              +-- scott

Having the correct structure, all the samples into the folder rawData, we place the R session into the upper folder, in the case starwars.

R> path <- getwd()
R> path
[1] "/home/carleshf/.../analysis/starwars"


Being there we can start the mosaicism analysis, that is performed in three steps:

  1. The set-up step.
  2. The segmentation procedure step.
  3. The backward elimination step.


library( mad )

object <- setupParGADA.B.deviation(
    folder    = path, 
    NumCols   = 6, 
    log2ratio = 4, 
    BAFcol    = 6, 
    GenoCol   = 5

Segmentation Procedure

    estim.sigma2 = TRUE, 
    aAlpha       = 0.8

Backward Elimination

    T         = 8, 
    MinSegLen = 100


At this point the analysis is finished. All the possible events have been detected ans stored into the variable called object. MAD allows us to export all this information as a table:

    file   = "mosaic_events.txt"

The content of this file follows:

[carleshf@coruscan analysisstarwars]$ head mosaic_events.txt 
IniProbe  EndProbe  LenProbe  qqBdev  chr  LRR    (s.e.)  Bdev   %HetWT  %Hom  State  sample
66690197  71078462  183       0.95    3    -0.64  0.21    0.188  0       8.2   2      palpatine
17309881  21421319  127       0.38    22   -0.24  0.34    0.032  4.7     51.2  2      watto
143559    15049329  495       0.89    18   -0.64  0.24    0.214  0       10.3  2      anakin

We can see that MAD gives us a lot of information. May be, the most import could be the region where the mosaic event was detected (from InitProbe to EndProbe), the chromosome containing the event (chr), the classification of the event (State) and the sample that suffers the event (sample).

The number that codifies the State of the detected abnormalities corresponds:

  1. UPD
  2. Deletion
  3. Duplication
  4. Trisomy
  5. LOH


For more information I refer you to the following:

  • The web page of the package – link
  • The vignette of the package – link
  • The paper where the method (package) was used – link

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: