DEFMC : Instructions


How to run

DEFMC can be executed as a command line applictaion.

Usage

>defmc [options] input_file_name

[OPTIONS]

  • -l integer : lookahead depth (default 0)

  • -t : parallel mode

  • -n string : output file name (default : o_result.txt)

  • -f string : char for header (default : NULL)(Do not input space between ’-f’ and string)

  • -s string : char for separation between categories (default : ,)

  • -h : help

Examples

>defmc inputfile.txt
>defmc -l 10 inputfile.txt
>defmc -l 5 -t inputfile.txt
>defmc -t -n resultfile.txt inputfile.txt
>defmc -t -n resultfile.txt -f# -s / inputfile.txt



How to compile

DEFMC has been developed for Linux systems.

  1. Update your c++ compiler

  2. Get DEFMC source code and decompress it

  3. Go to DEFMC directory

  4. make


Input matrix file

Your input file must be a data matrix in which each sample should be expressed in a line and each variable should be separated by tab(\t).

Examples

Ex. 1

“x1_2”   “y1_4”
“x1_5”   “y1_1”
“x1_2”   “y1_2”
“x1_1”   “y1_3”
“x1_2”   “y1_4”
“x1_2”   “y1_1”
“x1_3”   “y1_2”
“x1_4”   “y1_5”
“x1_5”   “y1_3”
“x1_5”   “y1_5”
...

Ex. 2

8  11 11
15 11 12
12 15 14
10 9  12
11 10 9 
9  11 12
8  14 11
9  8 10
7  3  9 
11 11 10
...

Ex. 3

A A A G
C C C C
G G G T
T T T T
C A T T
A A T A
C G G G
T C T A
A G T C
G T C G
...

Output file

In output file, one line is composed of categories of each variable, the number of samples, partition size, probability of mass and probability of density in a partition.

Ex. 1

Var_0 Var_1 Counted samples Partition size Prob. mass Prob. density
“x2_2”, “x2_1”, “x2_4”, “x2_5”, “x2_3” “y1_1”, “y1_2”, “y1_3”, “y1_4”, “y1_5” 0 25 0 0
“x2_2”, “x2_1”, “x2_4”, “x2_5”, “x2_3” “y2_5”, “y2_2”, “y2_4”, “y2_1”, “y2_3” 300 25 0.3 0.012
“x1_5”, “x1_1”, “x1_2”, “x1_3”, “x1_4” “y2_1”, “y2_2”, “y2_3”, “y2_4”, “y2_5” 0 25 0 0
“x1_5”, “x1_1”, “x1_2”, “x1_3”, “x1_4” “y1_4”, “y1_2”, “y1_3”, “y1_1”, “y1_5” 700 25 0.7 0.028
\input data matrix is divided into four partitions

EX. 2

Var_0 Var_1 Var_2 Counted samples Partition size Prob. mass Prob. density
14, 15, 16, 3, 4, 5, 6 10, 11, 16, 17, 4, 5, 6, 13, 14, 15, 3, 9 18, 17, 3, 4, 5, 16 0 504 0 0
18, 17, 7 10, 11, 16, 17, 4, 5, 6 18, 17, 3, 4, 5, 16 0 126 0 0
17, 18 13, 14, 15, 3, 9 17, 18, 4, 3, 5 0 50 0 0
7 13, 14, 15 17, 18, 4, 3, 5 0 15 0 0
7 3 17, 18, 4, 5 0 4 0 0
7 3 3 1 1 0.001 0.001
7 9 17, 18, 3, 4 0 4 0 0
7 9 5 1 1 0.001 0.001
18, 17, 7 13, 14, 15, 3, 9 16 3 15 0.003 0.0002
15, 16, 17, 18, 3, 4, 5, 6, 7 8, 7, 12 18, 3, 17 0 81 0 0
...

Ex. 3

Var_0 Var_1 Var_2 Counted samples Partition size Prob. mass Prob. density
N A, C, G, K, L, T A, G, K, T 0 24 0 0
N A, C, G C 0 3 0 0
N K, L, T C 3 3 0.00297619 0.000992064
G, A, T, C A, C, G, K, T K 0 20 0 0
A, C, G L K 0 3 0 0
T L K 1 1 0.000992064 0.000992064
G, A, T, C K, L A, G, C, T 0 32 0 0
G, A, T, C C G 22 4 0.0218254 0.00545635
G, A C A 21 2 0.0208333 0.0104167
C, T C A 44 2 0.0436508 0.0218254
...

Lookahead option

Using Lookahead option, computing time can be greatly shortened, but you will get the less precise result data.