23-2 Algorithm Project_2
Algorithm to implement multiple sequence alignment with k DNA sequences using Dynamic Programming.
- four-letter alphabet {Adenine (A), Thymine (T), Guanine (G), Cytosine (C)}
- measure the similarity of genetic sequences by the frequency of the exactly matched alphabets
- align the k sequences, but we are permitted to insert gaps in either any sequence
- C programming language to print out the sequence alignment result into the output file named ‘hw2 output.txt after finding the best sequence alignment from k DNA sequences in the input file named ‘hw2 input.txt’
- fisrt part : the number (k) of DNA sequences to be aligned
- second part : the k DNA sequences to be aligned (each sequence appears on a separate line of text)
- Each part is separated from the next part by a character $
- 2 ≤ k ≤ 5, and 1 ≤ n ≤ 120 where n is the maximum length of each DNA sequence
- the sequence alignment results with marks representing matched alphabets
- In the last line, mark “*” on the columns containing identical alphabets across all sequences
[Input file: hw2_input.txt]
3
$
ATTGCCATT
ATGGCCATT
ATCCAAT
[Output file: hw2_output.txt]
ATTGCCA-TT
ATGGCCA-TT
AT--CCAAT-
** *** *
- the number of identical alphabets across all sequences returned by your submitted program
- the actual running time
- well-written document to explain your source code and the performance analysis of your algorithm