DNA to Protein

the algorithm is simple - slice the input into 3-letter (T,C or G) triplets then translate each triplet to an amino acid (one of 20). for example TTT has a value of 1 and points to the 1st entry in the table of amino acids,which has a value of F (the one letter code for phenylalanine) while GGG stands for 64, which points to the last entry, G, which means glycine.

here's the project: DNA_to_protein

Can you explain this? I don't know much about biology

this is called the "codon wheel" - from which i based my notation. if you start from the center and read outwards to the 12 o'clock position, you will see TTT (in red) translates to F (see below for legend). if you go in a clockwise direction, the next triplet is TTC, which also translates to F - after that, the triple is TTA which translates to L, then after that, the triplet is TTG, which also translates to L. this is numbering scheme i followed such that TTT, TTC, TTA, TTG are numbers 1,2,3,4, etc. and the last one (before the midnight position) GGG s number 64.

TCAG stands for the bases - Thymine, Cystosine, Adenine , Guanine

and here's the article that goes with the above.

(just remember that i used the same notation for the bases - namely TCAG - while other documents and tables use another notation, as in the next article).

you should read this article to give you a visual view of the process. it's very fascinating.

(just remember that where the article points to U, i had used T in my program).
i don't know any biology - i was just fascinated by the translation process.

uracil which doesn't exist in dna


i just learned that dna sequences generally start with ATG (translated to M) and they end in TAA, TAG or TGA (which translates to a *, which means stop), then protein synthesis stops.