People speculate there may have been a wider variety of DNA coding in the past. But natural selection plus perhaps some reaction energetics versus complexity settled on the current system.
There was probably a simpler two nuclide encoding versus three beforehand. About half of the amino acids only use the first two nuclides and ignore the third.
that seems unlikely, because shifting your recognition domain count from 2-3 means that you basically lose all the evolved information from before and have to rely on chance "correct encodings" everywhere.
The idea is that the initial tRNA was not specific enough and only care about the first two letters of each codon and ignored the third. So for example Proline was determined by the first two letters CC? and was associated the four codons CCU, CCC, CCA and CCG. Actually, this is the current mapping.
Other blocks of four codons were split for some reason. We can imagine that originally Isoleucine was determined by AU? so initially AUU, AUC, AUA and AUG encoded Isoleucine, but now only the first three encode Isoleucine and the last one encodes Methionine instead.
and then make the whatever letters also important with a almost backward compatible code, so in most case it still doesn't mater, but in a few cases it is important.
[Note: The official letter for whatever is "N" instead of "?"]
That's a great explanation! To add a cool point, the wobble position is frequently modified by highly specific enzymes to make it matter more. It's like some random protein mutated to do this modification and all of a sudden the organism got more RAM thus increasing it's fitness.
You start with a two-letter code, then something evolves that puts an (initially) rare third letter at a few locations on the tape. All the old "gear" that reads two-letter code can still read most of the tape.
It is difficult to imagine that as a possibility. The spacing of three mRNA nucleotides is pretty important structurally in the process of translation from mRNA to protein via the ribosome. It is difficult for me to imagine a ribosome that could operate arbitrarily between codons of two and three nucleotides, unless I am misunderstanding your comment.
A translational reading frame consists of non-overlapping codons of three nucleotides. If one nucleotide is skipped, the entire downstream message is thus garbled. So how would the translational machinery operate if each codon arbitrarily consisted of two or three nucleotides?
I see what you are getting at, but I think what the other comments are saying is that for the third nucleotide position of a given codon, it does not matter which nucleotide this is. The amino acid to be used would only depend on the first two nucleotides, while the third nucleotide can be AUCG.
>> About half of the amino acids only use the first two nuclides and ignore the third.
I've often thought some of that redundancy in the code could be a feature. Important (more sensitive) sequences could evolve to a coding that is more robust against mutations, while things that are less important could be more brittle in their encoding. This seems hard to prove though.
It also allows a particular triplet to have more neighbors, meaning you can go from one amino acid to more options without going through intermediates.
There was probably a simpler two nuclide encoding versus three beforehand. About half of the amino acids only use the first two nuclides and ignore the third.