Codeword design and information encoding in DNA ensembles


Encoding of information in DNA-, RNA- and other biomolecules is an important area of research in fields such as DNA computing, bioinformatics, and, conceivably, microbiology and genetics. This survey focuses on two fundamental problems, the codeword design problem and the representation problem of abiotic information, for massively parallel processing with DNA molecules. The first problem requires libraries of DNA sequences to be designed so that specific duplexes are formed during annealing while simultaneously preventing other undesirable hybridizations from occurring in the course of a computation in the tube. The second involves a search for efficient and cost-effective methods of representing non-biological information in DNA sequences for storage and retrieval of large amouns of data (tera- and peta-byte scales). Two approaches are treated, namely thermodynamic and combinatoric-computational. Both experimental and theoretical results are described. A reference list of major works in the area is given. Finally, some open problems deemed important for their possible impact on encoding of abiotic information representation and processing are discussed. © 2004 Kluwer Academic Publishers.

Publication Title

Natural Computing