Viral gene compression: Complexity and verification. (English)
Domaratzki, Michael (ed.) et al., Implementation and application of automata. 9th international conference, CIAA 2004, Kingston, Canada, July 22‒24, 2004. Revised selected papers. Berlin: Springer (ISBN 3-540-24318-6/pbk). Lecture Notes in Computer Science 3317, 102-112 (2005).
Summary: The smallest known biological organisms are, by far, the viruses. One of the unique adaptations that many viruses have aquired is the compression of the genes in their genomes. In this paper we study a formalized model of gene compression in viruses. Specifically, we define a set of constraints that describe viral gene compression strategies and investigate the properties of these constraints from the point of view of genomes as languages. We pay special attention to the finite case (representing real viral genomes) and describe a metric for measuring the level of compression in a real viral genome. An efficient algorithm for establishing this metric is given along with applications to real genomes including automated classification of viruses and prediction of horizontal gene transfer between host and virus.