Prediction of protein secondary structure at high accuracy using a combination of many neural networks. (English)
Guerra, Concettina (ed.) et al., Mathematical methods for protein structure analysis and design. C.I.M.E. summer school, Martina Franca, Italy, July 9‒15, 2000. Advanced lectures. Berlin: Springer (ISBN 3-540-40104-0/pbk). Lecture Notes in Computer Science 2666. Lecture Notes in Bioinformatics, 117-122 (2003).
Summary: A protein secondary structure prediction protocol involving up to 800 neural network predictions has been developed by SBI-AT. An overall performance of $80\%$ is obtained for three-state (helix, strand, coil) DSSP categories. Input to primary-layer neural networks includes sequence profiles, relative residue position, relative chain length, and amino-acid composition. Secondary structure predictions are made for three consecutive residues simultaneously ‒ a technique which we describe as ‘output expansion’ ‒ which boosts the performance of second-layer structure-to-structure networks. Independent network predictions arise from 10-fold cross validated training and testing of 1032 protein sequences at both primary and secondary network layers. Network output activities are converted to probabilities. Finally, 800 different predictions are combined using a novel balloting procedure.