Ensemble multiple sequence alignment via advising

Dan DeBlasio, John Kececioglu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

The multiple sequence alignments computed by an aligner for different settings of its parameters, as well as the alignments computed by different aligners using their default settings, can differ markedly in accuracy. Parameter advising is the task of choosing a parameter setting for an aligner to maximize the accuracy of the resulting alignment. We extend parameter advising to aligner advising, which in contrast chooses among a set of aligners to maximize accuracy. In the context of aligner advising, default advising selects from a set of aligners that are using their default settings, while general advising selects both the aligner and its parameter setting. In this paper, we apply aligner advising for the first time, to create a true ensemble aligner. Through cross-validation experiments on benchmark protein sequence alignments, we show that parameter advising boosts an aligner's accuracy beyond its default setting for virtually all of the standard aligners currently used in practice. Furthermore, aligner advising with a collection of aligners further improves upon parameter advising with any single aligner, though surprisingly the performance of default advising on testing data is actually superior to general advising due to less overfitting to training data. The new ensemble aligner that results from aligner advising is significantly more accurate than the best single default aligner, especially on hard-to-align sequences. This successfully demonstrates how to construct out of a collection of individual aligners, a more accurate ensemble aligner.

Original languageEnglish (US)
Title of host publicationBCB 2015 - 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
PublisherAssociation for Computing Machinery, Inc
Pages452-461
Number of pages10
ISBN (Electronic)9781450338530
DOIs
StatePublished - Sep 9 2015
Event6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2015 - Atlanta, United States
Duration: Sep 9 2015Sep 12 2015

Publication series

NameBCB 2015 - 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

Other

Other6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2015
Country/TerritoryUnited States
CityAtlanta
Period9/9/159/12/15

Keywords

  • Accuracy estimation
  • Aligner advising
  • Ensemble methods
  • Multiple sequence alignment
  • Parameter advising

ASJC Scopus subject areas

  • Software
  • Health Informatics
  • Computer Science Applications
  • Biomedical Engineering

Fingerprint

Dive into the research topics of 'Ensemble multiple sequence alignment via advising'. Together they form a unique fingerprint.

Cite this