๐ Inferring the hosts of coronavirus using dual statistical models based on nucleotide composition
Many coronaviruses are capable of interspecies transmission. Some of them have caused worldwide panic as emerging human pathogens in recent years, e.g., severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV). In order to assess their threat to humans, we explored to infer the potential hosts of coronaviruses using a dual-model approach based on nineteen parameters computed from spike genes of coronaviruses. Both the support vector machine (SVM) model and the Mahalanobis distance (MD) discriminant model achieved high accuracies in leave-one-out cross-validation of training data consisting of 730 representative coronaviruses (99.86% and 98.08% respectively). Predictions on 47 additional coronaviruses precisely conformed to conclusions or speculations by other researchers. Our approach is implemented as a web server that can be accessed at http://bioinfo.ihb.ac.cn/seq2hosts.
keywords
๐ severe acute (1373)
๐ syndrome coronavirus (1074)
๐ spike gene (68)
๐ respiratory syndrome (2004)
๐ acute respiratory (1734)
author
๐ค Tang, Qin
๐ค Song, Yulong
๐ค Shi, Mijuan
๐ค Cheng, Yingyin
๐ค Zhang, Wanting
๐ค Xia, Xiao Qin
year
โฐ 2015
journal
๐ Scientific Reports
issn
๐ 20452322
volume
5
number
page
citedbycount
4
download
๐ [BibTeX]