Indexed by:
Abstract:
Phylogenetic tree is essential to understand evolution and it is usually constructed through multiple sequence alignment, which suffers from heavy computational burdens and requires sophisticated parameter tuning. Recently, alignment free methods based on k-mer profiles or common substrings provide alternative ways to construct phylogenetic trees. However, most of these methods ignore the global similarities between sequences or some specific valuable features, e.g., frequent patterns overall datasets. To make further improvement, we propose an alignment free algorithm based on sequential pattern mining, where each sequence is converted into a binary representation of sequential patterns among sequences. The phylogenetic tree is further constructed via clustering distance matrix which is calculated from pattern vectors. To increase accuracy for highly divergent sequences, we consider pattern weight and filtering redundancy sub-patterns. Both simulated and real data demonstrates our method outperform other alignment free methods, especially for large sequence set with low similarity.
Keyword:
Reprint Author's Address:
Email:
Source :
Genes
ISSN: 2073-4425
Year: 2019
Issue: 2
Volume: 10
3 . 7 5 9
JCR@2019
4 . 0 9 6
JCR@2020
ESI Discipline: MOLECULAR BIOLOGY & GENETICS;
ESI HC Threshold:156
JCR Journal Grade:4
CAS Journal Grade:3
Cited Count:
WoS CC Cited Count: 4
SCOPUS Cited Count: 6
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 4
Affiliated Colleges: