dc.contributor.author |
Winck, Ana Trindade |
|
dc.contributor.author |
Machado, Karina dos Santos |
|
dc.contributor.author |
Souza, Osmar Norberto de |
|
dc.contributor.author |
Ruiz, Duncan Dubugras Alcoba |
|
dc.date.accessioned |
2015-05-15T18:09:45Z |
|
dc.date.available |
2015-05-15T18:09:45Z |
|
dc.date.issued |
2013 |
|
dc.identifier.citation |
WINCK, Ana Trindade et al. Context-based preprocessing of molecular docking data. BMC Genomics, v. 14, supl. 6, p. 1-9, 2013. Disponível em: <http://www.biomedcentral.com/1471-2164/14/S6/S6#abs>. Acesso em: 13 maio 2015. |
pt_BR |
dc.identifier.uri |
http://repositorio.furg.br/handle/1/4858 |
|
dc.description.abstract |
Background: Data preprocessing is a major step in data mining. In data preprocessing, several known techniques
can be applied, or new ones developed, to improve data quality such that the mining results become more
accurate and intelligible. Bioinformatics is one area with a high demand for generation of comprehensive models
from large datasets. In this article, we propose a context-based data preprocessing approach to mine data from
molecular docking simulation results. The test cases used a fully-flexible receptor (FFR) model of Mycobacterium
tuberculosis InhA enzyme (FFR_InhA) and four different ligands.
Results: We generated an initial set of attributes as well as their respective instances. To improve this initial set, we
applied two selection strategies. The first was based on our context-based approach while the second used the
CFS (Correlation-based Feature Selection) machine learning algorithm. Additionally, we produced an extra dataset
containing features selected by combining our context strategy and the CFS algorithm. To demonstrate the
effectiveness of the proposed method, we evaluated its performance based on various predictive (RMSE, MAE,
Correlation, and Nodes) and context (Precision, Recall and FScore) measures.
Conclusions: Statistical analysis of the results shows that the proposed context-based data preprocessing approach
significantly improves predictive and context measures and outperforms the CFS algorithm. Context-based data
preprocessing improves mining results by producing superior interpretable models, which makes it well-suited for
practical applications in molecular docking simulations using FFR models |
pt_BR |
dc.language.iso |
eng |
pt_BR |
dc.rights |
open access |
pt_BR |
dc.title |
Context-based preprocessing of molecular docking data |
pt_BR |
dc.type |
article |
pt_BR |
dc.identifier.doi |
10.1186/1471-2164-14-S6-S6 |
pt_BR |