Han, L.X. and Liew, C.S. and van Hemert, J. and Atkinson, M. (2011) A generic parallel processing model for facilitating data mining and integration. Parallel Computing, 37 (3). pp. 157-171. ISSN 0167-8191,
Full text not available from this repository.Abstract
To facilitate data mining and integration (DMI) processes in a generic way, we investigate a parallel pipeline streaming model. We model a DMI task as a streaming data-flow graph: a directed acyclic graph (DAG) of Processing Elements (PEs). The composition mechanism links PEs via data streams, which may be in memory, buffered via disks or inter-computer data-flows. This makes it possible to build arbitrary DAGs with pipelining and both data and task parallelisms, which provide room for performance enhancement. We have applied this approach to a real DMI case in the life sciences and implemented a prototype. To demonstrate feasibility of the modelled DMI task and assess the efficiency of the prototype, we have also built a performance evaluation model. The experimental evaluation results show that a linear speedup has been achieved with the increase of the number of distributed computing nodes in this case study.
Item Type: | Article |
---|---|
Funders: | UNSPECIFIED |
Uncontrolled Keywords: | Pipeline streaming; Parallelism; Data mining and data integration (DMI); Workflow; Life sciences; OGSA-DAI; |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Depositing User: | Zanaria Saupi Udin |
Date Deposited: | 26 Aug 2011 07:45 |
Last Modified: | 26 Dec 2014 02:22 |
URI: | http://eprints.um.edu.my/id/eprint/2071 |
Actions (login required)
View Item |