Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Task Scheduling Strategy for Heterogeneous Spark Clusters
Ist Teil von
Artificial Intelligence in China, 2020, Vol.572, p.131-138
Ort / Verlag
Singapore: Springer Singapore Pte. Limited
Erscheinungsjahr
2020
Link zum Volltext
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
As a primary data processing and computing framework, Spark can support memory computing, interactive computing, and querying in a huge amount of data. Also, it can provide data mining, machine learning, stream computing, and the other services. However, the strategy of allocating resources among isomorphic processors cannot adapt to heterogeneous cluster environment due to its lack of load-based task scheduling. Therefore, we propose a dynamic load scheduling algorithm for heterogeneous Spark clusters by regularly collecting load information from each of the cluster node. Such an algorithm can dramatically reduce the allocation of load to the nodes which are already heavily loaded and in turn allocate more task to the idle nodes, thereby speeding up the process of job allocation in Spark. The experimental results show that the proposed algorithm can dramatically improve the computation efficiency by dynamically loading among the nodes in a heterogeneous cluster.