Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
On Power-Peak-Aware Scheduling for Large-Scale Shared Clusters
Ist Teil von
IEEE transactions on big data, 2020-06, Vol.6 (2), p.412-426
Ort / Verlag
Piscataway: IEEE
Erscheinungsjahr
2020
Quelle
IEEE Xplore
Beschreibungen/Notizen
Recent studies have reported that big data analytics clusters, such as Hadoop, can create substantial power peaks, bringing instability and inflexibility issues to the power grid. Substantial power peaks also lead to high penalty charges from electric utility companies, accounting for more than 30 percent of the electricity bill for a cluster operator according to empirical studies. To this end, in this paper, we present a framework that schedules computing jobs in large-scale data analytics clusters to mitigate power peaks. The scheduling model captures important properties of modern distributed data analytics clusters, including bundled resource provisioning and job-to-task decomposition with distributed processing. The scheduling problem is formulated as a nonlinear integer program. Its solution is derived by decomposing it into two classes of sub-problems and solving each class with an exact and efficient solution method. As a direct application, we detail the implementation of our proposed scheduling framework on a Hadoop cluster, and demonstrate its efficacy by extensive trace-driven simulations based on the CloudSim simulator.