Dynamic Resource Provisioning for Sustainable Cloud Computing Systems in the Presence of Correlated Failures

2020 
Dependence of computing resources on each other in cloud computing systems (CCS) makes them prone to fail in correlated manner which significantly impacts their service reliability and energy efficiency. Focusing on these two metrics of CCS while considering correlated failures remained an open question, which is the focus of this work. This paper proposes mechanisms for improving reliability and energy efficiency jointly under correlated failures in CCS. In order to model failure correlation, statistical cluster analysis techniques are applied to real failure traces. Then, mathematical models are built to calculate reliability and energy consumption of failure prone CCS. These mathematical models are used to design fault-tolerant and energy-aware resource provisioning mechanisms/policies. In order to further reduce the energy consumption, a correlated failure-aware VM consolidation policy is also proposed in this paper. A simulation based study of the proposed resource management policies and fault tolerance mechanisms is conducted by using real failure traces and Bag-of-Tasks workload. The results demonstrate that by exploiting failure correlation with the proposed resource management policies, we reduce the occurrence of failures on tasks by 34% and increase the energy efficiency of the system by 20%, approximately in comparison to the environments where failures are handled independently.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []