My personal interest lies in the area of scheduling and resource allocation in IaaS clouds. Evaluating the effectiveness of a new scheduling algorithm is often only visible in over a long period of time, with heavy load on the system. When working with production traces spanning multiple months, empirical evaluation in real-time becomes infeasible. The academic community has picked up on this issue and produced a large variety of simulators that allow evaluation of schedulers in faster-than-realtime. For a taxonomy of evaluation methods for large scale systems, I highly recommend you to have a look at a Gustedt et al. survey from 2009.
Looking specifically at the simulation approach, system evaluation is typically performed from a specific perspective – from the application or the infrastructure provider – and deliver accordingly tailored results. A subset of these simulators is presented below. Another, complimentary summary of existing work by Oujani can be found online as well.
CloudSim. One of the primary frameworks used for simulating clouds in academic research. It is the brain-child of the developers of GridSim and has been used in a number of studies as it is highly customizable. Extensions to CloudSim include CloudAnalyst and NetworkCloudSim, which add a GUI and facilities for simulating geo-distributed applications, among others.
GreenCloud. Built on NS2 it’s primary focus lies on exploring the impact of network layouts on cloud performance and energy consumption.
iCanCloud. Focuses on predicting application performance, energy-consumption and cost with different hardware platforms and resource allocation schemes.
MDCSim. A commercial entrant in the area, relying on detailed models of individual hardware components to produce predictions about a clouds performance at scale. The original publication targets 3-tier web applications instead of generic IaaS cloud infrastructures.
DCSim. Simulates IaaS clouds with a specific focus on dynamic power- and SLA-optimization via VM migration. Its authors use tiered scale-out workloads and evaluate the advantage of VM migration and replication strategies over static provisioning.
GDCSim. Primarily concerned with the thermal aspects of power-management in data centers by integrating existing modeling tools. Specifically investigates the interaction of workloads intensity and resource management policies with heat dissipation and fluid dynamics of different physical data center layouts.
PICS. A recent entrant in the cloud simulation field, with a focus on accurate reproduction of job execution times and cost on public clouds from traces.
EMUSim. Uses emulation of Bag-of-Task applications to extract performance properties and simulate their behavior at larger scale more accurately. An evaluation step ensures that emulation and simulation agree at observable scales.
These simulator are typically based on discrete-event simulation, using compound models from smaller sub-models. This approach makes them highly customizable, but creates a significant problem calibrating and validating them against real-world measurements. Notably, while application-perspective simulator are published with results to validate their accuracy against measurements taken from real-world execution, this is step appears to be missing for most infrastructure simulators. They are thus mostly applicable to exploratory research and design studies rather than exact performance prediction.