Hedge Computing

Figure 1: Mapping the Diversification Strategies of Major Cloud Providers

Note: Technically this article is not about hedging but diversification as a strategy to reduce risk based on cloud computing.

[SEATTLE 19 NOVEMBER 2017] Separation of physical infrastructure for computing – i.e. data centers – can be combined with cloud systems to reduce risk of undesirable loss of data and unavailability. Thanks to the size of the largest cloud providers on global scale relative to the efficient scale of a cloud data center, it is quite possible to achieve significant risk reduction without sacrificing efficiency.

If not for the flexibility and cost advantages, diversification and the associated benefits with higher availability and less problems with disaster recovery are making systems with distributed capacity very attractive.

A straightforward lesson from finance

It is theoretically and empirically known that diversification is a feasible and efficient method to reduce financial risk. The beauty of diversification in financial markets is that risk can be managed without reducing the expected return. Two assets with uncorrelated risk and a certain combined expected return can be combined in a portfolio with substantially less variance than a single asset with the same expected return.

Multigrid’s Strategy: Separation and Scale

Diversification in computing is not as trivial as in finance. First, it is challenging to distribute assets and maintaining cost efficiency at the subsystem level. Second, risk reduction requires low correlation between risks associated with each subsystem. Third, a negative event or failure in one subsystem should not spill over to another subsystem. Fourth, the process or result when developing the distributed system should not be so complex that it introduces other operational and technical risks.

The challenges in the site selection process, the design of the data center and the installation of physical infrastructure help explain why the full potential for diversification is far from exploited. Additional gains from distributed computing can thus be realized in the future.

To achieve diversified risk and cost efficiency one must combine separation and scale. Fortunately, as cloud computing is a commodity with large and growing demand it is feasible to separate physical capacity at or above the minimum efficient scale of any subsystem.

Diversification is supported by separation at four levels to allow for subsystems of appropriate size on each level (row – room – building – campus – region):

  1. In the data room, workloads can be distributed on separate physical servers in different rows of cabinets to mitigate negative consequences of hardware failure.
  2. In the data center, ICT equipment can be distributed in different rooms to contain the consequences of intrusion or fire.
  3. On the data center campus, separate physical infrastructure and technical systems for uninterruptable power supply, cooling and fire suppression can be distributed in different data center buildings to limit the negative consequences of failure in any given subsystem.
  4. In the computing region, data centers can be geographically separated with different power grid connections to maintain capacity even if there are severe external events in proximity of one campus (wild fire, power outage, traffic jam, chemical accident, terrorist attack etc).

Having global scale, a cloud provider can match two computing regions in a pair, allowing computing capacity to be separated in different jurisdictions and in separate energy markets to mitigate problems with political unrest, regulation with negative impact or significant changes in input prices long term.

Figure 2: A Multi-Level Diversification Strategy

Understanding cloud regions

All of the major cloud providers are using strategies for physical infrastructure and data centers to support the diversification that cloud computing allows and supports. Computing and storage are kept in domains that are distributed over physically separated servers and switches in different racks and/or data rooms. Workloads are distributed over physically separated data centers independent power supply and cooling systems in different data centers (referred to as “availability zones”). Copies of data for disaster recovery are kept in separate jurisdictions and geographies (referred to as “regions”).

Figure 3: Cloud providers, cloud regions and availability zones

The benefits of diversification could easily lead to the false and foregone conclusion that a fully distributed model with maximum physical and geographical separation is desirable. This is, however, not the case. There are fundamental reasons to choose the best location to benefit from locational advantage. In addition, increasing return to scale is an important motive for concentration.

Within a region there are both centrifugal and centripetal forces. On the one hand, availability zones are pushed apart to reduce the correlated risk associated with two separate locations in relative proximity. On the other hand, availability zones are pulled together to speed up communications and reduce latency. Additional motives include reduced cost for inter data center traffic.

A final word on the benefits of cloud computing

Cloud computing is making several important and positive contributions. Direct benefits to end users include the flexibility of scaling and the comfort of an external party taking responsibility for installation and operation of IT systems. Indirect benefits include cost savings stemming from scale and higher utilization of physical assets. Moreover, R&D in combination with systematic site selection and sophisticated sourcing yield productivity gains and environmental benefits.

The flexibility and the cost saving associated with cloud computing are related to virtualization and size. Virtualization gives greater flexibility and higher utilization of physical assets while size motivates stronger effort, larger production units (economies of scale) and pooling of needs over time, geographies and industries (economies of scope).

It can be argued that the benefits above can be realized if physical systems are appropriately sized and located in suitable geographies. The challenge and the upside with diversification, however, deserve careful analysis and further development.

Mattias Ganslandt, CEO of Multigrid Data Centers

Sources

  1. Microsoft Azure, Overview of Availability Zones in Azure, https://docs.microsoft.com/en-us/azure/availability-zones/az-overview
  2. Google Cloud Platform, Geography and Regions, https://cloud.google.com/docs/geography-and-regions
  3. What Is Amazon EC2?, Regions and Availability Zone, http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html
  4. Oracle Cloud Terminology, https://docs.oracle.com/en/cloud/get-started/subscriptions-cloud/csgsg/oracle-cloud-terminology.html
  5. New Regional Services for IBM Cloud Object Storage, https://www.ibm.com/blogs/bluemix/2017/06/announcing-ibm-cloud-object-storage-regional-us-south-dallas-new-low-cost-low-latency-s3-compatible-regional-services/