Part II:
Computing in the Cloud
While scientists were first attracted to the cloud by the ability to store and
share data, it was the introducti on o f cheap on-demand computing that created a
paradigm shift. In this second part of the book, we follow the pattern established
in the preceding one: we first introduce principles and then show how you can use
both cloud portals and Python SDKs to compute on various cloud platforms.
Computing in the cloud has gone through a fascinating evolution. It started
with virtualization, an old computing technology first invented in the context of
mainframe computers and later adopted within data centers as a means of al lowing
customers to create environments and services that are uniquely tailored to their
needs. Virtual machines can be started and stopped easily, an d the customer is
charged only for the time that the machine instance is running. In chapter 5, we
describe how to create and manage virtual machines on cloud platforms.
A second stage of the evolution of computing in the cloud was the introduction
of containers as a means of encapsulating software. Container technologies allow
researchers to share deployed applications that can be deployed rapidly on any
cloud and then run with a single command. In chapter 6, we show you how to
create and deploy containers based on a technology called Docker.
Scale has always been a critical cloud capability and a major requirement of
scientists. By “scale” we mean the ability of computation to be spread over multip le
cloud servers to exploit parallelism in the application. In chapter 7, we consider
four types of parallel application execution:
• SPMD clusters in the cloud, for traditional HPC-style computing.
• Many task
or
high throughput
parallel computation, characterized by a
large bag of tasks with few or no dependencies and that thus can be executed
in parallel.