Chapter 5
Using and Managing Virtual
Machines
“We all live every day in virtual environments, defined by our ideas.”
—Michael Crichton
The intro duction by Amazon of its Elastic Compute Cloud (EC2) service in 2006
marked the true beginning of cloud computing. EC2 is based on virtualization
technology, which allows one server to run independent, isolated operating systems
(OSs) for multiple users simultaneously. Since then Microsoft, Google, and many
others have introduced virtual machine (VM) services based on this technology.
In this chapter, we first provide a brief introduction to virtualization technology,
and then proceed to describe how to create and ma nage VMs in the cloud. We
start with creating a VM on EC2 and show how to attach an external disk. We
then describe Microsoft’s solution, Azure, and show how to create VM instances
via both the Azure p ortal and a Python API.
The open source community has also been active in this space. Around 2008
three projects—Eucalyptus from the University of California Santa Barbara [
212
],
OpenNebula from the Complutense University of Madrid [
202
], and Nimbus from
the University of Chicago [
191
]—released cloud software stacks. Later NASA,
in collaboration with Rackspace, released OpenStack, which is widely supported.
We describe in this chapter one OpenStack-based system called Jetstream, a
facility funded by the U.S. National Science Foundation, and show how to create
OpenStack VMs on Jetstream. We provide additional information on Eucalyptus
and OpenStack in chapters 12 and 13, respectively.
5.1. Historical Roots
5.1 Historical Roots
Any modern computer has a set of basic resources: CPU data registers, memory
addressing mechanisms, and I/O and network interfaces. The programs that control
the computer are just sequences of binary codes corresponding to instructions that
manipulate these resources, for example to
ADD
the contents of one register to the
contents of another.
There are also important instructions for performing context switches, in which
the computer stops executing one program and starts executing another. These
state management instructions plus the I/O instructions are termed privileged.
Such instructions are usually directly executed only by the OS, because you do
not want users to be able to access state associated with other computations.
The OS has the ability to allow user programs (encapsulated as pro cesses ) to
run the unprivileged instructions. But as soon as the user program attempts to
access an I/O operation or other privileged instruction, the OS traps the instruction,
inspects the request, and, if the request proves to be acceptable, runs a program
that executes a safe version of the operation. This process of providing a version
of the ins truction that looks real but is actually handled in software is called
virtualization
. Other types of virtualization, such as virtual memory, are han dled
directly by the hardware with guidance from the OS.
In the late 1960s and early 1970s , IBM and others created many variations on
virtualization and eventually demonstrated that they could virtualize an entire
computer [
104
]. What resulted was the concept of a
hypervisor
:aprogramthat
manages the virtualization of the hardware on behalf of multiple distinct OSs.
Each such OS instance runs on its own complete VM that the hypervisor ensures is
completely is olated from all other instances running on the same computer. Here
is an easy way to think about it. The OS allows multiple user processes to run
simultaneously by sharing the resources among them; the hypervisor below the
OS allows multiple OSs to share the real physical hardware and run concurrently.
Many hypervisors are available today, such as Citrix Xen, Microsoft Hyper-V,
and VMWare ESXi. We refer to the guest OSs running on the hypervisors as
VMs. Some hyp ervis ors run on top of the host machine OS as a process, such as
VirtualBox an d KVM, but for our purposes the distinction is minor.
While this technical background is good for the soul, it is not essential to
learning how to create and manage VMs in the cloud. In the remainder of this
chapter we dig into the mechanics of getting sci ence done with VMs. We assume
that you are familiar with Linux and focus in our examples on creating Linux VMs.
This choice does not imply that Windows is not available. In fact, all three public
74
Chapter 5. Using and Managing Virtual Machines
clouds we talk about in this book allow you to create VMs running Windows just
as easily as Linux. So if you need a Windows VM for some of your work, rest
assured that almost everything that we present works for Windows, too. We try
to point out the occasi ona l exceptions to this rule.
Each of the three public clouds and the NSF Jetstream cloud has a web portal
that guides you through the steps needed to create and manage VM instances. If
you have never used a cloud, you are well advised to start there. We introduce
selected interfaces and describe how to get started with each.
5.2 Amazon’s Elastic Compute Cloud
We describe first how to create VM instances on Amazon’s Elastic Compute Cloud
service and then how to attach storage to our VMs.
5.2.1 Creating VM Instances
We start on the Amazon portal at
aws.amazon.com
, where we can log in or create
an account. Figure 5.1 shows what we see when we log in. We are interested in
VMs, so we click on
EC2
or
Launch a Virtual Machine
. This brings us to another
series of views, with instructions on how to launch a basic “Amazon Linux” instance.
We can then specify our desired host service
Instance Type
, which determines
the number of cores that our VM is to use, the required memory size, and network
performance. Literally dozens of choices exist, ordered from small to large, and
priced accordin gly (more on this below).
One important step during the launch process involves providing a key pair:
the cryptographic keys that you use to access your runni ng instance. If this is
your first experience with EC2, you may be asked to create a key pair early in the
process. You should do so. Give it a name and remember it. You then download
the private key file to a secure place on your laptop where you can access it again.
The corresponding public key is stored with Amazon. Just before you launch your
instance, it asks you which key pair you want to use. After you select it, the public
key is loaded into the instance. The other important choices involve storage options
and security groups. We return to those later. Once you launch the instance you
can monitor its status , as shown in figure 5.2 on the next page, where you see two
stopped instances and one newly launched instance. The Status Checks shows that
the new instance is still initializing. After a few moments, its status changes to a
green check mark to indicate that the instance is read y to launch.
75
5.2. Amazon’s Elastic Compute Cloud
Figure 5.1: First view of the Amazon portal.
Figure 5.2: Portal instance view.
To connect to your instance, you need to use a secure shell command. On
Windows the tool to use is called PuTTY. You need a companion tool called
PuTTYgen to convert the downloaded private key into one that can be consumed by
PuTTY. When you launch it, you use
ec2-user@IPAddress
, where the IPAddress
is the IP address you can find in the Portal Instance View. The PuTTY SSH tab
has an Auth tab that allows you to upload your converted private key. On a Mac
or Linux machine, you can go directly to the shell and execute:
ssh -i path-to -your- private -key .pem ec2 - user@ipaddress -of - instance
76
Chapter 5. Using and Managing Virtual Machines
The following listing uses the Python Boto3 SDK to create an Amazon EC2
VM instance. It creates an
ec2
resource, which requires your
aws_access_key_id
and
aws_secret_access_key
, unless you have these stored i n your
.aws
directory.
It then uses the create_resources function to request creation of the instance.
import boto3
ec2 = boto3 . resource (' ec2 ', 'us -w est -2')
ec2. create_instances ( ImageId ='ami -717 2b611' , ' t2 . micro ' ,
MinCount=1, MaxCount =1)
The
ImageId
argument specifies the VM image that is to be started and the
MinCount
and
MaxCount
arguments the number of instances needed. (In this case,
we want five instances, but we will accept a single instance if that is all tha t is
available.) Other optional arguments can be used to specify instances. For example,
the instance type: Do you want a small virtual computer, with limited memory
and computing power, or a big one with many cores and lots of storage? (As we
discuss below, you pay more for the latter.) Having created the instance(s), we
define and call a
show_instances
function that uses
instances.filter
to obtain
and dis play a list of running instances. The last line shows the result.
# A function that lists instances with a specified status
def show_instance(status):
instances = ec2.instances.filter (
Filters =[{' Name ': ' instance -state- name', ' Values ':[status]}])
for instance in instances:
print(instance.id,instance.instance_type,
instance.image_id , instance.public_ip_address)
show_instance(' running ')
( 'i-0a184b56b0ebdba98', 't2 . micro ' , 'ami -717 2b6 11', ' 146.137.70.71 ')
Notebook 7 provides more examples, showing, for example, how to suspend
and terminate instances, check instance status, and attach a virtual disk volume.
5.2.2 Attaching Storage
We now discuss the three types of storage that, as noted in chapter 2, can be
attached to a VM: instance storage, Elastic Block Store, and Elastic File System.
Instance storage
is what comes with each VM i nstan ce. It is easy to access, but
when you destroy a VM, all data saved in its instance storage goes away.
We allocate
Elastic Block Store (EBS)
storage independent of a VM and
then attach it to a running VM. EBS volumes persist and thus are good for
databases and other data collections that we want to keep beyond the life of a
77
5.2. Amazon’s Elastic Compute Cloud
VM. Furthermore, they can be encrypted and thus used to hold sensitive data. To
create an EBS volume, go to the volumes tab of the EC2 Management console and
click
Create Volume
. The dialog box in figure 5.3 allows you to specify volume
size (20 GB here), encryption state, snapshot ID,andavailability zone.
Figure 5.3: Creating an EBS volume.
We selected the
us-west-2b
availability zone because we want to attach the
volume to the instance created earlier. The easiest way to make the attachment is
through the Actions tab in the Volume management console. However, you can do
much of the volume attaching and mounting in Python. First let us look at the
list of our current volumes. The following is a transcript of an IPython session.
In [3]: vols = ec2 . volumes. filter(Filters=[])
In [4]: for vol in vols :
print(vol.id,vol.size,vol.state)
( 'vol -032807 a 23 12 19 af 70' ,8,'in-u se')
( 'vol -0 b dd 05 84d 08 33 e69 1 ' ,20,' available ')
( 'vol -07 c e6f0 3c 1a 13d 5a 7',100,'in -u se')
( 'vol -0 c e3 df 91d 4d 2e 07e 0 ' ,8,'in -use')
( 'vol -0 f c1 ff 871 1c d0 eac 4 ' ,8,'in -use')
We see that the 20 GB volume that we created in the portal session above is
available, so let’s attach it to our instance, as fol lows.
78
Chapter 5. Using and Managing Virtual Machines
In [5]: vol = ec2 . Volume ( 'vol -0b dd05 84 d083 3e 691')
In [6]: vol .attach_to_instance( InstanceId = 'i-0a184b56b0ebdba98' ,
Device='/dev/xvdh' )
{u' AttachTime ':datetime.datetime(2016,9,23,18,15,49,308000,
tzinfo= tzutc ()),
u ' Device ': '