Part IV
Building Your Own Cloud
Managing&data&in&the&cloud
File%systems
Object%stores
Datab a s es%(SQL)
NoSQL%and%graphs
Warehouses
Globus%file%services
Computing&in&the&cloud
Virtual%machines
Containers% Docker
MapReduce% Yarn%and%Spark
HPC%clust ers%in%the%cloud
Mesos,%Swarm,%Kubernetes
HTCond or
The&cloud&as&platform
Data%analytics
Spark%&%Hadoop
Public%cloud%Tool s
Streaming%data
Kafka,%Spark,%Beam
Kinesis,%Azure%Events
Machine%learning
Scikit-Learn,%CNTK,%
Tenso rf lo w,%AWS% ML
Building&your&own&cloud
What%you%need%to%know
Using%Eucalyptus
Using%OpenStack
Security&and&other&topics
Securing%services%and%data%
Solutions
History,%critiques,%futures
Research%data%portals
DMZs%and%DT Ns,%Globu s
Science%gateways
Part&I
Part&II
Part&III
Part&IV
Part&V
Part IV:
Building Your Own Cloud
We cover two topics in this fourth part of our book: how to build your open private
cloud on your own hardware in your own institution, and how to build your own
software as a service (SaaS) systems that run on public clouds.
A private cloud, as we explained in section 1.2 on page 3, is a cloud infrastructure
operated for a single organization. A private cloud is generally taken to be a
compute cluster that supports APIs similar to those provided by the Amazon
EC2 and S3 services described in parts I and II. That is, it supports on-demand
provisioning of virtual machine instances and storage buckets. Building a private
cloud is thus a matter of deploying and running software that provide such APIs.
Numerous software stacks have been developed for this purpose, of which the
following are among the most frequently used.
OpenStack openstack.org
is an open source project that curates software
for building private a nd public clouds. Its separate, in divi dua lly accessible
services can be deployed in various combinations, making it possible to
customize an OpenStack deployment to meet specific private cloud dema nds .
OpenNebula
[
202
]
opennebula.org
is an open source cloud project that
simplifies the process of deploying a private cloud. Because each data center
is architected dierently, by many architects, possib ly over many years,
developing a portable private cloud platform that can install in any data
center configuration is dicult. OpenNebula addresses this challenge by
providing a simple set of services that easily integrate with existin g data
center hardware, software, and administrative policies.
Eucalyptus
[
212
] is an open source project for buildi ng private and hybrid
clouds that are API-compatible with Amazon AWS. It was designed to enable
the creation of private clouds with a consistent API and functionality, regard-
less of how they are deployed; its API compatibility means that applications
can be migrated without change between Eucalyptus and Amazon. Thus
Eucalyptus is architected not as a set of separately developed services, but
as an end-to-end integrated service ensemble.
Apache CloudStack cloudstack.apache.org
is an open source so ftware
project that bridges between the customizability and site-specific deployment
characteristics of OpenStack and OpenNebula with the scale, reliability, and
API portability of Eucalyptus. It supports its own API and a lso provides
limited support for older versions of the AWS API.
Microsoft’s
Azure Stack
[
32
] is proprietary (i.e., non-open source) software
that can be deployed within a data center, primarily to enable hybrid cloud
operation with the Azure public cloud. It supports basic cloud functionality
using the Azure APIs and includes extensive support for hybrid operation.
VMware Cloud Foundation
[
50
] oers a suite of proprietary virtualization
technologies from which it is possible to build a private cloud. The Cloud
Foundation product provides installation and deployment support for these
technologies.
We describe two of these private cloud software stacks here: Eucalyptus, which
has the dual merits of being particularly easy to deploy and of implementing
Amazon APIs, in chapter 12; and the more complex but also more configurable,
and perhaps for that reason more popular, OpenStack in chapter 13.
In chapter 14, we turn to the second topic of part IV, building your own software
as a service. We explain that SaaS is both a technology and a business model [
136
].
As a technology, it features a single version of software, operated by a SaaS provider,
that is consumed by many customers over the network. As a business model , it
features lightweight pay-for-use or subscription-based compensation mechanisms
that both minimize friction for consumers and enable SaaS providers to scale
delivery with usage.
Together, these two concepts have proven remarkably successful in enterprise
and consumer softwa re, allowing previously expensive capabilities to be delivered to
many more people a t dramatically lower prices. We discuss the implications of SaaS
for scientific software, review selected projects that deliver scientific software over
the network, and present some examples of where SaaS proper is being applied in
scientific settings. Space does not permit a comprehensive, step-by-step treatment
of how to build a SaaS system, but we hope that the material here will whet your
appetite for building your own software services.
260