Part I:
Managing Data in the Cloud
Data storage was the first manifestation of the cloud. Amazon’s S3 public data
storage service was launched in 2006. In 2008, Dropbox was introduced as a cloud
service that could replace having to share files by passing around USB flash drives.
That same year, Microsoft introduced its SkyDrive cloud storage service, later
integrated with a service called Live Mesh that allowed synchronization across
multiple machines, and in 2014 rebranded as OneDrive due to a lawsuit over the
use of the word “Sky.” Google introduced Google Drive in 2012.
These services all demonstrate the utility of a cloud service that allows you
access to data anywhere, at any time, and on any device. However, they only
represent one data storage model that is important for cloud computing. In this
first part of the book we explore the following models.
• File system
storage is the well-known model of organizing data into folders
and directories. In the cloud, file storage is usually accessed by attaching a
virtual disk to a virtual machine.
• Blob storage
, where Blob is shorthand for Binary Large Object, provides
a flat object model for d ata. It is extremely scalable, in ways that are
challenging for file systems.
• Databases
provide highly structured data collections. We consider three
primary types of database in this book:
1.
Relational databases, which have a formal algebra of composition tha t
can be invoked by the structured query language, SQL.
2.
Tables and NoSQL databases, which are more easily distributed over
multiple machines.