Chapter 7. Scaling Deployments
your authority. This is your microservice, so you are entitled to do so, but you
may p refer to pass only a limited authority: for example, grant access only to the
task queue but not the database.
The public cloud providers have a better solution:
role-based security
.What
this means is that you can create special secure entities, called
roles
, that authorize
individuals, applications, or services to access various cloud resources on your behalf.
As we illustrate below, you can add a reference to a role in a container’s deployment
metadata so that each time the container is instantiated, the role is applied. (We
discuss role-based access control in more detail in s ection 15.2 on page 319.)
7.6.3 A Simple Microservices Example
As in previous chapters, we present a single example and show how different
resource managers can be used to implement that example. Several types of
scientific applications can benefit from a microservice architecture. One common
characteristic is that they are to run continuously and respond to external events.
In chapter 9, we describe a detailed example of such an application: the analysis
of events from online instruments and experiments. In this chapter, we consider
the following simpler example.
Scientific document classification
.Whenscientistssendtechnicalpapersto
scientific journals, the abstracts of these papers often make their way onto the
Internet as a stream of news items, to which one can subscribe via RSS feeds. A
major source of high-quality streamed science data is arXiv
arxiv.org
,acollection
of more than one million open-access documents. Other sources include the Public
Library of Science (PLOS one), Science, and Nature, as well as online news sources.
We have downloaded a small collection of records from arXiv, each containing a paper
title, an abstract, and, for some, a sc ientific topic as determined by a curator. In the
following sections we describe how to build a simple online science d ocument classifier.
Our goal is to build a system that pulls document abstracts from the various feeds
and then uses one set of microservices to classify those abstracts into the major topics
of physics, biology, math , fin an ce , and comp u ter science, an d a second set to classify
them into subtopic areas, as illus trated in figure 7.10 on the next p age .
The initial version that we describe here is more modest. In the fi rst phase of this
system, initial document classification, we feed the do cuments from a Jupyter notebook
into a cloud-based message queue. The classifier microservices pull documents from
the queue, perform the analysis, and push results into a NoSQL table, as shown in
figure 7.11. This is now a simple many task system.
113