11.4. Building a Remotely Accessible Service
Other Globus services provide other capab il ities. Globus Publ icatio n, for
example, provi des user-configurable, cloud-hosted data publication pipelines that
can be used to automate the workflows used to make data accessible to others,
workflows that will typically include steps such as providing and collecting metadata,
moving data to long-term storage, assigning persistent identifiers (e.g., a Digital
Object Identifier or DOI [
218
]), and verifying data correctness [
89
]. Globus Data
Search can be used to search for data on endpoints to which a user has access. See
the Globus documentation docs.globus.org for information on these services.
Data delivery at the Advanced Photon Source
. The
Advanced Photon
Source
(APS) at Argonne National Laboratory is typical of many experimental
facilities worldwide in that it serves large numbers (thousands) of researchers every
year, most of whom visit ju st for a few days to collect data and then return to their
home institution. In the past, data produced during an experiment was invariably
carried back on physical media. However, as data sizes have grown and experiments
have become more collaborative, that approach has become less effective. Data
transfer via network is preferred; the challenge is to integrate data transfer into the
experimental workflow of the facility in a way that is fully automated, secure, reliable,
and scalable to thou sands of users and datasets.
Francesco De Carlo uses Globus APIs to do just that at the APS. His
DMagic
system
[
107
] implements a variant of the program in fi gure 11.9 that integrates with
APS administrative and facility systems to deliver data to experimental users. When
an experiment is approved at the APS, a set of associated researchers are registered
in the APS administrative datab as e as approved participants. DMagic leverages this
information as follows. Before the experiment begins, it creates a shared endpoint on
alargestoragesystemmaintainedbyArgonne’scomputingfacility. DMagicthen
retrieves from the APS scheduling system the list of approved users for the experiment,
and adds permissions for those users to the shared endpoint. It then monitors the
experiment data directory at the APS facility and copies new files automatically to
that shared endp oint, from which it can be retrieved by any approved user.
11.4 Building a Remotely Accessible Service
Say you want to build a service that can be invoked remotely via a REST API
call. Building and invoking a service in this way is straightforward in principle:
many libraries exist for defining, implementing, and usin g REST APIs. Security
is perhaps the on e major source of complexity, and here Globus Auth can help.
The basic issue is that when a remote user makes a request to the service, the
service author needs to be able to determine who is making the request and what
rights the requestor is passing with the request. For example, the service may
240