Chapter 18

Afterword: A Discovery Cloud

“It would appear that we have reached the limits of what it is possible to

achieve with computer technology, although one should be careful with

such statements, as they tend to sound pretty silly in ﬁve years.”

—John von Neumann

We hope that the preceding pages have given you some concrete ideas about how

you can use the cloud in your research. Perhaps the cloud, for you, will simply be a

place to store your research data s ecurely and cheaply, or to perform computations

that you could not easily run before. Or perhaps you are inspired to embrace

the power of the cloud to transform how you run your laboratory, conduct your

research, and interact with your community. No matter how yo u approach the

use of these technologies, we are conﬁdent that you will ﬁnd the experience both

rewarding and fun.

We cannot resist th is last opportunity to prognosticate. The pioneering cyber-

netician an d organizational theorist Staﬀord Beer wrote in 1972 [70]:

The question which asks how to use the computer in the enterprise [is]

the wrong question. A better formulation is to ask how the enterprise

should be run given that computers exist. The best version of all is

the question asking what, given computers, the enterprise now is.

We, in turn, are fascinated by the following variant of Beer’s best question:

Given cloud, and all that its scalab le and cost-eﬀective automation and

outsourcing entail, what the scientiﬁc enterprise now is.

We propose the following likely outcomes.

Industrialization of data pro-

duction

via large-scale automated experiments and observations, already occurring

in astronomy [

241

], functional genomics, and materials science [

159

], will expand to

many more domains. The resulting data glut will in turn drive

industrialization

of data analysis

, by which we mean large-scal e computational platforms that

automate quality control, analysis, inference, and other step s. These developments

will greatly reduce the costs of hypothesis generation and testing. They will also

improve reproducibility because experimental conﬁgurations and data processing

steps will be captured precisel y.

Meanwhile, the digital encoding of large quantities of scientiﬁc knowledge

from such experiments and other sources (e.g., the scientiﬁc literature) will enable

the creation of a

universal knowledge base

supporting both rapid access a nd

automated inference. It will become routine to ask questions via a scientiﬁc search

engine, to be notiﬁed of potential inconsistencies across existing knowledge, and to

vote on the next set of experiments to be performed by industrial-scale facilities.

Other experiments will be performed by quasi-independent

robot scientists

[

171

262

] that apply inference and experiment design methods to guide their choice of

the next experiment.

These steps towards economies of scale may sound dehumanizing, but experience

suggests that, if implemented in the right way, they can unleash a ﬂood of creativity.

If the universal knowledge base is treated as a globally accessible public good, then

the scientiﬁc playing ﬁeld becomes more level. A high school student in Angola,

India, or New Zealand will be able to search for new drugs for rare diseases or

new materials sourced from local materials. They will use powerful tools accessible

from this

discovery cloud

[

124

] to collect new data, analyze extant and new data,

test hypotheses, and contribute to knowledge.

This discovery cloud will also empower the bench scientist. Ama zon’ s Echo

and Alexa services can keep our calendar, order a pizza, and summon a car service,

and Alexa can invoke an Amazon Lambda function to start an experimental

analysis in the cloud. All of these actions can be driven by voice commands.

As machine learni ng continues to progress, future s cientists will beneﬁt from a

cloud-based research assistant

that not only monitors experiments but also

performs background research, such as scanning the literature for related work

and checking our mathematical d erivations. Such a system will respond to vocal

instructions while also reading (and writing) our computational notebooks.

Some of these developments may b e some way out, but the techno logy is

evolving fast. As Roy Amara observed, “[w]e tend to overestimate the eﬀect of a

technology in the short run and underestimate the eﬀect in the long run.”

346