Access control at the project level.
Apache Flink dataflow programming model.
Assignments for Udacity deep learning class with TensorFlow.
AWS Case Study: Animoto.
AWS Identity and Access Management best practices.
Azure Batch Shipyard recipes.
Azure Data Lake Store Python SDK.
Azure: Deploy a slurm cluster.
[9] Bare metal on OpenStack: Ironic.
CentOS 7 / RHEL 7 Open ports.
Cloudbridge documentation.
Containers on OpenStack: Magnum.
Deep learning AMI Amazon Linux version.
Euca2ools overview.
Eucalyptus EDGE network configuration.
Eucalyptus installation guide.
Eucalyptus network configuration requirements.
Eucalyptus: Plan services placement.
Eucalyptus: Planning networking modes.
[20] Galaxy on Jetstream.
Get started: Create Apache Spark cluster on HDInsight Linux and run interactive
queries using Spark SQL.
Globus endpoint activation.
Google Cloud Dataflow: Complete Examples.
Google Cloud Datalab Quickstart.
IBM Analytics Stream Computing.
[26] The Kub ernetes project.
[27] Layers library reference.
[28] Linux RAID.
Machine Learning Library (MLlib) guide.
Making secure requests to Amazon Web Services.
Microsoft Azure Event Hubs.
Microsoft Azure Stack.
NCBI BLAST on Windows Azure.
[34] Ocean Observatories Initiative.
[35] The Open Compute Project.
OpenStack documentation: CPU topologies.
OpenStack in production: Hints and tips from the CERN OpenStack cloud team.
OpenStack Newton release n otes.
OpenStack: Operators mailing list.
OpenStack: Scientific working group.
Predict with pre-trained models.
[42] Rados object storage utility.
[43] Riak cloud storage.
Sample applications built using Amazon Machine Learning.
Spark SQL, DataFrames and Datasets guide.
The Red Hat Package Manager.
Theano deep learning library.
Transferring RDA data with Globus.
[49] TripleO online documentation.
VMware Cloud Foundation.
Welcome to Bridges.
What is IAM?
Setup Linux Network Bridges on CentOS for Nova Networking, Nov 2015.
OpenStack user survey, Oct 2016.
Using AWS in th e context of New Zealand privacy considerations. Technical report,
Oct. 2016.
G. Agha. An overview of actor languages. SIGPLAN Notices,21(10):5867,June
T. Akidau. The world beyond batch: Streaming 102, Jan 2016.
T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. J. Fernández-Moctezuma,
R. Lax, S. McVeety, D. Mills, F. Perry, E. Schmidt, and S. Whittle. The dataflow
model: A practical approach to balancing correctness, latency, and cost in massive-
scale, unbounded, out-of-order data processing. Proceedings of the VLDB Endow-
T. Akidau and F. Perry. Dataflow/Beam and Spark: A Programming Model
Comparison, Feb 2016.
A. Aliper, S. Plis, A. Artemov, A. Ulloa, P. Mamoshina, and A. Zhavoronkov. Deep
learning applications for predicting pharmacological properties of drugs and drug
repurposing using transcriptomic data. Molecular Pharmaceutics,2016.
W. Allcock, J. Bresnahan, R. Kettimuthu, M. Link, C. Dumitrescu, I. Raicu, and
I. Foster. The Globus striped GridFTP framework and server. In ACM/IEEE
Conference on Supercomputing,page54,2005.
B. Allen, J. Bresnahan, L. Childers, I. Foster, G. Kandaswamy, R. Kettimuthu,
J. Kordas, M. Link, S. Martin , K. Pickett, and S. Tuecke. Software as a service for
data scientists. Communicat ions of the ACM,55(2):8188,Feb.2012.
S. Anthony. How big is the Cloud? ExtremeTech,May2012.
M. Armbrust, R. S. Xin, C. Lian, Y. Huai, D. Liu, J. K. Bradley, X. Meng, T. Kaftan,
M. J. Franklin, A. Ghodsi, et al. Spark SQL: Relational data processing in Spark. In
ACM SIGMOD International Conference on Management of Data,pages13831394,
[65] P. Bailis and K. Kingsbury. The network is reliable. Queue,12(7):20,2014.
R. Barga, J. Goldstein, M. Ali, and M. Hong. Consistent streaming through time:
A vision for event stream processing. In Conference on Innovative Data Systems
P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer,
I. Pratt, and A. Warfield. Xen and the art of virtualization. ACM SIGOPS Operating
Systems Review,37(5):164177,2003.
W. Barnett, V. Welch, A. Walsh, and C. A. Stewart. A roadmap for using NSF cyber-
infrastructure with In Common, 2011.
L. A. Barroso, J. Clidaras, and U. Hölzle. The datacenter as a computer: An
introduction to th e design of warehouse-scale machines. Synthesis Lectures on
Computer Architecture,8(3):1154,2013.
[70] S. Beer. Brain of the Firm.PenguinPress,1972.
D. Bernstein. Containers and cloud: From LXC to Docker to Kubernetes. IEEE
Cloud Computing,1(3):8184,2014.
P. Bernstein, S. Berkov, J. Thelin, an d S. Burkhardt. Orleans - Virtual Actors.
K. Bhuvanes hwar, D. Sulakhe, R. Gaub a, A. Rodriguez, R. Madduri, U. Dave,
L. Lacinski, I. Foster, Y. Gusev, and S. Madhavan. A case study for cloud based high
throughput analysis of NGS data using the Globus Genomics system. Computational
and Structural Biotechnology Journal,13:6474,2015.
M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D.
Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, and J. Zhao. End to end
learning for self-driving cars. arXiv preprint arXiv:1604.07316,2016.
F. Bonomi, R. Milito, J. Zhu, and S. Addepalli. Fog computing and its role in the
internet of things. In MCC Workshop on Mobile Cl oud Computing,pages1316.
ACM, 2012.
D. E. Boyle, D. C. Yates, and E. M. Yeatman. Urban sensor data streams: London
2013. IEEE Internet Computing,17(6):1220,2013.
T. Bray. One Amazon year, December 2015.
E. Brewer. CAP twelve years later: How the “rules” have changed. Computer,
45(2):23–29, 2012.
E. Brewer. Kubernetes and the path to cloud native. In 6th ACM Symposium on
Cloud Computing, pages 167–167. ACM , 2015.
J. Bryce. Embracing datacenter d iversity. In OpenStack Austin.2016.
Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst. HaLoop: Ecient iterative data
processing on large clusters. Proceedings of the VLDB Endowment,3(1-2):285296,
S. Bugiel, S. Nürnberger, T. Pöppelmann, A.-R. Sadeghi, and T. Schneider. Ama-
zonIA: When elasticity snaps back. In 18th ACM conference on Computer and
Communications Security, pages 389–400. ACM, 2011.
J. Cantarella, C. Shonkwiler, and E. Uehara. A fast direct sampling algorithm for
equilateral closed p olygons, Jan 2017.
C. Catlett, T. Malik, B. Goldstein, J. Giurida, Y. Shao, A. Panella, D. Eder, E. v.
Zanten, R. Mitchum, S. Thaler, and I. Foster. Plenario: An ope n data discovery
and exploration platf orm for urban science. Bulletin of the IEEE Computer Society
Technical Committee on Data Engineering,pages2742,2014.
A. Caulfield, E. Chung, A. Putnam, H. Angepat, J. Fowers, M. Haselman, S. Heil,
M. Humphrey, P. Kaur, J.-Y. Kim, D. Lo, T. Massengill, K. Ovtcharov, M. Pa-
pamichael, L. Woods, S. Lanka, D. Chiou, and D. Burger. A cloud-scale acceleration
architecture. In 49th Annual IEEE/ACM International Symposium on Microarchi-
M. Cezar. Setting up NTP (Network Time Protocol) Server in RHEL/Cen-
tOS 7. Tecmint,Mar2015.
K. M. Chandy, O. Etzion, and R. von Ammon. Event processing. Dagstuhl Seminar
Proceedings 10201, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany,
K. Chard, S. Caton, O. F. Rana, and K. Bubendorfer. Soc ial cloud: Cloud computing
in social networks. IEEE CLOUD,10:99106,2010.
K. Chard, J. Pruyne, B. Blaiszik, R. Ananthakrishnan, S. Tuecke, and I. Foster.
Globus data publication as a service: Lowering barriers to reproducible science. In
11th IEEE International Conference on eScience,pages401410,2015.
R. Chard, K. Chard, K. Bubendorfer, L. Lacinski, R. Madduri, and I. Foster. Cost-
aware cloud provisioning. In IEEE 11th International Conference on e-Science,
pages 136–144, 2015.
R. Chard, R. Madduri, N. Karonis, K. Chard, K. Dun, C. Ordonez, T. Uram,
J. Fleischauer, I. Foster, M. Papka, and J. Winans. Scalable pCT image reconstruc-
tion delivered as a cloud service. IEEE Transactions on Cloud Computing,2015.
T. Chen, M. Li, Y. Li, M. Lin, N. Wang, M. Wang, T. Xiao, B. Xu, C. Zhang, and
Z. Zhang. Mxnet: A flexible and ecient machine learning library for heterogene ous
distributed systems. CoRR ,abs/1512.01274,2015.
Y. Chen, V. Paxson, and R. H. Katz. What’s new about cloud computing security.
University of California, Berkeley Report No. UCB/EECS-2010-5 January,2010.
T. Cheng and J. Wang. Application of a Dynamic Recurrent Neural Network in
Spatio-Temporal Forecasting,pages173186. SpringerBerlinHeidelberg,Berlin,
Heidelberg, 2007.
K. Cho, B. Van Merriënboer, D. Bahdanau, and Y. Bengio. On the proper-
ties of neural machine translation : Encoder-decoder approaches. arXiv preprint
J. Clark. 5 numbers that illustrate the mind bending size of Amazon’s cloud.
Bloomberg Global Tech,Nov2014.
Cloud Comp uting Security Working Group. NIST Cloud Computing Security
Reference Architecture. Special Publication 500-299, National Institute of
Standards and Technology, 2013.
D. T. Cohen, G. W. Hatchard, and S. G. Wilson. Population trends in incorporated
places: 2000 to 2013. Technical Report P25-1142, US Census, Mar 2015.
A. Con esa, P. Madrigal, S. Tarazona, D. Gomez-Cabrero, A. Cervera, A. McPh erson,
M. W. Szcześniak, D. J. Ganey, L. L. Elo, X. Zhang, et al. A survey of best
practices for RNA-seq data analysis. Genome biology,17(1):13,2016.
F. J. Corbató and V. Vyssotsky. Introduction and overview of the Multics system.
IEEE Annals of the History of Computing,14(2):1213,1992.
J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat,
A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li,
A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito,
M. Szymaniak, C. Taylor, R. Wang, , and D. Woodford. Spanner: Google’s globally
distributed database. ACM Transactions on Computer Systems,31(3):8,2013.
T. Cowles, J. Delaney, J. Orcutt, and R. Weller. The Ocean Observatories Initiative:
Sustained ocean observing across a range of spatial scales. Marine Technology
Society Journal,44(6):5464,2010.
D. R. Cox. The regression analysis of binary sequences. Journal of the Royal
Statistical Society. Series B (Methodological),pages215242,1958.
R. J. Creasy. The origin of the VM/370 time-sharing system. IBM Journal of
Research and Development,25(5):483490,1981.
J. Czyzyk, M. P. Mesnier, and J. J. Moré. The NEOS server. IEEE Computational
Science and Engineering,5(3):6875,1998.
E. Dart, L. Rotman, B. Tierney, M. Hester, and J. Zurawski. The Science DMZ: A
network design pattern for data-intensive science. Scientific Programming,22(2):173
185, 2014.
F. De Carlo. DMagic data management system.
J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters.
Communications of the ACM,51(1):107113,2008.
E. Deelman, K. Vahi, M. Rynge, G. Juve, R. Mayani, and R. F. da Silva. Pegasus
in the cloud: Science automation through workflow technologies. IEEE Internet
P. Dhingra, K. Tolle, and D. Gannon. Using cloud-based analytics to save lives.
Cloud Computing in Ocean and Atmospheric Sciences,page221,2016.
S. Dieleman. My solution for the Galaxy Zoo challenge, Apr 2014.
C. Docan, M. Parashar, and S. Klasky. DataSpaces: An interaction and coordination
framework for coupled simulation workflows. Cluster Computing,15(2):163181,
A. Dubey and D. Wagle. Delivering software as a service. The McKinsey Quarterly,
6(2007):2007, 2007.
D. Eadline. Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Com-
puting in the Apache Hadoop 2 Ecosystem.Addison-Wesley,2016.
G. E isenhauer, M. Wolf, H. Abbasi, and K. Schwan. Event-based systems: Op-
portunities and challenges at exascale. In 3rd ACM International Conference on
Distributed Event-Based Systems,2009.
S. Ekanayake, S. Kamburugamuve, and G. Fox. SPIDAL: high performance data
analytics with Java and MPI on large multicore HPC clusters. In Spring Simulation
J. Elliott, D. Kelly, J. Chryssanthacopoulos, M. Glotter, K. Jhunjhnuwala, N. Best,
M. Wilde, and I. Foster. The parallel system for integrating impact models and
sectors (pSIMS). Environmental Modelling & Software,62:509516,2014.
O. Etzioni. Deep learning isn’t a dangerous magic genie. It’s just math. Wired,
June 2016.
B. Familiar. Microservices, IoT and Azure: Leveraging DevOps and Microservice
Architecture to deliver SaaS Solutions.APress,2015.
M. R. Ferré. Cloud native applications (for dummies), 2014.
R. T. Fielding. Architectural styles and the design of network-based software archi-
tectures. PhD thesis, University of California, Irvine, 2000.
J. Fischer, S. Tuecke, I. Foster, and C. A. Ste wart. Jetstream: A distributed cloud
infrastructure for underresourced higher education communities. In 1st Workshop on
The Science of Cyberinfrastructure: Research, Experience, Applications and Models,
pages 53–61. ACM, 2015.
I. Foster. Globus Online: Accelerating and democratizing science through cloud-
based services. IEEE Internet Computing,15(3):7073,May2011.
I. Foster, K. Chard, and S. Tuecke. The discovery cloud: Accelerating and demo c -
ratizing research on a global scale. In IEEE International Conference on Cloud
Engineering, pages 68–77. IEEE, 2016.
I. Foster, R. Ghani, R. S. Jarmin, F. Kreuter, and J. I. Lane, editors. Big Data and
Social Science: A Practical Guide to Methods and Tools.Taylor&FrancisGroup,
2016. See also
I. Foster and C. Kesselman. The history of the grid. In High Performance Computing:
From Grids and Clouds to Exascale, pages 3–30. IOS Press, 2011.
I. Foster, Y. Zhao, I. Raicu, and S. Lu. Cloud computing and grid computing
360-degree compared. In Grid Computing Environments Workshop,pages110.
IEEE, 2008.
A. Fox, D. A. Patterson, and S. Joseph. Engineering Software as a Service: An
Agile Approach using Cl oud Computing. Strawberry Canyon LLC, 2013.
G. Fox and D. Gannon. Using clouds for technical com puting, 2013.
G. Fox, S. Jha, and L. Ramakrishnan. Streaming and Steering Applications:
Requirements and Infrastructure.
G. C. Fox, R. D. Williams, and G. C. Messina. Parallel Computing Works! Morgan
Kaufmann, 2014.
B. H. Frank. AWS wants to dominate beyond the public cloud with Lambda upd ates .
PC World,Dec.2016.
D. Gannon. Performance Analysis of a Cloud Microservice-based ML Classifier, Oct
D. Gannon. CNTK revisited. A new deep learning toolkit release from Microsoft, Nov
D. Gannon, D. Fay, D. Gree n, K. Takeda, and W. Yi. Scienc e in the cloud: Lessons
from three years of research projects on Microsoft Azure. In 5th International
Workshop on Scientific Cloud Computing, pages 1–8. ACM, 2014.
Gartner Research . Software as a Service (SaaS).
K. Gee and W. Hunt. Enhancing stormwater management benefits of rainwater
harvesting via innovative technologies. Journal of Environmental Engineering,
142(8):04016039, 2016.
L. George. HBase: The Definitive G uide: Random Access to Your Plan et-Size Data.
O’Reilly Media, Inc., 2011.
[139] A. V. Gerbessiotis and L. G. Valiant. Direct bulk-synchronous parallel algorithms.
Journal of Parallel and Distributed Computing,22(2):251267,1994.
S. Goasguen. Enjoy Kubernetes with Python.
J. Goecks, A. Nekrutenko, J. Taylor, and T. G. Team. Galaxy: A comprehensive
approach for supporting accessible, reproducible, and transparent computational
research in the life sciences. Genome Biol,11(8):R86,2010.
J. Gong, P. Yue, and H. Zhou. Geoprocessing in the Microsoft cloud comput-
ing platform–Azure. In Joint Symposium of ISPRS Technical Commission IV &
I. Goodfellow, Y. Bengio, and A. Courville. Deep Learning. MIT Press, 2016.
A. Graves, A. Mohamed, and G. Hinton. Speech recognition with deep recurrent
neural networks. In IEEE International Conference on Acoustics, Speech, and Signal
Processing, pages 6645–6649. IEEE, 2013.
A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel. The cost of a cloud: Research
problems in data center networks. ACM SIGCOMM Computer Communication
K. Gremban. Get started with ac ces s management in the azure portal.
W. Gropp, E. Lusk, and R. Thakur. Using MPI-2: Advanced Features of the Message
Passing Interface. MIT Press, 1999.
J. Han, M. Kamber, and J. Pei. Data Mining: Concepts and Techniques.Morgan-
Kaufmann, 2011.
D. Hardt. OAuth 2.0 authorization framework specification , 2012.
J. A. Hartigan and M. A. Wong. Algorithm AS 136: A k-means clustering algorithm.
Journal of the Royal Statistical Society. Series C (Applied Statistics),28(1):100108,
K. Hashizume, D. G. Rosado, E. Fernández-Medina, and E. B. Fernandez. An
analysis of security issues f or cloud computing. Journal of Internet Services and
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition.
T. Hey, S. Tansley, and K. Tolle. The Fourth Paradigm: Data-Intensive Scientific
B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. H. Katz,
S. Shenker, and I. Stoica. Mesos: A platform for fine-grained resource sharing
in the data center. In USENIX Symposium on Networked Systems Design and
B. Holzman. Fermilab HEPCloud: An elastic computing facility for High Energy
Physics. In International Conference on Computing in High Energy Physics.2016.
A. Howard. Running MPI applications in Amazon EC2, May 2015.
W. Huang, A. Ganjali, B. H. Kim, S. Oh, and D. Lie. The state of p ublic
infrastructure-as-a-service clou d security. ACM Computing Surveys,47(4):68,2015.
T. Hunt. Introducing Have I been pwned?” aggregating accounts across
website breaches, Dec 2013.
A. Jain, S. P. Ong, G. Hautier, W. Chen, W. D. Richards, S. Dacek, S. Cholia,
D. Gunter, D. Skinner, G. Ceder, et al. The materials project: A materials genome
approach to accelerating materials innovation. APL Materials,1(1):011002,2013.
S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wan-
derer, J. Zhou, M. Zhu, J. Zolla, U. Hölzle, S. Stuart, and A. Vahdat. B4: Experience
with a globally-deployed software defined WAN. ACM SIGCOMM Computer Com-
munication Review,43(4):314,2013.
[161] Y. Jia and E. Shelhamer. Cae.
B. Johnson. Cloud computing is a trap, warns GNU founder Richard Stallman.
Guardian Newspaper,Sep2008.
B. Jones. Towards the European open science cloud, 2015.
N. Jouppi. Google supercharges machine learning tasks with TPU cu stom chip, 2016.
S. Kamburugamuve and G. Fox. Survey of distributed stream processing, Feb 2016.
S. Kamburugamuve, P. Wickramasinghe, S. Ekanayake, and G. Fox. Anatomy of
machine learning algorithm implementations in MPI, Spark, and Flink, Jan 2017.
N. T. Karonis, K. L. Dun, C. E. Ordoñez, B. Erdelyi, T. D. Uram, E. C. Olson,
G. Coutrakon, and M. E. Papka. Distributed and hardware accelerated computing
for clinical medical imaging using proton computed tomography (pCT). Journal of
Parallel and Distributed Computing,73(12):16051612,2013.
A. Karpathy. The unreasonable eectiveness of recurrent neural networks, Feb 2015.
M. Kassner. A look at Amazon’s world class data center ecosystem,
Dec 2014.
S. Kemp. Password-less logins with OpenSSH, 2005.
R. D. King, J. Rowland, S. G. Oliver, M. Young, W. Aubrey, E. Byrne, M. Liakata,
M. Markham, P. P ir, L. N. Soldatova, A. Sparkes, K. E. Whelan, and A. Clare. The
automation of science. Science,324(5923):8589,2009.
G. Klimeck, M. McLennan, S. P. Brophy, G. B. Adams III, and M. S. Lundstrom. Advancing education and research in nanotechnology. Computing in
Science & Engineering,10(5):1723,2008.
S. Kulkarni, N. Bhagat, M. Fu, V. Kedigehalli, C. Kellogg, S. Mittal, J. M. Patel,
K. Ramasamy, and S. Taneja. Twitter Heron: Stream processing at scale. In ACM
SIGMOD International Conference on Management of Data,pages239250,2015.
H. S. Kuyuk, R. M. Allen, H. Brown, M. Hellweg, I. Henson, and D. Neuhauser. De-
signing a network-based earthquake early warning algorithm for California: ElarmS-2.
Bulletin of t he Seismological Society of America,2013.
M. Lamanna. The LHC computing grid project at CERN. Nuclear Instruments
and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors
and Associated Equipment,534(1):16,2004.
K. A. Lawrence, M. Zentner, N. Wilkins-Diehr, J. A. Wernert, M. Pierce, S. Marru,
and S. Michael. Science gateways today and tomorrow: Positive perspectives of
nearly 5000 members of the research community. Concurrency and Computation:
Practice and Experience,27(16):42524268,2015.
J. Layton. A container for HPC.
J. A. Le, H. El-Askary, M. Allali, and D. Struppa. Applic ation of recurrent neural
networks for drought projections in California. Atmospheric Research,187,2017.
[179] H. Lee. Simple Azure.
P. D. Lena, K. Nagata, and P. F. Baldi. Deep spatio-temporal architectures
and learning for protein structure prediction. In Advances in Neural Information
Processing Systems 25, pages 512–520. Curran Associates, Inc., 2012.
Y. Li. Introduction to Docker secrets management.
D. Lifka, I. Foster, S. Mehringer, M. Parashar, P. Red fern, C. Stewart, and S. Tuecke.
XSEDE cloud survey report, 2013.
I. Liu and B. Ramakrishnan. Bach in 2014: Mu sic composition with recurrent neural
network. CoR R,abs/1412.3191,2014.
Q. Liu, J. Logan, Y. Tian, H. Abbasi, N. Podhorszki, J. Y. Choi, S. Klasky, R. Tchoua,
J. Lofstead, R. Oldfield, M. Parashar, N. Samatova, K. Schwan, A. Shosh an i, M. Wolf,
K. Wu, and W. Yu. Hello ADIOS: The challenges and lessons of developing leadership
class I/O frameworks. Concurrency and Computation: Practice and Experience,
26(7):1453–1473, 2014.
Y. Liu, A. Padmanabhan, and S. Wang. CyberGIS Gateway for enabling data-rich
geospatial research and education. Concurrency and Computation: Practice and
R. Madduri, K. Chard, R. Chard, L. Lacinski, A. Rodriguez, D. Sulakhe, D. Kelly,
U. Dave, and I. Foster. The Globus Galaxies platform: Delivering science gateways
as a service. Concurrency and Computation: Practice and Experience,27(16):4344
4360, 2015.
R. Madduri, D. Sulakhe, L. Lacinski, B. Liu, A. Rodriguez, K. Chard, U. J. Dave, and
I. T. Foster. Experienc es building Globus Genomics: A next-generation sequencing
analysis service using Galaxy, Globus, and Amazon Web Services. Concurrency and
Computation: Practice and Experience,26(13):22662279,2014.
P. K. Mantha, A. Luckow, and S. Jha. Pilot-MapReduce: An extensible and flexible
MapReduce implementation for distributed data. In 3rd International Workshop on
MapReduce and Its Applicat ions, pages 17–24. ACM, 2012.
J. Margolis. Amaz on Echo’s role in deep space exploration. Financial Times,Jan
N. Marz and J. Warren. Big Data Principles and Best Practices of Scalable
Realtime Data Systems.Manning,2015.
A. Matsun aga, J. Fortes, K. Keahey, and M. Tsugawa. S ky computing. IEEE
Internet Computing,13:4351,2009.
[192] K. Matthias and S. P. Kane. Docker: Up and Running.OReilly,2016.
W. McKinney. Python for Data Analysis: Data Wrangling with Pandas, NumPy,
and IPython.OReillyMedia,2015.
N. Mehrotra, L. Franks, P. McKay, R. McAllister, and J. Gao. Get started:
Create Apache Spark cluster in Azure HDInsight and run interactive queries using
Spark SQL.
N. Me h rotra, R. McMurray, L. Franks, and J. Gao. Machine learning: Predic-
tive analysis on food inspection data using MLlib with Apache Spark cluster
on HDInsight Linux.
P. Mehrotra, J. Djomehri, S. Heistand, R. Hood, H. Jin, A. Lazano, S. Saini, and
R. Biswas. Performance evaluation of Amazon EC2 for NASA HPC applications.
In 3rd Workshop on Scientific Cloud Computing, pages 41–50. ACM, 2012.
P. Mell and T. Grance. The NIST definition of cloud computing. Special Publication
800-145, National Institute of Standards and Technology, 2011.
X. Meng, J. Bradley, B. Yavuz, E. Sparks, S. Venkataraman, D. Liu, J. Freeman,
D. Tsai, M. Amde, S. Owen, D. Xin, R. Xin, M. J. Franklin, R. Zadeh, M. Zaharia,
and A. Talwalkary. MLlib: Machine learning in Apache Spark. Journal of Machine
Learning Research,17(34):17,2016.
F. Meyer, D. Paarmann, M. D’Souza, R. Olson, E. M. Glass, M. Kubal, T. Pac zian,
A. Rodriguez, R. Stevens, A. Wilke, et al. The metagenomics RAST server–A public
resource for the automatic phylogenetic and func tion al analysis of metagenomes.
BMC bioinformatics,9(1):386,2008.
Microsoft Research Connections. MSR Courseware.
M. A. Miller, W. Pfeier, and T. Schwartz. Creating the CIPRES scien ce gateway
for inference of large phylogenetic trees. In Gateway Computing Environments
D. Milojičić, I. M. Llorente, and R. S. Montero. OpenNebula: A cloud management
tool. IEEE Internet Computing,15(2):1114,2011.
N. M. Mohamed, H. Lin, and W.-C. Feng. Accelerating data-intensive genome
analysis in the cloud. In 5th International Conference on Bioinformatics and
Computational Biology.2013.
T. P. Morgan. A rare peek at the massive scale of AWS. EnterpriseTech,
Nov 2014.
A. Morin, J. Urban, P. D. Adams, I. Foster, A. Sali, D. Baker, and P. Sliz. Shining
light into black boxes. Science,336(6078):159160,2012.
A. Mouat. Docker security: Using containers safely in production.
A. C. Muller and S. Guido. Introduction to Machine Learning with Python: A Guide
for Data Scientists.OReillyPublishing,2017.
N. Nakata, J. P. Chang, J. F. Lawrence, and P. Boué. Body wave extraction and
tomography at long beach, california, with ambient-noise interferometry. Journal of
Geophysical Research: Solid Earth,120(2):11591173,2015.
F. Nelli. Python Data Analytics: Data Analysis and Science using Pandas, Matplotlib
and the Python Programming Language.Apress,2015.
M. A. Nielsen. Neural Networks and Deep Learning.DeterminationPress,2015.
B. Nikolic. Data processing for the Square Kilometre Array telescope.
D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youse, and
D. Zagorodnov. The Eucalyptus open-source cloud-computing system. In 9th
IEEE/ACM International Symposium on Cl uster Computing and the Grid,pages
124–131, 2009.
C. Olah. Understandin g LSTM networks, Aug 2015.
C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig latin: A not-so-
foreign language for data processing. In ACM SIGMOD International Conference
on Management of Data,pages10991110,2008.
R. Orihuela and D. Bass. Help wanted: Black belts in data, Jun 2015.
K. Ovtcharov, O. Ruwase, J.-Y. Kim, J. Fowers, K. Strauss, and E. Chung. Toward
accelerating deep learning at scale using specialized hardware in the datacenter. In
27th HotChips Symposium on High-Performance Chips. IEEE, August 2015.
D. F. Parkhill. The Challenge of the Computer Utility.Addison-WesleyEducational
Publishers, 1966.
N. Paskin. Digital object identifier (DOI) system. Encyclopedia of L ibrary and
Information Sciences,3:15861592,2010.
F. Pérez and B. Granger. The state of Jupyter, Jan 2017.
D. A. Phillips, C. Puskas, Santillan, L. M., Wang, R. W. King, W. M. Szeliga,
T. Melbourne, M. Murray, M . Floyd, and T. A. Herring. Plate Boundary Observatory
and related networks: GPS data analysis methods and geo d etic products. Reviews
of Geophysics,54:759f808,2016.
I. Raicu, I. Foster, and Y. Zhao. Many-task computing for grids and supercomputers.
In IEEE Workshop on Many-Task Computing on Grids and Supercomputers,2008.
K. Ram. Git can facilitate greater re p roducibility and increased transparency in
science. Source Code for Biology and Medicine,8(1):7,2013.
L. Ramakrishnan, P. T. Zbiegel, S. Campbell, R. Bradshaw, R. S. Canon, S. Coghlan,
I. Sakrejda, N. Desai, T. Declerck, and A. Liu. Magellan: Experiences from a science
cloud. In 2nd International Workshop on Scientific Cloud Computing,pages4958.
ACM, 2011.
[224] S. Rashka. Python Machine Learning.PacktPublishing,2016.
[225] K. Reitz. Requests: HTTP for humans.
[226] J. Richer. OAuth 2.0 token introspection. RFC 7662, IETF, 2015.
M. Rosenblum and T. Garfinkel. Virtual machine monitors: Current technology
and future trends. Computer,38(5):3947,2005.
M. Russinovich. Report from Open Networking Summit: Achieving hyp er-scale
with software defined networking.
S. Ryza, U. Laserson, S. Owen, and J. Wills. Advanced Analytics with Spark:
Patterns for Learning from Data at Scale.OReillyMedia,2015.
N. Sakimura, J. Bradley, M. Jones, B. d. Medeiros, and C. Mortimore. OpenID
Connect Core 1.0 incorporating errata set 1, 2014.
D. Sanderson. Programming Google App Engine with Python: Build and Run
Scalable Python Apps on Google’s Infrastructure.OReillyPress,2015.
M. Satyanarayanan. The emergence of edge computing. Computer,50(1):3039,
C. Severance. Python for informatics: Exploring information, 2013.
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driess-
che, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Diele-
man, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach,
K. Kavukcuoglu, T. Graepel, and D. Hassabis. Mastering the game of Go with deep
neural networks and tree search. Nature,529(7587):484489,2016.
F. Simorjay. Shared responsibilities for cloud computing. Technical re-
port, Microsoft, Mar 2016.
A. Singh, J. Ong, A. Agarwal, G. Anderson , A. Armistead, R. Bannon, S. Boving,
G. Desai, B. Felderman, P. Germano, et al. Jupiter rising: A decade of Clos
topologies and centralized control in Google’s datacenter network. ACM SIGCOMM
Computer Communication Review,45(4):183197,2015.
L. Smarr and C. E. Catlett. Metacomputing. Communications of the ACM,35(6):44
53, 1992.
[238] R. M. Stallman. Who does that server really serve? Boston Review,35(2),2010.
R. Stevens, P. Woodward, T. DeFanti, and C. Catlett. From the I-WAY to the
National Technology Grid. Communications of the ACM,40(11):5060,1997.
C. Strasser. Git/GitHub: A primer for researchers, 2014.
A. Szalay and J. Gray. The world-wide telescope. Science,293(5537):20372040,
T. Tetrick. Best practices for securing access to your Azure virtual machines, 2014.
D. Thain, T. Tannenbaum, and M. Livny. Distributed computing in practice:
The Condor experience. Concurrency and Computation: Practice and Experience,
17(2-4):323–356, 2005.
B. Tierney, J. Metzger, J. Boote, E. Boyd, A. Brown, R. Carlson, M. Zekauskas,
J. Zurawski, M. Swany, and M. Grigoriev. perfsonar: Instantiating a global network
measurement framework. In SOSP Workshop on Real Overlays and Distributed
J. Towns, T. Cockerill, M. Dahan, I. Foste r, K. Gaither, A. Grimshaw, V. Hazlewood,
S. Lathrop, D. Lifka, G. D. Peterson, R. Roskies, J. R. Scott, and N. Wilkins-Diehr.
XSEDE: Accelerating scientific discovery. Computing in Science & Engineering,
16(5):62–74, 2014.
R. Tudoran, A. Costan, G. Antoniu, and H. Soncu. TomusBlobs: Towards
communication-ecient storage for MapReduce applications in Azure. In 12th
IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing,pages
427–434, 2012.
S. Tuecke, R. Ananthakrishnan, K. Chard, M. Lidman, B. McCollam, and I. Foster.
Globus Auth: A research identity and access management platform. In 12th IEEE
International Conference on e-Science,2016.
T. Tugend. UCLA to be first station in nationwide computer network, July 1969.
J. Turnbull. The Docker Book: Containerization is the New Virtualization.Kindle,
A. Vahdat. A look inside Google’s data center networks, 2015.
[251] J. van Vliet and F. Paganelli. Programming AWS EC2.OReillyPress,2011.
T. C. Vance, N. Merati, C. Yang, and M. Yuan. Cloud Computing in Ocean and
Atmospheric Sciences.Elsevier,2016.
J. VanderPlas. Python Data Science Handbook: Essential Tools for Working with
J. Varia. Tips for secu ring your EC2 instance.
N. Vijayakumar and B. Plale. Performance evaluation of rate-based join window
sizing for asynchronous data streams. In 13th IEEE International Symposium on
High Performance Distributed Computing,pages260261,2004.
W. Vogels. MXNet Deep learning framework of choice at AWS, Nov 2016.
M. M. Waldrop. The Dream Machine: JCR Licklider and the Revolution that Made
Computing Personal.VikingPenguin,2001.
[258] T. White. Hadoop: The Definitive Guide. O’Reilly Media, Inc., 2012.
M. Wilde, M. Hategan, J. M. Wozniak, B. Cliord, D. S. Katz, and I. Foster. Swift:
Alanguagefordistributedparallelscripting. Parallel Computing,37(9):633652,
J. Wilkening, A. Wilke, N. Desai, and F. Meye r. Using clouds for metage n omics: A
case study. In IEEE International Conference on Cluster Computing,pages16,
N. Wilkins-Diehr, D. Gannon, G. Klimeck, S. Oster, and S. Pamidighantam. Tera-
Grid science gateways and their impact on scien ce. Computer,41(11),2008.
K. Williams, E. Bilsland, A. Sparkes, W. Aubrey, M. Young, L. N. Soldatova,
K. De Grave, J. Ramon, M. de Clare, W. Sirawaraporn, S. G. Oliver, and R. D. King.
Cheaper faster drug development validated by the repositioning of drugs against
neglected tropical diseases. Journal of the Roy al Society Interface,12(104):20141289,
[263] A. Wittig and M. Wittig. Amazon Web Services in Action.ManningPress,2015.
D. Xue, P. V. Balachandran, J. Hogden, J. Theiler, D. Xue, and T. Lookman.
Accelerated search for materials with targeted properties by adaptive design. Nature
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark:
Cluster computing with working sets. In 2nd USENIX Workshop on Hot Topics in
Cloud Computing,2010.
Y. Zheng, X. Chen, Q. Jin, Y. Chen, X. Qu, X. Liu, E. Chang, W.-Y. Ma, Y. Rui,
and W. Sun. A cloud-based knowledge discovery system for monitoring fine-grained
air quality. Technical Report MSR-TR-2014–40, Microsoft Research, 2014.