Skip to content

Nature of Cloud Computing

Some Motivation at Amazon.com

  • Massive IT infrastructure supports the Amazon store and company
  • They wanted to sell shopping application as a service to a company like Target who didn't want to r-un their own store. T This required the software developers to have lots of flexible infrastructure (servers) to run on.
  • They found team to build a service (with software) could spend 70% of their time setting up the 'back end'
  • They called all the infrastructure needed to run a massive dot-com "muck" and saw this as a secondary supporting role to application development. What they wanted in days actually took months.

Eureka moment for Amazon: we could sell it

  • Amazon automated their IT department so teams could order and provision the servers they needed on demand beyond just virtualization ("everything was an API")
  • They got really good at running very large data centers for many customers as cheaply as possible and on-demand for Amazon.com and other stores and services.
  • They realized that their innovations would help any IT organization and especially internet start-ups like themselves, and that they could sell it.
  • Their customers were other IT departments
    Blog Post from 2006: "We Build Muck, So You Don’t Have To"

NIST defintion of cloud

Government offices interested in purchasing cloud computing needed a definition of it to differentiate from other kinds of computing, hence... the NIST definition of cloud computing essential characteristics

  • On-demand self-service.
  • Measured service: pay for what you get.
  • Broad network access: accessible from the internet
  • Rapid elasticity: no limits from a customer perspective. This word was invented by AWS
  • Resource pooling: single resources serve many customers.

What is Cloud Computing? Cloud concepts vs Cloud Providers

Benefits of Cloud Computing for Research

  • Customized Computing: can create customized resources only when you need it
  • Elastic/On-demand: can run ad-hoc computations on those on-demand resources
  • Instant service:
  • Reproducible: a computation can be re-run as needed, meaning cloud resources can be easily re-recreated to re-run your computations.
  • Cost effective: unlike commerical applications, more users does not mean more revenue. Budgets are fixed and the pay-as-you-go model requires vigilance to not over-spend.
  • Others?

Restatement of goals of this Cloud Computing Fellowship:

  • Learn which types of computing resources are beneficial to your research
  • Learn how to use Cloud to create those resources
  • Use the services packaged by cloud companies to discover new resources

Using workflow and computational thinking

  • Karl Popper stated that "non-reproducible single occurrences are of no significance to science" ( K Popper, "The Logic of Scientific Discovery", English translation from Routledge, London, 1992, p. 66.) and this is a significant issue for research based on computing.

To enhance reproducibility in your own work, consider documenting all the steps needed for create the environment to run your computation. For many on-premise academic systems (e.g. the MSU HPCC), we depend upon the system administrators to create that environment, but we may install and configure all the software we need to run our code. Workflow thinking can apply to the scienfic domain itself (e.g. "Principles for data analysis workflows" https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008770 ) and to the provisioning of the cloud computing environment. That is, we may use a workflow system for creating all the cloud stuff we need, and then a different workflow system that runs on that cloud stuff. One example is we may create an HPC system on Azure using templates and then launch the Slurm scheduler on that HPC to run our jobs. (note the complexity of running your own HPC is beyond the scope of this fellowship and used as an example only)

A major advantage to using workflows or code for provisioning your cloud computing components is that you can turn them off and delete them when you are done, and restart when needed.

Our first uses of cloud will use forms to create resources, but we encourage you to automation where possible.

About Cloud Security

Security and Risk management are important issues even for researchers who's data are open
- If your computer is a server, your responsibility just increased 100X: these are prime targets. Consider each component of a server to be a point of vulnerability.
- Finding a readable list of security recommendations for cloud computing is a challenge for all the reasons outlined above. Our textbook has a nice chaper outlining cloud security - We will cover methods to reduce security risks but it's important to consider the risk of hacking from the beginning

Attackers may use the services you create to launch attacks on other services, leaving you liable.

  • The "Shared responsibility" model for cloud computing takes a model of computing components, and shows how much of each component the user is responsible for security.

Microsoft Model of Shared Responsibility Microsoft Model of Shared Responsibility for Cloud Computing

We will come back to this model as we gain deeper understanding of research computing on the cloud.

HPCC vs Cloud

The HPC is amazing effective at running all kinds of systems at very list cost, if any, to MSU researchers, but not all are the best fit.

Many systems not designed for HPC can be adjusted to run in that environment. However, just like many workflows are difficult to port from HPC to cloud, some cloud workflows are difficult run on HPC (but never say never). Especially windows-based software.

Acknowledging bias in access to cloud computing across research cultures

It's widely recognized that AI is frequently bias. For example, Azure Voice recognition did not work for a female researcher who developed voice-controlled surgery, so

However I believe there is also inherent bias in the user interfaces, design and definitions in the engineering of technology across many axes of diversity (gender, culture, background, training, creativity, etc). System Engineering is it's own discipline and Cloud computing is arcane so our goal is to reduce conceptual barriers to using this technology while you work with us.

About Cloud Costs

  • Cost management is a major hurdle for adopting CC, so we will talk about costs extensively
  • (Almost) everything you do in Azure has a cost
  • Costs often acrue over time, wether the resource is in use or not
  • Deleting resources when are not using is a great way to reduce cost
  • We want to encourage you to experiment! Using a very powerful machine for an hour may cost only $0.50
  • Just be aware that creating something and leaving it on will deplete your budget
  • Solution: "Budget Alerts"

Case Study: Computation of a machine learning model based on gene networks for inferring gene association ( https://www.geneplexus.net): a single (virtual) machine to run the ML such that users would not have to wait too long would be $650/month. However, if the computational power is provisioned only when needed, it's 5 cents/job.

Value Proposition of Cloud Computing

  • Costs are more than just dollars for services. Consider [Total Cost] = ( $ + Time + Risk )
  • [Total Time] = ( development time + wait time + compute time )
  • Security Risks are rarely non-significant, so factor that into cost
  • In the Service level spectrum, the higher level "platform" services may have higher monetary costs but often reduce time and risk