Connecting to a remote HPC system
Overview
Teaching: 5 min
Exercises: 0 minQuestions
How do I log in to a remote HPC system?
Objectives
Access Open OnDemand.
Connect to a remote HPC system.
Secure Connections
The first step in using a cluster is to establish a connection from our laptop to the cluster. When we are sitting at a computer (or standing, or holding it in our hands or on our wrists), we have come to expect a visual display with icons, widgets, and perhaps some windows or applications: a graphical user interface, or GUI. Since computer clusters are remote resources that we connect to over slow or intermittent interfaces (WiFi and VPNs especially), it is more practical to use a command-line interface, or CLI, to send commands as plain-text. If a command returns output, it is printed as plain text as well. The commands we run today will not open a window to show graphical results.
If you have ever opened the Windows Command Prompt or macOS Terminal, you have seen a CLI. If you have already taken The Carpentries’ courses on the UNIX Shell or Version Control, you have used the CLI on your local machine extensively. The only leap to be made here is to open a CLI on a remote machine, while taking some precautions so that other folks on the network can’t see (or change) the commands you’re running or the results the remote machine sends back. We will use the Secure SHell protocol (or SSH) to open an encrypted network connection between two machines, allowing you to send & receive text and data without having to worry about prying eyes.
SSH clients are usually command-line tools, where you provide the remote
machine address as the only required argument. If your username on the remote
system differs from what you use locally, you must provide that as well. If
your SSH client has a graphical front-end, such as PuTTY or MobaXterm, you will
set these arguments before clicking “connect.” From the terminal, you’ll write
something like ssh userName@hostname
, where the argument is just like an
email address: the “@” symbol is used to separate the personal ID from the
address of the remote machine.
Connecting with OnDemand
Instead of using a shell application installed on your computer, you can shortcut much of this process using a shell in your browser that directly connects to the cluster.
To do so, login to the cluster’s Open OnDemand interface at this link: https://ondemand.hpcc.msu.edu. If you are prompted, choose Michigan State University as the Identity Provider, and login using your site credentials.
Here, you will find many different interactive applications and ways access the cluster. It is possible that these may be the only ways you work with the cluster in the future! But for now, we will make sure that you can get started using the command line since it is traditionally the most common and flexible access method.
To open a command line connected to the cluster from OnDemand, choose any entry from the Development Nodes tab. For the remainder of this lesson, we will use dev-amd20.
Looking Around Your Remote Home
Very often, many users are tempted to think of a high-performance computing
installation as one giant, magical machine. Sometimes, people will assume that
the computer they’ve logged onto is the entire computing cluster. So what’s
really happening? What computer have we logged on to? The name of the current
computer we are logged onto can be checked with the hostname
command. (You
may also notice that the current hostname is also part of our prompt!)
[netid@dev-amd20 ~]$ hostname
dev-amd20
So, we’re definitely on the remote machine. Next, let’s find out where we are
by running pwd
to print the working directory.
[netid@dev-amd20 ~]$ pwd
/mnt/home/netid
Great, we know where we are! Let’s see what’s in our current directory:
[netid@dev-amd20 ~]$ ls
id_ed25519.pub
The system administrators may have configured your home directory with some helpful files, folders, and links (shortcuts) to space reserved for you on other filesystems. If they did not, your home directory may appear empty. To double-check, include hidden files in your directory listing:
[netid@dev-amd20 ~]$ ls -a
. .bashrc id_ed25519.pub
.. .ssh
In the first column, .
is a reference to the current directory and ..
a
reference to its parent (/mnt/home
). You may or may not see
the other files, or files like them: .bashrc
is a shell configuration file,
which you can edit with your preferences; and .ssh
is a directory storing SSH
keys and a record of authorized connections.
Key Points
An HPC system is a set of networked machines.
HPC systems typically provide login nodes and a set of worker nodes.
The resources found on independent (worker) nodes can vary in volume and type (amount of RAM, processor architecture, availability of network mounted filesystems, etc.).
Files saved on one node are available on all nodes.