Skip to content

Latest commit

 

History

History
155 lines (105 loc) · 6.18 KB

06-interactive.md

File metadata and controls

155 lines (105 loc) · 6.18 KB

Running interactive jobs on Discovery

Okay, let us say we want to use a node interactively:

If you want to take advantage of Discovery while also using a graphical software (such as GaussView), you need to start by logging in with X11 forwarding:

You can find more details about loading X11 here.

Remember how we do not want to do anything resource intensive on the log in nodes (discovery2 and discovery4)? Doing so would cause everyone difficulties, and ruffle some jimmies. Requesting an interactive job and logging in to a compute node is the solution.

First we need to request a compute node to use.

$ salloc -N 1  --exclusive -p ht-10g
salloc: Granted job allocation 649056

This will request exclusive use of a free node on partition ht-10g. (It would make sense to check the queues before choosing a partition, for example sinfo | grep idle will show how many idle nodes are in each partition. One of the ser-par-10g- partitions is usually a good bet.)

To find out which node you were allocated by Slurm:

$ squeue -u <username>

where is your username, eg. husky.id. For example, it will look like this:

         JOBID PARTITION     NAME        USER ST       TIME  NODES NODELIST(REASON)
        649056    ht-10g     bash  <username>  R       0:08      1 compute-0-006

Then log in to your shiny new compute node (if you did not log in with a -X, $ exit, then log in again)! This tells us our job 649056 is running on node compute-0-006. Then log in to your shiny new compute node!

$ ssh -X compute-0-006

Sometimes, it may prompt you for a password. Just Ctrl-C and try to ssh -X <compute-node> again. Note that if you didn't use -X when sshing in to discovery login node from your computer, you shouldn't use -X when sshing from the login node to the compute node.

Now you're on an interactive session - now let's revisit matplotlib!

[<user_name>@compute-0-006 ~]$

Using matplotlib

We managed to install matplotlib before in the python packages tutorial, but can we use it?

First load ipython then try from matplotlib import pyplot as plt as we are used to doing.

$ ipython
Python 3.5.2 (default, Sep 19 2016, 11:10:34)
Type "copyright", "credits" or "license" for more information.

IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: from matplotlib import pyplot as plt

This will give a long error message, with the clue # If this fails your Python may not be configured for Tk. Googling this will eventually lead you to try the following , but I'll save you some time. You have to exit() your ipython session and start a new one ($ ipython) because you have to tweak the matplotlib configuration before importing pyplot for the first time:

In [1]: import matplotlib

In [2]: matplotlib.use('Agg')

In [3]: from matplotlib import pyplot as plt

OK so far - no errors if you did the python packages tutorial! What we did is change the 'back end' that matplotlib uses, because Discovery is not set up with the default 'Tk', but does have 'Agg' (don't ask me what these are). You'll have to do this in any scripts that use matplotlib. Now let's try to make a plot.

In [4]: plt.plot(range(5))
Out[4]: [<matplotlib.lines.Line2D at 0x7fc098b02a90>]

OK, but how do we see it?

In [5]: plt.show()

Don't see anything? If you had installed an X11 window system and used X11 forwarding (ssh -X) then you would have a local window appear. (This is where you would find out that you need to use -Y instead of -X, but practice makes perfect!) But when running a script on a remote computer it's usually more helpful to just save the figure to a file and retrieve it later:

In [6]: plt.savefig("my_figure.pdf")

In [7]: exit()

Now look for the file:

$ ls *.pdf
my_figure.pdf

OK - it's there! You can retrieve it using some SCP or SFTP commands, or preferably a client with a nice GUI (my favorite is currently Forklift but there are many good free ones to choose from. I used Cyberduck for a while. FileZilla is commonly mentioned.)

Once you are done with your interactive session, log out of the compute node by typing exit and then release the allocation using scancel <jobID>, e.g.:

r.west@compute-0-006 ~]$ exit
logout
Connection to compute-0-006 closed.
[r.west@discovery2 ~]$ scancel 649056
salloc: Job allocation 649056 has been revoked.
Hangup
[r.west@discovery2 ~]$

The quick way?

Not sure why this isn't in the instructions provided by research computing for how to use Discovery, but if at the login node you just type:

	$ srun --pty ipython

it will allocate you some resource, wait until it's available (you may experience a pause), then run ipython in "pseudo terminal mode". All your tab completion, syntax highlighting, terminal wizardry etc. seem to work naturally. We can check which compute node we're running like this:

$ srun --pty ipython
Python 3.5.2 (default, Sep 19 2016, 11:10:34)
Type "copyright", "credits" or "license" for more information.

IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import subprocess

In [2]: subprocess.check_output('hostname')
Out[2]: b'compute-0-013\n'

We used the python function subprocess.check_output to run the linux program hostname and capture the result, which is compute-0-013 in this case. But when I exit() from iPython I am back on the login node. You could add other options to the srun command to specify the partition, etc. if you want.


Next: 7. Submitting an array of jobs.