Running stuff on Westgrid
Table of Contents
Logging in & minimal set-up
Having gotten a Westgrid account you should be able to log in like so
ssh username@bugaboo.westgrid.ca
it will prompt you for your password. For convenience you can define in ~/.bashrc the alias
alias sshwestgrid='ssh -X username@bugaboo.westgrid.ca'
To copy stuff to Westgrid use scp. Type on your local machine
scp path/to/somefile username@bugaboo.westgrid.ca:path/where/to/place/file/in/home/dir
But have a look at sshfs below for an easy method to mount the Westgrid home-folder on you local machine.
More info to be found here.
Running computations on Westgrid
A note on Matlab: we, from SFU, can only run Matlab on SFU-machines, which is bugaboo.
Interactive
Interactive sessions on westgrid are only supposed to be used for debugging and such. They are quite limited by the amount of RAM and CPU time. To run a matlab session without the GUI type:
matlab -nodesktop -nosplash
or for python
python
Batch jobs
Westgrid documentation here.
The way to use Westgrid for proper computing is to submit batch jobs. They land in a queue and will be processed once they reach the front of the queue. The queuing system is depending on things like job size, run time, how many jobs you got running, etc.
The three most important commands are:
submit a job
qsub jobfile.pbs
where the job is defined in the *.pbs file (see below).
show the place and status of your jobs in the queue
showq -u username
It sometimes take a couple of minutes for the job to show up.
delete a job (jobid is displayed with showq)
qdel jobid
- the *.pbs file
The *.pbs file tells the queue-system how to run a specific job and gives requests/estimates for the resources needed. It may look like this (note, it's basically a bash script)
#!/bin/bash #PBS -S /bin/bash # # .pbs file for running on westgrid: # see https://www.westgrid.ca/support/quickstart/new_users#running # # comments starting with #PBS are instructions for the queue-system # # use one processor: #PBS -l procs=1 # request 2000MB of ram: #PBS -l pmem=2000mb # estimated run time 3:10 hours: #PBS -l walltime=03:10:00 # cd to directory in which this *.pbs file is located cd $PBS_O_WORKDIR echo "Current working directory is `pwd`" echo "Running on hosts:" cat $PBS_NODEFILE echo "Starting run at: `date`" matlab -nodisplay -r your_matlab_script -logfile your_log_file echo "Job finished at: `date`"
This will run the matlab script "your\matlab\script.m", note the lack of the .m and that no arguments can be passed. The "your\log\file" will capture everything written to standard out.
Example
Find an example of a (maybe) working matlab computation here.
On local machine
scp it to westgrid
scp westgrid_matlab_example.zip mauro@bugaboo.westgrid.ca:westgrid_matlab_example.zip
On Westgrid:
unzip it
unzip westgrid_matlab_example.zip
cd to geeks and submit job
cd geeks qsub test_pbs.pbs
check on it
showq -u username
follow log file
tail -f test_log
Advanced computational setup
Customise environment with modules, eg. execute
module load python
Available software differs from machine to machine: http://www.westgrid.ca/support/software
But again, note that we, from SFU, can only run Matlab on SFU-machines, which is bugaboo.
More setup
The following stuff I use to make my westgrid-life easier.
Email forwarding
To get emails westgrid sends to you about your jobs and such, run in your home directory
echo "youremail@sfu.ca" > .forward
bash shell setup
Keyless SSH
SSH can be set up such that you don't need to type in your password when you log in, mount via sshfs (see below) or use version control connecting to remote servers. These web-sites give a howto (do use a passphrase to protect your ssh-keys! This passphrase should be different to your passwords.):
- http://linuxreference.wordpress.com/2011/10/11/keyless-ssh-using-ssh-keygen-and-ssh-copy-id/
- more details: https://wiki.archlinux.org/index.php/SSH_Keys
(Note: You want to set this up on your local machine!)
Now the potentially fiddly part is to get it set-up such that you only need to enter above passphrase once. If running Ubuntu this should be working out of the box and you'll be prompted for the passphrase the first time you ssh. On other systems this may not be so, I put the line
eval `/usr/bin/ssh-agent`
in my ~/.xinitrc which does it. Your mileage may vary.
Run
ssh-add
and it will prompt you for the passphrase and unlock the keys for the current session.
sshfs
sshfs is way to mount directories of a remote machine, easily and securely. On the remote all that is needed is ssh, on the local machine sshfs is needed. To then mount your westgrid home run the following on your local machine. Make a directory to be used as mount point:
mkdir -p ~/mnt/westgrid
and then mount (replace username with yours):
sshfs username@bugaboo.westgrid.ca:/home/username $HOME/mnt/westgrid -o compression=yes -o reconnect -o idmap=user -o gid=100 -o workaround=rename -o follow_symlinks
all the options are not strictly needed but enhance the experience. To unmount run
fusermount -u -z $HOME/mnt/westgrid/
For convenience I add the two aliases to my ~/.bashrc
alias mountwestgrid='sshfs username@bugaboo.westgrid.ca:/home/username $HOME/mnt/westgrid -o compression=yes -o reconnect -o idmap=user -o gid=100 -o workaround=rename -o follow_symlinks' alias umountwestgrid='fusermount -u -z $HOME/mnt/westgrid/'
Version control
Git, mercurial and subversion are available on bugaboo. Again to have access to your version control server you can setup keyless SSH, but now on westgrid.
GNU screen
GNU screen is magic and makes remote work so much more easy. Copy ./docs/.screenrc to your westgrid home folder; in there is also a setup to make ssh-agent work. For this, copy ./docs/screen-ssh-agent to your ~/bin folder.
Now type
screen
hit control+space then ? for help.
Now running ssh-add will make the unlocked ssh-keys available in all screen sub-shells.
Locally installing software
If some desired software, like e.g. your model code, or some other software is not installed, then you need to install it yourself. See here https://www.westgrid.ca/support/programming. Potentially you have to activate the right compilers with the
module load xxx
command.