Gurobi Installation and Tests on a HPC system

Gurobi is an optimisation solver, which describes itself as follows, thus explaining it's increasing popularity:

The Gurobi Optimizer is a state-of-the-art solver for mathematical programming. The solvers in the Gurobi Optimizer were designed from the ground up to exploit modern architectures and multi-core processors, using the most advanced implementations of the latest algorithms.

The following outlines the installation procedure on a Linux cluster, various licensing condundrums, and a sample job using Slurm.

Installation

In our case we acquired a floating academic license. A form will need to filled out, scanned, and sent back to Gurobi. This is all sub-optimal and like any every partially proprietary software it's a damaged good. But certainly it's not as bad as it could be, small mercies.

Having received a licene file and having downloaded the software one can install. The following is an EasyBuild script (Gurobi-8.1.1.eb), but it's basically a tarball which contains several binaries, docs, examples etc.


name = 'Gurobi'
version = '8.1.1'
easyblock = 'Tarball'
homepage = 'http://www.gurobi.com'
description = """The Gurobi Optimizer is a state-of-the-art solver for mathematical programming. The solvers in the Gurobi Optimizer were designed from the ground up to exploit modern architectures and multi-core processors, using the most advanced implementations of the latest algorithms."""
toolchain = {'name': 'dummy', 'version': 'dummy'}
# registration is required
# source_urls = ['http://www.gurobi.com/downloads/user/gurobi-optimizer']
sources = ['%(namelower)s%(version)s_linux64.tar.gz']
moduleclass = 'math'

Licensing

As is often the case with proprietary software, the greatest pain for sysadmins will be dealing with the license. Even in those cases where this is quicker than the software install itself, at least with the software you know that the work is necessary. With licenses, it's unnecesssary work, and I keep a sharp eye on the number of expected seconds I have left in my life. For anyone else reading this, hopefully I've saved a few for you.

Like most sensible HPC systems the management node is not directly accessible to the outside world. A license file will need to be created from the grbprobe command and relevant material added into a gurobi.lic file. The following is an example:


# DO NOT EDIT THIS FILE except as noted
#
# License ID XXXXXX
TYPE=TOKEN
VERSION=8
TOKENSERVER=spartan-build.hpc.unimelb.edu.au
HOSTNAME=spartan-build.hpc.unimelb.edu.au
HOSTID=XXXXXXXX
SOCKETS=2
EXPIRATION=2020-04-18
USELIMIT=4096
DISTRIBUTED=100
SPECIAL=2
KEY=XXXXXXXX
CKEY=XXXXXXXX
# Uncomment and edit the following lines as desired:
PORT=XXXXX
# # PASSWORD=YourPrivatePassword

Gurobi strongly prefers that the license is installed in a /opt/gurobi directory according to their documentation, but on an HPC system it is doubtful that this is mounted across compute nodes. Thus a path will have to be exported with the appropriate variable when run. In the meantime, the token server can be started:


(vSpartan) [root@spartan-build gurobi]# module load Gurobi
(vSpartan) [root@spartan-build gurobi]# grb_ts
..

Smoke Test

With the token server running, it should be possible to run a Gurobi task. Note however that in an HPC environment where the management node is private and running the token server, and the login node is public, you may encounter a subnet error.


[lev@spartan-login1 ~]$ module load Gurobi
[lev@spartan-login1 ~]$ export GRB_LICENSE_FILE=/usr/local/easybuild/software/Gurobi/gurobi.lic
[lev@spartan-login1 ~]$ gurobi_cl
ERROR 10009: Server must be on the same subnet

Thus in these situations it is best to run on a compute node after launching an interactive job.


[lev@spartan-login1 ~]$ sinteractive
..
[lev@spartan-rc110 ~] module load Gurobi
[lev@spartan-rc110 ~] export GRB_LICENSE_FILE=/usr/local/easybuild/software/Gurobi/gurobi.lic
[lev@spartan-rc110 ~]$ gurobi_cl
Usage: gurobi_cl [--command]* [param=value]* filename
Type 'gurobi_cl --help' for more information.

Slurm Job Script

A number of example Gurobi jobs are provided in the appliction, and the misc07.mps file is a good example for a speedtest. Note that this task is pleasingly parallel and run much faster as a multicore job compared to a single-core job. Here is a sample Slurm script, gurobi.slurm. For the test case, it is worth testing the sample job with a single CPU vs eight or more.


#!/bin/bash
#SBATCH -p cloud
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
module load Gurobi/7.0.1
export GRB_LICENSE_FILE=/usr/local/easybuild/software/Gurobi/gurobi.lic
time gurobi_cl misc07.mps

The following are some sample results:

cpus-per-task=1

real 0m9.255s
user 0m8.220s
sys 0m1.009s

cpus-per-task=8

real 0m2.558s
user 0m14.593s
sys 0m3.778s

Restart License File

One other issue that's too easy to overlook is if the management node is restarted for any reason (e.g., a planned outage for system upgrades), the Gurobi license server will have to be restarted. One method to do this is to write a short script and it to the list of services that are required on boot. It may be necessary to source a profile in order to invoke the modules system in a non-interactive shell e.g.,


#!/bin/bash
. /etc/profile.d/z00_lmod.sh
. /etc/profile.d/z01_spartan.sh
module load Gurobi/8.1.1
grb_ts