After the recent reinstallations, torque and maui need to be reinstalled. Since we’ve changed the setup a bit, I think that it’s now ok to take the default installation locations (/usr/local). So the command to configure and install the software is:

cd /system/software/linux/torque-2.0.0
./configure --with-rcp=scp
make
make install (as root)

This puts all the programs in /usr/local/bin and the needed libraries in /usr/local/lib. The spool directory is in /var/spool/torque.

Maui is done with:

cd /system/software/linux/maui-3.2.6p19
./configure
make
make install (as root)

Maui’s home is /usr/local/maui.

Startup scripts are provided in /system/software/linux/torque-2.3.0/contrib/init.d. We only need pbs_mom and pbs_server because we’ll be using maui for the scheduler. They need to be edited with the correct values.
PBS_DAEMON=/usr/local/sbin/pbs_server
PBS_HOME=/var/spool/torque

Copy pbs_mom and pbs_server to /etc/rc.d/init.d. And run /etc/rc.d/init.d/pbs_server. Once it’s a running process, can create the queues with qmgr.

[root@cpserver init.d]# qmgr
Max open servers: 4
Qmgr: p s
#
# Set server attributes.
#
set server acl_hosts = cpserver
set server log_events = 511
set server mail_from = adm
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
Qmgr: c q cp1
Qmgr: s q cp1 queue_type=Execution
Qmgr: s q cp1 from_route_only=True
Qmgr: s q cp1 resources_max.cput=240:00:00
Qmgr: s q cp1 resources_min.cput=00:00:01
Qmgr: s q cp1 enabled=True
Qmgr: s q cp1 started=True
Qmgr: c q cp
Qmgr: s q cp queue_type=Route
Qmgr: s q cp route_destinations=cp1
Qmgr: s q cp route_held_jobs=True
Qmgr: s q cp route_waiting_jobs=True
Qmgr: s q cp enabled=True
Qmgr: s q cp started=True
Qmgr: s s scheduling=True
Qmgr: s s acl_host_enable=True
Qmgr: s s acl_hosts=*.uchicago.edu
Qmgr: s s default_queue=cp
Qmgr: s s query_other_jobs=True
Qmgr: s s resources_default.nodect=1
Qmgr: s s resources_default.nodes=1
Qmgr: s s resources_max.walltime-96:00:00
Qmgr: s s resources_max.walltime=96:00:00
Qmgr: s s submit_hosts = cpserver
Qmgr: c n cpserver np=2

Maui’s startup script is provided in /system/software/linux/maui-3.2.6p19/contrib/service-scripts/redhat.maui.d. Edit this file:
MAUI_PREFIX=/usr/local/maui
also change the user as which it should run. We don’t have a maui user, so use my own username instead. This turned out to be a big problem, so have to run as root.

and copy to /etc/rc.d/init.d/maui.

Now, chkconfig –add pbs_mom, pbs_server and maui. Restart them all and submit a test job.

The job was accepted in the queue, but never executed. Oops, forgot to edit maui.cfg, add ADMIN1 and ADMIN3 and change the RMCFG line:

ADMIN1   root maryh
ADMIN3   ALL

#RMCFG[CPS1] TYPE=PBS@RMNMHOST@
RMCFG[base] TYPE=PBS

Test job now works, so can move on to the compute node.

The compute node doesn’t need maui, only torque. So simply run make install on the compute node.

In /var/spool/torque, check that server_name has the proper name. Copy the pbs_mom startup script from the server to this node. Start it up. Back on the server, create a new node in qmgr.

c n cpcompute np=8

Create /var/spool/torque/mom_priv/config

$usecp cpserver.uchicago.edu
$ideal_load 8.0
$max_load 10.0
$restricted *.uchicago.edu

This node has eight cores, so the ideal_load is eight.

Finally go back on the server into qmgr and add the compute host as another submit host:

qmgr
s s submit_hosts += cpcompute