MAXPS throttling in the STF queue

In order to ensure that no single user can monopolize the HPC club resources, we’ve initiated a “throttling mechanism” on the STF queue. This post will tell you why we did this, how it works, and how you can make it work for you.

The problem

Up until now, we’ve all been able to run massive jobs for long periods of time. In theory, there was nothing to stop us from taking up all of the STF nodes for a month-long job, effectively halting the research programs of all of the other HPC club members. If we all worked in the same lab, this might not be such a big problem. In that case, we could beg or bribe the node hog to cancel the offending job. But the HPC club has a lot of users, distributed throughout campus. We need an easier and more uniformly applied way to limit each other’s wait times.

The solution

Effective immediately, all jobs in the STF queue will be subject to a maximum processor-second (MAXPS) limitation. Here’s how that works.

MAXPS limits the number of outstanding processor-seconds a credential may have allocated at any given time. For example, if a user has a 4-processor job that will complete in 1 hour and a 2-processor job that will complete in 6 hours, they have 4 * 1 * 3600 + 2 * 6 * 3600 = 16 * 3600 outstanding processor-seconds. The outstanding processor-second usage of each credential is updated each scheduling iteration, decreasing as jobs approach their completion time.

The STF queue will have a MAXPS limit of 30,000,000 processor-seconds.

How will this affect you?

In practice, the MAXPS limitation will keep you from running long jobs that occupy a lot of processors. If you’re running on only a few nodes, you’ll be able to run longer jobs. Here’s a table of common job sizes (in number of cores) and the maximum time allowed under the new MAXPS limitation:

Number of Cores Max Hours Allowed Max Days Allowed
16 520.83 21.70
32 260.42 10.85
64 130.21 5.43
96 86.81 3.62
128 65.10 2.71
160 52.08 2.17
256 32.55 1.36
512 16.28 0.68
1024 8.14 0.34

How to make this work for you

The first step in working with the MAXPS limitation is to properly estimate the walltime required for your jobs. You should try to guess how long your job will need to complete and then put only this amount of time (or maybe a little bit more for margin) in your qsub script. If you’re in the habit of writing huge walltimes in your qsub script because you don’t want to figure out the true walltime required, then the MAXPS limit will punish you.

If you calculate how long your job will take to complete and it is longer than the time allowed by the MAXPS limitation, then it’s time to explore the checkpoint-restart capabilities of your application (or write it yourself for your own in-house code). Breaking your job into smaller chunks will help you get your work done without monopolizing the HPC club resources.

If you absentmindedly submit a single job that exceeds the MAXPS limit, it will just sit in the queue forever.

Lastly, if you’re installing and compiling your own software and you find the MAXPS limitation onerous, please try using Hyak’s build queue.

Thanks for reading. We think the new MAXPS limit will help us share the HPC club resources more equitably. We’ll get more research done and also learn how to be good Hyak users. As a bonus, the limitation will teach us how to better reason about the execution time of our programs and how to use or implement checkpoint/restart capabilities in our programs.