Distributed ML @ W&M
  • Distributed ML @ W&M
  • Logging in and Setting up your HPC Account
    • πŸŽ‰Requesting an Account
    • πŸ‘‹Login & Basic Setup
    • πŸ—ΊοΈPBSTOP - Your Cluster Roadmap
    • 🐍Uploading Files
  • The Batch System
    • πŸ‘©β€πŸŒΎWhat is a batch system?
    • πŸ‘·Jobs
    • Interactive Jobs
    • Non-Interactive Jobs
    • Checking the status of your jobs
    • Deleting Jobs
  • Using Python & Batch
    • 🐍Conda Environments
    • Python + Conda in a Job
    • Python & MPI
    • Python & Dask
  • Distributed sklearn
    • Example Dataset
    • Random Search - Simple
    • Random Search - MPI
    • Random Forest
    • Dask & sklearn
  • Distributed PyTorch - Dask
    • Basics of Torch
    • PyTorch + DASK
  • Kubernetes
    • Basics of Kubernetes
    • Your First K8S Deployment
    • Persistence & Python
    • Setting up Torch
    • One Pod Torch with Data
Powered by GitBook
On this page
  • qstat
  • qsu
  1. The Batch System

Checking the status of your jobs

PreviousNon-Interactive JobsNextDeleting Jobs

Last updated 2 years ago

So you've launched a job, yay! There are two main ways that you can now check on the status of your jobs.

qstat

Running qstat from your terminal will give you a table with statistics about all of the jobs currently running on the sub-cluster you are logged into. The output of this command will look like:

A much more helpful command however is qstat -u [USERNAME].

What does this table tell us?

Job ID: The unique ID given to our job

Username: The user who launched the job

NDS: Number of nodes reserved

TSK: Number of total processors reserved

Req'd Time: Amount of walltime requested

Elap Time: How long the job has been running for. If the job is in the queue and waiting to be launched, this line will look like -----------

qsu

qsu is a shortcut for qstat -u [USERNAME]. Running it will give you the exact same printout.