From ccfa1bbec108471718db3a44cabd0a27c3b3f1e0 Mon Sep 17 00:00:00 2001 From: Andreas Dilger Date: Sun, 14 Jul 2019 17:25:52 -0600 Subject: [PATCH] LUDOC-11 jobid: cover newer JobID functionality Describe the complex JobID format-string functionality added in 2.12, the per-node JobID added in 2.8, and the per-session JobID added in 2.13. Signed-off-by: Andreas Dilger Change-Id: I863cac0d63482fd0b3ab4e10d4ba3a6b7f3ebbe5 Reviewed-on: https://review.whamcloud.com/35503 Tested-by: jenkins Reviewed-by: Joseph Gmitter --- LustreMonitoring.xml | 63 ++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 56 insertions(+), 7 deletions(-) diff --git a/LustreMonitoring.xml b/LustreMonitoring.xml index 89af577..610c9b0 100644 --- a/LustreMonitoring.xml +++ b/LustreMonitoring.xml @@ -601,7 +601,9 @@ Lustre Jobstats job. Job schedulers known to be able to work with jobstats include: SLURM, SGE, LSF, Loadleveler, PBS and Maui/MOAB. Since jobstats is implemented in a scheduler-agnostic manner, it is - likely that it will be able to work with other schedulers also. + likely that it will be able to work with other schedulers also, and also + in environments that do not use a job scheduler, by storing custom format + strings in the jobid_name.
<indexterm><primary>monitoring</primary><secondary>jobstats</secondary></indexterm> How Jobstats Works @@ -612,18 +614,65 @@ Lustre Jobstats ID. A Lustre setting on the client, jobid_var, - specifies which variable to use. Any environment variable can be - specified. For example, SLURM sets the + specifies which environment variable to holds the JobID for that process + Any environment variable can be specified. For example, SLURM sets the SLURM_JOB_ID environment variable with the unique job ID on each client when the job is first launched on a node, and the SLURM_JOB_ID will be inherited by all child processes started below that process. - Lustre can also be configured to generate a synthetic JobID from - the user's process name and User ID, by setting - jobid_var to a special value, - procname_uid. + Lustre can be configured to generate a synthetic JobID from + the client's process name and numeric UID, by setting + jobid_var=procname_uid. This will generate a + uniform JobID when running the same binary across multiple client + nodes, but cannot distinguish whether the binary is part of a single + distributed process or multiple independent processes. + + In Lustre 2.8 and later it is possible to set + jobid_var=nodelocal and then also set + jobid_name=name, which + all processes on that client node will use. This + is useful if only a single job is run on a client at one time, but if + multiple jobs are run on a client concurrently, the per-session JobID + should be used. + + + In Lustre 2.12 and later, it is possible to + specify more complex JobID values for jobid_name + by using a string that contains format codes that are evaluated for + each process, in order to generate a site- or node-specific JobID string. + + + + %e print executable name + + + %g print group ID number + + + %h print hostname + + + %j print JobID from process environment + variable named by the jobid_var parameter + + + + %p print numeric process ID + + + %u print user ID number + + + + In Lustre 2.13 and later, it is possible to + set a per-session JobID by setting the + jobid_this_session parameter. This will be + inherited by all processes that are started in this login session, + but there can be a different JobID for each login session. + + The setting of jobid_var need not be the same on all clients. For example, one could use SLURM_JOB_ID on all clients managed by SLURM, and -- 1.8.3.1