Whamcloud - gitweb
LU-18610 obdclass: add job expired flag 16/57616/7
authorShaun Tancheff <shaun.tancheff@hpe.com>
Wed, 5 Feb 2025 16:56:47 +0000 (23:56 +0700)
committerOleg Drokin <green@whamcloud.com>
Fri, 28 Feb 2025 08:12:37 +0000 (08:12 +0000)
commitb1a07c17b620e7ed6437d35b477f3b34c21f9200
tree5773635bf069d48a1061ba4402f8f60813a133a2
parentcab92c00bff91147ba54afad0def8bdfe22c3319
LU-18610 obdclass: add job expired flag

In lprocfs_job_cleanup() expired jobs are de-referenced before
being removed from the lru to defer holding a spinlock.
This opens a race where a job can be put multiple times
when only a single put on expiry is expected. To avoid a double
de-reference race use a bit flag to avoid the extra de-reference
on jobs in the process of being expired and removed.

HPE-bug-id: LUS-12670
Test-Parameters: testlist=sanity env=ONLY=205,ONLY_REPEAT=100
Fixes: cad59b9b72 ("LU-18351 obdclass: jobstat scaling")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ia7dc91cac313919827cc13db971ffb3debe318c2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57616
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
lustre/obdclass/lprocfs_jobstats.c