Whamcloud - gitweb
Patrick Farrell [Thu, 5 Dec 2024 04:30:12 +0000 (23:30 -0500)]
LU-17814 utils: implement real pfind
This patch does the last step of integrating the actual
find code with pfind.
This doesn't mean we're done - it doesn't have error
propagation and is not enabled by default - but we are
close.
The next patch will finalize and enable testing.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I34020357037bdcf400ffe7f3b4dc14ea5e5a23c7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57295
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Thu, 5 Dec 2024 04:08:38 +0000 (23:08 -0500)]
LU-17814 utils: Add deep copy of find_param
Need to copy find_param for each work unit.
Technically not all fields need to be copied, but a bunch
do and it's much easier to copy the whole thing than work
out precisely which fields need to be copied.
Plus that is fragile to future changes, this should be more
robust.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7fbf909b3fc88ca4a4300abc7e4ccea776fff629
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57294
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Oleg Drokin [Tue, 27 May 2025 04:30:39 +0000 (00:30 -0400)]
LU-15358 tests: fix sanity-flr.sh test 0b syntax
local=cnt is clearly a typo and shoule be just local
Change-Id: I055c65eca5fe356dd5d180b4c8bf238c9f27c179
Fixes:
0c710a46cfb4 ("LU-11022 lfs: remove mirror by pool name")
Test-Parameters: trivial
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59446
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Alex Zhuravlev [Tue, 27 May 2025 17:35:52 +0000 (20:35 +0300)]
LU-19065 osc: remove extra linefeed from debug
OSC_DUMP_GRANT() has own trailing linefeed, so the callers
shouldn't pass extra n in their messages.
Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ifc5f01c3d79dcbd2619c1bcba9305c635006e8d9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59458
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Tue, 27 May 2025 08:28:15 +0000 (11:28 +0300)]
LU-19064 tests: sanity/851 to use correct host
sanity/851 should use correct hostname to run well on a local setup.
Test-Parameters: trivial env=ONLY=851,ONLY_REPEAT=10 testlist=sanity
Test-Parameters: trivial env=ONLY=851,ONLY_REPEAT=10 testlist=sanity
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id8c62cc8fa7c5e57cef70e549652d30db94a0740
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59454
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Feng Lei <flei@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Oleg Drokin [Tue, 27 May 2025 04:18:22 +0000 (00:18 -0400)]
LU-15358 tests: sanity-quota wait_reintegration wrong quotes
Looks like these quotes need to be escaped otherwise they
just unquote the variable and grep might get confused.
Test-Parameters: trivial
Fixes:
c2db06180b29 ("LU-2183 quota: quota tests for DNE")
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I3b129e3924da4cbe4d6baa6e8c958881a799de26
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59445
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bruno Faccini [Thu, 5 Jun 2025 14:27:51 +0000 (16:27 +0200)]
LU-19091 ptlrpc: protect internal access to obd->obd_svc_stats
PM-QoS patch from LU-18446, where OBD svc stats are used to
evaluate best time period for low CPUs latency to be kept, has
introduced a new and internal way to access obd->obd_svc_stats
which now requires other concurrent access protection than
simply to remove external tunables in /sys or /debug.
Fixes:
54a64ea818 ("LU-18446 ptlrpc: lower CPUs latency during client I/O")
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I45a5f65216fa2bf0821776ff3141fa8e2a33f10e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59593
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Arshad Hussain [Sat, 31 May 2025 06:06:53 +0000 (02:06 -0400)]
LU-17000 obdclass: Fix mem leak in lcfg_setparam_client
if call to llapi_param_get_paths() fails. tmp_path
is left unfreed.
Test-Parameters: trivial
CoverityID: 457066 ("Resource leak")
Fixes:
10a04e32 (LU-16724 ptlrpc: refactor page pools patch 3)
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ib8962675fcb06a4d6b1539340f4a005dd65b7e02
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59499
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Cyril Bordage [Fri, 30 May 2025 16:07:08 +0000 (18:07 +0200)]
LU-18897 o2iblnd: NULL pointer dereference
When the network is flapping, we could get an
RDMA_CM_EVENT_UNREACHABLE event before conn is created, so we should
check the value first.
Test-Parameters: trivial
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I8d9777370b927c28ee438687de596e498d64bb07
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59498
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alex Zhuravlev [Fri, 30 May 2025 15:45:41 +0000 (18:45 +0300)]
LU-19076 ptlrpc: resend can hit original req
the client may need to resend a request if the reply buffer can
not fit the reply (LOVEA has just changed, for example).
in some environment (e.g. server and client share same node),
a resend RPC can find the original RPC on export's list and the
server just drops the resend RPC thinking it's a duplicate.
this way the client gets no reply for the resend RPC and times
out.
if this problem happens during layout refresh where the client
holds layout lock requesting LOVEA with MDS_GETXATTR, then
the server can evict the client.
the patch removes RPC from export's list just before sending a
reply as RPC has been already processed and for non-idempotent
request reconstruction should take place.
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I48437ad018b9b43b9fff4157203906fd84b6cfd3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59497
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Thu, 29 May 2025 18:36:16 +0000 (18:36 +0000)]
LU-19072 lnet: don't crash if ni_status is NULL
When reading LNet tunables, ni_status can be NULL. This
triggers an LASSERT() rather than gracefully handling it.
Instead, don't crash. Remove the LASSERT().
lnet_ni_get_status_locked() already handles a NULL ni_status.
While it's questionable whether ni_status == NULL should be
LNET_NI_STATUS_UP or LNET_NI_STATUS_DOWN, it definitely
should not crash.
Also, use lnet_ni_get_status() instead of
lnet_ni_get_status_locked().
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I1d8ba9b5f6478d2a915ac6c7f33c22d1742c43d0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59482
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Arshad Hussain [Tue, 27 May 2025 15:39:28 +0000 (21:09 +0530)]
LU-17000 llite: Handle not NUL terminated buffer in ll_statahead_info
Match ll_statahead_info:sai_fname(target) array
length with llapi_lu_ladvise2:lla_buf(source).
Test-Parameters: trivial
CoverityID: 400216 ("Buffer not null terminated")
Fixes:
1288681b (LU-14361 statahead: add statahead advise IOCTL)
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Id898ab4b49d54bd734831c09e3de725533e7c249
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59456
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Timothy Day [Sun, 25 May 2025 05:37:19 +0000 (01:37 -0400)]
LU-17848 osd: dt_tunables_fini() and friends return void
The function dt_tunables_fini() can't really fail. Make
it return void. Make the various osd_procfs_fini()
implementations also return void.
Test-Parameters: trivial
Test-Parameters: trivial fstype=zfs
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I5edc7ed43fad69d6ebdd734d8e9fdc69cdcf0915
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59418
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Sun, 25 May 2025 02:47:12 +0000 (22:47 -0400)]
LU-18813 osd-wbcfs: use common rwsem for osd_object
Use a common read/write semaphore for all osd_object
attributes.
Test-Parameters: trivial
Test-Parameters: testlist=sanity fstype=wbcfs mdscount=4 mdtcount=1 osscount=4 ostcount=1
Test-Parameters: testlist=sanity fstype=wbcfs combinedmdsmgs=false standalonemgs=true mdscount=1 mdtcount=1 osscount=4 ostcount=1
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I16678e57596365ce25d978e2b5a524fc4c21bf26
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59417
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Sat, 24 May 2025 03:56:38 +0000 (21:56 -0600)]
LU-19053 build: allow specifying "make rpms" build dir
Currently "make rpms" will create a temporary directory with mktemp
to hold the intermediate build products, and this ends up in /tmp.
This can cause issues if /tmp is not large enough for the full build.
Allow specifying "BUILDDIR=DIR" to redirect the intermediate build
products into the specified directory. This allows you to run:
BUILDDIR=/var/tmp make rpms
or
BUILDDIR=/var/tmp make debs
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I12f2e7444f0fc7f09f41d64b8e4dd4a429797a37
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59410
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Li Dongyang [Fri, 23 May 2025 10:05:47 +0000 (20:05 +1000)]
LU-14712 ldiskfs: keep EXT4_BG_TRIMMED flag in memory
Keep the EXT4_BG_TRIMMED flag in memory for the trimmed block groups
so that the filesystem without track_trim superblock bit(e.g. existing
filesystem created with earlier version of e2fsprogs) can still
skip trimmed block groups during fstrim as long as it's mounted.
For persistent trimmed block group tracking we should turn on track_trim
with tune2fs.
Change-Id: I19df047c717d3b20310fcba7fa682b6dfab9d5e4
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59312
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alexander Boyko [Fri, 16 May 2025 12:38:12 +0000 (14:38 +0200)]
LU-19015 llog: logic for skipping a zeroed record
For ENOSPC errors during dt_write() and threads races, the changelog
could have a sparse file with zeros inside. The current processing
logic skips records for the next chunk.
The patch adds the abilty to skip only zeros in the buffer and start
from a valid record.
Also fix changes the llog_test 8 so that it uses non-zero byte for
corruption.
Fixes:
cb1290768df9 ("LU-18218 mdd: changelog specific write function")
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I7263764ba6a89f226995b8967631eaa6d5bdd4dd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59267
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Tue, 13 May 2025 21:37:17 +0000 (21:37 +0000)]
LU-19013 lnet: fix wording for GDS configure check
The wording on the GDS/CUDA configuration options is
incorrect. If the user does not specify external headers,
Lustre will fallback to the embedded headers rather
than disabling GDS.
Fix the wording on the configure options, improve the
macro name, and reorganize the header such that correct
defintions are under the ifdef.
Fixes:
c65eabc2b113 ("LU-15189 build: add GDS configure options")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I645bc1c0c4bf26bdb9841c849b6cf8eebdc0bdee
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59215
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Timothy Day [Wed, 16 Apr 2025 16:26:50 +0000 (16:26 +0000)]
LU-18162 mgc: convert Management Client to LU device
Convert MGC to use LU device init/fini rather
than the legacy OBD API.
Also, use ldo_process_config rather than the legacy
o_process_config.
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ic33aeb0d1effabc25b538d946611c3a0b189150e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58826
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Giardi Sylwyn [Mon, 24 Mar 2025 13:52:02 +0000 (14:52 +0100)]
LU-16767 mdt: Allow jobID fields widths
Modify the function jobid_interpret_string in order to allow admin to
specify the widths of parameter printed by
lctl get_param mdt.*.job_stats.
By specifying the parameter jobid_name, the admin can truncate the
fields.
For exemaple, the format "%3e.%u.%6h" will print in job_stats
the 3 first characters of the executable name, a dot, the whole uid, and
the 6 first characters of the hostname.
If no digit is passed before the letter, it will print the whole field.
Signed-off-by: Giardi Sylwyn <sylwyn.giardi@cea.fr>
Change-Id: Ifd94b354cef07a7fff5e70c94c313a7e4617e2f8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58822
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Timothy Day [Tue, 15 Apr 2025 05:21:08 +0000 (05:21 +0000)]
LU-18162 kunit: convert llog unit test to LU device
Convert OBD test to use LU device init/fini rather
than the legacy OBD API.
Test-Parameters: trivial testlist=sanity env=ONLY=60a,ONLY_REPEAT=25
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Iab7e3109ac061be826b0d7695fcc69e0dee2346d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58803
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Thu, 3 Apr 2025 22:16:46 +0000 (18:16 -0400)]
LU-18824 utils: Fix lfs migrate with --overstripe-count
The --overstripe-count (-C) option was not being properly
honored during file migration. When using lfs migrate with
this option, the overstriping flag was set but the
LLAPI_LAYOUT_OVERSTRIPING pattern was not applied to the
destination file.
This was because in lfs_setstripe_internal(), the code only
set lsa.lsa_pattern = LLAPI_LAYOUT_OVERSTRIPING when not in
migrate mode.
Fix this by always setting the pattern when the overstriped
flag is true, regardless of whether we're in migrate mode or
not.
Added a test case (27X) to verify that lfs migrate properly
applies overstriping
NB: This fix and test were generated and tested by the VS
Code Augment Agent after being given the LU URL and some
prompting.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I734b9d4e3c699e335c9d810bba2e2d2a1c301ed6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58672
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Timothy Day [Sat, 29 Mar 2025 23:18:54 +0000 (19:18 -0400)]
LU-18687 doc: move man *.1 pages to Documentation/man1
Consolidate all of the man pages into the top
level Documentation directory.
Move all of the Lustre man pages (from 1) to Docmentation/.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ied472c7612996cfd04670f5b2803bfb48d2bf74a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58587
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Shaun Tancheff [Thu, 27 Mar 2025 01:42:38 +0000 (08:42 +0700)]
LU-18852 build: Compatability updates for kernel v6.14
Linux commit v6.13-rc1-1-g6fba89813ccf
lsm: ensure the correct LSM context releaser
struct lsm_context is now upstream, provide an lsmcontext
mapping for Ubuntu
Linux v6.13-rc1-7-g5be1fa8abd7b
Pass parent directory inode and expected name to ->d_revalidate()
Adjust d_revalidate() to handle the extra arguments.
Use FMODE_NONOTIFY now that __FMODE_NONOTIFY macro is dropped.
Test-Parameters: trivial
HPE-bug-id: LUS-12797
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4ea10d171ab83e6cadb7d03580e9a2748c0d60b0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58551
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Wed, 14 May 2025 00:25:08 +0000 (17:25 -0700)]
LU-18668 kernel: new kernel [RHEL 9.6 5.14.0-570.16.1.el9_6]
This patch makes changes to support new RHEL 9.6 release
for Lustre client.
Test-Parameters: trivial env=SANITY_EXCEPT="17p" \
mdtcount=4 mdscount=2 clientdistro=el9.6 testlist=sanity
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-3
Change-Id: Idf8c96ee9389978d9497da73b05c5ed400c429d4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57876
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Tue, 13 May 2025 00:20:16 +0000 (17:20 -0700)]
LU-18970 kernel: update RHEL 8.10 [4.18.0-553.51.1.el8_10]
Update RHEL 8.10 kernel to 4.18.0-553.51.1.el8_10.
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testlist=sanity
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testlist=sanity
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-1
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-2
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-3
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-1
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-2
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-3
Change-Id: I210fcf4be1bf39a0cb6fc64dcdfa898bb98f87ca
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59201
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Tue, 13 May 2025 00:14:30 +0000 (17:14 -0700)]
LU-18969 kernel: update RHEL 9.5 [5.14.0-503.40.1.el9_5]
Update RHEL 9.5 kernel to 5.14.0-503.40.1.el9_5.
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.4 testlist=sanity
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el9.4 serverdistro=el9.5 testlist=sanity
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-1
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-2
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-3
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-1
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-2
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-3
Change-Id: I62b270ad85126e6022eaf04ddbd32898fb4dc320
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59200
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alexander Zarochentsev [Wed, 28 May 2025 17:29:26 +0000 (17:29 +0000)]
LU-19070 dne: dir migrate allowed only for root
Current implemetation of lfs migrate -m
relies on setxttr(, "trusted.lmv", ) which is
allowed only for users with CAP_SYS_ADMIN capability.
Adding the same check to ll_migrate() will prevent
incomplete migrations from a non-root user.
Add error reporting to cb_migrate_mdt_fini().
Fixes:
0a83d948f3 ("LU-4684 migrate: shrink dir layout after migration")
Fixes:
2dae2b8ffb ("LU-8777 mdt: add parameter to disable remote/striped dir")
HPE-bug-id: LUS-12895
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I58d417b64e2b634d76e4ad38685deb21d9ce8a86
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59474
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Mon, 26 May 2025 00:30:24 +0000 (20:30 -0400)]
LU-19008 hsm: add locking for coordinator thread stop
There is no locking around thread stop, which can race between
mdt_coordinator() and mdt_hsm_cdt_stop() and with use-after-free
during unmount. Add locking to avoid this.
Fixes:
4512347d6c ("LU-16356 hsm: add running ref to the coordinator")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I996a79fcbca3b1c6f6a0f5ee5d9f052f31eda61f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59425
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 22 May 2025 02:58:06 +0000 (20:58 -0600)]
LU-5969 lnet: use LGPL-2.1+ for SPDX headers
The change from explicit LGPL license text to SPDX headers
introduced a number of incorrect license identifiers, because
the "or (at your option) any later version" text was missed.
Convert remaining library license blocks over to SPDX LGPL-2.1+.
Reorder copyright and file description to be consistent.
Remove filenames explicitly listed in the header block.
Test-Parameters: trivial
Fixes:
e6aefbfaa6 ("LU-6142 libcfs: SPDX for libcfs module")
Fixes:
56a9ba02ae ("LU-6142 libcfs: SPDX for libcfs headers")
Fixes:
c9a7728476 ("LU-6142 lnet: SPDX for lnet/include/ and misc files")
Fixes:
9e3fd9ce8f ("LU-6142 lnet: SPDX for lnet/util/lnetconfig/")
Fixes:
0f39311369 ("LU-6142 lnet: SPDX for lnet/utils/")
Fixes:
14e981db6c ("LU-6142 misc: SPDX for Lustre headers")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic2e9f70f82211ce5231c12d431ca63dc163ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59367
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Guillaume Courrier <guillaume.courrier@cea.fr>
Reviewed-by: Cory Spitz <cory.spitz@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mikhail Pershin [Tue, 13 May 2025 15:59:27 +0000 (18:59 +0300)]
LU-18986 mgc: client part of new registration protocol
Use new target registration protocol in MGC.
It uses inline buffer mtn_inline_list[] if NIDs fit into
it or prepare bulk transfer for large list of NIDs.
Test-Parameters: testlist=runtests mdsversion=EXA6.3.2
Test-Parameters: testlist=runtests ossversion=EXA6.3.2
Test-Parameters: testlist=runtests serverversion=EXA6.3.2
Test-Parameters: testlist=runtests clientversion=EXA6.3.2
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ifc0fd24d7eb26dd092c3e9cce895980b26f0524d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59212
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Marc Vef [Tue, 13 May 2025 11:27:31 +0000 (13:27 +0200)]
LU-18756 sec: add resource id check to oss and mds
This patch includes the resource id check into the relevant code paths
on the oss and mds side. It is therefore included for the following
operations.
On the MDT-side:
- open
- create (file and directory)
- unlink (file and directory)
- setattr
- setxattr
- getxattr
- rename
- link
On the OST-side and on the MDT-side for Data on MDT (DoM) files:
- write
- read
- truncate
- fallocate
Some caveats:
The resource id check is not included for MDS_GETATTR RPCs due to
functional and usability concerns. Specifically for the latter, the
"struct stat" would no longer be filled resulting in "?" when running
"ls -l", which can be misunderstood.
Also, if the check is only enabled on the OST-side, writes are only
denied for "sync"/"fsync"-type operations on a file as the check is at
the server-side. If the check is enabled on the MDT-side, write-access
is denied before the OST_WRITE RPC is sent, i.e., immediately
returning the access denied error code. If a file is still in the page
cache before the check is enabled, a client can still read the local
copy of the file, which is expected.
Sanity-sec test 75a was added to exercise the ID check for the above
cases in several disciplines further testing that access to
neighboring nodemap offset ranges work as expected.
Test-Parameters: trivial testlist=sanity-sec
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I040ddb1b934707baa84b492337139f45b856692e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59208
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Marc Vef [Tue, 13 May 2025 11:13:50 +0000 (13:13 +0200)]
LU-18756 sec: add generic nodemap resource id check
This patch represents the first patch in the series to check the OST
object and MDT inodes UID/GID against the nodemap offset range. This
patch adds the corresponding functions on the OST, MDT, and nodemap
sides for the resource ID check. A resource is defined as an MDT inode
or OST object. This patch does not yet connect the functions to the
relevant codepaths. The patch further adds the new "lctl set_param"
configurables, which are (for now) disabled by default:
- "lctl set_param mdt.*.enable_resource_id_check={0,1}" toggling the
check on the MDT side.
- "lctl set_param obdfilter.*.enable_resource_id_check={0,1}" toggling
the check on the OST side.
These configurables work individually but should be toggled together.
The ID check relies on the "nodemap_map_id()" functionality to
guarantee compatibility with the nodemap mapping functionality, e.g.,
covering both offset and mapping cases, among others. The ID check
therefore functions as follows:
If "nodemap_map_id()" returns the squashed value for both UID and GID
for a given client export, "fs_uid", and "fs_gid" stored on the MDT
inode and OST object, access is not permitted to the resource. It
does not rely on any IDs given by the client. The corresponding
permission bits or ACLs are not taken into consideration and are
only relevant later if access was permitted elsewhere.
Test-Parameters: trivial
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I818c511cd37251843bcfa6b873ef8bdc05176980
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59207
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mikhail Pershin [Thu, 8 May 2025 19:07:37 +0000 (22:07 +0300)]
LU-18986 mgs: server part of new registration protocol
Rework mgs_target_reg() to handle new protocol along with
old one for older targets
It handles old protocol with NIDs either in mti_nids or
in mti_nidlist[], and new protocol with NIDs in
mtn_inline_list[] or bulk
All NIDs are put in mti_nidlist[] as result of request
processing, so that eliminates need in extra changes in
further code path
Test-Parameters: testlist=runtests ossversion=EXA6.3.2
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I41dd487c37136e24328914e33c9ce056be013aae
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59206
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mikhail Pershin [Thu, 8 May 2025 09:40:54 +0000 (12:40 +0300)]
LU-18986 mgs: new target registration protocol
Patch adds new target registration request format with
enhanced NIDs list handling. The idea is to don't overload
mgs_target_info with extra flags and fields for NID list
description but keep such information in new structure.
NIDs list is arrays of string always and can be send
in varios manners: inline buffer, bulk, compressed,
appended, etc.
It helps also to resolve compatibility issues.
Patch includes:
- new wire structure mgs_target_nidlist
- new possible RPC format with mgs_target_nidlist buffer
- new connect flag OBD_CONNECT_MGS_NIDLIST to replace
obsoleted OBD_CONNECT_REQPORTAL removed in commit
1.6.0-159-gd2d56f38da ("make HEAD from b_post_cmd3")
- corresponding swabber and wirecheck
Test-Parameters: testlist=runtests clientversion=EXA6.3.2
Test-Parameters: testlist=runtests serverversion=EXA6.3.2
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I441de467a530137f76712273b9a5f814fdb562c1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59205
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Mon, 27 Jan 2025 16:44:25 +0000 (17:44 +0100)]
LU-17410 sec: per-nodemap capabilities mask
Add a per-nodemap capabilities mask, used in preference to the global
enable_cap_mask parameter if it is set.
The new nodemap property is named enable_cap_mask, and can be set
thanks to the new lctl command 'nodemap_set_cap'. It is possible to
specify capabilities in hex or with symbolic names, with '+' and '-'
prefixes to respectively add or remove corresponding capabilities.
We support defining 2 types of capabilities, either a "set" so that it
is possible to add capabilities, or a "mask" to reduce capabilities of
the client.
This per-nodemap capabilities mask is available on any nodemap
including the default nodemap.
A dynamic child nodemap is allowed to define only a subset of the
capabilities set on the parent, unless the child_raise_privileges
property has the 'caps' privilege.
sanity-sec test_51 is enhanced to exercise this new nodemap property.
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1ed91c721d869d0596af9c2d7e07a2c411f2b7c2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57938
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alexander Boyko [Fri, 13 Dec 2024 12:57:17 +0000 (13:57 +0100)]
LU-18556 hsm: optimize llog record modification
This commit introduces a new llog modification mechanism for HSM
operations to address inefficiencies caused by prior reliance on
catalog processing. The new approach directly modifies llog record,
eliminating the need for catalog-based processing and reducing
latency.
Key changes include:
* Replacing the hsm_action_item (HAI) with a full in-memory llog
record representation, increasing memory usage by ~80 bytes per
record but removing the need for a dedicated llog cookie hash
table.
* Unifying the coordinator's read/store logic for HAI data into a
single in-memory item shared by mdt_hsm_agent_send() and
mdt_hsm_add_hsr(). This reduces memory allocation steps: only one
cdt_agent_req allocation is now required during llog read
operations, eliminating subsequent allocations/copies.
Performance results on VMs 2 MDTs/2 OSTs/2 Clients no-op copytool:
Test 1 (1M archive requests): 572s -> 187s (~3 times faster)
Test 2 (1M archive + 1M queued): 558s -> 392s (~1.4 times faster)
HPE-bug-id: LUS-12583
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I4b6e697bc3b1f0cf2c76f5433b49affbc933c653
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57428
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Vitaliy Kuznetsov [Wed, 11 Jun 2025 16:10:39 +0000 (18:10 +0200)]
LU-14772 tests: Add conf-sanity-framework.sh
This patch creates a new file conf-sanity-framework.sh
The functions from conf-sanity.sh will be moved into
this file, and will also be used in other tests with
the conf-* prefix.
Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I6e0c53d4e15fa01c341be7a67fcf386c4fb5f0ed
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57370
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Marc Vef [Sun, 25 May 2025 19:02:50 +0000 (21:02 +0200)]
LU-19050 utils: Support long nid lists when getting fs info
When "get_root_path_slow()" is called through various user commands,
e.g., "lfs setquota", the internal "root_cache" is filled with mount
point information. The cache's "nid" field allowed 256 characters
which resulted in a buffer overflow for long nidlists that are set
during mount.
This patch removes this limitation and further removes the "nid" field
from the "root_cache" since it is only needed in the "lfs check"
command. Therefore, the nid list no longer needs to be processed and
put into the cache in the numerous other llapi_* functions where the
nid list is never accessed.
Further, string copy handling was insufficient, allowing the overflow
in the first place, and was updated accordingly for all fields.
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I3d9c30795fba14618368b7b9e1769fe0b07d3fc7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59421
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Feng Lei <flei@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Oleg Drokin [Sat, 7 Jun 2025 23:10:47 +0000 (19:10 -0400)]
New tag 2.16.56
Change-Id: Iabf1977eeb273e629a3ea4c6ba75a3eadaa8be2a
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Nathaniel Clark [Thu, 22 May 2025 11:53:20 +0000 (07:53 -0400)]
LU-19039 lnetconfig: Fix error string in cyaml output
String output in yaml only needs to be quoted when beginning with '@',
''', '"', or '- ', or contains ':'.
This corrects the most common error output for `lnetctl ping` errors
to be correct yaml and also cleans up all other error strings output.
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I9a8436280b34f82cf78152e488b68c0581cc2a7d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59373
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Oleg Drokin [Tue, 27 May 2025 04:08:26 +0000 (00:08 -0400)]
LU-15358 tests: Escape quote symbols in sanityn test 26b
Shellcheck highlights that those quotes are actually unquoting
the variables. And looking at prior code we really try to ensure
you can tell which one is which even when some of them are empty
or have spaces.
Test-Parameters: trivial
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I2cdd0dcc1bce59b397f928cffeb790c74d8dc311
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59444
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Thu, 17 Apr 2025 10:17:32 +0000 (13:17 +0300)]
LU-16818 tests: ignore more opcodes in replay-single/65a
ignore few more opcodes which can interfere testing:
MDS_STATFS, OST_STATFS, OST_DISCONNECT and OST_PRECREATE
Test-Parameters: env=ONLY=65a,ONLY_REPEAT=100 testlist=replay-single
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib730b540b9075e0ed871bc11f3bdfb4cfd4634a1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58838
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Sun, 25 May 2025 00:17:54 +0000 (18:17 -0600)]
LU-18276 tests: add debugging to sanity-pfl/16b
Add extra debugging messages to sanity-pfl.sh test_16b to help find
what is causing this test to fail with ENOSPC intermittently.
Reduce size of overstriped PFL file layout slightly, so that two
such components can fit within the xattr size limit, which may or
may not be the cause of the ENOSPC failures.
Print a message in llapi_layout_file_open() if ENOSPC is hit, so
that we can determine the xattr size, in case it is too large.
Move layout conversion before file open() to avoid contacting
MDS needlessly if the layout is bad.
Test-Parameters: trivial testlist=sanity-pfl env=ONLY=16b,ONLY_REPEAT=100
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iaf347e231147041dda07277227e80f0b6f2540e5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59416
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Aurelien Degremont [Fri, 23 May 2025 14:00:23 +0000 (16:00 +0200)]
LU-19051 config: silent spurious messages while checking mpitests
When detecting mpicc configuration, do not print warnings
or error messages in the middle of configure output.
Test-Parameters: trivial
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: If536aa1d04f0d641a7b2a721869261c85907e084
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59403
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Fri, 23 May 2025 03:32:26 +0000 (21:32 -0600)]
LU-19046 mgc: mgc_fs_setup() should wait interruptibly
When a target mounts, it fetches a copy of its config log from the
MGS to store in the local filesystem. However, the MGC can currently
only fetch the config log for one target filesystem at a time.
This should be improved in a separate patch.
If the MGS is inaccessible, or there is a problem during setup, the
server will wait for it while holding cl_mgc_mutex. Other targets on
the same server will be unable to mount, and block on cl_mgc_mutex,
possibly dumping a stack trace like:
INFO: task mount.lustre:93138 blocked for more than 90 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" to disable this
task:mount.lustre state:D stack:0 pid:93138 ppid:93135
Call Trace:
__schedule+0x2d1/0x870
schedule+0x55/0xf0
schedule_preempt_disabled+0xa/0x10
__mutex_lock.isra.11+0x349/0x420
mgc_fs_setup.isra.12+0x65/0x7a0 [mgc]
mgc_set_info_async+0x99f/0xb30 [mgc]
server_start_targets+0x452/0x2c30 [obdclass]
server_fill_super+0x94e/0x10a0 [obdclass]
lustre_fill_super+0x388/0x3d0 [lustre]
mount_nodev+0x49/0xa0
legacy_get_tree+0x27/0x50
vfs_get_tree+0x25/0xc0
do_mount+0x2e9/0x950
ksys_mount+0xbe/0xe0
Use wait_event_interruptible() in mgc_fs_setup() so the server's mount
thread can be interrupted and killed. This does not fix the reason
for the server to be blocked, but it does allow it to be killed.
Rename mgc_fs_cleanup() to mgc_fs_clear() so it is not confused with
actually cleaning up the MGC.
Avoid printing an error if the sptlrpc log is not available. This is
common for most filesystems, and is not an error.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0bafa5dae0eadecb112efaf61f8bcf7ea8c4c296
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59396
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Shaun Tancheff [Fri, 23 May 2025 01:16:04 +0000 (08:16 +0700)]
LU-17242 libcfs: use sched_show_task() for thread dumping
Use sched_show_task() for thread dumping, since it should be
available on all kernels that Lustre supports. On some kernels,
libcfs_debug_dumpstack() is unable to show the thread stack.
Replacing this function avoid that issue.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I421560b0d4223fd3503f4a3697a7615dd43bad8f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59394
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Thu, 22 May 2025 06:31:54 +0000 (23:31 -0700)]
LU-19040 kernel: update SLES15 SP6 [6.4.0-150600.23.50.1]
Update SLES15 SP6 kernel to 6.4.0-150600.23.50.1 for Lustre client.
Test-Parameters: trivial mdtcount=4 mdscount=2 \
clientdistro=sles15sp6 testlist=sanity
Test-Parameters: optional mdtcount=4 mdscount=2 \
clientdistro=sles15sp6 testgroup=full-dne-part-1
Test-Parameters: optional mdtcount=4 mdscount=2 \
clientdistro=sles15sp6 testgroup=full-dne-part-2
Test-Parameters: optional mdtcount=4 mdscount=2 \
clientdistro=sles15sp6 testgroup=full-dne-part-3
Change-Id: Ie2d530f0edb28326bbcbd1326f40e3e7db845c21
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59368
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Tue, 20 May 2025 05:09:47 +0000 (01:09 -0400)]
LU-18813 osd-wbcfs: refactor osd_device_alloc
osd_device_alloc() has improper error handling.
Refactor the function such that we properly
cleanup if __osd_device_init() fails.
Test-Parameters: trivial
Test-Parameters: testlist=sanity fstype=wbcfs mdscount=4 mdtcount=1 osscount=4 ostcount=1
Test-Parameters: testlist=sanity fstype=wbcfs combinedmdsmgs=false standalonemgs=true mdscount=1 mdtcount=1 osscount=4 ostcount=1
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ia03eb805ef3fdc75c8490e09c66b99e6541d13fd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59306
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Tue, 20 May 2025 04:57:57 +0000 (00:57 -0400)]
LU-18813 osd-wbcfs: remove f_op llseek checks
MemFS will always have llseek defined, so we
can remove the checks in the OSD.
Test-Parameters: trivial
Test-Parameters: testlist=sanity fstype=wbcfs mdscount=4 mdtcount=1 osscount=4 ostcount=1
Test-Parameters: testlist=sanity fstype=wbcfs combinedmdsmgs=false standalonemgs=true mdscount=1 mdtcount=1 osscount=4 ostcount=1
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I77f7abcef686c9c654b7bee04b3f88bb89a87756
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59305
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Tue, 20 May 2025 04:53:02 +0000 (00:53 -0400)]
LU-18813 osd: fix dcb_func LASSERT
Each OSD was incorrectly asserting that the
address of the function pointer was not NULL,
instead of the function pointer itself.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ie5682a9d80219743ecb86d8d463cbabcdbf77b64
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59304
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sergey Cheremencev [Fri, 4 Apr 2025 01:58:13 +0000 (04:58 +0300)]
LU-19030 quota: lfs quota all respects nodemap
Command lfs quota all should print only IDs from the appropriate
nodemap range. The patch also maps FS quota IDs to client IDs
according to nodemap before returning in a quota all iterator buffer.
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I8820e18957805c0dceacc4674713875b024a8e99
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59297
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Fri, 16 May 2025 05:50:43 +0000 (01:50 -0400)]
LU-18813 contrib: add an example config.site
The variable CONFIG_SITE can be used to specify
config files to the Autoconf generated configure
script. This is a useful alternative to long
configure command lines.
Add an example config.site file used for compiling
Lustre server (osd-wbcfs) and client for use in
ktest.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I6597b860629643ced7191d7a250a86ede2576993
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59265
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Marc Vef [Wed, 21 May 2025 11:35:33 +0000 (13:35 +0200)]
LU-19021 ptlrpc: Add obd info to nodemap exports output
When clients connect to MDTs/OSTs, a new export is generated on the
server-side during obd_connect_*() with the client's UUID. For each
target, a separate export is created which is then added to the
nodemap's "nm_member_list", if applicable.
Currently, "lctl get_param nodemap.NM_NAME.exports" prints the UUID
and NID information for each entry in the "nm_member_list". Because
the obd device is not listed, duplicate entries appear to be shown for
each client, which can be confusing for the administrator.
This patch extends the nodemap.NM_NAME.exports output by also showing
the obd the client is connected to, e.g., MDT0000, MDT0001, etc, such
that the shown entries no longer appear as duplucate.
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I681480f9258e57c522acc148f4096a8f40c71eab
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59248
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sergey Cheremencev [Tue, 13 May 2025 14:53:38 +0000 (17:53 +0300)]
LU-19011 utils: lfs quota -a -u --busage has delimiter
lfs quota all should insert a delimiter between the name and certain
parameter(busage, bhardlimit, bsoftlimit ...).
Fixes:
7c02893e12 ("LU-18079 utils: argument parse opts for lfs quota")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Icace5752c4a169858792748c5f4b41e336d18cac
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59209
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Frederick Dilger <fdilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jihyeon Gim [Sun, 4 May 2025 14:40:53 +0000 (23:40 +0900)]
LU-18971 build: Update ZFS version to 2.3.2
Update ZFS version to 2.3.2. The changes are listed in:
https://github.com/openzfs/zfs/releases/tag/zfs-2.3.2
Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testlist=sanity
Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testlist=sanity
Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-1
Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-2
Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-3
Change-Id: Id2a9780cd3b10e81e0136c0a7dde0cb317b52834
Signed-off-by: Jihyeon Gim <potatogim@gluesys.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59065
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Frederick Dilger [Thu, 1 May 2025 00:53:21 +0000 (18:53 -0600)]
LU-7105 tests: remove deprecated test_28 from sanityn
sanityn.sh test_28 was deprecated in 2022-06 but not removed.
Test-Parameters: testlist=sanityn
Fixes:
51c491dac6 ("LU-10994 test: remove netdisk from obdfilter-survey")
Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: I524adf575170ae9e78dc1eb5e0e1596ee7252dfe
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59052
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Sun, 27 Apr 2025 16:55:24 +0000 (12:55 -0400)]
LU-17848 osd: fix deref in ldiskfs osd_health_check()
The implementations of osd_health_check() in ldiskfs
incorrectly check for a NULL mount after already
dereferencing it. Add a check for a NULL mount in
osd_sb() and check for a NULL sb in osd_health_check().
CoverityID: 397885 ("Dereference before null check")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Id1ce015eb420fe067be375bf0019f305e3e2718c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58989
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Timothy Day [Sat, 29 Mar 2025 23:37:30 +0000 (19:37 -0400)]
LU-18687 doc: move man *.7 pages to Documentation/man7
Consolidate all of the man pages into the top
level Documentation directory.
Move all of the Lustre man pages (from 7) to Docmentation/.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I9c5a5e36028739b4872e469721acb8d32b61cce1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58590
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Ellis Wilson <elliswilson@microsoft.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alexander Boyko [Tue, 25 Feb 2025 10:38:34 +0000 (11:38 +0100)]
LU-18116 tests: replay-single test_201 timeout
19s is not enough for some system to finish MDT1 umount.
Increasing it to 20s + OSTCOUNT seconds.
HPE-bug-id: LUS-12689
Test-Parameters: testlist=replay-single
Fixes:
ffedcbae21f7 ("LU-17809 osp: make disconnect asynchronous")
Signed-off-by:Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I900ce107ceb664530bc2165685ba7b88cbd46807
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58203
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Shaun Tancheff [Fri, 23 May 2025 00:06:32 +0000 (07:06 +0700)]
LU-19045 build: memfs_write_end can be passed a folio
Linux v6.11-rc1-51-ga225800f322a
fs: Convert aops->write_end to take a folio
Linux v6.11-rc1-52-g1da86618bdce
fs: Convert aops->write_begin to take a folio
Add 'struct folio' for page vs folio signature change.
Fixes:
25813cf8ba ("LU-18813 osd-wbcfs: MemFS-based OSD with writeback support")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I1cf87ac70c52652530e4fbc853c5160dc5822ec9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59393
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Thu, 28 Sep 2023 00:09:03 +0000 (20:09 -0400)]
LU-10026 ptlrpc: verify large allocations are aligned
If we were ever to do an allocation with kmalloc(), we could
get non-page-aligned memory.
We haven't seen any problem yet, but we are trying to be ready for
this in advance. The arguments from https://lwn.net/Articles/787740/
also look strong.
Let's add an assertion to illuminate this dangerous behaviour.
EX-bug-id: EX-8245
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: Id20898065b516d363d9dc280e71be1b5cfb6f4a7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57844
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Patrick Farrell [Thu, 5 Dec 2024 03:51:52 +0000 (22:51 -0500)]
LU-17814 utils: Add work unit management
Add creating and removing the root work unit.
Still no actual find, but closer.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Id56e43042b6e6ea776fae20b53837c9eced5098e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57293
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Wed, 4 Dec 2024 23:49:23 +0000 (18:49 -0500)]
LU-17814 utils: implement thread pool
Implement thread pool for parallel find - still uses
regular find code to do the actual find.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If5ebd3b52b93fc54dd4ab86bd8dfe06c4dcc0c11
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57292
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Keguang Xu [Tue, 3 Dec 2024 03:41:32 +0000 (11:41 +0800)]
LU-17055 mdt: add fallocate(FALLOC_FL_ZERO_RANGE) for indirect
This patch implements fallocate(zero) operation on DoM (Data
on Metadata) components where files are mapped in indirect mode.
This functionality is not natively supported in the current
ldiskfs implementation.
We mimic the write procedure ourselves, to brw zero content
between user specified [start, end) range.
- in mdt_io.c::mdt_fallocate_hdl() we first try the default path
mdt_object_falloc(), where for this (ZERO & IND) case it returns
(-EOPNOTSUPP & DT_FALLOC_ERR_NEED_ZERO as a signal. We then
switch to mdt_object_falloc_zero(), there we iteratively call
dt_bufs_get() for brw preparation, and invokes mdt_commitrw_write()
executing the real write.
Signed-off-by: Keguang Xu <squalfof@gmail.com>
Change-Id: Ic7a9cb46eb94bdddff47795aaa2e4a4276dbe237
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57246
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alexey Lyashkov [Wed, 20 Nov 2024 08:46:40 +0000 (11:46 +0300)]
LU-18461 lod: let's remove home coded lu_buf code.
lti_ea_store and lti_ea_store_size is same is lu_buf,
lets drop code duplication.
Test-Parameters: trivial
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I3a61382e53a848c94654bc4c55b7ac0c97758dbb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57188
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Alexey Lyashkov [Wed, 20 Nov 2024 08:33:12 +0000 (11:33 +0300)]
LU-18461 lod: convert a lti_ea_store to the lu_buf
lti_ea_store + lti_ea_store_size is same as lu_buf,
lets reuse code.
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I61334247288736286937eeb0bb3afee0638c28bb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57187
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alexey Lyashkov [Tue, 19 Nov 2024 15:59:59 +0000 (18:59 +0300)]
LU-18461 lod: fix layout conversion.
lets set a lcm_id after convert.
Change-Id: I58ce52362c97bcd5f747a05ef994b2de4e69f93c
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57186
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Mon, 19 May 2025 08:17:21 +0000 (02:17 -0600)]
LU-18446 ptlrpc: fix cpu_latency_work() completion time
Consider the cpu_latency_work() completion when it is equal to the
scheduled jiffies counter, rather than only afterward.
Fixes:
54a64ea818 ("LU-18446 ptlrpc: lower CPUs latency during client I/O")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I985d29b493bda02aa69ad54d9aae581a05fad685
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59285
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bruno Faccini <bfaccini@nvidia.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Mon, 28 Apr 2025 15:41:31 +0000 (11:41 -0400)]
LU-13814 osc: reduce queue use in __osc_dio_submit
This patch removes another queue use in __osc_dio_submit.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I6f15d4110291cdca3d3c894916ed3c894f2ad9ce
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52161
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Mon, 28 Apr 2025 15:39:28 +0000 (11:39 -0400)]
LU-13814 osc: begin converting queue_dio_pages
This patch begins the lengthy process of converting
osc_queue_dio_pages to use the page array rather than the
lists. This will be a lengthy process because this ties in
to the OSC extent and BRW code.
Test-Parameters: testlist=sanity-sec env=ONLY=52,59a,59b
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I788faa0748a88045d838fb530107938a639407d0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52160
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Patrick Farrell [Tue, 29 Aug 2023 15:37:33 +0000 (11:37 -0400)]
LU-13814 osc: convert osc_dio_submit to array
This is a trivial first step, converting from list to the
array of cl_pages, which is not a very useful step but
does have to be done as we proceed here.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib208bc2ed2ad95602a5fde79a4b39652ff73d9bf
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52159
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Patrick Farrell [Mon, 28 Apr 2025 16:12:28 +0000 (12:12 -0400)]
LU-13814 osc: cleanup osc_completion
Removing oap_cmd usage in osc_completion makes it easier to
remove osc page usage for DIO.
There's also no need for the osc_ap_completion wrapper, so
remove it.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ie72eecded73b0444a51b0273bf8913bfaee001bd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52141
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Sun, 25 Feb 2024 18:06:38 +0000 (13:06 -0500)]
LU-13814 osc: add osc_queue_dio_pages
For dio, we need to replace osc_queue_sync_pages with a
specialized function. This adds that function and starts
specializing it to DIO only.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1eb996578cf2fc4d758d959ffc7f9b48225374ce
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52140
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Vladimir Saveliev [Wed, 4 Jun 2025 16:18:14 +0000 (19:18 +0300)]
LU-19059 lov: fix lov_stripe_set
Prevent overflow on multiplication of 32 bit integers.
Test to illustrate the issue is added to hit overflow in both branches
of lov_stripe_set().
Fixes:
6eee4ea5b6 ("LU-6174 lov: use standard Linux 64 divison macros")
HPE-bug-id: LUS-12782
Signed-off-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Change-Id: If5a5aaf8de98c79b3df908e0d052461a3c8d0989
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59436
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Oleg Drokin [Wed, 7 May 2025 21:59:26 +0000 (17:59 -0400)]
LU-18959 ofd: remove sync_on_lock_cancel tunable
Now that enough time passed we don't need anymore
Revert "LU-12967 ofd: restore sync_on_lock_cancel tunable"
This reverts commit
7df7347b7b188e7168e094304fd6d2d985f7f274.
Except the sanity.sh portion
Test-Parameters: trivial
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I871ea7ba90bb2f193c967cdd184bbf392916b525
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59145
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sylvain Goudeau [Thu, 15 May 2025 08:51:58 +0000 (10:51 +0200)]
LU-19010 lnet: Define BXI3LND network type
Define the BXI3LND network type. This reserves the network type number
for future implementation and allows creation of bxi3 peers and
adding routes to bxi3 peers.
Test-Parameters: trivial
Signed-off-by: Sylvain Goudeau <sylvain.goudeau@eviden.com>
Change-Id: I0b15da1c3be7873fecebba926c60214663b8a091
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59256
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alex Zhuravlev [Wed, 14 May 2025 16:35:13 +0000 (19:35 +0300)]
LU-19016 target: reset o_grant if no grant is returned
tgt_grant_dealloc() should reset o_grant (which goes back
to the client) if the server doesn't want to return grants.
Test-Parameters: testlist=sanity
Test-Parameters: testlist=sanity
Test-Parameters: testlist=sanity
Test-Parameters: testlist=sanity-quota,sanity-quota
Fixes:
df2b5d99ad ("LU-17933 target: do not break grants on RPC failure")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I33851c1e023534f6b5cf3c5596a23b0835a3fe01
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59234
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Mon, 12 May 2025 22:04:31 +0000 (22:04 +0000)]
LU-19007 doc: document client mount retry option
Document the client mount retry option.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ib618a5e909b66136bb37ed82783af500cd6b40ee
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59198
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Frederick Dilger <fdilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Gian-Carlo DeFazio [Thu, 8 May 2025 17:12:02 +0000 (10:12 -0700)]
LU-18988 kfilnd: add return value after LBUGs
To get rid of compiler warnings of the kind
"control reaches end of non-void function",
add return statements after LBUGs.
Signed-off-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Change-Id: Ie8596c7527fa68f88e807ca290157b4a4dc891cf
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59157
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Aurelien Degremont [Mon, 5 May 2025 14:36:16 +0000 (16:36 +0200)]
LU-18976 gss: detect dns error in lsvcgssd
Use proper NI_NAMEREQD flag with getnameinfo() in lsvcgssd when
resolving IP addresses, for it to return an error if DNS resolution fails.
That way, the error is propagated and Lustre knows a DNS failure happens
and can report it better. If not, it fails latter and reports a very unclear
message.
Test-Parameters: kerberos=true testlist=sanity-krb5
Change-Id: Iaa9e718c056ea742d8695048e43bdeeb3205f0dd
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59095
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bruno Faccini <bfaccini@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Frederick Dilger [Wed, 30 Apr 2025 20:40:07 +0000 (14:40 -0600)]
LU-18964 tests: performance-sanity test_2 fixed large
test_2 in performance-sanity previously had mdsrate-create-large
meaning that the files themselves were large, however for performance
testing this doesn't make much sense as this is largely dependant on
the system IO. Instead large now means a large number of files (1M),
which was the original goal of these tests.
The SLOW/zfs check was also moved from test_1 to test_2 as it had
been left when the tests were renumbered.
Fixes:
01d16dadab ("LU-14697 tests: change performance-sanity to use mdtest")
Test-Parameters: trivial testlist=performance-sanity env=ONLY=1+2
Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: I602b76c050d29b5d401971123caa73e5d4844342
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59047
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sergey Cheremencev [Wed, 30 Apr 2025 18:21:45 +0000 (21:21 +0300)]
LU-18936 tests: sanity-quota_95 interop fix
Skip sanity-quota 95 if MDS version is less than
2_16_52-39-gb162043239 to avoid MDS that is missing
root idmap quota offset.
Fixes:
b162043239 ("LU-18109 nodemap: fix idmap offset for root")
Test-Parameters: testlist=sanity-quota env=ONLY=95 serverversion=2.16
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I713827eb7eb500689a75e956817d2a6eac556656
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59044
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Keguang Xu [Sun, 27 Apr 2025 07:32:00 +0000 (15:32 +0800)]
LU-13527 utils: allow OST FID lookup via lfs fid2path
We support reverse name lookup of OST FID in addition
to MDT FID via "lfs fid2path",
- we use fld_client_lookup() to tell whether it's OST FID
or MDT FID
- for OST FID, we communicate with the OST that contains
the underlying object with the given FID, retrieve its parent
FID, i.e., the MDT FID, then continue with the MDT FID logic
- for MDT FID, previous logic was kept
NB: The ost_index obtained from fld_client_lookup() is passed
down to the lower layer via u.gf_root_fid, which was initially
taken as a non-functional field in OST context.
Signed-off-by: Keguang Xu <squalfof@gmail.com>
Change-Id: Icdf5cd7bb4693d5fef0b48f83464ca80aab81d1d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58988
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Sergey Cheremencev [Tue, 22 Apr 2025 10:51:15 +0000 (13:51 +0300)]
LU-18944 target: shrink grant depending on free space
Shrink grant at targets when it uses more than 25% of left space.
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I7f2fe97a05e04cce0ac9db82abe4c4bd20f194a0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58926
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Sebastien Buisson [Tue, 22 Apr 2025 13:08:17 +0000 (15:08 +0200)]
LU-18109 nodemap: map back root with offset
When a nodemap is defined with admin=1 and an offset, root should be
mapped back to id 0, independently from the map_mode property.
sanity-sec test_77 is added to exercise this.
Fixes:
e3051ad0f1c8 ("LU-18109 utils: adding nodemap offset capability")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5b220dc90104ceab5c874dfbee41d3e85aefeb5b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58895
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Rajeev Mishra [Tue, 16 May 2023 21:51:09 +0000 (21:51 +0000)]
LU-18935 libcfs: cfs_num_min_max() returns wrong max value
The cfs_num_min_max() function is called using the nf_min_max function
pointer. It returns the minimum and maximum values for a given
range. However, if a network type uses cfs_num_min_max()
to find the min and max of the range, it will encounter an issue
where the start and end values of the range will be same.
This occurs because cfs_num_min_max() was returning the same value
for both the max and min
Here's an example specific to the "kfi" type, which utilizes
"cfs_num_min_max()" to find the minimum and maximum values.
As a result of the bug, the start and end values of
the range are the same.
Added range for kfi type--
lctl nodemap_add_range --name alps --range [0-131071]@kfi
show the range added
lctl get_param nodemap.alps.ranges
nodemap.alps.ranges=
[
{ id: 1, start_nid: 0@kfi, end_nid: 0@kfi }
]
Signed-off-by: Rajeev Mishra <rajeevm@hpe.com>
Test-Parameters: testlist=sanity-sec.sh
Change-Id: I368f8563648c1819a89e76fa8974818b5f8f6111
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58856
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Sebastien Buisson [Thu, 17 Apr 2025 14:45:55 +0000 (16:45 +0200)]
LU-18933 gss: allow expired ctx from server for LDLM callback
When a client receives an ldlm callback request to release a lock,
the server might use an expired reverse context to send this request.
The server might not have any other choice, as it only has a reverse
context and cannot refresh it explicitly. And if the client refuses to
use the associated gss context, it fails to reply to the ldlm callback
request and gets evicted.
In order to avoid this eviction, the client needs to accept the
expired gss context associated with the server request. The client
cannot refresh its own context immediately, as it would not match the
reverse context used by the server. But the client context is going to
be refreshed right after that, along with the subsequent ldlm cancel
request.
sanity-krb5 test_201 is added to exercise this use case.
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9bc82731cb7013e07cc09295cb488f52a0034ea9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58840
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Feng Lei [Thu, 20 Feb 2025 02:21:44 +0000 (10:21 +0800)]
LU-17679 tests: enable sanity/851
Print start message in monitor_lustrefs just before it is ready
to read events.
In test script, wait for the start message to confirm that
monitor_lustrefs process is scheduled and executed.
In test script, assign monitor_lustrefs process a higher priority.
Test-Parameters: trivial
Test-Parameters: testlist=sanity env="ONLY=851,ONLY_REPEAT=1000"
Signed-off-by: Lei Feng <flei@whamcloud.com>
Change-Id: Ibc5dd2450e7a8b795dba355118ea4a95f4cd52af
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58141
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Marc Vef [Wed, 12 Feb 2025 16:16:38 +0000 (17:16 +0100)]
LU-18715 utils: Extend lctl nodemap_info with property values
The command "lctl nodemap_info" only presents information about which
property names exist for a given nodemap. It does not show the values
for each property, and the command's usefulness is therefore limited.
Further, nodemap property values are generally retrieved via the "lctl
get_param" interface which is inconsistent with how nodemap properties
are set, i.e., by using the "lctl nodemap_*" interface rather than the
"lctl set_param" interface.
This patch extends the "lctl nodemap_info" command to improve
usability, presenting in-depth information, including the global
nodemap state (active/inactive), property availability and
descriptions, all defined nodemaps, and the values for all nodemap
properties.
Extended options:
lctl nodemap_info [-l/--list] [-n/--name NODEMAP_NAME]
[-p/--property PROPERTY_NAME]
List the values of all properties for all nodemaps:
"lctl nodemap_info"
List the value of the specific "ranges" property for the "remote"
nodemap: "lctl nodemap_info --name remote --property ranges"
"--name" and "--property" can be used individually to show the
properties of all nodemaps or all properties of a specific nodemap.
Note the previous positional arguments [all,list,<nodemap>] are
retained for backward-compatibility.
Added sanity-sec 25a to test these modifications.
Test-Parameters: trivial testlist=sanity-sec env=ONLY="25 25a"
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: Ic09710233de490d8eb2f50a74e2b7e4765ca4f3d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58052
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alex Zhuravlev [Wed, 29 Jan 2025 03:31:53 +0000 (06:31 +0300)]
LU-18682 tests: ost-pools/23b to call wait_delete_completed
otherwise the next test 24 may find few OSTs almost filled
and do not use them for new objects leading to a failure:
ost-pools test_24: @@@@@@ FAIL: Stripe count 4 not on
/mnt/lustre/d24.ost-pools/dir1/f24.ost-pools0:3
Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I0136dc352603b8f9ccd8926b3105d99ce0df63e2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57917
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Hongchao Zhang [Thu, 24 Apr 2025 00:37:59 +0000 (08:37 +0800)]
LU-18612 quota: update usage in glimpse AST
In the quota ID glimpse AST, update the quota usage in case of
the qunit has been changed on QMT.
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I1acae2731b11d70de21b47658fcea2e987167e04
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57638
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Qian Yingjin [Sat, 28 Dec 2024 16:10:16 +0000 (00:10 +0800)]
LU-18608 pcc: fix INTEGER_OVERFLOW in pcc_file_read_iter()
Fixing the possible INTEGER_OVERFLOW issue reported from Coverity.
/lustre/llite/pcc.c: 2643 in pcc_file_read_iter()
2641 iocb->ki_filp = file;
2642 pcc_io_fini(inode, PIT_READ, result, cached);
CID 454276: Insecure data handling (INTEGER_OVERFLOW)
"result", which might have overflowed,
is returned from the function.
2643 RETURN(result);
Test-Parameters: trivial testlist=sanity-pcc
CoverityID: 454276 ("Insecure data handing")
Fixes:
ce98bfe5f72 ("LU-10499 pcc: add readonly mode for PCC")
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib856b7598441c06e0fcfe2e7f1eb4eef4d3d82b7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57611
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Tue, 19 Nov 2024 15:30:25 +0000 (18:30 +0300)]
LU-18460 lod: avoid double llog initialization
lod should not try to initialize llogs if they have been initialized
already. this may happen if, for example, deactivation of specific
MDT has been lost and we activate already active MDT.
the result would be an assertion in lod_sub_prep_llog():
LustreError: 8141:0:(lod_sub_object.c:991:lod_sub_prep_llog())
ASSERTION( !ctxt->loc_handle ) failed:
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I17ed1a19ac143d1dd80e5c711c08311c49eda89e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57071
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Etienne AUJAMES [Fri, 30 Aug 2024 17:46:11 +0000 (19:46 +0200)]
LU-17920 mgs: handle compound permanent parameters
Special parameters like "nrs_tbf_rule" or "pcc" can "append" several
values.
e.g: add several TBF rules for ost_io
# lctl set_param ost.OSS.ost_io.nrs_policies="tbf jobid"
# lctl set_param ost.OSS.ost_io.nrs_tbf_rule="start login
jobid={*login*} rate=5000"
# lctl set_param ost.OSS.ost_io.nrs_tbf_rule="start rbh
jobid={*rbh*} rate=100000"
Multiple permanent values are not supported for these parameters since
the it is possible to set a single value today.
The current implementation will only stores the last value and remove
the old records.
This patch allows "pcc", "nrs_tbf_rule" and future "wbc" to store
several values for the same parameter: it appends new llog records
without deleting the old ones.
If the MGS found an existing record matching the new value, it will
not append the new one.
To remove from the configuration only one entry, the patch modifies
the behavior of "lfs set_param -Pd" and "lfs conf_param -d" to match
the parameter name and the value if set. Those commands return
-ENOENT if the entries are not found.
e.g: remove the TBF rule "rbh" from the configuration
# lctl set_param -Pd ost.OSS.ost_io.nrs_tbf_rule="start rbh
jobid={*rbh*} rate=100000"
e.g: remove all TBF rules for ost_io service
# lctl set_param -Pd ost.OSS.ost_io.nrs_tbf_rule
Add regression test conf-sanity 123aj.
Fix conf-sanity 123ag to use "lctl set_param -P -d" with the value set
in configuration (not "ANY").
Fix sanity-sec 27a to use "lctl set_param -P -d" only if the parameter
have been set before.
Test-Parameters: testlist=conf-sanity env=ONLY=123
Test-Parameters: testlist=conf-sanity env=ONLY=123
Test-Parameters: testlist=conf-sanity env=ONLY=123aj,ONLY_REPEAT=20
Test-Parameters: testlist=sanity env=ONLY=400b
Test-Parameters: testlist=sanity env=ONLY=401db,ONLY_REPEAT=20
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I9d3ec3d8d9004218138468739d4f7c5ea8a3eadd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56218
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Arshad Hussain [Mon, 10 Jun 2024 08:15:41 +0000 (04:15 -0400)]
LU-17000 llite: Handle not NUL terminated buffer
In pcc_expr_time_parse() 'buf' may not have a null
terminator if the source string's length is equal
to the buffer. This patch handles this.
Test-Parameters: trivial testlist=sanity-pcc
CoverityID: 426259 ("Buffer not null terminated")
Fixes:
3835f4d3 (LU-13881 pcc: comparator support for PCC rules)
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ifc144d73c75b8eef25a994630c600b9c1922aa3b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55377
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Li Dongyang [Wed, 20 Mar 2024 23:09:34 +0000 (10:09 +1100)]
LU-17658 fid: check on disk sequence before allocating to osp
If we lose the commit to update seq_srv on ofd/ost, the available
super-sequence range is not updated, the sequence server of ofd
could assign the same sequence again to a different osp,
creating filesystem corruption.
To address this, a new dt_device_operations->dt_last_seq_get()
is added to iterate the current known sequence dirs under /O
and return the latest one. Before using the super-sequence range
read from seq_srv we use the new interface to double check and
update the current range or get a new range if necessary.
Change-Id: I49a11bb3b5e476e55c5835b05392c9567aeeb4ce
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54474
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Oleg Drokin [Mon, 12 May 2025 07:21:13 +0000 (03:21 -0400)]
LU-19003 ptlrpc: prevent freed memory access in ptlrpcd_init()
When ptlrpcd_cpts is used, an array is allocated with parsed
arguments, and then freed before the last user is done with it.
Move freeing to the end of ptlrpcd_init() to fix this.
Test-Parameters: trivial
Fixes:
2686b25c30 ("LU-6325 ptlrpc: make ptlrpcd threads cpt-aware")
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I84847f00ca7df6a9cc56962a09bfe41c1435223e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59185
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Oleg Drokin [Wed, 14 May 2025 18:24:50 +0000 (14:24 -0400)]
LU-19017 osd-ldiskfs: fix lprocfs stats print
It was mentioning zfs accidentally.
Test-Parameters: trivial
Fixes:
a20476ca2286 ("LU-11850 obd: support the rest of "stats" with Netlink")
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I51334ca03dcb69c59a20e62325d2442b5ea5bac8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59235
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Alexander Zarochentsev [Tue, 6 May 2025 15:27:48 +0000 (15:27 +0000)]
LU-18989 osd: no tx restart in fallocate
A transaction restart in osd_fallocate_preallocate()
while holding an object lock may lead to a deadlock
with another thread which takes the lock first and then
starts a transaction.
crash> bt 44982
#0 __schedule at
ffffffffa2331fe8
#1 schedule at
ffffffffa233241a
#2 wait_transaction_locked at
ffffffffc106f08a
#3 add_transaction_credits at
ffffffffc106f67a
#4 start_this_handle at
ffffffffc106fa10
#5 jbd2__journal_restart at
ffffffffc107015e
#6 osd_fallocate_preallocate.constprop.0 at
ffffffffc23e3b1b
#7 osd_fallocate at
ffffffffc23e40cb
#8 mdt_object_fallocate at
ffffffffc21418f9
#9 mdt_fallocate_hdl at
ffffffffc2144125
and
crash> bt
PID: 47838 TASK:
ffff9f66f5288000 CPU: 9 COMMAND: "mdt00_020"
#0 __schedule at
ffffffffa2331fe8
#1 schedule at
ffffffffa233241a
#2 rwsem_down_write_slowpath at
ffffffffa2334a9b
#3 osd_write_lock at
ffffffffc23b6b4d
#4 mdd_xattr_del at
ffffffffc208cd88
#5 mdt_reint_setxattr at
ffffffffc21224a9
#6 mdt_reint_rec at
ffffffffc211ec69
#7 mdt_reint_internal at
ffffffffc20f0a64
#8 mdt_reint at
ffffffffc20fc6b9
#9 tgt_handle_request0 at
ffffffffc1dea9e7
HPE-bug-id: LUS-12819
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I8baa3ba1505c7a55e96841524368a66d447d18c5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59158
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>