Whamcloud - gitweb
fs/lustre-release.git
15 hours agoLU-19140 utils: sep in yaml_fill_scalar_data can be '\0' 65/59965/2
James Simmons [Fri, 27 Jun 2025 19:53:24 +0000 (15:53 -0400)]
LU-19140 utils: sep in yaml_fill_scalar_data can be '\0'

While testing lnetctl import a corner case bug was exposed. For
yaml_fill_scalar_data() the variable sep can be set to '\0' which
when later we can strchr to find the newline will fail since it
thinks its already at the end of the string. Instead do the
search for the newline after sep has restored ':' at its start
and skipped the whitespaces.

Test-Parameters: trivial testlist=sanity-lnet
Fixes: 8f64231185a ("LU-9680 utils: fix nested attribute handling in liblnetconfig")
Change-Id: Ibcf03616777feca58599d816265947f6de27c5b8
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59965
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 hours agoLU-6142 misc: replace license boilerplate with SPDX 24/59924/2
Xose Vazquez Perez [Mon, 23 Jun 2025 21:15:35 +0000 (23:15 +0200)]
LU-6142 misc: replace license boilerplate with SPDX

Just that.

Test-Parameters: trivial
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Change-Id: I445ee5625424a8f49671cc9a093aa3121bdcaaa3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59924
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 hours agoLU-19106 lod: check QoS data in pool before down_write 52/59852/5
Emoly Liu [Fri, 20 Jun 2025 00:14:14 +0000 (08:14 +0800)]
LU-19106 lod: check QoS data in pool before down_write

Just like ltd_qos_is_usable() does, define pool_qos_is_usable()
to check whether QoS data in pool is up-to-date and balanced before
expensive qos write lock is taken.

Fixes: e642e75cde02 ("LU-13363 lod: do object allocation in OST pool")
Test-Parameters: ostcount=8 testlist=conf-sanity env=ONLY=133
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: I9d17108b649ba5689f02d5f5eee098d030db3d5b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59852
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 hours agoLU-19113 llite: cfs_delete_from_page_cache() keep page locked 29/59829/4
Bruno Faccini [Wed, 18 Jun 2025 16:46:46 +0000 (18:46 +0200)]
LU-19113 llite: cfs_delete_from_page_cache() keep page locked

Like in other places where generic_error_remove_folio() is
also being called, in both Lustre and Kernel, page should not
be unlocked prior to call it in cfs_delete_from_page_cache().

This was also allowing a race where page->mapping may become
NULL.

Taking an extra reference is also useless if page not unlocked
anymore.

Fixes: 738e69d4b9 ("LU-16292 llite: delete_from_page_cache not exported")
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: If39575f4339afe460b3b1c955201e8f9cdfeb871
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59829
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 hours agoLU-14810 lnet: Avoid multiple PUSH to same peer 15/59815/3
Chris Horn [Tue, 17 Jun 2025 19:22:48 +0000 (13:22 -0600)]
LU-14810 lnet: Avoid multiple PUSH to same peer

It is possible to send multiple PUSHes to the same peer when the
LNET_PEER_FORCE_PUSH bit is set in the peer state. A partial solution
was added in https://review.whamcloud.com/55559/ where we modified
lnet_peer_needs_push() to check for the PUSH_SENT flag. However, we
missed that the main loop in lnet_peer_discovery() will check for the
LNET_PEER_FORCE_PUSH bit prior to calling lnet_peer_needs_push().
Update lnet_peer_discovery() to remove the problematic check for
LNET_PEER_FORCE_PUSH.

Also refactor the checks for sending a ping into a new function,
lnet_peer_needs_ping().

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet env=ONLY=212,ONLY_REPEAT=100
Fixes: 72726a3118 ("LU-14810 lnet: Do not issue multiple PUSHes")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie25089a07ac1d0fcc0e6c56ec69337d22371cc32
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59815
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 hours agoLU-19075 o2iblnd: reduce memory usage 88/59488/3
Alexey Lyashkov [Fri, 30 May 2025 09:32:44 +0000 (12:32 +0300)]
LU-19075 o2iblnd: reduce memory usage

ib_recv_wr / ib_sge are not needed after ib_post_recv finished.
lets remove it.

Test-Parameters: trivial
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: Id866c1bbaeafa41103ef7caa8a1254c53c9e3c3d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59488
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 hours agoLU-18765 tests: sanity-quota_91 interop check 83/58283/3
Sergey Cheremencev [Mon, 3 Mar 2025 15:31:21 +0000 (18:31 +0300)]
LU-18765 tests: sanity-quota_91 interop check

Start sanity-qutoa_91 only when MDS version is greater
or equal to v2_16_50-52-g1f9689d0f9.

Fixes: 1f9689d0f9 ("LU-17770 quota: don't panic in qmt_map_lge_idx")
Test-Parameters: trivial testlist=sanity-quota serverversion=2.16.1
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Ie37f05e8c7b13bb9444991f89c72d20eab1cecba
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58283
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 hours agoLU-19124 tests: except sanity/27P on ubuntu2204 16/59916/3
Andreas Dilger [Tue, 24 Jun 2025 14:48:20 +0000 (08:48 -0600)]
LU-19124 tests: except sanity/27P on ubuntu2204

This test is failing 100% on Ubuntu kernel 5.15.0-142
but was passing on kernel 5.15.0-94.

Disable until issue is fixed.

Test-Parameters: trivial testlist=sanity env=ONLY=27,HONOR_EXCEPT=y clientdistro=ubuntu2204
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0152cdcf51aba26c6fa6896ed87b5ade1ed69542
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59916
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
15 hours agoLU-19014 memcg: fix client hang in balance_dirty_pages() 23/59223/11
Qian Yingjin [Sat, 24 May 2025 08:30:48 +0000 (16:30 +0800)]
LU-19014 memcg: fix client hang in balance_dirty_pages()

Two nodes (at least) append write a shared file in Lustre with
memcg enabled.
The client randomly hung in balance_dirty_pages() with the
following call trace:
[<0>] balance_dirty_pages+0x2ee/0xd10
[<0>] balance_dirty_pages_ratelimited_flags+0x27a/0x380
[<0>] generic_perform_write+0x150/0x210
[<0>] vvp_io_write_start+0x516/0xc00 [lustre]
[<0>] cl_io_start+0x5a/0x110 [obdclass]
[<0>] cl_io_loop+0x97/0x1f0 [obdclass]
[<0>] ll_file_io_generic+0x4d2/0xe50 [lustre]
[<0>] do_file_write_iter+0x3e9/0x5d0 [lustre]
[<0>] vfs_write+0x2cb/0x410
[<0>] ksys_write+0x5f/0xe0
[<0>] do_syscall_64+0x5c/0xf0

After analyze the core dump of the hung system, we found that the
bdi_writeback data structure (wb) corresponded to the memcg has
pending dirty pages (in state WB_registered | WB_has_diry_io), but
can not write-out the dirty pages and loop in balance_dirty_pages
function.

This is a bug in Lustre memcg code. In OSC/MDC layer, it will stop
to flush dirty pages once found that there are no any unstable
pages.
However, there may be some dirty pages queued in the cache. In
this case, the client should still write back the dirty pages.
Thus the wb stat accounting will be updated and the write process
can continue instead of looping endless.

Moreover, there are some problem in the current Lustre CLIO engine.
When the system or a certain memcg is under memory pressure, the
client just queues the dirty page in page cache or in the current
active extent (OES_ACTIVE osc_extent) when vvp_io_write_commit()/
cl_io_commit_async() is called in ->write_end(). The queued pages
can not be written back even the kernel is trying to flush dirty
pages in writeback via ->ll_writepages().
The client is looping in the following call sequences:

loop:
->write_begin()
->write_end()
->balance_dirty_pages()
  ->Launch file writeback in background but cannot flush any
    dirty pages.
  ->The current process is paused a certain time (i.e. 200ms) as
    the corresponding @wb is dirty exceeded.
-> GOTO loop:

The write progress is very slow: write a page and sleep/pause for
a period of time alternately.

We fix this hang in ->ll_write_end(). When detect the corresponding
@wb is dirty exceeded, the client will submit the dirty pages into
OSC writeback cache. The state of current extent will change from
OES_ACTIVE to OES_CACHE and this kind of extents can be written
back. Moreover, we mark the current extent as urgent, thus it can
be flushed much more quickly.

Fixes: 8aa231a99 ("LU-16713 llite: writeback/commit pages under memory pressure")
Signed-off-by: Yingjin Qian <qian@ddn.com>
Change-Id: Iecee60484f1b65fad6f4c9eac7bd4d2c53f38b8d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59223
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 days agoLU-16974 utils: make bandwidth options consistent 11/59411/4
Andreas Dilger [Sat, 24 May 2025 04:51:39 +0000 (22:51 -0600)]
LU-16974 utils: make bandwidth options consistent

The "lfs mirror extend|resync" and "lfs migrate" commands all have
the ability to limit IO bandwidth. Some used "--bandwidth" and
others used "--bandwidth-limit". Make them all the same.

They accept the longer "--bandwidth-limit" for compatibility, but
only "--bandwidth" will be documented for brevity and ease of use.
The getopt_long() options can be abbreviated to any unique prefix,
so both do not need to be specified.

Update the usage messages and man pages to reflect these options.
Minor nearby code style fixes to the man pages as well.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie8702bedba9fbfa5b0ea473853a7b5480e61abb5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59411
Tested-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 days agoLU-9633 ptlrpc: Add kernel doc style for ptlrpc (14) 59/59759/3
Arshad Hussain [Fri, 13 Jun 2025 16:24:25 +0000 (21:54 +0530)]
LU-9633 ptlrpc: Add kernel doc style for ptlrpc (14)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I0bd4e00cd20e8057743564bcb1677fa26ffd0294
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59759
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 days agoLU-9633 ptlrpc: Add kernel doc style for ptlrpc (13) 58/59758/2
Arshad Hussain [Fri, 13 Jun 2025 15:53:04 +0000 (21:23 +0530)]
LU-9633 ptlrpc: Add kernel doc style for ptlrpc (13)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: If6e81cc9dbfc8f063597884da19db0f5f2438c18
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59758
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 days agoLU-9633 ptlrpc: Add kernel doc style for ptlrpc (12) 57/59757/2
Arshad Hussain [Fri, 13 Jun 2025 14:45:11 +0000 (20:15 +0530)]
LU-9633 ptlrpc: Add kernel doc style for ptlrpc (12)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I78aa662b13a99870f59d84a05c773f5beb6e22e3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59757
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 days agoLU-9633 ptlrpc: Add kernel doc style for ptlrpc (11) 56/59756/3
Arshad Hussain [Mon, 2 Jun 2025 17:00:21 +0000 (22:30 +0530)]
LU-9633 ptlrpc: Add kernel doc style for ptlrpc (11)

This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I757552dc766b50acfbb35838be3d12406de90009
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59756
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 days agoLU-16974 utils: fix 'lfs mirror resync --stats' printing 63/59263/7
Andreas Dilger [Sat, 17 May 2025 09:43:57 +0000 (03:43 -0600)]
LU-16974 utils: fix 'lfs mirror resync --stats' printing

If "lfs mirror resync --stats" is run without --bandwidth-limit,
then no stats are printed until the file resync is 100% finished.

Fix llapi_mirror_resync_many_params() to print the stats even if
no bandwidth throttle is being applied.

Fix write estimate calculation for a file with multiple components.
Print stats with requested granularity, not rounded to next second.

Update the code in lfs.c::migrate_copy_data() to use shared stats
printing and bandwidth throttling with llapi_mirror_resync_many().

Replace direct calls to fprintf() with llapi_err*().
Fix header ordering, remove duplicate headers found.

Test-Parameters: trivial testlist=sanity-flr
Fixes: be131d125a ("LU-16974 utils: lfs mirror resync to show progress")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I68df25501f3cac2647318cff2eb86062f9300c1e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59263
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 days agoLU-18798 mdd: Exclude quotes from user.job xattr. 43/58943/24
Aryan Gupta [Thu, 24 Apr 2025 09:10:03 +0000 (03:10 -0600)]
LU-18798 mdd: Exclude quotes from user.job xattr.

Exclude quotes around the jobid string when saving it into the
user.job xattr. Adjusted the mdd_buf_get_const() call to use
jobid + 1 as the starting point and jobid_len - 2 as the length
when quotes are detected.

Test-Parameters: trivial
Signed-off-by: Aryan Gupta <argupta@ddn.com>
Change-Id: I4dcf7012572de71a0bb64f3376a9f2d8544c0684
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58943
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 days agoLU-18242 utils: allow 'lfs df -m/-o' to specify one MDT/OST 29/58729/8
Chakshu Kansal [Fri, 13 Jun 2025 09:22:15 +0000 (14:52 +0530)]
LU-18242 utils: allow 'lfs df -m/-o' to specify one MDT/OST

Extend the 'lfs df' command to allow specifying a single MDT
or OST index with the -m and -o options. This allows users to
easily check the space usage of a specific target without
having to use grep.

For example:
  lfs df -m0 /mnt/lustre  # Show only MDT0
  lfs df -o1 /mnt/lustre  # Show only OST1

This is useful in test scripts and for end users who want to
check the space usage of a specific target.

Signed-off-by: Chakshu Kansal <ckansal@ddn.com>
Change-Id: I74538b6d8c95c41a7d9030f94103a7c0fd8756db
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58729
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 days agoLU-11850 obd: support target_obd using Netlink 06/58506/13
James Simmons [Sun, 8 Jun 2025 00:45:44 +0000 (20:45 -0400)]
LU-11850 obd: support target_obd using Netlink

Due to "target_obd" being an debugfs file normal users can't
access its contents. This breaks standard tools non-root.
Implement the same functionality using Netlink.

We don't implement it for lod since its an dt_device which
we don't have a way to find such a device like obd_devices.
The lod layer is server side so only root should have access.

Change-Id: I8fce80f6460b4b3f46106bc24d9494ae94e4fd4b
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58506
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
5 days agoLU-19110 lnet: add interop for net_delay commands 53/59853/4
Timothy Day [Thu, 19 Jun 2025 15:06:18 +0000 (15:06 +0000)]
LU-19110 lnet: add interop for net_delay commands

Older Lustre versions do not have the net_delay subcommands,
which causes interop testing failures. Add interop checks
to mitigate this.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet serverversion=2.16
Test-Parameters: testlist=sanity-lnet serverversion=2.16
Test-Parameters: testlist=sanity-lnet serverversion=2.16
Test-Parameters: testlist=sanity-lnet serverversion=2.16
Fixes: 6d9cfdeda926 ("LU-18114 tests: fix the version checks")
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I1650dc5a3d851120e3e7b5edad68998110e83183
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59853
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
5 days agoLU-17242 libcfs: deduplicate macros with ENUM2STR 46/58346/3
Timothy Day [Sat, 8 Mar 2025 19:03:18 +0000 (14:03 -0500)]
LU-17242 libcfs: deduplicate macros with ENUM2STR

Add ENUM2STR and replace the various redefinitions
of this macro.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ibbc536e59d24af4d930e0ecc772a869398fb9da3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58346
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 days agoLU-18613 lbuild: fix RPMBUILD in lbuild 62/57662/4
Jian Yu [Mon, 6 Jan 2025 22:16:15 +0000 (14:16 -0800)]
LU-18613 lbuild: fix RPMBUILD in lbuild

lbuild should return an error when rpmbuild is not found
in check_options(). The check for rpm is not needed because
it's not used for building RPM package.

Test-Parameters: trivial

Change-Id: I6572110c6362a941f130d65d6734bdebbc6acc82
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57662
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 days agoLU-14772 tests: Change init ENV in conf-sanity.sh 00/59800/3
Vitaliy Kuznetsov [Tue, 17 Jun 2025 14:37:44 +0000 (16:37 +0200)]
LU-14772 tests: Change init ENV in conf-sanity.sh

This patch moves the initialization of init_test_env() so
that it is executed before the conf-sanity-framework.sh
library is sourced.

Fixes: 4a9fc7cebaad24cadf3213906fda193d8c681226 ("LU-14772 tests: Add conf-sanity-framework.sh")
Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I8b21bb56cdefe8d31d3e4e5653fcfcbb32c5c5e7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59800
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 days agoLU-19050 utils: fix get_root_path WANT condition 79/59779/3
Marc Vef [Mon, 16 Jun 2025 14:16:12 +0000 (16:16 +0200)]
LU-19050 utils: fix get_root_path WANT condition

This patch fixes the WANT condition in get_root_path(), modified in
the previous patch, as get_root_path_fast() should not be called when
either WANT_INDEX or WANT_NID is set.

Fixes: fff7cd33bbb4 ("LU-19050 utils: Support long nid lists when getting fs info")
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: Ib801f9f2cc19eaeb1e3a5932391ccf7dc53b9f5e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59779
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 weeks agoLU-19098 hsm: don't print progname twice with lhsmtool 80/59680/2
Aurelien Degremont [Tue, 28 Jan 2025 14:08:07 +0000 (15:08 +0100)]
LU-19098 hsm: don't print progname twice with lhsmtool

Since Lustre 2.11, log message from llapi_error() prefixes
the message with the current program short name. There is no
more a need for lhsmtool_posix log fonctions (CT_xxxx) to also
do the same.

Remove the duplicate prog name and cmd_name.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: I4ebdf01cd00d1544678cbad066e2c3a79ecfda38
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59680
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 weeks agoLU-19062 llapi: add layout pattern string functions 30/59530/9
Andreas Dilger [Tue, 27 May 2025 04:02:22 +0000 (22:02 -0600)]
LU-19062 llapi: add layout pattern string functions

Add llapi_lov_pattern_string() to print arbitrary pattern flags to
a string rather than layout2name() which only can print specific
hard-coded combinations of patterns.

Add llapi_lov_string_pattern() to convert layout pattern names to
flags.

Add enum lov_pattern that holds LOV_PATTERN constants, and use it.

Add description of patterns to lfs-getstripe.1.
Restore "-L, --layout" argument listing to lfs-setstripe.1.

Fixes: b6deb420a8 ("LU-17370 utils: simplify lfs help text")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie21c7c75c685f3a15ac23e83562a12a3ea2540e5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59530
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Zhenyu Xu <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 weeks agoLU-19079 nodemap: reserve cmds for gss identification 14/59514/7
Sebastien Buisson [Mon, 2 Jun 2025 08:41:59 +0000 (10:41 +0200)]
LU-19079 nodemap: reserve cmds for gss identification

Declare 2 new values in enum lcfg_command_type:
LCFG_NODEMAP_GSS_IDENTIFY = 0x00ce065
LCFG_NODEMAP_LOOKUP_SHA = 0x00ce066

LCFG_NODEMAP_GSS_IDENTIFY is for a new nodemap property that would be
named gss_identification. And LCFG_NODEMAP_LOOKUP_SHA is to be able to
lookup a nodemap from the sha256 of its name.

Declare a new value in enum nm_flag2_bits:
NM_FL2_GSS_IDENTIFY = 0x8

This is to store on disk the value of the future gss_identification
property.

Reserve sanity-sec test_79 for testing the gss identification feature.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2e2648f2eeb0956d7cb0793865b3344d1e8ed5a0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59514
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 weeks agoLU-18924 hsm: fix the crash cause by huge max_requests 93/58793/16
Emoly Liu [Mon, 26 May 2025 08:31:41 +0000 (16:31 +0800)]
LU-18924 hsm: fix the crash cause by huge max_requests

Some variables in function mdt_coordinator() should be unsigned
type to avoid memory allocation crash caused by huge parameter
mdt.*.hsm.max_requests.

To avoid such a failure earlier, the sum of MDT max_requests is
limited to 1/8 of total memory.
If it is bigger than this limit, it will be recalculated by this
limit and a useful warning message with memory information and
the limit will be printed.

Also, sanity-hsm.sh test_40 is modified to verify this patch and
stack_trap is added to test_50 and test_100 to restore the default
max_requests value correctly.

Test-Parameters: testlist=sanity-hsm env=ONLY=40,ONLY_REPEAT=20
Test-Parameters: testlist=sanity-hsm
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I3f6f9722c2af34a4632dc1620ad191774b8ed403
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58793
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Guillaume Courrier <guillaume.courrier@cea.fr>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-19049 lutf: Debian 13: swig 4.3, python 3.13.3 01/59401/3
Shaun Tancheff [Sat, 24 May 2025 00:56:10 +0000 (07:56 +0700)]
LU-19049 lutf: Debian 13: swig 4.3, python 3.13.3

Update the config/ac_python_devel.m4 to handle distutils
being removed.

Handle swig 4.3 api change, SWIG_Python_AppendOutput() takes a
3rd argument 'is_null'

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I35968b6928f682f5a70731e316db5e171fad00be
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59401
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-19035 kernel: update RHEL 8.10 [4.18.0-553.53.1.el8_10] 41/59341/3
Jian Yu [Wed, 21 May 2025 07:25:03 +0000 (00:25 -0700)]
LU-19035 kernel: update RHEL 8.10 [4.18.0-553.53.1.el8_10]

Update RHEL 8.10 kernel to 4.18.0-553.53.1.el8_10.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-1

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-2

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-3

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-1

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-2

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-3

Change-Id: Ide3ddc9dd8716e24cfb5bbbcba75237ac58041ba
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59341
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-19029 kernel: update RHEL 8.10 [4.18.0-553.52.1.el8_10] 93/59293/3
Jian Yu [Mon, 19 May 2025 18:51:33 +0000 (11:51 -0700)]
LU-19029 kernel: update RHEL 8.10 [4.18.0-553.52.1.el8_10]

Update RHEL 8.10 kernel to 4.18.0-553.52.1.el8_10.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-1

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-2

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-3

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-1

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-2

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-3

Change-Id: I0d5a2872050a92e1bf8e8b9438ab2bd1a4a21636
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59293
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-18668 kernel: update RHEL 9.6 [5.14.0-570.17.1.el9_6] 92/59292/4
Jian Yu [Mon, 19 May 2025 18:44:18 +0000 (11:44 -0700)]
LU-18668 kernel: update RHEL 9.6 [5.14.0-570.17.1.el9_6]

Update RHEL 9.6 kernel to 5.14.0-570.17.1.el9_6 for Lustre client.

Test-Parameters: trivial env=SANITY_EXCEPT="17p" \
  mdtcount=4 mdscount=2 clientdistro=el9.6 testlist=sanity
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-3

Change-Id: Iac6973ac636953c1e64a60433ae72c0f692a24ca
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59292
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
3 weeks agoLU-18162 mdc: convert Metadata Client to LU device 09/58809/7
Timothy Day [Tue, 15 Apr 2025 17:38:09 +0000 (17:38 +0000)]
LU-18162 mdc: convert Metadata Client to LU device

Convert MDC to use LU device init/fini rather
than the legacy OBD API.

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ic48accf1104d2845707b44519c32c5ced56ae8ef
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58809
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-18813 osd-wbcfs: make dt_last_seq_get() optional 02/59502/2
Timothy Day [Sun, 1 Jun 2025 20:39:04 +0000 (16:39 -0400)]
LU-18813 osd-wbcfs: make dt_last_seq_get() optional

Allow osd-wbcfs to leave this unimplemented for
now. Once we track object mapping internally,
we can implement this.

Fixes: 373b76b345b5 ("LU-17658 fid: check on disk sequence before allocating to osp")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Iea4401db0c7656fd56c43f6e0d296cabce89e864
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59502
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-18687 doc: move man *.5 pages to Documentation/man5 89/58589/5
Timothy Day [Sat, 29 Mar 2025 23:29:40 +0000 (19:29 -0400)]
LU-18687 doc: move man *.5 pages to Documentation/man5

Consolidate all of the man pages into the top
level Documentation directory.

Move all of the Lustre man pages (from 5) to Docmentation/.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Idac247218406378398fdbf7f84e779ed87c42eef
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58589
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-16518 lst: fix switch-case unannotated fall-through (2) 45/58345/4
Timothy Day [Sat, 8 Mar 2025 18:33:28 +0000 (13:33 -0500)]
LU-16518 lst: fix switch-case unannotated fall-through (2)

Fix more unannotated fall-through errors reported by Clang,
by moving the default case to the end of the switch-case
statement.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I12fb3bf709f2edd6ef03f58c122255319dc91049
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58345
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-18602 osc: don't block async enqueues 99/57599/13
Alex Zhuravlev [Fri, 25 Apr 2025 02:49:03 +0000 (05:49 +0300)]
LU-18602 osc: don't block async enqueues

don't block async enqueue RPCs trying to resend, otherwise the client
can get into a deadlock awaiting for RPC pinning object-inodes at
umount:

CPU: 0 PID: 9863 Comm: umount
Call Trace:
 dump_stack+0x6e/0xa0
 lbug_with_loc.cold.4+0x5/0x63 [libcfs]
 lov_delete_composite+0x45b/0x680 [lov]
 lov_object_delete+0xc1/0x260 [lov]
 lu_object_free.isra.5+0x76/0x190 [obdclass]
 cl_inode_fini+0xeb/0x250 [lustre]
 ll_clear_inode+0x269/0x620 [lustre]
 ll_delete_inode+0x3b/0x140 [lustre]
 evict+0xbc/0x180
 dispose_list+0x38/0x60
 evict_inodes+0x12e/0x170
 generic_shutdown_super+0x2d/0xf0
 kill_anon_super+0x9/0x20
 deactivate_locked_super+0x24/0x60
 cleanup_mnt+0x36/0x70

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id1c6c71414b7387b9b0b191b65e34cf65388b57f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57599
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 weeks agoLU-19099 llite: Add vfstrace debug prints for ll_fallocate 18/59718/3
Oleg Drokin [Wed, 11 Jun 2025 22:49:10 +0000 (18:49 -0400)]
LU-19099 llite: Add vfstrace debug prints for ll_fallocate

Currently ll_fallocate() does not have any vfstrace prints, but
it's important for debugging related issues.

Fixes: 48457868a02a ("LU-3606 fallocate: Implement fallocate preallocate operation")
Test-Parameters: trivial
Change-Id: I165a7208a59b055756db416063c6695cc7f2d8e4
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59718
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-18973 mgc: account failovers from all sources 76/59076/9
Mikhail Pershin [Sat, 3 May 2025 14:52:19 +0000 (17:52 +0300)]
LU-18973 mgc: account failovers from all sources

Once set up initially MGC is not updating import failovers
from other mounts. That causes problems with MGC on MGS -
it is always set up with only @lo interface, so if MGS
failed over to other node, all targets/clients on primary
node are unable to find MGS, because MGC has only @lo peer

Patch reworks lustre_start_mgc() code to account all
failover peers from each user of that MGC. It adds new
failover NIDs even if MGC exists already.

Patch re-organizes also the way how  peers are identified.
It uses peer UUID as 'Primary NID' string instead of
naming it as 'MGC<PrimaryNID>_##' so same NIDs don't
produces new mappings and don't pollute import with
duplicated connections.

That makes LCFG_DEL_UUID obsoleted as well, because
lustre_stop_mgc() was its last user.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Icea5b74a16972e8a5f2737257086074630e652a8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59076
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-18942 obdclass: rework limits for zfs 18/58918/5
Alexey Lyashkov [Wed, 23 Apr 2025 08:23:53 +0000 (11:23 +0300)]
LU-18942 obdclass: rework limits for zfs

ZFS ARC is uncontrolled by Linux memory, so this size should not be
accounted for all caches, not just the lu_object cache. Let's reduce
the number of objects freed in a single batch to avoid high CPU usage
in ARC prune threads and increase latency in providing free space.

Test-Parameters: trivial
HPE-bug-id: LUS-12814, LUS-12813
Fixes: 79b4ae9139c ("LU-1305 osd: osd_handler")
Fixes: 0123baecc4e ("LU-5164 osd: Limit lu_object cache")
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I5342149b185c61c56087d970f26eb4f197a597ef
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58918
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-19103 mgs: check mti nid format is old 42/59742/2
Mikhail Pershin [Fri, 13 Jun 2025 15:05:43 +0000 (18:05 +0300)]
LU-19103 mgs: check mti nid format is old

Check mti NID list format passed by MGC in addition to
connection flags as real request can be prepared before
flags are negotiated

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ic669be26439bc1ef2f5713c0520af8c1c54fc981
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59742
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17814 utils: implement real pfind 95/57295/14
Patrick Farrell [Thu, 5 Dec 2024 04:30:12 +0000 (23:30 -0500)]
LU-17814 utils: implement real pfind

This patch does the last step of integrating the actual
find code with pfind.

This doesn't mean we're done - it doesn't have error
propagation and is not enabled by default - but we are
close.

The next patch will finalize and enable testing.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I34020357037bdcf400ffe7f3b4dc14ea5e5a23c7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57295
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17814 utils: Add deep copy of find_param 94/57294/15
Patrick Farrell [Thu, 5 Dec 2024 04:08:38 +0000 (23:08 -0500)]
LU-17814 utils: Add deep copy of find_param

Need to copy find_param for each work unit.

Technically not all fields need to be copied, but a bunch
do and it's much easier to copy the whole thing than work
out precisely which fields need to be copied.

Plus that is fragile to future changes, this should be more
robust.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7fbf909b3fc88ca4a4300abc7e4ccea776fff629
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57294
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-15358 tests: fix sanity-flr.sh test 0b syntax 46/59446/3
Oleg Drokin [Tue, 27 May 2025 04:30:39 +0000 (00:30 -0400)]
LU-15358 tests: fix sanity-flr.sh test 0b syntax

local=cnt is clearly a typo and shoule be just local

Change-Id: I055c65eca5fe356dd5d180b4c8bf238c9f27c179
Fixes: 0c710a46cfb4 ("LU-11022 lfs: remove mirror by pool name")
Test-Parameters: trivial
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59446
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
3 weeks agoLU-19065 osc: remove extra linefeed from debug 58/59458/3
Alex Zhuravlev [Tue, 27 May 2025 17:35:52 +0000 (20:35 +0300)]
LU-19065 osc: remove extra linefeed from debug

OSC_DUMP_GRANT() has own trailing linefeed, so the callers
shouldn't pass extra n in their messages.

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ifc5f01c3d79dcbd2619c1bcba9305c635006e8d9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59458
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-19064 tests: sanity/851 to use correct host 54/59454/3
Alex Zhuravlev [Tue, 27 May 2025 08:28:15 +0000 (11:28 +0300)]
LU-19064 tests: sanity/851 to use correct host

sanity/851 should use correct hostname to run well on a local setup.

Test-Parameters: trivial env=ONLY=851,ONLY_REPEAT=10 testlist=sanity
Test-Parameters: trivial env=ONLY=851,ONLY_REPEAT=10 testlist=sanity
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id8c62cc8fa7c5e57cef70e549652d30db94a0740
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59454
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Feng Lei <flei@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 weeks agoLU-15358 tests: sanity-quota wait_reintegration wrong quotes 45/59445/4
Oleg Drokin [Tue, 27 May 2025 04:18:22 +0000 (00:18 -0400)]
LU-15358 tests: sanity-quota wait_reintegration wrong quotes

Looks like these quotes need to be escaped otherwise they
just unquote the variable and grep might get confused.

Test-Parameters: trivial
Fixes: c2db06180b29 ("LU-2183 quota: quota tests for DNE")
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I3b129e3924da4cbe4d6baa6e8c958881a799de26
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59445
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-19091 ptlrpc: protect internal access to obd->obd_svc_stats 93/59593/4
Bruno Faccini [Thu, 5 Jun 2025 14:27:51 +0000 (16:27 +0200)]
LU-19091 ptlrpc: protect internal access to obd->obd_svc_stats

PM-QoS patch from LU-18446, where OBD svc stats are used to
evaluate best time period for low CPUs latency to be kept, has
introduced a new and internal way to access obd->obd_svc_stats
which now requires other concurrent access protection than
simply to remove external tunables in /sys or /debug.

Fixes: 54a64ea818 ("LU-18446 ptlrpc: lower CPUs latency during client I/O")
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I45a5f65216fa2bf0821776ff3141fa8e2a33f10e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59593
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17000 obdclass: Fix mem leak in lcfg_setparam_client 99/59499/2
Arshad Hussain [Sat, 31 May 2025 06:06:53 +0000 (02:06 -0400)]
LU-17000 obdclass: Fix mem leak in lcfg_setparam_client

if call to llapi_param_get_paths() fails. tmp_path
is left unfreed.

Test-Parameters: trivial
CoverityID: 457066 ("Resource leak")
Fixes: 10a04e32 (LU-16724 ptlrpc: refactor page pools patch 3)
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ib8962675fcb06a4d6b1539340f4a005dd65b7e02
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59499
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-18897 o2iblnd: NULL pointer dereference 98/59498/3
Cyril Bordage [Fri, 30 May 2025 16:07:08 +0000 (18:07 +0200)]
LU-18897 o2iblnd: NULL pointer dereference

When the network is flapping, we could get an
RDMA_CM_EVENT_UNREACHABLE event before conn is created, so we should
check the value first.

Test-Parameters: trivial
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I8d9777370b927c28ee438687de596e498d64bb07
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59498
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-19076 ptlrpc: resend can hit original req 97/59497/12
Alex Zhuravlev [Fri, 30 May 2025 15:45:41 +0000 (18:45 +0300)]
LU-19076 ptlrpc: resend can hit original req

the client may need to resend a request if the reply buffer can
not fit the reply (LOVEA has just changed, for example).
in some environment (e.g. server and client share same node),
a resend RPC can find the original RPC on export's list and the
server just drops the resend RPC thinking it's a duplicate.
this way the client gets no reply for the resend RPC and times
out.

if this problem happens during layout refresh where the client
holds layout lock requesting LOVEA with MDS_GETXATTR, then
the server can evict the client.

the patch removes RPC from export's list just before sending a
reply as RPC has been already processed and for non-idempotent
request reconstruction should take place.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I48437ad018b9b43b9fff4157203906fd84b6cfd3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59497
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-19072 lnet: don't crash if ni_status is NULL 82/59482/2
Timothy Day [Thu, 29 May 2025 18:36:16 +0000 (18:36 +0000)]
LU-19072 lnet: don't crash if ni_status is NULL

When reading LNet tunables, ni_status can be NULL. This
triggers an LASSERT() rather than gracefully handling it.
Instead, don't crash. Remove the LASSERT().

lnet_ni_get_status_locked() already handles a NULL ni_status.
While it's questionable whether ni_status == NULL should be
LNET_NI_STATUS_UP or LNET_NI_STATUS_DOWN, it definitely
should not crash.

Also, use lnet_ni_get_status() instead of
lnet_ni_get_status_locked().

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I1d8ba9b5f6478d2a915ac6c7f33c22d1742c43d0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59482
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17000 llite: Handle not NUL terminated buffer in ll_statahead_info 56/59456/3
Arshad Hussain [Tue, 27 May 2025 15:39:28 +0000 (21:09 +0530)]
LU-17000 llite: Handle not NUL terminated buffer in ll_statahead_info

Match ll_statahead_info:sai_fname(target) array
length with llapi_lu_ladvise2:lla_buf(source).

Test-Parameters: trivial
CoverityID: 400216 ("Buffer not null terminated")
Fixes: 1288681b (LU-14361 statahead: add statahead advise IOCTL)
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Id898ab4b49d54bd734831c09e3de725533e7c249
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59456
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-17848 osd: dt_tunables_fini() and friends return void 18/59418/3
Timothy Day [Sun, 25 May 2025 05:37:19 +0000 (01:37 -0400)]
LU-17848 osd: dt_tunables_fini() and friends return void

The function dt_tunables_fini() can't really fail. Make
it return void. Make the various osd_procfs_fini()
implementations also return void.

Test-Parameters: trivial
Test-Parameters: trivial fstype=zfs
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I5edc7ed43fad69d6ebdd734d8e9fdc69cdcf0915
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59418
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-18813 osd-wbcfs: use common rwsem for osd_object 17/59417/2
Timothy Day [Sun, 25 May 2025 02:47:12 +0000 (22:47 -0400)]
LU-18813 osd-wbcfs: use common rwsem for osd_object

Use a common read/write semaphore for all osd_object
attributes.

Test-Parameters: trivial
Test-Parameters: testlist=sanity fstype=wbcfs mdscount=4 mdtcount=1 osscount=4 ostcount=1
Test-Parameters: testlist=sanity fstype=wbcfs combinedmdsmgs=false standalonemgs=true mdscount=1 mdtcount=1 osscount=4 ostcount=1
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I16678e57596365ce25d978e2b5a524fc4c21bf26
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59417
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-19053 build: allow specifying "make rpms" build dir 10/59410/2
Andreas Dilger [Sat, 24 May 2025 03:56:38 +0000 (21:56 -0600)]
LU-19053 build: allow specifying "make rpms" build dir

Currently "make rpms" will create a temporary directory with mktemp
to hold the intermediate build products, and this ends up in /tmp.
This can cause issues if /tmp is not large enough for the full build.

Allow specifying "BUILDDIR=DIR" to redirect the intermediate build
products into the specified directory. This allows you to run:

    BUILDDIR=/var/tmp make rpms

or

    BUILDDIR=/var/tmp make debs

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I12f2e7444f0fc7f09f41d64b8e4dd4a429797a37
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59410
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-14712 ldiskfs: keep EXT4_BG_TRIMMED flag in memory 12/59312/7
Li Dongyang [Fri, 23 May 2025 10:05:47 +0000 (20:05 +1000)]
LU-14712 ldiskfs: keep EXT4_BG_TRIMMED flag in memory

Keep the EXT4_BG_TRIMMED flag in memory for the trimmed block groups
so that the filesystem without track_trim superblock bit(e.g. existing
filesystem created with earlier version of e2fsprogs) can still
skip trimmed block groups during fstrim as long as it's mounted.

For persistent trimmed block group tracking we should turn on track_trim
with tune2fs.

Change-Id: I19df047c717d3b20310fcba7fa682b6dfab9d5e4
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59312
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-19015 llog: logic for skipping a zeroed record 67/59267/6
Alexander Boyko [Fri, 16 May 2025 12:38:12 +0000 (14:38 +0200)]
LU-19015 llog: logic for skipping a zeroed record

For ENOSPC errors during dt_write() and threads races, the changelog
could have a sparse file with zeros inside. The current processing
logic skips records for the next chunk.
The patch adds the abilty to skip only zeros in the buffer and start
from a valid record.
Also fix changes the llog_test 8 so that it uses non-zero byte for
corruption.

Fixes: cb1290768df9 ("LU-18218 mdd: changelog specific write function")
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I7263764ba6a89f226995b8967631eaa6d5bdd4dd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59267
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-19013 lnet: fix wording for GDS configure check 15/59215/4
Timothy Day [Tue, 13 May 2025 21:37:17 +0000 (21:37 +0000)]
LU-19013 lnet: fix wording for GDS configure check

The wording on the GDS/CUDA configuration options is
incorrect. If the user does not specify external headers,
Lustre will fallback to the embedded headers rather
than disabling GDS.

Fix the wording on the configure options, improve the
macro name, and reorganize the header such that correct
defintions are under the ifdef.

Fixes: c65eabc2b113 ("LU-15189 build: add GDS configure options")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I645bc1c0c4bf26bdb9841c849b6cf8eebdc0bdee
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59215
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-18162 mgc: convert Management Client to LU device 26/58826/5
Timothy Day [Wed, 16 Apr 2025 16:26:50 +0000 (16:26 +0000)]
LU-18162 mgc: convert Management Client to LU device

Convert MGC to use LU device init/fini rather
than the legacy OBD API.

Also, use ldo_process_config rather than the legacy
o_process_config.

Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ic33aeb0d1effabc25b538d946611c3a0b189150e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58826
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-16767 mdt: Allow jobID fields widths 22/58822/11
Giardi Sylwyn [Mon, 24 Mar 2025 13:52:02 +0000 (14:52 +0100)]
LU-16767 mdt: Allow jobID fields widths

Modify the function jobid_interpret_string in order to allow admin to
specify the widths of parameter printed by
lctl get_param mdt.*.job_stats.
By specifying the parameter jobid_name, the admin can truncate the
fields.
For exemaple, the format "%3e.%u.%6h" will print in job_stats
the 3 first characters of the executable name, a dot, the whole uid, and
the 6 first characters of the hostname.
If no digit is passed before the letter, it will print the whole field.

Signed-off-by: Giardi Sylwyn <sylwyn.giardi@cea.fr>
Change-Id: Ifd94b354cef07a7fff5e70c94c313a7e4617e2f8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58822
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 weeks agoLU-18162 kunit: convert llog unit test to LU device 03/58803/3
Timothy Day [Tue, 15 Apr 2025 05:21:08 +0000 (05:21 +0000)]
LU-18162 kunit: convert llog unit test to LU device

Convert OBD test to use LU device init/fini rather
than the legacy OBD API.

Test-Parameters: trivial testlist=sanity env=ONLY=60a,ONLY_REPEAT=25
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Iab7e3109ac061be826b0d7695fcc69e0dee2346d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58803
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-18824 utils: Fix lfs migrate with --overstripe-count 72/58672/5
Patrick Farrell [Thu, 3 Apr 2025 22:16:46 +0000 (18:16 -0400)]
LU-18824 utils: Fix lfs migrate with --overstripe-count

The --overstripe-count (-C) option was not being properly
honored during  file migration. When using lfs migrate with
this option, the overstriping flag was set but the
LLAPI_LAYOUT_OVERSTRIPING pattern was not applied to the
destination file.

This was because in lfs_setstripe_internal(), the code only
set lsa.lsa_pattern = LLAPI_LAYOUT_OVERSTRIPING when not in
migrate mode.

Fix this by always setting the pattern when the overstriped
flag is true, regardless of whether we're in migrate mode or
not.

Added a test case (27X) to verify that lfs migrate properly
applies overstriping

NB: This fix and test were generated and tested by the VS
Code Augment Agent after being given the LU URL and some
prompting.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I734b9d4e3c699e335c9d810bba2e2d2a1c301ed6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58672
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 weeks agoLU-18687 doc: move man *.1 pages to Documentation/man1 87/58587/5
Timothy Day [Sat, 29 Mar 2025 23:18:54 +0000 (19:18 -0400)]
LU-18687 doc: move man *.1 pages to Documentation/man1

Consolidate all of the man pages into the top
level Documentation directory.

Move all of the Lustre man pages (from 1) to Docmentation/.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ied472c7612996cfd04670f5b2803bfb48d2bf74a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58587
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-18852 build: Compatability updates for kernel v6.14 51/58551/2
Shaun Tancheff [Thu, 27 Mar 2025 01:42:38 +0000 (08:42 +0700)]
LU-18852 build: Compatability updates for kernel v6.14

Linux commit v6.13-rc1-1-g6fba89813ccf
  lsm: ensure the correct LSM context releaser

struct lsm_context is now upstream, provide an lsmcontext
mapping for Ubuntu

Linux v6.13-rc1-7-g5be1fa8abd7b
  Pass parent directory inode and expected name to ->d_revalidate()

Adjust d_revalidate() to handle the extra arguments.

Use FMODE_NONOTIFY now that __FMODE_NONOTIFY macro is dropped.

Test-Parameters: trivial
HPE-bug-id: LUS-12797
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4ea10d171ab83e6cadb7d03580e9a2748c0d60b0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58551
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-18668 kernel: new kernel [RHEL 9.6 5.14.0-570.16.1.el9_6] 76/57876/9
Jian Yu [Wed, 14 May 2025 00:25:08 +0000 (17:25 -0700)]
LU-18668 kernel: new kernel [RHEL 9.6 5.14.0-570.16.1.el9_6]

This patch makes changes to support new RHEL 9.6 release
for Lustre client.

Test-Parameters: trivial env=SANITY_EXCEPT="17p" \
  mdtcount=4 mdscount=2 clientdistro=el9.6 testlist=sanity
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-3

Change-Id: Idf8c96ee9389978d9497da73b05c5ed400c429d4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57876
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 weeks agoLU-18970 kernel: update RHEL 8.10 [4.18.0-553.51.1.el8_10] 01/59201/3
Jian Yu [Tue, 13 May 2025 00:20:16 +0000 (17:20 -0700)]
LU-18970 kernel: update RHEL 8.10 [4.18.0-553.51.1.el8_10]

Update RHEL 8.10 kernel to 4.18.0-553.51.1.el8_10.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testlist=sanity

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-1

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-2

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-3

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-1

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-2

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-3

Change-Id: I210fcf4be1bf39a0cb6fc64dcdfa898bb98f87ca
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59201
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-18969 kernel: update RHEL 9.5 [5.14.0-503.40.1.el9_5] 00/59200/3
Jian Yu [Tue, 13 May 2025 00:14:30 +0000 (17:14 -0700)]
LU-18969 kernel: update RHEL 9.5 [5.14.0-503.40.1.el9_5]

Update RHEL 9.5 kernel to 5.14.0-503.40.1.el9_5.

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.4 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.4 serverdistro=el9.5 testlist=sanity

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-1

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-2

Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-3

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-1

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-2

Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-3

Change-Id: I62b270ad85126e6022eaf04ddbd32898fb4dc320
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59200
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-19070 dne: dir migrate allowed only for root 74/59474/5
Alexander Zarochentsev [Wed, 28 May 2025 17:29:26 +0000 (17:29 +0000)]
LU-19070 dne: dir migrate allowed only for root

Current implemetation of lfs migrate -m
relies on setxttr(, "trusted.lmv", ) which is
allowed only for users with CAP_SYS_ADMIN capability.
Adding the same check to ll_migrate() will prevent
incomplete migrations from a non-root user.
Add error reporting to cb_migrate_mdt_fini().

Fixes: 0a83d948f3 ("LU-4684 migrate: shrink dir layout after migration")
Fixes: 2dae2b8ffb ("LU-8777 mdt: add parameter to disable remote/striped dir")
HPE-bug-id: LUS-12895
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I58d417b64e2b634d76e4ad38685deb21d9ce8a86
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59474
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-19008 hsm: add locking for coordinator thread stop 25/59425/3
Patrick Farrell [Mon, 26 May 2025 00:30:24 +0000 (20:30 -0400)]
LU-19008 hsm: add locking for coordinator thread stop

There is no locking around thread stop, which can race between
mdt_coordinator() and mdt_hsm_cdt_stop() and with use-after-free
during unmount.  Add locking to avoid this.

Fixes: 4512347d6c ("LU-16356 hsm: add running ref to the coordinator")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I996a79fcbca3b1c6f6a0f5ee5d9f052f31eda61f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59425
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-5969 lnet: use LGPL-2.1+ for SPDX headers 67/59367/4
Andreas Dilger [Thu, 22 May 2025 02:58:06 +0000 (20:58 -0600)]
LU-5969 lnet: use LGPL-2.1+ for SPDX headers

The change from explicit LGPL license text to SPDX headers
introduced a number of incorrect license identifiers, because
the "or (at your option) any later version" text was missed.

Convert remaining library license blocks over to SPDX LGPL-2.1+.
Reorder copyright and file description to be consistent.
Remove filenames explicitly listed in the header block.

Test-Parameters: trivial
Fixes: e6aefbfaa6 ("LU-6142 libcfs: SPDX for libcfs module")
Fixes: 56a9ba02ae ("LU-6142 libcfs: SPDX for libcfs headers")
Fixes: c9a7728476 ("LU-6142 lnet: SPDX for lnet/include/ and misc files")
Fixes: 9e3fd9ce8f ("LU-6142 lnet: SPDX for lnet/util/lnetconfig/")
Fixes: 0f39311369 ("LU-6142 lnet: SPDX for lnet/utils/")
Fixes: 14e981db6c ("LU-6142 misc: SPDX for Lustre headers")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic2e9f70f82211ce5231c12d431ca63dc163ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59367
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Guillaume Courrier <guillaume.courrier@cea.fr>
Reviewed-by: Cory Spitz <cory.spitz@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 weeks agoLU-18986 mgc: client part of new registration protocol 12/59212/8
Mikhail Pershin [Tue, 13 May 2025 15:59:27 +0000 (18:59 +0300)]
LU-18986 mgc: client part of new registration protocol

Use new target registration protocol in MGC.
It uses inline buffer mtn_inline_list[] if NIDs fit into
it or prepare bulk transfer for large list of NIDs.

Test-Parameters: testlist=runtests mdsversion=EXA6.3.2
Test-Parameters: testlist=runtests ossversion=EXA6.3.2
Test-Parameters: testlist=runtests serverversion=EXA6.3.2
Test-Parameters: testlist=runtests clientversion=EXA6.3.2
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ifc0fd24d7eb26dd092c3e9cce895980b26f0524d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59212
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 weeks agoLU-18756 sec: add resource id check to oss and mds 08/59208/6
Marc Vef [Tue, 13 May 2025 11:27:31 +0000 (13:27 +0200)]
LU-18756 sec: add resource id check to oss and mds

This patch includes the resource id check into the relevant code paths
on the oss and mds side. It is therefore included for the following
operations.

On the MDT-side:
- open
- create (file and directory)
- unlink (file and directory)
- setattr
- setxattr
- getxattr
- rename
- link

On the OST-side and on the MDT-side for Data on MDT (DoM) files:
- write
- read
- truncate
- fallocate

Some caveats:
The resource id check is not included for MDS_GETATTR RPCs due to
functional and usability concerns. Specifically for the latter, the
"struct stat" would no longer be filled resulting in "?" when running
"ls -l", which can be misunderstood.

Also, if the check is only enabled on the OST-side, writes are only
denied for "sync"/"fsync"-type operations on a file as the check is at
the server-side. If the check is enabled on the MDT-side, write-access
is denied before the OST_WRITE RPC is sent, i.e., immediately
returning the access denied error code. If a file is still in the page
cache before the check is enabled, a client can still read the local
copy of the file, which is expected.

Sanity-sec test 75a was added to exercise the ID check for the above
cases in several disciplines further testing that access to
neighboring nodemap offset ranges work as expected.

Test-Parameters: trivial testlist=sanity-sec
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I040ddb1b934707baa84b492337139f45b856692e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59208
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
3 weeks agoLU-18756 sec: add generic nodemap resource id check 07/59207/5
Marc Vef [Tue, 13 May 2025 11:13:50 +0000 (13:13 +0200)]
LU-18756 sec: add generic nodemap resource id check

This patch represents the first patch in the series to check the OST
object and MDT inodes UID/GID against the nodemap offset range. This
patch adds the corresponding functions on the OST, MDT, and nodemap
sides for the resource ID check. A resource is defined as an MDT inode
or OST object. This patch does not yet connect the functions to the
relevant codepaths. The patch further adds the new "lctl set_param"
configurables, which are (for now) disabled by default:

- "lctl set_param mdt.*.enable_resource_id_check={0,1}" toggling the
  check on the MDT side.
- "lctl set_param obdfilter.*.enable_resource_id_check={0,1}" toggling
  the check on the OST side.

These configurables work individually but should be toggled together.

The ID check relies on the "nodemap_map_id()" functionality to
guarantee compatibility with the nodemap mapping functionality, e.g.,
covering both offset and mapping cases, among others. The ID check
therefore functions as follows:

If "nodemap_map_id()" returns the squashed value for both UID and GID
for a given client export, "fs_uid", and "fs_gid" stored on the MDT
inode and OST object, access is not permitted to the resource. It
does not rely on any IDs given by the client. The corresponding
permission bits or ACLs are not taken into consideration and are
only relevant later if access was permitted elsewhere.

Test-Parameters: trivial
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I818c511cd37251843bcfa6b873ef8bdc05176980
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59207
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-18986 mgs: server part of new registration protocol 06/59206/7
Mikhail Pershin [Thu, 8 May 2025 19:07:37 +0000 (22:07 +0300)]
LU-18986 mgs: server part of new registration protocol

Rework mgs_target_reg() to handle new protocol along with
old one for older targets

It handles old protocol with NIDs either in mti_nids or
in mti_nidlist[], and new protocol with NIDs in
mtn_inline_list[] or bulk

All NIDs are put in mti_nidlist[] as result of request
processing, so that eliminates need in extra changes in
further code path

Test-Parameters: testlist=runtests ossversion=EXA6.3.2
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I41dd487c37136e24328914e33c9ce056be013aae
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59206
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-18986 mgs: new target registration protocol 05/59205/4
Mikhail Pershin [Thu, 8 May 2025 09:40:54 +0000 (12:40 +0300)]
LU-18986 mgs: new target registration protocol

Patch adds new target registration request format with
enhanced NIDs list handling. The idea is to don't overload
mgs_target_info with extra flags and fields for NID list
description but keep such information in new structure.

NIDs list is arrays of string always and can be send
in varios manners: inline buffer, bulk, compressed,
appended, etc.

It helps also to resolve compatibility issues.

Patch includes:
- new wire structure mgs_target_nidlist
- new possible RPC format with mgs_target_nidlist buffer
- new connect flag OBD_CONNECT_MGS_NIDLIST to replace
  obsoleted OBD_CONNECT_REQPORTAL removed in commit
  1.6.0-159-gd2d56f38da ("make HEAD from b_post_cmd3")
- corresponding swabber and wirecheck

Test-Parameters: testlist=runtests clientversion=EXA6.3.2
Test-Parameters: testlist=runtests serverversion=EXA6.3.2
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I441de467a530137f76712273b9a5f814fdb562c1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59205
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-17410 sec: per-nodemap capabilities mask 38/57938/16
Sebastien Buisson [Mon, 27 Jan 2025 16:44:25 +0000 (17:44 +0100)]
LU-17410 sec: per-nodemap capabilities mask

Add a per-nodemap capabilities mask, used in preference to the global
enable_cap_mask parameter if it is set.
The new nodemap property is named enable_cap_mask, and can be set
thanks to the new lctl command 'nodemap_set_cap'. It is possible to
specify capabilities in hex or with symbolic names, with '+' and '-'
prefixes to respectively add or remove corresponding capabilities.
We support defining 2 types of capabilities, either a "set" so that it
is possible to add capabilities, or a "mask" to reduce capabilities of
the client.
This per-nodemap capabilities mask is available on any nodemap
including the default nodemap.

A dynamic child nodemap is allowed to define only a subset of the
capabilities set on the parent, unless the child_raise_privileges
property has the 'caps' privilege.

sanity-sec test_51 is enhanced to exercise this new nodemap property.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1ed91c721d869d0596af9c2d7e07a2c411f2b7c2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57938
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-18556 hsm: optimize llog record modification 28/57428/18
Alexander Boyko [Fri, 13 Dec 2024 12:57:17 +0000 (13:57 +0100)]
LU-18556 hsm: optimize llog record modification

This commit introduces a new llog modification mechanism for HSM
operations to address inefficiencies caused by prior reliance on
catalog processing. The new approach directly modifies llog record,
eliminating the need for catalog-based processing and reducing
latency.

Key changes include:
* Replacing the hsm_action_item (HAI) with a full in-memory llog
 record representation, increasing memory usage by ~80 bytes per
 record but removing the need for a dedicated llog cookie hash
 table.
* Unifying the coordinator's read/store logic for HAI data into a
 single in-memory item shared by mdt_hsm_agent_send() and
 mdt_hsm_add_hsr(). This reduces memory allocation steps: only one
 cdt_agent_req allocation is now required during llog read
 operations, eliminating subsequent allocations/copies.

Performance results on VMs 2 MDTs/2 OSTs/2 Clients no-op copytool:
Test 1 (1M archive requests): 572s -> 187s (~3 times faster)
Test 2 (1M archive + 1M queued): 558s -> 392s (~1.4 times faster)

HPE-bug-id: LUS-12583
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I4b6e697bc3b1f0cf2c76f5433b49affbc933c653
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57428
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-14772 tests: Add conf-sanity-framework.sh 70/57370/16
Vitaliy Kuznetsov [Wed, 11 Jun 2025 16:10:39 +0000 (18:10 +0200)]
LU-14772 tests: Add conf-sanity-framework.sh

This patch creates a new file conf-sanity-framework.sh
The functions from conf-sanity.sh will be moved into
this file, and will also be used in other tests with
the conf-* prefix.

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I6e0c53d4e15fa01c341be7a67fcf386c4fb5f0ed
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57370
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-19050 utils: Support long nid lists when getting fs info 21/59421/5
Marc Vef [Sun, 25 May 2025 19:02:50 +0000 (21:02 +0200)]
LU-19050 utils: Support long nid lists when getting fs info

When "get_root_path_slow()" is called through various user commands,
e.g., "lfs setquota", the internal "root_cache" is filled with mount
point information. The cache's "nid" field allowed 256 characters
which resulted in a buffer overflow for long nidlists that are set
during mount.

This patch removes this limitation and further removes the "nid" field
from the "root_cache" since it is only needed in the "lfs check"
command.  Therefore, the nid list no longer needs to be processed and
put into the cache in the numerous other llapi_* functions where the
nid list is never accessed.

Further, string copy handling was insufficient, allowing the overflow
in the first place, and was updated accordingly for all fields.

Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I3d9c30795fba14618368b7b9e1769fe0b07d3fc7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59421
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Feng Lei <flei@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoNew tag 2.16.56 2.16.56 v2_16_56
Oleg Drokin [Sat, 7 Jun 2025 23:10:47 +0000 (19:10 -0400)]
New tag 2.16.56

Change-Id: Iabf1977eeb273e629a3ea4c6ba75a3eadaa8be2a
Signed-off-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-19039 lnetconfig: Fix error string in cyaml output 73/59373/4
Nathaniel Clark [Thu, 22 May 2025 11:53:20 +0000 (07:53 -0400)]
LU-19039 lnetconfig: Fix error string in cyaml output

String output in yaml only needs to be quoted when beginning with '@',
''', '"', or '- ', or contains ':'.

This corrects the most common error output for `lnetctl ping` errors
to be correct yaml and also cleans up all other error strings output.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I9a8436280b34f82cf78152e488b68c0581cc2a7d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59373
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-15358 tests: Escape quote symbols in sanityn test 26b 44/59444/2
Oleg Drokin [Tue, 27 May 2025 04:08:26 +0000 (00:08 -0400)]
LU-15358 tests: Escape quote symbols in sanityn test 26b

Shellcheck highlights that those quotes are actually unquoting
the variables. And looking at prior code we really try to ensure
you can tell which one is which even when some of them are empty
or have spaces.

Test-Parameters: trivial
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I2cdd0dcc1bce59b397f928cffeb790c74d8dc311
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59444
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 weeks agoLU-16818 tests: ignore more opcodes in replay-single/65a 38/58838/7
Alex Zhuravlev [Thu, 17 Apr 2025 10:17:32 +0000 (13:17 +0300)]
LU-16818 tests: ignore more opcodes in replay-single/65a

ignore few more opcodes which can interfere testing:
MDS_STATFS, OST_STATFS, OST_DISCONNECT and OST_PRECREATE

Test-Parameters: env=ONLY=65a,ONLY_REPEAT=100 testlist=replay-single
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib730b540b9075e0ed871bc11f3bdfb4cfd4634a1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58838
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-18276 tests: add debugging to sanity-pfl/16b 16/59416/2
Andreas Dilger [Sun, 25 May 2025 00:17:54 +0000 (18:17 -0600)]
LU-18276 tests: add debugging to sanity-pfl/16b

Add extra debugging messages to sanity-pfl.sh test_16b to help find
what is causing this test to fail with ENOSPC intermittently.

Reduce size of overstriped PFL file layout slightly, so that two
such components can fit within the xattr size limit, which may or
may not be the cause of the ENOSPC failures.

Print a message in llapi_layout_file_open() if ENOSPC is hit, so
that we can determine the xattr size, in case it is too large.
Move layout conversion before file open() to avoid contacting
MDS needlessly if the layout is bad.

Test-Parameters: trivial testlist=sanity-pfl env=ONLY=16b,ONLY_REPEAT=100
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iaf347e231147041dda07277227e80f0b6f2540e5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59416
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-19051 config: silent spurious messages while checking mpitests 03/59403/2
Aurelien Degremont [Fri, 23 May 2025 14:00:23 +0000 (16:00 +0200)]
LU-19051 config: silent spurious messages while checking mpitests

When detecting mpicc configuration, do not print warnings
or error messages in the middle of configure output.

Test-Parameters: trivial
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: If536aa1d04f0d641a7b2a721869261c85907e084
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59403
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-19046 mgc: mgc_fs_setup() should wait interruptibly 96/59396/2
Andreas Dilger [Fri, 23 May 2025 03:32:26 +0000 (21:32 -0600)]
LU-19046 mgc: mgc_fs_setup() should wait interruptibly

When a target mounts, it fetches a copy of its config log from the
MGS to store in the local filesystem. However, the MGC can currently
only fetch the config log for one target filesystem at a time.
This should be improved in a separate patch.

If the MGS is inaccessible, or there is a problem during setup, the
server will wait for it while holding cl_mgc_mutex.  Other targets on
the same server will be unable to mount, and block on cl_mgc_mutex,
possibly dumping a stack trace like:

    INFO: task mount.lustre:93138 blocked for more than 90 seconds.
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" to disable this
    task:mount.lustre    state:D stack:0     pid:93138 ppid:93135
    Call Trace:
    __schedule+0x2d1/0x870
    schedule+0x55/0xf0
    schedule_preempt_disabled+0xa/0x10
    __mutex_lock.isra.11+0x349/0x420
    mgc_fs_setup.isra.12+0x65/0x7a0 [mgc]
    mgc_set_info_async+0x99f/0xb30 [mgc]
    server_start_targets+0x452/0x2c30 [obdclass]
    server_fill_super+0x94e/0x10a0 [obdclass]
    lustre_fill_super+0x388/0x3d0 [lustre]
    mount_nodev+0x49/0xa0
    legacy_get_tree+0x27/0x50
    vfs_get_tree+0x25/0xc0
    do_mount+0x2e9/0x950
    ksys_mount+0xbe/0xe0

Use wait_event_interruptible() in mgc_fs_setup() so the server's mount
thread can be interrupted and killed.  This does not fix the reason
for the server to be blocked, but it does allow it to be killed.

Rename mgc_fs_cleanup() to mgc_fs_clear() so it is not confused with
actually cleaning up the MGC.

Avoid printing an error if the sptlrpc log is not available.  This is
common for most filesystems, and is not an error.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0bafa5dae0eadecb112efaf61f8bcf7ea8c4c296
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59396
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-17242 libcfs: use sched_show_task() for thread dumping 94/59394/2
Shaun Tancheff [Fri, 23 May 2025 01:16:04 +0000 (08:16 +0700)]
LU-17242 libcfs: use sched_show_task() for thread dumping

Use sched_show_task() for thread dumping, since it should be
available on all kernels that Lustre supports. On some kernels,
libcfs_debug_dumpstack() is unable to show the thread stack.
Replacing this function avoid that issue.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I421560b0d4223fd3503f4a3697a7615dd43bad8f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59394
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-19040 kernel: update SLES15 SP6 [6.4.0-150600.23.50.1] 68/59368/2
Jian Yu [Thu, 22 May 2025 06:31:54 +0000 (23:31 -0700)]
LU-19040 kernel: update SLES15 SP6 [6.4.0-150600.23.50.1]

Update SLES15 SP6 kernel to 6.4.0-150600.23.50.1 for Lustre client.

Test-Parameters: trivial mdtcount=4 mdscount=2 \
  clientdistro=sles15sp6 testlist=sanity

Test-Parameters: optional mdtcount=4 mdscount=2 \
  clientdistro=sles15sp6 testgroup=full-dne-part-1

Test-Parameters: optional mdtcount=4 mdscount=2 \
  clientdistro=sles15sp6 testgroup=full-dne-part-2

Test-Parameters: optional mdtcount=4 mdscount=2 \
  clientdistro=sles15sp6 testgroup=full-dne-part-3

Change-Id: Ie2d530f0edb28326bbcbd1326f40e3e7db845c21
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59368
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-18813 osd-wbcfs: refactor osd_device_alloc 06/59306/3
Timothy Day [Tue, 20 May 2025 05:09:47 +0000 (01:09 -0400)]
LU-18813 osd-wbcfs: refactor osd_device_alloc

osd_device_alloc() has improper error handling.
Refactor the function such that we properly
cleanup if __osd_device_init() fails.

Test-Parameters: trivial
Test-Parameters: testlist=sanity fstype=wbcfs mdscount=4 mdtcount=1 osscount=4 ostcount=1
Test-Parameters: testlist=sanity fstype=wbcfs combinedmdsmgs=false standalonemgs=true mdscount=1 mdtcount=1 osscount=4 ostcount=1
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ia03eb805ef3fdc75c8490e09c66b99e6541d13fd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59306
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-18813 osd-wbcfs: remove f_op llseek checks 05/59305/2
Timothy Day [Tue, 20 May 2025 04:57:57 +0000 (00:57 -0400)]
LU-18813 osd-wbcfs: remove f_op llseek checks

MemFS will always have llseek defined, so we
can remove the checks in the OSD.

Test-Parameters: trivial
Test-Parameters: testlist=sanity fstype=wbcfs mdscount=4 mdtcount=1 osscount=4 ostcount=1
Test-Parameters: testlist=sanity fstype=wbcfs combinedmdsmgs=false standalonemgs=true mdscount=1 mdtcount=1 osscount=4 ostcount=1
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I77f7abcef686c9c654b7bee04b3f88bb89a87756
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59305
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-18813 osd: fix dcb_func LASSERT 04/59304/2
Timothy Day [Tue, 20 May 2025 04:53:02 +0000 (00:53 -0400)]
LU-18813 osd: fix dcb_func LASSERT

Each OSD was incorrectly asserting that the
address of the function pointer was not NULL,
instead of the function pointer itself.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ie5682a9d80219743ecb86d8d463cbabcdbf77b64
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59304
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 weeks agoLU-19030 quota: lfs quota all respects nodemap 97/59297/5
Sergey Cheremencev [Fri, 4 Apr 2025 01:58:13 +0000 (04:58 +0300)]
LU-19030 quota: lfs quota all respects nodemap

Command lfs quota all should print only IDs from the appropriate
nodemap range. The patch also maps FS quota IDs to client IDs
according to nodemap before returning in a quota all iterator buffer.

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I8820e18957805c0dceacc4674713875b024a8e99
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59297
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-18813 contrib: add an example config.site 65/59265/3
Timothy Day [Fri, 16 May 2025 05:50:43 +0000 (01:50 -0400)]
LU-18813 contrib: add an example config.site

The variable CONFIG_SITE can be used to specify
config files to the Autoconf generated configure
script. This is a useful alternative to long
configure command lines.

Add an example config.site file used for compiling
Lustre server (osd-wbcfs) and client for use in
ktest.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I6597b860629643ced7191d7a250a86ede2576993
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59265
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-19021 ptlrpc: Add obd info to nodemap exports output 48/59248/4
Marc Vef [Wed, 21 May 2025 11:35:33 +0000 (13:35 +0200)]
LU-19021 ptlrpc: Add obd info to nodemap exports output

When clients connect to MDTs/OSTs, a new export is generated on the
server-side during obd_connect_*() with the client's UUID. For each
target, a separate export is created which is then added to the
nodemap's "nm_member_list", if applicable.

Currently, "lctl get_param nodemap.NM_NAME.exports" prints the UUID
and NID information for each entry in the "nm_member_list". Because
the obd device is not listed, duplicate entries appear to be shown for
each client, which can be confusing for the administrator.

This patch extends the nodemap.NM_NAME.exports output by also showing
the obd the client is connected to, e.g., MDT0000, MDT0001, etc, such
that the shown entries no longer appear as duplucate.

Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I681480f9258e57c522acc148f4096a8f40c71eab
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59248
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 weeks agoLU-19011 utils: lfs quota -a -u --busage has delimiter 09/59209/4
Sergey Cheremencev [Tue, 13 May 2025 14:53:38 +0000 (17:53 +0300)]
LU-19011 utils: lfs quota -a -u --busage has delimiter

lfs quota all should insert a delimiter between the name and certain
parameter(busage, bhardlimit, bsoftlimit ...).

Fixes: 7c02893e12 ("LU-18079 utils: argument parse opts for lfs quota")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Icace5752c4a169858792748c5f4b41e336d18cac
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59209
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Frederick Dilger <fdilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-18971 build: Update ZFS version to 2.3.2 65/59065/17
Jihyeon Gim [Sun, 4 May 2025 14:40:53 +0000 (23:40 +0900)]
LU-18971 build: Update ZFS version to 2.3.2

Update ZFS version to 2.3.2. The changes are listed in:
https://github.com/openzfs/zfs/releases/tag/zfs-2.3.2

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el8.10 serverdistro=el8.10 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testlist=sanity

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-1

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-2

Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-3

Change-Id: Id2a9780cd3b10e81e0136c0a7dde0cb317b52834
Signed-off-by: Jihyeon Gim <potatogim@gluesys.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59065
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-7105 tests: remove deprecated test_28 from sanityn 52/59052/5
Frederick Dilger [Thu, 1 May 2025 00:53:21 +0000 (18:53 -0600)]
LU-7105 tests: remove deprecated test_28 from sanityn

sanityn.sh test_28 was deprecated in 2022-06 but not removed.

Test-Parameters: testlist=sanityn
Fixes: 51c491dac6 ("LU-10994 test: remove netdisk from obdfilter-survey")
Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: I524adf575170ae9e78dc1eb5e0e1596ee7252dfe
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59052
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-17848 osd: fix deref in ldiskfs osd_health_check() 89/58989/3
Timothy Day [Sun, 27 Apr 2025 16:55:24 +0000 (12:55 -0400)]
LU-17848 osd: fix deref in ldiskfs osd_health_check()

The implementations of osd_health_check() in ldiskfs
incorrectly check for a NULL mount after already
dereferencing it. Add a check for a NULL mount in
osd_sb() and check for a NULL sb in osd_health_check().

CoverityID: 397885 ("Dereference before null check")

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Id1ce015eb420fe067be375bf0019f305e3e2718c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58989
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 weeks agoLU-18687 doc: move man *.7 pages to Documentation/man7 90/58590/5
Timothy Day [Sat, 29 Mar 2025 23:37:30 +0000 (19:37 -0400)]
LU-18687 doc: move man *.7 pages to Documentation/man7

Consolidate all of the man pages into the top
level Documentation directory.

Move all of the Lustre man pages (from 7) to Docmentation/.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I9c5a5e36028739b4872e469721acb8d32b61cce1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58590
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Ellis Wilson <elliswilson@microsoft.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 weeks agoLU-18116 tests: replay-single test_201 timeout 03/58203/4
Alexander Boyko [Tue, 25 Feb 2025 10:38:34 +0000 (11:38 +0100)]
LU-18116 tests: replay-single test_201 timeout

19s is not enough for some system to finish MDT1 umount.
Increasing it to 20s + OSTCOUNT seconds.

HPE-bug-id: LUS-12689
Test-Parameters: testlist=replay-single
Fixes: ffedcbae21f7 ("LU-17809 osp: make disconnect asynchronous")
Signed-off-by:Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I900ce107ceb664530bc2165685ba7b88cbd46807
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58203
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>