Whamcloud - gitweb
Sergey Cheremencev [Thu, 22 May 2025 01:57:02 +0000 (04:57 +0300)]
LU-19058 llite: ll_statfs_project for projid 0
df should take into account project quota limits for
squashed project id. Otherwhise it shows system wide
size.
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Ieae5d8503829e2de859b60ed259bc0ee4d1274ca
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59435
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Timothy Day [Mon, 16 Jun 2025 16:33:45 +0000 (16:33 +0000)]
LU-18687 compat: move wait/wait_bit to lustre_compat
Migrate the backported waiting code to
lustre_compat.
Eventually, all of the Lustre/LNet compatability code
will live in lustre_compat - maintaining a clear
separation from the functional code in Lustre and LNet.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I9ffaabf7d4665abb002f11599f993e776e7a38b6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59812
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Mikhail Pershin [Fri, 14 Mar 2025 14:11:32 +0000 (17:11 +0300)]
LU-18815 mgc: don't fail/lbug on many NIDs
Keep server starting on node with more that 32 NIDs,
allowing first 32 NIDs per target.
Account '-o network' mount option to don't use other
networks as server import peers
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: If4c997be3480eba8b75888a070fb5a721b71b894
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58502
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alexey Lyashkov [Wed, 20 Nov 2024 08:56:48 +0000 (11:56 +0300)]
LU-18461 osd: pass lu_attr to declare_xattr_set
Pass the lu_attr to dt_declare_xattr_set() and do_declare_xattr_set()
OSD API methods for later use by file join to pass the object offset.
This patch only adds the "attr" parameter but does not use it,
so there is no functional change to the code.
Test-Parameters: trivial
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: Ie43bb37cbd93f7eaabab2b4ef1f1fc0e6d7e7567
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57189
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Cyril Bordage [Fri, 6 Jun 2025 16:23:13 +0000 (18:23 +0200)]
LU-19119 misc: add umd_ prefix for fields in lnet_md
All structure fields should have a prefix to make the study of code
easier. It was not the case for struct lnet_md.
Some cosmetic changes have also been made.
Test-Parameters: trivial
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I725be4f5ebac3537e64197a648ea4261a710c37b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59865
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Shaun Tancheff [Thu, 3 Apr 2025 07:26:16 +0000 (14:26 +0700)]
LU-18254 dkms: optional support for weak-modules
Conditionally support weak modules by setting:
LUSTRE_DKMS_NO_WEAK_MODULES=no
in /etc/sysconfig/dkms-lustre before installing the dkms
package.
Duplicate the o2ib and kfi detection from lustre-dkms_pre-build.sh
into dkms.mkconf and dkms.conf.in for rpm and deb packaging
Test-Parameters: trivial clientextra_install_params="--dkms" \
serverextra_install_params="--dkms" testgroup=full-dkms
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I09502ad4a2e4930694725cf00b847ecacc4ca043
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56452
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
James Simmons [Fri, 27 Jun 2025 19:53:24 +0000 (15:53 -0400)]
LU-19140 utils: sep in yaml_fill_scalar_data can be '\0'
While testing lnetctl import a corner case bug was exposed. For
yaml_fill_scalar_data() the variable sep can be set to '\0' which
when later we can strchr to find the newline will fail since it
thinks its already at the end of the string. Instead do the
search for the newline after sep has restored ':' at its start
and skipped the whitespaces.
Test-Parameters: trivial testlist=sanity-lnet
Fixes:
8f64231185a ("LU-9680 utils: fix nested attribute handling in liblnetconfig")
Change-Id: Ibcf03616777feca58599d816265947f6de27c5b8
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59965
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Xose Vazquez Perez [Mon, 23 Jun 2025 21:15:35 +0000 (23:15 +0200)]
LU-6142 misc: replace license boilerplate with SPDX
Just that.
Test-Parameters: trivial
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Change-Id: I445ee5625424a8f49671cc9a093aa3121bdcaaa3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59924
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Emoly Liu [Fri, 20 Jun 2025 00:14:14 +0000 (08:14 +0800)]
LU-19106 lod: check QoS data in pool before down_write
Just like ltd_qos_is_usable() does, define pool_qos_is_usable()
to check whether QoS data in pool is up-to-date and balanced before
expensive qos write lock is taken.
Fixes:
e642e75cde02 ("LU-13363 lod: do object allocation in OST pool")
Test-Parameters: ostcount=8 testlist=conf-sanity env=ONLY=133
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: I9d17108b649ba5689f02d5f5eee098d030db3d5b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59852
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Bruno Faccini [Wed, 18 Jun 2025 16:46:46 +0000 (18:46 +0200)]
LU-19113 llite: cfs_delete_from_page_cache() keep page locked
Like in other places where generic_error_remove_folio() is
also being called, in both Lustre and Kernel, page should not
be unlocked prior to call it in cfs_delete_from_page_cache().
This was also allowing a race where page->mapping may become
NULL.
Taking an extra reference is also useless if page not unlocked
anymore.
Fixes:
738e69d4b9 ("LU-16292 llite: delete_from_page_cache not exported")
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: If39575f4339afe460b3b1c955201e8f9cdfeb871
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59829
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Chris Horn [Tue, 17 Jun 2025 19:22:48 +0000 (13:22 -0600)]
LU-14810 lnet: Avoid multiple PUSH to same peer
It is possible to send multiple PUSHes to the same peer when the
LNET_PEER_FORCE_PUSH bit is set in the peer state. A partial solution
was added in https://review.whamcloud.com/55559/ where we modified
lnet_peer_needs_push() to check for the PUSH_SENT flag. However, we
missed that the main loop in lnet_peer_discovery() will check for the
LNET_PEER_FORCE_PUSH bit prior to calling lnet_peer_needs_push().
Update lnet_peer_discovery() to remove the problematic check for
LNET_PEER_FORCE_PUSH.
Also refactor the checks for sending a ping into a new function,
lnet_peer_needs_ping().
Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet env=ONLY=212,ONLY_REPEAT=100
Fixes:
72726a3118 ("LU-14810 lnet: Do not issue multiple PUSHes")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie25089a07ac1d0fcc0e6c56ec69337d22371cc32
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59815
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alexey Lyashkov [Fri, 30 May 2025 09:32:44 +0000 (12:32 +0300)]
LU-19075 o2iblnd: reduce memory usage
ib_recv_wr / ib_sge are not needed after ib_post_recv finished.
lets remove it.
Test-Parameters: trivial
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: Id866c1bbaeafa41103ef7caa8a1254c53c9e3c3d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59488
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sergey Cheremencev [Mon, 3 Mar 2025 15:31:21 +0000 (18:31 +0300)]
LU-18765 tests: sanity-quota_91 interop check
Start sanity-qutoa_91 only when MDS version is greater
or equal to v2_16_50-52-g1f9689d0f9.
Fixes:
1f9689d0f9 ("LU-17770 quota: don't panic in qmt_map_lge_idx")
Test-Parameters: trivial testlist=sanity-quota serverversion=2.16.1
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Ie37f05e8c7b13bb9444991f89c72d20eab1cecba
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58283
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Tue, 24 Jun 2025 14:48:20 +0000 (08:48 -0600)]
LU-19124 tests: except sanity/27P on ubuntu2204
This test is failing 100% on Ubuntu kernel 5.15.0-142
but was passing on kernel 5.15.0-94.
Disable until issue is fixed.
Test-Parameters: trivial testlist=sanity env=ONLY=27,HONOR_EXCEPT=y clientdistro=ubuntu2204
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0152cdcf51aba26c6fa6896ed87b5ade1ed69542
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59916
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Qian Yingjin [Sat, 24 May 2025 08:30:48 +0000 (16:30 +0800)]
LU-19014 memcg: fix client hang in balance_dirty_pages()
Two nodes (at least) append write a shared file in Lustre with
memcg enabled.
The client randomly hung in balance_dirty_pages() with the
following call trace:
[<0>] balance_dirty_pages+0x2ee/0xd10
[<0>] balance_dirty_pages_ratelimited_flags+0x27a/0x380
[<0>] generic_perform_write+0x150/0x210
[<0>] vvp_io_write_start+0x516/0xc00 [lustre]
[<0>] cl_io_start+0x5a/0x110 [obdclass]
[<0>] cl_io_loop+0x97/0x1f0 [obdclass]
[<0>] ll_file_io_generic+0x4d2/0xe50 [lustre]
[<0>] do_file_write_iter+0x3e9/0x5d0 [lustre]
[<0>] vfs_write+0x2cb/0x410
[<0>] ksys_write+0x5f/0xe0
[<0>] do_syscall_64+0x5c/0xf0
After analyze the core dump of the hung system, we found that the
bdi_writeback data structure (wb) corresponded to the memcg has
pending dirty pages (in state WB_registered | WB_has_diry_io), but
can not write-out the dirty pages and loop in balance_dirty_pages
function.
This is a bug in Lustre memcg code. In OSC/MDC layer, it will stop
to flush dirty pages once found that there are no any unstable
pages.
However, there may be some dirty pages queued in the cache. In
this case, the client should still write back the dirty pages.
Thus the wb stat accounting will be updated and the write process
can continue instead of looping endless.
Moreover, there are some problem in the current Lustre CLIO engine.
When the system or a certain memcg is under memory pressure, the
client just queues the dirty page in page cache or in the current
active extent (OES_ACTIVE osc_extent) when vvp_io_write_commit()/
cl_io_commit_async() is called in ->write_end(). The queued pages
can not be written back even the kernel is trying to flush dirty
pages in writeback via ->ll_writepages().
The client is looping in the following call sequences:
loop:
->write_begin()
->write_end()
->balance_dirty_pages()
->Launch file writeback in background but cannot flush any
dirty pages.
->The current process is paused a certain time (i.e. 200ms) as
the corresponding @wb is dirty exceeded.
-> GOTO loop:
The write progress is very slow: write a page and sleep/pause for
a period of time alternately.
We fix this hang in ->ll_write_end(). When detect the corresponding
@wb is dirty exceeded, the client will submit the dirty pages into
OSC writeback cache. The state of current extent will change from
OES_ACTIVE to OES_CACHE and this kind of extents can be written
back. Moreover, we mark the current extent as urgent, thus it can
be flushed much more quickly.
Fixes:
8aa231a99 ("LU-16713 llite: writeback/commit pages under memory pressure")
Signed-off-by: Yingjin Qian <qian@ddn.com>
Change-Id: Iecee60484f1b65fad6f4c9eac7bd4d2c53f38b8d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59223
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Sat, 24 May 2025 04:51:39 +0000 (22:51 -0600)]
LU-16974 utils: make bandwidth options consistent
The "lfs mirror extend|resync" and "lfs migrate" commands all have
the ability to limit IO bandwidth. Some used "--bandwidth" and
others used "--bandwidth-limit". Make them all the same.
They accept the longer "--bandwidth-limit" for compatibility, but
only "--bandwidth" will be documented for brevity and ease of use.
The getopt_long() options can be abbreviated to any unique prefix,
so both do not need to be specified.
Update the usage messages and man pages to reflect these options.
Minor nearby code style fixes to the man pages as well.
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie8702bedba9fbfa5b0ea473853a7b5480e61abb5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59411
Tested-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Arshad Hussain [Fri, 13 Jun 2025 16:24:25 +0000 (21:54 +0530)]
LU-9633 ptlrpc: Add kernel doc style for ptlrpc (14)
This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I0bd4e00cd20e8057743564bcb1677fa26ffd0294
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59759
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Arshad Hussain [Fri, 13 Jun 2025 15:53:04 +0000 (21:23 +0530)]
LU-9633 ptlrpc: Add kernel doc style for ptlrpc (13)
This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: If6e81cc9dbfc8f063597884da19db0f5f2438c18
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59758
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Arshad Hussain [Fri, 13 Jun 2025 14:45:11 +0000 (20:15 +0530)]
LU-9633 ptlrpc: Add kernel doc style for ptlrpc (12)
This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I78aa662b13a99870f59d84a05c773f5beb6e22e3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59757
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Arshad Hussain [Mon, 2 Jun 2025 17:00:21 +0000 (22:30 +0530)]
LU-9633 ptlrpc: Add kernel doc style for ptlrpc (11)
This patch converts existing functional comments
to kernel doc style comments and removes '/**' for
comments which is not meant to be a kernel-doc comment
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I757552dc766b50acfbb35838be3d12406de90009
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59756
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Sat, 17 May 2025 09:43:57 +0000 (03:43 -0600)]
LU-16974 utils: fix 'lfs mirror resync --stats' printing
If "lfs mirror resync --stats" is run without --bandwidth-limit,
then no stats are printed until the file resync is 100% finished.
Fix llapi_mirror_resync_many_params() to print the stats even if
no bandwidth throttle is being applied.
Fix write estimate calculation for a file with multiple components.
Print stats with requested granularity, not rounded to next second.
Update the code in lfs.c::migrate_copy_data() to use shared stats
printing and bandwidth throttling with llapi_mirror_resync_many().
Replace direct calls to fprintf() with llapi_err*().
Fix header ordering, remove duplicate headers found.
Test-Parameters: trivial testlist=sanity-flr
Fixes:
be131d125a ("LU-16974 utils: lfs mirror resync to show progress")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I68df25501f3cac2647318cff2eb86062f9300c1e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59263
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Aryan Gupta [Thu, 24 Apr 2025 09:10:03 +0000 (03:10 -0600)]
LU-18798 mdd: Exclude quotes from user.job xattr.
Exclude quotes around the jobid string when saving it into the
user.job xattr. Adjusted the mdd_buf_get_const() call to use
jobid + 1 as the starting point and jobid_len - 2 as the length
when quotes are detected.
Test-Parameters: trivial
Signed-off-by: Aryan Gupta <argupta@ddn.com>
Change-Id: I4dcf7012572de71a0bb64f3376a9f2d8544c0684
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58943
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chakshu Kansal [Fri, 13 Jun 2025 09:22:15 +0000 (14:52 +0530)]
LU-18242 utils: allow 'lfs df -m/-o' to specify one MDT/OST
Extend the 'lfs df' command to allow specifying a single MDT
or OST index with the -m and -o options. This allows users to
easily check the space usage of a specific target without
having to use grep.
For example:
lfs df -m0 /mnt/lustre # Show only MDT0
lfs df -o1 /mnt/lustre # Show only OST1
This is useful in test scripts and for end users who want to
check the space usage of a specific target.
Signed-off-by: Chakshu Kansal <ckansal@ddn.com>
Change-Id: I74538b6d8c95c41a7d9030f94103a7c0fd8756db
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58729
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
James Simmons [Sun, 8 Jun 2025 00:45:44 +0000 (20:45 -0400)]
LU-11850 obd: support target_obd using Netlink
Due to "target_obd" being an debugfs file normal users can't
access its contents. This breaks standard tools non-root.
Implement the same functionality using Netlink.
We don't implement it for lod since its an dt_device which
we don't have a way to find such a device like obd_devices.
The lod layer is server side so only root should have access.
Change-Id: I8fce80f6460b4b3f46106bc24d9494ae94e4fd4b
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58506
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Timothy Day [Thu, 19 Jun 2025 15:06:18 +0000 (15:06 +0000)]
LU-19110 lnet: add interop for net_delay commands
Older Lustre versions do not have the net_delay subcommands,
which causes interop testing failures. Add interop checks
to mitigate this.
Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet serverversion=2.16
Test-Parameters: testlist=sanity-lnet serverversion=2.16
Test-Parameters: testlist=sanity-lnet serverversion=2.16
Test-Parameters: testlist=sanity-lnet serverversion=2.16
Fixes:
6d9cfdeda926 ("LU-18114 tests: fix the version checks")
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I1650dc5a3d851120e3e7b5edad68998110e83183
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59853
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Timothy Day [Sat, 8 Mar 2025 19:03:18 +0000 (14:03 -0500)]
LU-17242 libcfs: deduplicate macros with ENUM2STR
Add ENUM2STR and replace the various redefinitions
of this macro.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ibbc536e59d24af4d930e0ecc772a869398fb9da3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58346
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Mon, 6 Jan 2025 22:16:15 +0000 (14:16 -0800)]
LU-18613 lbuild: fix RPMBUILD in lbuild
lbuild should return an error when rpmbuild is not found
in check_options(). The check for rpm is not needed because
it's not used for building RPM package.
Test-Parameters: trivial
Change-Id: I6572110c6362a941f130d65d6734bdebbc6acc82
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57662
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Tue, 17 Jun 2025 14:37:44 +0000 (16:37 +0200)]
LU-14772 tests: Change init ENV in conf-sanity.sh
This patch moves the initialization of init_test_env() so
that it is executed before the conf-sanity-framework.sh
library is sourced.
Fixes:
4a9fc7cebaad24cadf3213906fda193d8c681226 ("LU-14772 tests: Add conf-sanity-framework.sh")
Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I8b21bb56cdefe8d31d3e4e5653fcfcbb32c5c5e7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59800
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Marc Vef [Mon, 16 Jun 2025 14:16:12 +0000 (16:16 +0200)]
LU-19050 utils: fix get_root_path WANT condition
This patch fixes the WANT condition in get_root_path(), modified in
the previous patch, as get_root_path_fast() should not be called when
either WANT_INDEX or WANT_NID is set.
Fixes:
fff7cd33bbb4 ("LU-19050 utils: Support long nid lists when getting fs info")
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: Ib801f9f2cc19eaeb1e3a5932391ccf7dc53b9f5e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59779
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Aurelien Degremont [Tue, 28 Jan 2025 14:08:07 +0000 (15:08 +0100)]
LU-19098 hsm: don't print progname twice with lhsmtool
Since Lustre 2.11, log message from llapi_error() prefixes
the message with the current program short name. There is no
more a need for lhsmtool_posix log fonctions (CT_xxxx) to also
do the same.
Remove the duplicate prog name and cmd_name.
Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: I4ebdf01cd00d1544678cbad066e2c3a79ecfda38
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59680
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Tue, 27 May 2025 04:02:22 +0000 (22:02 -0600)]
LU-19062 llapi: add layout pattern string functions
Add llapi_lov_pattern_string() to print arbitrary pattern flags to
a string rather than layout2name() which only can print specific
hard-coded combinations of patterns.
Add llapi_lov_string_pattern() to convert layout pattern names to
flags.
Add enum lov_pattern that holds LOV_PATTERN constants, and use it.
Add description of patterns to lfs-getstripe.1.
Restore "-L, --layout" argument listing to lfs-setstripe.1.
Fixes:
b6deb420a8 ("LU-17370 utils: simplify lfs help text")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie21c7c75c685f3a15ac23e83562a12a3ea2540e5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59530
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Zhenyu Xu <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Mon, 2 Jun 2025 08:41:59 +0000 (10:41 +0200)]
LU-19079 nodemap: reserve cmds for gss identification
Declare 2 new values in enum lcfg_command_type:
LCFG_NODEMAP_GSS_IDENTIFY = 0x00ce065
LCFG_NODEMAP_LOOKUP_SHA = 0x00ce066
LCFG_NODEMAP_GSS_IDENTIFY is for a new nodemap property that would be
named gss_identification. And LCFG_NODEMAP_LOOKUP_SHA is to be able to
lookup a nodemap from the sha256 of its name.
Declare a new value in enum nm_flag2_bits:
NM_FL2_GSS_IDENTIFY = 0x8
This is to store on disk the value of the future gss_identification
property.
Reserve sanity-sec test_79 for testing the gss identification feature.
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2e2648f2eeb0956d7cb0793865b3344d1e8ed5a0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59514
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Emoly Liu [Mon, 26 May 2025 08:31:41 +0000 (16:31 +0800)]
LU-18924 hsm: fix the crash cause by huge max_requests
Some variables in function mdt_coordinator() should be unsigned
type to avoid memory allocation crash caused by huge parameter
mdt.*.hsm.max_requests.
To avoid such a failure earlier, the sum of MDT max_requests is
limited to 1/8 of total memory.
If it is bigger than this limit, it will be recalculated by this
limit and a useful warning message with memory information and
the limit will be printed.
Also, sanity-hsm.sh test_40 is modified to verify this patch and
stack_trap is added to test_50 and test_100 to restore the default
max_requests value correctly.
Test-Parameters: testlist=sanity-hsm env=ONLY=40,ONLY_REPEAT=20
Test-Parameters: testlist=sanity-hsm
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I3f6f9722c2af34a4632dc1620ad191774b8ed403
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58793
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Guillaume Courrier <guillaume.courrier@cea.fr>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Shaun Tancheff [Sat, 24 May 2025 00:56:10 +0000 (07:56 +0700)]
LU-19049 lutf: Debian 13: swig 4.3, python 3.13.3
Update the config/ac_python_devel.m4 to handle distutils
being removed.
Handle swig 4.3 api change, SWIG_Python_AppendOutput() takes a
3rd argument 'is_null'
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I35968b6928f682f5a70731e316db5e171fad00be
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59401
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Wed, 21 May 2025 07:25:03 +0000 (00:25 -0700)]
LU-19035 kernel: update RHEL 8.10 [4.18.0-553.53.1.el8_10]
Update RHEL 8.10 kernel to 4.18.0-553.53.1.el8_10.
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testlist=sanity
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testlist=sanity
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-1
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-2
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-3
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-1
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-2
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-3
Change-Id: Ide3ddc9dd8716e24cfb5bbbcba75237ac58041ba
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59341
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Mon, 19 May 2025 18:51:33 +0000 (11:51 -0700)]
LU-19029 kernel: update RHEL 8.10 [4.18.0-553.52.1.el8_10]
Update RHEL 8.10 kernel to 4.18.0-553.52.1.el8_10.
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testlist=sanity
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testlist=sanity
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-1
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-2
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-3
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-1
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-2
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-3
Change-Id: I0d5a2872050a92e1bf8e8b9438ab2bd1a4a21636
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59293
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Mon, 19 May 2025 18:44:18 +0000 (11:44 -0700)]
LU-18668 kernel: update RHEL 9.6 [5.14.0-570.17.1.el9_6]
Update RHEL 9.6 kernel to 5.14.0-570.17.1.el9_6 for Lustre client.
Test-Parameters: trivial env=SANITY_EXCEPT="17p" \
mdtcount=4 mdscount=2 clientdistro=el9.6 testlist=sanity
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-3
Change-Id: Iac6973ac636953c1e64a60433ae72c0f692a24ca
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59292
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Timothy Day [Tue, 15 Apr 2025 17:38:09 +0000 (17:38 +0000)]
LU-18162 mdc: convert Metadata Client to LU device
Convert MDC to use LU device init/fini rather
than the legacy OBD API.
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ic48accf1104d2845707b44519c32c5ced56ae8ef
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58809
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Sun, 1 Jun 2025 20:39:04 +0000 (16:39 -0400)]
LU-18813 osd-wbcfs: make dt_last_seq_get() optional
Allow osd-wbcfs to leave this unimplemented for
now. Once we track object mapping internally,
we can implement this.
Fixes:
373b76b345b5 ("LU-17658 fid: check on disk sequence before allocating to osp")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Iea4401db0c7656fd56c43f6e0d296cabce89e864
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59502
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Sat, 29 Mar 2025 23:29:40 +0000 (19:29 -0400)]
LU-18687 doc: move man *.5 pages to Documentation/man5
Consolidate all of the man pages into the top
level Documentation directory.
Move all of the Lustre man pages (from 5) to Docmentation/.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Idac247218406378398fdbf7f84e779ed87c42eef
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58589
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Sat, 8 Mar 2025 18:33:28 +0000 (13:33 -0500)]
LU-16518 lst: fix switch-case unannotated fall-through (2)
Fix more unannotated fall-through errors reported by Clang,
by moving the default case to the end of the switch-case
statement.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I12fb3bf709f2edd6ef03f58c122255319dc91049
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58345
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alex Zhuravlev [Fri, 25 Apr 2025 02:49:03 +0000 (05:49 +0300)]
LU-18602 osc: don't block async enqueues
don't block async enqueue RPCs trying to resend, otherwise the client
can get into a deadlock awaiting for RPC pinning object-inodes at
umount:
CPU: 0 PID: 9863 Comm: umount
Call Trace:
dump_stack+0x6e/0xa0
lbug_with_loc.cold.4+0x5/0x63 [libcfs]
lov_delete_composite+0x45b/0x680 [lov]
lov_object_delete+0xc1/0x260 [lov]
lu_object_free.isra.5+0x76/0x190 [obdclass]
cl_inode_fini+0xeb/0x250 [lustre]
ll_clear_inode+0x269/0x620 [lustre]
ll_delete_inode+0x3b/0x140 [lustre]
evict+0xbc/0x180
dispose_list+0x38/0x60
evict_inodes+0x12e/0x170
generic_shutdown_super+0x2d/0xf0
kill_anon_super+0x9/0x20
deactivate_locked_super+0x24/0x60
cleanup_mnt+0x36/0x70
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id1c6c71414b7387b9b0b191b65e34cf65388b57f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57599
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Oleg Drokin [Wed, 11 Jun 2025 22:49:10 +0000 (18:49 -0400)]
LU-19099 llite: Add vfstrace debug prints for ll_fallocate
Currently ll_fallocate() does not have any vfstrace prints, but
it's important for debugging related issues.
Fixes:
48457868a02a ("LU-3606 fallocate: Implement fallocate preallocate operation")
Test-Parameters: trivial
Change-Id: I165a7208a59b055756db416063c6695cc7f2d8e4
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59718
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mikhail Pershin [Sat, 3 May 2025 14:52:19 +0000 (17:52 +0300)]
LU-18973 mgc: account failovers from all sources
Once set up initially MGC is not updating import failovers
from other mounts. That causes problems with MGC on MGS -
it is always set up with only @lo interface, so if MGS
failed over to other node, all targets/clients on primary
node are unable to find MGS, because MGC has only @lo peer
Patch reworks lustre_start_mgc() code to account all
failover peers from each user of that MGC. It adds new
failover NIDs even if MGC exists already.
Patch re-organizes also the way how peers are identified.
It uses peer UUID as 'Primary NID' string instead of
naming it as 'MGC<PrimaryNID>_##' so same NIDs don't
produces new mappings and don't pollute import with
duplicated connections.
That makes LCFG_DEL_UUID obsoleted as well, because
lustre_stop_mgc() was its last user.
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Icea5b74a16972e8a5f2737257086074630e652a8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59076
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alexey Lyashkov [Wed, 23 Apr 2025 08:23:53 +0000 (11:23 +0300)]
LU-18942 obdclass: rework limits for zfs
ZFS ARC is uncontrolled by Linux memory, so this size should not be
accounted for all caches, not just the lu_object cache. Let's reduce
the number of objects freed in a single batch to avoid high CPU usage
in ARC prune threads and increase latency in providing free space.
Test-Parameters: trivial
HPE-bug-id: LUS-12814, LUS-12813
Fixes:
79b4ae9139c ("LU-1305 osd: osd_handler")
Fixes:
0123baecc4e ("LU-5164 osd: Limit lu_object cache")
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I5342149b185c61c56087d970f26eb4f197a597ef
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58918
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mikhail Pershin [Fri, 13 Jun 2025 15:05:43 +0000 (18:05 +0300)]
LU-19103 mgs: check mti nid format is old
Check mti NID list format passed by MGC in addition to
connection flags as real request can be prepared before
flags are negotiated
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ic669be26439bc1ef2f5713c0520af8c1c54fc981
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59742
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Thu, 5 Dec 2024 04:30:12 +0000 (23:30 -0500)]
LU-17814 utils: implement real pfind
This patch does the last step of integrating the actual
find code with pfind.
This doesn't mean we're done - it doesn't have error
propagation and is not enabled by default - but we are
close.
The next patch will finalize and enable testing.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I34020357037bdcf400ffe7f3b4dc14ea5e5a23c7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57295
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Thu, 5 Dec 2024 04:08:38 +0000 (23:08 -0500)]
LU-17814 utils: Add deep copy of find_param
Need to copy find_param for each work unit.
Technically not all fields need to be copied, but a bunch
do and it's much easier to copy the whole thing than work
out precisely which fields need to be copied.
Plus that is fragile to future changes, this should be more
robust.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7fbf909b3fc88ca4a4300abc7e4ccea776fff629
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57294
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Oleg Drokin [Tue, 27 May 2025 04:30:39 +0000 (00:30 -0400)]
LU-15358 tests: fix sanity-flr.sh test 0b syntax
local=cnt is clearly a typo and shoule be just local
Change-Id: I055c65eca5fe356dd5d180b4c8bf238c9f27c179
Fixes:
0c710a46cfb4 ("LU-11022 lfs: remove mirror by pool name")
Test-Parameters: trivial
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59446
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Alex Zhuravlev [Tue, 27 May 2025 17:35:52 +0000 (20:35 +0300)]
LU-19065 osc: remove extra linefeed from debug
OSC_DUMP_GRANT() has own trailing linefeed, so the callers
shouldn't pass extra n in their messages.
Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ifc5f01c3d79dcbd2619c1bcba9305c635006e8d9
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59458
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Tue, 27 May 2025 08:28:15 +0000 (11:28 +0300)]
LU-19064 tests: sanity/851 to use correct host
sanity/851 should use correct hostname to run well on a local setup.
Test-Parameters: trivial env=ONLY=851,ONLY_REPEAT=10 testlist=sanity
Test-Parameters: trivial env=ONLY=851,ONLY_REPEAT=10 testlist=sanity
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id8c62cc8fa7c5e57cef70e549652d30db94a0740
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59454
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Feng Lei <flei@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Oleg Drokin [Tue, 27 May 2025 04:18:22 +0000 (00:18 -0400)]
LU-15358 tests: sanity-quota wait_reintegration wrong quotes
Looks like these quotes need to be escaped otherwise they
just unquote the variable and grep might get confused.
Test-Parameters: trivial
Fixes:
c2db06180b29 ("LU-2183 quota: quota tests for DNE")
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I3b129e3924da4cbe4d6baa6e8c958881a799de26
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59445
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bruno Faccini [Thu, 5 Jun 2025 14:27:51 +0000 (16:27 +0200)]
LU-19091 ptlrpc: protect internal access to obd->obd_svc_stats
PM-QoS patch from LU-18446, where OBD svc stats are used to
evaluate best time period for low CPUs latency to be kept, has
introduced a new and internal way to access obd->obd_svc_stats
which now requires other concurrent access protection than
simply to remove external tunables in /sys or /debug.
Fixes:
54a64ea818 ("LU-18446 ptlrpc: lower CPUs latency during client I/O")
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I45a5f65216fa2bf0821776ff3141fa8e2a33f10e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59593
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Arshad Hussain [Sat, 31 May 2025 06:06:53 +0000 (02:06 -0400)]
LU-17000 obdclass: Fix mem leak in lcfg_setparam_client
if call to llapi_param_get_paths() fails. tmp_path
is left unfreed.
Test-Parameters: trivial
CoverityID: 457066 ("Resource leak")
Fixes:
10a04e32 (LU-16724 ptlrpc: refactor page pools patch 3)
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ib8962675fcb06a4d6b1539340f4a005dd65b7e02
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59499
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Cyril Bordage [Fri, 30 May 2025 16:07:08 +0000 (18:07 +0200)]
LU-18897 o2iblnd: NULL pointer dereference
When the network is flapping, we could get an
RDMA_CM_EVENT_UNREACHABLE event before conn is created, so we should
check the value first.
Test-Parameters: trivial
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I8d9777370b927c28ee438687de596e498d64bb07
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59498
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alex Zhuravlev [Fri, 30 May 2025 15:45:41 +0000 (18:45 +0300)]
LU-19076 ptlrpc: resend can hit original req
the client may need to resend a request if the reply buffer can
not fit the reply (LOVEA has just changed, for example).
in some environment (e.g. server and client share same node),
a resend RPC can find the original RPC on export's list and the
server just drops the resend RPC thinking it's a duplicate.
this way the client gets no reply for the resend RPC and times
out.
if this problem happens during layout refresh where the client
holds layout lock requesting LOVEA with MDS_GETXATTR, then
the server can evict the client.
the patch removes RPC from export's list just before sending a
reply as RPC has been already processed and for non-idempotent
request reconstruction should take place.
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I48437ad018b9b43b9fff4157203906fd84b6cfd3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59497
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Thu, 29 May 2025 18:36:16 +0000 (18:36 +0000)]
LU-19072 lnet: don't crash if ni_status is NULL
When reading LNet tunables, ni_status can be NULL. This
triggers an LASSERT() rather than gracefully handling it.
Instead, don't crash. Remove the LASSERT().
lnet_ni_get_status_locked() already handles a NULL ni_status.
While it's questionable whether ni_status == NULL should be
LNET_NI_STATUS_UP or LNET_NI_STATUS_DOWN, it definitely
should not crash.
Also, use lnet_ni_get_status() instead of
lnet_ni_get_status_locked().
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I1d8ba9b5f6478d2a915ac6c7f33c22d1742c43d0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59482
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Max Wang <wamax@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Arshad Hussain [Tue, 27 May 2025 15:39:28 +0000 (21:09 +0530)]
LU-17000 llite: Handle not NUL terminated buffer in ll_statahead_info
Match ll_statahead_info:sai_fname(target) array
length with llapi_lu_ladvise2:lla_buf(source).
Test-Parameters: trivial
CoverityID: 400216 ("Buffer not null terminated")
Fixes:
1288681b (LU-14361 statahead: add statahead advise IOCTL)
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Id898ab4b49d54bd734831c09e3de725533e7c249
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59456
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Timothy Day [Sun, 25 May 2025 05:37:19 +0000 (01:37 -0400)]
LU-17848 osd: dt_tunables_fini() and friends return void
The function dt_tunables_fini() can't really fail. Make
it return void. Make the various osd_procfs_fini()
implementations also return void.
Test-Parameters: trivial
Test-Parameters: trivial fstype=zfs
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I5edc7ed43fad69d6ebdd734d8e9fdc69cdcf0915
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59418
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Sun, 25 May 2025 02:47:12 +0000 (22:47 -0400)]
LU-18813 osd-wbcfs: use common rwsem for osd_object
Use a common read/write semaphore for all osd_object
attributes.
Test-Parameters: trivial
Test-Parameters: testlist=sanity fstype=wbcfs mdscount=4 mdtcount=1 osscount=4 ostcount=1
Test-Parameters: testlist=sanity fstype=wbcfs combinedmdsmgs=false standalonemgs=true mdscount=1 mdtcount=1 osscount=4 ostcount=1
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I16678e57596365ce25d978e2b5a524fc4c21bf26
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59417
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Sat, 24 May 2025 03:56:38 +0000 (21:56 -0600)]
LU-19053 build: allow specifying "make rpms" build dir
Currently "make rpms" will create a temporary directory with mktemp
to hold the intermediate build products, and this ends up in /tmp.
This can cause issues if /tmp is not large enough for the full build.
Allow specifying "BUILDDIR=DIR" to redirect the intermediate build
products into the specified directory. This allows you to run:
BUILDDIR=/var/tmp make rpms
or
BUILDDIR=/var/tmp make debs
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I12f2e7444f0fc7f09f41d64b8e4dd4a429797a37
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59410
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Li Dongyang [Fri, 23 May 2025 10:05:47 +0000 (20:05 +1000)]
LU-14712 ldiskfs: keep EXT4_BG_TRIMMED flag in memory
Keep the EXT4_BG_TRIMMED flag in memory for the trimmed block groups
so that the filesystem without track_trim superblock bit(e.g. existing
filesystem created with earlier version of e2fsprogs) can still
skip trimmed block groups during fstrim as long as it's mounted.
For persistent trimmed block group tracking we should turn on track_trim
with tune2fs.
Change-Id: I19df047c717d3b20310fcba7fa682b6dfab9d5e4
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59312
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alexander Boyko [Fri, 16 May 2025 12:38:12 +0000 (14:38 +0200)]
LU-19015 llog: logic for skipping a zeroed record
For ENOSPC errors during dt_write() and threads races, the changelog
could have a sparse file with zeros inside. The current processing
logic skips records for the next chunk.
The patch adds the abilty to skip only zeros in the buffer and start
from a valid record.
Also fix changes the llog_test 8 so that it uses non-zero byte for
corruption.
Fixes:
cb1290768df9 ("LU-18218 mdd: changelog specific write function")
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I7263764ba6a89f226995b8967631eaa6d5bdd4dd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59267
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Tue, 13 May 2025 21:37:17 +0000 (21:37 +0000)]
LU-19013 lnet: fix wording for GDS configure check
The wording on the GDS/CUDA configuration options is
incorrect. If the user does not specify external headers,
Lustre will fallback to the embedded headers rather
than disabling GDS.
Fix the wording on the configure options, improve the
macro name, and reorganize the header such that correct
defintions are under the ifdef.
Fixes:
c65eabc2b113 ("LU-15189 build: add GDS configure options")
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I645bc1c0c4bf26bdb9841c849b6cf8eebdc0bdee
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59215
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Timothy Day [Wed, 16 Apr 2025 16:26:50 +0000 (16:26 +0000)]
LU-18162 mgc: convert Management Client to LU device
Convert MGC to use LU device init/fini rather
than the legacy OBD API.
Also, use ldo_process_config rather than the legacy
o_process_config.
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ic33aeb0d1effabc25b538d946611c3a0b189150e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58826
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Giardi Sylwyn [Mon, 24 Mar 2025 13:52:02 +0000 (14:52 +0100)]
LU-16767 mdt: Allow jobID fields widths
Modify the function jobid_interpret_string in order to allow admin to
specify the widths of parameter printed by
lctl get_param mdt.*.job_stats.
By specifying the parameter jobid_name, the admin can truncate the
fields.
For exemaple, the format "%3e.%u.%6h" will print in job_stats
the 3 first characters of the executable name, a dot, the whole uid, and
the 6 first characters of the hostname.
If no digit is passed before the letter, it will print the whole field.
Signed-off-by: Giardi Sylwyn <sylwyn.giardi@cea.fr>
Change-Id: Ifd94b354cef07a7fff5e70c94c313a7e4617e2f8
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58822
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Timothy Day [Tue, 15 Apr 2025 05:21:08 +0000 (05:21 +0000)]
LU-18162 kunit: convert llog unit test to LU device
Convert OBD test to use LU device init/fini rather
than the legacy OBD API.
Test-Parameters: trivial testlist=sanity env=ONLY=60a,ONLY_REPEAT=25
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Iab7e3109ac061be826b0d7695fcc69e0dee2346d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58803
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Thu, 3 Apr 2025 22:16:46 +0000 (18:16 -0400)]
LU-18824 utils: Fix lfs migrate with --overstripe-count
The --overstripe-count (-C) option was not being properly
honored during file migration. When using lfs migrate with
this option, the overstriping flag was set but the
LLAPI_LAYOUT_OVERSTRIPING pattern was not applied to the
destination file.
This was because in lfs_setstripe_internal(), the code only
set lsa.lsa_pattern = LLAPI_LAYOUT_OVERSTRIPING when not in
migrate mode.
Fix this by always setting the pattern when the overstriped
flag is true, regardless of whether we're in migrate mode or
not.
Added a test case (27X) to verify that lfs migrate properly
applies overstriping
NB: This fix and test were generated and tested by the VS
Code Augment Agent after being given the LU URL and some
prompting.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I734b9d4e3c699e335c9d810bba2e2d2a1c301ed6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58672
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Timothy Day [Sat, 29 Mar 2025 23:18:54 +0000 (19:18 -0400)]
LU-18687 doc: move man *.1 pages to Documentation/man1
Consolidate all of the man pages into the top
level Documentation directory.
Move all of the Lustre man pages (from 1) to Docmentation/.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ied472c7612996cfd04670f5b2803bfb48d2bf74a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58587
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Shaun Tancheff [Thu, 27 Mar 2025 01:42:38 +0000 (08:42 +0700)]
LU-18852 build: Compatability updates for kernel v6.14
Linux commit v6.13-rc1-1-g6fba89813ccf
lsm: ensure the correct LSM context releaser
struct lsm_context is now upstream, provide an lsmcontext
mapping for Ubuntu
Linux v6.13-rc1-7-g5be1fa8abd7b
Pass parent directory inode and expected name to ->d_revalidate()
Adjust d_revalidate() to handle the extra arguments.
Use FMODE_NONOTIFY now that __FMODE_NONOTIFY macro is dropped.
Test-Parameters: trivial
HPE-bug-id: LUS-12797
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4ea10d171ab83e6cadb7d03580e9a2748c0d60b0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58551
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Wed, 14 May 2025 00:25:08 +0000 (17:25 -0700)]
LU-18668 kernel: new kernel [RHEL 9.6 5.14.0-570.16.1.el9_6]
This patch makes changes to support new RHEL 9.6 release
for Lustre client.
Test-Parameters: trivial env=SANITY_EXCEPT="17p" \
mdtcount=4 mdscount=2 clientdistro=el9.6 testlist=sanity
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.6 testgroup=full-part-3
Change-Id: Idf8c96ee9389978d9497da73b05c5ed400c429d4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57876
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Tue, 13 May 2025 00:20:16 +0000 (17:20 -0700)]
LU-18970 kernel: update RHEL 8.10 [4.18.0-553.51.1.el8_10]
Update RHEL 8.10 kernel to 4.18.0-553.51.1.el8_10.
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testlist=sanity
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testlist=sanity
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-1
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-2
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-part-3
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-1
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-2
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.10 testgroup=full-dne-zfs-part-3
Change-Id: I210fcf4be1bf39a0cb6fc64dcdfa898bb98f87ca
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59201
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Tue, 13 May 2025 00:14:30 +0000 (17:14 -0700)]
LU-18969 kernel: update RHEL 9.5 [5.14.0-503.40.1.el9_5]
Update RHEL 9.5 kernel to 5.14.0-503.40.1.el9_5.
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.4 testlist=sanity
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el9.4 serverdistro=el9.5 testlist=sanity
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-1
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-2
Test-Parameters: optional fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-part-3
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-1
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-2
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el9.5 serverdistro=el9.5 testgroup=full-dne-zfs-part-3
Change-Id: I62b270ad85126e6022eaf04ddbd32898fb4dc320
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59200
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alexander Zarochentsev [Wed, 28 May 2025 17:29:26 +0000 (17:29 +0000)]
LU-19070 dne: dir migrate allowed only for root
Current implemetation of lfs migrate -m
relies on setxttr(, "trusted.lmv", ) which is
allowed only for users with CAP_SYS_ADMIN capability.
Adding the same check to ll_migrate() will prevent
incomplete migrations from a non-root user.
Add error reporting to cb_migrate_mdt_fini().
Fixes:
0a83d948f3 ("LU-4684 migrate: shrink dir layout after migration")
Fixes:
2dae2b8ffb ("LU-8777 mdt: add parameter to disable remote/striped dir")
HPE-bug-id: LUS-12895
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I58d417b64e2b634d76e4ad38685deb21d9ce8a86
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59474
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Mon, 26 May 2025 00:30:24 +0000 (20:30 -0400)]
LU-19008 hsm: add locking for coordinator thread stop
There is no locking around thread stop, which can race between
mdt_coordinator() and mdt_hsm_cdt_stop() and with use-after-free
during unmount. Add locking to avoid this.
Fixes:
4512347d6c ("LU-16356 hsm: add running ref to the coordinator")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I996a79fcbca3b1c6f6a0f5ee5d9f052f31eda61f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59425
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 22 May 2025 02:58:06 +0000 (20:58 -0600)]
LU-5969 lnet: use LGPL-2.1+ for SPDX headers
The change from explicit LGPL license text to SPDX headers
introduced a number of incorrect license identifiers, because
the "or (at your option) any later version" text was missed.
Convert remaining library license blocks over to SPDX LGPL-2.1+.
Reorder copyright and file description to be consistent.
Remove filenames explicitly listed in the header block.
Test-Parameters: trivial
Fixes:
e6aefbfaa6 ("LU-6142 libcfs: SPDX for libcfs module")
Fixes:
56a9ba02ae ("LU-6142 libcfs: SPDX for libcfs headers")
Fixes:
c9a7728476 ("LU-6142 lnet: SPDX for lnet/include/ and misc files")
Fixes:
9e3fd9ce8f ("LU-6142 lnet: SPDX for lnet/util/lnetconfig/")
Fixes:
0f39311369 ("LU-6142 lnet: SPDX for lnet/utils/")
Fixes:
14e981db6c ("LU-6142 misc: SPDX for Lustre headers")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic2e9f70f82211ce5231c12d431ca63dc163ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59367
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Guillaume Courrier <guillaume.courrier@cea.fr>
Reviewed-by: Cory Spitz <cory.spitz@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mikhail Pershin [Tue, 13 May 2025 15:59:27 +0000 (18:59 +0300)]
LU-18986 mgc: client part of new registration protocol
Use new target registration protocol in MGC.
It uses inline buffer mtn_inline_list[] if NIDs fit into
it or prepare bulk transfer for large list of NIDs.
Test-Parameters: testlist=runtests mdsversion=EXA6.3.2
Test-Parameters: testlist=runtests ossversion=EXA6.3.2
Test-Parameters: testlist=runtests serverversion=EXA6.3.2
Test-Parameters: testlist=runtests clientversion=EXA6.3.2
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ifc0fd24d7eb26dd092c3e9cce895980b26f0524d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59212
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Marc Vef [Tue, 13 May 2025 11:27:31 +0000 (13:27 +0200)]
LU-18756 sec: add resource id check to oss and mds
This patch includes the resource id check into the relevant code paths
on the oss and mds side. It is therefore included for the following
operations.
On the MDT-side:
- open
- create (file and directory)
- unlink (file and directory)
- setattr
- setxattr
- getxattr
- rename
- link
On the OST-side and on the MDT-side for Data on MDT (DoM) files:
- write
- read
- truncate
- fallocate
Some caveats:
The resource id check is not included for MDS_GETATTR RPCs due to
functional and usability concerns. Specifically for the latter, the
"struct stat" would no longer be filled resulting in "?" when running
"ls -l", which can be misunderstood.
Also, if the check is only enabled on the OST-side, writes are only
denied for "sync"/"fsync"-type operations on a file as the check is at
the server-side. If the check is enabled on the MDT-side, write-access
is denied before the OST_WRITE RPC is sent, i.e., immediately
returning the access denied error code. If a file is still in the page
cache before the check is enabled, a client can still read the local
copy of the file, which is expected.
Sanity-sec test 75a was added to exercise the ID check for the above
cases in several disciplines further testing that access to
neighboring nodemap offset ranges work as expected.
Test-Parameters: trivial testlist=sanity-sec
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I040ddb1b934707baa84b492337139f45b856692e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59208
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Marc Vef [Tue, 13 May 2025 11:13:50 +0000 (13:13 +0200)]
LU-18756 sec: add generic nodemap resource id check
This patch represents the first patch in the series to check the OST
object and MDT inodes UID/GID against the nodemap offset range. This
patch adds the corresponding functions on the OST, MDT, and nodemap
sides for the resource ID check. A resource is defined as an MDT inode
or OST object. This patch does not yet connect the functions to the
relevant codepaths. The patch further adds the new "lctl set_param"
configurables, which are (for now) disabled by default:
- "lctl set_param mdt.*.enable_resource_id_check={0,1}" toggling the
check on the MDT side.
- "lctl set_param obdfilter.*.enable_resource_id_check={0,1}" toggling
the check on the OST side.
These configurables work individually but should be toggled together.
The ID check relies on the "nodemap_map_id()" functionality to
guarantee compatibility with the nodemap mapping functionality, e.g.,
covering both offset and mapping cases, among others. The ID check
therefore functions as follows:
If "nodemap_map_id()" returns the squashed value for both UID and GID
for a given client export, "fs_uid", and "fs_gid" stored on the MDT
inode and OST object, access is not permitted to the resource. It
does not rely on any IDs given by the client. The corresponding
permission bits or ACLs are not taken into consideration and are
only relevant later if access was permitted elsewhere.
Test-Parameters: trivial
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I818c511cd37251843bcfa6b873ef8bdc05176980
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59207
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mikhail Pershin [Thu, 8 May 2025 19:07:37 +0000 (22:07 +0300)]
LU-18986 mgs: server part of new registration protocol
Rework mgs_target_reg() to handle new protocol along with
old one for older targets
It handles old protocol with NIDs either in mti_nids or
in mti_nidlist[], and new protocol with NIDs in
mtn_inline_list[] or bulk
All NIDs are put in mti_nidlist[] as result of request
processing, so that eliminates need in extra changes in
further code path
Test-Parameters: testlist=runtests ossversion=EXA6.3.2
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I41dd487c37136e24328914e33c9ce056be013aae
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59206
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mikhail Pershin [Thu, 8 May 2025 09:40:54 +0000 (12:40 +0300)]
LU-18986 mgs: new target registration protocol
Patch adds new target registration request format with
enhanced NIDs list handling. The idea is to don't overload
mgs_target_info with extra flags and fields for NID list
description but keep such information in new structure.
NIDs list is arrays of string always and can be send
in varios manners: inline buffer, bulk, compressed,
appended, etc.
It helps also to resolve compatibility issues.
Patch includes:
- new wire structure mgs_target_nidlist
- new possible RPC format with mgs_target_nidlist buffer
- new connect flag OBD_CONNECT_MGS_NIDLIST to replace
obsoleted OBD_CONNECT_REQPORTAL removed in commit
1.6.0-159-gd2d56f38da ("make HEAD from b_post_cmd3")
- corresponding swabber and wirecheck
Test-Parameters: testlist=runtests clientversion=EXA6.3.2
Test-Parameters: testlist=runtests serverversion=EXA6.3.2
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I441de467a530137f76712273b9a5f814fdb562c1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59205
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Mon, 27 Jan 2025 16:44:25 +0000 (17:44 +0100)]
LU-17410 sec: per-nodemap capabilities mask
Add a per-nodemap capabilities mask, used in preference to the global
enable_cap_mask parameter if it is set.
The new nodemap property is named enable_cap_mask, and can be set
thanks to the new lctl command 'nodemap_set_cap'. It is possible to
specify capabilities in hex or with symbolic names, with '+' and '-'
prefixes to respectively add or remove corresponding capabilities.
We support defining 2 types of capabilities, either a "set" so that it
is possible to add capabilities, or a "mask" to reduce capabilities of
the client.
This per-nodemap capabilities mask is available on any nodemap
including the default nodemap.
A dynamic child nodemap is allowed to define only a subset of the
capabilities set on the parent, unless the child_raise_privileges
property has the 'caps' privilege.
sanity-sec test_51 is enhanced to exercise this new nodemap property.
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1ed91c721d869d0596af9c2d7e07a2c411f2b7c2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57938
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Marc Vef <mvef@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alexander Boyko [Fri, 13 Dec 2024 12:57:17 +0000 (13:57 +0100)]
LU-18556 hsm: optimize llog record modification
This commit introduces a new llog modification mechanism for HSM
operations to address inefficiencies caused by prior reliance on
catalog processing. The new approach directly modifies llog record,
eliminating the need for catalog-based processing and reducing
latency.
Key changes include:
* Replacing the hsm_action_item (HAI) with a full in-memory llog
record representation, increasing memory usage by ~80 bytes per
record but removing the need for a dedicated llog cookie hash
table.
* Unifying the coordinator's read/store logic for HAI data into a
single in-memory item shared by mdt_hsm_agent_send() and
mdt_hsm_add_hsr(). This reduces memory allocation steps: only one
cdt_agent_req allocation is now required during llog read
operations, eliminating subsequent allocations/copies.
Performance results on VMs 2 MDTs/2 OSTs/2 Clients no-op copytool:
Test 1 (1M archive requests): 572s -> 187s (~3 times faster)
Test 2 (1M archive + 1M queued): 558s -> 392s (~1.4 times faster)
HPE-bug-id: LUS-12583
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I4b6e697bc3b1f0cf2c76f5433b49affbc933c653
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57428
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Vitaliy Kuznetsov [Wed, 11 Jun 2025 16:10:39 +0000 (18:10 +0200)]
LU-14772 tests: Add conf-sanity-framework.sh
This patch creates a new file conf-sanity-framework.sh
The functions from conf-sanity.sh will be moved into
this file, and will also be used in other tests with
the conf-* prefix.
Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I6e0c53d4e15fa01c341be7a67fcf386c4fb5f0ed
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57370
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Marc Vef [Sun, 25 May 2025 19:02:50 +0000 (21:02 +0200)]
LU-19050 utils: Support long nid lists when getting fs info
When "get_root_path_slow()" is called through various user commands,
e.g., "lfs setquota", the internal "root_cache" is filled with mount
point information. The cache's "nid" field allowed 256 characters
which resulted in a buffer overflow for long nidlists that are set
during mount.
This patch removes this limitation and further removes the "nid" field
from the "root_cache" since it is only needed in the "lfs check"
command. Therefore, the nid list no longer needs to be processed and
put into the cache in the numerous other llapi_* functions where the
nid list is never accessed.
Further, string copy handling was insufficient, allowing the overflow
in the first place, and was updated accordingly for all fields.
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I3d9c30795fba14618368b7b9e1769fe0b07d3fc7
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59421
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Feng Lei <flei@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Oleg Drokin [Sat, 7 Jun 2025 23:10:47 +0000 (19:10 -0400)]
New tag 2.16.56
Change-Id: Iabf1977eeb273e629a3ea4c6ba75a3eadaa8be2a
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Nathaniel Clark [Thu, 22 May 2025 11:53:20 +0000 (07:53 -0400)]
LU-19039 lnetconfig: Fix error string in cyaml output
String output in yaml only needs to be quoted when beginning with '@',
''', '"', or '- ', or contains ':'.
This corrects the most common error output for `lnetctl ping` errors
to be correct yaml and also cleans up all other error strings output.
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I9a8436280b34f82cf78152e488b68c0581cc2a7d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59373
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Oleg Drokin [Tue, 27 May 2025 04:08:26 +0000 (00:08 -0400)]
LU-15358 tests: Escape quote symbols in sanityn test 26b
Shellcheck highlights that those quotes are actually unquoting
the variables. And looking at prior code we really try to ensure
you can tell which one is which even when some of them are empty
or have spaces.
Test-Parameters: trivial
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I2cdd0dcc1bce59b397f928cffeb790c74d8dc311
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59444
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Thu, 17 Apr 2025 10:17:32 +0000 (13:17 +0300)]
LU-16818 tests: ignore more opcodes in replay-single/65a
ignore few more opcodes which can interfere testing:
MDS_STATFS, OST_STATFS, OST_DISCONNECT and OST_PRECREATE
Test-Parameters: env=ONLY=65a,ONLY_REPEAT=100 testlist=replay-single
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib730b540b9075e0ed871bc11f3bdfb4cfd4634a1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58838
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Sun, 25 May 2025 00:17:54 +0000 (18:17 -0600)]
LU-18276 tests: add debugging to sanity-pfl/16b
Add extra debugging messages to sanity-pfl.sh test_16b to help find
what is causing this test to fail with ENOSPC intermittently.
Reduce size of overstriped PFL file layout slightly, so that two
such components can fit within the xattr size limit, which may or
may not be the cause of the ENOSPC failures.
Print a message in llapi_layout_file_open() if ENOSPC is hit, so
that we can determine the xattr size, in case it is too large.
Move layout conversion before file open() to avoid contacting
MDS needlessly if the layout is bad.
Test-Parameters: trivial testlist=sanity-pfl env=ONLY=16b,ONLY_REPEAT=100
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iaf347e231147041dda07277227e80f0b6f2540e5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59416
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Aurelien Degremont [Fri, 23 May 2025 14:00:23 +0000 (16:00 +0200)]
LU-19051 config: silent spurious messages while checking mpitests
When detecting mpicc configuration, do not print warnings
or error messages in the middle of configure output.
Test-Parameters: trivial
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: If536aa1d04f0d641a7b2a721869261c85907e084
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59403
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Fri, 23 May 2025 03:32:26 +0000 (21:32 -0600)]
LU-19046 mgc: mgc_fs_setup() should wait interruptibly
When a target mounts, it fetches a copy of its config log from the
MGS to store in the local filesystem. However, the MGC can currently
only fetch the config log for one target filesystem at a time.
This should be improved in a separate patch.
If the MGS is inaccessible, or there is a problem during setup, the
server will wait for it while holding cl_mgc_mutex. Other targets on
the same server will be unable to mount, and block on cl_mgc_mutex,
possibly dumping a stack trace like:
INFO: task mount.lustre:93138 blocked for more than 90 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" to disable this
task:mount.lustre state:D stack:0 pid:93138 ppid:93135
Call Trace:
__schedule+0x2d1/0x870
schedule+0x55/0xf0
schedule_preempt_disabled+0xa/0x10
__mutex_lock.isra.11+0x349/0x420
mgc_fs_setup.isra.12+0x65/0x7a0 [mgc]
mgc_set_info_async+0x99f/0xb30 [mgc]
server_start_targets+0x452/0x2c30 [obdclass]
server_fill_super+0x94e/0x10a0 [obdclass]
lustre_fill_super+0x388/0x3d0 [lustre]
mount_nodev+0x49/0xa0
legacy_get_tree+0x27/0x50
vfs_get_tree+0x25/0xc0
do_mount+0x2e9/0x950
ksys_mount+0xbe/0xe0
Use wait_event_interruptible() in mgc_fs_setup() so the server's mount
thread can be interrupted and killed. This does not fix the reason
for the server to be blocked, but it does allow it to be killed.
Rename mgc_fs_cleanup() to mgc_fs_clear() so it is not confused with
actually cleaning up the MGC.
Avoid printing an error if the sptlrpc log is not available. This is
common for most filesystems, and is not an error.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0bafa5dae0eadecb112efaf61f8bcf7ea8c4c296
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59396
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Shaun Tancheff [Fri, 23 May 2025 01:16:04 +0000 (08:16 +0700)]
LU-17242 libcfs: use sched_show_task() for thread dumping
Use sched_show_task() for thread dumping, since it should be
available on all kernels that Lustre supports. On some kernels,
libcfs_debug_dumpstack() is unable to show the thread stack.
Replacing this function avoid that issue.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I421560b0d4223fd3503f4a3697a7615dd43bad8f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59394
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Thu, 22 May 2025 06:31:54 +0000 (23:31 -0700)]
LU-19040 kernel: update SLES15 SP6 [6.4.0-150600.23.50.1]
Update SLES15 SP6 kernel to 6.4.0-150600.23.50.1 for Lustre client.
Test-Parameters: trivial mdtcount=4 mdscount=2 \
clientdistro=sles15sp6 testlist=sanity
Test-Parameters: optional mdtcount=4 mdscount=2 \
clientdistro=sles15sp6 testgroup=full-dne-part-1
Test-Parameters: optional mdtcount=4 mdscount=2 \
clientdistro=sles15sp6 testgroup=full-dne-part-2
Test-Parameters: optional mdtcount=4 mdscount=2 \
clientdistro=sles15sp6 testgroup=full-dne-part-3
Change-Id: Ie2d530f0edb28326bbcbd1326f40e3e7db845c21
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59368
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <adeiter@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Tue, 20 May 2025 05:09:47 +0000 (01:09 -0400)]
LU-18813 osd-wbcfs: refactor osd_device_alloc
osd_device_alloc() has improper error handling.
Refactor the function such that we properly
cleanup if __osd_device_init() fails.
Test-Parameters: trivial
Test-Parameters: testlist=sanity fstype=wbcfs mdscount=4 mdtcount=1 osscount=4 ostcount=1
Test-Parameters: testlist=sanity fstype=wbcfs combinedmdsmgs=false standalonemgs=true mdscount=1 mdtcount=1 osscount=4 ostcount=1
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ia03eb805ef3fdc75c8490e09c66b99e6541d13fd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59306
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Tue, 20 May 2025 04:57:57 +0000 (00:57 -0400)]
LU-18813 osd-wbcfs: remove f_op llseek checks
MemFS will always have llseek defined, so we
can remove the checks in the OSD.
Test-Parameters: trivial
Test-Parameters: testlist=sanity fstype=wbcfs mdscount=4 mdtcount=1 osscount=4 ostcount=1
Test-Parameters: testlist=sanity fstype=wbcfs combinedmdsmgs=false standalonemgs=true mdscount=1 mdtcount=1 osscount=4 ostcount=1
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I77f7abcef686c9c654b7bee04b3f88bb89a87756
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59305
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Tue, 20 May 2025 04:53:02 +0000 (00:53 -0400)]
LU-18813 osd: fix dcb_func LASSERT
Each OSD was incorrectly asserting that the
address of the function pointer was not NULL,
instead of the function pointer itself.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Ie5682a9d80219743ecb86d8d463cbabcdbf77b64
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59304
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sergey Cheremencev [Fri, 4 Apr 2025 01:58:13 +0000 (04:58 +0300)]
LU-19030 quota: lfs quota all respects nodemap
Command lfs quota all should print only IDs from the appropriate
nodemap range. The patch also maps FS quota IDs to client IDs
according to nodemap before returning in a quota all iterator buffer.
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I8820e18957805c0dceacc4674713875b024a8e99
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59297
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Timothy Day [Fri, 16 May 2025 05:50:43 +0000 (01:50 -0400)]
LU-18813 contrib: add an example config.site
The variable CONFIG_SITE can be used to specify
config files to the Autoconf generated configure
script. This is a useful alternative to long
configure command lines.
Add an example config.site file used for compiling
Lustre server (osd-wbcfs) and client for use in
ktest.
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I6597b860629643ced7191d7a250a86ede2576993
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59265
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Lijing Chen <lijinc@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Marc Vef [Wed, 21 May 2025 11:35:33 +0000 (13:35 +0200)]
LU-19021 ptlrpc: Add obd info to nodemap exports output
When clients connect to MDTs/OSTs, a new export is generated on the
server-side during obd_connect_*() with the client's UUID. For each
target, a separate export is created which is then added to the
nodemap's "nm_member_list", if applicable.
Currently, "lctl get_param nodemap.NM_NAME.exports" prints the UUID
and NID information for each entry in the "nm_member_list". Because
the obd device is not listed, duplicate entries appear to be shown for
each client, which can be confusing for the administrator.
This patch extends the nodemap.NM_NAME.exports output by also showing
the obd the client is connected to, e.g., MDT0000, MDT0001, etc, such
that the shown entries no longer appear as duplucate.
Signed-off-by: Marc Vef <mvef@whamcloud.com>
Change-Id: I681480f9258e57c522acc148f4096a8f40c71eab
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59248
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>