Whamcloud - gitweb
Sergey Cheremencev [Mon, 27 May 2024 22:49:24 +0000 (01:49 +0300)]
LU-12706 tests: sanity-quota 4a sync timeout fix
Don't sync all OSTs in a system - this might take
too much time. Instead, set striping only on OST0000
and sync only MDTs and OST0000. This fix is against
the following failure:
FAIL: Passed grace time 20,
1566910527,
1566910563
Lustre-change: https://review.whamcloud.com/55216
Lustre-commit:
9e7b239bbd26b601127073bb0c6789cb9def7073
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I525e6c73c6d14a126a2bde7d92bc28f11f3c78c8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55470
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Mon, 24 Jun 2024 14:08:03 +0000 (16:08 +0200)]
EX-9970 lipe: Fix "NaN" value in files size statistic
This patch fixes the issue where the file size is less than
a kilobyte, in which case the total size was incremented by 0
because it was rounded down to kilobytes.
Now, the summation of the total values occurs without rounding
and is done using floating-point arithmetic. Rounding is
applied only at the very end during calculations.
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ic9e7d703ec68f61553b566bc65de9014398969e1
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55466
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexandre Ioffe [Thu, 27 Jun 2024 02:16:51 +0000 (19:16 -0700)]
EX-10036 tests: delay verification to finish set flag
- clean files in the pools before start of the test
to be focused only on set of test files
- reset changelog before lamigo starting to prevent lamigo
from normal synch work and be focused only on scan work
- Wait a bit until set flag commands finish before
verifying them
Test-Parameters: trivial testlist=hot-pools
Test-Parameters: trivial testlist=hot-pools env=ONLY="60 82",ONLY_REPEAT=10
Test-Parameters: trivial testlist=hot-pools env=ONLY="60 82",ONLY_REPEAT=10
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I70c5252564cfe6eeebb0cfc4f8383f263a233a71
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55546
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Thu, 20 Jun 2024 06:27:34 +0000 (00:27 -0600)]
LU-17906 tests: skip conf-sanity/153a in interop
conf-sanity.sh test_153a mounts the OST nodes slowly when running
against an older server, which causes this test to fail regularly.
Skip it during interop testing.
Test-Parameters: trivial testlist=conf-sanity env=ONLY=153a serverversion=EXA6
Fixes:
57ac09024f ("LU-17379 mgc: try MGS nodes faster")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5e728abba54de570a4393d06d71dd7f46e2d4b94
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55480
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Aurelien Degremont [Tue, 11 Jun 2024 08:33:11 +0000 (10:33 +0200)]
LU-17929 ptlrpc: ptlrpc_request_alloc_pack() always returns an error code
Current code was always considering that when this function
returns NULL it meant ENOMEM error, but this is not always
true, especially when using GSS by example, or when
reconnecting from an IDLE state.
Also, instead of having every caller converting NULL to
ENOMEM, do that directly in the function when
appropriate.
Make ptlrpc_request_alloc_pack() return -errno in case
of error instead of a NULL pointer.
Thanks to that change, error code will be propagated up
and will help error reporting and debugging.
Took the opportunity to simplify related error path
for 2 HSM functions.
Also changed param.status to a signed data, as it can
store -errno.
Lustre-change: https://review.whamcloud.com/55391
Lustre-commit:
fc00b7e3d1bbb6ad390c5d69a73353cb7b61960a
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Change-Id: Id2b873d5f0c5cb89db070f6db00269545e6c85e8
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55531
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alexander Boyko [Sat, 20 Apr 2024 22:02:54 +0000 (18:02 -0400)]
LU-17809 osp: make disconnect asynchronous
MDT could have many osp devices. During umount there is a problem
of casscading timeouts of disconnect request. It could lead to
unpredictable large umount time.
This patch adds ability of parallel disconnect for OSP devices.
During LCFG_PRECLEANUP osp_disconnect() sends disconnects requests.
And osp_shutdown() waits it. So casscading timeouts were changed
to a single request wait.
Don't drop obd_force flag from upper layers.
Adds replay-single test 201, it simulates delays of OSP disconnects.
This leads to a high cumulative umount time.
Lustre-change: https://review.whamcloud.com/54995
Lustre-commit:
ffedcbae21f7aefe5c2258a94b36fe286f46182c
HPE-bug-id: LUS-12251
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Id788b22c494147bdc7f0d36968629e7b7f660e01
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55498
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Thu, 11 Jan 2024 10:51:11 +0000 (16:21 +0530)]
LU-16741 ptlrpc: ptlrpc: rename ptlrpc_req_finished
First series of patchs thats renames ptlrpc_req_finished
to ptlrpc_req_put
Change it as part of a general refactor of the ptlrpc
request put/freeing code.
Lustre-change: https://review.whamcloud.com/53648
Lustre-commit:
1c25cb7a3e3db17a65a8560915a0b79ada05a351
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I3f897b74debe383c4efb25c9a0becc1c27faa3d9
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <patrick.farrell@oracle.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55496
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alexander Boyko [Wed, 3 Feb 2021 11:04:52 +0000 (06:04 -0500)]
LU-14397 ptlrpc: idle import vs lock enqueue race
There is a window after ptlrpc_check_import_is_idle()
and setting LUSTRE_IMP_CONNECTING for lock enqueue.
The lock get granted on OST and is returned to the client.
Server's lock is destroyed on OST_DISCONNECT.
Perform import counters check with setting LUSTRE_IMP_CONNECTING.
A regression test_812c was added to sanity.
Lustre-change: https://review.whamcloud.com/41403
Lustre-commit:
e6af3c529021976e6df5b5e729d6a6197d27fe11
HPE-bug-id: LUS-8705
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Signed-off-by: Xing Huang <hxing@ddn.com>
Change-Id: I85da18b29ca58f811ecde8ce72ba24373388947e
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andriy Skulysh <askulysh@gmail.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55494
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bruno Faccini [Fri, 7 Jun 2024 09:22:44 +0000 (11:22 +0200)]
LU-17911 lustre: fix faked flexible arrays in getinfo_fid2path
faked (0-length) flexible arrays need some rework to comply with
new coding rules to stay Fortify feature compliant (see document
at https://people.kernel.org/kees/bounded-flexible-arrays-in-c).
With this particular getinfo_fid2path struct content, we ended-up
with generated code causing straight crash upon each call of
lmv_fid2path routine upon strlen() first call.
Lustre-change: https://review.whamcloud.com/55354
Lustre-commit:
3eb2f75836e822cdf411823330c36ba13f2339ee
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: Id6f594779ca0ae86f0c2842535abccbf4df688d3
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55533
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Fri, 17 May 2024 09:40:23 +0000 (05:40 -0400)]
LU-17897 lfsck: don't assert on orphan existence
lfsck_namespace_create_orphan_dir() is called in several cases,
and orphan may exist in some cases, change assertion to check.
Lustre-change: https://review.whamcloud.com/55302
Lustre-commit:
192c395d01062f2e1178ec8ce437f5eea42011c1
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I28563aa60d0f345616fd30cd0899495e7c1ef8f0
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55506
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Wed, 29 May 2024 20:41:54 +0000 (16:41 -0400)]
LU-17872 ldlm: switch to read_positive in reclaim_full
Checking reclaim full for every lock request is expensive;
it requires taking a global spinlock and can completely
clog the MDS CPU on larger systems.
If we switch to read_positive rather than sum_positive for
our counter read, we avoid this spinlock at the cost of
being off by as much as NR_CPU*32.
Since the counter is for hundreds of thousands to millions
of items and just triggers memory reclaim, this level of
error is completely fine.
This resolves the contention issue, on an OCI system with
384 cores, here's our mdtest comparison:
Operation | Without Patch | With Patch | %Change
---------------------|---------------|-------------|-------
Directory creation | 69481.994 | 64373.060 | -7%
Directory stat | 87942.757 | 274670.454 | 212%
Directory rename | 78127.922 | 92592.239 | 19%
Directory removal | 69901.490 | 89560.415 | 28%
File creation | 62789.774 | 107294.450 | 71%
File stat | 88039.061 | 480469.711 | 446%
File read | 82192.370 | 151117.380 | 84%
File removal | 146690.828 | 127589.655 | -13%
Tree creation | 46.549 | 56.992 | 22%
Tree removal | 51.531 | 53.967 | 5%
Note the *446%* improvement in stat and the 70-80% in
file creation and read.
Note this issue is likely much worse on systems with higher
core counts since the cost of summing the counter scales
with the number of CPUs. This may be why this has not been
seen before.
Lustre-change: https://review.whamcloud.com/55141
Lustre-commit:
0c16987b2233c32d775f0e3e6f6503c4b7825e02
Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Signed-off-by: Xing Huang <hxing@ddn.com>
Change-Id: I01a39abf5e6f0829156b413b1f44001e2c504be2
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: wangdi <di.d.wang@oracle.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55479
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Wed, 5 Jun 2024 12:35:02 +0000 (14:35 +0200)]
LU-17907 enc: enc flag should not remove other flags
When updating inode flags, the lli_flags must be taken into account
so that they do not get lost. So provide helper functions for callers
of ll_update_inode_flags(), as an overlay to ll_inode_to_ext_flags().
And on server side, the mdd layer must fetch the existing flags when
setting LUSTRE_ENCRYPT_FL attr flag.
Lustre-change: https://review.whamcloud.com/55317
Lustre-commit:
cfde96fb2e77c33a8ab9afede79cdf093ee8ff9e
Fixes:
40d91eafe2 ("LU-12275 sec: atomicity of encryption context getting/setting")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I150f2d87cef112beab81d1d2030133671d4b7361
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55318
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Tue, 11 Jun 2024 10:40:26 +0000 (12:40 +0200)]
LU-17930 gss: node principal expectations
When a credentials cache exists for Kerberos, lgss_keyring looks into
it to find a valid entry. The cache's principal must match the
expected role for the GSS request being processed:
- LGSS_ROOT_CRED_MDT: expect "lustre_mds" principal;
- LGSS_ROOT_CRED_OST: expect "lustre_oss" principal;
- LGSS_ROOT_CRED_ROOT: expect "lustre_root" or "host" principal.
And there is the special case of the GSS request on the MGC, for which
by convention all 3 roles are applied at the same time.
Lustre-change: https://review.whamcloud.com/55392
Lustre-commit:
d83be78be789f1d0b04301cd088fb30deeed9b0a
Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I4c46b03bb012c5f56bd26efdfaa6dab5fc7de31a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55527
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Thu, 13 Jun 2024 09:19:04 +0000 (11:19 +0200)]
LU-17940 gss: get rid of root key sooner
The root key associated with a GSS context (gck_key) is used to pass
information between kernel and userspace during GSS context
negotiation.
Once the GSS context for root is up-to-date, the key is never used
again, although it has a permanent validity. And when the context
expires, the key is directly revoked and replaced with a new one to
serve the negotiation of a new root context.
So to avoid issues with keys staying in the root's kernel keyring and
being accidentally revoked, just get rid of the key associated with a
root context as soon as the negotiation process has finished.
Lustre-change: https://review.whamcloud.com/55406
Lustre-commit:
bffafaa5273109cea0e3b2a15d7a0b7ae965daa8
Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-selinux-ssk-part-1
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I4be773723b9046ed451684bd141d5ef2bc584bfb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55528
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Oleg Drokin [Wed, 28 Jul 2021 18:02:19 +0000 (14:02 -0400)]
LU-14711 osc: Do not attempt sending empty pages
Do not crash if trying to send a lock-prolonging emtpy read
to an old server, if the server does not support short reads.
Otherwise the client crashes when access the NULL page.
Lustre-change: https://review.whamcloud.com/44654
Lustre-commit:
1a409a3e6a74685970ee779ebe32917bf51eaf3a
Test-Parameters: trivial
Fixes:
564070343ac4 ("LU-14711 osc: Notify server if cache discard takes a long time")
Change-Id: Icae7bf3ef16c45d33894b3c5fbac15b1a98c39d9
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55593
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Wed, 5 Jun 2024 15:33:09 +0000 (23:33 +0800)]
EX-9802 test: refine sanity-compr.sh/test_0a
Only change and verify compression types negotiation between client
and OST0000.
Test-Parameters: trivial testlist=sanity-compr env=ONLY=0a
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I307f3826973f48791b7fbf4cdfa121401b059b0e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55356
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Wed, 19 Jun 2024 05:42:07 +0000 (23:42 -0600)]
RM-620 build: New tag 2.14.0-ddn154
New tag 2.14.0-ddn154
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6025a7760be70b58c1e529b3e0804b86d411155a
Andreas Dilger [Wed, 19 Jun 2024 05:41:36 +0000 (23:41 -0600)]
RM-620 build: New tag lipe-2.54
New tag lipe-2.54
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I802f9e061a4c4fe9a484eae1679ff7d20229e3a5
Alex Zhuravlev [Thu, 30 May 2024 06:12:17 +0000 (09:12 +0300)]
EX-9858 osd: add @read param to dt_write_prep()
the purpose is to skip full page reading when OSD is aked for
pages to be written later. in CSDC we need all data as the
pages can be compressed and we may need to decompress them
first.
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I2ea9888a1dab419275dbee97e9acc661101b5262
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55250
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Aurelien Degremont [Wed, 6 Mar 2024 14:04:41 +0000 (15:04 +0100)]
LU-17566 mdt: improve new_init_ucred() for refactoring
In order to merge new_init_ucred() and old_init_ucred()
code eventually, move new_init_ucred() code around
for it to look even closer to old_init_ucred().
- Fill generic ucred fields at the beginning (similar to
what old_init_ucred() is doing.
- Move code for the bottom part to be closer to
old_init_ucred_common().
This code path is not used on most of lustre deployments,
so I'm enabling kerberos testing to ensure some tests
will go through this code path.
Lustre-change: https://review.whamcloud.com/55025
Lustre-commit:
2752ac20422b681649e7a1c9ab0b6cf0f93d9e27
Test-Parameters: kerberos=true testlist=sanity-krb5
Change-Id: I113fca6a104c1db66d9e0defd6fd91e378d7208c
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55376
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mikhail Pershin [Sat, 18 May 2024 19:43:05 +0000 (22:43 +0300)]
EX-9183 llog: debug for llog cancel problems
- add debug messages for update_log_dir remote access
errors
- remove former extended debug for -ENOTDIR error
Fixes:
00548f792a ("EX-3860 llog: extended debug for -ENOTDIR error")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ifb87489cc76f10814eb12eba0a4ae997293fa11d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55389
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Tue, 18 Jun 2024 20:09:42 +0000 (13:09 -0700)]
LU-17641 kernel: update RHEL 9.3 [5.14.0-362.24.1.el9_3]
Update RHEL 9.3 kernel to 5.14.0-362.24.1.el9_3 for Lustre client.
Lustre-change: https://review.whamcloud.com/54820
Lustre-commit:
5ba9f847baa63f3a9d8108cded1c755c1d5fd47a
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el9.3 testlist=sanity
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.3 testgroup=full-part-3
Change-Id: Ifafb3fbbfdfcd82506daed44d3601a0d4357331e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55138
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Artem Blagodarenko [Wed, 12 Jun 2024 13:54:18 +0000 (14:54 +0100)]
EX-9889 csdc: Fix the next available algorithm selection
Currently, if a chosen algorithm is not available, then
next preferable is chosen, but an error code is not cleared
so __alloc_compr() returns this wrong code. Data if
written uncompressed while it can be compressed with
next appropriate algorithm.
This patch adds this error code clearing. Test is
provided.
Test-Parameters: trivial testlist=sanity-compr env=ONLY=1015,ONLY_REPEAT=10
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Fixes:
67794c381440 ("EX-7601 obd: move type switching to alloc_compr callers")
Change-Id: I59f65058a0fe9b108de3d4ba7cf5950f18e32204
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55426
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Tue, 4 Jun 2024 12:25:35 +0000 (15:25 +0300)]
LU-10026 ldiskfs: mballoc to preserve preallocation's start
.. used in dense preallocation. otherwise it's possible to lose
preallocated space when the corresponding cache bitmap is dropped
from the cache, then ldiskfs will be printing error messages
about block counter mismatch.
Lustre-change: https://review.whamcloud.com/55467
Lustre-commit: TBD (from
1b2ee8b4cbf28dcd14f26bf1dc90caee9dc073c1)
Fixes:
686dee707f ("LU-10026 osd-ldiskfs: use preallocation for dense writes")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I93177510af959e849dba7a9c35d81bc27809a31b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55303
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Wed, 31 Jan 2024 05:16:12 +0000 (08:16 +0300)]
LU-17486 ldiskfs: fix race in ext4_destroy_inode
ext4_i_callback() can race with the access to i_reserved_data_blocks
in ext4_destroy_inode() when used with preemption-enabled kernel.
Lustre-change: https://review.whamcloud.com/53868
Lustre-commit:
4b51f1df05c4219cd8f910ac8ad58e8de946bb56
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I69c6bcfbb24e6c07d28ebcd2bdd9d9e6f06ec8d1
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55428
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexandre Ioffe [Tue, 16 Apr 2024 07:54:22 +0000 (00:54 -0700)]
EX-8257 lipe: Lamigo: set nocompr flag when replicate
- use nocompr flag in lfs mirror extend when a sourced replicated
file has this flag set
- collect nocompr and prefer flags in a single bit mask.
This bit mask is set in striping_is_in_sync_*(). Then this bit
mask is used to generate specific lfs command
- fixed EX-9565: turn on/off ALR accounting only for
lfs mirror/resync commands
- for better coding replace amigo_resync_type by
lfs_cmd_type to address wider range of lfs commands:
extend/resync/setflag/check agent
- move check agent code into callback function
- add hot-pools tests to verify nocompr and prefer flags are set
when lamigo mirrors or makes initial rescan
Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I8e0cec04f5263761ddd201c7c03ec390e050eef4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54813
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Mon, 17 Jun 2024 17:40:28 +0000 (10:40 -0700)]
LU-17402 kernel: RHEL 8.10 server support
This patch makes changes to support RHEL 8.10 release
with kernel 4.18.0-553.5.1.el8_10 for Lustre server.
Lustre-change: https://review.whamcloud.com/54800
Lustre-commit:
c8ea06678d622adc67ba41864859228212e43776
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.8 testlist=sanity
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.9 serverdistro=el8.10 testlist=sanity
Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
testgroup=full-part-1
Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
testgroup=full-part-2
Test-Parameters: optional clientdistro=el8.10 serverdistro=el8.10 \
testgroup=full-part-3
Change-Id: Ib8e2c60d1defdefeedda10d8aeef5563c70a1100
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55326
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Bobi Jam [Mon, 17 Jun 2024 17:24:29 +0000 (10:24 -0700)]
LU-17941 ofd: do not copy over filter_fid structure
When a bigger filter_fid has been writen on disk by newer server,
downgraded Lustre would read more data but we need store less to
fit smaller filter_fid structure.
Lustre-change: https://review.whamcloud.com/55408
Lustre-commit: TBD (from
587b178e9330fcdf58bdd3c093e72594a08a611a)
Fixes:
28c366cee6d ("LU-17218 ofd: improve filter_fid upgrade compatibility")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Idb5c8fffe4af22f35b64aa93e7efce7f9dd206d6
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55458
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Artem Blagodarenko [Wed, 12 Jun 2024 13:54:18 +0000 (14:54 +0100)]
EX-9889 csdc: migrate to another compression level
Here is a test there lfs migrate with changed
compression level.
Test-Parameters: trivial env=ONLY=1030 testlist=sanity-compr
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: Ia67316a247fed99f6ad002c8b67053a3c2fa0cff
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55407
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Tue, 18 Jun 2024 15:26:10 +0000 (17:26 +0200)]
EX-9392 sec: use IDENTITY_UPCALL_INTERNAL
Use defined IDENTITY_UPCALL_INTERNAL instead of raw string "INTERNAL".
Test-Parameters: trivial
Fixes:
b5e421625b ("EX-9392 sec: use dedicated INTERNAL upcall cache")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0bcae8de4ad585a644c9352e523d15b0eb387226
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55460
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Thu, 16 May 2024 09:58:24 +0000 (11:58 +0200)]
LU-17852 gss: do not use expired reverse gss contexts
On server side, a reverse context matches a gss context established
on client side. These reverse contexts have a expiration time, and are
replaced with fresh ones when they expire.
So get rid of expired reverse contexts when we find them in the
gsk_clist. And when we look up for a context, do not continue using
the current one if it is expired.
Add sanity-krb5 test_200 to check the expired reverse contexts.
Lustre-change: https://review.whamcloud.com/55127
Lustre-commit: TBD (from
29a26d4e74ceda192e63d49f130ef233dc3b3b55)
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I11f2d8ab298073f9d5bedff187b67f2ca289ae47
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55230
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bruno Faccini [Thu, 30 May 2024 16:39:37 +0000 (18:39 +0200)]
LU-17887 obd: do not update obd_memory from RCU
OBD_FREE_PRE() should not be run from an RCU
callback as the obdclass module may have been
unloaded during the RCU grace period.
Lustre-change: https://review.whamcloud.com/55263
Lustre-commit:
8b6719f1b32e134babe9f47e2dc79c66a3735b49
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I6f663b2aed2e60c15f2a1b9755b2c4050bd91ce2
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55380
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Mon, 17 Jun 2024 10:56:56 +0000 (18:56 +0800)]
LU-17464 lod: use OBD_ALLOC_LARGE for ldo_comp_entries
The lod_object::ldo_comp_entries is allocated/free with _LARGE macros
so that it could be large enought to use vmalloc instead of kmalloc
for memory allocation. There are some places use OBD_ALLOC without
_LARGE to re-allocate memory which mismatch the assumption.
Lustre-change: https://review.whamcloud.com//55449
Lustre-commit: TBD (from
1c1ae0a5b642b2dd749c466dec5f91930db24601)
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ie356ae875329af07c893586fa4b1485dbd17afe6
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55455
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bruno Faccini [Mon, 3 Jun 2024 14:47:51 +0000 (16:47 +0200)]
LU-17900 llite: handle AT_GETATTR_NOSEC flag if present
Starting with v6.7-rc1-1-g8a924db2d7b5, a new AT_GETATTR_NOSEC
flag can be passed in addition by vfs_getattr_nosec() to the
underlying FS getattr() interface routine.
So it must be handled/masked in ll_vfs_getattr() in order to avoid
to pass it back to vfs_getattr(), like already done by
ecryptfs/overlayfs and thus no longer get a warning/stack displayed.
Lustre-change: https://review.whamcloud.com/55296
Lustre-commit:
b3c5473ce74a6600aaa6938de81d91f099b1f1bf
Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>
Change-Id: I1d041913a6fc3ab9158fd611cb7d14dd1b7f694b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55400
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Wed, 5 Jun 2024 16:43:51 +0000 (19:43 +0300)]
EX-9895 osc: preserve compressed pages for OST_WRITE replay
it's incorrect to release compressed pages right after reply
as we may resend them during OST_WRITE replay.
Test-Parameters: env=ONLY=1081 testlist=sanity-compr
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I3edc16d6556ddd60735d2f14fe879fc0f45231d7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55323
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Wed, 12 Jun 2024 09:08:34 +0000 (03:08 -0600)]
RM-620 build: New tag 2.14.0-ddn153
New tag 2.14.0-ddn153
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iba22b568bc85df707e29a80ca71b713a67c91b59
Andreas Dilger [Wed, 12 Jun 2024 09:08:15 +0000 (03:08 -0600)]
RM-620 build: New tag lipe-2.53
New tag lipe-2.53
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7761005d9f91455007931047b370b186b9dba0e4
Vitaliy Kuznetsov [Mon, 10 Jun 2024 10:21:21 +0000 (12:21 +0200)]
EX-7729 ofd: Add counters for compressed data
This patch is the second of two patches that add counters
to track server-side data compression statistics.
From added counters:
1. Size of compressed/uncompressed chunks read by
client to compressed files, in chunk/bytes;
Example:
obdfilter.lustre-OST0000.stats_compr=
snapshot_time
1715876501.
602669648 secs.nsecs
start_time
1715873670.
819396144 secs.nsecs
elapsed_time 2830.
783273504 secs.nsecs
compressed_objects 2 samples [reqs]
uncompressed_objects 1 samples [reqs]
compr_bytes_read 885 samples [bytes] 557 126974
52174403
compr_bytes_read_raw 893 samples [bytes] 16384 131104
115885042
incompr_bytes_read 8 samples [bytes] 16384 16384 131072
compr_chunks_read 885 samples [reqs]
incompr_chunks_read 8 samples [reqs]
Test-Parameters: testlist=sanity-compr env=ONLY=1001,ONLY_REPEAT=300
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I0998b1facaffd3e84b8d9429c5674d8d9ad60020
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54334
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Thu, 30 May 2024 08:36:43 +0000 (10:36 +0200)]
EX-9328 lipe: Add new option for update LSOM attrs
This patch addresses an issue where a significant portion of
MDT index descriptors lack size information, causing most
files to be excluded from size reports.
This mechanism will be enabled by default, but it can be
disabled via new option.
This patch adds a new option for lipe_find3 and lipe_scan3 that
disables the update of lsom attributes when using
the '--collect-fsize-stats' option. A brief description of the
option has also been added to the help documentation, which
can be accessed with the '--help' option for each utility.
Example usage:
sudo lipe_scan3 /dev/sda8 --no-lsom-update
--collect-fsize-stats=report.all
sudo lipe_find3 --no-lsom-update /dev/sda8
-collect-fsize-stats report.all
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I362166f6a4b8e8abdfc3bc9709cf7406e3d3cd8d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55140
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Tue, 11 Jun 2024 11:40:38 +0000 (13:40 +0200)]
EX-9609 lipe: Add sorting to reports by directory sizes
This patch does not change format the display of tables
in directory reports. This patch adds sorting to the directory
size report. This patch also sorts the TOP-X rating table.
Sorting occurs in descending order of size. Sorting is done
based on the allocated size.
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I5d641660d7bf33cedb27612c04d1a7de43d78c75
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55395
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Tue, 11 Jun 2024 11:23:31 +0000 (13:23 +0200)]
EX-9121 lipe: Trivial improvements for dirs report
Minor changes that do not affect functionality.
Adds functionality to fill the table by directory size.
Changes the output for the rating table slightly, adding
nesting visualization, but without sorting.
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I74c8978a2e8e4806f7f600db0a573ee162d5cd94
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55394
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Tue, 11 Jun 2024 09:31:15 +0000 (11:31 +0200)]
EX-9121 lipe: Trivial improvements for report merging
Minor changes that do not affect functionality.
Fixed memory leaks when merging reports, and excessive
memory allocation when generating JSON type reports when
there are a large number of ranges.
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I75c580f53246301d6262c6f5c5db271d36b2ec75
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55393
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Chris Hunter [Thu, 6 Jun 2024 05:44:12 +0000 (01:44 -0400)]
LU-17899 gss: improved systemd unit file for SSK daemon
Add operation ordering to lsvcgss initscript/service unit
so it starts after systemd network services are running.
Lustre-change: https://review.whamcloud.com/55379
Lustre-commit: TBD (from
cc08ebd0fb8f370451408c57b86001323b4da4dc)
Signed-off-by: Chris Hunter <chunter@ddn.com>
Change-Id: Iad39d01aae16732ff646383814033d6efb34af5e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55339
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Jian Yu [Fri, 7 Jun 2024 17:44:11 +0000 (10:44 -0700)]
LU-17404 kernel: update RHEL 9.4 [5.14.0-427.20.1.el9_4]
Update RHEL 9.4 kernel to 5.14.0-427.20.1.el9_4 for Lustre client.
Lustre-change: https://review.whamcloud.com/54712
Lustre-commit: TBD (from
527a21ce444b46034e45de185a3bd39727353abb)
Test-Parameters: trivial \
mdtcount=4 mdscount=2 clientdistro=el9.4 testlist=sanity
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-3
Change-Id: Ieee88a5a9f8e58f8445e126d21e45228e7b5ca64
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55367
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Frederick Dilger [Sat, 25 May 2024 23:23:20 +0000 (19:23 -0400)]
LU-17343 utils: added --path option for lctl list_param
Added 'lctl list_param [-p] PARAM' option that prints the
actual pathname(s) for PARAM instead of the parameter names(s).
This should allow users to "resolve" PARAM pathnames so that they
can be used directly, which avoids having to hard code them. Also
renamed "po_only_path" and "po_show_path" to be "po_only_name" and
"po_show_name" to avoid confusion with "po_only_pathname" for the new
option.
Lustre-change: https://review.whamcloud.com/55202
Lustre-commit:
e1a9d08351721d280faed51a2061e3e16f25a6b2
Signed-off-by: Frederick Dilger <fdilger@whamcloud.com>
Change-Id: I2259b930f3ac5cc46ac7a9a36218a44fa110157c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55331
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 28 Mar 2024 03:18:56 +0000 (21:18 -0600)]
LU-16500 utils: 'lfs migrate' should select new OSTs
When migrating a file using "lfs migrate FILE" without any arguments
to specify a new layout, this should migrate the file to the best
OSTs available at that time based on free space, instead of keeping
the file on the same OSTs (which is almost pointless otherwise).
Reset the starting OST index for all components of the copied file
layout so that this can happen properly. Previously, only the last
component had the OST index reset, which was only partly helpful.
Add llapi_layout_ost_index_reset() to handle this, since it seems
likely that tools using llapi_layout_from_fd() and friends to copy
an existing layout will want to do the same. Add the corresponding
man page and reference it from llapi_layout_get_from_fd().
Update sanity test_56xe to check that the starting OST index of each
component is not the same for all components. This check might not
catch a broken "lfs migrate" every time since even before this patch
the last component would be allocated on a random OST, but will still
fail about once every 1/$OST_COUNT runs. Conversely, with this patch
it passes hundreds of iterations without a false positive, though a
small chance exists that it will have a false positive on occasion.
Add a "make utils" target to simplify building only user utilities.
Lustre-change: https://review.whamcloud.com/54600
Lustre-commit:
2007ab4709acaef0397df15c9f4cf4387844ba9c
Test-Parameters: testlist=sanity env=ONLY=56xe,ONLY_REPEAT=100
Fixes:
0568f4ca25 ("LU-16500 utils: set default ost index for lfs migrate")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie4c68d4b2ff09560a7a13ae464723745cf968d36
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55369
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Etienne AUJAMES [Tue, 12 Sep 2023 16:06:25 +0000 (18:06 +0200)]
LU-17110 llite: fix slab corruption with fm_extent_count=0
If userspace uses fiemap with .fm_extent_count=0, .fm_extents[0] is
not allocated. Writing on the first entry without checking the extent
count could lead to memory corruption (slab).
This patch fix also the case when osc is disable: FIEMAP_EXTENT_LAST
should be set on the extent (fe_flags) and not on the fiemap struct.
Add a regression test sanityn 71d to test fiemap with
fm_extent_count=0.
Add a regression test sanity-hsm 408 to test fiemap on release files.
Lustre-change: https://review.whamcloud.com/52352
Lustre-commit:
a81dc7d0e158894e905ab3d309f7b92864a94378
Fixes: 4097196 ("LU-11848 lov: FIEMAP support for PFL and FLR file")
Test-Parameters:testlist=sanityn env=ONLY=71d,ONLY_REPEAT=20
Test-Parameters:testlist=sanity-hsm env=ONLY=408,ONLY_REPEAT=20
Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: Id63c6973540187e678020977f2d555dfcbf3c634
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55363
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andrew Perepechko [Mon, 16 Jan 2023 13:13:34 +0000 (08:13 -0500)]
LU-16480 lov: fiemap improperly handles fm_extent_count=0
FIEMAP calls with fm_extent_count=0 are supposed only to
return the number of extents.
lov_object_fiemap() attempts to initialize stripe_last
based on fiemap->fm_extents[0] which is not initialized
in userspace and not even allocated in kernelspace.
Eventually, the call exits with -EINVAL and "FIEMAP does
not init start entry" kernel log message.
Lustre-change: https://review.whamcloud.com/49645
Lustre-commit:
829af7b029d8e4e391b93792bf5214611b0193bd
Fixes:
409719608c ("LU-11848 lov: FIEMAP support for PFL and FLR file")
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Change-Id: I65e706b5dd5c8a6db90a539c2602af839b4da823
HPE-bug-id: LUS-11443
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55362
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Mon, 3 Jun 2024 11:52:20 +0000 (13:52 +0200)]
LU-17899 gss: lsvcgss service fix
The lsvcgss service can fail to start if the daemon is invoked with
the '-k' option whereas no proper Kerberos configuration is in place
on the server. The daemon should ignore the '-k' option is such case
and try to start the other provided modes if any (SSK, Null).
And in case the daemon is started with the '-s' option (SSK), it
spawns a temporary additional thread to compute the number of rounds
used for Miller-Rabin prime testing. So the lsvcgss_sysd script should
support that.
Lustre-change: https://review.whamcloud.com/55293
Lustre-commit:
f28a7a33a8254fc25c8cb348f87a0c133286393f
Fixes:
ac1ea2ef12 ("LU-17741 gss: fix lsvcgss service for systemd")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iba632bd0ea9696ccea52bff5982a4d4e490597a7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55294
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Fri, 7 Jun 2024 09:04:27 +0000 (02:04 -0700)]
LU-17402 kernel: update RHEL 8.10 [4.18.0-553.5.1.el8_10]
Update RHEL 8.10 kernel to 4.18.0-553.5.1.el8_10.
Lustre-change: https://review.whamcloud.com/55350
Lustre-commit: TBD (from
66e63642f81f4d3059fa1969b9e510d172c374d0)
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.8 testlist=sanity
Test-Parameters: optional clientdistro=el8.10 testgroup=full-part-1
Test-Parameters: optional clientdistro=el8.10 testgroup=full-part-2
Test-Parameters: optional clientdistro=el8.10 testgroup=full-part-3
Change-Id: Iad6dc4f6294beeed1db44d8484b325a771bc1ad4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55353
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Tue, 9 Apr 2024 13:00:41 +0000 (15:00 +0200)]
LU-17718 obdclass: potential string overflow upcall_cache.c
Use strncpy() in upcall_cache_set_upcall() to quiet Coverity warning.
And reorganize the function so that the code flow is more linear in
the success case.
CoverityID: 424705: ("String overflow")
Lustre-change: https://review.whamcloud.com/54710
Lustre-commit:
7869bb320e735547410a7d3e31061b9044389c53
Fixes:
a462a119ec ("LU-17497 obdclass: check upcall incorrect values")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1aee77f78c92c6c571dfe358435a2733cc3ba9d9
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55314
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Sun, 19 May 2024 00:38:33 +0000 (20:38 -0400)]
EX-9875 test: limit dir restripe overstripe count
Lack of LU-15527 code, the distributed transactions are slow. To
avoid test timeout, limit overstripe count and increase timeout for
sanity test_300ud and test_300ue.
Test-Parameters: trivial
Test-Parameters: mdtcount=4 testlist=sanity env=ONLY="300ud 300ue"
Test-Parameters: mdtcount=4 testlist=sanity env=ONLY="300ud 300ue"
Test-Parameters: mdtcount=4 testlist=sanity env=ONLY="300ud 300ue"
Test-Parameters: mdtcount=4 testlist=sanity env=ONLY="300ud 300ue"
Test-Parameters: mdtcount=4 testlist=sanity env=ONLY="300ud 300ue"
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I830ac27e446f3841147be4777ba06cdb8e1a7f59
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55347
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Mon, 10 Jun 2024 03:11:30 +0000 (21:11 -0600)]
EX-9125 tests: exclude sanity-compr/1008 on Ubuntu
This subtest is failing consistently on Ubuntu. Disable until it
can be fixed.
Test-Parameters: trivial testlist=sanity-compr env=ONLY=1008,HONOR_EXCEPT=y clientdistro=ubuntu2204
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie71462e7f033be914523ca96b22478a53b81b882
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55374
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Fri, 7 Jun 2024 07:34:52 +0000 (01:34 -0600)]
RM-620 build: New tag 2.14.0-ddn152
New tag 2.14.0-ddn152
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If85c2c311326aae73f27e15c7fa1358c393a509c
Jian Yu [Thu, 6 Jun 2024 07:34:00 +0000 (00:34 -0700)]
EX-9125 tests: change source dir for sanity-compr/1008
While the source directory contains almost all small files,
sanity-compr test_1008 will hit the
"failed estimates > 50% of total estimates" failure on SLES15.
The patch fixes the issue by changing the source dir to
contain large files.
Test-Parameters: trivial clientdistro=sles15sp5 \
testlist=sanity-compr env=ONLY="1008",ONLY_REPEAT=3
Test-Parameters: trivial clientdistro=el9.3 \
testlist=sanity-compr env=ONLY="1008",ONLY_REPEAT=3
Test-Parameters: trivial clientdistro=el8.8 \
testlist=sanity-compr env=ONLY="1008",ONLY_REPEAT=3
Change-Id: Iad661750cba7c9d2204f2306e73169deb012ddf4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55334
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Mikhail Pershin [Thu, 6 Jun 2024 11:12:00 +0000 (14:12 +0300)]
LU-15644 llog: don't report warning in no error case
Fix wrong check which includes rc == 0 valid case wronly
Fixes:
53d946a1222 (LU-15644 llog: don't replace llog error with -ENOTDIR)
Test-Parameters: trivial
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id6e7b2cd42b4769765c67d418552a13f048ea050
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55337
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Wed, 29 May 2024 12:26:29 +0000 (14:26 +0200)]
EX-9121 lipe: Add statistics merging for directories
This patch adds the ability to merge statistics for directories.
This is the first of two patches and contains the basic collection of
information from json as well as basic output.
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I253f7606b66921eddf52709931cc1c880e66a997
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55233
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Fri, 24 Nov 2023 09:26:10 +0000 (17:26 +0800)]
EX-8355 csdc: stop compressing incompressible file data
The reduced_ratio (original_size/compress_reduced_size) represents
the minimum fraction of pages that are compressed out of each chunk,
namely the compressed chunk needs to shrink by at least
1/reduced_ratio blocks for it to be "compressible".
Let size compression_ratio be defined as
original_size/after_compression_size, so
reduced_ratio = compression_ratio / (compression_ratio - 1)
and we set its default value to 16, equivalent to 1.07 of compression
ratio (i.e. needs to shrink at least one 4KB block out of each 64KB
chunk).
After every compress_check_bytes of data being compressed, file's
compressibility would be re-calculated based on average
compress_reduced and average compress_orig data size.
Stop compressing file data if it is deemed to be incompressible, and
after compress_skip_bytes data have been written uncompressed , retry
the file compressibility check.
compress_reduced_ratio, compress_check_bytes, compress_skip_bytes
are tunable parameters:
osc.*.compress_reduced_ratio
osc.*.compress_check_bytes
osc.*.compress_skip_bytes
their default values are 16, 1M and 32M respectively.
Test-Parameters: testlist=sanity-compr
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I4ce3d752c67f18ba7b100c72a2bb61a91258c6e8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53306
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andriy Skulysh [Wed, 3 Apr 2024 10:34:32 +0000 (13:34 +0300)]
LU-17871 ldlm: FLOCK ownlocks may be not set
Conflict checking loop should continue until ownlocks is set.
Ownlocks variable is essential for lock merges.
Lustre-change: https://review.whamcloud.com/55184
Lustre-commit:
ede8d928d6c47551371512c80dfa4f159260e7e2
Fixes:
b07a57027e (LU-15402 ldlm: speedup RD flock enqueue)
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Signed-off-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Change-Id: Ied526581dd7d4f100c95f2fe582d117a87a8a584
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55246
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Thu, 6 Jun 2024 08:37:21 +0000 (02:37 -0600)]
RM-620 build: New tag 2.14.0-ddn151
New tag 2.14.0-ddn151
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I254278cbf7ac546de7ce6005d5e9a35cb0952556
Andreas Dilger [Thu, 6 Jun 2024 08:37:00 +0000 (02:37 -0600)]
RM-620 build: New tag lipe-2.52
New tag lipe-2.52
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5939742a929cc91e92e1608592d1d2801d1fef4a
Hongchao Zhang [Fri, 2 Feb 2024 05:58:59 +0000 (13:58 +0800)]
LU-14535 quota: get all quota info in LFS
This patch adds option "-a" for LFS to get the quota info of
all quota IDs. it iterates quota setting saved in global quota
setting files "quota_master/md-0x0" and "quota_master/dt-0x0"
from QMT and iterates the quota usage info saved in acct quota
files in the backend FS (LDiskFS or ZFS) from QSDs, then merge
the two kinds of quota info at client and print it in the similar
way as "lfs quota -u|-g|-p".
$lfs quota -a -u /mnt/lustre
Filesystem /mnt/lustre, Disk usr quotas
quota_id kbytes quota limit grace files quota limit grace
root 9684 0 0 - 1019 0 0 -
bin 4 0 102400 - 1 0 10240 -
daemon 4 0 102400 - 1 0 10240 -
adm 4 0 102400 - 1 0 10240 -
lp 4 0 102400 - 1 0 10240 -
sync 4 0 102400 - 1 0 10240 -
shutdown 4 0 102400 - 1 0 10240 -
halt 4 0 102400 - 1 0 10240 -
mail 4 0 102400 - 1 0 10240 -
$lfs quota -a -g /mnt/lustre
Filesystem /mnt/lustre, Disk grp quotas
quota_id kbytes quota limit grace files quota limit grace
root 9684 0 0 - 1019 0 0 -
bin 4 0 204800 - 1 0 20480 -
daemon 4 0 204800 - 1 0 20480 -
adm 4 0 204800 - 1 0 20480 -
lp 4 0 204800 - 1 0 20480 -
sync 4 0 204800 - 1 0 20480 -
shutdown 4 0 204800 - 1 0 20480 -
halt 4 0 204800 - 1 0 20480 -
mail 4 0 204800 - 1 0 20480 -
Lustre-change: https://review.whamcloud.com/42098
Lustre-commit:
3edc71803af3b4dc672313cd1ba395de724fbc59
Test-Parameters: testlist=sanity-quota env=SLOW=yes,ONLY=49,NUM_QIDS=20000
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I08feb928fbf34635ec9c5c341de993c718798dc9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/46328
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Thu, 30 May 2024 01:20:36 +0000 (18:20 -0700)]
LU-17750 kernel: update SLES15 SP4 [5.14.21-150400.24.100.2]
Update SLES15 SP4 kernel to 5.14.21-150400.24.100.2 for Lustre client.
Lustre-change: https://review.whamcloud.com/54823
Lustre-commit: TBD (from
0406b98b5178074c86710262f33d9315d6306116)
Test-Parameters: trivial
Change-Id: I401e97f602e6c8c62fac73e3603eb0226745bba1
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55206
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Artem Blagodarenko [Mon, 3 Jun 2024 10:57:33 +0000 (06:57 -0400)]
EX-9878 csdc: is_chunk_start should return header copy
In is_chunk_start()
*ret_header = header;
...
kunmap_atomic(header);
ret_header is used after is_chunk_start(). The header
copy should be returned from is_chunk_start() for safe work.
Test-Parameters: testlist=sanity-compr
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: Ib5e828d6b61e90dcd70c28589931a4490cf19c22
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55292
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bobi Jam [Thu, 30 May 2024 12:58:57 +0000 (20:58 +0800)]
EX-9823 osc: clear oi_write_osclock in lock fini func
Move osc_io::oi_write_osclock clearance in osc_lock_fini() as
it's set in osc_lock_init().
Compression IO could possibly expand lock region and
osc_lock_set_writer() could access a osc_io that is not accessed
in osc_io_iter_init(), so that osc_io_rw_iter_fini() miss clearing
osc_io's oi_writer_osclock.
This patch moves the oi_write_osclock clearance in lock fini function
to match its creation in osc_lock_init().
Test-Parameters: testlist=sanity-compr env=COMPR_EXTRA_LAYOUT="-E 1M -c 1 -E eof -c 4 -Z lz4:3"
Test-Parameters: testlist=sanity-compr env=COMPR_EXTRA_LAYOUT="-E 1M -c 1 -E eof -Z lz4:3"
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ied42f5befc1abd76aa10a7666eadb9a58e1f1783
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55261
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alexandre Ioffe [Tue, 4 Jun 2024 23:25:19 +0000 (16:25 -0700)]
EX-9867 test: unlimit expected number of keepalive msgs
Sometimes test_165g may be internally delayed and
the number of keepalive messages from ofd_access_log_reader
may be unexpectably big.
To fix, remove the verified upper boundary of the keepalive
message counter and make test_165g to expect unlimited number
of such messages.
Test-Parameters: trivial testlist=sanity env=ONLY="165g",ONLY_REPEAT=20
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I8afcfd3c3e52fda229ef81491259bdc600947bd3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55312
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Wed, 5 Jun 2024 13:50:41 +0000 (15:50 +0200)]
LU-17000 gss: update init_channel initialization
Only root needs write access to 'sptlrpc.gss.init_channel', so adjust
permissions accordingly when sysfs file is created.
Lustre-change: https://review.whamcloud.com/55322
Lustre-commit: TBD (from
44c147a3bdf8d44ef3e36c86018bacacec542341)
DDN-Bug-Id: EX-9705
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6539ade1a9d815664f6659a5c1ee25e7f1f7df0e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55320
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Alex Zhuravlev [Tue, 4 Jun 2024 17:59:09 +0000 (20:59 +0300)]
EX-9873 obdclass: reset bits after decompression
as uncompressed data can be less than chunk/page, but still be
visiable to userspace as a part of a sparse file.
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I4114b0704fb685013f4e03cf2d80ccde2cc8c87f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55308
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Sat, 1 Jun 2024 18:29:57 +0000 (11:29 -0700)]
LU-17773 lov: avoid partly outside array bounds build error
Avoid "array subscript 'struct lov_stripe_md_entry[0]’ is partly
outside array bounds of ‘struct lov_stripe_md_entry[0]’ error.
Otherwise an lsme holder will be allocated for invalid lmm magic.
Lustre-change: https://review.whamcloud.com/54944
Lustre-commit: TBD (from
2859950cc91df34ddaf0a45f5f37fa13faf99a5d)
Fixes:
902fe290 ("LU-17261 lov: ignore broken components")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I5a403a0d230d2129e372fd8a22f58901cd0c1b68
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54868
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Fri, 31 May 2024 08:54:10 +0000 (11:54 +0300)]
EX-9871 tests: skip sanity-compr 1007 and 1008
if needed tools are not installed
Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I0dc3d44c300708f3a25bfce06b81993cdd30c418
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55273
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexandre Ioffe [Sat, 11 May 2024 01:28:05 +0000 (18:28 -0700)]
LU-17646 llapi: lustreapi: add FID in error messages
Use llapi_fd2fid() to print FID in llapi_lease_set() and
llapi_lease_check() error messages.
Lustre-change: https://review.whamcloud.com/55074
Lustre-commit:
8920e024cbc5d7db094f06e757e07c50524928e6
Test-Parameters: trivial
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Iac97ea721860652e304c674007ac7646d183e2fd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55237
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexandre Ioffe [Thu, 30 May 2024 02:45:31 +0000 (19:45 -0700)]
EX-9280 lipe: extend periodic stats in lpurge
In lpurge added periodic stats:
- Size and number of files which are not purged due to
- stale
- not mirrored
- Number of inodes total and used
These stats are refreshed with each purge cycle. For example:
testfs-OST0000: INFO: used_kb: 179564 (3%) total_kb: 5496292
used_inodes: 301 (0%) total_inodes: 375360
testfs-OST0000: INFO: purged: 1 (20480KB 0%) failed_del: 0 (0KB 0%)
stale: 0 (0KB 0%) nomirror: 2 (178176KB 3%)
Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Ib404afe2b9d636bf1deaf8948411616971443932
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55248
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Qian Yingjin [Thu, 23 May 2024 02:44:49 +0000 (22:44 -0400)]
LU-17866 pcc: zero ra_pages explictly for a file after PCC mmap
To support mmap under PCC, we do some special magic with mmap to
allow Lustre and PCC to share the page mapping.
The mapping host (@mapping->host) for the Lustre file is replaced
with the PCC copy for mmap. This may result in the wrong setting
of @ra_pages for the Lustre file handle with the backing store of
the PCC copy in the kernel:
->do_dentry_open()->file_ra_state_init():
file_ra_state_init(struct file_ra_state *ra,
struct address_space *mapping)
{
ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages;
ra->prev_pos = -1;
}
Setting readahead pages for a file handle is the last step of the
open() call and it is not under the control inside the Lustre file
system.
Thus, to avoid setting @ra_pages wrongly we set @ra_pages with
zero for Lustre file handle explictly in all read I/O path.
When invalidate a PCC copy, we will switch back the mapping
between Lustre and PCC. We also set mapping->a_ops back with
@ll_aops.
The readahead path in PCC backend may enter the ->readpage() in
Lustre. Then we check whethter the file handle is a Lustre file
handle. If not, it should be from mmap readahead I/O path of the
PCC copy and return error code directly in this case.
Change-Id: Id1e4a9e47bb484e97053759e1743fd2fce040149
Test-Parameters: clientdistro=el8.9 testlist=sanity-pcc env=ONLY=97,ONLY_REPEAT=10
Test-Parameters: clientdistro=el9.3 testlist=sanity-pcc env=ONLY=98,ONLY_REPEAT=10
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55181
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Tue, 12 Apr 2022 23:18:10 +0000 (17:18 -0600)]
LU-15720 dne: add crush2 hash type
The original "crush" hash type has a significant error with files
that have all-number suffixes, or suffixes that have non-alpha
characters in them. These files will all be placed on the same
MDT as the base filename, which causes MDT imbalance.
Add a "crush2" hash type that has more stringent checks for the
suffix, so that it doesn't consider all-digit suffixes, or files
that only have a '.' at the right offset, as temporary files.
Test that the "broken" all-digit or extra-'.' filenames are hashed
properly with "crush2". We also need to confirm that the old "crush"
hash has not changed (for name lookup compatibility) and still has
the original "bad hashing" bug that puts all files on the same MDT.
Fix handling of types beyond MDT_HASH_TYPE_CRUSH when creating dirs.
Fix debug layout printing of hash_type in more parts of the code.
Don't flood console if hash type is unrecognized in the future.
Lustre-change: https://review.whamcloud.com/47015
Lustre-commit:
1ac4b9598ad6e2f94c4c672b4733186364255c6a
Lustre-change: https://review.whamcloud.com/48713
Lustre-commit:
e17471792388e59f44040d48dd8138ec865663af
Fixes:
0a1cf8da8069 ("LU-11025 dne: introduce new directory hash type 'crush'")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1ce34b8f3af44432f55307ebc6906677c6179d1d
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54925
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 30 May 2024 17:04:27 +0000 (11:04 -0600)]
EX-9708 utils: lfs setstripe adds -E with -Z
When specifying a layout with "lfs setstripe -Z" it will ignore
this option if no PFL component is specified with "-E".
Instead, "lfs setstripe -Z" should automatically upgrade the file
layout to a PFL layout so the compression parameters are saved.
Test-Parameters: trivial testlist=sanity-compr
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I29cc373fabd352d6f8b6781c238806b75cce7057
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55264
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Timothy Day [Tue, 9 Jan 2024 17:17:10 +0000 (17:17 +0000)]
LU-17242 debug: use dump_stack() where possible
In some cases, libcfs_debug_dumpstack() can fail to output a
stack trace - either because the needed symbols are not exported
or those symbols can't be resolved at runtime. This seems to
occur more often with newer kernels. The messages appears only
as:
Lustre: ldlm_cb01_002: service thread pid 57876 was inactive for
40.494 seconds. The thread might be hung, or it might only be
slow and will resume later. Dumping the stack trace for
debugging purposes:
Pid: 57876, comm: ldlm_cb01_002 6.1.70 #1 SMP PREEMPT_DYNAMIC
Thu Jan 4 18:52:41 UTC 2024
Call Trace TBD:
with no stack trace (seen on CentOS 8.5 with ml 6.1.70).
For reference, the runtime symbol lookup was added and updated in:
b49ce7a ("LU-12400 libcfs: save_stack_trace_tsk if ARCH_STACKWALK")
58ac9d3 ("LU-14099 build: Fix for unconfigured arch_stackwalk")
First, add a message when the symbol can't be resolved correctly.
This makes it much easier to understand why the stack trace is
missing.
Second, replace libcfs_debug_dumpstack(NULL) with dump_stack().
When the task_struct is NULL, libcfs uses the current
task_struct. This replicates the functionality of dump_stack().
Using dump_stack() is more reliable, more in line with kernel
style, and not likely to be un-exported in the future.
Finally, in lustre/osc/osc_object.c the stack isn't dumped since
there is already an LBUG().
There only remains one user of libcfs_debug_dumpstack() which
uses a task_struct other than current. This can be cleaned up
in a future patch.
Lustre-change: https://review.whamcloud.com/53625
Lustre-commit:
ecac0c175d934fd5624c9ad8db8f45dbc33fb56c
Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I196c1da7e39b1a694c0cb67ecfaab58ab3e4662c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55239
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Alexander Zarochentsev [Mon, 29 Apr 2024 17:37:34 +0000 (17:37 +0000)]
LU-17851 ldiskfs: restart long fallocate tx
__ext4_journal_ensure_credits() may allow a long fs operation
like fallocate to run for too long, if the initial credits
estimation is enough high.
The fix is to force tx restart if tx state is not T_RUNNING.
Lustre-change: https://review.whamcloud.com/55111
Lustre-commit:
f317b5c30e478fdecceea4bd07c85ff305e9d81d
HPE-bug-id: LUS-12311
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: Ib03d78739997caa6d13690b41ef7d01609a3623b
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55247
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaly Fertman [Tue, 13 Jul 2021 16:07:14 +0000 (19:07 +0300)]
LU-14847 ptlrpc: two replay lock threads
conflict to each other what leads to:
ASSERTION( atomic_read(&imp->imp_replay_inflight) == 1 )
replay_lock_interpret() does ptlrpc_connect_import() on error, and one
thread will appear starting with connect reply interpret.
replay_lock_interpret() also wakes up ldlm_lock_replay_thread() which
does ptlrpc_import_recovery_state_machine().
It may happen that both threads will get to ldlm_replay_locks() on the
next round at the same time, both increment imp_replay_inflight and
the second one will assert.
The problem appeared in LU-13600 which added ldlm_lock_replay_thread()
with the ptlrpc_import_recovery_state_machine() call.
Lustre-change: https://review.whamcloud.com/44294
Lustre-commit:
d7d7eb50c8f5fd3fc5a7808fb112d233bdef34d7
HPE-bug-id: LUS-10147
Fixes:
3b613a442b ("LU-13600 ptlrpc: limit rate of lock replays")
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Signed-off-by: Xing Huang <hxing@ddn.com>
Change-Id: Ia9aafb631e3ba5f850504cc58b4826acec2813bd
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55249
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Tue, 19 Dec 2023 08:24:07 +0000 (03:24 -0500)]
LU-9457 test: improve sanity 253
Improve sanity test_253: set high watermark to 50M, and fill OST with
fallocate.
Lustre-change: https://review.whamcloud.com/53548
Lustre-commit:
e934646f5ea87cd8a432db0e672c6ea48867ea47
Test-Parameters: trivial
Test-Parameters: testlist=sanity env=EXCEPT=77c
Test-Parameters: testlist=sanity env=EXCEPT=77c
Test-Parameters: testlist=sanity env=EXCEPT=77c
Test-Parameters: testlist=sanity env=EXCEPT=77c
Test-Parameters: testlist=sanity env=EXCEPT=77c
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I85139d7fc0697d08c21bdb19432b40c8dab82ee9
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55276
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Fri, 3 May 2024 00:27:04 +0000 (20:27 -0400)]
LU-15988 osp: don't print nid on -ESTALE
Osp_send_update_req() should not access import upon -ESTALE, because
this MDT may be in umount.
Lustre-change: https://review.whamcloud.com/55049
Lustre-commit:
ae26dbc3387a17b763cbc901fa256d894a1f88fb
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ibd869e4e8da4f90ffd608a36d866264d5d552d0e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55288
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 16 May 2024 19:57:42 +0000 (21:57 +0200)]
LU-15496 tests: fix sanity/398c to use proper OSC name
For ppc64le and aarch64 clients, the OSC import instance name does
not have "ffff" at the start, so use the proper device name for this
subtest.
Clean up the rest of test_398c to meet modern test code style.
Also add debugging to sanity/398c from #53462.
Lustre-change: https://review.whamcloud.com/55132
Lustre-commit:
b1b57bcadeeb5a87ac75387c4aa4ae084e1a27e0
Lustre-change: https://review.whamcloud.com/53462
Lustre-commit:
304ca31e2aa15c576e468a86e45d8817c8eca391
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If8c72fa9b13eace009f39daf82454221eba6761b
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Alex Deiter
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55313
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mikhail Pershin [Sat, 18 May 2024 19:43:05 +0000 (22:43 +0300)]
LU-15644 llog: don't replace llog error with -ENOTDIR
The dt_try_as_dir() contains check for object existence
which is reported as -ENOTDIR after all. In case of llog
that goes to upper level and cause error reporting to
console. It is not relevant neither by error code nor by
debug level
Patch skips check for object existence in case of llog,
it is excessive anyway.
Debug level is reduced as well to don't spawn console
messages in case of -ENOENT, -ESTALE or -EIO errors
Lustre-change: https://review.whamcloud.com/55151
Lustre-commit:
bd9839f7dbdf59751e7cdc234602eb338c518104
Fixes:
1ebc9ed460 ("LU-15902 obdclass: dt_try_as_dir() check dir exists")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id404204566898a6ac2e258b7824491effc5fc92e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55152
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Thu, 30 May 2024 01:17:55 +0000 (18:17 -0700)]
LU-17883 kernel: update SLES15 SP5 [5.14.21-150500.55.65.1]
Update SLES15 SP5 kernel to 5.14.21-150500.55.65.1 for Lustre client.
Lustre-change: https://review.whamcloud.com/55227
Lustre-commit: TBD (from
1372c20c7d85c4d5c216c566647a883af1c5f16a)
Test-Parameters: trivial mdtcount=4 mdscount=2 \
clientdistro=sles15sp5 testlist=sanity
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-1
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-2
Test-Parameters: optional clientdistro=sles15sp5 testgroup=full-part-3
Change-Id: Ie0601c190e52d6192bf389338be51c77db03a9c2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55229
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Wed, 29 May 2024 00:40:19 +0000 (17:40 -0700)]
LU-17402 kernel: RHEL 8.10 client support
This patch makes changes to support RHEL 8.10 release
with kernel 4.18.0-553.el8_10 for Lustre client.
Lustre-change: https://review.whamcloud.com/54800
Lustre-commit: TBD (from
6748f47fac79e557ae21eb790b597be6449c926a)
Test-Parameters: trivial fstype=ldiskfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.8 testlist=sanity
Test-Parameters: trivial fstype=zfs mdtcount=4 mdscount=2 \
clientdistro=el8.10 serverdistro=el8.8 testlist=sanity
Test-Parameters: optional clientdistro=el8.10 testgroup=full-part-1
Test-Parameters: optional clientdistro=el8.10 testgroup=full-part-2
Test-Parameters: optional clientdistro=el8.10 testgroup=full-part-3
Change-Id: I0a9a262d13e0b0de3607da0982468fd8b5f6a7aa
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55207
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Thu, 30 May 2024 23:00:40 +0000 (16:00 -0700)]
LU-17404 kernel: update RHEL 9.4 [5.14.0-427.18.1.el9_4]
Update RHEL 9.4 kernel to 5.14.0-427.18.1.el9_4 for Lustre client.
Lustre-change: https://review.whamcloud.com/55203
Lustre-commit: TBD (from
07a23833999207c336532bcf75aa9d5a954f1b07)
Test-Parameters: trivial \
mdtcount=4 mdscount=2 clientdistro=el9.4 testlist=sanity
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-1
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-2
Test-Parameters: optional clientdistro=el9.4 testgroup=full-part-3
Change-Id: If18027650ff953733f2e57727b71d2daa61d249c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55208
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Elena Gryaznova [Tue, 26 Apr 2022 13:37:27 +0000 (16:37 +0300)]
LU-15785 tests: do not detect versions for RPC_MODE mode
lustre_version_code() is called each time when do_rpc_nodes()
is called. It is not needed to detect versions for RPC_MODE mode.
Lustre-change: https://review.whamcloud.com/47144
Lustre-change:
e3fcd81ae5f378ac62754a659c7adf0e0b656cf3
Fixes:
8fa23490bb ("LU-1538 tests: standardize test script init - sanity")
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-10914
Change-Id: Ia7645de0a4eedfddf859c80e661ebcb2e45de140
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55272
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Thu, 30 May 2024 00:45:45 +0000 (18:45 -0600)]
RM-620 build: New tag 2.14.0-ddn150
New tag 2.14.0-ddn150
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7cac3d582c510f1e19316b97ccfe26dd239dce31
Andreas Dilger [Thu, 30 May 2024 00:45:22 +0000 (18:45 -0600)]
RM-620 build: New tag lipe-2.51
New tag lipe-2.51
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I814564f4535217c614ecc8bbda0ed842661ebf08
Etienne AUJAMES [Mon, 8 Jan 2024 15:06:08 +0000 (16:06 +0100)]
LU-17250 mgs: generate a new MDT configuration by copy
The configuration for a new MDT is generated by reading the client
configuration. The MGS filter existing mdc/osc, interpret the
records and then create the corresponding osp/osc device for the MDT.
The main idea of this patch is first to convert and copy the records
from the client configuration to create the new MDT.
And then, copy the remaining record sections from an existing MDT.
So the new MDT can inherit OST pools and parameters from the existing
one.
This avoids complex compatibility checks for IPv4/v6 NID because
add_uuid records are copied without need to parse NIDs.
This also allows to copy "add failnid" section from the client.
This patch extend the usage to "add failnid" section on MDT
configurations.
Here are the steps to copy a existing MDT configuration:
1/ read client configuration and generate osp MDT/OST records for the
new MDT
1/ find an existing MDT configuration
2/ copy and convert the remaining configuration records from the
existing MDT configuration (parameters and OST pools)
Add the regresion test conf-sanity 137.
Lustre-change: https://review.whamcloud.com/53614
Lustre-commit:
d4682ff4cc44413810a68e572cf7f05d5b188bb4
Test-Parameters: mdtcount=4 fstype=zfs testlist=conf-sanity
Test-Parameters: mdtcount=4 fstype=ldiskfs testlist=conf-sanity
Test-Parameters: mdtcount=4 fstype=zfs testlist=conf-sanity env=ONLY=137,ONLY_REPEAT=10
Test-Parameters: mdtcount=4 fstype=ldiskfs testlist=conf-sanity env=ONLY=137,ONLY_REPEAT=10
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I4a99085b8930a0dd8002bde87d4e8c575aaccba0
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55101
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Patrick Farrell [Fri, 15 Dec 2023 20:48:53 +0000 (15:48 -0500)]
LU-13805 llite: Fix return for non-queued aio
If an AIO fails or is completed synchronously (even
partially), the VFS will handle calling the completion
callback to finish the AIO, and so Lustre needs to return
the number of bytes successfully completed to the VFS.
This fixes a bug where if an AIO was racing with buffered
I/O, the AIO would fall back to buffered I/O, causing it to
complete before returning to the VFS rather than being
queued. In this case, Lustre would return 0 the VFS, and
the VFS would complete the AIO and report 0 bytes moved.
This fixes the logic for this.
Lustre-Commit:
8a5bb81f774b9d41f1009b07010372fa9cd03a62
Lustre-Change: https://review.whamcloud.com/49915
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I9306402201e2962bbff04a4264c37bd0f1eca7b7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53696
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Sat, 27 Apr 2024 02:48:15 +0000 (20:48 -0600)]
LU-17788 ptlrpc: restore watchdog revival message
Restore the "Service thread pid NNN completed after SSS.mmm
seconds. This likely indicates the system was overloaded"
message that was lost during ptlrpc watchdog restructuring.
Do not rate limit this message, so that it is possible to see
when all threads are restored, even if their corresponding
"Service thread pid NNN was inactive" message was throttled.
Update recovery-small test_10a to check for these messages,
so that they are not removed again in the future.
Lustre-change: https://review.whamcloud.com/54942
Lustre-commit:
20c09eff4d397e7158aa4408e0cb50b102cc61c0
Test-Parameters: testlist=recovery-small env=ONLY=10a
Fixes:
fc9de679a4 ("LU-9859 libcfs: add watchdog for ptlrpc service threads.")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0c7e96fb7f73ca5562a6f5ad780a79ffc83ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55095
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Vitaliy Kuznetsov [Tue, 21 May 2024 19:05:16 +0000 (21:05 +0200)]
EX-9585 lipe: add lipe_find3 pool option
Add an option to print the OST pool for a file with the
"-printf" argument, both as long option %{pool} as well as
short option and "%Lp" that is compatible with "lfs find".
The long %{pools} option prints *all* pools in the layout.
Update the lipe-find3.1 man page and add test cases for both.
Test-Parameters: trivial testlist=sanity-lipe-find3,sanity-lipe-scan3
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I18d2d3cc161c8aa92eb27c33b06214b6f53ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54785
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Vitaliy Kuznetsov [Wed, 29 May 2024 15:00:30 +0000 (17:00 +0200)]
EX-9121 lipe: Trivial improvements for report merging
Small changes that do not affect the functionality, but allow to
reuse some functions in other parts of lipe3, for example in the
utility for merging different directory stats reports.
Test-Parameters: trivial testlist=sanity-lipe-scan3,sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: Ib7eeeccb651e7bcff4ddfc78c66a35793df7bd1d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55232
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Etienne AUJAMES [Thu, 26 Oct 2023 19:28:55 +0000 (21:28 +0200)]
LU-16566 sptlrpc: remove rq_sepol from ptlrpc_request
This patch remove rq_sepol from ptlrpc_request to reduce the memory
consumption on the servers.
rq_sepol field is 327 bytes long allocated for each request and this
is rarely used (it needs SELinux activated with the send_sepol
feature).
The patch store the SELinux policy status string in a separate object.
The pointer is stored in ptlrpc_sec->ps_sepol and protected by RCU
(mostly read-only, the SELinux policy should rarely change).
When the policy status needs to be packed in a request, we take a
reference to the current ps_sepol object and release it after the
packing. If the policy has changed in the meantime, the object used
will be free after.
A read operation is added to srpc_sepol parameter to return the
SELinux policy string cached in Lustre.
Lustre-change: https://review.whamcloud.com/52845
Lustre-commit:
3f70481c93dcabbb30267608a0054f4d7092e0db
Test-Parameters: testlist=sanity-selinux env=ONLY=21,ONLY_REPEAT=50
Test-Parameters: testlist=sanity-selinux env=ONLY=21,ONLY_REPEAT=50
Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I80fb76c97885c4b2987eb7f91a9bfe6e0e6e6c70
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55211
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 31 Aug 2023 20:50:56 +0000 (14:50 -0600)]
LU-17000 ptlrpc: fix string overflow warnings
Fix potential string overflow warnings in sptlrpc_flavor2name()
calling strncat() with the full size of the target buffer
instead of the *remaining* space in the target buffer.
Fix potential string overflow warning in sepol_seq_write_old()
and sepol_seq_write() potentially copying an unterminated string
from userspace via strncpy() and not terminating it afterward.
Since the maximum incoming parameter size is known in advance,
is reasonably small (~342 bytes), and is only used temporarily,
reorganize the code to avoid two buffer allocations and copies.
Use memcpy() to copy the string since its length is known, and
always add a NUL terminator to the string afterward.
Improvements to error messages and code style in these functions.
Addresses-Coverity: 199034 ("Out-of-bounds access")
Addresses-Coverity: 199063 ("Out-of-bounds access")
Addresses-Coverity: 199108 ("Out-of-bounds access")
Addresses-Coverity: 397374 ("String not null terminated")
Addresses-Coverity: 397394 ("String not null terminated")
Lustre-change: https://review.whamcloud.com/52210
Lustre-commit:
ff62700fa8ee717a71de13baec25f0d69640ae7c
Test-Parameters: trivial testlist=sanity-sec,sanity-selinux
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia810ce9f07b663a90049bb78af21c06f0e3ebbe5
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55210
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Hongchao Zhang [Sat, 20 Apr 2024 06:31:51 +0000 (14:31 +0800)]
LU-17873 test: ignore WIFSIGNALED if rc is 0
Ignored the checking resulst of WIFSIGNALED if the return status
of the "lctl test_create" thread is zero.
Lustre-change: https://review.whamcloud.com/55194
Lustre-commit: TBD (from
d1000ae89065a6868d0dbbd5c752ff06299d36c4)
Test-Parameters: trivial envdefinitions=SLOW=yes,DEBUG_SIZE=64 mdtcount=1 \
testlist=mds-survey,mds-survey,mds-survey,mds-survey,mds-survey,mds-survey
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ifc3727d48010c9f00f38baff9ff91b5cc3afce5c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55185
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Fri, 12 Apr 2024 01:18:28 +0000 (19:18 -0600)]
LU-16915 tests: improve distro type checking
Improve lustre_os_release() infrastructure to reduce redundant
code and make it easier to use.
Lustre-change: https://review.whamcloud.com/54790
Lustre-commit:
1ffbec13c0f745d0b9c6b91959b1afa52f99d63b
Test-Parameters: trivial
Fixes:
339b5e918f ("LU-16915 tests: except sanity-sec test_51")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id02223752df4eb3fd3b62b339e8c417eb33ebbe5
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55213
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Fri, 12 Apr 2024 01:18:28 +0000 (19:18 -0600)]
LU-16915 tests: except sanity-sec test_51
Skip sanity-sec test_51 since it has started failing recently with
the move to el9.3 servers.
Add common lustre_os_release infrastructure to make such checking
easier in the future.
Lustre-change: https://review.whamcloud.com/54751
Lustre-commit:
b881bd1051451ed18610e0cc3c3cd56c8803cbc9
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id02223752df4eb3fd3b62b339e8c417eb3e86a12
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55212
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Rebanta Mitra [Tue, 28 May 2024 00:17:43 +0000 (17:17 -0700)]
LU-17877 lnet: export REGISTER_FUNC with EXPORT_SYMBOL_GPL
This patch exports REGISTER_FUNC and UNREGISTER_FUNC
with EXPORT_SYMBOL_GPL to load GPL-licensed modules.
Lustre-change: https://review.whamcloud.com/55217
Lustre-commit: TBD (from
b3bdf8ba7fb316905b76decb35bab8dc1947ed91)
Test-Parameters: trivial
Signed-off-by: Rebanta Mitra <rmitra@nvidia.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I3a0d4e2b27911af36e210692d28892590eb0371c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55218
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Shaun Tancheff [Wed, 15 May 2024 06:30:39 +0000 (23:30 -0700)]
LU-17816 llapi: ensure pool name is nul terminated
strncpy() usage is inconsistent about the size of pool name
and sometimes for get to ensure a nul byte is placed at the
end of the copy.
CoverityID: 397181 ("Buffer not null terminated (BUFFER_SIZE)")
Also cleanup a case of checking that an unsigned value >= 0
CoverityID: 397820 ("Unsigned compared against 0 (NO_EFFECT)")
Lustre-change: https://review.whamcloud.com/55018
Lustre-commit:
64469274a4f3e202c76cf9a2757b8f36e8d0ee08
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Idec7adaf89c9dabc0275687c4a069fc8fa63e7a7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55119
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>