Whamcloud - gitweb
Artem Blagodarenko [Thu, 25 Jan 2024 23:01:05 +0000 (23:01 +0000)]
EX-8814 csdc: Update async_args after resend
It is decided to send an uncompressed request on redo.
osc_brw_prep_request() processes uncompressed data and prepares
a request, so some parts of the old request are outdated.
Let's update the old request with information from the new one.
Fixes: 8fb8d5b ("EX-8814 csdc: Revert "EX-8189 osc: do not compress resends")
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: Idb1c6ee9db64cb1f2ea1c1562b1c5aae443263e3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53830
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Arshad Hussain [Mon, 22 Jan 2024 10:33:02 +0000 (16:03 +0530)]
LU-17000 utils: In mydaemon() check after calling open()
This patch adds check after calling open() in function
mydaemon() instead of directly using the value
Lustre-change: https://review.whamcloud.com/53758
Lustre-commit:
0f67ab9b00c3949f257cd4e6081184858f245b4e
Test-Parameters: trivial kerberos=true testlist=sanity-krb5
CoverityID: 397666 ("Argument cannot be negative")
Fixes:
d2d56f38da0 ("make HEAD from b_post_cmd3")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Ic59414977029221e8618c5bb3320e95d39d9cded
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53911
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Sun, 4 Feb 2024 00:44:13 +0000 (17:44 -0700)]
DDN-4656 osd-ldiskfs: hide alloc time in brw_stats
For EXA6.0/6.1 do not show the "block maps msec" stats in brw_stats
by default as this breaks collectd and lustrefs_collector parsing.
Base this check on the Linux kernel version, since those releases
were based on RHEL7.9 on the server, while EXA6.2/6.3 use RHEL8.
Add an "enable_brw_block_maps" parameter that can be used to
disable the display of this statistic (it is always collected).
Enable the "enable_stats_header" parameter automatically in the
same way, as this was added for EXA6.2 but should now be supported.
Test-Parameters: trivial
Fixes:
c1e43cf8e0 ("LU-15564 osd: add allocation time histogram")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib5e33bd98085aaf4a5a5d39283d5d334b93ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53903
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Andreas Dilger [Mon, 5 Feb 2024 22:27:20 +0000 (15:27 -0700)]
LU-17476 tests: wait for sanity/350 to clean up
Wait until sanity test_350 has finished deleting its files before
moving on to the next subtests, otherwise the background cleanup
can cause later test failures (in particular test_413a).
Test-Parameters: trivial testlist=sanity
Test-Parameters: testlist=sanity
Test-Parameters: testlist=sanity
Test-Parameters: testlist=sanity
Test-Parameters: testlist=sanity
Fixes:
d1509ff2ca ("LU-17476 lnet: prefer to use bits only to match ME")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9ff61013764f4e47916999eefab893e069bb217a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53928
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Vitaliy Kuznetsov [Tue, 12 Dec 2023 14:49:27 +0000 (15:49 +0100)]
EX-7717 lipe: Add simple compression ratio statistics
This patch adds a new table to display data
compression ratio in overall statistics.
The new table to display compression ratio (for regular files)
will have the following column values:
0. Compression ratio range;
1. Count of files in range;
2. Number of files in range as a percent of total
number of files;
3. Number of files in this range or smaller as
a % of total # of files;
4. Total compression size of files in range;
5. Total compression size of files in range as a % of
total compression size of files;
6. Total compression size of files in this range or
smaller as a % of total compression size of files;
7. Minimum value in range (ratio);
8. Maximum value in range (ratio).
The columns in the table are numbered from 0 to 8 for a better
understanding of the table without the need to name the
columns with long text.
This PR also changes some variable types to the "double" type
for correct calculation of values and to avoid duplication of
variables with the same semantic value.
The output of information in reports with the .out
extension has also been improved.
Test-Parameters: trivial testlist=sanity-lipe-scan3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I242ddb9c4132a7fce81508dadacf8e2b01e3cead
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52372
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Thu, 11 Jan 2024 09:44:18 +0000 (17:44 +0800)]
EX-8038 csdc: store compression info in FID EA
Store compression information in OST-object's FID EA, and lfsck could
use it to recover the MDT-object layout EA from orphan OST-object(s).
2.15 Lustre may embed PFID and layout stripe info in LMA EA, this
patch would clear them from LMA EA and store them with compression
info directly into FID EA thereafter.
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Iacac04601b73f85d9bc057b8dd34a5004248dac4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53649
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Fri, 4 Aug 2023 07:02:41 +0000 (15:02 +0800)]
EX-8038 csdc: expand filter_fid
Expand filter_fid to include compression information, for
compatibility reason, if the file is an uncompressed file, still
store the old filter_fid with no compression info in FID EA.
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I388500c03604749d05849aeed3c9141974540e4a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53663
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Fri, 2 Feb 2024 16:21:29 +0000 (09:21 -0700)]
RM-620 build: New tag 2.14.0-ddn133
New tag 2.14.0-ddn133
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I06785ba295c668f8aab7dcbf2504c64068592123
Andreas Dilger [Fri, 2 Feb 2024 16:21:16 +0000 (09:21 -0700)]
RM-620 build: New tag lipe-2.40
New tag lipe-2.40
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If7cd3a8cb392eee47bbc004ba881518f0e3fb991
Bobi Jam [Fri, 26 Jan 2024 10:14:36 +0000 (18:14 +0800)]
LU-17482 llite: short read could mess up next read offset
When read reaches EOF, it could read data from stale pagecache, but
we need to restore the iocb->ki_pos so that next read could continue
from the correct offset.
Lustre-change: https://review.whamcloud.com/53827
Lustre-commit: TBD (from
4bec3a277c83932cfb5ba26e31336e1f4666460a)
Fixes:
4468f6c9d9 ("LU-16025 llite: adjust read count as file got truncated")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib8b62c41bf65f8efec82dda53fcfbdb68ad08b38
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53828
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Serguei Smirnov [Sat, 27 Jan 2024 20:17:34 +0000 (12:17 -0800)]
LU-17476 lnet: prefer to use bits only to match ME
In some cases, it has been observed that a reply will arrive
at the portal with the correct match bits, but is dropped by
lnet_parse_put(). This appears to happen with LNet Multi-Rail
peers, each having two separate NIDs.
If a reply arrives with matchbits available and matching, but
the NIDs don't match, confirm the match if the NIDs are found
to belong to the same peer. This will only happen in cases
where the reply would be dropped entirely, causing hundreds of
seconds of delay until the RPC is resent, so the extra overhead
of checking for a peer match before dropping the request is
only in the error path and minimal compared to the alternative.
Add CFS_FAIL_CHECK() for exercising the match NIDs code.
That is in a hot codepath, but CFS_FAIL_CHECK() is marked unlikely()
and this check is in the error case and _should_ only be hit when the
message would have been dropped anyway, so it seems unlikely to impact
performance in any meaningful way.
Lustre-change: https://review.whamcloud.com/53843
Lustre-commit: TBD (from
3360e892750d1bf4f2b7ceab60d9a637b3e649ad)
Test-Parameters: testlist=sanity-lnet env=ONLY=350,ONLY_REPEAT=10
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I10e1a2142539ddf5dabc26ce962cec1f2cfcf3db
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53846
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Alexander Zarochentsev [Sun, 28 May 2023 12:42:27 +0000 (08:42 -0400)]
LU-16873 osd: update OI_Scrub file with new magic
The fix for LUS-11542 detects the format change correctly
but does not write new oi scrub file magic, so new mount
triggers the "oi files counter reset" again and again.
Lustre-change: https://review.whamcloud.com/51226
Lustre-commit:
38b7c408212f60d684c9b114d90b4514e0044ffe
Fixes:
126275ba83 ("LU-16655 scrub: upgrade scrub_file from 2.12 format")
HPE-bug-id: LUS-11646
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: Ia13fcfaf0d8f2c4ee9331dd9fec0ff159d195186
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53854
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Artem Blagodarenko [Sun, 28 Jan 2024 20:24:31 +0000 (20:24 +0000)]
EX-8598 tests: use alternative data source for rewriting
Using the same file as input has disadvantages. It is not
possible to understand that data was not rewritten at all.
Alternative data source should be used.
Let's shift source file data and use it as a source.
To check rewriting result the same operarion is performed
on the destination file copy stored outside the Lustre FS.
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Test-Parameters: trivial testlist=sanity-compr env=ONLY=1004
Change-Id: I6ef400520359bfe9156c3f47e757064863bdf4e0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53088
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Wed, 24 Jan 2024 16:02:32 +0000 (11:02 -0500)]
EX-8996 ofd: handle 'missing object' reads
When the read code (eg, mdt_preprw_read) finds there is no
object, it will return a read with 0 pages, but not fail the
read. The assert for local and remote pages needs to
recognize this case.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Idc6ff70f71abc100f750a63eca73a754a56f6435
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53807
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Mon, 29 Jan 2024 23:20:02 +0000 (16:20 -0700)]
EX-8450 tests: skip sanity-lipe-find3/306 on el7.9
The sanity-lipe-scan3 test_309 is failing consistently with el7.9
*clients*. Exclude it until fixed or we drop this client version.
Test-Parameters: trivial testlist=sanity-lipe-find3 clientdistro=el7.9 serverdistro=el7.9
Test-Parameters: trivial testlist=sanity-lipe-find3 clientdistro=el8.8 serverdistro=el7.9
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iceca83a3b85df95fe45482076170d77a6abc0947
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53853
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Andreas Dilger [Mon, 29 Jan 2024 21:54:03 +0000 (14:54 -0700)]
EX-7601 tests: skip sanity-compr tests in interop
Skip a number of subtests in sanity-compr that depend on fixes
landed to the code that were not available in older versions.
Test-Parameters: trivial testlist=sanity-compr serverversion=EXA6.3.0
Fixes:
3e1dd9d6ae ("LU-17468 lod: component add missed pattern info")
Fixes:
7731c7fc74 ("EX-7601 tests: unaligned read tests")
Fixes:
033dd0ba2c ("EX-7644 mmap: add mmap support for compression")
Fixes:
46708e4636 ("EX-7601 tests: tests for read-modify-write")
Fixes:
6c4c4d7599 ("EX-7601 tests: add multi-mount compression test")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I26cae5cf01cc32c9f3e4386cf7151a66ac3678ea
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53852
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Tue, 30 Jan 2024 02:26:24 +0000 (18:26 -0800)]
EX-7795 tests: add sanity-compr test for dir compression
This patch adds a sanity-compr test to validate that
we get directory space usage reduction with compression.
Change-Id: I16f3a3f1e413e4884b3973829df36500667271ce
Test-Parameters: trivial testlist=sanity-compr env=ONLY="1007 1008",ONLY_REPEAT=3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53855
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Mon, 5 Dec 2022 18:59:02 +0000 (11:59 -0700)]
LU-16367 utils: clean up ldiskfs feature handling
Update the default ldiskfs features used by mkfs.lustre:
- enable large_dir on OSTs as well as MDTs
- remove obsolete handling of "ext3" filesystems
- clean up handling of other features that have become a bit messy
Lustre-change: https://review.whamcloud.com/49316
Lustre-commit:
e6b6b7ee253cedd8aeb6bb48d6c54916368c4109
Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id717c3ba939ccf9b2de34e868d4415e88429ef39
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53875
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lei Feng [Wed, 1 Mar 2023 00:16:03 +0000 (08:16 +0800)]
LU-16599 obdclass: job_stats can parse escaped jobid string
Writing a jobid to job_stats proc entry asks lustre to clear
the stats of the specific jobid. Since job_stats outputs
escaped jobid string in some cases, it should be able to parse
an escaped jobid string when the string is written to it.
Lustre-change: https://review.whamcloud.com/50160
Lustre-commit:
8f004bc53b1a488dad5a92a580f5f0c078e33654
Test-Parameters: trivial
Signed-off-by: Lei Feng <flei@whamcloud.com>
Change-Id: Idbc63dac6c3b35331317927107e634a3d638dd66
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53847
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Chris Horn [Tue, 5 Dec 2023 09:56:57 +0000 (03:56 -0600)]
LU-14810 lnet: Cancel discovery ping/push on shutdown
Discovery shutdown can race with ping and push events. In some cases
this can result in failing to unlink ping/push MDs on shutdown.
Protect against this by checking for PING/PUSH_FAILED state on peers
on the request queue.
Lustre-change: https://review.whamcloud.com/53356
Lustre-commit:
c3b9597742d5118a96f56129e7dd30d84468d2c8
Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet env=ONLY=500,ONLY_REPEAT=50
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I84a1f5beb6508651bc62e1dd93271f9e72f5081c
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53848
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Hongchao Zhang [Fri, 26 Jan 2024 13:43:36 +0000 (21:43 +0800)]
LU-17471 osd: add symlink for brw_stats
Add symlink at /proc/fs/lustre/osd-*/*/brw_stats to
/sys/kernel/debug/lustre/osd-*/*/brw_stats to fix
the compatible issue of the previous utils that are
still using the old proc entry.
Lustre-change: https://review.whamcloud.com/53829
Lustre-commit: TBD (from
5fad20603098c55c0080548a177023a36e640e84)
Fixes:
8a84c7f9c7 ("LU-14927 osd: share brw_stats code between OSD back ends")
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Ie86b2b384e3b91f98ead00b6325ddeb020e47aa5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53858
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Mon, 29 Jan 2024 09:02:19 +0000 (02:02 -0700)]
RM-620 build: New tag 2.14.0-ddn132
New tag 2.14.0-ddn132
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I65b4833ced4c7c398110c49336138d5fb9947a31
Bobi Jam [Wed, 24 Jan 2024 06:04:35 +0000 (14:04 +0800)]
LU-17464 lod: set llc_ostlist to NULL after free
Default LOV striping could free component entry llc_ostlist if needed
e.g. expand component entries, without set it to NULL it could be
double allocated/freed later.
Lustre-change: https://review.whamcloud.com/53797
Lustre-commit: TBD (from
5e7440b488050166af15e744dc74b9dc4f0d3b96)
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I25824cb61dd47ba284403039259593b88d25fa9d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53798
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Vitaliy Kuznetsov [Tue, 23 Jan 2024 12:34:40 +0000 (13:34 +0100)]
EX-9007 lipe: Fix getting client mount path
This patch fixes an issue where when the client is not
mounted, size reports do not work.
Test-Parameters: trivial testlist=sanity-lipe-scan3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I1e99fddf21960ecd14526c0d6baeb75c2a138dd8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53763
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Wed, 24 Jan 2024 15:35:37 +0000 (23:35 +0800)]
EX-9029 lfs: not iterate compr_type_table using ARRAY_SIZE
EX-8311 patch modifies compr_type_table to contain NULL fields in the
array, so iterate over the array should not use ARRAY_SIZE, but skip
those elements with NULL compression type name.
Fixes:
ec5814c9a7 ("EX-8311 csdc: allow specify 'fast'/'best' compression type")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I8e4988fd3a63c1cb66f75510d190c2ebc4f8f9be
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53808
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Wed, 24 Jan 2024 17:08:33 +0000 (01:08 +0800)]
LU-17468 lod: component add missed pattern info
"lfs setstripe --commponent-add" missed setting component pattern,
which causes some setting missing, like overstriping, compression.
Lustre-change: https://review.whamcloud.com/53817
Lustre-commit: TBD (from
3849e3efdc58d535ee6858aafa22cfdc665ba2d7)
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I7ad746a550f1afea54a6f5b68823a79a85a44082
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53811
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Tue, 23 Jan 2024 13:29:11 +0000 (14:29 +0100)]
LU-16307 tests: fix sanity-sec test_31
In order to improve sanity-sec test_31 resiliency, reorganize the way
the new LNet '999' is handled. And make sure everything is correctly
cleaned up after the test.
Lustre-change: https://review.whamcloud.com/53818
Lustre-commit: TBD (from
f4a96799159fd662855542d471197ac4060d3295)
Test-Parameters: trivial testgroup=review-dne-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idd657c7555e598d0ebc08387eac537b1c73e35bd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53779
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sergey Cheremencev [Wed, 9 Feb 2022 11:53:51 +0000 (14:53 +0300)]
LU-16339 quota: notify OSTs until lge_qunit_nu is set
There is a window when locks are not granted yet, but
lqe is set to qmt_reba_list to send updates to OSTs.
t1: lqe_init()->qmt_setup_lqe_gd->qmt_seed_glbe()
t1: lqe_init()->qmt_setup_lqe_gd->qmt_id_lock_notify()
t2: qmt_glimpse_lock() lustre-QMT0000: no granted locks to send glimpse
t1: ldlm_lock_enqueue()->ldlm_granted_list_add_lock() ...
If lge_qunit_nu was set to 1 in qmt_seed_glbe and appropriate qunit
is equal to the least_qunit, new qunit won't be sent to OSTs
and finally lqe_revoke will not be set causing endless -115 errors.
The fix calls qmt_id_lock_notify into qmt_dqacq0 for an lqe that has
set lge_qunit_nu or lge_edquot_nu.
Add test 85 into sanity-quota to check that write
doesn't hung if qunit initial value is equal to
the least_qunit due to small block hard limit.
HPE-bug-id: LUS-10711
Change-Id: Icd1ac29beab87c0ebf00bcb20b25c33b771b74c1
Lustre-change: https://review.whamcloud.com/49228
Lustre-commit:
6c0b4329d046de283eeb254fca561be9386df68a
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53778
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexey Lyashkov [Wed, 12 Aug 2020 14:59:50 +0000 (17:59 +0300)]
LU-14008 o2iblnd: avoid memory copy for short msg
Modern cards allow to send a kernel memory data without mapping
or copy to the preallocated buffer.
It reduce a lnet selftest cpu consumption by 3% for messages
less than 4k size.
Lustre-change: https://review.whamcloud.com/40262
Lustre-commit:
bebd87cc6c9acc577a2fdde56e856075094f1291
Test-Parameters: trivial
HPe-bug-id: LUS-1796
Change-Id: I96c31be680c8ea7ac289a755df7f1d4c1c7f9aef
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53767
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexey Lyashkov [Wed, 12 Aug 2020 13:20:00 +0000 (16:20 +0300)]
LU-14008 o2iblnd: avoid static allocation for msg tx
tx msg handling simplification, just push
a lnet header message in same list as other.
Lustre-change: https://review.whamcloud.com/40261
Lustre-commit:
7d12b98d3f8294ca0911ca730aacd07a0f822298
Test-Parameters: trivial
Cray-bug-id: LUS-1796
Change-Id: I8e5d9b8a4579ff630d4a4fbc57b06a73a662e68c
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53766
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexey Lyashkov [Fri, 7 Aug 2020 11:26:25 +0000 (14:26 +0300)]
LU-14008 o2iblnd: cleanup
simplify kiblnd_send by avoid code duplication.
lets pickup idle tx first.
Lustre-change: https://review.whamcloud.com/40260
Lustre-commit:
3916b9d7226ebb21cf413dd7685afa693e243513
Test-Parameters: trivial
HPE-bug-id: LUS-1796
Change-Id: Iaf71a9a3aeb3047a086d4cc0a3cf4f1dbe8944b4
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53765
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Sat, 27 Jan 2024 20:08:33 +0000 (12:08 -0800)]
EX-8362 scripts: improve ll_compression_scan estimate
Improve ll_compression_scan script to give a better estimate of
actual compression ratios.
- add a '-d' debug option for verbose output during testing
- log and report incompressible small files < 4096
- log and report incompressible file count and size
- include small/incompressible/large files in compression estimate
- add a correction factor to calculations for safety margin
Change-Id: If561b0273e38e4821de228c81291859c7bb1a0d2
Test-Parameters: trivial testlist=sanity-compr env=ONLY=1007,ONLY_REPEAT=10
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53824
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Tue, 23 Jan 2024 19:46:40 +0000 (11:46 -0800)]
EX-8362 scripts: improve ll_compression_scan functionality
Improve ll_compression_scan script functionality without
changing the compression estimates.
- add a version string to the output to allow tracking
- handle pathnames with spaces in them
- handle the lz4fast compression type
- allow running on MacOS for testing
Test-Parameters: trivial testlist=sanity-compr env=ONLY=1007
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0b8442a2590fdb9c718b1404cba1d73c26cff03c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53678
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Oleg Drokin [Fri, 19 Jan 2024 05:24:43 +0000 (00:24 -0500)]
LU-17446 ldlm: Do not wait for BL AST RPC completion on cancel
If we have sent an AST RPC to the client and while it's in flight
the client sent in the cancel, sometimes (esp. if AST or reply
to it are lost) even though the lock is already cancelled, whoever
is waiting on it is still stuck while trying to resend ASTs.
And in the end the client is not even evicted because the lock cancel
did come and all is fine, but it can add over a hundred seconds
to lock granting process in some non-ideal circumstances.
For simplicity we only treat Blocking ASTs like this, since we
can only have a single one of this kind.
This is adding additional pointer to struct ldlm_lock, but that is
already 560 bytes so does not really mean much.
Lustre-change: https://review.whamcloud.com/53739
Lustre-commit: TBD (from
d4b782c249377276dc9f6ddbf0fab34956d57af6)
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: Id2231bc3bfc3e094faae2872fe09f3c330d441df
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53840
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Thu, 16 Jun 2022 05:03:45 +0000 (23:03 -0600)]
LU-15913 tests: add rename stress test via racer
Add a rename stress test using the racer framework. Use
mrename if found, to avoid stat and allow directory rename.
Sometimes create and rename files to/from subdirectories.
Run e2fsck after every run to confirm filesystem structure.
Allow tunable parameters via environment variables so they
can be set via Test-Parameters. Parameters can be set on
different nodes via variables CLIENT_LCTL_SETPARAM_PARAM,
MDS_LCTL_SETPARAM_PARAM, OSS_LCTL_SETPARAM_PARAM.
Lustre-change: https://review.whamcloud.com/47643
Lustre-commit: TBD (from
6c63c882741637a246012a81e41289fcf0e2dbbe)
Test-Parameters: trivial testlist=racer env=ONLY=2
Test-Parameters: testlist=racer env=ONLY=2 mdtcount=2
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I2ae034b864a5ccb8a59bf7028d22cd67c643f51f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53751
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Andreas Dilger [Fri, 19 Jan 2024 03:44:33 +0000 (20:44 -0700)]
LU-17426 tests: add crossdir parallel rename test
Add sanityn test_81d to test cross-dir (same-MDT) parallel rename
if the MDT supports this functionality.
Lustre-change: https://review.whamcloud.com/47643
Lustre-commit: TBD (from
fdd1f9df934efa070cd4aca4cf3db686261ef868)
Test-Parameters: trivial testlist=sanityn
Test-Parameters: testlist=sanityn serverversion=2.15
Test-Parameters: testlist=sanityn env=ONLY=81d,ONLY_REPEAT=10
Test-Parameters: testlist=sanityn env=ONLY=81d,ONLY_REPEAT=10 mdtcount=2
Test-Parameters: testlist=sanityn env=ONLY=81d,ONLY_REPEAT=10 mdtcount=4
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic8717e6865a9c6c9698186f4fdf34c1f4f74083f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53748
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Fri, 19 Jan 2024 19:13:19 +0000 (12:13 -0700)]
LU-17426 mdt: relax same MDT file rename lock
Allow cross-directory rename of regular files (strictly, any
non-directory) on the same MDT without holding the BigFilesystemLock
(BFL), as file renames cannot change the directory hierarchy.
This should improve the performance for these rename operations, and
reduce contention between local MDT file renames in different parts of
the directory tree.
Add "mdt.*.enable_parallel_rename_crossdir" parameter to disable
cross-directory file renames if there is an issue with this change.
Lustre-change: https://review.whamcloud.com/53726
Lustre-commit: TBD (from
d466465f9add30faec256ce7f725e0f36d4e8a66)
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I511b392e46c46140cac6aa3ede02bfe793729f7f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53744
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Aurelien Degremont [Tue, 18 Jan 2022 13:55:01 +0000 (13:55 +0000)]
LU-930 ptlrpc: clarify AT error message
Clarify the error message related to passed deadline
for AT early replies. It was indicating that the system
was CPU bound which is most of the time wrong, as the issue
is rather communication failure delaying RPC traffic.
This could be confusing to people which will look for
CPU resource consumption where the network traffic is
more at cause.
Also try to use less cryptic keywords which makes only
sense to the feature developer, and not to admins.
Lustre-change: https://review.whamcloud.com/49548
Lustre-commit:
9ce04000fba07706c73b8adb3605c959e5b62712
Test-Parameters: trivial
Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Change-Id: Icdff8f4c6fb9905233f6b8ed1b961b2fd1127667
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53772
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Thomas Bertschinger [Thu, 13 Jul 2023 22:32:52 +0000 (18:32 -0400)]
LU-16766 obdclass: trim kernel thread names in jobids
When collecting jobstats on operations coming from kernel threads, it
is more useful and reduces the noisiness of the data if the names of
kernel threads are trimmed so that all "kworker/CPU:ID" threads are
collected under "kworker", all "ll_sa_PID" threads under ll_sa, etc.
Lustre-change: https://review.whamcloud.com/51919
Lustre-commit:
8a9c503c002d08d0587894a748761e30c1b9a445
Signed-off-by: Thomas Bertschinger <bertschinger@lanl.gov>
Change-Id: Icd82a99c1153de0277ea5ed3f4b1d92535809921
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53801
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alexander Zarochentsev [Tue, 28 Mar 2023 16:00:09 +0000 (19:00 +0300)]
LU-16655 scrub: upgrade scrub_file from 2.12 format
Scrub_file->sf_oi_count has different offsets in Lustre-2.10,
Lustre-2.12, and Lustre-2.15 due to unintended format changes.
Lustre-2.15 reads sf_oi_count from offset of sf_success_count
and may initialize incorrect number of OI files, and not be
able to do FID lookups for existing filesystem objects.
Lustre-change: https://review.whamcloud.com/50455
Lustre-commit:
126275ba8339540e46f1c517decd3d69ad1cc42c
Fixes:
a114f6b8c5 ("LU-13344 servers: change request timeouts to s32")
Fixes:
4c2f028a95 ("LU-9019 osd-ldiskfs: migrate to 64 bit time")
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: Id7c8bd555229405d604456c48447f01fd121aca9
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53839
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Tue, 23 Jan 2024 02:09:27 +0000 (19:09 -0700)]
RM-620 build: New tag 2.14.0-ddn131
New tag 2.14.0-ddn131
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I39de60e1b13b95f532b36068933f8335a16d7b8f
Andreas Dilger [Tue, 23 Jan 2024 02:09:05 +0000 (19:09 -0700)]
RM-620 build: New tag lipe-2.39
New tag lipe-2.39
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I91f46bb7da4d5fd6d06faac7c8c9975c69d7e8ce
Andreas Dilger [Thu, 18 Jan 2024 09:49:48 +0000 (02:49 -0700)]
LU-17441 mdc: use MDS_IO_PORTAL for rename
Some workloads like Apache Spark are very rename intensive, and there
here may be many concurrent renames that need the BFL lock (more than
the number of MDS_REQUEST_PORTAL service threads), they will block
these threads until each is able to get the rename lock, and prevent
other MDS_REINT RPCs from being processed.
Since the MDS_IO_PORTAL is often unused (only needed for DoM files),
and has existed since 2.11.0, it seems possible to move the rename
RPCs to be serviced by the MDS_IO_PORTAL threads to avoid contention
on the primary MDS service threads. Also, it will avoid blocking
normal file open, setattr, statfs, and other common operations if the
BFL lock is contended. Even with DoM files they may have read-on-open
handling and only DoM writes would be blocked by the uncommon rename.
Lustre-change: https://review.whamcloud.com/53725
Lustre-commit: TBD (from
b31c07cf18882b150d3e49ceee85a187e7a9b159)
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I623a27de1482778f3c9fc6bb5bbcf917611dc75b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53749
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Bobi Jam [Tue, 24 Oct 2023 14:02:55 +0000 (22:02 +0800)]
EX-8311 csdc: allow specify 'fast'/'best' compression type
Use lctl set_param osc.*.compress_type_{fast|best}=<type>:<level>
to specify the compression_type:level for LL_COMPR_TYPE_FAST/
LL_COMPR_TYPE_BEST.
lctl get_param osc.*.compress_type_{fast|best} will list these
values.
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ifeff7f25e30fc0884f0c770a3b6d0798937b3c35
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52814
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Cyril Bordage [Tue, 7 Dec 2021 22:14:43 +0000 (23:14 +0100)]
LU-15288 lnet: increase transaction timeout
In LU-13145, it was decided to increase default transaction timeout
(LNET_TRANSACTION_TIMEOUT_DEFAULT) to 150s. But, in the associated
patch, it was set to 50s. This modification will also modify
lnd_timeout (from 16 to 49).
Lustre-change: ttps://review.whamcloud.com/45780
Lustre-commit:
18b4e28f18d55291f8a14a3bd9ee84b1a686a93e
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I13a8b5d14230bb6e8936cb3e18540f19dbc62985
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53747
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Fri, 19 Jan 2024 19:28:05 +0000 (19:28 +0000)]
LU-16913 revert "EX-7849 quota: extra debug messages"
This reverts commit
02242f6f1ba1867756ee5b91abd2207f646436cf.
Extra debugging is no longer needed.
Change-Id: I083b70a911ac85fb5a1054c8409146bb393e94b0
Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53746
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Mon, 22 Jan 2024 16:40:20 +0000 (16:40 +0000)]
LU-12031 tests: update interop version checks
Update the version check in sanity test_270j and
sanity-hsm test_1f to match actual landing version.
Change-Id: Ifd6d5dec50e3fcbb7ebe77ab41335a6c3994ef57
Test-Parameters: trivial
Fixes:
3bccd95ca2 ("LU-12031 mdt: explicit data version of DoM files")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53762
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Vitaliy Kuznetsov [Thu, 18 Jan 2024 11:31:36 +0000 (12:31 +0100)]
EX-8042 lipe: Fix size calculation when using -blocks option
This patch fixes the size calculation in the "-blocks"
option when specifying the exact size value "n[bkMG]".
Test-Parameters: trivial testlist=sanity-lipe-find3
Signed-off-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Change-Id: I5dd0ce69cef20ab9a9632798f350cf4c9f96cf25
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53723
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Fri, 19 Jan 2024 17:09:14 +0000 (18:09 +0100)]
DDN-4623 obdclass: fix upcall_cache_get_entry
When an entry is found while holding the read lock, we need to
convert to a write lock and find again, to check that entry was
not modified/freed in between.
In this case, the variable indicating an entry was found must be
reset, because we might not find any valid entry after all.
Fixes:
127128bed3 ("LU-16498 obdclass: change uc_lock to rwlock")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I111af4562ac78eeb22102a8a28943e46e30b4019
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53743
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 18 Jan 2024 09:29:17 +0000 (02:29 -0700)]
RM-620 build: New tag 2.14.0-ddn130
New tag 2.14.0-ddn130
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iff6822b7d54b0cf9e1946bccf20069fa2ec51e3e
Alexandre Ioffe [Fri, 8 Dec 2023 22:17:37 +0000 (14:17 -0800)]
EX-1878 lipe: resync all stale files
Add --resync-all-stales option (default is on).
This option allows lamigo by default to resync
all files if any component of a file
is stale regardless to pool or OST location.
If the option is off, lamigo works the old way
Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Ibc26a21fa99f75de87a8e0328b183d96b7548c1f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53391
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Wed, 17 Jan 2024 01:10:58 +0000 (17:10 -0800)]
EX-8353 csdc: rename "cp_comp_*" to "cp_compr_*"
This patch renames "cp_comp_type", "cp_comp_level",
and "cp_chunk_log_bits" to use "compr" in the name
to be consistent with other variable names.
Test-Parameters: trivial
Change-Id: I428ff3a789b33da02832dee02f316b02d97137e2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52761
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Tue, 21 Nov 2023 20:25:34 +0000 (15:25 -0500)]
EX-8993 ofd: use niocount consistently
'niocount' refers to the number of remote niobufs, ie, the
number of separate IOs from the client. Except for a few
places, where it's used to refer to the number of pages in
the entire RPC. Eek.
Replace this usage with 'npages', making niocount usage
consistent.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I266087ad8dccadb54c054b0a11fb03dc9868a725
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53206
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Fri, 12 Jan 2024 09:03:15 +0000 (04:03 -0500)]
EX-8971 pcc: add lctl pcc abort command to abort attaches
This patch adds a new PCC command "lctl pcc abort [--wait|-w]
[--detach|-d] $LUSTRE_MNTPT $PCCROOT".
--wait|-w: wait all in-flight attaches aborted.
--detach|-d: detach the PCC copies when scan the PCC backend.
It can be used to abort in-progress attaches for a given PCC
backend. It does not remove the PCC backend from a client.
Add sanity-pcc/test_109 to verify it.
Change-Id: Ib7152f7418aa1beb840919e98bf8de53c99b5c54
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53656
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexandre Ioffe [Fri, 5 Jan 2024 04:54:11 +0000 (20:54 -0800)]
LU-17370 utils: simplify lfs-mirror-extend help text
Add list of lfs setstripe command line options
to help text of lfs mirror extend.
Simplify syntax of lfs mirror extend help text.
Update corresponding lfs-mirror-extend man page.
On man pages make left side adjustment and disable hyphenation:
'.nh', '.ad l' to prevent hyphenation of keywords
Lustre-change: https://review.whamcloud.com/53719
Lustre-commit: TBD (from
2a71d159d4ac98a3252f12796b8688bfa4d6df50)
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Test-Parameters: trivial
Change-Id: I6cffcdb9651062e169f53868827646b876a82cb5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53598
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Wed, 20 Dec 2023 19:51:55 +0000 (14:51 -0500)]
EX-8851 lustre: add uncompressed size to compression header
It's useful to have the uncompressed size of the data in the
compression header. Also, we have three checksum fields -
compressed, uncompressed, and header, but in practice,
checksumming the compressed data including the header is
enough to cover all of these.
This patch cleans up all of this at the same time.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ie82e0dbe9c862ddc88999b109cea1f27577dbbff
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53520
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Mon, 23 Oct 2023 07:29:07 +0000 (15:29 +0800)]
LU-17218 ofd: improve filter_fid upgrade compatibility
filter_fid could be expanded in later Lustre version, and with
upgrade then downgrade process, the filter_fid EA on disk
could has been expanded during upgrade, and won't work after
the downgrade.
This patch improves this process by allocating bigger buffer to
hold the expanded filter_fid EA then trims the unrecognizable
fileds off.
Lustre-change: https://review.whamcloud.com/52798
Lustre-commit: TBD (from
cffd0a099c30794a63268da008958f722882119b)
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I4c99f1d9f3962d46ebf9e9b799988ff3dba4f919
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53662
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andrew Perepechko [Tue, 26 Dec 2023 17:02:12 +0000 (20:02 +0300)]
LU-16637 llite: tolerate fresh page cache pages after truncate
Truncate called by ll_layout_refesh() can race with a fast read
or tiny write, which can add an uninitialized non-uptodate page
into the page cache.
We want to avoid expensive locking for this rare case so if there
is any leftover in the cache after truncate, just check that
the pages are not uptodate, not dirty and do not have any
filesystem-specific information attached to them.
Lustre-change: https://review.whamcloud.com/53554
Lustre-commit: TBD (from
f4c8d44a7c2f0fbc2c74d1832ff63c5216c22c38)
Change-Id: I8cadc022a3d1822a585f32e1a765e59ad0ff434d
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-11937
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53611
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexey Lyashkov [Fri, 12 Jan 2024 18:55:55 +0000 (13:55 -0500)]
LU-17364 llite: don't use stale page.
using stale page for write might confuse a read path,
which expect any IO page have PG_uptodate flag set,
and it caused an panic with removing from IO.
Lustre-Change: https://review.whamcloud.com/53550
Lustre-Commit: TBD (from
f7b42523e669d3653ca7c442fe82afde618bbdd5)
Test-Parameters: testlist=sanityn env=SLOW=yes,ONLY=16k,ONLY_REPEAT=10
Test-Parameters: testlist=sanityn env=SLOW=yes,ONLY=16k,ONLY_REPEAT=10
Test-Parameters: testlist=sanityn env=SLOW=yes,ONLY=16k,ONLY_REPEAT=10
Test-Parameters: testlist=sanityn env=SLOW=yes,ONLY=16k,ONLY_REPEAT=10
Test-Parameters: testlist=sanityn env=SLOW=yes,ONLY=16k,ONLY_REPEAT=10
Test-Parameters: testlist=sanityn env=SLOW=yes,ONLY=16k,ONLY_REPEAT=10
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: Ia01129ceaecf53d8d9f301c26cd2d65122f6a267
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53666
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Mon, 15 Jan 2024 08:57:53 +0000 (09:57 +0100)]
LU-16498 obdclass: fix write unlock for internal case
Holding a (write) lock is mandatory for put_entry(), so fix that in
refresh_entry_internal().
Fixes:
127128bed3 ("LU-16498 obdclass: change uc_lock to rwlock")
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: If55182ca29f37f2a783fdb88ba46512944a61c47
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53674
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Sat, 13 Jan 2024 02:51:06 +0000 (19:51 -0700)]
RM-620 build: New tag 2.14.0-ddn129
New tag 2.14.0-ddn129
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9310e8bddd0fd14b5c8c1faa109bdab19454eca1
Alex Zhuravlev [Tue, 12 Dec 2023 08:57:53 +0000 (11:57 +0300)]
LU-17354 osp: don't reset sequence client
do not reset sequence client if sequence allocation returned an
error, instead try to to get sequence later upon reconnection.
Lustre-change: https://review.whamcloud.com/53406
Lustre-commit: TBD (from
5cce95b35c652564b084f0721d4775d0fd522aa7)
Fixes:
6c4c51e307 ("LU-1445 osp: Use FID to track precreate cache.")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie23b688e4f93651c4615d77a9686c44a150d3961
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53417
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Mikhail Pershin [Wed, 13 Dec 2023 12:43:53 +0000 (15:43 +0300)]
LU-17365 lod: handle llog errors gracefuly
Distinguish remote llog errors by their source and type
in LOD lod_sub_prep_llog() and uniform errors reported
by llog_osd_read_header() and llog_init_handle.
- Partial llog header or 0-size llog is to be
reinitialized, new header is created
- in llog_read_header() dt_attr_get() and read_header()
thier errors are printed and returned as -EIO to caller
- llog with invalid llog header data is skipped and new
one is created to be used instead. To indicate that
the llog_init_handle() returns -EINVAL error code instead
of -EIO. Therefore network errors are to be handled by
lod_sub_recovery_thread() retry logic while bad llog
content will lead to immediate llog re-creation.
- lod_sub_init_llogs() tries to init all targets even
if some failed
- always recreate llogs after recovery abort no matter
if ctxt->loc_handle exists or not
Patch tries to cover known issues and types of error during
update log recovery and provides also better debug for
similar cases in future
Lustre-change: https://review.whamcloud.com/53510
Lustre-commit:
e81805244476f1d3ffb5a2ecb0a85f54b936ce51
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I2705e0dc245ed4365123ce47137193a9ed769673
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53630
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Dmitry Ivanov [Mon, 16 May 2022 18:15:19 +0000 (12:15 -0600)]
LU-10283 mdd: fix parent FID in changelog of striped directory
Changelog entry for the file operations such as create, rename,
link, unlink, mkdir referred to parent FID ("p=") as a shard's
FID in a striped directory. The same was true for the source's
parent FID ("sp="). This commit hides the Lustre intrinsics from
user displaying the parent's directory FID instead as expected.
An object might be in a remote MDT, in which case obtaining the parent
FID via the linkEA can be an expensive operation, so the parent FID is
cached in the mdd_object, so that the cost of the cross-MDT RPC is
amortized over the lifetime of the object.
Certain userspace tools might depend on the previous behavior of
displaying the shard's parent FID in the changelog records, so this
canp be enabled by setting mdd.*.enable_shard_pfid=1, if this is
required for compatibility.
Lustre-change: https://review.whamcloud.com/51322
Lustre-commit:
3554923af9e3260235865d90949ecd2924bbbc0e
HPE-bug-id: LUS-10721
Signed-off-by: Dmitry Ivanov <dmitry.ivanov2@hpe.com>
Change-Id: Iae15b49f5852f36ba62ae1706d3a5f4ebf307bc4
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53475
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Sebastien Buisson [Thu, 14 Sep 2023 16:00:04 +0000 (18:00 +0200)]
LU-16498 obdclass: change uc_lock to rwlock
Change the upcall cache uc_lock to a read-write lock so that threads
can get the read lock to do concurrent lookups in the upcall cache,
and only grab the write lock in the rare case when a new entry is
added or old entries are expired. That reduces serialization between
server threads during normal operation, and avoids all of the threads
spinning for some time if the requested key (UID or gss context) is
not in the cache at all, before they sleep.
Lustre-change: https://review.whamcloud.com/52395
Lustre-commit: TBD (from
003615a0a6711334d95c42f3c41852e1cbc8e77b)
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I812400104fd2115d19386fb4a03bb3ce99c49383
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53622
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Mon, 18 Dec 2023 13:59:30 +0000 (14:59 +0100)]
LU-17374 gss: get rid of rsi cache entries after req handle
RPCSEC init requests are kept in the rsi cache. While this is useful
during request processing involving upcall/downcall with userspace,
rsi entries are never used again once RPCSEC init requests have been
handled completely.
And keeping entries in the rsi cache has some impact on authentication
speed. When a new RPCSEC init request is received, the first step is
to check if there is a valid matching entry in the cache. It is never
the case, except if an authentication request is replayed, but GSS
rejects that anyway.
So we spend time browsing a cache from which we expect no match. Even
if the upcall cache mechanism takes this lookup opportunity to remove
invalid or expired entries, it is even better to remove cache entries
as soon as we know they are done.
Lustre-change: https://review.whamcloud.com/53488
Lustre-commit:
7a56a689d4aa588bd003e35fdb93d87cf1e56d1d
Test-Parameters: kerberos=true testlist=sanity-krb5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ia9946578c3d3149e6235d832df28214ae8984f1e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53610
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sergey Cheremencev [Fri, 15 Dec 2023 18:57:02 +0000 (21:57 +0300)]
EX-7849 quota: notify newest lqe in qmt_set_id_notify
It is possible that lqe_locate may call lqe_find inside
qmt_pool_lqes_lookup_spec and insert the 2nd lqe into
lqs_hash during processing the previous one. Do not add the
1st lqe to be processed by qmt_reba_thread in qmt_id_lock_notify,
as this lqe will be freed in the end of lqe_locate_find due
to the race with the 2nd that is already exist in lqs_hash.
This fix should potentially fix the following assertion:
(qmt_lock.c:950:qmt_id_lock_glimpse()) ASSERTION( lqe->lqe_gl ) failed:
(qmt_lock.c:950:qmt_id_lock_glimpse()) LBUG
Lustre-change: https://review.whamcloud.com/53637
Lustre-commit:
2832874970232fb5e1deedbf89b7a482518e6886
Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Test-Parameters: trivial testlist=sanity-quota,racer
Fixes:
09f9fb3211 ("LU-11023 quota: quota pools for OSTs")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I179edb06ec8c784636f566ffeba0035c6758a55b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53496
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Tue, 9 Jan 2024 07:09:10 +0000 (23:09 -0800)]
EX-7795 tests: add sanity-compr test for compression
This patch adds a sanity-compr test to validate that
we get space usage reduction with compression.
The test uses ll_compression_scan tool to calculate
the compressed size of the source file and compares
it with the size of the Lustre CSDC compressed file.
Test-Parameters: trivial env=ONLY=1007 testlist=sanity-compr
Change-Id: Icf763331205a3e937b794f90444f756fc59f9050
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52895
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 3 Jan 2024 22:23:20 +0000 (17:23 -0500)]
EX-7601 ldiskfs: fix detection of compressed extent
The code in ldiskfs_map_inode_pages which detects a
compressed extent checks the first lnb for that extent, but
it's possible for some lnbs and not others to be compressed
in a given extent, so we must check all of them. This
occurs when multiple writes have been combined in to one
RPC.
If we don't detect compression correctly, we won't set the
file size correctly and we'll get data corruption.
Fixes:
b489a2a397 ("EX-7600 osc: save compressed object size")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I11d50bdc45c40d93bb1b829fcd930165b7626432
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53588
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 3 Jan 2024 21:39:31 +0000 (16:39 -0500)]
EX-7601 osc: add COMPR_GAP check to compress_request
Currently, compress_request will build the compression
buffer (calling merge_chunk()) for requests which are less
than the minimum compression gap. This is noticed in the
compression code when it checks if there's enough data to
attempt compression, but we can do a trivial check in
compress_request() to save that work.
Also fix a few minor style things.
This is not an important fix, but I discovered it while
investigating another issue and it's trivial to resolve.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ieb32e6297e10d229f23c58e2ef4d933ce3dda4f2
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53587
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Thu, 7 Sep 2023 09:29:42 +0000 (17:29 +0800)]
EX-6269 csdc: tune allowed compression type on server
Use lctl get_param {mdt|obdfiler}.*.compress_types to list supported
compression types on server.
Use lctl set_param {mdt|obdfilter}.*.compress_types="+gzip-lzo" to
add gzip to and delete lzo from existing compression types on server.
Server will negotiate supported compression types with client in
ocd_compress_types and client import stats could show the negotiated
supported compression types in "lctl get_param {mdc|odc}.*.import"
connect_data section:
import:
....
connect_data:
....
compress_types: [ fast,best,gzip,lz4fast,lz4hc,lzo ]
The OST support for the connect flags is enabled in this patch, but
MDT enabling is pending unaligned compression support for DoM files.
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I18943352e25ed9d5fe82442df9f00a7ef388f242
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52307
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Fri, 5 Jan 2024 23:26:05 +0000 (23:26 +0000)]
EX-6269 osc: handle different compression types
Allow the client to handle different compression types in a single
component. This shouldn't happen normally, but it may happen in
the future if there is dynamic compression algorithm selection for
"fast" or "best" types (e.g. compress based on available CPU and
network bandwidth or RPC backlog).
Change-Id: Ide2731c60a68584e7cbb474bee88a17e9a7b8fec
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53602
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Patrick Farrell [Fri, 5 Jan 2024 20:24:08 +0000 (15:24 -0500)]
EX-6269 osc: decompress with algorithm from server
Data may not be compressed with the compression type and
level from the layout, so we must use the compression type
and level from storage for decompression.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib4cdccf294ef631a25147413d7f5c1a847c9504e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53601
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Tue, 9 Jan 2024 23:20:05 +0000 (18:20 -0500)]
EX-6269 obd: add 'lvl' for best and fast
'best' and 'fast' compression types must also set a level,
because not all levels are supported by all algorithms.
Rather than trying to be clever, just use simple universally
supported values, except for lz4fast, where we special case
this, because otherwise '0' is the slowest setting (and
lz4fast is likely to remain our default fastest).
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7c29659d4f027af2e44285ae38e4c9d91e35509a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53600
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Tue, 9 Jan 2024 04:06:50 +0000 (21:06 -0700)]
EX-8739 tests: skip sanity-pcc tests on el9.3
Skip sanity-pcc test_6, test_7a/7b, test_23, test_35 on RHEL9.3
clients due to continuous failures with PCC-RW, which is unused.
Skip sanity-pcc test_102 due to el9.3 fio io_uring bug.
Test-Parameters: trivial testlist=sanity-pcc clientdistro=el9.3
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I76cbd0342788fff8b0167c0656e941f96d73fc48
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53618
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Mon, 8 Jan 2024 06:26:02 +0000 (23:26 -0700)]
LU-12998 tests: fix conf-sanity/112a version check
Fix version number in conf-sanity test_112a version due to
skew before landing the patch.
Fixes:
b2be94f559 ("LU-12998 mds: add no_create parameter to stop creates")
Test-Parameters: trivial testlist=conf-sanity env=ONLY=112 serverversion=EXA6
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I080de5421b918cf5e0d692740fb37b514a6c1014
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53607
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Sat, 6 Jan 2024 08:22:54 +0000 (01:22 -0700)]
RM-620 build: New tag 2.14.0-ddn128
New tag 2.14.0-ddn128
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie84bfc97c732043030769c183a7e8a879bb3e0f1
Andreas Dilger [Thu, 4 Jan 2024 00:07:34 +0000 (00:07 +0000)]
LU-17289 test: fix sanity/906 version check
Fix the version check in test_906 to include RHEL9.3.0.
Change-Id: I7e066cdd16946b541fee96281dd5a5c90daa7072
Fixes:
a6739c9c9a ("LU-17289 test: disable sanity/test_906 temporarily")
Test-Parameters: trivial testlist=sanity clientdistro=el9.3
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53580
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lei Feng [Sun, 10 Dec 2023 08:45:38 +0000 (16:45 +0800)]
LU-17352 utils: lljobstat can read dumped stats files
Improve lljobstat command to read dumped stats file.
Usually the file is generated by command:
lctl get_param *.*.job_stats > all_job_stats.txt
Multiple files can be specified with multiple --statsfile
options. For example:
lljobstat --statsfile=1.txt --statsfile=2.txt
Stats data from multiple files will be added up and
sorted. Then the top jobs will be listed.
Try to use CLoader to accelerate the YAML parsing.
Handle SIGINT and exit silently if lljobstat is in the loop
of reading system job_stats files periodically.
Fix a bug when the job_id is a pure number.
Lustre-change: https://review.whamcloud.com/53397
Lustre-commit:
ef2555d7af21bd35756805b13e6b458f56cecf54
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: Iee1ce69d2befb9d021e34effd4fc65a47297c1fb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53582
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Mon, 28 Aug 2023 13:08:34 +0000 (21:08 +0800)]
LU-17048 mdd: protect layout change in MDD layer
We need to detect changes to the LOD layout in between transaction
declaration and when the objects are locked during transaction
execution. Otherwise, if another thread has modified the layout
of an object used by the transaction then the declaration may
be incorrect.
This patch save objects' layout generation in transaction delaration
phase, and check whether they have been changed by others in the
transaction execution phase, if that's the case, the transaction will
be retried for several times.
Lustre-change: https://review.whamcloud.com/52146
Lustre-commit:
d5ab62af24166529b84b4d7227b96d3a69989a95
Fixes:
b7bd4e3422 ("LU-14621 mdd: fix lock-tx order in mdd_xattr_merge()")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I25fe03c6e8fc4eebccc039e62dfc88db1179cb26
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53567
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Thu, 4 Jan 2024 18:56:29 +0000 (13:56 -0500)]
EX-7601 osc: debug fix in decompress_request
Debug message had an incorrect subtraction.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I5daf360766ca77b98dc5af3d72c42ac38f5782bc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53586
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Patrick Farrell [Fri, 29 Dec 2023 20:10:58 +0000 (15:10 -0500)]
EX-7601 tests: add mmap write test
This improves the existing mmap test to test mmap writing
as well as mmap reading.
Test-Parameters: trivial
Test-Parameters: testlist=sanity-compr env=ONLY="1003",ONLY_REPEAT=10
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I81840c7bbbefbc5c3bae6b270c2d94297a254d19
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53307
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Fri, 29 Dec 2023 20:10:36 +0000 (15:10 -0500)]
EX-7601 tests: add multi-mount compression test
This adds a multi-mount correctness test for compression.
This races IO from two mountpoints at varying sizes to
stress test compression.
Test-Parameters: trivial
Test-Parameters: testlist=sanity-compr env=ONLY=1006
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If49cbd6d171068faa802835146f273d835b39bc3
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51842
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Fri, 29 Dec 2023 20:09:49 +0000 (15:09 -0500)]
EX-7601 tests: tests for read-modify-write
This patch adds tests for the read-modify-write case for
EX-7601. There's still some additional tests to be added
here, but this is a good start.
Test-Parameters: trivial
Test-Parameters: testlist=sanity-compr env=ONLY="1004 1005",ONLY_REPEAT=10
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I5dd9e566b8274ece99283c8962e0d34225089cc0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53230
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Patrick Farrell [Tue, 19 Dec 2023 04:19:44 +0000 (23:19 -0500)]
EX-7601 osc: add check to decompress_request
decompress_request should check to see if there's room in
the RPC for the decompressed data, since this can occur if
there's a bug or data corruption, and otherwise we will
go past the end of the RPC during decompression.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib1bf19bf39701b72f0f5a61b2aaff2f2fdad1897
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53502
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 13 Dec 2023 23:33:16 +0000 (18:33 -0500)]
EX-7601 ofd: add checks to io_lnb_to_tx_lnb
We should always be able to find the remote niobuf in the
local io range, if we can't, there's a bug. So assert on
this.
We should also never have page level overlapping remote IOs,
at least until we have unaligned DIO. (We can remove this
check when we combine the features.)
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I325d4a37d25c116e42621964e90b225b71fd8f1f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53450
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Tue, 12 Dec 2023 17:35:25 +0000 (12:35 -0500)]
EX-7601 ofd: add past eof check for reads
The client does not normally generate reads past EOF, but
this can occur during some racing situations. We need to
check for that case and not attempt decompression, since
there's no data to decompress if we're reading past EOF.
This covers a failure which shows up occasionally in the
racing parts of the test suite, but it's challenging to
write an explicit test for this.
We also add handling for complete reads of the last chunk,
even if that chunk is partial, because we can send that to
the client for decompression.
This allows us to remove the slightly funky eof handling
in decompress_rnb, since we'll just not call that code in
this case now. Note we'll still call decompress_rnb, etc,
for writes if they start before EOF and finish after EOF
(and are unaligned). This is fine - this case should be
rare and if we hit it, we'll notice there's nothing to
decompress and proceed accordingly.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I50295f2803af611de5069d094c0a5d1b0a4a9c2d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53428
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Tue, 12 Dec 2023 17:29:22 +0000 (12:29 -0500)]
EX-7601 ofd: put decompress_read in read prep
ofd_decompress_read is called from ofd_write_prep for
writes, but from tgt_brw_read for reads. This makes the
code a little harder to follow and makes it difficult to
check read side decompression against EOF.
Instead, we move the decompression call to ofd_preprw_read.
This makes no change to the real operations here, but makes
for better code (and more similar code between read and
write).
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ibefd0a48ad08e83725f2df64618db60ba61c5ce0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53427
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Tue, 12 Dec 2023 17:07:35 +0000 (12:07 -0500)]
EX-7601 ofd: same-ify preprw_read and preprw_write
preprw_read and preprw_write have some sections which are
functionally the same but which have diverged slightly.
(These can't easily be shared between the functions.)
This is a short patch to make them more similar before
adding eof checking to reads.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7bce912e99e61a4eec4060d6b49d4917894b44c4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53426
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Tue, 12 Dec 2023 16:37:48 +0000 (11:37 -0500)]
EX-7601 ofd: don't read for writes past eof
There's no data past EOF, so there's no need to do
read-modify-writes when the entire write is past the chunk
at EOF. So in that case, don't read up data and don't
attempt decompression.
There's no explicit test for this, but this shows up
immediately in the random-offset copy tests, because they
seek and write various sizes to offsets past current EOF.
We also need this functionality for reads, because in some
cases the client will do reads past EOF (this is unusual,
but can still happen sometimes). This is added in a
separate patch because it requires some code reorganization.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia2b598165d5645c5a44c3d58bea69c7e42f10e41
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53425
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Tue, 12 Dec 2023 04:35:24 +0000 (23:35 -0500)]
EX-7601 ofd: multiple reads in same chunk
When doing DIO or if we get unusual cache behavior on the
client, multiple reads can hit the same chunk.
This only shows up in racing tests, but it's important to
handle. We do this by making sure we start searching the
lnbs for decompression at the start of the last chunk we
decompressed.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I81fbbba79b16066e6d4519c66030cc58e03d2de7
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53419
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Tue, 2 Jan 2024 08:26:11 +0000 (01:26 -0700)]
RM-620 build: New tag 2.14.0-ddn127
New tag 2.14.0-ddn127
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I38dd7ae99d4594896d14224de68d6b42e83fde10
Patrick Farrell [Fri, 29 Dec 2023 19:49:55 +0000 (14:49 -0500)]
EX-7601 tgt: objcount in RPC must be 1
Much of the BRW write code assumes objcount is one, but
there is some provision for multiple objects.
Since the code will break if we send it multiple objects,
add errors to make sure anyone changing it will notice.
This isn't strictly compression related, but compression
adds even more code which assumes this, so this protection
will be useful.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Idcbf33fd14d4b1bd179c9516bed07cca907008bc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52990
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Sat, 16 Dec 2023 22:40:16 +0000 (17:40 -0500)]
EX-8826 ofd: set compressed file size for fake writes
When using the fake writes fail_loc, file size setting is
done at the ofd layer, since the osd layer isn't used. So
we must also handle the compressed file size for this case.
This fixes sanity test 399a with compression.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Icda612405908166d043e1e568d0d8bd9cd0c5156
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53483
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Patrick Farrell [Mon, 11 Dec 2023 23:15:44 +0000 (18:15 -0500)]
EX-7601 ofd: minor debug improvements
A smattering of minor debug improvements across several
patches, placed at the end because they're all minor and
some of them would disturb early parts of the series.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I2071911eb09f5c7fad28203db05396bb31ccda59
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53418
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Patrick Farrell [Mon, 20 Nov 2023 00:40:27 +0000 (19:40 -0500)]
EX-7601 osd: add assert for prepare partial page
In the write prep code, we read up any partial pages (pages
which are not completely overwritten by the write) to
prepare them for write. But for compressed files, we will
have already done this to prepare for decompression.
Add an assert to make sure we catch if this is ever wrong.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1366b1f5b191a4d581448d692933d562198c3a1f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53179
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Patrick Farrell [Sun, 5 Nov 2023 15:55:29 +0000 (10:55 -0500)]
EX-7601 ofd: create read mapping for read-modify-write
When we need to do a read-modify-write for unaligned writes
to a compressed file, it's important we read only the
portion of the file which is receiving unaligned IO.
This patch identifies these chunks in preprw_write and
creates a read lnb mapping from a subset of the pages for
write. These pages we read up are then decompressed.
Note one issue this patch does not address is reading of
data past EOF. If the final chunk is unaligned, we will
round the write to cover it. This results in extending the
file inappropriately, writing zeroes where they aren't
needed. The read side gives us the info to address this,
which we will do in a future patch.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iede43f12127cbb93e73c22a915192aa2f814a927
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52997
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Patrick Farrell [Fri, 3 Nov 2023 20:29:51 +0000 (16:29 -0400)]
EX-7601 ofd: distinguish nr_write and nr_read
We will have two counts of pages in lnbs, distinguish
between them.
Not actually used yet - will be calculated when the read
lnb mapping is created.
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I709b8fd299163d348a196184152bb0294fcb650b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52985
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Patrick Farrell [Fri, 3 Nov 2023 20:22:22 +0000 (16:22 -0400)]
EX-7601 ofd: add read lnb to ofd_preprw_write
The read phase of read-modify-write for compressed files
needs to read only a subset of the pages which will be
written, so it needs a separate set of lnb pointers for
tracking this subset.
This patch passes around the necessary argument but does
not set up or use the lnb yet.
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7ec7101e65e73a6c9e67cea3c58d8cace38e70e0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/52984
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Alexandre Ioffe [Thu, 21 Dec 2023 06:53:42 +0000 (22:53 -0800)]
LU-17370 utils: simplify lfs help text
Simplify help text for lfs getstripe and lfs setstripe.
Update corresponding man pages lfs-getstripe and lfs-setstripe.
On man pages make left side adjustment and disable hyphenation:
'.nh', '.ad l' to prevent hyphenation of keywords
Lustre-change: https://review.whamcloud.com/53564
Lustre-commit: TBD (from
6c3dae58eddc2e3c7caf35599733b2e59ebeb657)
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Test-Parameters: trivial
Change-Id: Iae9d3534230ee7d325fbeffd78b5c12632a4a161
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53523
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>