Whamcloud - gitweb
fs/lustre-release.git
3 months agoLU-13683 lfs: return -ENOENT when invoked on non-existing file 09/40709/3 b2_12
Sebastien Piechurski [Tue, 16 Jun 2020 16:14:55 +0000 (18:14 +0200)]
LU-13683 lfs: return -ENOENT when invoked on non-existing file

Since merge of LU-11510, lfs migrate on a non-existing file will give
the following error "lfs migrate: can't create composite layout from
file /some/path/to/file" and will exit with code 0, potentially
leaving a calling script unaware of the error.

This patch fixes it by using errno which is set in the call to
llapi_layout_get_by_path()

Lustre-change: https://review.whamcloud.com/38953
Lustre-commit: 52d7cb5913c1e653a89d3a4de5f39c0e596dd28c

Signed-off-by: Sebastien Piechurski <sebastien.piechurski@atos.net>
Change-Id: I910eae78445f6071ff4e741afd43d140f554ab22
Fixes: 8bedfa377fbd ("LU-11510 lfs: migrate a composite layout file correctly")
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40709
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
3 months agoLU-15938 llog: llog_reader to detect more corruptions 10/48310/2
Mikhail Pershin [Tue, 12 Jul 2022 06:40:38 +0000 (09:40 +0300)]
LU-15938 llog: llog_reader to detect more corruptions

Improve llog_reader to determine more corruptions and report
errors
 - notify if llog bitmap has bits set with no records in llog
 - compare header records count with amount of records really
   found
 - fix amount of records to output, preventing wrong output of
   NOT SET record
 - list missing records in gap if found
 - count all errors found, add prefix 'error:' in output for
   better output processing by third-party scripts
 - don't exit immediately in case of error but continue if
   possible and output all read valid data

Lustre-change: https://review.whamcloud.com/47934
Lustre-commit: d914a5b7a49ac6b61c0191a0966d1f684a6957b6

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ic47dc6bb6cbdd9db6f888a0b892254403a628912
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48310
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-6612 utils: strengthen llog_reader vs wrong format/header 09/48309/2
Bruno Faccini [Mon, 20 Jul 2015 14:30:11 +0000 (16:30 +0200)]
LU-6612 utils: strengthen llog_reader vs wrong format/header

The following snippet shows that llog_reader can be puzzled due to
an invalid 0 for the number of records when parsing an expected
LLOG file header :
root# dd if=/dev/zero bs=4096 count=1 of=/tmp/zeroes
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.000263962 s, 15.5 MB/s
root# llog_reader /tmp/zeroes
Memory Alloc for recs_buf error.
Could not pack buffer; rc=-12

Lustre-change: https://review.whamcloud.com/15654
Lustre-commit: 45291b8c06eebf33d3654db3a7d3cfc5836004a6

Test-Parameters: trivial testlist=sanity,sanity-hsm
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I12be79e6c6a5da384a5fd81878a76a7ea8aa5834
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48309
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-16000 utils: align updatelog parameters in llog_reader 08/48308/2
Etienne AUJAMES [Fri, 8 Jul 2022 10:36:10 +0000 (12:36 +0200)]
LU-16000 utils: align updatelog parameters in llog_reader

Parameters in update log records are aligned on 64bits. llog_reader
do not aligned these parameters: if a parameters size is not mutiple
of 8, the next parameter size will be read incorrectly.

Lustre-change: https://review.whamcloud.com/47913
Lustre-commit 6d74b759634355e7f6647ccaefef519a1ff208e2

Test-Parameters: trivial
Fixes: 9962d6f ("LU-14617 utils: llog_reader updatelog support")
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I6871614ab4ea79d59c3c3b4644b377de395bad56
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48308
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-15584 utils: ppc64le __le64_to_cpu type mismatch 07/48307/2
Gian-Carlo DeFazio [Tue, 22 Feb 2022 20:23:30 +0000 (12:23 -0800)]
LU-15584 utils: ppc64le __le64_to_cpu type mismatch

Cast values returned by __le64_to_cpu to
long long unsigned int. This is to match print format
strings that use %llx. This mismatch was resulting in a
build failure for ppc64le.

Build log message:
llog_reader.c:921:42: error: format '%llx' expects
argument of type 'long long unsigned int', but
argument 3 has type 'long unsigned int'

Lustre-change: https://review.whamcloud.com/46588
Lustre-commit: 131f559c5a241f15b8a91009968dfbd9a5dcddeb

Fixes: 80447caf980 LU-14926 utils: print unlink and setattr recs in llog_reader
Signed-off-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Change-Id: I939b94626d2707b6ff644324c5c2798218331c4d
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48307
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-14926 utils: print unlink and setattr recs in llog_reader 06/48306/2
Alexander Zarochentsev [Fri, 16 Jul 2021 19:16:29 +0000 (22:16 +0300)]
LU-14926 utils: print unlink and setattr recs in llog_reader

Enhance llog_reader to print unlink and setattr llog records
correctly.

Lustre-change: https://review.whamcloud.com/44591
Lustre-commit: 80447caf980699fd1b0118e8f7c11e48aead04ce

HPE-bug-id: LUS-10220
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I7b44f65c976459d143521185a807939524f67fa2
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48306
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-14865 utils: llog_reader.c printf type mismatch 05/48305/2
Gian-Carlo DeFazio [Tue, 20 Jul 2021 00:30:36 +0000 (17:30 -0700)]
LU-14865 utils: llog_reader.c printf type mismatch

Add (unsigned long long) cast to results of
__le64_to_cpu so that it matches the formatting (%llu)
of the enclosing printf call.

Build log message:
"llog_reader.c:887:9: error: format '%llu' expects
argument of type 'long long unsigned int', but
argument 3 has type '__u64' [-Werror=format=]"

Lustre-change: https://review.whamcloud.com/44346
Lustre-commit: 14b8276e06d6f4e3bfe785df1165458555e406f3

Test-Parameters: trivial
Fixes: 9962d6f84db5 LU-14617 utils: llog_reader updatelog support
Signed-off-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Change-Id: I9549e0a0bd21727dfcc42992b693bc39a779e1a1
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48305
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-6142 utils: Fix style issues for llog_reader.c 04/48304/2
Arshad Hussain [Fri, 22 May 2020 15:22:01 +0000 (20:52 +0530)]
LU-6142 utils: Fix style issues for llog_reader.c

This patch fixes issues reported by checkpatch
for file lustre/utils/llog_reader.c

Lustre-change: https://review.whamcloud.com/38706
Lustre-commit: 6e159dceb61829ff790d7245309ecf0cfdf70cb6

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I02af3385be5521ef5ed9063926e846059067b8ab
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48304
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-12420 utils: llog_reader handles uninitialized mountdata 03/48303/2
Li Xi [Tue, 11 Jun 2019 12:28:30 +0000 (20:28 +0800)]
LU-12420 utils: llog_reader handles uninitialized mountdata

When reading an mountdata that has never been used, "llog_reader
CONFIGS/mountdata" command crashes with following output:

Header size : 500170753
Time : Wed Sep  4 00:57:37 6869
Number of records: 65534
Target uuid :
-----------------------
Segmentation fault

After apply this patch, llog_reader will print following message
and quit under this circumstance:

Header size : 500170753
Time : Wed Sep  4 00:57:37 6869
Number of records: 65534
Target uuid :
-----------------------
uninitialized llog record at index 0

Lustre-change: https://review.whamcloud.com/35178
Lustre-commit: 46f53da979344c88ab985de7227a81240a8107bf

Change-Id: I25147f7fd09c6d59ff0049bdb20ac1979cf43ee4
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48303
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-12420 utils: llog_reader handles uninitialized llog properly 02/48302/2
Li Xi [Tue, 11 Jun 2019 08:26:22 +0000 (16:26 +0800)]
LU-12420 utils: llog_reader handles uninitialized llog properly

When reading an empty LLOG, llog_reader would crash because
of record number of zero. E.g. "llog_reader CONFIGS/nodemap" on
a MGS without nodemap configuration would cause failure of:

llog_reader: Error allocating -16 bytes for recs_buf: Cannot allocate memory (12)
llog_reader: Could not pack buffer.: Cannot allocate memory (12)

After apply this patch, llog_reader will print following message
and quit if the LLOG is unintialized:

uninitialized llog: zero record number

Lustre-change: https://review.whamcloud.com/35177
Lustre-commit: 94a16a027536100a9d0a279e1f384076a7a9b513

Change-Id: I87246672e9fc992c99126134236c2e8d304df74b
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48302
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-13680 osd-ldiskfs: handle large allocations 78/49478/2
Andreas Dilger [Mon, 15 Jun 2020 19:46:07 +0000 (13:46 -0600)]
LU-13680 osd-ldiskfs: handle large allocations

Use OBD_ALLOC_PAGE_ARRAY_LARGE() for oti_dio_pages, as this allocation
can be as large as 512KB due to large PTLRPC_MAX_BRW_PAGES.

Lustre-change: https://review.whamcloud.com/38943
Lustre-commit: bbb14d40a4be6a9172b80ed3208f81be2f1d1b66

Test-Parameters: trivial
Fixes: 72372486a5e9 ("LU-11347 osd: do not use pagecache for I/O")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0a0557e42bb5db5612c78e6d9b87f366a23ebbe5
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49478
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-14339 obdclass: add option %H for jobid 75/50375/2
Yang Sheng [Mon, 18 Jan 2021 17:46:05 +0000 (01:46 +0800)]
LU-14339 obdclass: add option %H for jobid

Add a option %H to avoid jobid too long in some cases.

Lustre-change: https://review.whamcloud.com/41262
Lustre-commit: cf72ee174bbf7e60301ddff211b0685dc6c7adab

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Iaf70da5de25fd321a21e6e6cd7f7d211dca1adf3
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50375
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-16490 kernel: kernel update RHEL 7.9 [3.10.0-1160.81.1.el7] 04/49704/2
Jian Yu [Thu, 19 Jan 2023 19:42:49 +0000 (11:42 -0800)]
LU-16490 kernel: kernel update RHEL 7.9 [3.10.0-1160.81.1.el7]

Update RHEL 7.9 kernel to 3.10.0-1160.81.1.el7.

Lustre-change: https://review.whamcloud.com/49684
Lustre-commit: TBD (from 0a6b9460584046c0344204ad5169efac4d791e59)

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I46f556f327d92fde17790e223187df5b1c33d2c1
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49704
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-16486 kernel: update RHEL 8.7 [4.18.0-425.10.1.el8_7] 03/49703/3
Jian Yu [Tue, 2 Jan 2024 18:37:42 +0000 (10:37 -0800)]
LU-16486 kernel: update RHEL 8.7 [4.18.0-425.10.1.el8_7]

Update RHEL 8.7 kernel to 4.18.0-425.10.1.el8_7 for Lustre client.

Lustre-change: https://review.whamcloud.com/49683
Lustre-commit: b68542d7641422e63f3bee37cd2059d2e3d0442e

Test-Parameters: trivial clientdistro=el8.7 testlist=sanity

Change-Id: I5759d0cb06a1148689ed9b8c947cb6516ab3aca1
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49703
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-13133 tests: sanity-selinux test_21{a,b} sepol update 01/50401/3
Sebastien Buisson [Tue, 14 Jan 2020 11:51:55 +0000 (20:51 +0900)]
LU-13133 tests: sanity-selinux test_21{a,b} sepol update

We need to make sure MDS receives updated sepol info from MGS.
In case of combined MGT/MDT, directly setting fileset on the node
will mask llog-based info retrieval mechanism. So always use
'lctl set_param -P' to set sepol value.

Lustre-change: https://review.whamcloud.com/37224
Lustre-commit: e32ca6e12edbd885838f8a12c0edd455b9ad7105

Test-Parameters: trivial
Test-Parameters: clientselinux testlist=sanity-selinux
Test-Parameters: clientselinux testlist=sanity-selinux
Test-Parameters: clientselinux testlist=sanity-selinux
Test-Parameters: clientselinux testlist=sanity-selinux
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iaf8ff13364b9ba5f5d8b733be0247d79e05a6b3d
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50401
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-13156 tests: wait for nodemap update in sanity-selinux 00/50400/3
Sebastien Buisson [Mon, 23 Mar 2020 15:23:57 +0000 (16:23 +0100)]
LU-13156 tests: wait for nodemap update in sanity-selinux

In sanity-selinux test_21a and test_21b, nodemaps are used to test
SELinux status checking (sepol).
We must wait for nodemap update on all MDS nodes before carrying out
tests.

Lustre-change: https://review.whamcloud.com/38034
Lustre-commit: f1761cbe6b1243edd7a69c68c401d7285f7f3b38

Test-Parameters: clientselinux mdscount=2 mdtcount=4 testlist=recovery-small,sanity-selinux env=ONLY="21 23",ONLY_REPEAT=80
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I363e2bec757efc199f7039f8af4bcb77e2a2a184
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50400
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
3 months agoLU-13617 tests: check client deadlock selinux 34/47034/3
Alexander Boyko [Mon, 1 Jun 2020 12:38:07 +0000 (08:38 -0400)]
LU-13617 tests: check client deadlock selinux

The patch adds test_20e to sanity-selinux. It checks client deadlock
and MDS eviction for it.

Lustre-change: https://review.whamcloud.com/38793
Lustre-commit: f519f22c8ba3a6de00af0bef77cae3b4b18acdab

Test-Parameters: testlist=sanity-selinux env=ONLY=20e,ONLY_REPEAT=20
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Cray-bug-id: LUS-8924
Change-Id: If7707fa14f7307fb3a3fb2228fbd1983b55cbe6b
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Signed-off-by: Etienne AUAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47034
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-13617 llite: don't hold inode_lock for security notify 25/47025/4
Alexander Boyko [Mon, 1 Jun 2020 12:32:11 +0000 (08:32 -0400)]
LU-13617 llite: don't hold inode_lock for security notify

With selinux enabled client has a dead lock which leads to
client eviction from MDS.
1 thread                    2 thread
do file open                do stat
inode_lock(parend dir)
                            got LDLM_PR(parent dir)
enqueue LDLM_CW(parent dir) waits on inode_lock to notify security
waits
timeout on enqueue
and client eviction because client didn't cancel a LDLM_PR lock

security_inode_notifysecctx()->selinux_inode_notifysecctx()->
selinux_inode_setsecurity()
The call of selinux_inode_setsecurity doesn't need to hold
inode_lock.

Lustre-change: https://review.whamcloud.com/38792
Lustre-commit: f87359b51f61a4baa9bf62faebb6625d518d23b4

Fixes: 1d44980bcb ("LU-8956 llite: set sec ctx on client's inode at create time")
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Cray-bug-id: LUS-8924
Change-Id: I4727da45590734bde57bee9d378b61c30b5d515a
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47025
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-15560 tests: remove the "libtool execute" commands 48/49148/3
Xing Huang [Mon, 14 Nov 2022 08:11:59 +0000 (16:11 +0800)]
LU-15560 tests: remove the "libtool execute" commands

Remove unnecessary libtool usage from test scripts to
make test_125 of conf-sanity pass Maloo tests.

Test-Parameters: testlist=conf-sanity env=ONLY=125 serverversion=2.15
Signed-off-by: Xing Huang <hxing@ddn.com>
Change-Id: I5842e3dcbb77c37124a288f0e87173dc2fd5f02e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49148
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14780 llite: failed ASSERTION(ldlm_has_layout(lock)) 14/44214/5
Bobi Jam [Fri, 4 Jun 2021 03:58:29 +0000 (11:58 +0800)]
LU-14780 llite: failed ASSERTION(ldlm_has_layout(lock))

When setting layout in layout lock, the lock could lost its layout
bits, and we'd try fetch the layout lock again.

Lustre-change: https://review.whamcloud.com/44054
Lustre-commit: 1b166d6dd6a2f39dfe35b60be169b288665d0283

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I10f96e4cb03cfe228d3c1ea1500b1a8d8e4e5e54
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44214
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-16534 build: Prefer timer_delete[_sync] 65/53265/2
Shaun Tancheff [Tue, 28 Nov 2023 07:34:01 +0000 (23:34 -0800)]
LU-16534 build: Prefer timer_delete[_sync]

Linux commit v6.1-rc1-7-g9a5a30568697
  timers: Get rid of del_singleshot_timer_sync()
Linux commit v6.1-rc1-11-g9b13df3fb64e
  timers: Rename del_timer_sync() to timer_delete_sync()
Linux commit v6.1-rc1-12-gbb663f0f3c39
  timers: Rename del_timer() to timer_delete()

Prefer timer_delete_sync() to del_singleshot_timer_sync()
Prefer timer_delete_sync() to del_timer_sync()
Prefer del_timer() to timer_delete()

Provide del_timer and del_timer_sync when
timer_delete[_sync] is not available

Lustre-change: https://review.whamcloud.com/49922
Lustre-commit: 0ec89529ce14a1bb5af0c01ed86424a10e0e373c

Test-Parameters: trivial
HPE-bug-id: LUS-11470
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4c946c315a83482dd0bd69e5e89f0302a67bf81c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53265
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-16118 build: Workaround __write_overflow_field errors 64/53264/2
Shaun Tancheff [Tue, 28 Nov 2023 07:02:25 +0000 (23:02 -0800)]
LU-16118 build: Workaround __write_overflow_field errors

Linux commit v5.17-rc3-1-gf68f2ff91512
   fortify: Detect struct member overflows in memcpy() at compile-time

memcpy and memset of collections of struct members
will trigger:

error: call to '__write_overflow_field' declared with attribute
   warning: detected write beyond size of field (1st parameter);
   maybe use struct_group()?
   [-Werror] __write_overflow_field(p_size_field, size);

Lustre-change: https://review.whamcloud.com/48364
Lustre-commit: a3a51806ef361f55421a1bc07f64c78730ae50d5

Test-Parameters: trivial
HPE-bug-id: LUS-11194
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Iacd1ab03d1b90ce62b5d7b65e1cd518a5f7981f2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53264
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-16291 build: make kobj_type constant 63/53263/2
Jian Yu [Tue, 28 Nov 2023 06:52:57 +0000 (22:52 -0800)]
LU-16291 build: make kobj_type constant

Kernel v5.16-rc2-28-gee6d3dd4ed48:
commit ee6d3dd4ed48ab24b74bab3c3977b8218518247d
driver core: make kobj_type constant.

This patch makes struct kobj_type constant to fix
the following build failure against kernel 5.16:

lustre/obdclass/obd_config.c: In function 'class_modify_config':
lustre/obdclass/obd_config.c:1639:13: error: assignment discards
'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
1639 |         typ = get_ktype(kobj);
     |             ^

Lustre-change: https://review.whamcloud.com/49043
Lustre-commit: d1dbf26afd6676e02a2a00e635b9ad1fe14cf68e

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I19e0d1f4e3cf97f6871e038487cda9294ac1f67b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53263
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-14651 build: fix build for el7.9 kernels 20/54120/2
Andreas Dilger [Tue, 6 Feb 2024 17:35:27 +0000 (10:35 -0700)]
LU-14651 build: fix build for el7.9 kernels

Handle extra setattr_prepare() argument added in Linux 5.12 kernels
when building on older kernels.

Lustre-change: https://review.whamcloud.com/53503
Lustre-commit: 7815835d21a5c0b6dbc58d9bc9dd823d4952f86f

HPE-bug-id: LUS-12059
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Change-Id: Ie7fd1c4d51b7a9b086cfca0db941321cbcce7057
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54120
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
3 months agoLU-15098 tests: sanity-sec 27a exec commands on right node 91/49891/3
Sebastien Buisson [Tue, 19 Oct 2021 15:59:33 +0000 (17:59 +0200)]
LU-15098 tests: sanity-sec 27a exec commands on right node

In nodemap_exercise_fileset called from sanity-sec test 27a,
make sure all commands are executed on first client, as we are
testing properties of nodemaps 'default' and 'c0'.
And make sure 'default' nodemap has admin and trusted properties
set to 1, as we are carrying operations as root.

Lustre-commit: b45169276ce1ab09dae7a733859f89a6c92808e5
Lustre-change: https://review.whamcloud.com/45293

Test-Parameters: trivial
Test-Parameters: testlist=sanity-sec clientcount=2 env=ONLY=27a
Fixes: 0daeebcbdc ("LU-14797 nodemap: map project id")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idd9f391db60475721f3a3856b5e3bee1a18bbbca
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49891
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-14299 test: sleep to enable quota acquire again 18/47618/2
Hongchao Zhang [Fri, 29 Jan 2021 20:51:43 +0000 (04:51 +0800)]
LU-14299 test: sleep to enable quota acquire again

sanity-quota test_61 fails with incorrect quota exceeded
errors because quota acquire will be disabled for 5 seconds
after edquot flag is set.  The test should introduce some
delay between the test of over quota and normal one.

Lustre-change: https://review.whamcloud.com/41389
Lustre-commit: 430e3f01ef2dc83ed317cf2b97be8a2ad50d9f13

Test-Parameters: trivial fstype=zfs testlist=sanity-quota env=ONLY=61,ONLY_REPEAT=20
Fixes: 530881fe4ee20 ("LU-7816 quota: add default quota setting support")
Change-Id: I8040ba960f32cf01cb7cee3a77c06ad4bd732f0e
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47618
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-6006 tests: add sleep 1 after command in background 16/45916/2
Alex Zhuravlev [Tue, 28 Jan 2020 12:38:05 +0000 (15:38 +0300)]
LU-6006 tests: add sleep 1 after command in background

otherwise subsequent command may race with the one in
background and fail:
 mkdir a & touch a/b

Test-Parameters: trivial env=ONLY="22-23",ONLY_REPEAT=50 testlist=replay-dual
Test-Parameters: env=ONLY="22-23",ONLY_REPEAT=50 fstype=zfs testlist=replay-dual
Test-Parameters: env=ONLY="22-23",ONLY_REPEAT=50 mdscount=2 mdtcount=4 testlist=replay-dual
Test-Parameters: env=ONLY="22-23",ONLY_REPEAT=50 fstype=zfs mdscount=2 mdtcount=4 testlist=replay-dual

Lustre-change: https://review.whamcloud.com/37343
Lustre-commit: 2d57999401072a034650d00a37fb59ef9b3f53d0

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I0574d315e596cd899f7c4ea20c70b4c3da99b9b4
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45916
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-16222 kernel: RHEL 8.7 client support 02/49002/3
Jian Yu [Tue, 1 Nov 2022 00:58:07 +0000 (17:58 -0700)]
LU-16222 kernel: RHEL 8.7 client support

This patch makes changes to support RHEL 8.7 release
with kernel 4.18.0-423.el8 for Lustre client.

Lustre-change: https://review.whamcloud.com/48879
Lustre-commit: 293844d132b79a1d256ed4200d5dbd8bb790bfb4

Test-Parameters: trivial clientdistro=el8.7 testlist=sanity

Change-Id: Ie97ff67c9a5fbd46bc145ab559665dcbc630b4a0
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49002
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-14838 tests: skip sanityn/32a if no truncate_lock 54/47554/2
Andreas Dilger [Tue, 7 Jun 2022 20:52:52 +0000 (14:52 -0600)]
LU-14838 tests: skip sanityn/32a if no truncate_lock

Newer servers do not support truncate_lock since 2.14.53.
Skip sanityn.sh test_32a if this feature is not available.

Test-Parameters: trivial testlist=sanityn env=ONLY=32a
Test-Parameters: clientversion=2.14 testlist=sanityn env=ONLY=32a
Fixes: 6335dba839 ("LU-14838 osc: Remove lockless truncate")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ibe37b59eff2b11a1b5e6ddd7a5c0ba6dae9993f5
Reviewed-on: https://review.whamcloud.com/47554
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-14644 vvp: wait for nrpages to be updated 48/46948/3
Vitaly Fertman [Tue, 27 Apr 2021 18:43:06 +0000 (21:43 +0300)]
LU-14644 vvp: wait for nrpages to be updated

truncate_inode_pages() says there still may be a page in a process
of deletion upon return. wait for another thread which is doing
__delete_from_page_cache() to get nrpages updated.

Lustre-change: https://review.whamcloud.com/43464
Lustre-commit: 7d5d004506650c3739898e70d72c9a86b8aeeb88

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I165b3d0866efaf2eb7e977520ebba4ee831874ab
HPE-bug-id: LUS-8842
Reviewed-on: https://review.whamcloud.com/46948
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15137 socklnd: expect two control connections maximum 54/47254/2
Serguei Smirnov [Thu, 4 Nov 2021 18:35:43 +0000 (11:35 -0700)]
LU-15137 socklnd: expect two control connections maximum

As a result of connecting to ourselves, e.g. pinging own nid,
two control type connections are established vs. just one
in case of connecting externally.
Fix the control connection counter to be able to handle that.

Lustre-change: https://review.whamcloud.com/45461
Lustre-commit: ee9a03d8308c5918a17e2e45fd59ee5a4c38acaf

Test-Parameters: trivial testlist=sanity-lnet
Fixes: e8842e86 ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Idce01d81e3924226b5b163d2472cbcd4f6eb5819
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46034
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47254
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15137 socklnd: decrement connection counters on close 53/47253/2
Serguei Smirnov [Sat, 30 Oct 2021 18:39:26 +0000 (11:39 -0700)]
LU-15137 socklnd: decrement connection counters on close

To gracefully handle potential race with delayed connection create,
decrement connection counters per type as connections are being
closed.

Lustre-change: https://review.whamcloud.com/45422
Lustre-commit: 7e26413aa85fdc931721cde36bae3bf2bb97e63f

Test-Parameters: trivial testlist=sanity-lnet
Fixes: e8842e86 ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ieb3b44701e4999ea1fe63234162dd5878d65958a
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46035
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47253

20 months agoLU-12815 socklnd: add conns_per_peer parameter 52/47252/2
Serguei Smirnov [Thu, 4 Feb 2021 01:35:00 +0000 (20:35 -0500)]
LU-12815 socklnd: add conns_per_peer parameter

Introduce conns_per_peer ksocklnd module parameter.
In typed mode, this parameter shall control
the number of BULK_IN and BULK_OUT tcp connections,
while the number of CONTROL connections shall stay
at 1. In untyped mode, this parameter shall control
the number of untyped connections.
The default conns_per_peer is 1. Max is 127.
Performance scaling on 100GbE:

 conns_per_peer     speed
        1        1.7GiB/s
        2        3.3GiB/s
        4        6.4GiB/s
        8       11.5GiB/s

Lustre-change: https://review.whamcloud.com/41056
Lustre-commit: 71b2476e4ddb95aa42f4a0ea3f23b1826017bfa5

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I1f4ef22141882224e14e18c2526554dcfa69c871
Reviewed-on: https://review.whamcloud.com/41411
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47252
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-15645 obdclass: llog to handle gaps 11/47011/3
Alex Zhuravlev [Wed, 16 Mar 2022 09:10:38 +0000 (12:10 +0300)]
LU-15645 obdclass: llog to handle gaps

due to old errors an update llog can contaain gaps in index.
this shouldn't block llog processing and recovery. actual
gaps in transaction sequence should be catched by VBR.

Lustre-change: https://review.whamcloud.com/46837
Lustre-commit: TBD (from b3de0d57bd0f7cd2e918aa9d3f08be1c69697b80)

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I11ec817e356f9658118c34706ef3a533e7faba83
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47011
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-13195 osp: osp_send_update_req() should check generation 10/47010/2
Alex Zhuravlev [Mon, 27 Sep 2021 13:28:50 +0000 (16:28 +0300)]
LU-13195 osp: osp_send_update_req() should check generation

and don't send requests depending on just failed one

Lustre-change: https://review.whamcloud.com/45042
Lustre-commit: dff1e0d21c8c6bb20d63669252190795198bc49f

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I27a2b21130e33287168204ad829c0a53002b517e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47010
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-12577 llog: protect partial updates from readers 47/45847/2
Alex Zhuravlev [Sun, 9 May 2021 06:32:55 +0000 (09:32 +0300)]
LU-12577 llog: protect partial updates from readers

llog_osd_write_rec() adds a record in few steps: the header is
updated first, then the record itself is appended. per-loghandle
semaphore is used, but remote readers allocate a new separate
loghandle for every access (header reading, blocks), the the
readers can't use loghandle's semaphore to avoid accessing partial
updates. use object-based locking [censored] to serialize the writer
vs the readers.

Lustre-change: https://review.whamcloud.com/43589
Lustre-commit: ae1404feefc1572fdafed938a3fc18131d675678

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie4e4d4a1e9a6fcdea9fcca7d80b0da920e786424
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/45847
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: John Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
20 months agoLU-11861 obdclass: fix build with debug kernel 97/41897/3
Alexey Lyashkov [Tue, 15 Jan 2019 12:24:42 +0000 (15:24 +0300)]
LU-11861 obdclass: fix build with debug kernel

Move declaration before usage.

Lustre-change: https://review.whamcloud.com/34030
Lustre-commit: 2dc87bb143e998e25585673b5f0ba7e2f317475e

Test-Parameters: trivial
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I9a6c451bb5454b1542f0b06041f6938702e20b36
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/41897
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
20 months agoLU-12739 lnet: Don't queue msg when discovery has completed 90/48190/3
Chris Horn [Mon, 9 Sep 2019 17:54:08 +0000 (12:54 -0500)]
LU-12739 lnet: Don't queue msg when discovery has completed

In lnet_initiate_peer_discovery(), it is possible for the peer object
to change after the call to lnet_discover_peer_locked(), and it is
also possible for the peer to complete discovery between the first
call to lnet_peer_is_uptodate() and our placing the lnet_msg onto
the peer's lp_dc_pendq. After the call to lnet_discover_peer_locked()
check whether the, potentially new, peer object is up to date while
holding the lp_lock. If the peer is up to date, then we needn't
queue the message. Otherwise, we continue to hold the lock to place
the message on the peer's lp_dc_pendq.

Lustre-change: https://review.whamcloud.com/36139
Lustre-commit: 4ef62976448d6821df9aab3e720fd8d9d0bdefce

Test-Parameters: trivial testlist=sanity-lnet
Cray-bug-id: LUS-7596
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ib3da7447588479bb35afcc3fe176b9120d915a89
Reviewed-on: https://review.whamcloud.com/48190
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoNew release 2.12.9 2.12.9 v2_12_9
Oleg Drokin [Fri, 17 Jun 2022 18:13:35 +0000 (14:13 -0400)]
New release 2.12.9

Change-Id: I099e525b0053ec5ecdd02b231be5bfa146ade633
Signed-off-by: Oleg Drokin <green@whamcloud.com>
2 years agoNew RC 2.12.9-RC1 2.12.9-RC1 v2_12_9-RC1
Oleg Drokin [Thu, 2 Jun 2022 13:02:58 +0000 (09:02 -0400)]
New RC 2.12.9-RC1

Change-Id: I63b2b223a57d26da40427502b639fe51f4f6e9d5
Signed-off-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14181 tests: except sanity test_64e 64f with SHARED_KEY 99/40999/2
Sebastien Buisson [Thu, 10 Dec 2020 08:37:43 +0000 (09:37 +0100)]
LU-14181 tests: except sanity test_64e 64f with SHARED_KEY

Add sanity test_64e and test_64f to ALWAYS_EXCEPT when
SHARED_KEY is used.

Lustre-change: https://review.whamcloud.com/40865
Lustre-commit: aa3bdbc23bc86bae565e78b38946f4ac8fcbeacb

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iaa9f5038a59f9ddc50dd9ac81ca81effd8bb9b1b
Reviewed-on: https://review.whamcloud.com/40999
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
2 years agoLU-14658 tests: fix conf-sanity 122b test 73/46873/2
Alexander Boyko [Mon, 13 Dec 2021 20:00:48 +0000 (15:00 -0500)]
LU-14658 tests: fix conf-sanity 122b test

Sometimes the test 122b failed with:
dd: failed to open '/mnt/lustre/d122b.conf-sanity/f122b.conf-sanity':
Numerical result out of range

ZFS readonly simulation produces OS_STATFS_READONLY flag.
It leads to zero stripe_count at lod_get_stripe_count(), and
lod_qos_prep_create() returns -34(ERANGE).

The patch fixes it by file creation before replay_barrier.

Lustre-change: https://review.whamcloud.com/46864
Lustre-commit: 853d5e4a25f393033b132659d24b7aad6916e3b8

Test-Parameters: trivial fstype=zfs env=ONLY=122b,ONLY_REPEAT=4 testlist=conf-sanity
Fixes: 747fed818be5 ("LU-14598 ofd: fix for IDIF sequence at ofd_preprw_write")
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I7ec04ffe09d0038bcf99e1a571f14d2bfb6a5df5
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46873
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15854 tests: fix version check for sanity test_64 45/47345/3
Aurelien Degremont [Fri, 13 May 2022 12:42:38 +0000 (12:42 +0000)]
LU-15854 tests: fix version check for sanity test_64

Add missing or proper server version check for interop
testing for sanity test 64i and 64h.

Lustre-change: https://review.whamcloud.com/47343/
Lustre-commit: TBD (63832046a5c78a2425f1f07e2ec3f7beb9b0561e)

Test-Parameters: trivial testlist=sanity env=ONLY=64
Fixes: 38c78ac ("LU-9704 grant: ignore grant info on read resend")
Fixes: 4894683 ("LU-14124 target: set OBD_MD_FLGRANT in read's reply")
Change-Id: Iec21a407f467db3e9cb197d0a1436ea4e821bef2
Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-on: https://review.whamcloud.com/47345
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15795 kernel: new kernel [RHEL 8.6 4.18.0-372.9.1.el8] 02/47302/3
Jian Yu [Wed, 18 May 2022 02:31:14 +0000 (19:31 -0700)]
LU-15795 kernel: new kernel [RHEL 8.6 4.18.0-372.9.1.el8]

This patch makes changes to support new RHEL 8.6 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.6

Change-Id: Id738259ed94104c3a3c7bb5c1b853cfabad49405
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47302
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15093 libcfs: Check if param_set_uint_minmax is provided 83/47383/2
Chris Horn [Wed, 18 May 2022 02:29:12 +0000 (19:29 -0700)]
LU-15093 libcfs: Check if param_set_uint_minmax is provided

Linux kernel v5.15 commit 2a14c9ae15a38148484a128b84bff7e9ffd90d68
moved param_set_uint_minmax to common code.

Lustre-change: https://review.whamcloud.com/45214
Lustre-commit: 3337e9fe920b260e34ff62c0840279ea6bff34ca

HPE-bug-id: LUS-10469
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ifd1d72ae531f0f6c7cd96cc28fbc07c8a8b70886
Reviewed-on: https://review.whamcloud.com/47383
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-10235 mdt: mdt_create: check EEXIST without lock 74/41674/7
Dominique Martinet [Wed, 10 Jan 2018 13:08:06 +0000 (14:08 +0100)]
LU-10235 mdt: mdt_create: check EEXIST without lock

mkdir() currently gets a write lock on the parent even if the new
directory already exists.

This patch adds an initial lookup of the new directory without a DLM
lock so that other clients do not need to cancel their DLM lock if the
"new" directory already exists, but will continue as usual if directory
did not exist.

There is a small race window that child was created by others after our
check and before locking parent, but this can be detected later during
index insert.

Performance change on two haswell 16-core VMs with ib, mean values of
mpirun -n 8 ./mdtest -D -i 8 -I 1000

test environment | directory creation | tree creation
local, no patch  | 1725/s             | 769/s
local, patch     | 1821/s             | 788/s
remote, no patch | 1729/s             | 772/s
remote, patch    | 1687/s             | 787/s

The differences are of the order of the noise here, with all mkdirs
being effective.

If directories exist, some simple stress on four nodes shows intended
improvements:
clush -w vm[0-3] 'seq 0 10000 |
    xargs -P 7 -I{} sh -c "(({}%3==0)) &&
        mkdir /mnt/lustre/testdir/foo 2>/dev/null ||
        stat /mnt/lustre/testdir > /dev/null"'

with patch: 10s
without patch: 19s
(the difference grows exponentially with number of clients and hangs
with over 60 clients without the patch; exact time was not re-measured
with patch)

Updated sanityn.sh 43a 45a to avoid race conditions.

Add sanityn.sh test_43j to verify above scenario.

Lustre-change: https://review.whamcloud.com/30880
Lustre-commit: 79acb9a9e7d3c3185a047f5b067382a814c0e9e5

Test-Parameters: envdefinitions=SLOW=yes testlist=replay-vbr,replay-vbr
Change-Id: I37fc9c8ffc7ab334c0645042beda5bef01284564
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/41674
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13974 tests: update log corruption 64/46864/3
Alexander Boyko [Tue, 24 Nov 2020 09:05:36 +0000 (04:05 -0500)]
LU-13974 tests: update log corruption

Test case reproduce missing object for sub transaction during
set xattr operation.
First setattr got -2, second already started, but didn't
make llog_add yet. In this case llog osp object is stale after
top_trans_start. So declaration phase can not refresh llogs. And
at llog_osd_write_rec osp object changes stale state to
valid(dt_attr_get), but llog handle and llog header are invalid.
A new record would be added to updatelog with wrong index.
In that case processing of update log fails with

fs1-MDT0001-osp-MDT0003: [0x2:0x400024d0:0x2] Invalid record: index
112926 but expected 112925
lod_sub_recovery_thread()) fs1-MDT0001-osp-MDT0003 get update log
failed: rc = -34
Recovery aborted, and clients are evicted.

Lustre-change: https://review.whamcloud.com/40743
Lustre-commit: 562837124ec7bffeba7edb4b4b899bc271833374

HPE-bug-id: LUS-9030
Test-Parameters: testlist=sanity  envdefinitions=ONLY="427"
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I6a47fed1bc01f4be62216d1d0787adc413df0cf5
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46864
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-13356 client: don't use OBD_CONNECT_MNE_SWAB 09/41309/2
Alexander Boyko [Wed, 11 Mar 2020 10:40:52 +0000 (06:40 -0400)]
LU-13356 client: don't use OBD_CONNECT_MNE_SWAB

OBD_CONNECT_MNE_SWAB is equal to OBD_CONNECT_MDS_MDS, and
it was used at MGC client in past for mne swabbing during interop.
Right now it is interpreted at MGS like OBD_CONNECT_MDS_MDS and skip
these clients from eviction and lock canceling after timeout.

Lustre-change: https://review.whamcloud.com/37880
Lustre-commit: 3fe77a129e131014ff654bde616a62a1e243e322

Fixes: 1bdc4fd0594e ("LU-6307 obdclass: distinguish MGC/MDT connection properly")
Test-Parameters: testlist=runtests clientversion=2.12 envdefinitions=MDS_MOUNT_OPTS="-orw",OST_MOUNT_OPTS="-orw"
Test-Parameters: testlist=runtests serverversion=2.12 envdefinitions=MDS_MOUNT_OPTS="-orw",OST_MOUNT_OPTS="-orw"
Test-Parameters: testlist=runtests clientversion=2.10 clientdistro=el7.6 envdefinitions=MDS_MOUNT_OPTS="-orw",OST_MOUNT_OPTS="-orw"
Test-Parameters: testlist=runtests serverversion=2.10 serverdistro=el7.6 envdefinitions=MDS_MOUNT_OPTS="-orw",OST_MOUNT_OPTS="-orw"
Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-8484
Change-Id: I4f8ddeb1808cfaee7507e0efcdefa24040cfcbb6
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/41309
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
2 years agoLU-13195 osp: invalidate object on write error 63/46863/3
Alex Zhuravlev [Mon, 27 Apr 2020 07:24:33 +0000 (10:24 +0300)]
LU-13195 osp: invalidate object on write error

do this unconditionally, to avoid cases when the object is
on another request's invalidation list.

Lustre-change: https://review.whamcloud.com/38387
Lustre-commit: 9e1071b517578ed3752efb1412017c8f93cd333b

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8ee0c484e695e88c0ea6fb13ac377fa689150780
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46863
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13974 llog: check stale osp object 62/46862/2
Alexander Boyko [Tue, 24 Nov 2020 05:34:11 +0000 (00:34 -0500)]
LU-13974 llog: check stale osp object

The logic of osp_attr_get has 2 path,
1) return attributes from a cache for health osp object
2) make an out update request and return attributes for stale
osp object, object lose stale state.

When some out update request with llog writes failed, osp object
become stale. But llog handle stay inconsistent (bitmap,count,
last_index), and a next llog_add->llog_osd_write_rec do dt_attr_get,
gets attributes and makes osp object valid, and uses wrong llog
handle data. The result is index jump at llog file - recX, recX+2.
And it makes an error during update log processing if failover take
a place.
The fix adds dt_object_stale function to check osp_object.
llog_osd_write_rec check it and return ESTALE. llog_add would fail
with ESTALE error and doesn't corrupt update log.

Lustre-change: https://review.whamcloud.com/40742
Lustre-commit: 82c6e42d6137f39a1f2394b7bc6e8d600eb36181

HPE-bug-id: LUS-9030
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Iadf53fd816e1c5bde0a19d4c537f0408796c864a
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46862
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14536 obi2lnd: don't try to reconnect if there's no listener 96/45896/2
Li Dongyang [Fri, 19 Mar 2021 10:21:58 +0000 (21:21 +1100)]
LU-14536 obi2lnd: don't try to reconnect if there's no listener

For each discovery we try to reconnect up to retry_count times,
default to 5. during MDT mount process conf log, there will be
multiple discovery made for each OST.
If the OSTs are not up, the mount will have a long time out.

Lustre-change: https://review.whamcloud.com/42111
Lustre-commit: 67ba3ce23d32266eabd5f8c56fa78d65920455e8

Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: If1d854216d2f26089c52d3fb501092b7f48a444d
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45896
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14536 o2iblnd: don't resend if there's no listener 95/45895/2
Li Dongyang [Fri, 19 Mar 2021 09:26:28 +0000 (20:26 +1100)]
LU-14536 o2iblnd: don't resend if there's no listener

If there's no listener at remote peer, we will
get IB_CM_REJ_INVALID_SERVICE_ID, currently we
will try to resend which makes the discovery longer
than necessary when connecting to a node which is
not up.
Use -EHOSTUNREACH instead of -ECONNREFUSED,
so we don't end up queued for resend.

Lustre-change: https://review.whamcloud.com/42109
Lustre-commit: 0ab06eb9d865a47ea3e09880a41a9e8f0a78b6a6

Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ifaf14bc3ada2e2469669285917e366af669817e2
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45895
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-10931 lnet: handle unlink before send completes 98/45898/2
Amir Shehata [Mon, 8 Jul 2019 19:33:31 +0000 (12:33 -0700)]
LU-10931 lnet: handle unlink before send completes

If LNetMDUnlink() is called on an md with md->md_refcount > 0 then
the eq callback isn't called.
There is a scenario where the response times out before the send
completes. So we have a refcount on the MD. The Unlink callback gets
dropped on the floor. Send completes, but because we've already timed
out, the REPLY for the GET is dropped. Now we're left with a peer
that is in the following state:
LNET_PEER_MULTI_RAIL
LNET_PEER_DISCOVERING
LNET_PEER_PING_SENT
But no more events are coming to it, and the discovery never
completes.

This scenario can get RPCs stuck as well if the response times out
before the send completes.

The solution is to set the event status to -ETIMEDOUT to inform
the send event handler that it should not expect a reply.

Lustre-commit: d8fc5c23fe541e0ff6ce5bec6302957714c3f69f
Lustre-change: https://review.whamcloud.com/35444

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ica0e1a823d0d1200bb8cc42a6e058785da1d4fa4
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/45898
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15357 mdd: fix changelog context leak 32/45832/3
Mikhail Pershin [Sat, 11 Dec 2021 12:49:47 +0000 (15:49 +0300)]
LU-15357 mdd: fix changelog context leak

The mdd_changelog_clear() shouldn't skip llog_ctxt_put()
in case of error.

Lustre-change: https://review.whamcloud.com/45831
Lustre-commit: TBD (from c330a73e4cffb1fb642fadfa38001275251d1f14)

Fixes: 6b183927e1 (LU-14553 changelog: eliminate mdd_changelog_clear warning)
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I9c9aa3ce0d11e8f67470b450d007f2a1081644c6
Reviewed-on: https://review.whamcloud.com/45832
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12483 tests: fix sanity test 60h running conditions 93/45993/2
Oleg Drokin [Thu, 6 Jan 2022 21:50:16 +0000 (14:50 -0700)]
LU-12483 tests: fix sanity test 60h running conditions

The test is supposed to run in DNE mode on 2.12.4 or above,
but the conditions are somehow reversed.

Lustre-change: https://review.whamcloud.com/35355
Lustre-commit: dfd64242755b2b993ad6fe177480fb391d6eb6bb

Fixes: 5b1ea58c21e ("LU-11907 dne: allow access to striped dir with broken layout")
Change-Id: I322941a6098b0dbfbabe2f5c70f40f8e81d1bbab
Test-Parameters: trivial
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45993
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alena Nikitenko <anikitenko@ddn.com>
2 years agoLU-15009 ofd: continue precreate if LAST_ID is less on MDT 30/45930/2
Lai Siyao [Thu, 16 Sep 2021 21:49:33 +0000 (17:49 -0400)]
LU-15009 ofd: continue precreate if LAST_ID is less on MDT

It's possible that precreate succeeded on OST, but MDT didn't get the
reply, and assumed failure. In this case, the LAST_ID on MDT is
smaller than that on OST, instead of report error and stop precreate,
it's better to move precreate window forward.

Lustre-change: https://review.whamcloud.com/44984
Lustre-commit: 1711e26ae861c28829870c2433caf7ee232909cf

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ia6ca418ec0ea6797b7eccc1610879331307fad07
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45930
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14688 mdt: changelog purge deletes plain llog 90/43990/4
Alexander Boyko [Mon, 17 May 2021 13:29:01 +0000 (09:29 -0400)]
LU-14688 mdt: changelog purge deletes plain llog

With a massive cancel records changelog could delete a plain
llog file and skip one by one record cancelling.
Also patch fixes the race between llog_destroy and llog_next_block.

Lustre-change: https://review.whamcloud.com/43719
Lustre-commit: d813c75df6798efbf3228347628c0d671ca7269c

HPE-bug-id: LUS-9950
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I47c2ed97945e979745255381f83b6a417d7ba8b1
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/43990
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14606 llog: hide ENOENT for cancelling record 72/43572/5
Alexander Boyko [Mon, 12 Apr 2021 12:19:47 +0000 (08:19 -0400)]
LU-14606 llog: hide ENOENT for cancelling record

Llog allows parallel records processing. A record could be cancelled
at callback. If two threads processing and cancelling the same record,
one thread would get ENOENT.
The error was observed during purging changlog records.The patch
adds reproducer test sanity 160m.

This is a valid case, let's hide ENOENT error from a caller.

Lustre-change: https://review.whamcloud.com/43264
Lustre-commit: 0b60647c0382426e3b4105d82d04862d2e4831cb

HPE-bug-id: LUS-9826
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Id00b959e6f329c2ad34966f8a17a52f71680f24c
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43572
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13636 obdclass: drop nlink if directory is removed 66/44466/2
Alex Zhuravlev [Fri, 5 Jun 2020 12:15:22 +0000 (15:15 +0300)]
LU-13636 obdclass: drop nlink if directory is removed

To make e2fsck happy.  Otherwise, all the features using
local directories (quota, nodemap, nid tables) can leave
orphaned objects as nlink doesn't drop to 0.

Lustre-change: https://review.whamcloud.com/38844
Lustre-commit: c6d5c6606a38e2b550a81591935b0091faba4a2e

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I9e20a304d66c61f312168715e888757bc06b6ed0
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/44466
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-2233 tests: improve tests sanityn/40-47 91/44391/3
Alex Zhuravlev [Mon, 29 Apr 2019 08:21:13 +0000 (11:21 +0300)]
LU-2233 tests: improve tests sanityn/40-47

sanity/40-46 usually take 800-900s which is almost a half
of the whole sanityn pass. 99.(9)% of time the tests just
wait to ensure specific order the operations execute in.

the patch changes cfs_fail_timeout_set() so that it can
interrupt waiting if fail_loc is set to 0 - polling with
1/10s frequency is used.

the tests itself are modified to reset fail_loc. to be
able to do so both operations (referenced as OP1 and OP2
in the tests) are run in background. once started and then
ensured with pdo_sched() helper that MDS threads got to the
blocking points, we can interrupt OP1 and do usual checks.

ONLY=40-47 sh sanityn.sh take: 1017s before and 78s after.

Lustre-change: https://review.whamcloud.com/4392
Lustre-commit: 743b85a32e24cff0c77dff739691043970a0901e

LU-12470 tests: increase pdirops timeout

There are pretty regular failures of the sanityn pdirops test_40-47.
Increase the timeout slightly to reduces the frequency of failures.

Lustre-change: https://review.whamcloud.com/37304
Lustre-commit: b35f50c96c608ba650a5b3cf29fa129e01025549

Test-Parameters: trivial testlist=sanityn,sanityn,sanityn,sanityn
Test-Parameters: testlist=sanityn,sanityn,sanityn,sanityn,sanityn
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: Ib8aec2b4517a6f84402ccae66f6d5ceac6d73d85
Reviewed-on: https://review.whamcloud.com/44391
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15292 kernel: kernel update RHEL7.9 [3.10.0-1160.49.1.el7] 07/45707/2
Jian Yu [Thu, 2 Dec 2021 08:43:38 +0000 (00:43 -0800)]
LU-15292 kernel: kernel update RHEL7.9 [3.10.0-1160.49.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.49.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I356b8a8345a4a91d6d1c1a4a9b4eab4bb5afe75b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45707
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12661 tests: skip sanity 817 for kernel 4.12+ 63/39863/3
Andreas Dilger [Wed, 9 Sep 2020 00:29:06 +0000 (18:29 -0600)]
LU-12661 tests: skip sanity 817 for kernel 4.12+

Skip the NFS exec mode bug for kernels 4.12 and later, since this
is also being hit on SLES12/15 kernel 4.12.14+ and not just 4.14.

Lustre-change: https://review.whamcloud.com/39838
Lustre-commit: 3e2c28437404b0ccbd7bbfb8f77788678975b63d

Test-Parameters: trivial
Fixes: 4fed33473ca2 ("LU-12661 tests: skip sanity 817 if kernel >= 4.14")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ibc4ffda72bd7827e250c4583c760505b8f3ebbe5
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39863
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12579 tests: allow some margin in runtests 79/45579/2
Andreas Dilger [Fri, 30 Aug 2019 23:23:50 +0000 (17:23 -0600)]
LU-12579 tests: allow some margin in runtests

Allow some margin in the space used by runtests for internal
log files for Lustre and the underlying filesystem.

Lustre-change: https://review.whamcloud.com/36011
Lustre-commit: c05656557353954b2a9799c4e702329db2d38851

Test-Parameters: trivial testlist=runtests,runtests,runtests
Test-Parameters: mdtcount=4 mdscount=2 testlist=runtests,runtests,runtests
Test-Parameters: fstype=zfs testlist=runtests,runtests,runtests
Test-Parameters: fstype=zfs mdtcount=4 mdscount=2 testlist=runtests,runtests

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I34b47a8436c5718be311698a3f6e6d7af7ea45ad
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45579
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-12751 tests: add missing error() 78/45578/3
Alex Zhuravlev [Wed, 11 Sep 2019 14:32:21 +0000 (17:32 +0300)]
LU-12751 tests: add missing error()

nothing else I can say

Lustre-change: https://review.whamcloud.com/36159
Lustre-commit: 78f7b7709f9b45b5faae6e7c7b3093c246a08086

Test-Parameters: trivial

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I040771e57ec6f6c6bfbde5a21358c6747f4f20dc
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45578
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alena Nikitenko <anikitenko@ddn.com>
2 years agoLU-15196 kernel: kernel update RHEL8.4 [4.18.0-305.25.1.el8_4] 13/45513/4
Jian Yu [Wed, 17 Nov 2021 20:43:25 +0000 (12:43 -0800)]
LU-15196 kernel: kernel update RHEL8.4 [4.18.0-305.25.1.el8_4]

Update RHEL8.4 kernel to 4.18.0-305.25.1.el8_4 for Lustre client.

Test-Parameters: trivial clientdistro=el8.4 testlist=sanity

Change-Id: Ic70f7330f90a36646bb36e0c6015ea22882b20b9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45513
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12410 lnet: Add additional output to sanity-lnet.sh 88/44188/3
Chris Horn [Thu, 19 Sep 2019 19:01:05 +0000 (14:01 -0500)]
LU-12410 lnet: Add additional output to sanity-lnet.sh

Add wrappers around ip netns exec and lnetctl commands to generate
some additional test output. This makes it easier to see what each
test case is doing from the test script output, and aids in debugging
any problems.

Lustre-change: https://review.whamcloud.com/36242
Lustre-commit: 32528a689889989607a34b21efa583429bda1422

Test-parameters: trivial testlist=sanity-lnet

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I95b18cb3a090527548a8f9e65845eb4a18dea6d6
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44188
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoNew release 2.12.8 2.12.8 v2_12_8
Oleg Drokin [Thu, 18 Nov 2021 19:04:45 +0000 (14:04 -0500)]
New release 2.12.8

Change-Id: I33decc215454eb6bc85361dfd7d68a11db4113c4
Signed-off-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14587 ptlrpc: remove LASSERT in nrs_polices proc handler 68/45568/2
Lei Feng [Tue, 12 Oct 2021 06:33:22 +0000 (14:33 +0800)]
LU-14587 ptlrpc: remove LASSERT in nrs_polices proc handler

It's not necessary to LASSERT() in nrs_polices proc handler.
CERROR() and returning error is good enough.

Lustre-change: https://review.whamcloud.com/45200
Lustre-commit: 9997f94d4b6ee335d2bf86f94bd43464d5b8f061

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I09f06dc4ab90e49b2df66a9b47a74678c64cdd2f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45568
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-9704 grant: ignore grant info on read resend 74/45474/2
Vladimir Saveliev [Wed, 3 Nov 2021 10:52:14 +0000 (13:52 +0300)]
LU-9704 grant: ignore grant info on read resend

The following scenario makes a message like "claims 28672 GRANT, real
grant 0" to appear:

 1. client owns X grants and run rpcs to shrink part of those
 2. server fails over so that the shrink rpc is to be resent.
 3. on the clinet reconnect server and client sync on initial amount
 of grants for the client.
 4. shrink rpc is resend, if server disk space is enough, shrink does
 not happen and the client adds amount of grants it was going to
 shrink to its newly initial amount of grants. Now, client thinks that
 it owns more grants than it does from server points of view.
 5. the client consumes grants and sends rpcs to server. Server avoids
 allocating new grants for the client if the current amount of grant
 is big enough:
static long tgt_grant_alloc(struct obd_export *exp, u64 curgrant,
...
        if (curgrant >= want || curgrant >= ted->ted_grant + chunk)
                RETURN(0);
 6. client continues grants consuming which eventually leads to
 complains like "claims 28672 GRANT, real grant 0".

In case of resent of read and set_info:shrink RPCs grant info should
be ignored as it was reset on reconnect.

Tests to illustrate the issue is added.

Lustre-change: https://review.whamcloud.com/45371
Lustre-commit: TBD

Change-Id: I8af1db287dc61c713e5439f4cf6bd652ce02c12c
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45474
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5] 28/45528/3
Jian Yu [Mon, 15 Nov 2021 19:12:16 +0000 (11:12 -0800)]
LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]

This patch makes changes to support new RHEL 8.5 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.5

Lustre-change: https://review.whamcloud.com/45285
Lustre-commit: TBD (from a1b4ee323ad650d2fdff3754596771dd0c8df507)

Change-Id: I068f091817126fffc14402254f45dcd75ba7f3fc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45528
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14128 lov: correctly set OST obj size 48/45448/4
Bobi Jam [Wed, 3 Nov 2021 18:19:09 +0000 (14:19 -0400)]
LU-14128 lov: correctly set OST obj size

When extends a PFL file to a size locating at a boundary of a stripe
in a component, the truncate won't set the size of the OST object
in the prior stripe.

This patch record the prior stripe in
lov_layout_raid0::lo_trunc_stripeno and add the stripe in the
truncate IO and enqueue the lock covering it.

Lustre-change: https://review.whamcloud.com/40581
Lustre-commit: 98015004516cad1173e2bac2a4695bdc56e4d9a4

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ic5d8e3c16f950003736cd6dbd5af404613f818c7
Reviewed-on: https://review.whamcloud.com/45448
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14543 target: prevent overflowing of tgd->tgd_tot_granted 90/45490/2
Vladimir Saveliev [Fri, 19 Mar 2021 12:08:47 +0000 (15:08 +0300)]
LU-14543 target: prevent overflowing of tgd->tgd_tot_granted

If tgd->tgd_tot_granted < ted->ted_grant then there should not be:
   tgd->tgd_tot_granted -= ted->ted_grant;
which breaks tgd->tgd_tot_granted.
In case of obvious ted->ted_grant damage, recalculate
tgd->tgd_tot_granted using list of exports.

The same change is made for tgd->tgd_tot_dirty.

This patch also adds sanity check for exp->exp_target_data.ted_grant
increase in tgt_grant_alloc() to catch grant counting corruption as
soon as it happened.

Lustre-change: https://review.whamcloud.com/45474
Lustre-commit: bb5d81ea95502fb5709e176b561b70aa5280ee07

Fixes: af2d3ac30e ("LU-11939 tgt: Do not assert during grant cleanup")
Change-Id: I36ba7496f7b72b4881e98c06ec254a8eefd4c13f
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45490
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-11939 tgt: Do not assert during grant cleanup 89/45489/3
Patrick Farrell [Fri, 8 Feb 2019 17:14:06 +0000 (12:14 -0500)]
LU-11939 tgt: Do not assert during grant cleanup

Client/server grant inconsistencies discovered during
cleanup are indicative of a bug, but any problems they
would cause have already occurred at this point.

So do not assert during this cleanup.

Lustre-change: https://review.whamcloud.com/34215
Lustre-commit: af2d3ac30eafead6b47c5db20d76433c091d89de

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic9b827b1005bc321a290505a368349699ddf2f38
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45489
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-15184 llite: properly detect SELinux disabled case 27/45527/3
Sebastien Buisson [Mon, 15 Nov 2021 19:06:31 +0000 (11:06 -0800)]
LU-15184 llite: properly detect SELinux disabled case

Usually, security_dentry_init_security() returns -EOPNOTSUPP when
SELinux is disabled. But on some kernels (e.g. rhel 8.5) it returns
0 when SELinux is disabled, and in this case the security context is
empty.
So in both cases make sure the security context name is not set, which
means "SELinux is disabled" for the rest of the code.

Lustre-change: https://review.whamcloud.com/45501
Lustre-commit: TBD (from 85779753abe0451e2b0b82dcf5d4a4d111b0bfb8)

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3b9608f9768288de89570c158e8429560fa0213f
Reviewed-on: https://review.whamcloud.com/45527
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-14413 test: test for overstriping for sanity 27M 54/44354/7
James Simmons [Wed, 28 Jul 2021 00:10:29 +0000 (20:10 -0400)]
LU-14413 test: test for overstriping for sanity 27M

The introduction of sanity 27M broke interop with 2.12 LTS since
over striping doesn't exist in that version. Adjust the test to
use over striping if the client supports it, otherwise just use
traditional striping.

Lustre-change: https://review.whamcloud.com/44340
Lustre-commit: 4e1f9c4bd1d96063a1fbb2dfaab41b15836167ab

Test-Parameters: trivial testlist=sanity env=ONLY=27M
Change-Id: I2d788a116cbb749a83d6cec36f97d06533b32421
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44340
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44354
Reviewed-by: James Nunez <jnunez@whamcloud.com>
2 years agoLU-14598 ofd: fix for IDIF sequence at ofd_preprw_write 41/43541/2
Alexander Boyko [Thu, 8 Apr 2021 08:23:54 +0000 (04:23 -0400)]
LU-14598 ofd: fix for IDIF sequence at ofd_preprw_write

During recovery write operation could create and load a sequence
if it comes before creation request from MDT0. ofd_preprw_write() uses
wrong logic for taking sequence for IDIF fids. And if oid overflows
32bit and takes a part at IDIF sequence, write request loads wrong
ofd sequence. And after that it is used for other IO. The next
create from MDT0 cause an error:
Too many FIDs to precreate OST replaced or reformatted...

The test 122b reproduce issue when OST using a wrong sequence for
MDT0 IDIF. This error requires objects id grater than 32bit, and
write request during recovery, it should be processed before a create
requset from MDT0.
For a visible error at console the last object id should be
1<<32 + (OST_MAX_PRECREATE * 5). Error is
lustre-OST0000: Too many FIDs to precreate OST replaced or
    reformatted: LFSCK will clean up

Lustre-change: https://review.whamcloud.com/43248
Lustre-commit: 747fed818be5a4e09281ab1d9fd5b3a13763ab40

HPE-bug-id: LUS-9595
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I09e6f88b1f0d03fec59b24ef096cbc7baa5388ae
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/43541
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14565 ofd: Do not rely on tgd_blockbit 55/43955/9
Arshad Hussain [Mon, 29 Mar 2021 05:22:11 +0000 (10:52 +0530)]
LU-14565 ofd: Do not rely on tgd_blockbit

tgd_blockbit is recordsize bits set during mkfs.
This once set does not change. However, 'zfs set'
can be used to change the OST blocksize. Instead
of using cached value of 'tgd_blockbit' always
calculate the blocksize bits which may have
changed.

Test-case: sanity/104c added

Conflicts:
lustre/mdt/mdt_handler.c

Lustre-change: https://review.whamcloud.com/43154/
Lustre-commit: 8ee6e1c8825c4fabfd6c39db11081839ca53d454

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Icc100cca0d5ae492c41d60f0bf97512450f796bc
Reviewed-on: https://review.whamcloud.com/43955
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13054 ldiskfs: split htree_lock as separate patch 21/44121/3
Yang Sheng [Sun, 26 Apr 2020 11:59:16 +0000 (19:59 +0800)]
LU-13054 ldiskfs: split htree_lock as separate patch

The htree_lock part is identical in the different
distro version of pdirop patch. So move it out as
separate patch to reduce maintenance effort.

Lustre-change: https://review.whamcloud.com/38372
Lustre-commit: 42880f9502ba57b7ee35559d7b07d2f1a3adec72

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I423cc957de37ccdb097c9893f69481ce947ac78c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44121
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-13054 ldiskfs: htree_node wrongly granted 20/44120/3
Yang Sheng [Sun, 26 Apr 2020 11:56:40 +0000 (19:56 +0800)]
LU-13054 ldiskfs: htree_node wrongly granted

The thread was waken up accidently. So need check
whether the lock granted or not after wake up.
Also fix issue that major always set to 0 since
hbit initialize incorrect. The performace should be
impacted especial operate in big directory.

kernel BUG at lustre/ldiskfs/htree_lock.c:429!
 Call Trace:
 htree_node_release_all+0x5a/0x80 [ldiskfs]
 htree_unlock+0x22/0x70 [ldiskfs]
 osd_index_ea_delete+0x30e/0xb10 [osd_ldiskfs]
 lod_sub_delete+0x1c8/0x460 [lod]
 lod_delete+0x24/0x30 [lod]
 __mdd_index_delete_only+0x194/0x250 [mdd]
 __mdd_index_delete+0x46/0x290 [mdd]
 mdd_unlink+0x5f8/0xaa0 [mdd]
 mdo_unlink+0x46/0x48 [mdt]
 mdt_reint_unlink+0xbed/0x14b0 [mdt]

Lustre-change: https://review.whamcloud.com/38371
Lustre-commit: 4597a2b4fc33711f66eb1c21fc125d028bd3f2ec

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I5972961bc78b349214c6756642717d126f0c4b26
Reviewed-on: https://review.whamcloud.com/44120
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15099 kernel: kernel update RHEL7.9 [3.10.0-1160.45.1.el7] 54/45354/2
Jian Yu [Mon, 25 Oct 2021 18:47:37 +0000 (11:47 -0700)]
LU-15099 kernel: kernel update RHEL7.9 [3.10.0-1160.45.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.45.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I11c307bfd6a6b353bc7b6fe40bb5d604bc9b3fdc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45354
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15026 zfs: Fix ZFS(2.0.0-1) build error on CentOS (3.10) 55/45355/2
Arshad Hussain [Mon, 25 Oct 2021 18:51:50 +0000 (11:51 -0700)]
LU-15026 zfs: Fix ZFS(2.0.0-1) build error on CentOS (3.10)

ZFS: (2.0.0-1)
Lustre: 608cce73d51 LU-15007 tests: quota enable cmd fix
CentOS: 3.10.0-1160.15.2.el7.x86_64

This patch fixes two build failures seens as below for
the above configuration

First
~~~~~
In file included from:
/root/zfs/zfs_git_lustre_build/zfs/include/sys/spa.h:39:0,
from libmount_utils_zfs.c:32:
/root/zfs/<path>/.../sys/zfs_context.h:110:27:
fatal error: sys/byteorder.h: No such file or directory
#include <sys/byteorder.h>

Second
~~~~~~
gcc -rdynamic -shared -export-dynamic -pthread \
-L/root/zfs/zfs_git_lustre_build/zfs/lib/libzfs/.libs/
-L/root/zfs/zfs_git_lustre_build/zfs/lib/libnvpair/.libs
-o mount_osd_zfs.so \
`ar -t libmount_utils_zfs.a` \
-ldl -lzfs -lnvpair -lzpool
/usr/bin/ld: cannot find -lzpool
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
collect2: error: ld returned 1 exit status

Lustre-change: https://review.whamcloud.com/45016
Lustre-commit: 8931f7e4e5da39389a79eff11dc04bb468beb715

Change-Id: Iaf868391e414deb7ac8df43847250bbcd0115d5e
Test-Parameters: trivial fstype=zfs
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45355
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14124 target: set OBD_MD_FLGRANT in read's reply 71/45471/2
Vladimir Saveliev [Wed, 20 Oct 2021 10:32:11 +0000 (13:32 +0300)]
LU-14124 target: set OBD_MD_FLGRANT in read's reply

If tgt_grant_shrink() decides to not shrink grants - a client is
supposed to restore its cl_grant_avail in osc_update_grant(). In case
of read OBD_MD_FLGRANT is not set on reply's body->oa.o_valid, so
osc_update_grant() misses the cl_grant_avail update. As result server
keeps thinking that client has a lot of grants while a client thinks
that it is missing grants badly. That may lead to performance
degradation.

A test to illustrate the issue is included.

Lustre-change: https://review.whamcloud.com/43375
Lustre-commit: 4894683342d77964daeded9fbc608fc46aa479ee

Test-Parameters: testlist=sanity
Change-Id: Ibe7ce0af5701226c8be3ae3f9ad57c354791fa0f
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45471
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-15160 kernel: kernel update SLES12 SP5 [4.12.14-122.91.2] 64/45364/2
Jian Yu [Mon, 25 Oct 2021 23:40:08 +0000 (16:40 -0700)]
LU-15160 kernel: kernel update SLES12 SP5 [4.12.14-122.91.2]

Update SLES12 SP5 kernel to 4.12.14-122.91.2 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: Ia6620869fa84d72f8d22c4a8a039600037ddb2d9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45364
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14696 llite: check read only mount for setquota 23/44923/3
Hongchao Zhang [Wed, 15 Sep 2021 11:44:23 +0000 (19:44 +0800)]
LU-14696 llite: check read only mount for setquota

During setting quota, it should fail if the mount is read-only.

Lustre-change: https://review.whamcloud.com/43765
Lustre-commit: 29e00cecc6019fbdb5bd98511970970ac5ef5318

Change-Id: I966ac71d0a4a72dcb998f09ffc0f99ae28498e27
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44923
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-15008 kernel: kernel update RHEL8.4 [4.18.0-305.19.1.el8_4] 51/44951/2
Jian Yu [Thu, 16 Sep 2021 00:53:27 +0000 (17:53 -0700)]
LU-15008 kernel: kernel update RHEL8.4 [4.18.0-305.19.1.el8_4]

Update RHEL8.4 kernel to 4.18.0-305.19.1.el8_4 for Lustre client.

Test-Parameters: trivial clientdistro=el8.4 testlist=sanity

Change-Id: Icedc6cf2a5678cfbce76c47507137c0ea41d0b06
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44951
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14994 kernel: kernel update RHEL7.9 [3.10.0-1160.42.2.el7] 76/44876/2
Jian Yu [Thu, 9 Sep 2021 00:38:05 +0000 (17:38 -0700)]
LU-14994 kernel: kernel update RHEL7.9 [3.10.0-1160.42.2.el7]

Update RHEL7.9 kernel to 3.10.0-1160.42.2.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9 \
testlist=sanity

Change-Id: I377ea5d1e28c50b1087dfca7cb32f44afb9bf5f5
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44876
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14934 kernel: kernel update SLES12 SP5 [4.12.14-122.83.1] 63/44863/2
Jian Yu [Tue, 7 Sep 2021 19:56:49 +0000 (12:56 -0700)]
LU-14934 kernel: kernel update SLES12 SP5 [4.12.14-122.83.1]

Update SLES12 SP5 kernel to 4.12.14-122.83.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: I2b35d129550b895324bb3e2e61910ad10e846f03
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44863
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-11546 utils: enable large_dir for ldiskfs 81/36781/6
Li Dongyang [Wed, 23 Oct 2019 00:10:34 +0000 (11:10 +1100)]
LU-11546 utils: enable large_dir for ldiskfs

Format MDT with "large_dir" option by default,
to get over the 10M-entry limit for the directories.

Lustre-change: https://review.whamcloud.com/36555
Lustre-commit: cd1faa0124f21e12a5ecd83c709c13918264fc86

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ie51e6ce28b5f00adc9958de24794a760d9b43b77
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36781
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-12627 ofd: reset fti_attr in ofd_lvbo_update() 69/44269/5
Wang Shilong [Sat, 3 Aug 2019 06:27:22 +0000 (14:27 +0800)]
LU-12627 ofd: reset fti_attr in ofd_lvbo_update()

This patch try to fix following panic:

(ofd_internal.h:440:tsi2ofd_info()) ASSERTION( info->fti_attr.la_valid == 0 ) failed:
(ofd_internal.h:440:tsi2ofd_info()) LBUG
[ 5321.108598] Call Trace:
[ 5321.109347]  [<ffffffffc06fc8bc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
[ 5321.111342]  [<ffffffffc06fc96c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[ 5321.113026]  [<ffffffffc147631a>] ofd_preprw+0xcfa/0x1160 [ofd]
[ 5321.114643]  [<ffffffffc0bb934c>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[ 5321.116373]  [<ffffffffc0bbc50a>] tgt_request_handle+0x91a/0x15c0 [ptlrpc]
[ 5321.118230]  [<ffffffffc0b61636>] ptlrpc_server_handle_request+0x256/0xb00 [ptlrpc]
[ 5321.120318]  [<ffffffffc0b6516c>] ptlrpc_main+0xbac/0x1560 [ptlrpc]
[ 5321.122001]  [<ffffffff84cc1c31>] kthread+0xd1/0xe0
[ 5321.123023]  [<ffffffff85374c37>] ret_from_fork_nospec_end+0x0/0x39
[ 5321.124066]  [<ffffffffffffffff>] 0xffffffffffffffff

If this is server lock, tgt_brw_lock() will finally call
ofd_lvbo_update() upon lock canceling which will use @fti_attr
and pollute value:

|->ptlrpc_main
 |->lu_context_enter(le_ctx)
  |->tgt_brw_write
   |->tgt_brw_lock
    |->tgt_extent_lock
     |->ldlm_cli_enqueue_local
      |->ldlm_lock_enqueue
       |->ldlm_run_ast_work
        |->ptlrpc_check_set
          |->ldlm_cb_interpret
           |->ldlm_handle_ast_error
            |->ofd_lvbo_update
             |->ofd_attr_get polluted @info->fti_attr

  |->tgt_brw_write
   |->ofd_preprw
    |->tsi2ofd_info
      |->ASSERTION(info->fti_attr.la_valid == 0)

 |->lu_context_exit(le_ctx)--->memset @fti_attr

To fix this problem, reset fti_attr->la_valid before
ofd_lvbo_update() return just like what offd_lvbo_init() did.

Lustre-change: https://review.whamcloud.com/35685
Lustre-commit: 8ffbe6b82fac1d3e4d4391bcba74dc2ee1411a69

Change-Id: Ib6b448dd21603cfe0305d8425862a96ef3f7fee8
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44269
Reviewed-by: Wang Shilong <wangshilong1991@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14876 out: don't connect to busy MDS-MDS export 62/44362/6
Mikhail Pershin [Wed, 21 Jul 2021 15:14:01 +0000 (18:14 +0300)]
LU-14876 out: don't connect to busy MDS-MDS export

MDS-MDS connection is missing check for busy requests upon
reconnect, so resent can be executed concurrently with
original request.

- in ptlrpc_server_check_resend_in_progress() remove exception
  for bulk requests, they can be compared by XID nowadays.
  This prevents OUT requests vs resent execution as well.
- fix messages in target_handle_connect() to report correct
  information about connection details
- in out_handle() check for last_xid only once per OUT_UPDATE
- test 110m is added to recovery-small to reproduce the issue

Lustre-change: https://review.whamcloud.com/44390
Lustre-commit: 301d76a71176c186129231ddd1323bae21100165

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I2ad183674d59a2cdeab0037bd8551c607b10ffeb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44362
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-11518 ldlm: cancel LRU improvement 07/41007/3
Vitaly Fertman [Wed, 16 Dec 2020 16:54:10 +0000 (11:54 -0500)]
LU-11518 ldlm: cancel LRU improvement

Add @batch parameter to cancel LRU, which means if at least 1 lock is
cancelled, try to cancel at least a batch locks. This functionality
will be used in later patches.

Limit the LRU cancel by 1 thread only, however, not for those which
have the @max limit given (ELC), as LRU may be left not cleaned up
in full.

Lustre-change: https://review.whamcloud.com/39561
Lustre-commit: 3d4b5dacb3053f39d79d59860a903a19e76b9318

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: Ide21c4a2b2209b8a721249466ea1e651c8532c8a
HPE-bug-id: LUS-8678
Reviewed-on: https://es-gerrit.dev.cray.com/157067
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41007
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
2 years agoLU-11768 test: make at_max to take effect 45/41345/2
Hongchao Zhang [Thu, 10 Oct 2019 20:22:25 +0000 (16:22 -0400)]
LU-11768 test: make at_max to take effect

In test_6 of sanity-quota, the "at_max" won't affect
the "at_current" if there is no RPC to be sent in that
import, which still makes the following DQACQ request
to have larger timeout value and triggers watchdog.

Lustre-change: https://review.whamcloud.com/36431
Lustre-commit: 550af84a91505c85824ffad2990d31c8e8ab4dd9

Fixes: d8226b93 ("LU-11768 test: limit at_max to timeout in time")
Test-Parameters: trivial testlist=sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: Iccc969459647aa70da6f6ecb0d8d13a404bf8088
Reviewed-on: https://review.whamcloud.com/41345
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13423 tests: cleanup_netns correctly set result 03/44203/3
Shaun Tancheff [Tue, 7 Apr 2020 23:05:06 +0000 (18:05 -0500)]
LU-13423 tests: cleanup_netns correctly set result

The existence test for 'test1pl' should not result in
cleanup_netns returning failure to the caller.

A slightly more terse if/else can be used to ensure the
caller is notified of failure only in the case of
test1pl not being deleted.

Lustre-change: https://review.whamcloud.com/38157
Lustre-commit: 410b655c71849e5a26251f7c187b19ed8f504bd7

Test-Parameters: trivial
HPE-bug-id: LUS-8713

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I85dee20ec0f0ccd0be17597431fcedda9469d9da
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44203
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14204 tests: make sure we have a single import 98/40998/2
Sebastien Buisson [Wed, 9 Dec 2020 17:53:12 +0000 (18:53 +0100)]
LU-14204 tests: make sure we have a single import

In sanity, retrieve the exact name of the import being used on the
client, in order to properly get information such as lock_count
or lru_size.

Change-Id: I065b7da7990c7171d5baa24f3400c5f8ffc12fc9
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/40998
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14098 obdclass: try to skip corrupted llog records 96/44396/2
Alex Zhuravlev [Mon, 26 Jul 2021 06:18:06 +0000 (09:18 +0300)]
LU-14098 obdclass: try to skip corrupted llog records

if llog's header or record is found corrupted, then
ignore the remaining records and try with the next one.

Lustre-commit: 910eb97c1b43a44a9da2ae14c3b83e28ca6342fc
Lustre-change: https://review.whamcloud.com/40754

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I86a682a8874a2184e8891ff0ee8a68414d232a79
Reviewed-on: https://review.whamcloud.com/44396
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14733 o2iblnd: Avoid double posting invalidate 17/44217/2
Mike Marciniszyn [Wed, 7 Jul 2021 19:16:01 +0000 (15:16 -0400)]
LU-14733 o2iblnd: Avoid double posting invalidate

When the kib_tx is provisioned during kiblnd_fmr_pool_map(), spare
WRs in the kib_fast_reg_descriptor are setup and the mapping of
pages is given to the mr.

kiblnd_post_tx_locked() then posts the spare WRs from the
kib_fast_reg_descriptor.

if (rc == 0)
return 0;

The code returns and the kib_fast_reg_descriptor is still contains
the spare WRs.   The next time the kib_tx is used, the
now obsolete WRs will be inadvertently posted.   For rdmavt, the
obsolete invalidate will cause an -EINVAL to be returned from
the post send.

Fix by adding a state variable frd_posted to the kib_fast_reg_descriptor.
The variable is set to false in kiblnd_fmr_pool_unmap().
kiblnd_post_tx_locked() is adjusted to avoid prepending the
kib_fast_reg_descriptor WRs when frd_posted is true.   After
the post succeeds, the frd_posted is set to true.

Lustre-change: https://review.whamcloud.com/44190
Lustre-commit: 5930576791e864529e6ef9b46f3e09cc4b635fc2

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Change-Id: I426dd05e635392e75d1aa48808782a229e83ce5f
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44217
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14871 kernel: kernel update RHEL7.9 [3.10.0-1160.36.2.el7] 77/44377/2
Jian Yu [Thu, 22 Jul 2021 07:31:50 +0000 (00:31 -0700)]
LU-14871 kernel: kernel update RHEL7.9 [3.10.0-1160.36.2.el7]

Update RHEL7.9 kernel to 3.10.0-1160.36.2.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: Ie2898b1df28c8b99ea4099e94baafe388c6aa626
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44377
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14733 o2iblnd: Move racy NULL assignment 16/44216/2
Mike Marciniszyn [Wed, 7 Jul 2021 19:16:00 +0000 (15:16 -0400)]
LU-14733 o2iblnd: Move racy NULL assignment

kiblnd_fmr_pool_unmap() can race map and subsequent processing
because of this flaw in unmap:

if (frd) {
frd->frd_valid = false;
spin_lock(&fps->fps_lock);
list_add_tail(&frd->frd_list, &fpo->fast_reg.fpo_pool_list);
spin_unlock(&fps->fps_lock);
fmr->fmr_frd = NULL;
}

The fmr can be pulled off the list in kiblnd_fmr_pool_unmap() on
another CPU an fmr_frd could be in a state of flux and
potentially be seen incorrectly later on as the kib_tx is processed.

Fix my moving the fmr_frd assignment to before the fmr is added to the
list.

Lustre-change: https://review.whamcloud.com/44189
Lustre-commit: 023113fb8946f3565529e7327fdcd90ab9db3ba3

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Change-Id: Ibddf132a363ecfe9db3cc06287cec873c021d2fb
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44216
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13729 osd-ldiskfs: race access to iam_formats during setup 56/44356/2
Wang Shilong [Tue, 30 Jun 2020 01:12:48 +0000 (09:12 +0800)]
LU-13729 osd-ldiskfs: race access to iam_formats during setup

It might be possible during OST mounting, two targets reach
iam_format_guess() at the same time, if @initialized is 0,
they both access iam_lxx_format_init(), however list operation
inside is not protected by any locking which cause list corruptions
finally.

We could fix this by doing formats registration in module init,
since there are only two formats, just remove pointless list.

Lustre-change: https://review.whamcloud.com/39213
Lustre-commit: 54d0f5de911af52e7f2a978c4b6cd158fed87dc5

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I6dd5a4d1297792b47fb4b94052465a7e0f9123aa
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/44356
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wangshilong1991@gmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12836 osd-zfs: Catch all ZFS pool change events 29/43929/3
Tony Hutter [Fri, 12 Mar 2021 01:23:16 +0000 (17:23 -0800)]
LU-12836 osd-zfs: Catch all ZFS pool change events

This change adds the following symlinks:

  vdev_attach-lustre -> statechange-lustre.sh
  vdev_remove-lustre -> statechange-lustre.sh
  vdev_clear-lustre -> statechange-lustre.sh

This makes it so the statechange-lustre.sh script is also called on
all ZFS events that could change the pool state.

Lustre-change: https://review.whamcloud.com/43552
Lustre-commit: e11a47da71a2e2482e4c4cf582d663cd76a2ecab

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Change-Id: I18edc86749e8ab91bb45f21aafd3fd47e78cbaef
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43929
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>