Whamcloud - gitweb
Wang Shilong [Thu, 26 Sep 2019 13:21:13 +0000 (21:21 +0800)]
LU-12777 test: fix to pass facet to facet_fstype
Function facet_fstype() expect mgs1 mds1 etc as its
argument, and we used it wrong to pass $mds1 which will
cause following error.
line 1192: lustre-ost1/ost1_FSTYPE: bad substitution
And we fail to detect this is ZFS based OSD, and pool
reimporting will be missed thus failed to mount.
Lustre-change: https://review.whamcloud.com/36298
Lustre-commit:
38c8fdfde3953f239bd3d86a91a3213737231ce5
Test-Parameters: trivial clientdistro=el8 testlist=conf-sanity \
fstype=zfs envdefinitions=ONLY=103
Test-Parameters: trivial clientdistro=el8 testlist=conf-sanity \
fstype=ldiskfs envdefinitions=ONLY=103
Change-Id: Id8fd5b9f17e666614e83e5c1a2399fde8b91b023
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36379
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Emoly Liu [Thu, 19 Sep 2019 10:26:31 +0000 (18:26 +0800)]
LU-12229 tests: fix "bad substitution" error
In newer bash version, the special characters is invalid in the
usage of indirect variable expansion {!word}. For example,
# a=lustre,pool
# echo ${!a}
-bash: lustre,pool: bad substitution
To avoid "bad sustitution" error, pool_new command is used in
test_1j and test_1k directly.
Lustre-change: https://review.whamcloud.com/36243
Lustre-commit:
ac426d6f17b80ed36052f11b9780fa444cfa24aa
Test-Parameters:trivial clientdistro=el8 testlist=ost-pools
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: Ifce4616cd7f314416fe5fa09f8fba846ae45bcef
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36377
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Wed, 19 Dec 2018 23:55:49 +0000 (15:55 -0800)]
LU-11816 lnet: setup health timeout defaults
Enable health feature by default.
Setup transaction timeout to a default 10 seconds and
retry count to 3 when health is enabled. When health
is disabled set default transaction timeout to 50.
When toggling between health enabled/disabled the defaults
will always kick in.
Lustre-change: https://review.whamcloud.com/34252
Lustre-commit:
8632e94aeb7e62da07f342a9897d15dfd8251148
This is a new commit for the previous reverted of commit
https://review.whamcloud.com/#/c/36031/
Change-Id: I359f9cc6c93b5f7d0b58df1abdd29ae0bffd4faf
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36382
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Mon, 27 May 2019 17:43:10 +0000 (10:43 -0700)]
LU-12344 lnet: handle remote health error
When a peer is dead set the health status to REMOTE_DROPPED
in order to handle health properly for the peer.
When dropping a routed message set REMOTE_ERROR. Routed messages
are dropped when the routing feature is turned off which could
be considered a configuration error if it happens in the middle
of traffic. Therefore, it's better to flag this issue at this
point without resending the message.
Lustre-change: https://review.whamcloud.com/34967
Lustre-commit:
b45e3d96fc4d82ebf5b1bb3ef0b5a59e8ff86e75
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I131263215a68fc8607582643a47007ce4d04abbc
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36030
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Fri, 5 Oct 2018 00:18:20 +0000 (17:18 -0700)]
LU-11478 lnet: misleading discovery seqno.
There is a sequence number used when sending discovery messages. This
sequence number is intended to detect stale messages. However it
could be misleading if the peer reboots. In this case the peer's
sequence number will reset. The node will think that all information
being sent to it is stale, while in reality the peer might've
changed configuration.
There is no reliable why to know whether a peer rebooted, so we'll
always assume that the messages we're receiving are valid. So we'll
operate on first come first serve basis.
Lustre-change: https://review.whamcloud.com/33304
Lustre-commit:
42d999ed8f6113724b1ac103b832d5b74b878d55
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I421a00e47bc93ee60fa37c648d6d9a726d9def9c
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36041
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Minh Diep [Mon, 30 Sep 2019 18:25:50 +0000 (11:25 -0700)]
LU-12825 build: change lbuild to support MOFED 4.7
* Remove 'alternate' name in MOFED tar
* use MLNX_LIBS to download rpms
Test-Parameters: trivial
Lustre-change: https://review.whamcloud.com/36333
Lustre-commit:
279c26466bff37dd25fe26e4bb56a16a9a797870
Change-Id: Ia5a4f51455be836a7df4fa6b3e9eccc17cffef2c
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36344
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sergey Gorenko [Fri, 20 Sep 2019 13:34:48 +0000 (16:34 +0300)]
LU-12789 o2ib: fix configure checks
Fix configure checks for modern kernels / MOFED 4.7
1) sg_dma_address() and sg_dma_len() always have only one argument.
2) Make configure checks executed in proper enviroment
Lustre-change: https://review.whamcloud.com/36245
Lustre-commit:
f44f657ee218303220f41182ced4fac290266b7f
Change-Id: I9910de888371776758376743ab4418778e1d85e4
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36331
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
NeilBrown [Sun, 4 Nov 2018 20:42:51 +0000 (15:42 -0500)]
LU-11617 mdc: fix possible deadlock in chlg_open()
Lockdep reports a possible deadlock between chlg_open() and
mdc_changelog_cdev_init()
mdc_changelog_cdev_init() takes chlg_registered_dev_lock and then
calls misc_register() which takes misc_mtx.
chlg_open() is called while misc_mtx is held, and tries to take
chlg_registered_dev_lock.
If these two functions race, a deadlock can occur as each thread will
hold one of the locks while trying to take the other.
chlg_open() does not need to take a lock. It only uses the
lock to stablize a list while looking for the matching
chlg_registered_dev, and this can be found directly by examining
file->private_data.
So remove chlg_obd_get(), and use file->private_data to find the
obd_device.
Also ensure the device is fully initialized before calling
misc_register(). This means setting up some list linkage before the
call, and tearing it down if there is an error.
Lustre-change: https://review.whamcloud.com/33572
Lustre-commit:
206b21741b07a10269bbcfdac28743591b64ab2f
Change-Id: Icffdebcee656ee6199297ba2a28ba57dcbc51ae1
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36230
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Gu Zheng [Wed, 18 Sep 2019 04:12:55 +0000 (12:12 +0800)]
LU-12705 utils: cleanup unnecessary typecasting
There're a bunch of variables typeecasted in utils/lfs.c where
they are not needed, so cleanup them here.
Lustre-change: https://review.whamcloud.com/36224
Lustre-commit:
d8135ad2fbe58a0fbe6984584816338542901c5c
Change-Id: I6c944f18137fd1ff1162d9b6567c9328dfa185eb
Test-Parameters: trivial
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36313
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Shaun Tancheff [Sun, 21 Jul 2019 07:42:43 +0000 (02:42 -0500)]
LU-12400 lnet: Infiniband sg_dma changes for linux 5.1
IB/core: Remove ib_sg_dma_address() and ib_sg_dma_len()
Linux-commit:
a163afc88556e099271a7b423295bc5176fcecce
This simplification can be applied to mainline 3.15 and later
however the test should remain for 3rd party ib driver support
Lustre-change: https://review.whamcloud.com/35497
Lustre-commit:
bbc2cf593b83f5f1822889ef5c910906aadbe735
Test-Parameters: trivial
Cray-bug-id: LUS-7600
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I4824b3b737388a3fc0aec43b2d8e5d10f871ccdd
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36330
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Bobi Jam [Sat, 24 Aug 2019 17:20:23 +0000 (01:20 +0800)]
LU-12690 llite: error handling of ll_och_fill()
The return error of ll_och_fill() should be handled.
Lustre-change: https://review.whamcloud.com/35913
Lustre-commit:
4d6d58575d3d957aa3dbf38f83f749259b580bf2
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I4e750001cb124104836fa24e39ec8ae203b51a83
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36315
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alexey Zhuravlev [Fri, 13 Sep 2019 19:28:06 +0000 (22:28 +0300)]
LU-12570 mdt: request env for DT threads
as part of lock enqueue MDT thread can call ldlm_reclaim_full() to
cancel old unused LDLM locks and that scans all presented namespace
including OFD-originated (with extent locks). thus MDT ends with
calls into OFD code which needs own env marked with LCT_DT_THREAD.
Lustre-change: https://review.whamcloud.com/36179
Lustre-commit:
1f94d5eb2be4e921e909d8f18523dcab91bb6531
Test-Parameters: testlist=sanity,sanity,sanity,sanity envdefinitions=ONLY="134a",SHARED_KEY=true
Test-Parameters: testlist=sanity,sanity,sanity,sanity envdefinitions=ONLY="134a",SHARED_KEY=true
Test-Parameters: testlist=sanity,sanity,sanity,sanity envdefinitions=ONLY="134a",SHARED_KEY=true
Signed-off-by: Alexey Zhuravlev <bzzz@whamcloud.com>
Change-Id: I231b88159978bc3ce7a3fa0f27e57eb32137c343
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36312
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Shaun Tancheff [Thu, 13 Jun 2019 19:04:54 +0000 (14:04 -0500)]
LU-12355 lnet: ib_fmr_pool_unmap returns void
Historically ib_fmr_pool_unmap only ever returned 0
Linux kernel 4.20 changed the return for ib_fmr_pool_unmap to void.
Linux-commit:
3eeeb7a59acddaa326b03efdf6dce61c120449a3
Lustre-change: https://review.whamcloud.com/35017
Lustre-commit:
46298ffe0b436a8cf1c60aa3d7bde7ae52c78d00
Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I49d91a49c452dad5c7d9b153fdbc011f2f25743a
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36329
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Shaun Tancheff [Tue, 11 Jun 2019 12:29:49 +0000 (07:29 -0500)]
LU-12355 lnet: Adjust checks for ib_device_ops
RDMA/core: Introduce ib_device_ops
The ib_device_ops structure defines all the InfiniBand device
operations in one place
Linux-commit:
521ed0d92ab0db3edd17a5f4716b7f698f4fce61
Lustre-change: https://review.whamcloud.com/35016
Lustre-commit:
27572b0476b07b396174430940f184ed85088eeb
Test-Parameters: trivial
Change-Id: Ia2a617597c75ec819f485b93a1deb368d4b5e873
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36328
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Sat, 25 May 2019 16:55:47 +0000 (09:55 -0700)]
LU-12339 lnet: select LO interface for sending
In the following scenario
Lustre->LNetPrimaryNID with 0@lo
Discover is initiated on 0@lo
The peer is created with 0@lo and <addr>@<net>
The interface health of the peer's <addr>@<net> is decremented
LNetPut() to self
selection algorithm selects 0@lo to send to
This exposes an issue where we try and go through the peer credit
management algorithm, but because there are no credits associated with
0@lo we end up indefinitely queuing the message. ptlrpc will then get
stuck waiting for send completion on the message.
This was exposed via conf-sanity 32a
Lustre-change: https://review.whamcloud.com/34957
Lustre-commit:
69d1535ebdac139c6b19db2bca5f65663fe88467
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I98e9d3428b594a0d041d27d8e8d8de7596825edc
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36040
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Tue, 30 Apr 2019 21:01:48 +0000 (14:01 -0700)]
LU-12199 lnet: verify msg is commited for send/recv
Before performing a health check make sure the message
is committed for either send or receive. Otherwise we
can just finalize it.
Lustre-change: https://review.whamcloud.com/34797
Lustre-commit:
fc6b321036f34c00d5b32b49c817dc0034fbad9e
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Id7bd956f8e81e60a2d63059730973f851d4c7abe
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36039
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Wed, 20 Mar 2019 19:14:51 +0000 (12:14 -0700)]
LU-12080 lnet: clean mt_eqh properly
There is a scenario where you have a peer on your recovery queue
that's down. So you keep pinging it, but every ping times out
after 10 seconds. In the middle of these 10 seconds you perform a
shutdown. First you try to do the rsp_tracker_clean. It goes through
and calls MDUnlink on the MD related to that ping. But because the
message has a ref count on the MD, it doesn't go away. The MD gets
zombied. And just waits for lnet_md_unlink to be called in
lnet_finalize(). Then you hit clean_peer_ni_recovery. We see the peer
on the queue, we try to call Unlink on it, but when we lookup the
MD using lnet_handle2md() we can't find it. Afterwards we try to clean
up the EQ and it asserts. Even if we remove the assert we end up with
a resource leak since the EQ is not actually freed since we won't call
LNetEQFree() again.
The solution is to pull the EQ create in the LNetNIInit() and deletion
happens in lnet_unprepare. By this point all the remaining messages
would've been finalized and all references on the EQ are gone,
allowing us to clean it up properly
Lustre-change: https://review.whamcloud.com/34477
Lustre-commit:
1065c8888e96fef9e98676bd3a71b46f7910b085
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I7fd6018ee2e57f82c649fc3658352e89a4309986
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36029
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Thu, 18 Apr 2019 03:49:18 +0000 (22:49 -0500)]
LU-12199 lnet: Ensure md is detached when msg is not committed
It's possible for lnet_is_health_check() to return "true" when the
message has not hit the network. In this situation the message is
freed without detaching the MD. As a result, requests do not receive
their unlink events and these requests are stuck forever.
A little cleanup is included here:
- The value of lnet_is_health_check() is only used in one place, so
we don't need to save the result of it in a variable.
- We don't need separate logic to detach the md when the send was
successful. We'll fall through to the finalizing code after
incrementing the health counters
Lustre-change: https://review.whamcloud.com/34885
Lustre-commit:
b65f3a1767ae82c7f629320187b33eb8670da537
Cray-bug-id: LUS-7239
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I6301d491090b862d016eed3aac8afd7be8685e57
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36038
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Thu, 2 May 2019 22:24:32 +0000 (17:24 -0500)]
LU-12264 lnet: Protect lp_dc_pendq manipulation with lp_lock
Protect the peer discovery queue from concurrent manipulation by
acquiring the lp_lock.
Lustre-change: https://review.whamcloud.com/34798
Lustre-commit:
dd16a31bf4ae874a69cc7dc5fe1f3197993630ae
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: If43b877c1c7ea203f346a3d6ea846f00b8f9661f
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36037
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Tue, 30 Apr 2019 18:51:09 +0000 (11:51 -0700)]
LU-12254 lnet: correct discovery LNetEQFree()
The EQ needs to be freed after all the queues are cleaned to avoid
having non-processed events on the event queue on free. This will
prevent the memory from being freed.
Lustre-change: https://review.whamcloud.com/34796
Lustre-commit:
a0879b5985b41f92dede96e7f27623eb72102b15
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ie38ec25e09bf6d7cf2aadc30edd91d298897c51b
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36036
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Tue, 30 Apr 2019 05:57:21 +0000 (22:57 -0700)]
LU-12249 lnet: fix list corruption
In shutdown the resend queues are cleared and freed. The monitor
thread state is set to shutdown. It is possible to get lnet_finalize()
called after the queues are freed. The code checks for ln_state to see
if we're shutting down. But in this case we should really be checking
ln_mt_state. The monitor thread is the one that matters in this case,
because it's the one which allocates and frees the resend queues.
Lustre-change: https://review.whamcloud.com/34778
Lustre-commit:
d799ac910cd6c980b40c81b76eaefb65b88904d0
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ia077cec7a52ef5cd2e1b231437c6265ba9416b1b
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36035
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Sat, 27 Apr 2019 22:47:42 +0000 (15:47 -0700)]
LU-11297 lnet: invalidate recovery ping mdh
For cleanliness, ensure that recovery ping mdh is invalidated when
an peer ni or a local ni are allocated
Lustre-change: https://review.whamcloud.com/34771
Lustre-commit:
d7b5f3114d51d5a9d1a34f5073e0bb2d0d63d302
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: If06448b1602b3680831244923b6b982a555159ea
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36034
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Fri, 27 Sep 2019 18:09:51 +0000 (11:09 -0700)]
LU-12620 kernel: kernel update RHEL 8.0 [4.18.0-80.7.1.el8_0]
Update RHEL 8.0 kernel to 4.18.0-80.7.1.el8_0 for Lustre client.
Test-Parameters: trivial clientdistro=el8 \
envdefinitions=SANITY_EXCEPT="421a" \
testlist=sanity
Change-Id: I9a78ad00d1503cc90f5975e349fe96d452b1174f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35657
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Wang Shilong [Mon, 23 Sep 2019 16:34:24 +0000 (09:34 -0700)]
LU-12755 ldiskfs: fix project quota unpon unpatched kernel
The value of MAXQUOTAS is the number of quota types supported
by kernel. With project quotas patch applied, MAXQUOTAS is
equal to EXT4_MAXQUOTAS. However, on an unpatched kernel,
project quota type is not supported and MAXQUOTAS is one less
than EXT4_MAXQUOTAS.
In ldiskfs, we need to make sure that the loop in
ext4_quota_off_umount() is limiting the EXT4_MAXQUOTAS loop
to the kernel MAXQUOTAS value. Otherwise, it is trying to
dereference sb_dqopt(sb)->files[2] which is not an inode at all,
and cause the kernel stick on a spinlock in ext4_quota_off()
as follows during unmount:
Call Trace:
[<
ffffffffb9d733c5>] queued_spin_lock_slowpath+0xb/0xf
[<
ffffffffb9d81b30>] _raw_spin_lock+0x20/0x30
[<
ffffffffb9865e2e>] igrab+0x1e/0x60
[<
ffffffffc08a8c4b>] ldiskfs_quota_off+0x3b/0x130 [ldiskfs]
[<
ffffffffc08abcdd>] ldiskfs_put_super+0x4d/0x400 [ldiskfs]
[<
ffffffffb984b13d>] generic_shutdown_super+0x6d/0x100
[<
ffffffffb984b5b7>] kill_block_super+0x27/0x70
[<
ffffffffb984b91e>] deactivate_locked_super+0x4e/0x70
[<
ffffffffb984c0a6>] deactivate_super+0x46/0x60
[<
ffffffffb986abff>] cleanup_mnt+0x3f/0x80
[<
ffffffffb986ac92>] __cleanup_mnt+0x12/0x20
[<
ffffffffb96c1c0b>] task_work_run+0xbb/0xe0
[<
ffffffffb962cc65>] do_notify_resume+0xa5/0xc0
[<
ffffffffb9d8d23b>] int_signal+0x12/0x17
This patch is back-ported from the following one:
Lustre-commit:
4b013aa4cdc12647cb1aa9c93bdd72d741b83af4
Lustre-change: https://review.whamcloud.com/36203
Test-Parameters: clientdistro=el7.7 serverdistro=el7.7
Change-Id: I18a4d97656e2f8478754943424c0fac927f843ca
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/36270
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andrew Perepechko [Fri, 10 Aug 2018 13:18:48 +0000 (16:18 +0300)]
LU-11296 osc: speed up page cache cleanup during blocking ASTs
While we are cleaning a write lock, we don't need to check if
page cache pages under this lock are covered by another lock.
If a client needs to give up its lock, cleaning gigabytes of
page cache can take quite a long time.
Lustre-change: https://review.whamcloud.com/33090
Lustre-commit:
b9ebb17277c78101018a0cf4a63f6beb93b9baf0
Signed-off-by: Andrew Perepechko <c17827@cray.com>
Cray-bug-id: LUS-6352
Change-Id: I576130216ed4de4e352ea697bddb5ff83046443a
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35831
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Tue, 16 Jul 2019 19:26:43 +0000 (15:26 -0400)]
LU-12559 ptlrpc: Hold imp lock for idle reconnect
Idle reconnect sets import state to IMP_NEW, then releases
the import lock before calling ptlrpc_connect_import. This
creates a gap where an import in IMP_NEW state is exposed,
which can cause new requests to fail with EIO.
Hold the lock across the call so as not to expose imports
in this state.
Lustre-change: https://review.whamcloud.com/35530
Lustre-commit:
e9472c54ac820c3a0db2318a6ef894c3971e6e0b
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I9f8509d11c4d5a8917a313349534d98b964cd588
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36215
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Wed, 12 Dec 2018 08:49:00 +0000 (16:49 +0800)]
LU-11743 utils: allow lctl pool commands on separate MGS
The current lctl code checks for the presence of configured pools on
the client and MDS via /proc or /sys files. However, the MGS does
not parse the client/MDS configuration logs, so it does not create
the various files for the pools, which causes the pool commands to
fail verification.
Change lctl pool_new, pool_add, pool_remove and pool_destroy commands
to parse the configuration log directly when run on a standalone MGS
node. This also allows the pool commands to be run when only the MGS
is started.
Lustre-change: https://review.whamcloud.com/34110
Lustre-commit:
4a003a1f554602265630637080f65d9b4474f822
Test-Parameters: standalonemgs=true testlist=ost-pools.sh
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: Ib6fdb367c919f7b726fbf551dcfa6015593ebbe5
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35804
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alex Zhuravlev [Thu, 15 Aug 2019 18:33:08 +0000 (22:33 +0400)]
LU-12612 osd: add lnb size down to osd
so that each OSD can check for lnb array overflow.
the patch isn't final - there will be proper
implementation in osd-zfs and a new test.
Lustre-change: https://review.whamcloud.com/35801
Lustre-commit:
8033f80de3d0db87f7e965078ceee62033adb58d
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I43683c84e48006b4075f9a8b3e87cdfeae28c02b
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36273
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Sun, 21 Jul 2019 17:06:37 +0000 (13:06 -0400)]
LU-12569 o2iblnd: Make credits hiw connection aware
The IBLND_CREDITS_HIGHWATER mark check currently looks only
at the global peer credits tunable, ignoring the connection
specific queue depth when determining the threshold at
which to send a NOOP message to return credits.
This is incorrect because while connection queue depth
defaults to the same as peer credits, it can be less than
that global value for specific connections.
So we must check for this case when setting the threshold.
Lustre-change: https://review.whamcloud.com/35578
Lustre-commit:
1b87e8f61781e48c31b4da647214d66addf2b90c
Test-Parameter: nettype=o2ib
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ie028ae11cdbd0f75a38b265b7ab5830f92f08d90
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36254
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Thu, 5 Sep 2019 21:28:48 +0000 (14:28 -0700)]
LU-12385 lnet: update opa defaults
Testing reveals no significant performance improvements
when using peer_credits > 32. Adjusted the default
peer_credits, peer_credits_hiw and concurrent_sends
to take that into account.
This has the advantage of avoiding an issue observed
on multiple opa sites where the qp can not be created because
of large initial queue_depth. The queue depth is then
reduced gradually until the qp creation succeeds.
Lustre-change: https://review.whamcloud.com/36072
Lustre-commit:
7f199dbf0261b89afe0dc8185db4403ae0efdefa
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I6036ec1da7063e30b567446e5db89040f21bc701
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36252
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Fri, 6 Sep 2019 01:15:10 +0000 (18:15 -0700)]
LU-12621 o2iblnd: cache max_qp_wr
When creating the device the maximum number of work requests per qp
which can be allocated is already known. Cache that internally,
and when creating the qp make sure the qp's max_send_wr does not
exceed that max. If it does then cap max_send_wr to max_qp_wr.
Recalculate the connection's queue depth based on the max_qp_wr.
Lustre-change: https://review.whamcloud.com/36073
Lustre-commit:
7ee319ed7f9dfa365a66b20b03f2141c54fb0293
Test-Parameter: nettype=o2ib
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I6d9a642d03633264f5f14445a051dd14515709c1
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36253
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Wed, 11 Sep 2019 18:26:47 +0000 (14:26 -0400)]
LU-11542 import: Fix missing spin_unlock()
A recent patch moved the spin_unlock() down into
each branch of an 'if', but missed the final 'else'.
Add the spin_unlock in the else.
Lustre-change: https://review.whamcloud.com/35999
Lustre-commit:
3dbdd38a6adcee63b6d89d4656e0099a0006f26c
Fixes:
29904135df67 ("LU-11542 import: fix race between imp_state & imp_invalid")
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I6ee399050aad0fe9df9c0e3ddf8ec0be8eae1641
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36251
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Yang Sheng [Mon, 15 Oct 2018 09:37:21 +0000 (17:37 +0800)]
LU-11542 import: fix race between imp_state & imp_invalid
We set import to LUSTRE_IMP_DISCON and then deactive when
it is unreplayable. Someone may set this import up between
those two operations. So we will get a invalid import with
FULL state.
Lustre-change: https://review.whamcloud.com/33395
Lustre-commit:
29904135df671c624b1e542fdda94b221d76e667
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ib4cec0bcaf6f4b221ba260edb94749a4e523f5e6
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35796
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Alex Zhuravlev [Tue, 23 Jul 2019 13:53:22 +0000 (17:53 +0400)]
LU-12090 utils: lfs rmfid
a new RPC_REINT_RMFID has been introduced by the patch.
it's supposed to be used with corresponding llapi_rmfid()
to unlink a batch of MDS files by their FIDs. the caller
has to have permission to modify parent dir(s) and the objects
themselves.
Lustre-change: https://review.whamcloud.com/34449
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib22379033aca92692e0e219671ca0c2ec7893c24
Reviewed-on: https://review.whamcloud.com/35595
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Gu Zheng [Fri, 30 Aug 2019 07:27:30 +0000 (03:27 -0400)]
LU-12705 build: fix building fail against Power9 little endian
We use "%ll[dux]" for __u64 variable as an input/output modifier,
this may cause building error on some architectures which use "long"
for 64-bit types, for example, Power9 little endian.
Here add necessary typecasting (long long/unsigned long long) to
make the build correct.
Lustre-change: https://review.whamcloud.com/36007
Lustre-commit:
4eddf36ac3607c66c172668b30eb5dcf921e3de4
Test-Parameters: trivial
Change-Id: I2e8569f4ac14f7d328a29d153ff57c7834cabc46
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36207
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Wed, 21 Aug 2019 14:03:11 +0000 (10:03 -0400)]
LU-11729 obdclass: align to T10 sector size when generating guard
Otherwise the client and server would come up with
different checksum when the page size is different.
Improve test_810 to verify all available checksum types.
Test-Parameters: trivial envdefinitions=ONLY=810 testlist=sanity,sanity,sanity
Test-Parameters: clientarch=aarch64 envdefinitions=ONLY=810 testlist=sanity,sanity
Test-Parameters: clientarch=ppc64 envdefinitions=ONLY=810 testlist=sanity,sanity
Lustre-change: https://review.whamcloud.com/34043
Lustre-commit:
98ceaf854bb4738305769c5cd1df556ee99aa859
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I24117aebb277d4ddcb7787b715587e33023ebbe5
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36205
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Mon, 16 Sep 2019 19:15:03 +0000 (12:15 -0700)]
LU-11485 lod: disallow setting the last non-stale mirror as stale
"lfs setstripe" allows setting stale flag on the last
non-stale mirror of a file, which makes the file have
no valid component to read and return IO error.
This patch fixes the above issue by disallowing that.
It also disallows "lfs mirror split" to destroy the
last non-stale mirror of a file.
This patch is back-ported from the following one:
Lustre-commit:
29be32a759f696006a539d3cff74ca55a281aa64
Lustre-change: https://review.whamcloud.com/36141
Change-Id: I6934cfe0190cd1ea83de1cf28ddf840b9f96193a
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36195
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alex Zhuravlev [Mon, 16 Sep 2019 19:07:20 +0000 (12:07 -0700)]
LU-11022 lfs: remove mirror by pool name
lfs mirror split --pool <poolname> <file>
This patch is back-ported from the following one:
Lustre-commit:
0c710a46cfb43366dc57ff6e83e414086b1d0e6c
Lustre-change: https://review.whamcloud.com/35329
Change-Id: I012e68729b94657236ba3fc530fc7b7485529ed2
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36194
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Emoly Liu [Mon, 9 Sep 2019 08:10:29 +0000 (16:10 +0800)]
LU-12602 mdt: more EA size check in mdt_getxattr_pack_reply()
While the RMF_EAVALS field size can be arbitrary length,
the RMF_EAVALS_LENS field definition specifies
the RMF_F_STRUCT_ARRAY flag, so the passed size must be a multiple
of sizeof(__u32) or the internal LBUG() will trigger.
Lustre-change: https://review.whamcloud.com/36103
Lustre-commit:
4d8bc239c2c30a47e8833cf23db6ccd39ff61705
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: I767e1b1496298e9a66274fc324f9c34daaed4a09
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36208
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
James Simmons [Tue, 15 Jan 2019 16:25:54 +0000 (11:25 -0500)]
LU-8130 libcfs: port working hash from upstream
The hash_[32|64] function in pre-4.6 kernels produce hashes
with poor distributions which result in high collision rates.
Backport those improvements for the pre-4.6 kernels Lustre
supports. Details can be read here:
https://lwn.net/Articles/687494
Lustre-change: https://review.whamcloud.com/33789
Lustre-commit:
1658ae30a0e97e7f4018d8cba67e459078470d1a
Test-Parameters: trivial
Change-Id: Id2436ba8be2d3ed482c5386b79710f594d5b3e59
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35179
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Tue, 9 Apr 2019 12:58:20 +0000 (14:58 +0200)]
LU-12131 tests: only create lgssc.conf file if necessary
lgssc.conf file is now packaged by Lustre, and installed under
/etc/request-key.d/.
So, unless run from build tree, init_gss() must not create its own
anymore. So adjust corresponding commands in init_gss() and
cleanup_sk().
Lustre-change: https://review.whamcloud.com/34520
Lustre-commit:
66919f2b687f8b15679e6ff4e22a3f66f7d1c13a
Fixes:
e299df1e9eea ("LU-7854 gss: install lgssc.conf under /etc/request-key.d")
Whamcloud-bug-id: ATM-1283
Test-Parameters: envdefinitions=SHARED_KEY=true testlist=sanity-sec
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9cc76fddb8a622d7c40d6348913df42ae063254a
Reviewed-on: https://review.whamcloud.com/35557
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Bobi Jam [Wed, 24 Jul 2019 13:24:01 +0000 (21:24 +0800)]
LU-12581 osc: prevent use after free
Clear aa_oa after it's been freed to prevent use after free.
Lustre-change: https://review.whamcloud.com/35601
Lustre-commit:
61c9f8797771c951ecd240981d7d97d5adc685e0
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Idf122aa53fe5b13c07337745e5a26763e8712be2
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36210
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Nikitas Angelinas [Wed, 24 Jul 2019 09:43:53 +0000 (02:43 -0700)]
LU-11675 hsm: don't allow new HSM requests during CDT_INIT
When the HSM CDT is shut down and restarted, it resets cdt_last_cookie
using ktime_get_real_seconds() and examines the CDT llog for existing
requests, in order to set cdt_last_cookie to the highest known value,
so that newly-assigned cookies are unique. There is a window between
CDT_INIT and CDT_RUNNING during which new requests can arrive, and if
the CDT llog has not been fully examined, cookies can be reused. This
can cause the following two assertions to be triggered in
cdt_agent_record_hash_add():
LASSERT(carl0->carl_cat_idx == carl1->carl_cat_idx);
LASSERT(carl0->carl_rec_idx == carl1->carl_rec_idx);
Fix this by not allowing new HSM requests during CDT_INIT.
Also, cookie values are incremented on a separate line, which causes
one value to be skipped at CDT startup time. This is not an issue, but
there does not seem to be a need for it; fix this post-incrementing
and assigning cookie values in the same line.
Lustre-change: https://review.whamcloud.com/33671
Lustre-commit:
39862136c3cfee127c4b0a9604ff12f560af3124
Signed-off-by: Nikitas Angelinas <nangelinas@cray.com>
Cray-bug-id: LUS-6589
Test-Parameters: trivial testlist=sanity-hsm
Change-Id: I18a1c3e85de6c50a9bf1ce598e21d83d893ad0ca
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36212
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Emoly Liu [Thu, 29 Aug 2019 06:15:15 +0000 (14:15 +0800)]
LU-12613 ptlrpc: check buffer length in lustre_msg_string()
Check buffer length in lustre_msg_string() in case of any invalid
access.
Lustre-change: https://review.whamcloud.com/35932
Lustre-commit:
728c58d60faef288eb7d05d8809fa2b1a55ade89
Change-Id: I286000db16384938a594bd8d104e5f3d0fff585a
Reported-by: Alibaba Cloud <yunye.ry@alibaba-inc.com>
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Yunye Ry <yunye.ry@alibaba-inc.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36209
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bobi Jam [Mon, 16 Sep 2019 18:56:47 +0000 (11:56 -0700)]
LU-10258 lfs: lfs mirror copy command
Add "lfs mirror copy" command to copy a mirror's content to other
mirror(s) of a mirrored file.
Usage:
lfs mirror copy {--read-mirror|-i <id0>}
{--write-mirror|-o <id1>[,<id2>,...]} <mirrored_file>
Options:
--read-mirror|-i <id0>
This option indicates the content of which mirror specified by id0
needs to be read. The id0 is the numerical unique identifier for a
mirror.
--write-mirror|-o <id1>[,<id2>,...]
This option indicates the content of which mirror(s) specified by
mirror IDs needs to be written. The mirror IDs are separated with
comma. If the mirror id -1 is used here, it means that all mirrors
other than the read mirror are to be written.
Note:
Be ware that the written mirror(s) will be marked as non-stale
mirror(s), be careful that after using this command, you could get a
file with non-stale mirrors while containing different contents.
This patch is back-ported from the following one:
Lustre-commit:
c6e7c0788d7cd766880d12eae6679782283dc479
Lustre-change: https://review.whamcloud.com/33220
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Id138368cdb29ec14b7c03a5db3b2dd1e0db5ea37
Reviewed-on: https://review.whamcloud.com/36193
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Thu, 27 Jun 2019 10:08:17 +0000 (12:08 +0200)]
LU-12131 tests: fix test_802 for GSS
test_802 should not overwrite already existing client mount options
when trying to mount client as read-only.
Lustre-change: https://review.whamcloud.com/35335
Lustre-commit:
a51d0653cf46fc898da01f86c26cc0f4f5beff5a
Test-Parameters: trivial
Test-Parameters: envdefinitions=ONLY=802 testlist=sanity
Test-Parameters: envdefinitions=SHARED_KEY=true,ONLY=802 testlist=sanity
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I8189c245870fb0caf48006db11621f0af48e1878
Reviewed-on: https://review.whamcloud.com/35535
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Amir Shehata [Fri, 19 Apr 2019 00:12:49 +0000 (17:12 -0700)]
LU-12201 lnet: detach response tracker
We need to unlink the response tracker from MDs even if the
corresponding message failed to send.
Lustre-change: https://review.whamcloud.com/34770
Lustre-commit:
1bb91b966d15345b4c89245d51f6cb631b052779
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I4f320274576790e3332f66f30aad5c2b3450b955
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36033
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Artem Blagodarenko [Fri, 9 Aug 2019 19:19:29 +0000 (22:19 +0300)]
LU-12650 lib: fix strings comparison during mount searching
get_root_path() returns path to "lustre" mount instead "lustre1"
because last symbol is not taking in account during comparison.
This bug has influence to get_root_path() users.
For example, fid2path use get_root_path().
lfs path2fid /mnt/lustre2/foodir3
[0x200000401:0x1:0x0]
lfs fid2path lustre2 [0x200000401:0x1:0x0]
lfs fid2path: cannot find '[0x200000401:0x1:0x0]': No such file or
directory
umount /mnt/lustre
lfs fid2path lustre2 [0x200000401:0x1:0x0]
foodir3
This fix adds strings length comparison.
Lustre-change: https://review.whamcloud.com/35755
Lustre-commit:
0817efd73f04bf59d1234887bc3971d2d067067e
Signed-off-by: Artem Blagodarenko <c17828@cray.com>
Cray-bug-id: LUS-7693
Change-Id: I3275d2182486d25389814f4c25b3f2a54ec29469
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36211
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Emoly Liu [Wed, 14 Aug 2019 07:52:58 +0000 (15:52 +0800)]
LU-12602 mdt: check EA size in mdt_getxattr_pack_reply()
Check EA data size(non-positive or excessively large) in case of
any corruption.
Lustre-change: https://review.whamcloud.com/35768
Lustre-commit:
915135c37cbfa6851a5ec732afd20955eb020566
Change-Id: I8ccea214f8d7c0403a9df180acf487ee381b8d77
Reported-by: Alibaba Cloud <yunye.ry@alibaba-inc.com>
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35936
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Oleg Drokin [Sat, 17 Aug 2019 05:43:36 +0000 (01:43 -0400)]
LU-12614 ldlm: ldlm_cancel_hpreq_check should check lock count
Make sure the number of locks we are going to cancel fits into
the supplied buffer first.
This is similar to LU-12603, just in a different place.
Lustre-change: https://review.whamcloud.com/35807
Lustre-commit:
2b7af478bdbf5c6701e0e49aefe34597bdee3126
Change-Id: Ifa2aa976ce8613217c739ef609de54538c57b5e9
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reported-by: Alibaba Cloud <yunye.ry@alibaba-inc.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yunye Ry <yunye.ry@alibaba-inc.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36107
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Wed, 31 Jul 2019 16:12:40 +0000 (18:12 +0200)]
LU-12604 mdt: check field size of sec context name
In request received from client, check that claimed size of
RMF_FILE_SECCTX_NAME field is consistent with expected content,
which is supposed to be an extended attribute name.
Lustre-change: https://review.whamcloud.com/35655
Lustre-commit:
384cd84489c9a7aa3145560002eb7a053cf4b2db
Test-Parameters: clientselinux testlist=sanity,recovery-small,sanity-selinux envdefinitions=SANITY_EXCEPT="271f"
Reported-by: Alibaba Cloud <yunye.ry@alibaba-inc.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ice96f0e03f790b334fcdf64ae4becef2e39738f4
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35868
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Emoly Liu [Thu, 29 Aug 2019 02:55:13 +0000 (10:55 +0800)]
LU-12590 ptlrpc: check lm_bufcount and lm_buflen
Check lm_bufcount to be used by lustre_msg_hdr_size_v2() and
validate individual and total buffer lengths in
lustre_unpack_msg_v2() in case of any out-of-bound read.
Lustre-change: https://review.whamcloud.com/35783
Lustre-commit:
268edb13d769994c4841864034d72f0bd7b36e12
Change-Id: I4905e0665c7770443684cffe504935d27473d7c6
Reported-by: Alibaba Cloud <yunye.ry@alibaba-inc.com>
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Yunye Ry <yunye.ry@alibaba-inc.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36119
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Thu, 12 Sep 2019 07:08:25 +0000 (00:08 -0700)]
LU-12608 kernel: kernel update RHEL7.6 [3.10.0-957.27.2.el7]
Update RHEL7.6 kernel to 3.10.0-957.27.2.el7.
Test-Parameters: clientdistro=el7.6 serverdistro=el7.6
Change-Id: I8dd5e24746ccf11467c7a468edf7f9056d5705e3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35639
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Fri, 6 Sep 2019 07:24:35 +0000 (00:24 -0700)]
LU-12724 kernel: kernel update RHEL7.7 [3.10.0-1062.1.1.el7]
Update RHEL7.7 kernel to 3.10.0-1062.1.1.el7.
Test-Parameters: trivial clientdistro=el7.7 serverdistro=el7.7
Change-Id: Iad40fb93b8a15d875b72749a05666a23e4755fcc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36075
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Sun, 17 Mar 2019 15:16:40 +0000 (08:16 -0700)]
LU-12080 lnet: recovery event handling broken
Don't increment health on unlink event.
If a SEND fails an unlink will follow so no need to do any
special processing on SEND event. If SEND succeeds then we
wait for the reply.
When queuing a message on the NI recovery queue only do so
if the MT thread is still running.
Lustre-change: https://review.whamcloud.com/34445
Lustre-commit:
5409e620e0256dc9b657f1c457541d7411b543cd
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I4877caebcac5cdfc35a59a18a3e3451b1f23cb0d
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36028
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Oleg Drokin [Thu, 12 Sep 2019 18:04:55 +0000 (18:04 +0000)]
Revert "LU-11816 lnet: setup health timeout defaults"
This is causing frequent assertion failures like below:
LNetError: 1701:0:(lib-move.c:3670:lnet_monitor_thr_stop()) ASSERTION( rc == 0 ) failed:
[ 378.662897] LNetError: 1701:0:(lib-move.c:3670:lnet_monitor_thr_stop()) LBUG
[ 378.665136] Pid: 1701, comm: rmmod 3.10.0-7.6-debug #1 SMP Fri Jul 12 02:40:17 EDT 2019
[ 378.667455] Call Trace:
[ 378.668302] [<
ffffffffa01927dc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
[ 378.670463] [<
ffffffffa019288c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[ 378.672398] [<
ffffffffa021d036>] lnet_monitor_thr_stop+0xe6/0x120 [lnet]
[ 378.674727] [<
ffffffffa01fde8a>] LNetNIFini+0x6a/0x110 [lnet]
[ 378.676532] [<
ffffffffa0622b15>] ptlrpc_ni_fini+0x175/0x200 [ptlrpc]
[ 378.678598] [<
ffffffffa0622e53>] ptlrpc_exit_portals+0x13/0x20 [ptlrpc]
[ 378.680850] [<
ffffffffa06b59aa>] ptlrpc_exit+0x22/0x678 [ptlrpc]
[ 378.683338] [<
ffffffff81108aab>] SyS_delete_module+0x19b/0x300
[ 378.684809] [<
ffffffff817c8e15>] system_call_fastpath+0x1c/0x21
[ 378.686727] [<
ffffffffffffffff>] 0xffffffffffffffff
[ 378.688144] Kernel panic - not syncing: LBUG
This reverts commit
db81f3f293dbc0c9dba90ea1153f554b33fbb80b.
Change-Id: Id12f9d3ec4af3ab37158b3e6049d2ea971d86913
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36173
Oleg Drokin [Sat, 17 Aug 2019 05:36:07 +0000 (01:36 -0400)]
LU-12603 ldlm: Check cancel lock count for correctness
Make sure the number of locks we are going to cancel fits into
the supplied buffer first.
Lustre-change: https://review.whamcloud.com/35806
Lustre-commit:
7cc43aef98f6a759cbc5ae572123b44803c0ccd2
Change-Id: I93887133532bf7ee2be27114b1972aa64e06623c
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reported-by: Alibaba Cloud <yunye.ry@alibaba-inc.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yunye Ry <yunye.ry@alibaba-inc.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36108
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 1 Aug 2019 20:55:58 +0000 (14:55 -0600)]
LU-12566 mdc: hold lock while walking changelog dev list
In mdc_changelog_cdev_finish() we need chlg_registered_dev_lock
while walking and changing entries on the chlog_registered_devs
and ced_obds lists in chlg_registered_dev_find_by_obd().
Move the calling of chlg_registered_dev_find_by_obd() under the
mutex, and add assertions to the places where the lists are walked
and changed that the mutex is held.
Lustre-change: https://review.whamcloud.com/35668
Lustre-commit:
a260c530801db7f58efa93b774f06b0ce72649a3
Fixes:
1d40214d96dd ("LU-7659 mdc: expose changelog through char devices")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib62fdff87cde6a4bcfb9bea24a2ea72a933ebbe5
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35835
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
James Nunez [Wed, 22 May 2019 16:22:19 +0000 (10:22 -0600)]
LU-12045 tests: honor EXCEPT tests when using ONLY list
The Lustre test framework allows a user to specify a subset
of tests to run using the ONLY parameter or --only flag.
The test framwork also allows the user to specify a list of
tests to skip using the EXCEPT or ALWAYS_EXCEPT parameters.
By default, if the ONLY parameter or --only flag is used,
the EXCEPT and ALWAYS_EXCEPT lists are ignored.
Add a flag to auster, -H, and an environment variable,
HONOR_EXCEPT, to skip the tests on the ALWAYS_EXCEPT,
EXCEPT and SLOW lists when using the ONLY/--only parameter.
Lustre-commit:
e636a709bf5948cd944ca9a42d4b74f07557a2ac
Lustre-change: https://review.whamcloud.com/34938
Test-Parameters: trivial
Test-Parameters: envdefinitions=ONLY="40-43" testlist=sanity
Test-Parameters: envdefinitions=ONLY="40-43" austeroptions=-H testlist=sanity
Test-Parameters: envdefinitions=SLOW="no",ONLY="27" testlist=sanity
Test-Parameters: envdefinitions=SLOW="no",ONLY="27" austeroptions=-H testlist=sanity
Test-Parameters: envdefinitions=SLOW="yes",ONLY="27" testlist=sanity
Test-Parameters: envdefinitions=SLOW="yes",ONLY="27" austeroptions=-H testlist=sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I173e48e1d2dc3b404d148146639a13148bc48a3d
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35901
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andriy Skulysh [Thu, 6 Jun 2019 12:22:00 +0000 (15:22 +0300)]
LU-12017 ldlm: DoM truncate deadlock
setxattr takes inode lock and sends reint to MDS.
truncate takes MDS_INODELOCK_DOM lock and wants
to acquire inode lock.
MDS locks are for different bits
MDS_INODELOCK_UPDATE|MDS_INODELOCK_XATTR vs
MDS_INODELOCK_DOM but they blocks each other if
some blocking lock was present earlier.
If IBITS waiting lock has no conflicts with any lock in the
granted queue or any lock ahead in the waiting queue then
it can be granted.
Use separate waiting lists for each ibit to eliminate full
lr_waiting list scan.
Lustre-change: https://review.whamcloud.com/35057
Lustre-commit:
2250e072c37855d611aa64027945981fe2c8f4d7
Cray-bug-id: LUS-6970
Change-Id: I95b2ed0b1a0063b7ece5277a5ee06e2511d44e5f
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35937
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Mon, 3 Jun 2019 18:21:53 +0000 (12:21 -0600)]
LU-11285 mdt: improve IBITS lock definitions
Move MDS_INODELOCK_* flags into a named enum, and add the definitions
for the newer flags into wirecheck/wiretest to ensure consistency.
Rename MDS_INODELOCK_MAXSHIFT to MDS_INODELOCK_NUMBITS to hold current
number of lockbits, rather than one less than the number of lockbits,
since the only two places that use it expect it to be one larger than
it is. Fix uses of MDS_INODELOCK_NUMBITS to be number of locks. This
does not change the value of MDS_INODELOCK_FULL, which is used in the
protocol to exchange supported lock bits between client and server.
Lustre-change: https://review.whamcloud.com/35045
Lustre-commit:
3611352b699ce479779c0ff92ca558d9321e58a2
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0c2985bcc602b7182d5db2cf8d590923be2cab07
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35955
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Hongchao Zhang [Thu, 4 Jul 2019 13:39:24 +0000 (09:39 -0400)]
LU-11761 fld: let's caller to retry FLD_QUERY
In fld_client_rpc(), if the FLD_QUERY request between MDTs fails
with -EWOUDBLOCK because the connection is lost, return -EAGAIN
to notify the caller to retry.
It also reverts the patch https://review.whamcloud.com/12586/, which
was landed on b2_6_90_0-5-g6db07f0 to avoid returning -EAGAIN from
lod_object_init() to confuse lu_object_find_at() (thinks the object
was dying when it encounters -EAGAIN). In current Lustre version,
lu_object_find_at() just returned found object and let's caller to
check whether it's dying.
Fixes:
6db07f095fba ("LU-5871 lod: Do not return EAGAIN in lod_object_init")
Lustre-change: https://review.whamcloud.com/34962
Lustre-commit:
e3f6111dfd1c6f2266d0beef67e5a7514a6965d0
Change-Id: Ie83ebfdae2bd50c96a59a065f7f3c3dcfad04e42
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35661
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Sat, 6 Apr 2019 00:38:38 +0000 (17:38 -0700)]
LU-12163 lnet: fix cpt locking
In lnet_select_pathway() the call to lnet_handle_send_case_locked()
can result in sd_cpt being changed. If this function returns
REPEAT_SEND, we'll go back to the again label. It is possible at
this time to initiate discovery, which will unlock the cpt.
If the local cpt isn't updated we could potentially be manipulating
the wrong cpt resulting in some form of corruption or dead lock.
Lustre-change: https://review.whamcloud.com/34607
Lustre-commit:
f6d63067e1ec00009b9da5cdb263fe14e7e503e1
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ifd39b0d84f8cce859151f7cc900a082481dd7218
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36032
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Wed, 19 Dec 2018 23:55:49 +0000 (15:55 -0800)]
LU-11816 lnet: setup health timeout defaults
Enable health feature by default.
Setup transaction timeout to a default 10 seconds and
retry count to 3 when health is enabled. When health
is disabled set default transaction timeout to 50.
When toggling between health enabled/disabled the defaults
will always kick in.
Lustre-change: https://review.whamcloud.com/34252
Lustre-commit:
8632e94aeb7e62da07f342a9897d15dfd8251148
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I153c2822898b44e33871ec827de7e61f153bb1db
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36031
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Olaf Faaland [Fri, 2 Aug 2019 16:38:50 +0000 (09:38 -0700)]
LU-12626 lnet: create existing net returns EEXIST
When "lnetctl net add" is called for an interface/net pair that
already exists, the error returned should be EEXIST, so the
user knows that the net is already configured.
Lustre-change: https://review.whamcloud.com/35681
Lustre-commit:
4aa71267cc0317e126843509f1c5b237f469414b
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: Idab79ab288a11a2920793f27df235b4dfab497fe
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35953
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Lai Siyao [Fri, 28 Jun 2019 12:19:56 +0000 (20:19 +0800)]
LU-12485 obdclass: 0-nlink race in lu_object_find_at()
There is a race in lu_object_find_at: in the gap between
lu_object_alloc() and hash insertion, another thread may
have allocated another object for the same file and unlinked
it, so we may get an object with 0-nlink, which will trigger
assertion in osd_object_release().
To avoid such race, initialize object after hash insertion.
But this may cause an unitialized object found in cache, if
so, wait for the object initialized by the allocator.
To reproduce the race, introduced cfs_race_wait() and
cfs_race_wakeup(): cfs_race_wait() will cause the thread that
calls it wait on the race; while cfs_race_wakeup() will wake
up the waiting thread. Same as cfs_race(), CFS_FAIL_ONCE
should be set together with fail_loc.
Add sanityn test_84.
Lustre-change: https://review.whamcloud.com/35360
Lustre-commit:
2ff420913b9718ee8d80ae51fddc6e5df4a3148a
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I0869f254544256987b73f0ff92f75e4d1562e566
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35834
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Tue, 7 May 2019 15:55:04 +0000 (00:55 +0900)]
LU-12267 tests: update filter in acl for SElinux case
With SElinux enforced on client, sanity.sh test_103a fails because
the "ls -l" command produces an extra '.' at the end to indicate
extra security attributes are set.
So update filter by removing this trailing '.' in the output.
Lustre-change: https://review.whamcloud.com/34818
Lustre-commit:
3f6294a482651802fb97175b6e8c6568a371352a
Test-Parameters: trivial testlist=sanity envdefinitions=ONLY=103a
Test-Parameters: clientselinux testlist=sanity envdefinitions=ONLY=103a
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ie684a3fe02f0f2821c8059855165a0f9dd585b72
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35957
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Sergey Cheremencev [Fri, 28 Jun 2019 20:42:28 +0000 (23:42 +0300)]
LU-11760 ofd: limit num of objects to create in 1 transaction
Set flag th_sync when the number of objects to create per
sequence reaches OST_MAX_PRECREATE in one transaction.
It is needed to avoid gaps after OST failover.
See details in LU-11760.
Lustre-change: https://review.whamcloud.com/35373
Lustre-commit:
4485ee8be4cf224e2543f6344efc6e1cb295a0a7
Change-Id: Ie29de5a42e757b07561749982359c01df999e798
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35951
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Tue, 30 Jul 2019 18:10:32 +0000 (14:10 -0400)]
LU-12600 tgt: shortio size should be unsigned
The short_io_size value is accepting unsigned values from
req_capsule_get_size, and so needs to be unsigned as well.
If it's not, it's possible for the short_io_size memcopy to
act on an incorrect value and cause memory corruption.
Lustre-change: https://review.whamcloud.com/35653
Lustre-commit:
4c3864cf97711d73b12905fea720570cf814d179
Reported-by: Alibaba Cloud <yunye.ry@alibaba-inc.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I043e314cd43a7b40519f951a605fa5a36ff91dcf
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35867
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Emoly Liu [Fri, 9 Aug 2019 07:29:30 +0000 (15:29 +0800)]
LU-12605 tgt: check client data size in target_handle_connect()
Check client data size (negtive or excessively large) in case of
memcpy corruption.
Lustre-change: https://review.whamcloud.com/35711
Lustre-commit:
149f005a3199eee13fe6396671613a0f620ee0cc
Change-Id: Ided26dea0e2bbb79e607c626810834ca947497d4
Reported-by: Alibaba Cloud <yunye.ry@alibaba-inc.com>
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35935
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Tue, 11 Jun 2019 18:54:20 +0000 (14:54 -0400)]
LU-12394 llite: Fix extents_stats
Patch 32517 from LU-8066 changed:
(1 << LL_HIST_START << i)
To:
BIT(LL_HIST_START << i)
But these are not equivalent because this changes the order
of operations. The earlier one does the operations in this
order:
(1 << LL_HIST_START) << i
The new one is this order:
1 << (LL_HIST_START << i)
Which is quite different, as it's left shifting
LL_HIST_START directly, and LL_HIST_START is a number of
bits.
The goal is really just to start with BIT(LL_HIST_START)
and left shift by one (going from 4K, to 8K, etc) each
time, so just use:
BIT(LL_HIST_START + i)
The result of this was that all i/os over 8K were placed in
the 4K-8K stat bucket, because the loop exited early.
Also add mmap'ed reads & writes to extents_stats.
Add test for extents_stats.
Fixes:
adb5aca3d673 ("LU-8066 llite: Move all remaining procfs entries
to debugfs")
Lustre-change: https://review.whamcloud.com/35075
Lustre-commit:
d31a4dad4e698c537dff3d018fd67f196b2b293f
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iab4dc097234d411601a18d501075df45791d1138
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35866
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Hongchao Zhang [Sat, 13 Jul 2019 12:07:07 +0000 (08:07 -0400)]
LU-12615 mdt: check mdt_object
In processing RPC of getattr, getxattr, swap_layouts and sync,
the mdt_object should be checked to verify there is a valid
RMF_MDT_BODY field and OBD_MD_FLID is set properly.
Lustre-change: https://review.whamcloud.com/35764
Lustre-commit:
e5e0bdb7a5c2d47ceaa2d1c190806d1be4999129
Change-Id: Ibb6aaa5ec5eb4b7284f4d5567a618a908d66920c
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reported-by: Alibaba Cloud <yunye.ry@alibaba-inc.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35869
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
James Nunez [Fri, 31 May 2019 21:28:20 +0000 (15:28 -0600)]
LU-12267 tests: filter trailing '.' for SELinux
When SELinux is enforced, sanity test 420 fails due to
the "ls -n" command producing an extra '.' at the end of
the file/directory permissions to indicate extra security
attributes are set.
We need to filter out the trailing '.' in the 'ls -n'
output for testing to pass when SELinux is enabled.
Lustre-change: https://review.whamcloud.com/35026
Lustre-commit:
f000996069acc7d535b7574a9d9a4ab65e753ff0
Test-Parameters: trivial envdefinitions=ONLY=420 testlist=sanity
Test-Parameters: clientselinux envdefinitions=ONLY=420 testlist=sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I4a2f199d2ef4a7b1b6a1b381041b384bb0077cc6
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35958
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Fri, 28 Jun 2019 15:32:29 +0000 (11:32 -0400)]
LU-11873 tests: Increase barrier freeze time
Barrier freeze times of 10 seconds or less are roughly the
same length as ZFS commit intervals, and because barriers
generate sync ops, they have to wait for those. This means
that a 10 second barrier will occassionally expire before
the commit has finished.
Switch to barriers of at least 20 seconds.
Lustre-change: https://review.whamcloud.com/35361
Lustre-commit:
96771280b330af07781326ff8811facd1ca39deb
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I50fc8315c791ed444ccf39755441fdbe3aa1db6c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35952
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alex Zhuravlev [Fri, 9 Aug 2019 19:43:45 +0000 (23:43 +0400)]
LU-12657 llite: forget cached ACLs properly
Lustre with linux-4.* fails ACL tests (e.g. sanity/103 and sanityn/25)
because ll_lock_cancel_bits() does not reset i_acl and i_default_acl
into initial state. use kernel's forget_all_cached_acls() to do so.
Lustre-change: https://review.whamcloud.com/35756
Lustre-commit:
3df034f8f46b0d22829f7ac83cbf9871823c093c
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I468b775e13ba0f7279a6aa320983705f5e79187a
Reviewed-by: Neil Brown <neilb@suse.com>
Tested-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35870
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Sat, 17 Aug 2019 05:43:49 +0000 (22:43 -0700)]
LU-12457 kernel: RHEL 7.7 server support
This patch makes changes to support new RHEL 7.7 release
for Lustre server.
Test-Parameters: trivial clientdistro=el7.7 serverdistro=el7.7
Change-Id: Ic56e087e6c89f1bbd1ab247c44b2e979828f34f9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35808
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Fri, 5 Jul 2019 16:50:11 +0000 (12:50 -0400)]
LU-12470 tests: clear MDT-MDT locks for pdo tests
It is not sufficient to clear client locks to avoid
spillover from one tests to the next in the pdo tests, we
must also clear MDT-MDT locks or we can end up waiting for
one of those.
Lustre-change: https://review.whamcloud.com/35321
Lustre-commit:
43ed7101e10e395839f9406bead6a5ac4fb02997
Test-Parameters: trivial testlist=sanityn
Test-Parameters: fstype=zfs testlist=sanityn
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanityn
Test-Parameters: mdscount=2 mdtcount=4 fstype=zfs testlist=sanityn
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I8b6a1a6e9a1268a5d87bcb216f54736118ae7ba0
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35865
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sergey Cheremencev [Mon, 25 Jun 2018 13:28:25 +0000 (16:28 +0300)]
LU-11537 osp: avoid nested transaction
Don't create and start new transaction in
osp_precreate_reserve (osp_sync_force)
because it has been already started in mdd_create.
New transaction rewrites oti_declare_ops_cred
resulting in assert in osd_trans_exec_op:
osd_trans_exec_op()) lustre-MDT0000: opcode 3: credits = 0, rollback = 4
osd_trans_exec_op()) ASSERTION( !ldiskfs_track_declares_assert ) failed:
...
#2 [
ffff88008983f600] panic at
ffffffff816a863f
#3 [
ffff88008983f680] lbug_with_loc at
ffffffffc0513854 [libcfs]
#4 [
ffff88008983f6a0] osd_create at
ffffffffc0dfac32 [osd_ldiskfs]
#5 [
ffff88008983f718] lod_sub_create at
ffffffffc101b585 [lod]
#6 [
ffff88008983f7c0] lod_create at
ffffffffc100d6e9 [lod]
#7 [
ffff88008983f800] mdd_create_object_internal at
ffffffffc0ed8888 [mdd]
#8 [
ffff88008983f838] mdd_create_object at
ffffffffc0ec3e05 [mdd]
#9 [
ffff88008983f8b0] mdd_create at
ffffffffc0ecc673 [mdd]
Lustre-change: https://es-gerrit.dev.cray.com/153461
Lustre-commit:
f9c20f472cb9f500a609bee1db68868cf2ac3c13
Change-Id: Ic2c4589a9a1f640c7a0aa989fc62d81ca08f917f
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Cray-bug-id: LUS-6098
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-on: https://review.whamcloud.com/33391
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35832
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Thu, 18 Jul 2019 19:52:11 +0000 (13:52 -0600)]
LU-12034 obdclass: disable server-only code on client
The lu_env_add(), lu_env_remove(), and lu_env_find() functions are
only used on the server. Conditionally remove them when doing a
client-only build.
Fixes:
fce8d80624fd ("LU-12034 obdclass: put all service env on list")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I15f4d4de583bb3f9d16adad3ea16f961853ebbe5
Reviewed-on: https://review.whamcloud.com/35566
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Thu, 6 Jun 2019 15:59:18 +0000 (10:59 -0500)]
LU-12387 tests: Validate l_tunedisk max_sectors_kb tuning
Add test to ensure that l_tunedisk only tunes the max_sectors_kb
value of OST devices, and that it properly tunes any slave devices.
Test-parameters: trivial
Test-parameters: fstype=ldiskfs testlist=conf-sanity \
envdefinitions=ONLY=125
Lustre-change: https://review.whamcloud.com/35081
Lustre-commit:
ac8bbb3ddd646e4aa04b77cb1e7640b5865f2c04
Cray-bug-id: LUS-7358
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I414526e71fd7ac2811d7c0e8a6afd80a50788258
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35372
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Mon, 12 Aug 2019 17:38:31 +0000 (10:38 -0700)]
LU-12660 kernel: kernel update SLES12 SP4 [4.12.14-95.29.1]
Update SLES12 SP4 kernel to 4.12.14-95.29.1 for Lustre client.
Test-Parameters: trivial clientdistro=sles12sp4 \
envdefinitions=LNET_SELFTEST_EXCEPT=smoke,SANITY_EXCEPT=103a
Change-Id: I93c9a255bfa7f5048cd5acf9efe3af707e08e38c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35775
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Lai Siyao [Sun, 18 Aug 2019 06:09:19 +0000 (23:09 -0700)]
LU-10094 mdc: dir page ldp_hash_end mistakenly adjusted
On system PAGE_SIZE > 4k, mdc_adjust_dirpages() adjusts dir page
end hash with le64_to_cpu() value, but it should be little endian.
Fixes:
9d087dfd0fd ("LU-4516 mdc: missing lexxx_to_cpu in
mdc_read_entry")
This patch is back-ported from the following one:
Lustre-commit:
d8b19ae6617733df003a906aca1791791a5f0eff
Lustre-change: https://review.whamcloud.com/35517
Test-Parameters: clientarch=ppc64 envdefinitions=ONLY="18 22 32 48" \
testlist=sanity
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I89bb8b93f1fe5f7962f0b80d122ef9965cf15c63
Reviewed-on: https://review.whamcloud.com/35812
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Patrick Farrell [Wed, 24 Jul 2019 19:50:23 +0000 (15:50 -0400)]
LU-12586 lov: Correct write_intent end for trunc
When instantiating a layout, the server interprets the
write intent from the client as the range [start, end), not
including the last byte.
This is correct for writes because the last byte given for
a write is actually 'endpos', the resulting file pointer
position, and so is not included.
However, truncate is specifiying a size, not an endpos, so
truncate is [start, size]. To make this work with the
[start, end) processing for write_intents, we have to add
1 to the size when sending a write intent.
Without this, a truncate operation to the first byte of a
new layout component fails silently because the component
is not instantiated.
Lustre-change: https://review.whamcloud.com/35607
Lustre-commit:
c32c7401426d46b371fa993bba17265443fefa1b
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Id2b07abe73455bf1f0ed841ad08c5f381a871315
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35836
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Vladimir Saveliev [Mon, 9 Apr 2018 09:18:50 +0000 (12:18 +0300)]
LU-11634 tests: sanityn/test_77 improvements
sshd limits number of simultaneous unauthenticated connections via
MaxStartups configuration parameter. By default, 10 connections are
allowed. nrs_write_read() tries to run up to 32 do_nodes() in
parallel, causing sshd to drop some of connections.
The fix is to have do_nodes() to start required number of dd-s in
parallel.
Minor changes which were probably meant at the development:
- Test filenames include $HOSTNAME so that each client worked with its
own file, it seems. Add missing escaping backslashes so that $HOSTNAME
worked as expected.
- Add conv=notrunc parameter for dd-s which write lustre file at
different seeks.
- Have reading dd-s to read files which were especially created for
that.
- use /dev/null instead on /dev/zero to throw read data away.
Lustre-change: https://review.whamcloud.com/33607
Lustre-commit:
43ac1425ad9e8c5fc1e7deff579a443b5c9c7a58
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I496b0f6b50811351ac8e0e606cf5a20843fab5d4
Cray-bug-id: LUS-2493
Test-Parameters: testlist=sanityn envdefinitions=ONLY=77
Reviewed-on: https://review.whamcloud.com/33607
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-on: https://review.whamcloud.com/35735
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
James Simmons [Thu, 11 Jul 2019 00:52:34 +0000 (20:52 -0400)]
LU-10756 ptlrpc: change IMPORT_SET_* macros into real functions
Make the IMPORT_SET_STATE_NOLOCK and IMPORT_SET_STATE macros into
normal functions. Since import_set_state_nolock() is basically a
wrapper around __import_set_state() we can merge both functions.
Lustre-change: https://review.whamcloud.com/35463
Lustre-commit:
cf78502e48d6dbbc0d6c113e573ba9c68c5c311e
Change-Id: Idaa6aeb81ff2282e2f83d758a267129e686bd794
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Neil Brown <neilb@suse.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35795
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Patrick Farrell [Wed, 29 May 2019 15:02:19 +0000 (11:02 -0400)]
LU-12343 osc: Fix dom handling in weight_ast
The DOM bit can be cancelled at any time during calls to
weigh_ast, so:
1. We cannot assert that it is present
2. We cannot use it to identify the !LDLM_EXTENT case when
calling osc_lock_weight
Lustre-change: https://review.whamcloud.com/34966
Lustre-commit:
4f3ce87a06e6ed90373218d3aa1eb34a7675db65
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic3e7370580e35d8ae06b8330971959e0d55a4e81
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35858
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Fri, 14 Dec 2018 22:43:41 +0000 (15:43 -0700)]
LU-8130 libcfs: don't include rhashtable if unavailable
Don't include <linux/rhashtable.h> if it is not available.
Lustre-change: https://review.whamcloud.com/34020
Lustre-commit:
29d627f860bc1963f2103ea441577dbd18d71344
Fixes:
ac8d93c9f6f9 ("LU-8130 libcfs: support latest rhashtable API")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I80b2ee63fb2a438399359f8052a5063429254035
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35565
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Fri, 12 Jul 2019 13:23:29 +0000 (15:23 +0200)]
LU-12539 build: pass --with-o2ib when building deb packages
When building deb packages (make debs), '--with-o2ib' option is
not passed to ./configure called by package mechanism.
So Lustre deb packages are possibly built against wrong OFED headers.
Lustre-change: https://review.whamcloud.com/35481
Lustre-commit:
8d7f2674337e4f22e200e08ca1ac001ec24b4496
Test-Parameters: trivial
Test-Parameters: trivial clientdistro=ubuntu1804
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9cd1db54e77b97f46c0e0bdfe35084f1a268b70b
Reviewed-on: https://review.whamcloud.com/35828
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Thu, 8 Aug 2019 00:56:26 +0000 (17:56 -0700)]
LU-12457 kernel: new kernel [RHEL 7.7 3.10.0-1062.el7]
This patch makes changes to support new RHEL 7.7 release
for Lustre client.
Test-Parameters: trivial clientdistro=el7.7
Change-Id: I1fd68b56340c8674c9fae607e05faca04ba99a5a
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35725
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Thu, 8 Aug 2019 19:13:47 +0000 (12:13 -0700)]
LU-12589 llite: swab LOV EA data in ll_getxattr_lov()
On PPC client, the LOV EA data returned by getfattr from x86_64 server
was not swabbed to the host endian. While running setfattr, the data was
swabbed in ll_lov_setstripe_ea_info(), which caused magic mis-match in
ll_lov_user_md_size() and then ll_setstripe_ea() returned -ERANGE.
This patch fixed the above issue by swabbing LOV EA data in ll_getxattr_lov().
Test-Parameters: clientarch=ppc64 \
envdefinitions=ONLY="24D 102a" testlist=sanity
This patch is back-ported from the following one:
Lustre-commit:
f4a5957164bb981c93072bb0a28118bb7207a209
Lustre-change: https://review.whamcloud.com/35626
Change-Id: I8069df0c8f07c0bedba2e27db7c3a5553f11afb4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35736
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
James Simmons [Thu, 18 Apr 2019 23:07:39 +0000 (19:07 -0400)]
LU-11771 ldlm: use hrtimer for recovery to fix timeout messages
Currently the functions target_handle_connect/reconnect show
incorrect timeout to the end of recovery:
fs1-OST0000: Recovery already passed deadline 71578:57.
If you do not want to wait more, please abort the recovery by force.
...
fs1-OST0000: Denying connection for new client ...
(1 recovered, 11 in progress, and 1 evicted) to recover in 71578:57
This is due to the assumption that the time returned by the
monotonic clock and jiffies was initialized at the same time but
that is not the case. So a compare between ktime_get_seconds()
and jiffies converted to seconds is invalid.
We solve this by replacing the recovery timer with a hrtimer based
one. Their are many benefits to using a hrtimer over jiffies like
better scaling, power profile, and better handling on tickless
system. This also makes the code clear by using just the real wall
clock in all cases.
Change-Id: I9d7e7e92e67ee942bc1dc51fbb0af7d8f53e54e1
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/34710
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-on: https://review.whamcloud.com/35276
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alex Zhuravlev [Tue, 18 Jun 2019 09:18:27 +0000 (13:18 +0400)]
LU-12516 mdd: support for volatile creation in .lustre
this is useful to enable striping manipulation by FIDs.
Lustre-change: https://review.whamcloud.com/35258
Lustre-commit:
9a0a864112550047ae7236c7a904dc7a9955880e
Change-Id: I4d5b1b13acdfef21ac46bf3557e9ab6d5ccc796b
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35620
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Wed, 10 Jul 2019 16:53:59 +0000 (10:53 -0600)]
LU-12501 utils: fix 'lfs df' printing loop
If the OS_STATE_NONROT flag is set for a device, the showdf() state
printing loop will spin endlessly because this bit is not printed,
so it is never cleared from the loop's state mask.
Declaring the obd_statfs_state_names[] array indexed by OS_STATE_*
flags also is problematic because the array will double in size as
new binary flags are added (already OS_STATE_NONROT results in an
array size of 0x200 = 512 entries). Instead, declare a struct that
is indexed linearly and stores the OS_STATE_* flag in a field,
along with the name and whether the flag indicates a problem state.
The flag printing loop can iterate over the array of flags instead
of the os_state bits, which clarifies the for-loop iteration and is
equally efficient.
This also allows printing informational flags with "lfs df -v" so
that OS_STATE_NONROT and similar flags can be visible to users.
Fixes:
68635c3d9b3 ("LU-11963 osd: Add nonrotational flag to statfs")
Lustre-change: https://review.whamcloud.com/35456
Lustre-commit:
e4d92a8a08acbdca6634decd4deb9fe5678ad7ba
Change-Id: Ib62e949ca56d691c4699d5f2d9439c42643ebbe5
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35662
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Thu, 13 Jun 2019 20:51:44 +0000 (16:51 -0400)]
LU-11963 osd: Add nonrotational flag to statfs
It is potentially useful for the MDS and userspace to
know whether or not an OST is using non-rotational media.
Add a flag to obd_statfs that reflects this.
Users can override this parameter in proc.
ZFS does not currently make this information available to
Lustre, so default to rotational and allow users to
override.
Lustre-Change: https://review.whamcloud.com/34235
Lustre-Commit:
68635c3d9b3113621b93fd989f1a3f8f064385b9
LU-12396 utils: lfs should not output 'nul' char
If lfs prints a nul char, it breaks parsing of the output.
Fixes:
68635c3d9b31 ("LU-11963 osd: Add nonrotational flag to statfs")
Lustre-Change: https://review.whamcloud.com/35137
Lustre-Commit:
fd3958b61c5f1c7ed520f07553b999af5522d8e0
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iac2b54c5d8cc1eb79cdace764e93578c7b058661
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35226
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Gu Zheng [Thu, 11 Jul 2019 05:52:38 +0000 (13:52 +0800)]
LU-11672 ldlm: awalys cancel aged locks regardless enabling or disabling lru resize
Currently cancelling aged locks is handled by of ldlm_pool_recalc routine,
and it only works when lru resize is enabled, means if we disabled lru
resize, old aged locks are still cached even though they reach the
ns_max_age.
But theoretically, even lru resize disabled, lru_max_age should behave
same as enabling lru resize. At the end, lru_size is like hard limit of
number of locks, but ns_max_age/lru_max_age is a elimination mechanism,
regardless enabling or disabling lru resize meaning once it gets
lru_max_age, locks need to be cancelled.
So fix it here with changing the lru flags when invoking ldlm_cancel_lru
to do the real cancel work, if lru resize is enabled, set flag to
LDLM_LRU_FLAG_LRUR, otherwise LDLM_LRU_FLAG_AGED.
Lustre-change: https://review.whamcloud.com/35467
Lustre-commit:
e4c490bac7701435cb08ce444d9b23b8fd1dd839
Change-Id: Ic2df2550af87fd7209fdb31ca3730683d727a74d
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/35660
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
James Simmons [Tue, 11 Jun 2019 15:39:17 +0000 (11:39 -0400)]
LU-8066 tests: use lod / osp tunables on servers
Before the lustre 2.4 OSD work the lov and osc code was used on
both servers and clients. With the OSD layer work we saw the new
lod and osp layers created that are server specific. To avoid
breakage symlinks were created that went from the lod / osp to
lov / osc directories in the proc tree on the server side.
It has been a very long time since that change so we can now
safely start to unwind that handling. The first step taken here
is to migrate the maloo test from using lov / osc for the server
tunables to using lod / osp instead.
Lustre-commit:
c2f43d4c7a609def4292c5b9bee63c9a33cb4598
Reviewed-on: https://review.whamcloud.com/35185
Change-Id: I9dd562cd74d68aaa0226d5ab93042b52193604a1
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35185
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35349
Tested-by: jenkins <devops@whamcloud.com>
Hongchao Zhang [Fri, 29 Mar 2019 13:28:06 +0000 (09:28 -0400)]
LU-11678 quota: make overquota flag for old req
For the old request with over quota flag, the over quota flag
should still be marked at OSC, because the old request could be
processed afther the new request at OST, then it won't break the
quota enforement at OST.
Lustre-change: https://review.whamcloud.com/34645
Lustre-commit:
c59cf862c3c06758c270564dd6e8948e167316b9
Test-Parameters: testlist=replay-single,replay-single,replay-single
Test-Parameters: testlist=replay-single,replay-single,replay-single
Change-Id: Ic34c438fe3f018c3b596b26ad6dc945547c8fada
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shilong Wang <wshilong@ddn.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34916
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Mon, 29 Jul 2019 07:33:52 +0000 (00:33 -0700)]
LU-10100 llite: swab LOV EA user data
Many sub-tests failed with "Invalid argument" failures
on PPC client because of the endianness issue.
This patch fixes the issue by adding a common function
lustre_swab_lov_user_md() to swab the LOV EA user data.
Test-Parameters: clientarch=ppc64 \
envdefinitions=ONLY=27 testlist=sanity
This patch is back-ported from the following one:
Lustre-commit:
9d17996766e0fa93b1029d2422d45d087edde389
Lustre-change: https://review.whamcloud.com/35291
Change-Id: I46bab0788300cd79c4e66e1a4990c3e1f7192391
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35633
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andriy Skulysh [Wed, 27 Feb 2019 17:37:24 +0000 (19:37 +0200)]
LU-12095 ptlrpc: ocd_connect_flags are wrong during reconnect
Import connect flags are reset to original ones during
reconnect, so a request can be created with unsupported
features.
Use separate obd_connect_data to send connect request.
Lustre-change: https://review.whamcloud.com/34480
Lustre-commit:
1224084c6300d5b15ccb703dfe18209a0f1f12ab
Change-Id: I4cfc48bf7ef66c4f3832613e179030b0eb1d6fdf
Cray-bug-id: LUS-6397
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35635
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Minh Diep [Tue, 23 Jul 2019 00:07:02 +0000 (17:07 -0700)]
LU-12575 build: add ibutils2 for MOFED build
MOFED 4.6 include ibutils2 instead of ibutils
Remove ofed rhel5 patch which we don't need
Lustre-change: https://review.whamcloud.com/35590
Lustre-commit:
26444d84c74e693e47c9785423e63402d32acb4f
Change-Id: I46c51eb8a194ea86bd8c951944e5c1427d0f37d0
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35631
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>