Whamcloud - gitweb
fs/lustre-release.git
4 years agoLU-13005 lnet: discard LNetEQGet and LNetEQWait 40/36840/4
Mr NeilBrown [Tue, 7 Jan 2020 18:14:51 +0000 (13:14 -0500)]
LU-13005 lnet: discard LNetEQGet and LNetEQWait

These interfaces are never used and are not particularly useful,
so discard them.

Test-Parameters: trivial testlist=sanity-lnet

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Iaf2bc9ec2638820c3e4334e40cf2cf6993237f7d
Reviewed-on: https://review.whamcloud.com/36840
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13004 ptlrpc: Allow BULK_BUF_KIOV to accept a kvec 24/36824/7
Mr NeilBrown [Sun, 5 Jan 2020 16:11:30 +0000 (11:11 -0500)]
LU-13004 ptlrpc: Allow BULK_BUF_KIOV to accept a kvec

Bulk descriptor of type PTLRPC_BULK_BUF_KIOV are comprised
of a list of page+offset+len.
If the calling code actually has a virtual-address+len, it
cannot current use BULK_BUF_KIOV and must use BULK_BUF_KVEC.

However it is quite easy to convert virtual-address+len
to a list of page+offset+len.

So we can add a ->add_iov_frag interface for KIOV descriptors, and
then we will be able to use KIOV descriptors for everything.  The
caller must ensure to allocate a large enough descriptor, taking
into account the size of each exptected kvec.

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: If8bc5dc9f6e89a196bd72d3ac9b88c4ea5da83d1
Reviewed-on: https://review.whamcloud.com/36824
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12865 tests: fix sanity 160f to be more robust 68/36468/7
Andreas Dilger [Thu, 17 Oct 2019 07:19:26 +0000 (16:19 +0900)]
LU-12865 tests: fix sanity 160f to be more robust

The sanity test_160f test was failing intermittently because the first
Changelog user ("cl6") was being unregistered in some cases when it
set changelog_max_idle_time=10, but the test slept for 9s and then did
some operations that could be slow.  In rare cases the test runs too
long and the MDS evicts the "good" user along with the bad user:

   MDD0000: Force deregister of ChangeLog user cl7 idle more than 35s
   MDD0000: Force deregister of ChangeLog user cl6 idle more than 11s

Change the test sleep interval to be half of the max_idle limit so
that there is no risk of the "good" Changelog user being evicted.

Add some logging to the test so that it is easier to correlate test
script actions with events in the MDS debug log.

Fixes: 31fef6845e8b ("LU-10680 mdd: create gc thread when no current transaction")
Test-Parameters: trivial envdefinitions=ONLY=160 testlist=sanity,sanity
Test-Parameters: envdefinitions=ONLY=160 mdscount=2 testlist=sanity,sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0e4c9c271d98a2716f848e75676780b0383ebbe5
Reviewed-on: https://review.whamcloud.com/36468
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-8130 lu_object: factor out extra per-bucket data 16/36216/8
NeilBrown [Thu, 12 Dec 2019 23:51:01 +0000 (18:51 -0500)]
LU-8130 lu_object: factor out extra per-bucket data

The hash tables managed by lu_object store some extra
information in each bucket in the hash table.  This prevents the use
of resizeable hash tables, so lu_site_init() goes to some trouble
to try to guess a good hash size.

There is no real need for the extra data to be closely associated with
hash buckets.  There is a small advantage as both the hash bucket and
the extra information can then be protected by the same lock, but as
these locks have low contention, that should rarely be noticed.

The extra data is updated frequently and accessed rarely, such an lru
list and a wait_queue head.  There could just be a single copy of this
data for the whole array, but on a many-cpu machine, that could become
a contention bottle neck.  So it makes sense keep multiple shards and
combine them only when needed.  It does not make sense to have many
more copies than there are CPUs.

This patch takes the extra data out of the hash table buckets and
creates a separate array, which never has more entries than twice the
number of possible cpus.  As this extra data contains a
wait_queue_head, which contains a spinlock, that lock is used to
protect the other data (counter and lru list).

The code currently uses a very simple hash to choose a
hash-table bucket:

 (fid_seq(fid) + fid_oid(fid)) & (CFS_HASH_NBKT(hs) - 1)

There is no documented reason for this and I cannot see any value in
not using a general hash function. We can use hash_32() and hash_64()
on the fid value with a random seed created for each lu_site. The
hash_*() functions where picked over the jhash() functions since
it performances way better.

The lock ordering requires that a hash-table lock cannot be taken
while an extra-data lock is held.  This means that in
lu_site_purge_objects() we much first remove objects from the lru
(with the extra information locked) and then remove each one from the
hash table.  To ensure the object is not found between these two
steps, the LU_OBJECT_HEARD_BANSHEE flag is set.

As the extra info is now separate from the hash buckets, we cannot
report statistic from both at the same time.  I think the lru
statistics are probably more useful than the hash-table statistics, so
I have preserved the former and discarded the latter.  When the
hashtable becomes resizeable, those statistics will be irrelevant.

As the lru and the hash table are now managed by different locks
we need to be careful to prevent htable_lookup() finding an
object that lu_site_purge_objects() is purging.
To help with this we introduce a new lu_object flag to say
that and object is being purged.  Once set, the object will
be quickly removed from the hash table, and is already
removed from the lru.

Change-Id: I2a7402a348377d3b17f76e8617216e5b7ff9b99a
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/36216
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12756 lnet: Refactor lnet_compare_routes 21/36621/2
Chris Horn [Thu, 31 Oct 2019 02:26:14 +0000 (21:26 -0500)]
LU-12756 lnet: Refactor lnet_compare_routes

Restrict lnet_compare_routes() to only comparing the lnet_route
objects passed as arguments. This saves us from doing unecessary
calls to lnet_find_best_lpni_on_net().

Rename lnet_compare_peers to lnet_compare_gw_lpnis to better
reflect what is done by this routine.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I2d7b5dcc2aacb371b21908ceebf2dd6a349fa74c
Reviewed-on: https://review.whamcloud.com/36621
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12756 lnet: Remove unused vars in lnet_find_route_locked 20/36620/2
Chris Horn [Sun, 27 Oct 2019 19:35:05 +0000 (14:35 -0500)]
LU-12756 lnet: Remove unused vars in lnet_find_route_locked

The lp and lp_best variables are not needed in
lnet_find_route_locked().

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I61a7097ab66703a1af1346c7301b9efc7e4392c9
Reviewed-on: https://review.whamcloud.com/36620
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12756 lnet: Avoid extra lnet_remotenet lookup 36/36536/4
Chris Horn [Tue, 22 Oct 2019 00:46:14 +0000 (19:46 -0500)]
LU-12756 lnet: Avoid extra lnet_remotenet lookup

We can keep track of the lnet_remotenet object associated with the
"best" lnet_peer_net, and pass that lnet_remotenet directly to
lnet_find_route_locked().

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ib9808ca885c698ba6c73c5243fbce8b3f499b790
Reviewed-on: https://review.whamcloud.com/36536
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13115 mdt: handle mdt_pack_sectx_in_reply() errors 48/37148/4
Mikhail Pershin [Tue, 7 Jan 2020 09:49:20 +0000 (12:49 +0300)]
LU-13115 mdt: handle mdt_pack_sectx_in_reply() errors

The mdt_pack_secctx_in_reply() contains mo_xattr_get() call
which -ENOENT error should be checked and exit by error path
if needed.

In DNE environment lu_object may lost its LOHA_EXISTS flag
during osp_xattr_get() and that should be handled to don't
proceed with code paths for existent objects.

Test-Parameters: clientselinux mdtcount=4 envdefinitions=ONLY=185a testlist=sanity,sanity,sanity,sanity
Test-Parameters: clientselinux mdtcount=4 testlist=sanity,recovery-small,sanity-sec
Test-Parameters: clientselinux mdtcount=4 testgroup=review-dne-selinux
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I55ad666f58dd3fae3ed097018aa23ed94818d246
Reviewed-on: https://review.whamcloud.com/37148
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-9679 llite: fix possible race with module unload. 20/37020/4
Mr NeilBrown [Fri, 3 Jan 2020 00:50:55 +0000 (19:50 -0500)]
LU-9679 llite: fix possible race with module unload.

lustre_fill_super() calls client_fill_super() without holding a
reference to the module containing client_fill_super.  If that
module is unloaded at a bad time, this can crash.

To be able to get a reference to the module using
try_get_module(), we need a pointer to the module.

So replace
  lustre_register_client_fill_super() and
  lustre_register_kill_super_cb()
with a single
  lustre_register_super_ops()
which also passed a module pointer.

Then use a spinlock to ensure the module pointer isn't removed
while try_module_get() is running, and use try_module_get() to
ensure we have a reference before calling client_fill_super().

Now that we take the reference to the module before calling
luster_fill_super(), we don't need to take one inside
lustre_fill_super().

Linux-commit: d487fe31f49e78f3cdd826923bf0c340a839ffd8

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9474622f2a253d9882eae3f0578c50782dd11ad4
Reviewed-on: https://review.whamcloud.com/37020
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoNew tag 2.13.51 2.13.51 v2_13_51
Oleg Drokin [Sat, 11 Jan 2020 03:21:29 +0000 (22:21 -0500)]
New tag 2.13.51

Change-Id: I2ce973d9b599ed426b56d7892176205ba6822910

4 years agoLU-12923 lnet: Replace CLASSERT() with BUILD_BUG_ON() 13/37113/3
Arshad Hussain [Sun, 29 Dec 2019 12:46:19 +0000 (18:16 +0530)]
LU-12923 lnet: Replace CLASSERT() with BUILD_BUG_ON()

This patch replaces CLASSERT() with kernel defined
BUILD_BUG_ON()

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I94292ca4729c19e0651fad285943ae02584afc03
Reviewed-on: https://review.whamcloud.com/37113
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12923 lustre: Replace CLASSERT() with BUILD_BUG_ON() 11/37111/5
Arshad Hussain [Sun, 29 Dec 2019 08:34:31 +0000 (14:04 +0530)]
LU-12923 lustre: Replace CLASSERT() with BUILD_BUG_ON()

This patch replaces remaining CLASSERT() with kernel defined
BUILD_BUG_ON()

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ie23846f8d67cac1872bda9c7e20fe9bc888bf365
Reviewed-on: https://review.whamcloud.com/37111
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13090 utils: fix lfs_migrate -p for file with pool 67/37067/6
Andreas Dilger [Thu, 19 Dec 2019 11:51:41 +0000 (04:51 -0700)]
LU-13090 utils: fix lfs_migrate -p for file with pool

If "lfs_migrate -p <pool>" is run to migrate a file with an existing
pool, the given pool is overridden by the existing pool from the file
during migration.  Fix this to use the OST pool requested by the user.

Don't print a warning about deprecated -n option if --dry-run is used.

If a pool is specified, use it with "lfs df" to find OST free space.

Change temp filename to work better with new DNE "crush" hash.

Don't return an error if falling back to rsync and no links are found.

Add test for "lfs_migrate -p" and update man page and usage to match.
Clean up debug-level helpers in test-framework.sh.

Test-Parameters: trivial testlist=ost-pools
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ief69a620fc969aeff24ec0633a3314c3b83ebbe5
Reviewed-on: https://review.whamcloud.com/37067
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13088 ldlm: Fix sleeping function called in atomic 63/37063/2
Mr NeilBrown [Thu, 19 Dec 2019 05:55:35 +0000 (16:55 +1100)]
LU-13088 ldlm: Fix sleeping function called in atomic

target_recovery_overseer() can sleep while holding a spinlock, which
triggers a BUG warning.

It is easily fixed by dropping the spinlock before waiting.  In the
case where the task waits, no useful information that could be
protected by the spinlock is held, so nothing can be lost by dropping
it.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I8bb3d02523b5dcfadac19f01ccb736d7b7f28239
Reviewed-on: https://review.whamcloud.com/37063
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13059 kernel: kernel update RHEL7.7 [3.10.0-1062.9.1.el7] 60/36960/3
Jian Yu [Mon, 9 Dec 2019 09:38:44 +0000 (01:38 -0800)]
LU-13059 kernel: kernel update RHEL7.7 [3.10.0-1062.9.1.el7]

Update RHEL7.7 kernel to 3.10.0-1062.9.1.el7.

Test-Parameters: trivial clientdistro=el7.7 serverdistro=el7.7

Change-Id: I11fc7a2c382a5c234698bfb30a38a08ed29fef03
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36960
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-930 doc: fix formatting errors in lfs_migrate.1 59/36959/2
Andreas Dilger [Mon, 9 Dec 2019 09:04:49 +0000 (02:04 -0700)]
LU-930 doc: fix formatting errors in lfs_migrate.1

Add missing .TP sections for the command-line options.
Remove duplicate EXAMPLES section and '--yes' from bad merge.

Test-Parameters: trivial
Fixes: 99d7a8ed43b ("LU-8207 scripts: add auto-stripe option to lfs_migrate")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5de290e00c5fd718e53ac0fc801d44e1cf3ebbe5
Reviewed-on: https://review.whamcloud.com/36959
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12637 kernel: new kernel [RHEL 8.1 4.18.0-147.3.1.el8_1] 46/36946/7
Jian Yu [Fri, 3 Jan 2020 07:28:20 +0000 (23:28 -0800)]
LU-12637 kernel: new kernel [RHEL 8.1 4.18.0-147.3.1.el8_1]

This patch makes changes to support new RHEL 8.1 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.1 \
envdefinitions=SANITY_EXCEPT="411" \
testlist=sanity

Change-Id: Ifcc0a15c3ad9afa99b670641f91b23c1a5c0668e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36946
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 lnet: use lnet_accept_magic, not le32_to_cpu. 57/36857/3
Mr NeilBrown [Wed, 6 Nov 2019 05:59:45 +0000 (16:59 +1100)]
LU-12678 lnet: use lnet_accept_magic, not le32_to_cpu.

This le32_to_cpu() looks wrong, as the argument is a CPU value, not
le32, and the value is being compared to something that might be
le32.  Previous code used lnet_accept_magic() for tests on 'magic',
so it seems to make sense to use lnet_accept_magic() here too.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I3f04bb087d4ae3d6785e77072b51132f9440bd32
Reviewed-on: https://review.whamcloud.com/36857
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 lnet: lnet_startup_lndnet: avoid use-after-free 55/36855/2
Mr NeilBrown [Wed, 6 Nov 2019 05:42:34 +0000 (16:42 +1100)]
LU-12678 lnet: lnet_startup_lndnet: avoid use-after-free

If lnet_startup_lndni() fails it will free 'ni' (via lnet_ni_free()).
So we mustn't de-reference it in the LASSERT() in that case

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I01e35013e028a8f95f169e25aeb0c344b2310380
Reviewed-on: https://review.whamcloud.com/36855
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 lnet: change list_for_each in ksocknal_debug_peerhash 36/36836/3
Mr NeilBrown [Mon, 18 Nov 2019 01:24:09 +0000 (12:24 +1100)]
LU-12678 lnet: change list_for_each in ksocknal_debug_peerhash

This list_for_each() loop searches for a particular entry,
then acts of in.  It currently acts after the loop by testing
if the variable is NULL.  When we convert to list_for_each_entry()
it won't be NULL.

Change the code so the acting happens inside the loop.
 list_for_each_entry() {
    if (this isn't it)
        continue;
    act on entry;
    goto done; // break out of 2 loops
}

Note that identing is deliberately left unchanged,
as the next patch will change the 2 loops to a single loop,
after which the current indents will be correct.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Idea32bf2ab4037650d6698d4f82f6b6764b4d1b2
Reviewed-on: https://review.whamcloud.com/36836
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 lnet: discard struct ksock_peer 35/36835/4
Mr NeilBrown [Tue, 31 Dec 2019 18:10:47 +0000 (13:10 -0500)]
LU-12678 lnet: discard struct ksock_peer

struct ksock_peer is declared in a forward-ref, but
never defined or used.  Let's remove it, and change
some spaces to TABs while we are there.

Test-Parameters: trivial testlist=sanity-lnet

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I8a86a77a5cad606a374e60a5b8920be28308587d
Reviewed-on: https://review.whamcloud.com/36835
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 lnet: prepare to make lnet_lnd const. 30/36830/5
Mr NeilBrown [Mon, 30 Dec 2019 15:54:09 +0000 (10:54 -0500)]
LU-12678 lnet: prepare to make lnet_lnd const.

Preferred practice is for structs containing function
pointers to be 'const'.  Such structs are generally tempting
attack vectors, and making them const allows linux to place
them in read-only memory, thus reducing the attack surface.

'struct lnet_lnd' is mostly function pointers, but contains
one writable field - a list_head.

Rather than keeping registered lnds in a linked-list, we can place
them in an array indexed by type - type numbers are at most 15 so
this is not a burden.

With these changes, no part of an lnet_lnd is ever modified.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I08c7df551109e05ca4a3cef866e8df737d1a1ad4
Reviewed-on: https://review.whamcloud.com/36830
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-8130 ldlm: simplify ldlm_ns_hash_defs[] 20/36220/3
NeilBrown [Fri, 20 Dec 2019 14:55:07 +0000 (09:55 -0500)]
LU-8130 ldlm: simplify ldlm_ns_hash_defs[]

As the ldlm_ns_types are dense, we can use the type as
the index to the array, rather than searching through
the array for a match.
We can also discard nsd_hops as all hash tables now
use the same hops.
This makes the table smaller and the code simpler.

Change-Id: I2aebb9d533d676bed51a7422801545be4fbb7e1e
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/36220
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
4 years agoLU-12542 handle: rename ops to owner 98/35798/10
NeilBrown [Fri, 20 Dec 2019 14:37:48 +0000 (09:37 -0500)]
LU-12542 handle: rename ops to owner

Now that portals_handle_ops contains only a char*,
it is functioning primarily to identify the owner of each handle.
So change the name to h_owner, and the type to const char*.

Note: this h_owner is now quite different from the similar h_owner
in the server code.  When server code it merged the
"med" pointer should be stored in the "mfd" and validated separately.

Change-Id: Ie2e9134ea22c4929683c84bf45c41b96b348d0a2
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35798
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-9091 sysfs: use string helper like functions for sysfs 58/35658/22
James Simmons [Fri, 3 Jan 2020 02:47:07 +0000 (21:47 -0500)]
LU-9091 sysfs: use string helper like functions for sysfs

For a very long time the Linux kernel has supported the function
memparse() that allowed the passing in of memory sizes with the
suffix set of K, M, G, T, P, E. Lustre adopted this approach
with its proc / sysfs implmentation. The difference being that
lustre expanded this functionality to allow sizes with a
fractional component, such as 1.5G for example. The code used to
parse for the numerical value is heavily tied into the debugfs
seq_file handling and stomps on the passed in buffer which you
can't do with sysfs files.

Similar functionality to what Lustre does today exist in newer
linux kernels in the form of string helpers. Currently the
string helpers only convert a numerical value to human readable
format. A new function, string_to_size(), was created that takes
a string and turns it into a numerical value. This enables the
use of string helper suffixes i.e MiB, kB etc with the lustre
tunables and we can now support 10 base numbers i.e MB, kB as
well. Already string helper suffixes are used for debugfs files
so I expect this to be adopted over time so it should be
encouraged to use string_to_size() for newer lustre sysfs files.

At the same time we want to perserve the original behavior of
using the suffix set of K, M, G, T, P, E. To do this we create
the function sysfs_memparse() that supports the new string helper
suffixes as well as the older set of suffixes. This new code is
also way simpler than what is currently done with the current code.

Change-Id: Ia437db44f2a987aa11ab4ff3e9df23e9aeba04d7
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/35658
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12477 libcfs: Remove obsolete config checks 42/35342/11
Patrick Farrell [Mon, 30 Dec 2019 13:24:49 +0000 (08:24 -0500)]
LU-12477 libcfs: Remove obsolete config checks

Remove a few config checks for kernel versions we no longer
support. Only 3.10+ kernels are now supported.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4f4177c512a37fb7a78bab69aa89aa7199ab30b4
Reviewed-on: https://review.whamcloud.com/35342
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12756 lnet: Avoid comparing route to itself 35/36535/4
Chris Horn [Tue, 22 Oct 2019 00:35:42 +0000 (19:35 -0500)]
LU-12756 lnet: Avoid comparing route to itself

The first iteration of the route selection loop compares the first
route in the list with itself.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I1a51b04b248dbaa9a47a7a69e2995c21e515fb2b
Reviewed-on: https://review.whamcloud.com/36535
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12756 lnet: Refactor lnet_find_best_lpni_on_net 34/36534/4
Chris Horn [Mon, 21 Oct 2019 22:15:27 +0000 (17:15 -0500)]
LU-12756 lnet: Refactor lnet_find_best_lpni_on_net

Replace lnet_send_data argument.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ic346eaf6870f2a7c68c7f4c45d424f4f924370d9
Reviewed-on: https://review.whamcloud.com/36534
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13069 obdclass: don't skip records for wrapped catalog 96/36996/2
Alexander Boyko [Thu, 12 Dec 2019 07:59:41 +0000 (02:59 -0500)]
LU-13069 obdclass: don't skip records for wrapped catalog

osp_sync_thread() uses opd_sync_last_catalog_idx as a start point of
catalog processing. It is used at llog_cat_process_cb also, to skip
records from processing. When catalog is wrapped, processing starts
from second part of catalog and then a first part. So, a first part
would be skipped at llog_cat_process_cb() base on lpd_startcat.

osp_sync_thread() restarts a processing loop with a
opd_sync_last_catalog_idx. For a wrapped it increases last
index and one more increase do a llog_process_thread. This leads
to a skipped records at catalog, they would not be processed.
The patch fixes these issues.
It also adds sanity test 135 and 136 as regression tests.

Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-8053,LUS-8236
Change-Id: Ic75af1bf4468b9ef2de32cbf6d834b6a81376e88
Reviewed-on: https://review.whamcloud.com/36996
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-9679 lnet: always check return of try_module_get() 54/36854/6
Mr NeilBrown [Wed, 6 Nov 2019 04:45:27 +0000 (15:45 +1100)]
LU-9679 lnet: always check return of try_module_get()

try_module_get() can fail, so the return value should be checked.
If we *know* that we already hold a reference, __module_get()
should be used instead.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Id526f9ae3829a50fe7df7069230804322cd4558e
Reviewed-on: https://review.whamcloud.com/36854
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-9679 obdclass: don't manage module refs in open/close. 21/37021/3
Mr NeilBrown [Sun, 15 Dec 2019 21:29:30 +0000 (08:29 +1100)]
LU-9679 obdclass: don't manage module refs in open/close.

Core Linux code for managing char-devs ensures that the relevant
module is held active while a char-dev is open - see cdev_get()
and cdev_put().
So there is no need for lustre/obd_class to manage the module
ref count as well.

As this is all that obd_class_open and obd_class_close do, those
functions can be removed.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I84b0dc81c830cefc2383f184d12beeb2cfa22404
Reviewed-on: https://review.whamcloud.com/37021
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-10467 osp: use wait_event_idle_timeout() 88/35988/10
Mr NeilBrown [Wed, 28 Aug 2019 23:28:20 +0000 (09:28 +1000)]
LU-10467 osp: use wait_event_idle_timeout()

osp has 4 LWI_TIMEOUT() calls that pass an on_timeout
function.
In each case, the on_timeout function returns 1, so this
is equivalent to using wait_event_idle_timeout(), and
calling the function if the timeout happened.

One of the two functions passed does nothing except return 1, so it
can be ignored.
The other function, used only once, contains a CDEBUG message,
so we now call that when wait_event_idle_timeout() returns 0.

Change-Id: Ic153266e412d684c4aa6c7204ff5755d991d83c6
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/35988
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-10467 ptlrpc: convert waiting in sptlrpc_req_refresh_ctx() 87/35987/10
Mr NeilBrown [Mon, 30 Dec 2019 15:01:24 +0000 (10:01 -0500)]
LU-10467 ptlrpc: convert waiting in sptlrpc_req_refresh_ctx()

The l_wait_event call in sptlrpc_req_refresh_ctx() is somewhat complex
as it is passed both an on_timeout and on_signal handler, and
on_timeout doesn't return a constant value.

The net effect is to wait for the timeout with signals blocked.  Then,
if the condition still isn't true, run the on_timeout handler and if
that returns zero, wait again - indefinitely this time - and allow
some signals.  If a signal was received, call the on_signal handler.

This is fairly straight forward to write out in C, as shown in the
patch.

Change-Id: I7f9cfb8a8ff234bed4045ab21b53d018337cd615
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35987
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-10467 lustre: use wait_event_idle_timeout() as appropriate. 77/35977/18
Mr NeilBrown [Fri, 3 Jan 2020 00:30:32 +0000 (19:30 -0500)]
LU-10467 lustre: use wait_event_idle_timeout() as appropriate.

If l_wait_event() is passed an lwi initialised with
one of
   LWI_TIMEOUT_INTR( time, NULL, NULL, NULL)
   LWI_TIMEOUT_INTR( time, NULL, LWI_ON_SIGNAL_NOOP, NULL)
   LWI_TIMEOUT( time, NULL, NULL)
where time != 0, then it behaves much like
wait_event_idle_timeout().
All signals are blocked, and it waits either for the
condition to be true, or for the timeout (in jiffies).

Note that LWI_ON_SIGNAL_NOOP has no effect here.

l_wait_event() returns 0 when the condition is true, or -ETIMEDOUT
when the timeout occurs.  wait_event_idle_timeout() instead returns a
positive number when the condition is true, and 0 when the timeout
occurs.  So in the cases where return value is used, handling needs to
be adjusted accordingly.

Note that in some cases where cfs_fail_val gives the time to wait for,
the current code re-tests the wait time against zero as cfs_fail_val
can change asynchronously.  This is because l_wait_event() behaves
quite differently if the timeout is zero.

The new code doesn't need to do that as wait_event_idle_timeout()
treat 0 just as a very short wait, which is exactly the correct
behavior here.

This patch also removes a comment which is no longer meaningful
(CAN_MATCH) and corrects a debug message which reported the wait time
as "seconds" rather than the correct "jiffies".

This patch doesn't change the timed wait in cl_sync_io_wait().
That is a bit more complicated, so it left to a separate patch.

Change-Id: I632afc290935e321926f45b144d5367799a01381
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35977
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13087 target: init lcd last transno from reply data 60/37060/2
Mikhail Pershin [Thu, 5 Dec 2019 21:23:01 +0000 (00:23 +0300)]
LU-13087 target: init lcd last transno from reply data

Init lcd_last_transno value from reply data to keep it
valid so tgt_release_reply_data() will keep a slot with
the highest transno and on-disk data is not lost.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id31b3b250616fb6afd3d145c31b12af30ac86be8
Reviewed-on: https://review.whamcloud.com/37060
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13061 osp: check catlog FID after reading in 98/36998/4
Hongchao Zhang [Thu, 19 Dec 2019 02:52:29 +0000 (21:52 -0500)]
LU-13061 osp: check catlog FID after reading in

In osp_sync_llog_init, the catlog FID read from "CATALOGS"
should be checked whether it is sane or not.

Change-Id: I4342b21b7d5c6d408a9ab52a1e30815ae1d5f563
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36998
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13077 pfl: cleanup xattr checking 10/37010/7
Sebastien Buisson [Fri, 13 Dec 2019 16:39:08 +0000 (01:39 +0900)]
LU-13077 pfl: cleanup xattr checking

Cleanup xattr checking in mdd and lod layers for PFL.

Reported-by: Clement Barthelemy <clement.barthelemy@nextino.eu>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2841b615ee304785fbf316b829d8280eefc3878a
Reviewed-on: https://review.whamcloud.com/37010
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13043 quota: remove annoying message in osd_declare_inode_qid() 06/36906/5
Wang Shilong [Tue, 3 Dec 2019 06:32:22 +0000 (14:32 +0800)]
LU-13043 quota: remove annoying message in osd_declare_inode_qid()

The admin shouldn't be getting console error messages when a user goes
over quota(this would be happening continuously at some sites).

In some call paths, the "*flags" parameter may be NULL, don't try to
access it in that case.

As a general cleanup, move the QUOTA_FL_* flags over to a named enum
"enum osd_quota_local_flags" so that it is easier to see what this field
actually holds, rather than a totally generic "int *flags" argument that
has to be hunted through the code.

Fixes: d30f9e6b6c5d ("LU-11425 quota: support quota for DoM")
Change-Id: Id5686ecdb8a943e48a2888067e321f83b8569188
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/36906
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
4 years agoLU-12781 ptlrpc: fix inline reply buffer grow 32/36732/4
Mikhail Pershin [Mon, 11 Nov 2019 21:24:39 +0000 (00:24 +0300)]
LU-12781 ptlrpc: fix inline reply buffer grow

In req_capsule_server_grow() reply buffer can be increased
without re-allocation if has enough size already, don't do
that though if rs->rs_repbuf is a wrapper, e.g. with security
enabled. In that case re-allocation is still needed.

Re-enable test 272a in sanity.sh with SHARED_KEY

Test-Parameters: mdscount=2 mdtcount=4 envdefinitions=SHARED_KEY=true testlist=sanity,sanity-pfl
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I0632b9513f877bea989b7a61a729e2db488dcfcc
Reviewed-on: https://review.whamcloud.com/36732
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12036 ofd: add "no_precreate" mount option 16/36716/9
Andreas Dilger [Fri, 8 Nov 2019 09:04:52 +0000 (02:04 -0700)]
LU-12036 ofd: add "no_precreate" mount option

Add a mount option to disallow object creation on the OST.  That
allows an OST to be mounted by the administrator without it being
immediately available for use by clients/applications.  This may
be useful if the OST needs to be added to a specific pool first,
or if it is being debugged or similar.

Mount option can be disabled with the obdfilter.*.no_precreate
tunable parameter.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Icdb64a4bdd5a66b0e9e6d483e3113b97d53ebbe5
Reviewed-on: https://review.whamcloud.com/36716
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12941 lnet: Add peer level aliveness information 78/36678/5
Chris Horn [Wed, 11 Sep 2019 20:42:55 +0000 (15:42 -0500)]
LU-12941 lnet: Add peer level aliveness information

Keep track of the aliveness of a peer so that we can optimize for
situations where an LNet router hasn't responded to a ping. In
this situation we consider all routes down, and we needn't spend time
inspecting each route, or inspecting all of the router's local and
remote interfaces in order to determine the router's aliveness.

Cray-bug-id: LUS-7860
Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ie63c1ef40de3ad818639bae6b040923898fd5b46
Reviewed-on: https://review.whamcloud.com/36678
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12898 utils: %llu mismatch with type __u64 on ppcle64 58/36558/7
Olaf Faaland [Tue, 22 Oct 2019 16:44:51 +0000 (09:44 -0700)]
LU-12898 utils: %llu mismatch with type __u64 on ppcle64

Fix build errors like this one on ppcle64:

BUILDSTDERR: libmount_utils_zfs.c: In function 'zfs_mkfs_opts':
BUILDSTDERR: libmount_utils_zfs.c:573:5: error: format '%llu' expects
argument of type 'long long unsigned int', but argument 4 has type
'__u64' [-Werror=format=]
BUILDSTDERR:      mop->mo_device_kb * 1024);

__u64 was treated as an unsigned long long which breaks the build on
ppc64le, where they are not the same size.

In printf cases, cast to unsigned long long to match the printf format
so the format is compatible with the type and it is guaranteed
not to lose any data.

In the case of sscanf(), replace the call with strtoull() to eliminate
the issue.

Test-Parameters: trivial
Change-Id: I02fd82e0be4d756881c15aa9faedb9b40961661a
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/36558
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
4 years agoLU-8130 ldlm: add a counter to the per-namespace data 19/36219/5
NeilBrown [Tue, 17 Sep 2019 19:34:39 +0000 (15:34 -0400)]
LU-8130 ldlm: add a counter to the per-namespace data

When we change the resource hash to rhashtable we won't have
a per-bucket counter.  We could use the nelems global counter,
but ldlm_resource goes to some trouble to avoid having any
table-wide atomics, and hopefully rhashtable will grow the
ability to disable the global counter in the near future.
Having a counter we control makes it easier to manage the
back-reference to the namespace when there is anything in the
hash table.

So add a counter to the ldlm_ns_bucket.

Change-Id: Ic79e96f95d5cacfb5e7bb02350f5f4fafb207b44
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/36219
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12904 gss: struct cache_detail readers changed to writers 80/36580/4
Shaun Tancheff [Tue, 17 Dec 2019 04:29:15 +0000 (22:29 -0600)]
LU-12904 gss: struct cache_detail readers changed to writers

Linux 5.3 changed struct cache_detail readers to writers
SUNRPC: Track writers of the 'channel' file to improve ...

kernel-commit: 64a38e840ce5940253208eaba40265c73decc4ee

Test-Parameters: trivial
Cray-bug-id: LUS-8042
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I7750303937cd6fc560e458efa79f25e521fefec7
Reviewed-on: https://review.whamcloud.com/36580
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12410 lnet: Delete unused nid parsing code 10/35310/11
Chris Horn [Fri, 20 Sep 2019 15:08:03 +0000 (10:08 -0500)]
LU-12410 lnet: Delete unused nid parsing code

Delete the nid parsing code from liblnetconfig that is no longer used.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I6fbb3450756c7976836c3b6731d3ecd9f93cbf8d
Reviewed-on: https://review.whamcloud.com/35310
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12410 tests: Add gni tests to sanity-lnet 06/35506/11
Chris Horn [Sun, 24 Nov 2019 18:02:16 +0000 (12:02 -0600)]
LU-12410 tests: Add gni tests to sanity-lnet

Add test-cases to validate handling of gni nids to sanity-lnet.sh

Also add some additional tests to validate error handling.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I7947e237e0d3e12e2e30752bca384cef2b66072c
Reviewed-on: https://review.whamcloud.com/35506
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12410 lnet: Convert lnetctl route add and del 08/35308/13
Chris Horn [Sun, 24 Nov 2019 18:01:33 +0000 (12:01 -0600)]
LU-12410 lnet: Convert lnetctl route add and del

Convert the lnetctl route add and delete commands to utilize the new
capabilities provided by the nidstrings library.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ifcaf67575ed1de40c9a3c92f40ec6dca7fd08d9e
Reviewed-on: https://review.whamcloud.com/35308
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12410 lnet: Convert yaml peer configuration 07/35307/13
Chris Horn [Mon, 24 Jun 2019 03:56:47 +0000 (22:56 -0500)]
LU-12410 lnet: Convert yaml peer configuration

Convert the yaml peer config handlers to utilize the new capabilities
provided by the nidstrings library.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I89a53ded636877661a3600822ca49030c8841540
Reviewed-on: https://review.whamcloud.com/35307
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12410 lnet: Convert lnetctl peer add and del 05/35305/13
Chris Horn [Mon, 24 Jun 2019 00:27:42 +0000 (19:27 -0500)]
LU-12410 lnet: Convert lnetctl peer add and del

Convert the lnetctl peer add and del commands to utilize the new
capabilities provided by the nidstrings library.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I50693a2af6fef2e1ef3b34fd02c7423625cb7665
Reviewed-on: https://review.whamcloud.com/35305
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12410 lnet: Implement method to tokenize nidstrings 05/35505/11
Chris Horn [Sun, 14 Jul 2019 18:57:56 +0000 (13:57 -0500)]
LU-12410 lnet: Implement method to tokenize nidstrings

The CLI for various lnetctl operations allows the user to specify
multiple, comma separated nidstrings. Implement a common method
for tokenizing nidstrings that can be leveraged by the operations
that require it.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I2f8ab6d5d9e7c3d5bde3a11b85bdf38fbf6fdf29
Reviewed-on: https://review.whamcloud.com/35505
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13070 mdd: try old format for orphan names during recovery 49/37049/3
Artem Blagodarenko [Tue, 17 Dec 2019 09:12:36 +0000 (12:12 +0300)]
LU-13070 mdd: try old format for orphan names during recovery

mdd_orphan_destroy() loop caused by compatibility issue on upgrade to
2.11 or later. The format for names of orphans in the PENDING directory
was changed in Lustre 2.11. The old format names are not recognized by
mdd_orphan_destroy() in Lustre 2.11, but compatibility code added to
handle this was incomplete, leading to an endless loop. There's a check
for the old format name, used in mdd_orphan_delete(), but that check
was not included in mdd_orphan_destroy().

This patch adds compatibility check for mdd_orphan_destroy().

Fixes: a02fd4573fe ("LU-7787 mdd: clean up orphan object handling")
Signed-off-by: Artem Blagodarenko <c17828@cray.com>
Cray-bug-id: LUS-8270
Change-Id: I9f42188dcb00f9d536996c14771de7df02502b40
Reviewed-on: https://review.whamcloud.com/37049
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
4 years agoLU-13042 tests: give more time in sanity-selinux test_21b 05/36905/3
Sebastien Buisson [Tue, 3 Dec 2019 02:06:48 +0000 (11:06 +0900)]
LU-13042 tests: give more time in sanity-selinux test_21b

In sanity-selinux test_21b, set sepol refresh time to 1000 seconds
instead of 10. This gives plenty of time for file/dir access tests,
and also cache drop, to complete. Then reset send_sepol to a smaller,
already expired value, to force sepol refresh.

Test-Parameters: trivial
Test-Parameters: clientselinux mdtcount=4 testlist=sanity-selinux envdefinitions=ONLY="21b"
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I57f72faad4bd55736a3240cdefdac2e5814eba79
Reviewed-on: https://review.whamcloud.com/36905
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12787 tests: skip project quota if it is disabled 97/36997/2
Alexander Boyko [Thu, 19 Sep 2019 13:22:57 +0000 (09:22 -0400)]
LU-12787 tests: skip project quota if it is disabled

quota_scan touchs project quota in case of errors or logs.
When project quota is not supported, this leads to error:
    Unexpected quotactl error: Operation not supported
    ...
    Some errors happened when getting quota info. Some devices
    may be not working or deactivated. The data in "[]" is inaccurate.

The fix adds a check before touching project quota.

Cray-bug-id: LUS-7811
Test-Parameters: testlist=sanity-quota
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: Ia733b666d6937ea9e8e99ef856d2ae1246dc44d1
Reviewed-on: https://review.whamcloud.com/36997
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11607 tests: remove duplicate code lnet-selftest 65/36965/3
James Nunez [Mon, 9 Dec 2019 18:37:14 +0000 (11:37 -0700)]
LU-11607 tests: remove duplicate code lnet-selftest

lnet-selftest.sh and test-framework.sh both have a function
called is_mounted() that check if the file system is
mounted.  Since both functions do and return the same
thing, let's remove the is_mounted() function from
lnet-selftest.

Test-Parameters: trivial
Test-Parameters: fstype=zfs testlist=lnet-selftest,lnet-selftest
Test-Parameters: fstype=ldiskfs testlist=lnet-selftest,lnet-selftest
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I05ce84002cfa8ac96ac4f1e8169fb2233b66f378
Reviewed-on: https://review.whamcloud.com/36965
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12923 libcfs: Use BUILD_BUG_ON() for hash.c 02/36902/6
Arshad Hussain [Sun, 1 Dec 2019 01:43:57 +0000 (07:13 +0530)]
LU-12923 libcfs: Use BUILD_BUG_ON() for hash.c

This patch replaces all CLASSERT() with kernel defined
BUILD_BUG_ON() for file libcfs/libcfs/hash.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ie5dc744fc10b6e5f303fca93d342629e99a2403d
Reviewed-on: https://review.whamcloud.com/36902
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12991 lnet: fix rspt counter 95/36895/3
Alexey Lyashkov [Fri, 29 Nov 2019 10:42:59 +0000 (13:42 +0300)]
LU-12991 lnet: fix rspt counter

rsp entries must freed via lnet_rspt_free function to avoid counter
leak. handle NULL allocation properly.

Test-parameters: trivial

Cray-bug-id: LUS-8189
Change-Id: I7630d375387593e28bfbe2c4a3ea3712a239f64f
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-on: https://review.whamcloud.com/36895
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12923 utils: Use BUILD_BUG_ON() for wirehdr.c & wirecheck.c 11/36711/4
Arshad Hussain [Tue, 29 Oct 2019 12:55:04 +0000 (18:25 +0530)]
LU-12923 utils: Use BUILD_BUG_ON() for wirehdr.c & wirecheck.c

This patch replaces all CLASSERT() with user defined
BUILD_BUG_ON() for file lustre/utils/wirehdr.c and
lustre/utils/wirecheck.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: If46603158a9ad311762fe51839000c39a0b15307
Reviewed-on: https://review.whamcloud.com/36711
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11656 llite: fetch default layout for a directory 09/36609/11
Jian Yu [Tue, 19 Nov 2019 22:19:24 +0000 (14:19 -0800)]
LU-11656 llite: fetch default layout for a directory

For a directory that does not have trusted.lov xattr, the current
"lfs getstripe" will only print the stripe_count, stripe_size,
and stripe_index that are fetched from the /sys/fs/lustre/lov values.
It doesn't show the actual default layout that will be used when
new files will be created in that directory.

This patch fixes the above issue in ll_dir_getstripe_default() by
fetching the layout from root FID after ll_dir_get_default_layout()
returns -ENODATA from a directory that does not have trusted.lov xattr.

Change-Id: Icbf1f8f4fa5e5b8788217fcb0cfd24a3b80a27d9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36609
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-930 doc: add man pages to make file 98/36598/7
James Nunez [Mon, 28 Oct 2019 18:54:21 +0000 (12:54 -0600)]
LU-930 doc: add man pages to make file

There are several existing man pages that are not installed on the
Lustre client or server because they are not included in the make
file.  Add the following pages to the Makefile.am file:

lctl-pool_add
lctl-pool_new
l_getsepol
lfs-getname
lfs-getsom
lfs-mirror-copy
lfs-mirror-read
lfs-mirror-write
lfs-rmfid
llapi_get_lum_dir
llapi_get_lum_dir_fd
llapi_get_lum_file
llapi_get_lum_file_fd
llapi_layout_extension_size_get
llapi_layout_extension_size_set
llapi_rmfid
llapi_search_mdt
llapi_search_ost
llapi_search_tgt

Fixes: ca34df3815f7 (LU-930 doc: man pages for lctl pool_new, pool_add)
Fixes: 2a4821b836c8 (LU-12159 utils: improve lfs getname functionality)
Fixes: 697e8fe6f325 (LU-11473 doc: add lfs-getsom man page)
Fixes: c6e7c0788d7c (LU-10258 lfs: lfs mirror copy command)
Fixes: 1fd63fcb045c (LU-12090 utils: lfs rmfid)
Fixes: e82adfcbd00f (LU-930 doc: man page for l_getsepol)
Fixes: 11aa7f8704c4 (LU-11367 som: integrate LSOM with lfs find)
Fixes: fed241911f61 (LU-10070 lod: SEL: Add flag & setstripe support)
Fixes: 096db80e0810 (LU-11264 llapi: clean up llapi_search_tgt() code)

Test-Parameters: trivial

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I72c34693a9cf03d2e241d60903020a72339c75b1
Reviewed-on: https://review.whamcloud.com/36598
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12904 utils: zfs properly detect spa_multihost 82/36582/6
Shaun Tancheff [Tue, 10 Dec 2019 09:20:26 +0000 (03:20 -0600)]
LU-12904 utils: zfs properly detect spa_multihost

spa_multihost is used in a user space tool and the
compile test for spa_multihost reflect that.

Test-Parameters: trivial
Cray-bug-id: LUS-8042
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: Ie85fffb80e84a2b65547e3d48dc0cff31c3325b4
Reviewed-on: https://review.whamcloud.com/36582
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12904 build: account_page_dirtied is not exported 75/36575/4
Shaun Tancheff [Fri, 25 Oct 2019 20:11:37 +0000 (15:11 -0500)]
LU-12904 build: account_page_dirtied is not exported

Linux 5.2 does not export account_page_dirtied
mm: remove the account_page_dirtied export

Use symbol_get() to access account_page_dirtied for Lustre

kernel-commit: ac1c3e49a9a734150b33297eeca5b43d92fd5be8

Test-Parameters: trivial
Cray-bug-id: LUS-8042
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I9cd432556e183d06784537b000a4bda657116d88
Reviewed-on: https://review.whamcloud.com/36575
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12861 libcfs: provide an scnprintf and start using it 53/36453/8
Shaun Tancheff [Tue, 29 Oct 2019 22:14:18 +0000 (17:14 -0500)]
LU-12861 libcfs: provide an scnprintf and start using it

snprintf() returns the number of chars that would be needed to hold
the complete result, which may be larger that the buffer size.

scnprintf differs in it's return value is number of chars actually
written (not including the terminating null).

Correct the few patterns where the return from snprintf() is used and
expected not to exceed the passed buffer size.

Test-Parameters: trivial
Cray-bug-id: LUS-7999
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: Ie42458be16e8c0ba1cb6d688fd418683f18de21e
Reviewed-on: https://review.whamcloud.com/36453
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12634 llite: Use __xa_set_mark if it is available 73/36373/3
Shaun Tancheff [Fri, 4 Oct 2019 17:57:16 +0000 (12:57 -0500)]
LU-12634 llite: Use __xa_set_mark if it is available

Linux v4.19-rc5-248-g9b89a0355144
xarray: Add XArray marks

Test for and use __xa_set_mark() for marking page cache pages.

Move kernel compat wrappers in to inline functions.
Co-locate the configure test macros for 4.20 in kernel version order.

Test-Parameters: trivial
Cray-bug-id: LUS-7689
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I659a0e3af0a648d50205f44f2649ba8b982bfa42
Reviewed-on: https://review.whamcloud.com/36373
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
4 years agoLU-12678 lnet: use list_move where appropriate. 39/36339/2
NeilBrown [Tue, 1 Oct 2019 15:59:59 +0000 (11:59 -0400)]
LU-12678 lnet: use list_move where appropriate.

There are several places in lustre where "list_del" (or occasionally
"list_del_init") is followed by "list_add" or "list_add_tail" which
moves the object to a different list.
These can be combined into "list_move" or "list_move_tail".

Test-Parameters: trivial testlist=sanity-lnet

Change-Id: I481de128ea40928186f78a0a0cc26e89b43f1645
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/36339
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12542 handle: discard OBD_FREE_RCU 97/35797/9
NeilBrown [Tue, 1 Oct 2019 00:58:55 +0000 (20:58 -0400)]
LU-12542 handle: discard OBD_FREE_RCU

OBD_FREE_RCU and the hop_free call-back together form an overly
complex mechanism equivalent to kfree_rcu() or call_rcu(...).
Discard them and use the simpler approach.

This removes the only use for the field h_size, so discard
that too.

Change-Id: I3b4135565dab6a9aa5034f42ae3f9b66851cae31
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35797
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-8066 obdclass: don't copy ops structures in to new type. 87/35687/8
NeilBrown [Fri, 11 Oct 2019 01:26:06 +0000 (21:26 -0400)]
LU-8066 obdclass: don't copy ops structures in to new type.

The obd_ops and md_ops structures passed to class_register_type() are
read-only, and have a lifetime that is exceeds the lifetime of the
obd_type - they are static in a module which unregisters the type before
being unloaded.

So there is no need to copy the ops, just store a pointer.

Also mark all the structures as read-only to confirm they don't get
written. This is best-practice for structures of function pointers.

Linux-commit: 2233f57f1b95b9a85a3129ddcc2860ddbc4c2a94

Signed-off-by: NeilBrown <neilb@suse.com>
Change-Id: Id0be1477925e0c878e3edb6a9d892f3c89a8b19b
Reviewed-on: https://review.whamcloud.com/35687
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12514 obdclass: remove vfsmount option from client_fill_super 27/35427/4
NeilBrown [Thu, 12 Dec 2019 14:58:32 +0000 (09:58 -0500)]
LU-12514 obdclass: remove vfsmount option from client_fill_super

This arg is always NULL and is never used.
So discard it from this and related functions.

Linux-commit: 7dc2155195586ec75f53d6dcd381f935ccc35d02

Change-Id: I00b16115edbff0de7605768121981b928585552c
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/35427
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12514 obdclass: remove pointless struct lustre_mount_data2 26/35426/7
NeilBrown [Wed, 2 Oct 2019 15:26:10 +0000 (11:26 -0400)]
LU-12514 obdclass: remove pointless struct lustre_mount_data2

This is used to pass a void* and NULL to lustre_fill_super().
It is easier just to pass the void*. The "NULL" passed is
sometimes a "struct vfsmount". This pointer is passed to
ll_fill_super() which then passes it to client_common_fill_super()
that just ignores it.

Linux-commit: 998831a00192a38a9f1425b2fb2d6faf3e34e665

Change-Id: If5e229d80c08b7c16e89d11a03fc766584c24f7c
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/35426
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-8130 obd: convert obd uuid hash to rhashtable 29/34429/8
NeilBrown [Thu, 12 Dec 2019 23:46:38 +0000 (18:46 -0500)]
LU-8130 obd: convert obd uuid hash to rhashtable

The rhashtable data type is a perfect fit for the
export uuid hash table, so use that instead of
cfs_hash (which will eventually be removed).

As rhashtable supports lookups and insertions in atomic
context, there is no need to drop a spinlock while
inserting a new entry, which simplifies code quite a bit.

Linux-commit: 4206c444e4a89dae9f67f08d9c29d58c37c960cd

Change-Id: Icadf64d572982409008a1ef4d23eb0fe1e3c8cd0
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/34429
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-9679 all: prefer sizeof(*var) for ALLOC/FREE 61/36661/3
Mr NeilBrown [Mon, 4 Nov 2019 05:38:24 +0000 (16:38 +1100)]
LU-9679 all: prefer sizeof(*var) for ALLOC/FREE

The construct
   LIBCFS_ALLOC(var, sizeof(*var));
is more obviously correct than
   LIBCFS_ALLOC(var, sizeof(struct something));
and is preferred upstream (where it is actually kzalloc
or similar of course).

When it is that simple, and there is no multiplier for
the size,
   CFS_ALLOC_PTR(var);
is even better.

The same logic applies to OBD_ALLOC(), LIBCFS_FREE(),
and OBD_FREE().

So convert allocations and frees that use sizeof(struct..)
to use one of the simpler constructs.

In mgs_write_log_mdt0, uuid is better declared as a
"struct obd_uuid *" which is a struct that contain a 'char'
array.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I8cd97c75241bbb87d15cc6b7c9ac2a7d6184d700
Reviewed-on: https://review.whamcloud.com/36661
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12410 lnet: Define enum for lnetctl commands 04/35504/11
Chris Horn [Thu, 27 Jun 2019 03:51:57 +0000 (22:51 -0500)]
LU-12410 lnet: Define enum for lnetctl commands

The enum values can be used to faciliate code sharing amongst the
lnetctl routines.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I1085a70a17aefa300f3bf949cf867b2712131a0f
Reviewed-on: https://review.whamcloud.com/35504
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12410 lnet: Implement DLC wrapper for cfs_parse_nidlist 04/35304/13
Chris Horn [Sat, 22 Jun 2019 17:03:09 +0000 (12:03 -0500)]
LU-12410 lnet: Implement DLC wrapper for cfs_parse_nidlist

Implement a simple wrapper around cfs_parse_nidlist to be used by DLC
commands. The wrapper serves to sanitize the nidstr, so that is
suitable to be input to cfs_parse_nidlist. We do not want to allow an
asterisk character in this string because the resultant nidlist, when
expanded, would define too many nids for the various operations that
will utilize this functionality.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I59280317af4af05eca1c8c598eadf8871e28bcf1
Reviewed-on: https://review.whamcloud.com/35304
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12410 libcfs: Implement address range expansion 03/35303/13
Chris Horn [Sun, 23 Jun 2019 23:26:30 +0000 (18:26 -0500)]
LU-12410 libcfs: Implement address range expansion

Implements a new top-level API function for the nidstrings library.
cfs_expand_nidlist iterates over each nidrange on a nidlist and
expands the range to create the lnet_nid_t's defined by the nidrange.

The caller supplies the nidlist, an lnet_nid_t array pointer where the
lnet_nid_t's are stored, and the maximum number of LNet nids to
generate (i.e. the size of the lnet_nid_t array pointer).

cfs_expand_nidlist returns the number of lnet_nid_t's that were added
to the lnet_nid_t array.

If the provided nidlist defines more NIDs than the specified maximum
then the return value is -1.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I3e02f1ec466a8bc90142944b62565ebc7ef82e88
Reviewed-on: https://review.whamcloud.com/35303
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12646 lnet: Prefer route specified by rtr_nid 37/35737/5
Chris Horn [Thu, 8 Aug 2019 01:33:13 +0000 (20:33 -0500)]
LU-12646 lnet: Prefer route specified by rtr_nid

Restore an optimization that was initially added under LU-11413. For
routed REPLY and ACK we should preferably use the same router from
which the GET/PUT was receieved.

Cray-bug-id: LUS-8008
Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ia3059cddf70e1c477d90acdd90c13b4c5a292f4f
Reviewed-on: https://review.whamcloud.com/35737
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12222 lnet: Check if we're sending to ourselves 78/35778/9
Chris Horn [Mon, 12 Aug 2019 22:40:55 +0000 (17:40 -0500)]
LU-12222 lnet: Check if we're sending to ourselves

It's desirable to avoid taking a send credit when sending messages to
ourselves. Check if dst_nid is one of our own, and use the lolnd for
the send accordingly.

There are two exceptions:
1. Recovery messages must be sent to the lnet_ni that is being
   recovered.
2. If a source NID is specified then we need to send via the
   associated NI.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I656c6b1ef18ccb9b18bca65839de7c487981ebdd
Reviewed-on: https://review.whamcloud.com/35778
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12410 tests: Additional test cases for lnetctl and DLC 86/35386/17
Chris Horn [Sun, 30 Jun 2019 15:41:52 +0000 (10:41 -0500)]
LU-12410 tests: Additional test cases for lnetctl and DLC

To faciliate the refactoring of a few lnetctl commands I wrote some
additional test cases for sanity-lnet.sh

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I9f3f74420f89b824ee25b6547c0baa815ccfd948
Reviewed-on: https://review.whamcloud.com/35386
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13002 tests: change clean up in sanity-lnet 49/36849/5
James Nunez [Mon, 25 Nov 2019 14:25:50 +0000 (07:25 -0700)]
LU-13002 tests: change clean up in sanity-lnet

sanity-lnet should detect the state of the system before executing,
and restore that state when it has finished.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet,runtests
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I3e3c5465789389e840efab516b35234cd61be901
Reviewed-on: https://review.whamcloud.com/36849
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12965 obdclass: remove assertion for imp_refcount 43/36743/3
Li Dongyang [Wed, 13 Nov 2019 04:01:25 +0000 (15:01 +1100)]
LU-12965 obdclass: remove assertion for imp_refcount

After calling obd_zombie_import_add(), obd_import could
be freed by obd_zombie before we check imp_refcount with
LASSERT_ATOMIC_GE_LT. It's a use after free and could
crash the box.

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I3d63acf2bff543924ca0e74a35d24c507d68f6aa
Reviewed-on: https://review.whamcloud.com/36743
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12951 lmv: fix to return correct MDT count 13/36713/3
Wang Shilong [Fri, 8 Nov 2019 04:05:32 +0000 (12:05 +0800)]
LU-12951 lmv: fix to return correct MDT count

@ltd_tgts_size could be larger than actual MDT count,
as we preallocate ltd_tgts and resize it if necessary.

Fix it to use @ld_tgt_count instead.

Change-Id: I1501fd965cc74223c7a77280aac64acdbbcf17f6
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/36713
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12895 tests: add Debian dependency on selinux-utils package 82/36882/3
Sebastien Buisson [Wed, 27 Nov 2019 10:21:07 +0000 (10:21 +0000)]
LU-12895 tests: add Debian dependency on selinux-utils package

For Debian, add a dependency on selinux-utils to the lustre-tests
package. This is required in order to have 'getenforce' command
available on client nodes running auster test suite.

Fixes: 4ae1c96672df ("LU-12895 tests: stop running tests for SSK and SELinux")
Test-Parameters: trivial clientdistro=ubuntu1804 testlist=sanity
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ie9d7e8280f9c41c2a4878c9951ef8f07ac36c594
Reviewed-on: https://review.whamcloud.com/36882
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12678 lnet: discard LNetMEInsert 58/36858/2
Mr NeilBrown [Wed, 6 Nov 2019 06:13:00 +0000 (17:13 +1100)]
LU-12678 lnet: discard LNetMEInsert

This function is unused and has never been used.
It is not used by cray-dvs - the other user of LNet.

So discard it.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I35a012c390ae1a7ae9d601f12cf5da1b56d4eb6d
Reviewed-on: https://review.whamcloud.com/36858
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 lnet: discard ksnn_lock 34/36834/3
Mr NeilBrown [Mon, 18 Nov 2019 00:29:08 +0000 (11:29 +1100)]
LU-12678 lnet: discard ksnn_lock

This lock in 'struct ksock_net' is being taken in places where it
isn't needed, so it is worth cleaning up.

It isn't needed when checking if ksnn_npeers has reached
0 yet, as at that point in the code, the value can only
decrement to zero and then stay there.

It is only needed:
 - to ensure concurrent updates to ksnn_npeers don't race, and
 - to ensure that no more peers are added after the net is shutdown.

The first is best achieved using atomic_t.
The second is more easily achieved by replacing the ksnn_shutdown
flag with a large negative bias on ksnn_npeers, and using
atomic_inc_unless_negative().

So change ksnn_npeers to atomic_t and discard ksnn_lock
and ksnn_shutdown.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I23dd07ef89c7abc14f5a5fef28468a62f7b2a35c
Reviewed-on: https://review.whamcloud.com/36834
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 lnet: change ksocknal_create_peer() to return pointer 33/36833/2
Mr NeilBrown [Sun, 17 Nov 2019 23:38:32 +0000 (10:38 +1100)]
LU-12678 lnet: change ksocknal_create_peer() to return pointer

ksocknal_create_peer() currently returns an error status, and if that
is 0, a pointer is stored in a by-reference argument.  The preferred
pattern in the kernel is to return the pointer, or the error code
encoded with ERR_PTR().

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie1458851e93ff56236fe7ac914e9fdfb0b079d0b
Reviewed-on: https://review.whamcloud.com/36833
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 lnet: discard lnd_refcount 29/36829/2
Mr NeilBrown [Sun, 24 Nov 2019 23:00:48 +0000 (10:00 +1100)]
LU-12678 lnet: discard lnd_refcount

The lnd_refcount in 'struct lnet_lnd' is never tested (except
in an ASSERT()), so it cannot be needed.  Let's remove it.

Each individual lnd keeps track of how many lnet_ni are
registered for that lnd e.g. ksocklnd has a counter in ksnd_nnets
and o2iblnd has a linked list in kib_devs.
They hold a reference on the module while there are registered
devices, and the lnd is only freed (and the lnd_refcount checked)
when the module is unloaded.  This confirms that lnd_refcount
adds no value.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I0d0c04051bf01a1fa77d888b00fb0a7875b09ccd
Reviewed-on: https://review.whamcloud.com/36829
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12999 mgs: Cleanup string handling in name_create_mdt 17/36817/5
Shaun Tancheff [Mon, 2 Dec 2019 17:32:50 +0000 (11:32 -0600)]
LU-12999 mgs: Cleanup string handling in name_create_mdt

To satisfy gcc8 -Werror=format-overflow sanity test the mdt_idx
before calling snprintf.

Cray-bug-id: LUS-8186
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I2c8764d3715290ee2bd8c96cdc98b532f50632c6
Reviewed-on: https://review.whamcloud.com/36817
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-8066 obd: map MDS lov/osc to lod/osp 91/36791/2
James Simmons [Mon, 18 Nov 2019 22:22:54 +0000 (17:22 -0500)]
LU-8066 obd: map MDS lov/osc to lod/osp

Before Lustre 2.4 the MDS mirrored the clients with its osc / lov
proc tree. After the OSD changes a new lod / osp proc tree was
created but to maintain back ward compatibility special symlinks
were created. We really don't need the symlinks any more.
Instead we can expand the modification of lustre_cfg done in the
function class_config_llog_handler() to include the cases of
LCFG_SET_PARAM and LCFG_PARAM. This means anyone setting on the
tunables on MDS using the old lov / osc format can be formated
to use lod / osp instead. This way when the symlinks do get
removed handling the old format will continue to work

Change-Id: Id82f095501440981fd8e4a07be09f35adba447e5
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36791
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
4 years agoLU-12973 doc: remove bad line from .gitignore 64/36764/2
Aurelien Degremont [Fri, 15 Nov 2019 09:19:47 +0000 (09:19 +0000)]
LU-12973 doc: remove bad line from .gitignore

lustre/doc/.gitignores ignore /*.3 and /*.7 man pages file
because they use to be generated.

Commit 4943ae1 removed the generated part and replaced it with
traditional static pages, but forgot to remove the ignore rule.

Test-Parameters: trivial
Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Change-Id: I1bcf30ffc46854936553f1d205222aa9f82274d4
Reviewed-on: https://review.whamcloud.com/36764
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
4 years agoLU-12967 tgt: clean up sync_on_cancel references 54/36754/3
Andreas Dilger [Thu, 14 Nov 2019 02:49:23 +0000 (19:49 -0700)]
LU-12967 tgt: clean up sync_on_cancel references

Clean up the use of "sync_on_cancel" in the code, since the tunable
parameter is named "sync_lock_cancel" and using the same name in
the code makes it easier to find the related parts.

Rename constants to be more consistent:
  NEVER_SYNC_ON_CANCEL    -> SYNC_LOCK_CANCEL_NEVER
  BLOCKING_SYNC_ON_CANCEL -> SYNC_LOCK_CANCEL_BLOCKING
  ALWAYS_SYNC_ON_CANCEL   -> SYNC_LOCK_CANCEL_ALWAYS

Initialize sync_lock_cancel_states[] with designated initializers
so that the state names always match the declared values.

Use ARRAY_SIZE() instead of needing NUM_SYNC_ON_CANCEL_STATES.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If7c6015420a5c3266a13798fd8b96539323ebbe5
Reviewed-on: https://review.whamcloud.com/36754
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12967 ofd: restore sync_on_lock_cancel tunable 48/36748/4
Andreas Dilger [Thu, 14 Nov 2019 00:56:35 +0000 (17:56 -0700)]
LU-12967 ofd: restore sync_on_lock_cancel tunable

The "ofd.*.sync_on_lock_cancel" tunable was inadvertently replaced
during procfs->sysfs changes in 2.12 with "sync_lock_cancel".  Restore
the "sync_on_lock_cancel" tunable since it has existed since the 2.0
release and is definitely in use with several systems.

It isn't just a matter of restoring the old tunable name, since the
"mdt.*.sync_lock_cancel" name is also used since 2.8 and the code for
the two tunables was recently consolidated in the server target code.

Instead, keep the common "sync_lock_cancel" tunable name, add backward
compatibility for "sync_on_lock_cancel" for a number of releases, and
print a deprecation warning if the old name is used.

Fix up sanity.sh test_80 to check for both the old and new names,
but only if we actually need to change this tunable for ZFS, along
with minor test script style cleanups.

Fixes: 7059644e9ad3 ("LU-8066 ofd: migrate from proc to sysfs")
Change-Id: Iffe65f6268d94075c71b96d42fe60ef11ac39448
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36748
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
4 years agoLU-12923 lustre: Replace CLASSERT() with BUILD_BUG_ON() 25/36725/3
Arshad Hussain [Tue, 29 Oct 2019 17:45:30 +0000 (23:15 +0530)]
LU-12923 lustre: Replace CLASSERT() with BUILD_BUG_ON()

This patch replaces all CLASSERT() with kernel defined
BUILD_BUG_ON()

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ic21510ba4f1c99fa0ea6832d240d96ffc7622593
Reviewed-on: https://review.whamcloud.com/36725
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
4 years agoLU-12661 tests: skip sanity 817 if kernel version >= 4.14 12/36712/4
Li Dongyang [Fri, 8 Nov 2019 00:19:32 +0000 (11:19 +1100)]
LU-12661 tests: skip sanity 817 if kernel version >= 4.14

sanity test_817 is in the ALWAYS_EXCEPT list for aarch64,
however it's failing because the test was done on kernel-alt
which is 4.14.x, it's not related with the architecture.

On new kernels nfsd is not releasing the file after write,
it will fail with ETXTBSY regardless of whether the nfs export
is backed by a lustre mount or not.

Skip the test on new kernels for now.

Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ie18ceb961eee2313fca7d60a35159a7496075029
Reviewed-on: https://review.whamcloud.com/36712
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
4 years agoLU-12988 osd: do not use preallocation during mount 04/36704/5
Alex Zhuravlev [Thu, 14 Nov 2019 15:13:16 +0000 (18:13 +0300)]
LU-12988 osd: do not use preallocation during mount

as cold mballoc cache can cause very lengthy search.

Change-Id: I821b023d392336f0085a96e821dc22e92dbf23b7
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36704
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-11490 tests: use variable for file system name 94/36694/4
James Nunez [Wed, 6 Nov 2019 21:01:34 +0000 (14:01 -0700)]
LU-11490 tests: use variable for file system name

There are several tests that have the file system name
hard coded to "lustre".  These tests will fail or some
calls will fail when these tests are run on systems
where the file system name is not "lustre".  These tests
should be changed to use $FSNAME.

Test-Parameters: trivial testlist=sanity,conf-sanity

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I22263d2ae5ad29806cb709f462ef21837916c939
Reviewed-on: https://review.whamcloud.com/36694
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
4 years agoLU-8066 lustre: drop ldebugfs_remove() 82/36682/7
Mr NeilBrown [Tue, 5 Nov 2019 23:55:52 +0000 (10:55 +1100)]
LU-8066 lustre: drop ldebugfs_remove()

ldebugfs_remove() is a wrapper around debugfs_remove_recursive()
which adds two features:
1/ the pointer is tested with IS_ERR_OR_NULL before making the call
2/ the pointer is cleared after the call.

The first is not needed since Linux 3.6
Commit a59d6293e537 ("debugfs: change parameter check in
                      debugfs_remove() functions")

and the "OR_NULL" part has never been needed.  In many cases a pointer
to a debugfs dentry is already never an error, or is NULLed as soon as
the error is noticed.  Only two place is an error stored (fid_request
and fld_request), so we change those to never store the error.

The second is only needed for a few global variables.  In most other
cases the structure holding the pointer will be freed in the near
future, so clearing the pointer is pointless.  obd_debugfs_entry is
one case where I wasn't certain the NULLing the pointer was not
needed.

Then the debugfs_remove_recursive() call is made just before module
exit, and the variable is local to the module, there is no point
clearing the variable.

As the extra functionality is barely needed, let's just use the
standard interface, with occasional checks and clears as needed.

Linux-commit b145d7865a7c ("staging: lustre: get rid of
                            ldebugfs_remove()")

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I68db147433273b70d6fe0957df10ed14e8e924bb
Reviewed-on: https://review.whamcloud.com/36682
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-4423 lustre: don't declare extern variables in C files. 50/36650/3
Mr NeilBrown [Sun, 3 Nov 2019 22:37:10 +0000 (09:37 +1100)]
LU-4423 lustre: don't declare extern variables in C files.

A previous patch already removes some 'extern's.  This patch
removes the remainder from kernel code.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: If78d96edd389c602ac8ee3321492db5fedda7c69
Reviewed-on: https://review.whamcloud.com/36650
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12923 ptlrpc: Use BUILD_BUG_ON() for pack_generic.c 49/36649/4
Arshad Hussain [Tue, 29 Oct 2019 01:00:29 +0000 (06:30 +0530)]
LU-12923 ptlrpc: Use BUILD_BUG_ON() for pack_generic.c

This patch replaces all CLASSERT() with kernel defined
BUILD_BUG_ON() for file lustre/ptlrpc/pack_generic.c

This patch also fixes few space/tab issues reported
by checkpatch

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I1903f9faad8b1dec7c550a6b653dcb899aaa0b98
Reviewed-on: https://review.whamcloud.com/36649
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-8585 llite: don't cache MDS_OPEN_LOCK for volatile files 41/36641/4
James Simmons [Wed, 6 Nov 2019 15:04:13 +0000 (10:04 -0500)]
LU-8585 llite: don't cache MDS_OPEN_LOCK for volatile files

The kernels knfsd constantly opens and closes files for each
access which can result in a continuous stream of open+close RPCs
being send to the MDS. To avoid this Lustre created a special
flag, ll_nfs_dentry, which enables caching of the MDS_OPEN_LOCK
on the client. The fhandles API also uses the same exportfs layer
as NFS which indirectly ends up caching the MDS_OPEN_LOCK as well.
This is okay for normal files except for Lustre's special volatile
files that are used for HSM restore. It is expected on the last
close of a Lustre volatile file that it is no longer accessable.
To ensure this behavior is kept don't cache MDS_OPEN_LOCK for
volatile files.

Change-Id: Ia5d78baf17279c6f268bc0bf443b428d5cbea440
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36641
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12894 sec: fix checksum for skpi 04/36604/10
Sebastien Buisson [Tue, 29 Oct 2019 09:32:22 +0000 (18:32 +0900)]
LU-12894 sec: fix checksum for skpi

Compute checkum on message before actually comparing
it to hmac value.

Add test to exercise all SSK flavors.
Make sure zconf_mount does include skpath mount option if SSK or
Kerberos is in use.

Fixes: a21c13d4df ("LU-8602 gss: Properly port gss to newer crypto api.")
Test-Parameters: envdefinitions=SHARED_KEY=true testlist=sanity-sec
Test-Parameters: envdefinitions=SHARED_KEY=true,SK_FLAVOR=skn testlist=sanity,recovery-small
Test-Parameters: envdefinitions=SHARED_KEY=true,SK_FLAVOR=ska testlist=sanity,recovery-small
Test-Parameters: envdefinitions=SHARED_KEY=true,SK_FLAVOR=ski testlist=sanity,recovery-small
Test-Parameters: envdefinitions=SHARED_KEY=true,SK_FLAVOR=skpi testlist=sanity,recovery-small
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I7bcc3618c1824a0f0ca73219c7ac0ccc8405b946
Reviewed-on: https://review.whamcloud.com/36604
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12904 o2ib: ib_destroy_cq() returns void 78/36578/2
Shaun Tancheff [Fri, 25 Oct 2019 13:32:44 +0000 (08:32 -0500)]
LU-12904 o2ib: ib_destroy_cq() returns void

Kernel destroy CQ flows can't fail and the returned value of
ib_destroy_cq() is not interested in those flows.

kernel-commit: 890ac8d97e6722a9e4a66a0bd836d1b028d075fe

Test-Parameters: trivial
Cray-bug-id: LUS-8042
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I873bf76a33bd80d5e6de4d1b16a79ff5ea930f3a
Reviewed-on: https://review.whamcloud.com/36578
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12969 test: reset quota limits for all test ID/users 56/36756/4
Wang Shilong [Thu, 14 Nov 2019 06:22:22 +0000 (14:22 +0800)]
LU-12969 test: reset quota limits for all test ID/users

It looks current sanity-quota.sh assumed TSTID/TSTID2 mapped
with quota_usr/quota_2usr. However, in a real testing envirment
this might be not true.

In order to make sure we clean up everthing properly, just reset
every IDs, Users, Groups.

Test-parameters: trivial testlist=sanity-quota
Change-Id: I2faf1a6392ce2ee89e2e22ba0a6ec65efea4ade2
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/36756
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>