git://git.whamcloud.com - fs/lustre-release.git/log

LU-12542 handle: remove locking from class_handle2object()

There is limited value in this locking and test on h_in.

If the lookup could have run in parallel with
class_handle_unhash_nolock() and seen "h_in == 0", then it could
equally well have run moments earlier and not seen it - no locking
would prevent that, so the caller much be prepared to have
an object returned which has already been unhashed by the time it
sees the object.

In other words, any interlock between unhash and lookup must be
provided at a higher level than where this code is trying
to handle it.

The locking *does* prevent the refcount from being incremented if the
object has already been removed from the list. As the final reference
is always dropped after that removal, it indirectly stops the refcount
from being incremented after the final reference is dropped.
This can be more directly achieved by using refcount_inc_not_zero().

So remove the locking, and replace it with refcount_inc_not_zero().

Change-Id: Id29cee173ed0c3b060ea92e21af6e420970cfa18
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/35861
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12518 llite: Accept EBUSY for page unaligned read

When doing unaligned strided reads, it's possible for the
first and last page of a stride to be read by another
thread on the same node, resulting in EBUSY.

Also this could potentially happen for sequential read,
for example, several MPI split one large file with unaligned
page size, sequential read happen with each MPI program.

We shouldn't stop readahead in these cases.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4e832c8859452d0b52f14b5e4fdb64a972bf40a3
Reviewed-on: https://review.whamcloud.com/35457
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12518 llite: proper names/types for offset/pages

Use loff_t for file offsets and pgoff_t for page index values
instead of unsigned long, so that it is possible to distinguish
what type of value is being used in the byte-granular readahead
code. Otherwise, it is difficult to determine what units "start"
or "end" in a given function are in.

Rename variables that reference page index values with an "_idx"
suffix to make this clear when reading the code. Similarly, use
"bytes" or "pages" for variable names instead of "count" or "len".

Fix stride_page_count() to properly use loff_t for the byte_count,
which might otherwise overflow for large strides.

Cast pgoff_t vars to loff_t before PAGE_SIZE shift to avoid overflow.
Use shift and mask with PAGE_SIZE and PAGE_MASK instead of mod/div.

Use proper 64-bit division functions for the loff_t types when
calculating stride, since they are not guaranteed to be within 4GB.

Remove unused "remainder" argument from ras_align() function.

Fixes: 91d264551508 ("LU-12518 llite: support page unaligned stride readahead")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie1e18e0766bde2a72311e25536dbb562ce3ebbe5
Reviewed-on: https://review.whamcloud.com/37248
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13154 test: skip sanity-quota 66 if MDS version < 2.12.4

Since LU-12826 landed after this version, add version check to
make interop test pass.

Test-Parameters: trivial envdefinitions=ONLY=66 testlist=sanity-quota
Change-Id: I829f424b9bb103e18c06de6f797827f82e1874d1
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37276
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>

LU-13063 tests: stop running sanity test 411

sanity test 411 hits a kernel bug for RHEL 8.1. Since this
is an issue with the kernel and not Lustre, let's stop
running this test until the kernel is patched. Thus, we
need to add sanity test 411 to the ALWAYS_EXCEPT list.

Also change the ALWAYS_EXCEPT condition for test smoke for
lnet-selftest to be based on kernel version and not
architecture, so that the custom test for this patch can
pass.

Test-Parameters: trivial clientdistro=el8.1
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I60174dcd4776b53ac5b44be6c208d40e1f022445
Reviewed-on: https://review.whamcloud.com/37270
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>

LU-13152 llapi: llapi_layout_get_by_xattr groks DoM

llapi_layout_get_by_xattr() function must be updated to handle
lov component with LOV_PATTERN_MDT pattern.

Signed-off-by: Clement Barthelemy <clement.barthelemy@nextino.eu>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6553e66cd4f3b5acc65790da94555350c98fe179
Reviewed-on: https://review.whamcloud.com/37269
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>

LU-13147 tests: Cleanup sanity-lnet on test failure

Trap EXIT so we can cleanup on test failure.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I702b214046a68af2b87536dab01879c356bff2a8
Reviewed-on: https://review.whamcloud.com/37258
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13136 dom: check read-on-open buffer presents in reply

The ll_dom_finish_open() uses req_capsule_has_field() wronly,
it check only format but not buffer presence in reply, that
causes unneeded console errors about missing buffer later in
req_capsule_server_get()

Patch replaces that with req_capsule_field_present() to check
if server pack that field in reply or not and properly skip
responses from an old server.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ia6114879c90e3e6b8c5020c4912e988cad90df30
Reviewed-on: https://review.whamcloud.com/37249
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>

LU-11644 ptlrpc: show target name in req_history

Currently the req_history tracing shows the "self" NID as the second
field. However, this is not very useful since there may be a number
of different targets on the same server, and since the logs are all
collected directly on the server we already know the local NID.

Instead of printing the "self" NID, store the target name as the
second field, if that is available, so that we can determine which
target the RPC was intended for. This makes it easier to debug
problems with bad clients and isolate traffic for a specific target.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4ce5b7c557c5b491bfe3bbc5ae80257f0a3ebbe5
Reviewed-on: https://review.whamcloud.com/37193
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13093 osd: fix osd_attr_set race

The race between tgt_brw_write->ofd_write_attr_set and
ofd_attr_set took a place, and it could set a wrong attributes.
ofd_write_attr_set() does checks and declarations and sleeps on
ofd_read_lock. Another thread executes ofd_attr_set() and sets
initial uid/gid. After that the first thread wakeups and sets
another uid/gid. But ofd_write_attr_set should change attributes
for initial time only.
This also leads to a bug at credits check cause uid was changed
between declaration and attr_set.

osd_trans_exec_check(ATTR_SET) has a wrong place when xattr_set
is called. Also xattr doesn't have osd_trans_exec_op.

lustre-OST0001: opcode 0: used 9, used now 9, reserved 1
create: 0/0/0, destroy: 0/0/0
attr_set: 1/1/9, xattr_set: 2/274/0
write: 0/0/0, punch: 0/0/0, quota 6/6/0
insert: 0/0/0, delete: 0/0/0
ref_add: 0/0/0, ref_del: 0/0/0
LBUG

Cray-bug-id: LUS-8133
Fixes: 9f79d4488 ("LU-10048 ofd: take local locks within transaction")
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: Id36ff633b0d97fff345ec105e0aa1b14fccafce4
Reviewed-on: https://review.whamcloud.com/37117
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13101 llite: eviction during ll_open_cleanup()

On error ll_open_cleanup() is called while
intent lock remains pinned. So eviction can
happen while close request waits for a mod rpc slot.

Release intent lock before ll_open_cleanup()

Change-Id: Ia422351f3f54fc652078f742f2ead0bf278c9d17
Cray-bug-id: LUS-8055
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-on: https://review.whamcloud.com/37096
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13099 lmv: disable statahead for remote objects

Statahead for remote objects is supposed to be disabled by
LU-11681 lmv: disable remote file statahead.

However due to typo it is not and statahead for remote objects is
accompanied by warnings like:
ll_set_inode()) Can not initialize inode .. without object type..
ll_prep_inode()) new_inode -fatal: rc -12

Fix the typo.

Test to illustrate the issue is added.

Fixes: 02b5a407081c ("LU-11681 lmv: disable remote file statahead")

Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Cray-bug-id: LUS-8262
Change-Id: I8055b6373fb7b9777fa888dcb09384213822a59f
Reviewed-on: https://review.whamcloud.com/37089
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12991 lnet: lnet response entries leak

LNetPut with ACK flag called, but LNetMDUnlink issued before ACK
arrives. It can due timeout or it is application call (ldiskfs commit
for difficult replies on MDT).
It freed an MD but rsp don't detached, as ACK don't hold an reference
to the MD between request sends and ACK arrives.
monitor thread detect it situation and RSP entry moved into the zombie
list, which don't freed as no msg processed due MD absense.

Let's remove a response tracking in case nobody want to have reply aka
LNetMDUnlink called.

Test-parameters: trivial

Cray-bug-id: LUS-8188
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I90ad88cea41bb28b29f909c85b8273d41464ce81
Reviewed-on: https://review.whamcloud.com/36896
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13049 lnet: peer lookup handle shutdown

When LNet is shutting down, looking up peer_nis shouldn't assert
but return NULL. Callers handle NULL return

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ia658f527719a71b2d0bed144ae03582eff54fcf9
Reviewed-on: https://review.whamcloud.com/36925
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12822 uapi: properly pack data structures

Linux UAPI headers use the gcc attributre __packed__ to ensure
that the data structures are the exact same size on all platforms.
This comes at the cost of potential misaligned accesses to these
data structures which at best cost performance and at worst cause
a bus error on some platforms. To detect potential misaligned
access starting with gcc version 9 a new compile flags was
introduced which is now impacting builds with Lustre.

Examining the build failures shows most of the problems are due to
packed data structures in the Lustre UAPI header containing
unpacked data structure fields. Packing those missed structures
resolved many of the build issues. The second problem is that the
lustre utilities tend to cast some of its UAPI data structure.
A good example is struct lov_user_md being cast to
struct lov_user_md_v3. To ensure this is properly handled with
packed data structures we need to use the __may_alias__ compiler
attribute. The one exception is struct statx which is defined out
side of Lustre and its unpacked. This requires extra special
handling in user land code due to the described issues in this
comment.

Fixing this problem exposed an incorrect wiretest for
struct update_op

Last problem address is the use of __swabXXp() on packed data
structure fields. Because of the potential alignment issues we
have to use __swabXX() functions instead.

Change-Id: I149c55d3361e893bd890f9c5e9c77c15f81acc1b
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36798
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-7791 ldlm: signal vs CP callback race

In case of interrupted wait for a CP AST
failed_lock_cleanup() sets LDLM_FL_LOCAL_ONLY, so
the client wouldn't cancel the lock on CP AST.

A lock isn't canceled on the server on reception

Cray-bug-id: LUS-2021
Change-Id: Id1e365b41f1fb8a0f9a32c0c929457b22ceba8ef
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-on: https://review.whamcloud.com/19898
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

Revert "LU-13120 build: Fix ZFS dependancies for osd-zfs-mount"

This reverts commit fb687e35402fa6755589657a67dbe30be09ba9c5.

All review-dne-zfs-part-[1234] sessions fail with the most recent
master landings, and this seems like the likely culprit.

Change-Id: Id0295d65a642e7c2ef6367dac72d89acfca8a6b4
Reviewed-on: https://review.whamcloud.com/37320
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>

LU-11276 ldlm: fix lock convert races

The blocking cb may be triggered in parallel and the convert logic
of the DOM lock must be ready that the cancel_bits could be already
zeroed by the first executor.

As there may be several blocking cb parallel executors and several
conversion callers, each requesting for different inode bits, setup
the following logic:
- the lock keeps the aggregated set of bits requested for cancelling
  by different parties, where 0 means the whole lock is to be
  cancelled, and where the CBPENDING flag means there is a canceling
  job pending;
- once completed, the cancel_bits are zeroed and the CBPENDING flag
  is dropped, meaning the next request will be a part of the next job;
- once a local lock is converted, its state is changed appropriately
  and no cleanup is left for the interpret time as the lock is ready
  for the next usage;
- as the lock is unlocked in a process of conversion and more bits
  may appear, check it and repeat appropriately;
- let just 1 conversion executor to work at a time, others are waiting
  similar to ldlm_cli_cancel();
- there are others who may want to cancel unused locks (cancel_lru,
  cancel_resource_local), consider CANCELING as a request to cancel
  the full lock independently of the cancel_bits;

Some cleanups are done:
- move the cache drop logic to the CANCELING part of the blocking cb
  from the BLOCKING one;
- remove the convert RPC interpret, as the lock cleanups are already
  done in advance; the convert RPC is re-sendable and an error means
  there is a serioes net problem;

Test-Parameters: testlist=racer,racer,racer
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I901de34241704ed801152f071cb7f610fe6f4bfe
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36466
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13119 osd-ldiskfs: set f_cred for app armour

The function interate_dir() interfaces with the security layer.
For some kernel versions on platforms that use app armour it
expects f_cred to be set. Currently osd-ldiskfs open codes the
creation of struct file so it is missing a cred. Fix this by
setting f_cred to the default current_cred().

Test-Parameters: testlist=sanity-lfsck serverdistro=sles12sp3

Change-Id: I38487e8ae99a0f70d6e430935b7d19523d414b4b
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/37184
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>

LU-13121 llite: fix deadlock in ll_update_lsm_md()

Deadlock may happen in in following senario: a lookup process called
ll_update_lsm_md(), it found lli->lli_lsm_md is NULL, then
down_write(&lli->lli_lsm_sem). but another lookup process initialized
lli->lli_lsm_md after this check and before write lock, so the first
lookup process called up_read(&lli->lli_lsm_sem) and return, so the
write lock is never released, which cause subsequent lookups deadlock.

Rearrange the code to simplify the locking:
1. take read lock.
2. if lsm was initialized and unchanged, release read lock and return.
3. otherwise release read lock and take write lock.
4. free current lsm and initialize with new lsm.
5. release write lock.
6. initialize stripes with read lock.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ifcc25a957983512db6f29105b5ca5b6ec914cb4b
Reviewed-on: https://review.whamcloud.com/37182
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13120 build: Fix ZFS dependancies for osd-zfs-mount

lustre-osd-zfs-mount depends on zfs
lustre-osd-zfs-mount depends on kmod-lustre-osd-zfs

SuSE packaging style prefers kmp package naming so prepare
for adopting a kmp named zfs package

Test-Parameters: trivial
Cray-bug-id: LUS-7077
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I510a46dd3d0e6d58a1e0db36226d412ee06016ec
Reviewed-on: https://review.whamcloud.com/37169
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13117 libcfs: fix to match right key in cfs_get_environ()

It does the memcmp() to match the environment variable
with the desired key, then accounts for the "=" when
calculating length. But it fails to check that the next
character is actually an equals sign. In the case of
any key which is also the prefix to some other variable

Also add debug information for debugging similar issue
in the future.

Test-Parameters: trivial
Change-Id: Ia2b4ccd1f10c89059cecc224d4e2ba8d1d75b825
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37156
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13098 ptlrpc: supress connection restored message

if that happens on idling connection.

Fixes: 5a6ceb664f07 ("LU-7236 ptlrpc: idle connections can disconnect")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I506665d427f3e77477f53e2d3059bcb1daaf0318
Reviewed-on: https://review.whamcloud.com/37086
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13092 lbuild: include lbuild-{fc,rhel,sles} to SIGNATURE

We should include these files to calculate SIGNATURE, for example
bump kernel extra tags could happen there.

Test-Parameters: trivial
Change-Id: I2c62ad765d3c6a1b9e99affe3be95a404d6140c5
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37076
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13053 tests: fix conf-sanity call to umount_ldiskfs

conf-sanity test 87 calls umount_ldiskfs(), but the function
in test-framework.sh is unmount_ldiskfs(). We need to
change the function call in test 87 to unmount_ldiskfs().

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I3e0818a229341c4fab8aee923cad2253b7dd634d
Reviewed-on: https://review.whamcloud.com/36949
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12678 lnet: remove locking protection ln_testprotocompat

lnet_net_lock(LNET_LOCK_EX) is a heavy-weight lock that is not
necessary here. The bits in this field are only set rarely - via an
ioctl - and the pattern for reading and clearing them exactly
matches test_and_clear_bit(). So change the field to "unsigned
long" (so test_and_clear_bit() can be used), and use
test_and_clear_bit(), discarding all other locking.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie420fcb3d547d9ec04025b921d5b24bd8f2fcce3
Reviewed-on: https://review.whamcloud.com/36856
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12678 lnet: make "struct lnet_lnd" always "const".

Every place where "struct lnet_lnd" appears, "const" is
added in front. Now all those structs can be in read-only
memory which is generally more secure.

Linux-commit 07499855083e ("lnet: make "struct lnet_lnd"
always "const".")

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I54a73d5b12de8c6b9a98182577c3c30d05c00222
Reviewed-on: https://review.whamcloud.com/36832
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12678 socklnd: initialize the_ksocklnd at compile-time.

All other lnds initialize this struct at compile-time.
It is best for socklnd to do so too.

Test-Parameters: trivial testlist=sanity-lnet

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I3acd636f6f5ba783a2c60bf18ffc46c98e091c13
Reviewed-on: https://review.whamcloud.com/36831
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-11385 odbclass: Handle gracefully if nsproxy is NULL

Gracefully handle the case if current->nsproxy is NULL:
check for the condition and return an error, avoiding attempts
to dereference the pointer.

Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ia102d2bacdb0e54b0339985396447e6d25465c56
Reviewed-on: https://review.whamcloud.com/36802
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12968 mgs: Prevent reading past end of buffer

KASAN reported
  BUG: KASAN: slab-out-of-bounds in mgs_wlp_lcfg+0xb3/0x4a0 [mgs]
  Read of size 64 at addr ffff8880b8f9fe40 by task ll_mgs_0002/17603

On memory allocated here.
  mgs_write_log_target+0x2ae/0x910 [mgs]

In mgs_wlp_lcfg( ..., char *ptr) ptr is a string so use strlcpy
instead of memcpy to avoid reading past the end of the buffer

Cray-bug-id: LUS-8137
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I539c0b4d878d26c44f64a4cd5746a8fba1bef2fa
Reviewed-on: https://review.whamcloud.com/36753
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-12904 ldiskfs: Add ldiskfs support for linux 5.4

Linux 5.4 ext4 has some changes from 5.0 this
fixes up the ldiskfs patches to apply against 5.4

Test-Parameters: trivial
Cray-bug-id: LUS-8042
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I116226ec9297eead4dfd3403be748f732e67f54f
Reviewed-on: https://review.whamcloud.com/36583
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13141 ldiskfs: block alloc performance patch

Add block alloc performance patch to CentOS 7.7, 8.0 and
Ubuntu 19.04 5.0 kernel.

Cray-bug-id: LUS-8402
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: Ifeb78839e5dbe8731bbb5532906708b97d4d9d33
Reviewed-on: https://review.whamcloud.com/37250
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12791 kernel: kernel update RHEL 8.0 [4.18.0-80.11.2.el8_0]

Update RHEL 8.0 kernel to 4.18.0-80.11.2.el8_0.

Test-Parameters: trivial clientdistro=el8 \
testlist=sanity

Change-Id: I4081719fa9a8c83ea0e8bff46dc9d54774cabb56
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36527
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12214 selinux: Remove concatenating of selinux context

Remove concatenating of context for the temporary mount point
if selinux is enabled.
mount.zfs don't have that option, so revert it for consistency.
It can be added with -o option if needed.

Cray-bug-id: LUS-5992
Test-Parameters: clientselinux mdtcount=4 testlist=sanity,recovery-small,sanity-sec,sanity-selinux

Change-Id: If471de13e201c5cdcb28631b90b2efa13d8f2b4f
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-on: https://review.whamcloud.com/36423
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12521 llapi: add separate fsname and instance API

The llapi_getname() function returns the combined fsname and client
instance as one string, which is fine when using the entire string,
but the output cannot be safely parsed into separate fsname and
instance strings in all cases.

Introduce new llapi_get_fsname() and llapi_get_instance() functions
that return only the fsname and instance strings, since the source
string returned from the kernel can be unambiguously separated before
it is returned in a combined string via llapi_getname().

Fix the lfs_getname() '-n' and '-i' options to use the new routines
rather than parsing the output from llapi_getname().

Add man pages for these functions.

Fixes: 2a4821b836c8 ("LU-12159 utils: improve lfs getname functionality")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iaf5846a0ae147a428f66ec8a1d0251e7e12540e5
Reviewed-on: https://review.whamcloud.com/35451
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12806 llapi: use name_to_handle_at in llapi_fd2fid

Reimplement llapi_fd2fid so as to use name_to_handle_at() rather than
using an ioctl() call.

This patch also updates llapi_fid2path as using file descriptor
obtained from a call to open() + O_PATH is valid with
name_to_handle_at(), is more efficient and also works for symlinks
out of the box.

Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: Ic7e83c7fdf924363ed59a0681267d960e660db6d
Reviewed-on: https://review.whamcloud.com/36292
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12919 lnet: Fix source specified route selection

If lnet_send() is called with a specific src_nid, but
rtr_nid == LNET_NID_ANY and the message needs to be routed, then we
need to ensure that the lnet_peer_ni of our next hop is on the same
network as the lnet_ni associated with the src_nid. Otherwise we
may end up choosing an lnet_peer_ni that cannot be reached from
the specified source.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Idc5b808f90170a3480d523ba4726cc48c3387ddb
Reviewed-on: https://review.whamcloud.com/36622
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-10467 ptlrpc: refactor waiting in ptlrpc_set_wait()

ptlrpc_set_wait can wait either with or without signals blocked.
After it has waited, it possibly checks if a signal is pending and if
so, marks the set as interrupted.

The code for the check examines lwi.lwi_allow_intr which was set
before the wait.  Converting this to use upstream wait primitives
will remove lwi, so we need another way to handle this.

The current test looks wrong.  It is
        if (rc == -ETIMEDOUT &&
            (!lwi.lwi_allow_intr || set->set_allow_intr) &&
            signal_pending(current)) {

but if lwi.lwi_allow_intr is true, then the wait will have allowed
signals and so the set will already have been interrupted if needed.
So the case where lwi.lwi_allow_intr is true and set->set_allow_intr
is also true, should be irrelevant.

i.e. the condition should just be
        if (rc == -ETIMEDOUT &&
            !lwi.lwi_allow_intr &&
            signal_pending(current)) {

which it was before
Commit afcf3026c6ad ("LU-6684 lfsck: stop lfsck even
                       if some servers offline")

Given this, if we move the l_wait_event() into each branch of the
'if', we can then move the extra condition and
ptlrpc_interrupted_set() call into the 'else' branch - the only place
the condition would fire, and simplify the condition to

        if (rc == -ETIMEDOUT &&
            signal_pending(current)) {

This will make the two waits separate, so they can be easily
converted.

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I1e1819c697cea47607d5fc4a018c898236b33f4b
Reviewed-on: https://review.whamcloud.com/35981
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12470 tests: increase pdirops timeout

There are pretty regular failures of the sanityn pdirops test_40-47.
Increase the timeout slightly to reduces the frequency of failures.

Fixes: 743b85a32e24 ("LU-2233 tests: improve tests sanityn/40-47")
Test-Parameters: trivial testlist=sanityn,sanityn,sanityn,sanityn
Test-Parameters: testlist=sanityn,sanityn,sanityn,sanityn,sanityn
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie1ea5704edab97b61f563135b4cc2491dc3ebbe5
Reviewed-on: https://review.whamcloud.com/37304
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12460 llite: replace lli_trunc_sem

lli_trunc_sem can lead to a deadlock.

vvp_io_read_start takes lli_trunc_sem, and can take
mmap sem in the direct i/o case, via
generic_file_read_iter->ll_direct_IO->get_user_pages_unlocked

vvp_io_fault_start is called with mmap_sem held (taken in
the kernel page fault code), and takes lli_trunc_sem.

These aren't necessarily the same mmap_sem, but can be if
you mmap a lustre file, then read into that mapped memory
from the file.

These are both 'down_read' calls on lli_trunc_sem so they
don't directly conflict, but if vvp_io_setattr_start() is
called to truncate the file between these, it does
'down_write' on lli_trunc_sem.  As semaphores are queued,
this down_write blocks subsequent reads.

This means if the page fault has taken the mmap_sem,
but not yet the lli_trunc_sem in vvp_io_fault_start,
it will wait behind the lli_trunc_sem down_write from
vvp_io_setattr_start.

At the same time, vvp_io_read_start is holding the
lli_trunc_sem and waiting for the mmap_sem, which will not
be released because vvp_io_fault_start cannot get the
lli_trunc_sem because the setattr 'down_write' operation is
queued in front of it.

Solve this by replacing with a hand-coded semaphore, using
atomic counters and wait_var_event().  This allows a
special down_read_nowait which ignores waiting down_write
operations.  This combined with waking up all waiters at
once guarantees that down_read_nowait can always 'join'
another down_read, guaranteeing our ability to take the
semaphore twice for read and avoiding the deadlock.

I'd like there to be a better way to fix this, but I
haven't found it yet.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ibd3abf4df1f1f6f45e440733a364999bd608b191
Reviewed-on: https://review.whamcloud.com/35271
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-8066 lfsck: use underscores in lfsck status files

Use underscores to separate words in the lfsck_layout
and oi_scrub files instead of hyphens, to match the use
in lfsck_namespace and consistency with other files.

Test-Parameters: trivial testlist=sanity-lfsck,sanity-scrub
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5306d82a4e6056a5098d1753f53cb4ee4e2540e5
Reviewed-on: https://review.whamcloud.com/36715
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12928 tests: start running recovery-small 136

recovery-small test 136 was not run due to repeated crashes
when run with SELinux and shared secrect key enabled. The
crash was fixed and we need to start running this test
again meaning recovery-small test 136 needs to be removed
from the ALWAYS_EXCEPT list.

Test-Parameters: trivial
Test-Parameters: envdefinitions=SHARED_KEY=true clientselinux testlist=recovery-small
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: If7e2e7932c3916f0588f943404b59aa5653bbf17
Reviewed-on: https://review.whamcloud.com/37009
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-10664 tests: fix MPI tests in dom-performance.sh

Make MPI tests in dom-performance.sh to be ran under
mpiuser instead of root

Test-Parameters: trivial mdssizegb=20 testlist=dom-performance
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ief3036434191a0bc153d5c8e380183d5e5067dc4
Reviewed-on: https://review.whamcloud.com/37044
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13039 quota: Ensure local buffer is null terminated

Found via KASAN, copy_from_user may not set null
terminator.

Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I20de911c2b2d50a1715a27e3edfe65442eaa2be6
Reviewed-on: https://review.whamcloud.com/36899
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13005 lnet: discard LNetEQGet and LNetEQWait

These interfaces are never used and are not particularly useful,
so discard them.

Test-Parameters: trivial testlist=sanity-lnet

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Iaf2bc9ec2638820c3e4334e40cf2cf6993237f7d
Reviewed-on: https://review.whamcloud.com/36840
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13004 ptlrpc: Allow BULK_BUF_KIOV to accept a kvec

Bulk descriptor of type PTLRPC_BULK_BUF_KIOV are comprised
of a list of page+offset+len.
If the calling code actually has a virtual-address+len, it
cannot current use BULK_BUF_KIOV and must use BULK_BUF_KVEC.

However it is quite easy to convert virtual-address+len
to a list of page+offset+len.

So we can add a ->add_iov_frag interface for KIOV descriptors, and
then we will be able to use KIOV descriptors for everything. The
caller must ensure to allocate a large enough descriptor, taking
into account the size of each exptected kvec.

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: If8bc5dc9f6e89a196bd72d3ac9b88c4ea5da83d1
Reviewed-on: https://review.whamcloud.com/36824
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12865 tests: fix sanity 160f to be more robust

The sanity test_160f test was failing intermittently because the first
Changelog user ("cl6") was being unregistered in some cases when it
set changelog_max_idle_time=10, but the test slept for 9s and then did
some operations that could be slow.  In rare cases the test runs too
long and the MDS evicts the "good" user along with the bad user:

   MDD0000: Force deregister of ChangeLog user cl7 idle more than 35s
   MDD0000: Force deregister of ChangeLog user cl6 idle more than 11s

Change the test sleep interval to be half of the max_idle limit so
that there is no risk of the "good" Changelog user being evicted.

Add some logging to the test so that it is easier to correlate test
script actions with events in the MDS debug log.

Fixes: 31fef6845e8b ("LU-10680 mdd: create gc thread when no current transaction")
Test-Parameters: trivial envdefinitions=ONLY=160 testlist=sanity,sanity
Test-Parameters: envdefinitions=ONLY=160 mdscount=2 testlist=sanity,sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0e4c9c271d98a2716f848e75676780b0383ebbe5
Reviewed-on: https://review.whamcloud.com/36468
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-8130 lu_object: factor out extra per-bucket data

The hash tables managed by lu_object store some extra
information in each bucket in the hash table.  This prevents the use
of resizeable hash tables, so lu_site_init() goes to some trouble
to try to guess a good hash size.

There is no real need for the extra data to be closely associated with
hash buckets.  There is a small advantage as both the hash bucket and
the extra information can then be protected by the same lock, but as
these locks have low contention, that should rarely be noticed.

The extra data is updated frequently and accessed rarely, such an lru
list and a wait_queue head.  There could just be a single copy of this
data for the whole array, but on a many-cpu machine, that could become
a contention bottle neck.  So it makes sense keep multiple shards and
combine them only when needed.  It does not make sense to have many
more copies than there are CPUs.

This patch takes the extra data out of the hash table buckets and
creates a separate array, which never has more entries than twice the
number of possible cpus.  As this extra data contains a
wait_queue_head, which contains a spinlock, that lock is used to
protect the other data (counter and lru list).

The code currently uses a very simple hash to choose a
hash-table bucket:

(fid_seq(fid) + fid_oid(fid)) & (CFS_HASH_NBKT(hs) - 1)

There is no documented reason for this and I cannot see any value in
not using a general hash function. We can use hash_32() and hash_64()
on the fid value with a random seed created for each lu_site. The
hash_*() functions where picked over the jhash() functions since
it performances way better.

The lock ordering requires that a hash-table lock cannot be taken
while an extra-data lock is held.  This means that in
lu_site_purge_objects() we much first remove objects from the lru
(with the extra information locked) and then remove each one from the
hash table.  To ensure the object is not found between these two
steps, the LU_OBJECT_HEARD_BANSHEE flag is set.

As the extra info is now separate from the hash buckets, we cannot
report statistic from both at the same time.  I think the lru
statistics are probably more useful than the hash-table statistics, so
I have preserved the former and discarded the latter.  When the
hashtable becomes resizeable, those statistics will be irrelevant.

As the lru and the hash table are now managed by different locks
we need to be careful to prevent htable_lookup() finding an
object that lu_site_purge_objects() is purging.
To help with this we introduce a new lu_object flag to say
that and object is being purged.  Once set, the object will
be quickly removed from the hash table, and is already
removed from the lru.

Change-Id: I2a7402a348377d3b17f76e8617216e5b7ff9b99a
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/36216
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12756 lnet: Refactor lnet_compare_routes

Restrict lnet_compare_routes() to only comparing the lnet_route
objects passed as arguments. This saves us from doing unecessary
calls to lnet_find_best_lpni_on_net().

Rename lnet_compare_peers to lnet_compare_gw_lpnis to better
reflect what is done by this routine.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I2d7b5dcc2aacb371b21908ceebf2dd6a349fa74c
Reviewed-on: https://review.whamcloud.com/36621
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12756 lnet: Remove unused vars in lnet_find_route_locked

The lp and lp_best variables are not needed in
lnet_find_route_locked().

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I61a7097ab66703a1af1346c7301b9efc7e4392c9
Reviewed-on: https://review.whamcloud.com/36620
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12756 lnet: Avoid extra lnet_remotenet lookup

We can keep track of the lnet_remotenet object associated with the
"best" lnet_peer_net, and pass that lnet_remotenet directly to
lnet_find_route_locked().

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ib9808ca885c698ba6c73c5243fbce8b3f499b790
Reviewed-on: https://review.whamcloud.com/36536
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13115 mdt: handle mdt_pack_sectx_in_reply() errors

The mdt_pack_secctx_in_reply() contains mo_xattr_get() call
which -ENOENT error should be checked and exit by error path
if needed.

In DNE environment lu_object may lost its LOHA_EXISTS flag
during osp_xattr_get() and that should be handled to don't
proceed with code paths for existent objects.

Test-Parameters: clientselinux mdtcount=4 envdefinitions=ONLY=185a testlist=sanity,sanity,sanity,sanity
Test-Parameters: clientselinux mdtcount=4 testlist=sanity,recovery-small,sanity-sec
Test-Parameters: clientselinux mdtcount=4 testgroup=review-dne-selinux
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I55ad666f58dd3fae3ed097018aa23ed94818d246
Reviewed-on: https://review.whamcloud.com/37148
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9679 llite: fix possible race with module unload.

lustre_fill_super() calls client_fill_super() without holding a
reference to the module containing client_fill_super.  If that
module is unloaded at a bad time, this can crash.

To be able to get a reference to the module using
try_get_module(), we need a pointer to the module.

So replace
  lustre_register_client_fill_super() and
  lustre_register_kill_super_cb()
with a single
  lustre_register_super_ops()
which also passed a module pointer.

Then use a spinlock to ensure the module pointer isn't removed
while try_module_get() is running, and use try_module_get() to
ensure we have a reference before calling client_fill_super().

Now that we take the reference to the module before calling
luster_fill_super(), we don't need to take one inside
lustre_fill_super().

Linux-commit: d487fe31f49e78f3cdd826923bf0c340a839ffd8

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9474622f2a253d9882eae3f0578c50782dd11ad4
Reviewed-on: https://review.whamcloud.com/37020
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

New tag 2.13.51

Change-Id: I2ce973d9b599ed426b56d7892176205ba6822910

LU-12923 lnet: Replace CLASSERT() with BUILD_BUG_ON()

This patch replaces CLASSERT() with kernel defined
BUILD_BUG_ON()

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I94292ca4729c19e0651fad285943ae02584afc03
Reviewed-on: https://review.whamcloud.com/37113
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12923 lustre: Replace CLASSERT() with BUILD_BUG_ON()

This patch replaces remaining CLASSERT() with kernel defined
BUILD_BUG_ON()

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ie23846f8d67cac1872bda9c7e20fe9bc888bf365
Reviewed-on: https://review.whamcloud.com/37111
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-13090 utils: fix lfs_migrate -p for file with pool

If "lfs_migrate -p <pool>" is run to migrate a file with an existing
pool, the given pool is overridden by the existing pool from the file
during migration. Fix this to use the OST pool requested by the user.

Don't print a warning about deprecated -n option if --dry-run is used.

If a pool is specified, use it with "lfs df" to find OST free space.

Change temp filename to work better with new DNE "crush" hash.

Don't return an error if falling back to rsync and no links are found.

Add test for "lfs_migrate -p" and update man page and usage to match.
Clean up debug-level helpers in test-framework.sh.

Test-Parameters: trivial testlist=ost-pools
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ief69a620fc969aeff24ec0633a3314c3b83ebbe5
Reviewed-on: https://review.whamcloud.com/37067
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13088 ldlm: Fix sleeping function called in atomic

target_recovery_overseer() can sleep while holding a spinlock, which
triggers a BUG warning.

It is easily fixed by dropping the spinlock before waiting. In the
case where the task waits, no useful information that could be
protected by the spinlock is held, so nothing can be lost by dropping
it.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I8bb3d02523b5dcfadac19f01ccb736d7b7f28239
Reviewed-on: https://review.whamcloud.com/37063
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13059 kernel: kernel update RHEL7.7 [3.10.0-1062.9.1.el7]

Update RHEL7.7 kernel to 3.10.0-1062.9.1.el7.

Test-Parameters: trivial clientdistro=el7.7 serverdistro=el7.7

Change-Id: I11fc7a2c382a5c234698bfb30a38a08ed29fef03
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36960
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-930 doc: fix formatting errors in lfs_migrate.1

Add missing .TP sections for the command-line options.
Remove duplicate EXAMPLES section and '--yes' from bad merge.

Test-Parameters: trivial
Fixes: 99d7a8ed43b ("LU-8207 scripts: add auto-stripe option to lfs_migrate")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5de290e00c5fd718e53ac0fc801d44e1cf3ebbe5
Reviewed-on: https://review.whamcloud.com/36959
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12637 kernel: new kernel [RHEL 8.1 4.18.0-147.3.1.el8_1]

This patch makes changes to support new RHEL 8.1 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.1 \
envdefinitions=SANITY_EXCEPT="411" \
testlist=sanity

Change-Id: Ifcc0a15c3ad9afa99b670641f91b23c1a5c0668e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36946
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12678 lnet: use lnet_accept_magic, not le32_to_cpu.

This le32_to_cpu() looks wrong, as the argument is a CPU value, not
le32, and the value is being compared to something that might be
le32. Previous code used lnet_accept_magic() for tests on 'magic',
so it seems to make sense to use lnet_accept_magic() here too.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I3f04bb087d4ae3d6785e77072b51132f9440bd32
Reviewed-on: https://review.whamcloud.com/36857
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12678 lnet: lnet_startup_lndnet: avoid use-after-free

If lnet_startup_lndni() fails it will free 'ni' (via lnet_ni_free()).
So we mustn't de-reference it in the LASSERT() in that case

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I01e35013e028a8f95f169e25aeb0c344b2310380
Reviewed-on: https://review.whamcloud.com/36855
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12678 lnet: change list_for_each in ksocknal_debug_peerhash

This list_for_each() loop searches for a particular entry,
then acts of in.  It currently acts after the loop by testing
if the variable is NULL.  When we convert to list_for_each_entry()
it won't be NULL.

Change the code so the acting happens inside the loop.
list_for_each_entry() {
    if (this isn't it)
        continue;
    act on entry;
    goto done; // break out of 2 loops
}

Note that identing is deliberately left unchanged,
as the next patch will change the 2 loops to a single loop,
after which the current indents will be correct.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Idea32bf2ab4037650d6698d4f82f6b6764b4d1b2
Reviewed-on: https://review.whamcloud.com/36836
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12678 lnet: discard struct ksock_peer

struct ksock_peer is declared in a forward-ref, but
never defined or used. Let's remove it, and change
some spaces to TABs while we are there.

Test-Parameters: trivial testlist=sanity-lnet

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I8a86a77a5cad606a374e60a5b8920be28308587d
Reviewed-on: https://review.whamcloud.com/36835
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12678 lnet: prepare to make lnet_lnd const.

Preferred practice is for structs containing function
pointers to be 'const'. Such structs are generally tempting
attack vectors, and making them const allows linux to place
them in read-only memory, thus reducing the attack surface.

'struct lnet_lnd' is mostly function pointers, but contains
one writable field - a list_head.

Rather than keeping registered lnds in a linked-list, we can place
them in an array indexed by type - type numbers are at most 15 so
this is not a burden.

With these changes, no part of an lnet_lnd is ever modified.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I08c7df551109e05ca4a3cef866e8df737d1a1ad4
Reviewed-on: https://review.whamcloud.com/36830
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-8130 ldlm: simplify ldlm_ns_hash_defs[]

As the ldlm_ns_types are dense, we can use the type as
the index to the array, rather than searching through
the array for a match.
We can also discard nsd_hops as all hash tables now
use the same hops.
This makes the table smaller and the code simpler.

Change-Id: I2aebb9d533d676bed51a7422801545be4fbb7e1e
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/36220
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>

LU-12542 handle: rename ops to owner

Now that portals_handle_ops contains only a char*,
it is functioning primarily to identify the owner of each handle.
So change the name to h_owner, and the type to const char*.

Note: this h_owner is now quite different from the similar h_owner
in the server code. When server code it merged the
"med" pointer should be stored in the "mfd" and validated separately.

Change-Id: Ie2e9134ea22c4929683c84bf45c41b96b348d0a2
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35798
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9091 sysfs: use string helper like functions for sysfs

For a very long time the Linux kernel has supported the function
memparse() that allowed the passing in of memory sizes with the
suffix set of K, M, G, T, P, E. Lustre adopted this approach
with its proc / sysfs implmentation. The difference being that
lustre expanded this functionality to allow sizes with a
fractional component, such as 1.5G for example. The code used to
parse for the numerical value is heavily tied into the debugfs
seq_file handling and stomps on the passed in buffer which you
can't do with sysfs files.

Similar functionality to what Lustre does today exist in newer
linux kernels in the form of string helpers. Currently the
string helpers only convert a numerical value to human readable
format. A new function, string_to_size(), was created that takes
a string and turns it into a numerical value. This enables the
use of string helper suffixes i.e MiB, kB etc with the lustre
tunables and we can now support 10 base numbers i.e MB, kB as
well. Already string helper suffixes are used for debugfs files
so I expect this to be adopted over time so it should be
encouraged to use string_to_size() for newer lustre sysfs files.

At the same time we want to perserve the original behavior of
using the suffix set of K, M, G, T, P, E. To do this we create
the function sysfs_memparse() that supports the new string helper
suffixes as well as the older set of suffixes. This new code is
also way simpler than what is currently done with the current code.

Change-Id: Ia437db44f2a987aa11ab4ff3e9df23e9aeba04d7
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/35658
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12477 libcfs: Remove obsolete config checks

Remove a few config checks for kernel versions we no longer
support. Only 3.10+ kernels are now supported.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4f4177c512a37fb7a78bab69aa89aa7199ab30b4
Reviewed-on: https://review.whamcloud.com/35342
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12756 lnet: Avoid comparing route to itself

The first iteration of the route selection loop compares the first
route in the list with itself.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I1a51b04b248dbaa9a47a7a69e2995c21e515fb2b
Reviewed-on: https://review.whamcloud.com/36535
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12756 lnet: Refactor lnet_find_best_lpni_on_net

Replace lnet_send_data argument.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ic346eaf6870f2a7c68c7f4c45d424f4f924370d9
Reviewed-on: https://review.whamcloud.com/36534
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13069 obdclass: don't skip records for wrapped catalog

osp_sync_thread() uses opd_sync_last_catalog_idx as a start point of
catalog processing. It is used at llog_cat_process_cb also, to skip
records from processing. When catalog is wrapped, processing starts
from second part of catalog and then a first part. So, a first part
would be skipped at llog_cat_process_cb() base on lpd_startcat.

osp_sync_thread() restarts a processing loop with a
opd_sync_last_catalog_idx. For a wrapped it increases last
index and one more increase do a llog_process_thread. This leads
to a skipped records at catalog, they would not be processed.
The patch fixes these issues.
It also adds sanity test 135 and 136 as regression tests.

Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-8053,LUS-8236
Change-Id: Ic75af1bf4468b9ef2de32cbf6d834b6a81376e88
Reviewed-on: https://review.whamcloud.com/36996
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9679 lnet: always check return of try_module_get()

try_module_get() can fail, so the return value should be checked.
If we *know* that we already hold a reference, __module_get()
should be used instead.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Id526f9ae3829a50fe7df7069230804322cd4558e
Reviewed-on: https://review.whamcloud.com/36854
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9679 obdclass: don't manage module refs in open/close.

Core Linux code for managing char-devs ensures that the relevant
module is held active while a char-dev is open - see cdev_get()
and cdev_put().
So there is no need for lustre/obd_class to manage the module
ref count as well.

As this is all that obd_class_open and obd_class_close do, those
functions can be removed.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I84b0dc81c830cefc2383f184d12beeb2cfa22404
Reviewed-on: https://review.whamcloud.com/37021
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-10467 osp: use wait_event_idle_timeout()

osp has 4 LWI_TIMEOUT() calls that pass an on_timeout
function.
In each case, the on_timeout function returns 1, so this
is equivalent to using wait_event_idle_timeout(), and
calling the function if the timeout happened.

One of the two functions passed does nothing except return 1, so it
can be ignored.
The other function, used only once, contains a CDEBUG message,
so we now call that when wait_event_idle_timeout() returns 0.

Change-Id: Ic153266e412d684c4aa6c7204ff5755d991d83c6
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/35988
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-10467 ptlrpc: convert waiting in sptlrpc_req_refresh_ctx()

The l_wait_event call in sptlrpc_req_refresh_ctx() is somewhat complex
as it is passed both an on_timeout and on_signal handler, and
on_timeout doesn't return a constant value.

The net effect is to wait for the timeout with signals blocked. Then,
if the condition still isn't true, run the on_timeout handler and if
that returns zero, wait again - indefinitely this time - and allow
some signals. If a signal was received, call the on_signal handler.

This is fairly straight forward to write out in C, as shown in the
patch.

Change-Id: I7f9cfb8a8ff234bed4045ab21b53d018337cd615
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35987
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-10467 lustre: use wait_event_idle_timeout() as appropriate.

If l_wait_event() is passed an lwi initialised with
one of
   LWI_TIMEOUT_INTR( time, NULL, NULL, NULL)
   LWI_TIMEOUT_INTR( time, NULL, LWI_ON_SIGNAL_NOOP, NULL)
   LWI_TIMEOUT( time, NULL, NULL)
where time != 0, then it behaves much like
wait_event_idle_timeout().
All signals are blocked, and it waits either for the
condition to be true, or for the timeout (in jiffies).

Note that LWI_ON_SIGNAL_NOOP has no effect here.

l_wait_event() returns 0 when the condition is true, or -ETIMEDOUT
when the timeout occurs.  wait_event_idle_timeout() instead returns a
positive number when the condition is true, and 0 when the timeout
occurs.  So in the cases where return value is used, handling needs to
be adjusted accordingly.

Note that in some cases where cfs_fail_val gives the time to wait for,
the current code re-tests the wait time against zero as cfs_fail_val
can change asynchronously.  This is because l_wait_event() behaves
quite differently if the timeout is zero.

The new code doesn't need to do that as wait_event_idle_timeout()
treat 0 just as a very short wait, which is exactly the correct
behavior here.

This patch also removes a comment which is no longer meaningful
(CAN_MATCH) and corrects a debug message which reported the wait time
as "seconds" rather than the correct "jiffies".

This patch doesn't change the timed wait in cl_sync_io_wait().
That is a bit more complicated, so it left to a separate patch.

Change-Id: I632afc290935e321926f45b144d5367799a01381
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35977
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13087 target: init lcd last transno from reply data

Init lcd_last_transno value from reply data to keep it
valid so tgt_release_reply_data() will keep a slot with
the highest transno and on-disk data is not lost.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id31b3b250616fb6afd3d145c31b12af30ac86be8
Reviewed-on: https://review.whamcloud.com/37060
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13061 osp: check catlog FID after reading in

In osp_sync_llog_init, the catlog FID read from "CATALOGS"
should be checked whether it is sane or not.

Change-Id: I4342b21b7d5c6d408a9ab52a1e30815ae1d5f563
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36998
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13077 pfl: cleanup xattr checking

Cleanup xattr checking in mdd and lod layers for PFL.

Reported-by: Clement Barthelemy <clement.barthelemy@nextino.eu>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2841b615ee304785fbf316b829d8280eefc3878a
Reviewed-on: https://review.whamcloud.com/37010
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13043 quota: remove annoying message in osd_declare_inode_qid()

The admin shouldn't be getting console error messages when a user goes
over quota(this would be happening continuously at some sites).

In some call paths, the "*flags" parameter may be NULL, don't try to
access it in that case.

As a general cleanup, move the QUOTA_FL_* flags over to a named enum
"enum osd_quota_local_flags" so that it is easier to see what this field
actually holds, rather than a totally generic "int *flags" argument that
has to be hunted through the code.

Fixes: d30f9e6b6c5d ("LU-11425 quota: support quota for DoM")
Change-Id: Id5686ecdb8a943e48a2888067e321f83b8569188
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/36906
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>

LU-12781 ptlrpc: fix inline reply buffer grow

In req_capsule_server_grow() reply buffer can be increased
without re-allocation if has enough size already, don't do
that though if rs->rs_repbuf is a wrapper, e.g. with security
enabled. In that case re-allocation is still needed.

Re-enable test 272a in sanity.sh with SHARED_KEY

Test-Parameters: mdscount=2 mdtcount=4 envdefinitions=SHARED_KEY=true testlist=sanity,sanity-pfl
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I0632b9513f877bea989b7a61a729e2db488dcfcc
Reviewed-on: https://review.whamcloud.com/36732
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12036 ofd: add "no_precreate" mount option

Add a mount option to disallow object creation on the OST. That
allows an OST to be mounted by the administrator without it being
immediately available for use by clients/applications. This may
be useful if the OST needs to be added to a specific pool first,
or if it is being debugged or similar.

Mount option can be disabled with the obdfilter.*.no_precreate
tunable parameter.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Icdb64a4bdd5a66b0e9e6d483e3113b97d53ebbe5
Reviewed-on: https://review.whamcloud.com/36716
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12941 lnet: Add peer level aliveness information

Keep track of the aliveness of a peer so that we can optimize for
situations where an LNet router hasn't responded to a ping. In
this situation we consider all routes down, and we needn't spend time
inspecting each route, or inspecting all of the router's local and
remote interfaces in order to determine the router's aliveness.

Cray-bug-id: LUS-7860
Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ie63c1ef40de3ad818639bae6b040923898fd5b46
Reviewed-on: https://review.whamcloud.com/36678
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12898 utils: %llu mismatch with type __u64 on ppcle64

Fix build errors like this one on ppcle64:

BUILDSTDERR: libmount_utils_zfs.c: In function 'zfs_mkfs_opts':
BUILDSTDERR: libmount_utils_zfs.c:573:5: error: format '%llu' expects
argument of type 'long long unsigned int', but argument 4 has type
'__u64' [-Werror=format=]
BUILDSTDERR: mop->mo_device_kb * 1024);

__u64 was treated as an unsigned long long which breaks the build on
ppc64le, where they are not the same size.

In printf cases, cast to unsigned long long to match the printf format
so the format is compatible with the type and it is guaranteed
not to lose any data.

In the case of sscanf(), replace the call with strtoull() to eliminate
the issue.

Test-Parameters: trivial
Change-Id: I02fd82e0be4d756881c15aa9faedb9b40961661a
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/36558
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>

LU-8130 ldlm: add a counter to the per-namespace data

When we change the resource hash to rhashtable we won't have
a per-bucket counter. We could use the nelems global counter,
but ldlm_resource goes to some trouble to avoid having any
table-wide atomics, and hopefully rhashtable will grow the
ability to disable the global counter in the near future.
Having a counter we control makes it easier to manage the
back-reference to the namespace when there is anything in the
hash table.

So add a counter to the ldlm_ns_bucket.

Change-Id: Ic79e96f95d5cacfb5e7bb02350f5f4fafb207b44
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/36219
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12904 gss: struct cache_detail readers changed to writers

Linux 5.3 changed struct cache_detail readers to writers
SUNRPC: Track writers of the 'channel' file to improve ...

kernel-commit: 64a38e840ce5940253208eaba40265c73decc4ee

Test-Parameters: trivial
Cray-bug-id: LUS-8042
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I7750303937cd6fc560e458efa79f25e521fefec7
Reviewed-on: https://review.whamcloud.com/36580
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12410 lnet: Delete unused nid parsing code

Delete the nid parsing code from liblnetconfig that is no longer used.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I6fbb3450756c7976836c3b6731d3ecd9f93cbf8d
Reviewed-on: https://review.whamcloud.com/35310
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12410 tests: Add gni tests to sanity-lnet

Add test-cases to validate handling of gni nids to sanity-lnet.sh

Also add some additional tests to validate error handling.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I7947e237e0d3e12e2e30752bca384cef2b66072c
Reviewed-on: https://review.whamcloud.com/35506
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12410 lnet: Convert lnetctl route add and del

Convert the lnetctl route add and delete commands to utilize the new
capabilities provided by the nidstrings library.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ifcaf67575ed1de40c9a3c92f40ec6dca7fd08d9e
Reviewed-on: https://review.whamcloud.com/35308
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12410 lnet: Convert yaml peer configuration

Convert the yaml peer config handlers to utilize the new capabilities
provided by the nidstrings library.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I89a53ded636877661a3600822ca49030c8841540
Reviewed-on: https://review.whamcloud.com/35307
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12410 lnet: Convert lnetctl peer add and del

Convert the lnetctl peer add and del commands to utilize the new
capabilities provided by the nidstrings library.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I50693a2af6fef2e1ef3b34fd02c7423625cb7665
Reviewed-on: https://review.whamcloud.com/35305
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12410 lnet: Implement method to tokenize nidstrings

The CLI for various lnetctl operations allows the user to specify
multiple, comma separated nidstrings. Implement a common method
for tokenizing nidstrings that can be leveraged by the operations
that require it.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I2f8ab6d5d9e7c3d5bde3a11b85bdf38fbf6fdf29
Reviewed-on: https://review.whamcloud.com/35505
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13070 mdd: try old format for orphan names during recovery

mdd_orphan_destroy() loop caused by compatibility issue on upgrade to
2.11 or later. The format for names of orphans in the PENDING directory
was changed in Lustre 2.11. The old format names are not recognized by
mdd_orphan_destroy() in Lustre 2.11, but compatibility code added to
handle this was incomplete, leading to an endless loop. There's a check
for the old format name, used in mdd_orphan_delete(), but that check
was not included in mdd_orphan_destroy().

This patch adds compatibility check for mdd_orphan_destroy().

Fixes: a02fd4573fe ("LU-7787 mdd: clean up orphan object handling")
Signed-off-by: Artem Blagodarenko <c17828@cray.com>
Cray-bug-id: LUS-8270
Change-Id: I9f42188dcb00f9d536996c14771de7df02502b40
Reviewed-on: https://review.whamcloud.com/37049
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>

LU-13042 tests: give more time in sanity-selinux test_21b

In sanity-selinux test_21b, set sepol refresh time to 1000 seconds
instead of 10. This gives plenty of time for file/dir access tests,
and also cache drop, to complete. Then reset send_sepol to a smaller,
already expired value, to force sepol refresh.

Test-Parameters: trivial
Test-Parameters: clientselinux mdtcount=4 testlist=sanity-selinux envdefinitions=ONLY="21b"
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I57f72faad4bd55736a3240cdefdac2e5814eba79
Reviewed-on: https://review.whamcloud.com/36905
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12787 tests: skip project quota if it is disabled

quota_scan touchs project quota in case of errors or logs.
When project quota is not supported, this leads to error:
    Unexpected quotactl error: Operation not supported
    ...
    Some errors happened when getting quota info. Some devices
    may be not working or deactivated. The data in "[]" is inaccurate.

The fix adds a check before touching project quota.

Cray-bug-id: LUS-7811
Test-Parameters: testlist=sanity-quota
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: Ia733b666d6937ea9e8e99ef856d2ae1246dc44d1
Reviewed-on: https://review.whamcloud.com/36997
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-11607 tests: remove duplicate code lnet-selftest

lnet-selftest.sh and test-framework.sh both have a function
called is_mounted() that check if the file system is
mounted. Since both functions do and return the same
thing, let's remove the is_mounted() function from
lnet-selftest.

Test-Parameters: trivial
Test-Parameters: fstype=zfs testlist=lnet-selftest,lnet-selftest
Test-Parameters: fstype=ldiskfs testlist=lnet-selftest,lnet-selftest
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I05ce84002cfa8ac96ac4f1e8169fb2233b66f378
Reviewed-on: https://review.whamcloud.com/36965
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12923 libcfs: Use BUILD_BUG_ON() for hash.c

This patch replaces all CLASSERT() with kernel defined
BUILD_BUG_ON() for file libcfs/libcfs/hash.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ie5dc744fc10b6e5f303fca93d342629e99a2403d
Reviewed-on: https://review.whamcloud.com/36902
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>