git://git.whamcloud.com - fs/lustre-release.git/log

LU-12857 tests: allow clients to be IDLE after recovery

If clients are not connected to an OST when it fails (connection
is IDLE), they do not need to be involved in recovery, so this
should not be considered an error when checking the client state.

Test-Parameters: trivial testlist=recovery-mds-scale env=SLOW=no
Test-Parameters: testlist=conf-sanity
Test-Parameters: testlist=replay-dual,replay-single
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6cfeb718acd233378ed1608f22061bc15c3ebbe5
Reviewed-on: https://review.whamcloud.com/45318
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15126 kernel: RHEL 8.5 server support

This patch makes changes to support RHEL 8.5 release
with kernel 4.18.0-348.2.1.el8_5 for Lustre server.

Test-Parameters: trivial fstype=ldiskfs \
env=SANITY_EXCEPT="101j" \
clientdistro=el8.5 serverdistro=el8.5 testlist=sanity

Test-Parameters: trivial fstype=zfs \
env=SANITY_EXCEPT="101j" \
clientdistro=el8.5 serverdistro=el8.5 testlist=sanity

Change-Id: Ie976d8fd3e6fcf8a564eff8a41ad0fd51b2c858c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45306
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15222 build: Update ZFS version to 2.0.6

Update ZFS version to 2.0.6. The changes are listed in:
https://github.com/openzfs/zfs/releases/tag/zfs-2.0.6

Change-Id: I2a7df45b79f402c3d3bce8b137edd11b5224b576
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45567
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]

This patch makes changes to support new RHEL 8.5 release
for Lustre client.

Test-Parameters: trivial env=SANITY_EXCEPT="101j" \
clientdistro=el8.5

Change-Id: I068f091817126fffc14402254f45dcd75ba7f3fc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45285
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>

LU-15184 llite: properly detect SELinux disabled case

Usually, security_dentry_init_security() returns -EOPNOTSUPP when
SELinux is disabled. But on some kernels (e.g. rhel 8.5) it returns
0 when SELinux is disabled, and in this case the security context is
empty.
So in both cases make sure the security context name is not set, which
means "SELinux is disabled" for the rest of the code.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3b9608f9768288de89570c158e8429560fa0213f
Reviewed-on: https://review.whamcloud.com/45501
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15119 tgt: BUG at tgt_brw_read+0x16bf/0x1d80

struct tgt_thread_big_cache {
  local = {{
      lnb_file_offset = 0,
      lnb_page_offset = 0,
      lnb_len = 0,
      lnb_rc = 0,
      lnb_page = 0xffffddee74fae100,
so npages_read becomes 0

Change-Id: Ie2201c9fc6f0350b1c6dcb480cff52f44d5413db
HPE-bug-id: LUS-10510
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/45273
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15110 quota: cosmetic changes in PQ

cosmetic changes in PQ:
- make tgt_pool_free and qmt_sarr_pool_free void
- remove outdated comment from qmt_pool_lqes_lookup
- replace tabs with spaces

HPE-bug-id: LUS-9547
Change-Id: If4918b647eed1d971d00c521d010d0c72d349207
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45258
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15079 quota: include qsd_thread_info into mgs thread context

mgs service thread envs do not get supplied with qsd_thread_info, which
may lead to the failure shown below:
(lu_object.h:1274:lu_env_info()) ASSERTION( info ) failed:
(lu_object.h:1274:lu_env_info()) LBUG
Pid: 146951, comm: ll_mgs_0003 3.10.0-957.1.3957.1.3.x4.3.25.x86_64 #1 SMP
Call Trace:
libcfs_call_trace+0x8e/0xf0 [libcfs]
lbug_with_loc+0x4c/0xa0 [libcfs]
qsd_refresh_usage+0x25e/0x2f0 [lquota]
qsd_op_adjust+0x2f1/0x730 [lquota]
osd_object_delete+0x2b2/0x360 [osd_ldiskfs]
lu_object_free.isra.32+0x68/0x170 [obdclass]
lu_site_purge_objects+0x2fe/0x530 [obdclass]
lu_object_find_at+0x371/0xa60 [obdclass]
dt_locate_at+0x1d/0xb0 [obdclass]
llog_osd_open+0x50e/0xf30 [obdclass]
llog_open+0x15a/0x3e0 [obdclass]
llog_origin_handle_open+0x334/0x720 [ptlrpc]
tgt_llog_open+0x33/0xe0 [ptlrpc]
mgs_llog_open+0x46/0x460 [mgs]
tgt_request_handle+0x96a/0x1680 [ptlrpc]

Supply msg service context with qsd_thread_info.

Change-Id: If8664b81e1f64df015dad46ba26c9c1d1e3f54bf
HPE-bug-id: LUS-10334
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45181
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15059 nrs: do not overwrite "cmd" in nrs_tbf_rule

"cmd" pointer inside ptlrpc_lprocfs_nrs_tbf_rule_seq_write() and
nrs_tbf_parse_cmd are static. This could cause a double kfree call
because "cmd" could be overwriten by another "nrs_tbf_rule" write
instance.

Let's try to remove the "static" definition.

Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I8cd7d9dd0483778c82bbf8711c07e49255983f4b
Reviewed-on: https://review.whamcloud.com/45142
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>

LU-14699 mdd: proactive changelog garbage collection

Currently changelog starts garbage collection when user
exceeds maximum idle timeout, there is also limit by amount
of idle records but it is used only for old changelog users
which have no cur_time field, therefore it is not used at
all nowadays. Another problem is that garbage collection is
started only when changelog is almost full. That causes
often situations when changelog might have very old users
staying much longer than idle timeout and having idle
records above maximum limit consuming space for nothing.

Patch reworks changelog GC in the following way:
- GC starts when changelog is almost full (old way) or
  either idle time or idle records limits are exceeded or
  when (idle_time * idle_records) exceeds its limit as well.
  The latest limit is calculated as:
  (idle_time * idle_records) / 84600 > (1 << 32) which is a
  reasonable heuristic for deciding if a user is "too idle"
  in both cases when lots records being created quickly vs
  user is idle a very long time.
- to avoid the processing of changelog users each time GC is
  checking all conditions both least user record and time
  are tracked when changelog users are initialized or
  purged/canceled. Both values are stored as mdd_changelog
  fields mc_minrec and mc_mintime
- test 160g is changed to test the new approach when idle
  indexes are checked always along with idle time checks
- test 160s is added in sanity.sh to check heuristic approach
  with (idle_time * idle_records) value checking

Fixes: 3442db6faf68 ("LU-7340 mdd: changelogs garbage collection")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I6028f3164212a2377a4fc45b60a826c64f859099
Reviewed-on: https://review.whamcloud.com/45068
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14957 mdd: prepare xattrs before migration

In directory migration, the xattrs should be prepared before starting
transaction, otherwise if remote MDT is down, which will cause local
MDT stuck as well.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I79279e7b0c051a7542a71066fffd4ad70f559368
Reviewed-on: https://review.whamcloud.com/44741
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14941 lnet: Fix source specified to routed destination

If a source NI is specified for a send then we should not modify the
destination NID that was passed to lnet_send().

HPE-bug-id: LUS-10301
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie47558d5bce97a0dca30ff7d072dcd39eb903324
Reviewed-on: https://review.whamcloud.com/44730
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14940 lnet: Fix source specified send to different net

The destination NI is fixed for all source-specified sends. Thus, in
order for a source-specified send to be considered "local", i.e. a
send that does not require a route, the destination NID must be on
the same net as the specified source.

HPE-bug-id: LUS-10303
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I4847db1d393bbc36def65123f260b928ebbf944e
Reviewed-on: https://review.whamcloud.com/44728
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14939 lnet: Allow specifying a source NID for lnetctl ping

Add a new --source option for lnetctl ping command. This allows the
user to specify a local NI from which to send the ping. This also
ensures that the specified destination NID is also used. Otherwise,
pings to multi-rail peers may end up going to a different peer NI
based on the multi-rail selection algorithm. The ability to specify
a source NI, and thus fix the destination NI, is a great help in
troubleshooting communication issues between multi-rail peers.

Add test to exercise lnetctl ping --source option.

HPE-bug-id: LUS-10296
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I454217b30a92414de537880f076a11a693b1f0b3
Reviewed-on: https://review.whamcloud.com/44727
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14867 doc: a few words about an asterisk in lfs quota

Clarify the difference between an asterisk printed per OST
in verbouse lfs quota output and an asterisk near the whole
filesystem usage.

Change-Id: I778fe1f7b1f6f8d55c311d81bd2b311d82463390
Test-Parameters: trivial
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/44357
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14437 gnilnd: use ktime_get_seconds() to get time

Use ktime_get_seconds() to directly get the time inatead of
getting a timespec and converting it.

Fixes: 4b0e495e3c ("LU-14080 gnilnd: updates for SUSE 15 SP2")
Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I256855ceb9e038a9960fa76fe6e3bfe63fb16580
Reviewed-on: https://review.whamcloud.com/41679
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14188 osd: remove not used iam_container lock

There is an rw_semaphore
struct iam_container {
    ...
        /*
         * read-write lock protecting index consistency.
         */
        struct rw_semaphore     ic_sem;   <<<<<<
        struct dynlock       ic_tree_lock;
        /* Protect ic_idle_bh */
        struct mutex         ic_idle_mutex;
     ...
};

There is initialization
2    234  lustre/osd-ldiskfs/osd_iam.c <<iam_container_init>>
             init_rwsem(&c->ic_sem);

There are wrappers
   3    622  lustre/osd-ldiskfs/osd_iam.c <<iam_container_write_lock>>
             down_write(&ic->ic_sem);
   4    627  lustre/osd-ldiskfs/osd_iam.c <<iam_container_write_unlock>>
             up_write(&ic->ic_sem);
   5    632  lustre/osd-ldiskfs/osd_iam.c <<iam_container_read_lock>>
             down_read(&ic->ic_sem);
   6    637  lustre/osd-ldiskfs/osd_iam.c <<iam_container_read_unlock>>
             up_read(&ic->ic_sem);

But this wrappers are not used. And, based on the git history, never been used.
Let's delete this useless code.

Change-Id: Ied1122f034e53fee08888e1091f700bda4507f00
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
HPE-bug-id: LUS-9545
Reviewed-on: https://review.whamcloud.com/40890
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14073 ldiskfs: update ldiskfs patches for Linux 5.9

Minor conflict changes only.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: If0b58fc278f6721e44e26c6052509c31068f8e78
Reviewed-on: https://review.whamcloud.com/40397
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9325 obdclass: make niduuid for lustre_stop_mgc() static

The process to create a proper string for niduuid can be made
simpler and avoid a memory allocation.

Change-Id: I52cb01117e41cbcf2756477e91934a42d31fd157
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/33617
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12511 ldlm: free resource when ldlm_lock_create() fails.

ldlm_lock_create() gets a resource, but don't put it on
all failure paths. It should.

Change-Id: Ib49bcafdeac834c412adad9db135034d1ea06a04
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/45585
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12511 llite: fix misuse of current->parent.

current->parent is used by ptrace to redirect some signal delivery
to the ptracer. It should only be used by 'ptrace' or 'signal' code.
All other users should use current->real_parent, which is the real
parent.

Change-Id: I3212fd9f3db9935b8144dba8af82eb2f387a5c45
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/45580
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15083 ldiskfs: disable xattr credits check

Add ext4-xattr-disable-credits-check.patch to the
ldiskfs series from kernel 4.15+ except the
rhel8 ones. They already have this fix.

Change-Id: I1f9818c394377cdba6b11d2b24c26d2354f41ac0
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/45546
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Shuichi Ihara <sihara@ddn.com>
Tested-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9162 lod: option to set max stripe count per filesystem

Add an option to set max default stripe count when the stripe count
is set to -1.

Change-Id: I02634a02a6f6579750fe964662b7e644af1689d6
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/45532
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14781 osp: osp_object_free access NULL pointer

If a osp_object is created by multiple threads at the same time,
lu_object_find_at() could allocate an osp_object without object
initialization, before hash inserting of the object, it find another
object has been created and inserted by another thread, it will free
the unintialized osp_object, and osp_object_free() will access
an uninitialized list_head (opo_xattr_list).

This patch initialize osp_object::opo_xattr_list in its allocation
function.

Call trace:
            lu_object_free.isra.30+0xf2/0x170 [obdclass]
            lu_object_find_at+0x496/0x930 [obdclass]
            lod_initialize_objects+0x3e4/0xba0 [lod]
            lod_parse_striping+0x693/0xc20 [lod]
            lod_striping_load+0x2b2/0x660 [lod]
            lod_declare_destroy+0x12b/0x600 [lod]
            mdd_declare_finish_unlink+0x91/0x210 [mdd]
            mdd_unlink+0x48f/0xab0 [mdd]
            mdt_reint_unlink+0xc32/0x1550 [mdt]
            mdt_reint_rec+0x83/0x210 [mdt]
            mdt_reint_internal+0x6e1/0xb00 [mdt]
            mdt_reint+0x67/0x140 [mdt]
            tgt_request_handle+0xaee/0x15f0 [ptlrpc]
            ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
            ptlrpc_main+0xb34/0x1470 [ptlrpc]
            kthread+0xd1/0xe0

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib86aca5b41e94a1758f177655ea3a0f680335e0f
Reviewed-on: https://review.whamcloud.com/45442
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>

LU-15114 osp: changes queuing throttle

Prevent queue of sync changes from growing too much
by adding resends when queue size reaches
some (tunable) limit.

HPE-bug-id: LUS-10345
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I5efabb91d3700c58d9451f81c5fed9a22ae404fb
Reviewed-on: https://review.whamcloud.com/45265
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14491 ldiskfs: do not corrupt journal with bh change rh7.6

Currently, ldiskfs_xattr_delete_inode() zeroes xattr inode
references in cached buffers that haven't been prepared by
get_write_access().

When using journal checksums, it is possible that these buffers
are modified after the checksum is calculated but before the
buffer has been written to journal. Journal replay will fail
with a journal checksum error message if this transaction needs
to be replayed.

This is a port of:

Lustre-commit: d563f5140ffa183d0854cf7cd493ad884e314e3d
Lustre-change: https://review.whamcloud.com/41896

Test-Parameters: trivial
HPE-bug-id: LUS-9483
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I5ce1016737a16ddf74811c43ae74296d4f3e03b0
Reviewed-on: https://review.whamcloud.com/45164
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15068 ptlrpc: Do not unlink difficult reply until sent

If a difficult reply is queued in LNet, or the PUT for it is
otherwise delayed, then it is possible for the commit callback
to unlink the reply MD which will abort the send. This results in
client hitting "slow reply" timeout for the associated RPC and
an unnecessary reconnect (and possibly resend).

This patch replaces the rs_on_net flag with rs_sent and rs_unlinked.
These flags indicate whether the send event for the reply MD has
been generated, and whether the MD has been unlinked, respectively.

If rs_sent is set, but rs_unlinked has not been set, then ptlrpc_hr
is free to unlink the reply MD as a result of the commit callback.
The reply-ack will simply be dropped by the server.

If ptlrpc_hr is processing the reply because of commit callback, and
rs_sent has not been set, then ptlrpc_hr will not unlink the reply
MD. This means that the reply_out_callback must also be modified to
check for this case when the send event occurs. Otherwise, if the ACK
never arrives from the client, then the MD would never be unlinked.
Thus when the send event occurs, and rs_handled is set, the
reply_out_callback will schedule the reply for handling by ptlrpc_hr.

HPE-bug-id: LUS-10505
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ib8f4853c7ab35d72624fce7ee3fba9e59a746e1f
Reviewed-on: https://review.whamcloud.com/45138
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15065 quota: fix BIO write performance drop

Before the patch qti_lqes_qunit_min used int to store qunit
value, while lqe_qunit type is _u64. lqe_qunit > 2G caused
an overflow in a local integer argument. For example, when
block hard limit was set to 500TB(i.e. lqe_qunit was about
64TB in a system with 2 OSTs), qti_lqes_qunit_min returned
0 instead of 64TB in a qmt_lvbo_fill. Thus new qunit was not
set on OSTs(qsd_set_qunit wasn't called). Without the qunit,
OST began to send release request after each acquire. For
example, to write 10MB at the OST were sent 2 acquire and
2 release reuests(as qunit was not set on OST). With the
fix, i.e. in a normal case, OST needs just one acquire
request. The issue caused performance drop in a bufferred
write up to 15%-20% if compare with a baseline without PQ
patches.

Note, the issue exists only when a hard limit is set to some
high value(>100GB). The exact hard limit value depends on OSTs
number in a system and on amount of used space, but let's think
that issue doesn't exist on a clean system with 2 OSTs and hard
block limit 100G(this case was checked).

Remove qmt_pool_hash - it is not used anywhere since
"LU-11023 quota: remove quota pool ID".

HPE-bug-id: LUS-10250
Change-Id: I2c4ce38f5b9395ed1f4868d4c8efc00751116b15
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45133
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15179 tests: add trap cleanup_quota_test

Add stack_trap cleanup_quota_test to the tests that
use setup_quota_test. If a test fails without calling
cleanup_quota_test, it may cause later tests to fail
due to used space > 0.

Remove ${tdir}_dom, if exists, in cleanup_quota_test.
sanity-quota_75 doesn't remove test_dom directory.

Test-Parameters: trivial testlist=sanity-quota
Fixes: a4fbe734("LU-14739 quota: nodemap squashed root cannot bypass quota")
Change-Id: Ife4fd499b427bee79f74a5e172d233fe6a83e240
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45418
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15025 quota: stale edquot after clearing limits

When hard and soft limit set to 0, lqe enforced flag is also
set to false. As qmt_adjust_qunit does not handle not enforced
lqes, edquot set to the pool continues to be true and a user
gets -EDQUOT even if all pool limits are cleared. This was ok
for global pool lqe as since it turned off, zero limits are
sent to OSTs causing OSTs to release all granted space and
avoid EDQUOT. Fix this for PQ - set edquot and qunit to zero,
since appropriate lqe becomes "not enforced".

HPE-bug-id: LUS-10146
Change-Id: I1e08929bae7e1b37b1e8cbbc44859a786b5fb090
Reviewed-on: https://es-gerrit.dev.cray.com/158915/
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Tested-by: Jenkins Build User <nssreleng@cray.com>
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45000
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15191 quota: set correct revoke_time

When we do qmt_adjust_qunit, there are several lqes
and lqe_revoke_time is set for some of them, it means
appropriate OSTs have been already notified with the
least qunit and there is no chance to free more space.
If a qunit of the current lqe becomes equal to the least
qunit, find an lqe with the minimum(earliest) revoke_time
and set this revoke_time to the current one.

This patch fixes the following case. For example, we have
8 OSTs and 4 MDTs(i.e. 12 slaves) and a pool with just one
OST. Global hard block limit for the user is 50M, and 10M
for this user in a pool. User's usage is 0. As global pool
has 12 slaves it's initial qunit value is 1M, i.e. equal to
the least qunit. At the same time initial qunit value for the
pool with one OST is 4M. When user begins to write, pool's
qunit is decreased to 1M, but lqe_revoke is not set - it
should be set only after sending new qunit to OSTs in
qmt_lvbo_update. However, it won't be send because appropriate
lge_qunit in lqe global array already has the same value.
This problem caused sanity-quota_72 to hang instead of fail
with EDQUOT in test_1_check_write.

HPE-bug-id: LUS-10516
Change-Id: I5878c1e719ae83a69ad5dbc3653717bb1b4de632
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45447
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-10391 uapi: move out kernel only code.

Userland doesn't use the new IPv6 function in the UAPI nidstr.h.
So move this to the kernel header lib-types.h. Normally this
wouldn't matter but the python wrappers break with the LUTF
project. With the move to Netlink in the future for the user land
API we shouldn't need these functions anyways.

Test-parameters: trivial
Change-Id: I2ac6d102d4575f639573253c21723d62ca08abc2
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44844
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12347 llite: do not take mod rpc slot for getxattr

The following scenario may lead to client eviction:
clientA                clientB                  MDS
threadA1: write to file F1, get
and hold DoM MDC LDLM lock L1:
   ->cl_io_loop()
    ->cl_io_lock()
     :
     ->mdc_lock_granted()
      ->lock->l_writers++
      [hold ref until write done]

threadA2-A8: create files F2-F8:
   ->ll_file_open()
    ->mdc_enqueue_base()
     ->ldlm_cli_enqueue()
      ->ptlrpc_get_mod_rpc_slot()
      ->ptlrpc_queue_wait()
      [hold RPC slot until create done]

                                              OST(s) in recovery.
                                              MDS waiting on OST(s) to
                                              precreate new objects.

threadA1:
   -> cl_io_start()
    -> __generic_file_aio_write()
     -> file_remove_suid()
      -> ll_xattr_cache_refill()
       -> mdc_xattr_common()
        -> ptlrpc_get_mod_rpc_slot()
        [blocked waiting for RPC slot]

                       threadB1: write file F1,
       enqueue DoM MDC lock L1

                                              MDS sends blocking AST
                                              to clientA for lock L1

ldlm_threadA3: cannot cancel busy lock L1:
   -> ldlm_handle_bl_callback()
   ["Lock L1 referenced, will be cancelled later"]

                                              MDS evicts clientA for
                                              not cancelling lock L1

threadA1: never completes write:
  ->cl_io_end()
   ->cl_io_unlock()
    ->osc_lock_cancel()
     ->lock->l_writers--;

The fix is to add IT_GETXATTR to list of operations which do not
need mod rpc slot.

Tests to illustrate the issue is added.

wait_for_function(): total sleep time (wait) is to be equal to max
when 1 is returned.

Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
HPE-bug-id: LUS-7271
Change-Id: I1b80677df084bda141b9ac58a78b765bd0b14a41
Reviewed-on: https://review.whamcloud.com/44151
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14428 libcfs: separate daemon_list from cfs_trace_data

cfs_trace_data provides a fifo for trace messages.  To minimize
locking, there is a separate fifo for each CPU, and even for different
interrupt levels per-cpu.

When a page is remove from the fifo to br written to a file, the page
is added to a "daemon_list".  Trace message on the daemon_list have
already been logged to a file, but can be easily dumped to the console
when a bug occurs.

The daemon_list is always accessed from a single thread at a time, so
the per-CPU facilities for cfs_trace_data are not needed.  However
daemon_list is currently managed per-cpu as part of cfs_trace_data.

This patch moves the daemon_list of pages out to a separate structure
- a simple linked list, protected by cfs_tracefile_sem.

Rather than using a 'cfs_trace_page' to hold linkage information and
content size, we use page->lru for linkage and page->private for
the size of the content in each page.

This is a step towards replacing cfs_trace_data with the Linux
ring_buffer which provides similar functionality with even less
locking.

In the current code, if the daemon which writes trace data to a file
cannot keep up with load, excess pages are moved to the daemon_list
temporarily before being discarded.  With the patch, these page are
simply discarded immediately.
If the daemon thread cannot keep up, that is a configuration problem
and temporarily preserving a few pages is unlikely to really help.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie894f7751cadacb515215f18182163ea5d26e969
Reviewed-on: https://review.whamcloud.com/41493
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15107 mdt: Exclusive create isn't replayed

mdt_finish_open() fails on exclusive create check as
there isn't an infrormation that the file was created.

Set DISP_OPEN_CREATE for exclusive open replay,
as we know that the original request has succeeded.

Change-Id: Idc71db76b757670b91b3bb1718870a018a805ed2
HPE-bug-id: LUS-9358
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-on: https://review.whamcloud.com/45254
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9680 net: Netlink improvements

With the expansion of the use of Netlink several issues have been
encountered. This patch fixes many of the issues. The issues are:

1) Fix idx handling in lnet_genl_parse_list() function. It needs
   to always been incremented. Some renaming suggestion for
   enum lnet_nl_scalar_attrs from Neil. New LN_SCALAR_ATTR_INT_VALUE
   to allow pushing integers as well as strings from userspace.

2) Create struct genl_filter_list which will be used to create
   a list of items to pass back to userland. This will be a common
   setup.

3) A normal user can't read /sys/debug/kernel/lustre which breaks
   lctl ***_params XXX since the first function called is
   llapi_param_get_paths(). Without the ability to read the
   debugfs tree glob() will fail. The solution is to use the
   kernel's glob function and just pass the requested string to
   the kernel.

4) For the external coordinator work you create a YAML parser
   that listens for kernel generated Netlink packets. This is
   a continuous stream vs an one time reply which we don't
   handle correctly. We move the handling of the completion
   of a Netlink packet series icompletely into the function
   yaml_netlink_msg_complete. In yaml_netlink_msg_parse() for
   the async case add "---" and in yaml_netlink_msg_complete()
   add "..." to define the beginning and end of a YAML document.

5) We have 3 types of setups. For kernel generated events it is
   possible to use just a YAML parser to listen for events. For
   the normal request -> reply setup we need both a YAML emitter
   and YAML parser. The last case is just sending commands to
   the kernel which only needs a YAML emitter. It is possible for
   that action to fail so we need to add handling for errors to
   the YAML emitter. We keep error handling for the YAML parser
   as well to handle the case of a stand along YAML parser listener.

6) Reworked the code that translates YAML to Netlink packets to
   send to the kernel so the both key and value pairs are sent
   seperately for the mapping case. This avoids dealing with
   complex string parsing in the kernel.

7) Error message handling was incorrect. struct nlmsgerr msg field
   is also the start of the nlattrs for the ext ack handling.

Test-Parameters: trivial
Change-Id: Ic8eee8fd0020b7a63565de6ef69f2c74bf4bdcd8
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44358
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14713 llite: mend the trunc_sem_up_write()

The original lli_trunc_sem replace change (commit e5914a61ac) fixed a
lock scenario:

t1 (page fault)          t2 (dio read)              t3 (truncate)
|- vm_mmap_pgoff()       |- vvp_io_read_start()     |- vvp_io_setattr
|- down_write(mmap_sem)  |- down_read(trunc_sem)            _start()
|- do_map()              |- ll_direct_IO_impl()
|- vvp_io_fault_start    |- ll_get_user_pages()

|- down_write(
|- down_read(mmap_sem)        trunc_sem)
|- down_read(trunc_sem)

t1 waits for read semaphore of trunc_sem which is hindered by t3,
since t3 is waiting for the write semaphore while t2 take its read
semaphore,and t2 is waiting for mmap_sem which has been taken by t1,
and a deadlock ensues.

commit e5914a61ac changes the down_read(trunc_sem) to
trunc_sem_down_read_nowait() in page fault path, to make it ignore
that there is a down_write(trunc_sem) waiting, just takes the read
semaphore if no writer has taken the semaphore, and breaks the
deadlock.

But there is a delicacy in using wake_up_var(), wake_up_var()->
__wake_up_bit()->waitqueue_active() locklessly test for waiters on the
queue, and if it's called without explicit smp_mb() it's possible for
the waitqueue_active() to ge hoisted before the condition store such
that we'll observe an empty wait list and the waiter might not
observe the condition, and the waiter won't get woke up whereafter.

Fixes: e5914a61ac ("LU-12460 llite: replace lli_trunc_sem")
Change-Id: Ifdda2c1c8a4171466be1723923c136e84de8ce0e
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43844
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14971 test: align mirror_io resync implementation

Align the mirror_io resync implementation with
llapi_mirror_resync_many().

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Icf11c4c2302f36fc0f9682e0a310058081e1214f
Reviewed-on: https://review.whamcloud.com/45202
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15142 lctl: fixes for set_param -P and llog_print

- properly handle permanent param deletion
- don't print skipped parameters in llog_print output
- add --raw option to llog_print to output all entries
including markers

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id93a206a255dc885343efa293e1ee2672493e5e5
Reviewed-on: https://review.whamcloud.com/45332
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15197 llite: Do not count tiny write twice

We accidentally count bytes written with tiny write twice
in stats. Remove the extra count.

This also has the positive effect of improving tiny write
performance by about 4% by removing an extra call to the
stats code (the main cost is ktime_get()).

Before, 8 byte dd:
13.9 MiB/s
After:
14.3 MiB/s

Test-parameters: trivial

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia11e7f16e3e3d0c4012f87cde817ad7b21128fa8
Reviewed-on: https://review.whamcloud.com/45476
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15152 tests: auster reports wrong testsuite status

auster always reports testsuites returned 0 even when there
are failures.

Test-Parameters: trivial testlist=sanity-lnet env=ONLY=230,ONLY_REPEAT=20
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I7a8101d6cfd854d8419edf55c18a72e211f5e5c8
Reviewed-on: https://review.whamcloud.com/45343
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14662 lnet: set eth routes needed for multi rail

When ksocklnd is initialized or new ethernet interfaces
are added via lnetctl, set the routing rules using a common
shell script ksocklnd-config. This ensures control over
source interface when sending traffic.

For example, for eth0 with ip 192.168.122.142/24:
   the output of "ip route show table eth0" should be
192.168.122.0/24 dev eth0 proto kernel scope link src 192.168.122.142

This step can be omitted by specifying
   options ksocklnd skip_mr_route_setup=1
in the conf file, or by using switch
   --skip-mr-route-setup
when adding NI with lnetctl. Note that the module parameter
takes priority over the lnetctl switch: if skip-mr-route-setup
is not specified when adding NI with lnetctl, the route still
won't get created if the conf file has skip_mr_route_setup=1.

The route also won't be created if any route already exists
for the given interface, assuming advanced users who manage
routing on their own will want to continue doing so.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ia14e637bd29d4bbce5dd93daad9992336b2e6b15
Reviewed-on: https://review.whamcloud.com/44065
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14448 lod: verify LOV early in lod_get_default_striping

lod_get_default_striping() will get both default LOV and default LMV,
and parse them to struct lod_default_striping one by one, however the
LOV and LMV data are both stored in lod_thread_info.lti_ea_store, so
lod_verify_striping() should verify LOV upon getting LOV, otherwise
if both exists, it's LMV that's verified, which will return -EINVAL.

Fixes: 6a08df2d0effc7a ("LU-14448 lod: verify LOV before set/inherit")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I9763d35bdbc74101fa8515d5096ec457a4cb3524
Reviewed-on: https://review.whamcloud.com/45370
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9704 grant: ignore grant info on read resend

The following scenario makes a message like "claims 28672 GRANT, real
grant 0" to appear:

1. client owns X grants and run rpcs to shrink part of those
2. server fails over so that the shrink rpc is to be resent.
3. on the clinet reconnect server and client sync on initial amount
of grants for the client.
4. shrink rpc is resend, if server disk space is enough, shrink does
not happen and the client adds amount of grants it was going to
shrink to its newly initial amount of grants. Now, client thinks that
it owns more grants than it does from server points of view.
5. the client consumes grants and sends rpcs to server. Server avoids
allocating new grants for the client if the current amount of grant
is big enough:
static long tgt_grant_alloc(struct obd_export *exp, u64 curgrant,
...
if (curgrant >= want || curgrant >= ted->ted_grant + chunk)
RETURN(0);
6. client continues grants consuming which eventually leads to
complains like "claims 28672 GRANT, real grant 0".

In case of resent of read and set_info:shrink RPCs grant info should
be ignored as it was reset on reconnect.

Tests to illustrate the issue is added.

HPE-bug-id: LUS-7666
Change-Id: I8af1db287dc61c713e5439f4cf6bd652ce02c12c
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45371
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-930 doc: update lfs migrate usage and man page

Update the usage and man page for "lfs migrate -m", noting that
this command will recursively migrate an entire directory tree.

It is not currently possible to migrate files with DoM components
between MDTs, so provide an example of how to work around this.

Only print the command-line options for commands in the usage
message instead of the full usage, since it is otherwise much
too verbose to see the actual error message being printed. The
user should read the lfs-migrate.1 man page for full usage.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9e34a33fcc3f0e2b90bc499fb7b946c53e6111d1
Reviewed-on: https://review.whamcloud.com/43378
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15121 llite: skip request slot for lmv_revalidate_slaves()

Some syscalls need lmv_revalidate_slaves(). It requires
second lock enqueue and the it can be blocked by
lack of RPC slots.

Don't acquire rpc slot for second lock enqueue.

Change-Id: Ida23c648c2bd169c4d238543731796232aa490dc
HPE-bug-id: LUS-8416
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-on: https://review.whamcloud.com/45275
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15141 quota: optimize capability check for root squash

On client side, checking for owner/group quota can be directly
bypassed if this is for root and there is no root squash.

Change-Id: If29eca428d8748df412a717615e4d0a4886ddd04
Fixes: a4fbe7341b ("LU-14739 quota: nodemap squashed root cannot bypass quota")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/45322
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12678 ptlrpc: remove bogus LASSERT

In the error case, it isn't possible for rc to be both -ENOMEM and
0 at the same time, so remove the incorrect LASSERT(rc == 0) to
avoid crashing the system on an allocation failure.

Improve error messages to conform to code style.

Fixes: ceeeae4271fd ("LU-12678 lnet: me: discard struct lnet_handle_me")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I61ac5d735d7b2658dae76213a2d40cbfd2bb8bb9
Reviewed-on: https://review.whamcloud.com/45421
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-11667 tests: Fix sanity test 317 for 64K PAGE_SIZE OST

When create a file, blocks are allocated with PAGE_SIZE aligned,
see function osd_ldiskfs_map_inode_pages(). E.g. for 64K PAGE_SIZE
Arm64 OST server, if create a file with size less than 64K, it
actually allocates 128 blocks each block 512 Bytes.

It needs to adjust the test for 64K PAGE_SIZE OST server.

Test-Parameters: trivial
Test-Parameters: clientarch=aarch64 fstype=ldiskfs testlist=sanity \
env=PTLDEBUG=-1,ONLY=317
Test-Parameters: fstype=ldiskfs testlist=sanity \
env=PTLDEBUG=-1,ONLY=317

Change-Id: Iada701f4f424093e847fc70aa843873b75fe6b06
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/45395
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15156 kernel: back port patch for rwsem issue

RHEL7 included a defeat in rwsem. It can cause a
thread hung on rwsem waiting infinity. Backport
commit: 5c1ec49b60cdb31e51010f8a647f3189b774bddf
to fix this issue.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ic5c469ce744ad5882c13163a9bfe14faef8fd446
Reviewed-on: https://review.whamcloud.com/45383
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15168 osd: use large allocation for idc cache

as in some cases (e.g. ofd precreate) the cache can grow to dozens
of kilobytes (sizeof(struct idc_map_cache)=40 * 1024).

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id9e0996a7a1d07065f4a50c1d5be5051e756559a
Reviewed-on: https://review.whamcloud.com/45382
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>

LU-15154 kernel: kernel update SLES15 SP3 [5.3.18-59.27.1]

Update SLES15 SP3 kernel to 5.3.18-59.27.1 for Lustre client.

Test-Parameters: trivial

Change-Id: Ie3c369a8e93a75b4afbde55489bd3819bb39e1de
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45349
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15138 lnet: Fail peer add for existing gw peer

If there's an existing peer entry for a peer that is being added
via CLI, and that existing peer was not created via the CLI, then
DLC will attempt to delete the existing peer before creating a new
one. The exit status of the peer deletion was not being checked.
This results in the ability to add duplicate peers for gateways,
because gateways cannot be deleted via lnetctl unless the routes
for that gateway have been removed first.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I9b7864a2bd21540336f72d96e180c89bd0aae8dc
Reviewed-on: https://review.whamcloud.com/45337
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15086 ptlrpc: fix timeout after spurious wakeup

so that final timeout don't exceed requested one

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Iff5e08c589cbbc3c483915002f3f9df7a6f2678a
Reviewed-on: https://review.whamcloud.com/45308
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15122 osd-ldiskfs: Fix ASSERTION( iobuf->dr_rw == 0 ) with 64KB PAGE_SIZE

During a writing, if there is a page can not be mapped to blocks
at once, it will cause "ASSERTION( iobuf->dr_rw == 0 )" crash
which leads by the overflow access of mapped blocks array.

This will happen on Arm platforms easily with 64KB PAGE_SIZE.
And will not happen on x86_64 platforms with 4KB PAGE_SIZE.
Because for 4KB block size, if PAGE_SIZE is 4KB, then i == 0
and blocks_left_page == 1. Which makes the inner loop each time
handle one block. Thus the outer loop condition "block_idx <
block_idx_end;" insures blocks[] array access not overflow.

Check the actual mapped count so that mapped blocks array
access will not overflow.

Fixes: 0271b17b80a8 ("LU-14134 osd-ldiskfs: reduce credits for new writing")
Change-Id: Icd46c04bea2d7930456840694d422758eebb4186
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/45288
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15102 lnet: Reset ni_ping_count only on receive

The lnet_ni:ni_ping_count is currently reset on every (healthy) tx.
We should only reset it when receiving a message over the NI. Taking
net_lock 0 on every tx results in a performance loss for certain
workloads.

Test-Parameters: trivial testlist=sanity-lnet
Fixes: 8fdf2bc62a ("LU-13569 lnet: Recover local NI w/exponential backoff interval")
HPE-bug-id: LUS-10427
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I67ea3aa977cb5d67b04f7957120c29e9985c83e6
Reviewed-on: https://review.whamcloud.com/45235
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15092 o2iblnd: Fix logic for unaligned transfer

It's possible for there to be an offset for the first page of a
transfer. However, there are two bugs with this code in o2iblnd.

The first is that this use-case will require LNET_MAX_IOV + 1 local
RDMA fragments, but we do not specify the correct corresponding values
for the max page list to ib_alloc_fast_reg_page_list(),
ib_alloc_fast_reg_mr(), etc.

The second issue is that the logic in kiblnd_setup_rd_kiov() attempts
to obtain one more scatterlist entry than is actually needed. This
causes the transfer to fail with -EFAULT.

Test-Parameters: trivial
HPE-bug-id: LUS-10407
Fixes: d226464aca ("LU-8057 ko2iblnd: Replace sg++ with sg = sg_next(sg)")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ifb843f11ae34a99b7d8f93d94966e3dfa1ce90e5
Reviewed-on: https://review.whamcloud.com/45216
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15094 o2iblnd: map_on_demand not needed for frag interop

The map_on_demand tunable is not used for setting max frags so don't
require that it be set in order to negotiate max frags.

HPE-bug-id: LUS-10488
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie89f1f035f4b05244feffb848c14582a8c7cf0e6
Reviewed-on: https://review.whamcloud.com/45215
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15049 quota: fix a panic with pool number > 16

Fix a panic that may occur when there are more than 16
pools in a system:
qti_pools_add()) ASSERTION( qti->qti_pools_num >= QMT_MAX_POOL_NUM ) failed: Forgot init? ffff91a5f9625800

HPE-bug-id: LUS-10116
Change-Id: I4f73b74d2fd3e85a51cf3c30e2eec29645f164be
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45105
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15048 quota: check that qti_lqes has been inited

qti_lqes_resotre_init/fini should check that qti_lqes
has been inited before address qti_lqes_count.

Fix helps against following panic:
qti_lqes_restore_fini()) ASSERTION( qmt_info(env)->qti_lqes_rstr ) failed:

HPE-bug-id: LUS-10239
Change-Id: Ic93d87535f615fe419b2c3a2453506c515837031
Reviewed-on: https://es-gerrit.dev.cray.com/159116
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45102
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14713 llite: tighten condition for fault not drop mmap_sem

As __lock_page_or_retry() indicates, filemap_fault() will return
VM_FAULT_RETRY without releasing mmap_sem iff flags contains
FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_RETRY_NOWAIT.

So ll_fault0() should pass in FAULT_FLAG_ALLOW_RETRY |
FAULT_FLAG_RETRY_NOWAIT in ll_filemap_fault() so that when it
returns VM_FAULT_RETRY, we can pass on trying normal fault
under DLM lock as mmap_sem is still being held.

While in Linux 5.1 (6b4c9f4469819) FAULT_FLAG_RETRY_NOWAIT is enough
to not drop mmap_sem when failed to lock the page.

Fixes: 87865e4ae9 ("LU-13182 llite: Avoid eternel retry loops with MAP_POPULATE")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I9420c587301722b597155558657577349a8141e4
Reviewed-on: https://review.whamcloud.com/44715
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14927 osd: share brw_stats code between OSD back ends.

Both the ldiskfs and ZFS OSD backend handle brw_stats. With the
stricter GPL requirement ZFS can no longer carry the brw_stats
code. So move the common code to lprocfs_status_server.c as
well as move brw_stats to debugfs as well.

Change-Id: I294e5df3557552266dd3a02d3bc9844c42c01f60
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44690
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14793 hsm: record index for further HSM action scanning

there is contention between HSM archive request and "hsm_cdtr"
kernel thread:
->mdt_hsm_request()
  ->mdt_hsm_add_actions()
    ->mdt_hsm_register_hal()
      ->mdt_agent_record_add()
        ->down_write(&cdt->cdt_llog_lock)
        ->llog_cat_add()
        ->up_write(&cdt->cdt_llog_lock)

->mdt_coordinator()
  ->cdt_llog_process()
    ->down_write(&cdt->cdt_llog_lock);
    ->llog_cat_process()
    ->up_write(&cdt->cdt_llog_lock);

HSM archive request and HSM cat llog scanning in the kernel daemon
"hsm_cdtr" are both contenting for write llog lock to add or
update the "hsm_actions" llog.

In the tesing, it uses max_requests = 1000000.
In the current implementation, it means kernel daemon thread
"hsm_cdtr" needs to scan nearly whole "hsm_actions" llog from the
beginning position with write llog lock held.
This will slow down the HSM archive requests which is contented
for write llog lock.

As llog is append-only, we record the latest handled position in
the llog, thus next scanning can start from the previous recorded
postion (llog index), does not need to start from the beginning.

Another way to mitigate this probelm is:
when the llog scanner found that there are other process
contended for the llog lock, it will stop the llog scanning and
release write llog lock properly for incoming HSM archive requests.

After applied this patch, with 200000 HSM actions in llog, the time
to queue 10000 HSM archive requests reduces from 10 seconds to 4
seconds.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I2e92daf34844605ee648787daf859143335c68bf
Reviewed-on: https://review.whamcloud.com/44077
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14706 libcfs: ___wait_event() update for older kernels

A couple of wait issues fail to build correctly on older
3.0 kernels

Fixes: f6df31a163 ("LU-10467 lustre: add wait_event macros suitable for upstream")
Fixes: d05427a785 ("LU-10428 lnet: call event handlers without res_lock")
HPE-bug-id: LUS-9975
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I275fbf79c4472575d867075a3e3ebd3d6ec1cdfa
Reviewed-on: https://review.whamcloud.com/43785
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13358 libcfs: add timeout to cfs_race() to fix race

there is no guarantee for the branches in cfs_race() to be executed
in strict order, thus it's possible that the second branch (with
cfs_race_state=1) is executed before the first branch and then another
thread executing the first branch gets stuck.

this construction is used for testing only and as a
workaround it's enough to timeout.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie1cc0accedb3e1a198d4b17d1ab00ce298c560f2
Reviewed-on: https://review.whamcloud.com/43161
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15171 osd-ldiskfs: xattr_sem locking is missing for dquot_transfer

Kernel commit 7a9ca53ae (~v4.13) added the requirement for xattr_sem locking
when calling *dquot_transfer. As of now, in rare cases, it is possible that
we can modify inode xattrs and perform their consistency checks in parallel,
which can fail.

Change-Id: I041694e30ce6c8398864c0ad57671df0bffd2f52
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-10549
Reviewed-on: https://review.whamcloud.com/45424
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15143 osd-ldiskfs: osd_declare_write() underestimates credits

osd_declare_write() seems to underestimate journal credits for
the extent case. It does not properly account credits for
a new extent tree block. 3 credits (bitmap, gd, self) should be
accounted for a new data block and for a new extent tree block.

Change-Id: Iad463cac3a2a8c2b2a6b1a634e7502039bb1b7e5
HPE-bug-id: LUS-10514
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-on: https://review.whamcloud.com/45328
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-10391 lnet: switch to large lnet_processid for matching

Change lnet_libhandle.me_match_id and lnet_match_info.mi_id to
struct lnet_processid, so they support large nids.

This requires changing
LNetMEAttach(), lnet_mt_match_head(), lnet_mt_of_attach(),
lnet_ptl_match_type(), lnet_match2mt()
to accept a pointer to lnet_processid rather than an
lnet_process_id.

Test-Parameters: trivial
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I6957b467bb9af84e20a4525db6351694f4d2a7af
Reviewed-on: https://review.whamcloud.com/43597
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-10391 lnet: extend prefered nids in struct lnet_peer_ni

union lpni_pref in struct lnet_peer_ni how has 'struct lnet_nid'
rather than lnet_nid_t.

Also, lnet_peer_ni_set_no_mr_pref_nid() allows the pref nid to be NULL
and is a no-op in that case.

Rather then updating the user-space cfs_match_nid_net() in
libcfs/utils/nidstrings.c, remove it as it is unused.

Test-Parameters: trivial
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9a2453185aa5d708e6939dadc1e954c9dbd24efc
Reviewed-on: https://review.whamcloud.com/43596
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-10391 lnet: change tp_nid to 16byte in lnet_test_peer.

This updates 'struct lnet_test_peer' to store a large address nid.

Test-Parameters: trivial
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Id2f97a841bb0738503b0e87e7a9e2f8bebc4c3ec
Reviewed-on: https://review.whamcloud.com/43595
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15210 tests: fix sanity-lnet to handle duplicate IP

The same IP may be added on the same interface with different netmasks
specified.
sanity-lnet is not expecting this and fails. Fix it.

Test-parameters: trivial testlist=sanity-lnet

Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I70c2df5f14c88362ea5af6f06410823c56535dee
Reviewed-on: https://review.whamcloud.com/45551
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-13456 ldlm: fix reprocessing of locks with more bits

Reprocessing check queues should be extended
with just granted lock inodebits.

ldlm_reprocess_all() should be called on BL AST race.

Change-Id: Ifd232062068481c1c62fa2f2a14c7778d4402260
Fixes: 2250e072c3785 ("LU-12017 ldlm: DoM truncate deadlock")
HPE-bug-id: LUS-8224
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-on: https://review.whamcloud.com/38244
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15070 llite: update default LMV upon any change

max_inherit and max_inherit_rr was newly added, and they are missing
in lsm_md_eq(), therefore client may not update default LMV when
either of these two fields is changed.

Add sanityn 112.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Iac71b530b3702105c4213715826b1782c6aba7ca
Reviewed-on: https://review.whamcloud.com/45237
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15070 mdt: revoke remote LOOKUP lock for default LMV

When setting default LMV, it will revoke LOOKUP lock, while if dir
is remote dir, its LOOKUP lock is on MDT where its parent is located.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I9f079a0bcff530603725ce72cd89c14935ba913b
Reviewed-on: https://review.whamcloud.com/45236
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15151 tests: use facet check instead of node check

Change wait_update_cond() call to wait_update_facet_cond()
call in test_119.

Fixes: 98f107b53e4d ("LU-9699 osp: don't assert on OSP duplicating")
Test-Parameters: trivial testlist=conf-sanity env=ONLY=119
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-10557
Change-Id: Ia1ff7921026212814ec71e0c3aa60f23fbd7278f
Reviewed-on: https://review.whamcloud.com/45369
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>

LU-15160 kernel: kernel update SLES12 SP5 [4.12.14-122.91.2]

Update SLES12 SP5 kernel to 4.12.14-122.91.2 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: Ia6620869fa84d72f8d22c4a8a039600037ddb2d9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45358
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15149 lnet: Missing newline in lnet_add_route

CWARN string is missing a newline character.

Test-Parameters: trivial
Fixes: 3f2844dc93 ("LU-14945 lnet: don't use hops to determine the route state")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I06370c36e9d88b7e02e000bfb573297ff281aef1
Reviewed-on: https://review.whamcloud.com/45340
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15136 socklnd: default conns_per_peer to 0

Setting conns_per_peer to 0 triggers socklnd to choose the
(heuristically) optimal setting for the interface given its speed.
Make 0 the default for socklnd conns_per_peer.

Test-parameters: trivial testlist=sanity-lnet

Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Fixes: c44afcfb72 ("LU-12815 socklnd: set conns_per_peer based on link speed")
Change-Id: Ie6e76eaee8693472384cce362b394b216142884e
Reviewed-on: https://review.whamcloud.com/45319
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15150 tests: sanity-lnet removes testsuite log on failure

cleanup_testsuite() needs to be more selective when removing files
created by sub-tests.

Test-Parameters: trivial testlist=sanity-lnet
Fixes: aa739144551 ("LU-13569 tests: Check LNet Health recovery logic")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ic17a68ff2aa552594a0f1ea470c39177abe985fc
Reviewed-on: https://review.whamcloud.com/45342
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15131 tests: check parameter correctly

sanity test_254() always skipped due to wrong
parameter path: MDT0 not initialized before get_param.

Fixes: 33aad7829de5 ("LU-10461 tests: call exit in the skip routine")
Test-Parameters: trivial testlist=sanity
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-9969
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I02e38a27ca4656128e62339d63425f85386fa905
Reviewed-on: https://review.whamcloud.com/45300
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15061 tests: fix sanity-dom exit status

If any of sanity tests fails during "ONLY=sanity sh sanity-dom.sh"
execution -- the next "ONLY=sanityn sh sanity-dom.sh" is always
detected as failed even if all sanityn tests pass. This is
because of ${TMP}/sanity.log exists and is checked for
"FAIL" also, while ${TMP}/sanityn.log is to be proceeded only.

Fixes e2cb43c409b9 ("LU-13773 tests: subscript failure propagation")

Test-Parameters: trivial testlist=sanity-dom
HPE-bug-id: LUS-10480
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Change-Id: I764d1d6df08da13acfecd445e67ced1b455ddce8
Reviewed-on: https://review.whamcloud.com/45299
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15098 tests: sanity-sec 27a exec commands on right node

In nodemap_exercise_fileset called from sanity-sec test 27a,
make sure all commands are executed on first client, as we are
testing properties of nodemaps 'default' and 'c0'.
And make sure 'default' nodemap has admin and trusted properties
set to 1, as we are carrying operations as root.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-sec clientcount=2 env=ONLY=27a
Fixes: 0daeebcbdc ("LU-14797 nodemap: map project id")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idd9f391db60475721f3a3856b5e3bee1a18bbbca
Reviewed-on: https://review.whamcloud.com/45293
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15103 tests: clean up busy cifs mount

Busy cifs mount point makes cleanup_cifs fail which
will infact fails lustre unmount as cifs mount point is
lustre mount.

Test-Parameters: trivial
Signed-off-by: gaurav mahajan <gaurav.mahajan@seagate.com>
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-4146
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I4b7ec7e0a6a706198e328dad337648bf3cb2c3be
Reviewed-on: https://review.whamcloud.com/45238
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15099 kernel: kernel update RHEL7.9 [3.10.0-1160.45.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.45.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I11c307bfd6a6b353bc7b6fe40bb5d604bc9b3fdc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45226
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13717 sec: missing defs in includes for client encryption

Add a few missing definitions in lustre_crypto.h that are required
in case Lustre client encryption is built against the in-kernel
fscrypt library.

Fixes: 028281ae19 ("LU-13717 sec: rework includes for client encryption")
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1965503554dcf660770d201444cfafd54aa84dce
Reviewed-on: https://review.whamcloud.com/45221
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15093 libcfs: Check if param_set_uint_minmax is provided

Linux kernel v5.15 commit 2a14c9ae15a38148484a128b84bff7e9ffd90d68
moved param_set_uint_minmax to common code.

HPE-bug-id: LUS-10469
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ifd1d72ae531f0f6c7cd96cc28fbc07c8a8b70886
Reviewed-on: https://review.whamcloud.com/45214
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14999 mdt: Deadlock on parent during resend

Parent-child lock order gets broken during resend as
there is child lock already but there isn't parent lock
and MDS tries to lock it again.

Don't lookup child on resend, extract fid from the lock instead.

Change-Id: I443648a8162e770c63fd087dd534c27e7c637c40
HPE-bug-id: LUS-9306
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-on: https://review.whamcloud.com/44885
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14929 gss: detect libkeyutils dependency

When building GSS support, gss_keyring requires libkeyutils.
So make sure this dependency is properly detected at configure time,
and include keyutils.h only when required.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9fa5750f4609250ecdc1c47f68b97bff9be13ace
Reviewed-on: https://review.whamcloud.com/44597
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14124 target: set OBD_MD_FLGRANT in read's reply

If tgt_grant_shrink() decides to not shrink grants - a client is
supposed to restore its cl_grant_avail in osc_update_grant(). In case
of read OBD_MD_FLGRANT is not set on reply's body->oa.o_valid, so
osc_update_grant() misses the cl_grant_avail update. As result server
keeps thinking that client has a lot of grants while a client thinks
that it is missing grants badly. That may lead to performance
degradation.

A test to illustrate the issue is included.

Test-Parameters: trivial testlist=sanity
Change-Id: Ibe7ce0af5701226c8be3ae3f9ad57c354791fa0f
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
HPE-bug-id: LUS-9943
Reviewed-on: https://review.whamcloud.com/43375
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15146 mdt: mdt_lvb2reply crash fix

mdt_lvb2body may crash if res->lr_lvb_data is
not allocated, make it tolerant to not initialized
lvb_data pointer.

HPE-bug-id: LUS-9549
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: Ie31cbba9187f9b04b3d1f8d2abc59e0612a44b41
Reviewed-on: https://review.whamcloud.com/45334
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15047 gss: gss integrity check with multi-rail

With multi-rail, a primary NID is used as node identifier, but LNet
decides which NID is actually used for sending/receiving data, on a
per request basis.
For the integrity check mechanism implemented as part of GSS, the
primary NID must be used in order to compute HMAC with the correct
key, independently of the actual NID for the current request.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2bf3974d3aa0e8365a9413dca56c69ee3734c12b
Reviewed-on: https://review.whamcloud.com/45277
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14587 ptlrpc: remove LASSERT in nrs_polices proc handler

It's not necessary to LASSERT() in nrs_polices proc handler.
CERROR() and returning error is good enough.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I09f06dc4ab90e49b2df66a9b47a74678c64cdd2f
Reviewed-on: https://review.whamcloud.com/45200
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15089 tests: allow enough time to create tcp connections

Allow enough time to create tcp connections before counting them
when testing socklnd conns_per_peer setting in sanity-lnet test_230

Test-Parameters: trivial testlist=sanity-lnet
Fixes: a5cbe7883db6 ("LU-12815 socklnd: allow dynamic setting of conns_per_peer")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ia3e25de157da03d6129b108c1af9632a8faf8efd
Reviewed-on: https://review.whamcloud.com/45331
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15125 o2iblnd: wrong list used for kib_connd_waits

The ibc_list field of struct kib_conn is used for the kib_connd_waits
list, but kiblnd_connd() uses ibc_sched_list in the
list_first_entry_or_null macro.

Test-Parameters: trivial
Fixes: 34b57a6f8f ("LU-12678 lnet: use list_first_entry() in lnet/klnds subdirectory.")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I0cc8a94636a5129956c53e48ae96b27ece5f0228
Reviewed-on: https://review.whamcloud.com/45316
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15118 ldlm: no free thread to process resend request

MDS grants lock request and sends a reply,
input request queue can be filled immediately
with conflicting lock enqueue request.
So there isn't any free thread to process resend of
first lock enqueue request if the client has failed
to receive the reply.

Process lock enqueue resends with existing lock on MDS
in high priority queue.

Change-Id: If7b94690100b44c774dc14231ed4902f701ed807
HPE-bug-id: LUS-9444
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-on: https://review.whamcloud.com/45272
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15106 ofd: quiet deprecated param warning

There are a number of obdfilter parameter files that report a
warning even when they are read, which is confusing for users
if there is a tool that is scraping all available parameters:

    # lctl get_param obdfilter.*.*
    ofd: 'obdfilter.*.read_cache_enabled' is deprecated,
         use 'osd-*.read_cache_enabled' instead
    ofd: 'obdfilter.*.readcache_max_filesize' is deprecated,
         use 'osd-*.readcache_max_filesize' instead
    ofd: 'obdfilter.*.sync_on_lock_cancel' is deprecated,
         use 'obdfilter.*.sync_lock_cancel' instead
    ofd: 'obdfilter.*.writethrough_cache_enabled' is deprecated,
         use 'osd-*.writethrough_cache_enabled' instead

It should only print a message if the parameters are actually written.
Also fix the messages to reference the correct parameter names.

Most of these parameter links were added in 2.4 with the addition of
osd-ldiskfs.  However, the deprecation warnings were only added in
2.12.53 and slated for removal in 2.15, but were not backported to
2.12 LTS, and there hasn't been an LTS release since then, so it is
better bump removal so the upcoming 2.15 LTS release includes them.

Fix the test scripts to only use the new parameter names, to avoid
spurious warning messages.  We don't test interop with 2.3 anymore.

Test-Parameters: trivial
Fixes: 7df7347b7b18 ("LU-12967 ofd: restore sync_on_lock_cancel tunable")
Fixes: 493cd8088388 ("LU-8066 osd: migrate from proc to sysfs")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie548e5b6af5463959fb4774e31996097373ebbe5
Reviewed-on: https://review.whamcloud.com/45246
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15104 tests: create client dirs with custom stripe count

Random ha_dir_stripe_count allows clients to create directories
with different stripe counts in one ha.sh test session.
Please use
DSTRIPECOUNT=3 DSTRIPECOUNTRAND=true
to create the directories with stripe counts 1,2 or 3.

Test-Parameters: trivial
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-10435
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Change-Id: I0092958a9bcf1991adc9f45ac1fbed9340a06c57
Reviewed-on: https://review.whamcloud.com/45240
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14195 obdclass: change list_sort to use const pointers

Kernel 5.10.70 commit 4f0f586bf0c898233d8f316f471a21db2abd522d
defines the list_cmp_func_t type and changes the comparison
function types of all list_sort() callers to use const pointers
to avoid type mismatches.

Change-Id: I40d37ec0f0d485d0deebaa9dc3493f2865f76ec9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45219
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15081 vfs: set_nlink() is not race-safe

set_nlink() is not atomic wrt race with itself and
the following warning may be triggered by VFS:

WARNING: CPU: 5 PID: 195090 at fs/inode.c:241 __destroy_inode+0xdb/0xf0

It does not seem important what exact nlink value is the result
of the race. However, we need to protect the superblock remove
counter.

Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-9825
Change-Id: I67bc345b9a9e43fb88d919a83246759d11604b03
Reviewed-on: https://review.whamcloud.com/45191
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15045 utils: fix lfs_migrate for files with spaces

Fix the lfs_migrate script to properly quote "$OLDNAME" so that
it works for filenames with spaces and other characters in them.

Test-Parameters: trivial
Fixes: 8bedfa377fbd ("LU-11510 lfs: migrate a composite layout file correctly")
Fixes: 128137adfc53 ("LU-13090 utils: fix lfs_migrate -p for file with pool")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic00f41f3a91ad9dfa491ff57768a3da0c6300c1e
Reviewed-on: https://review.whamcloud.com/45173
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>