Whamcloud - gitweb
fs/lustre-release.git
5 years agoLU-8293 lnet: Add insserv header to lnet init script 35/20835/7
Chris Horn [Thu, 16 Jun 2016 17:57:37 +0000 (12:57 -0500)]
LU-8293 lnet: Add insserv header to lnet init script

The lnet init script does not contain header information as described
by the insserv man page. This header information is needed to ensure
the lnet init script is not run until openibd has been able to start

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I6f778e827f88ce34199dff70be5d5089f0ba51b9
Reviewed-on: https://review.whamcloud.com/20835
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-10932 libcfs: properly handle failure cases in SMP code 85/32085/2
James Simmons [Thu, 19 Apr 2018 17:06:53 +0000 (13:06 -0400)]
LU-10932 libcfs: properly handle failure cases in SMP code

While pushing the LU-8703 and LU-7734 work upstream some bugs were
pointed out by Dan Carpenter in the code. Due to single err label
in cfs_cpt_table_alloc() and cfs_cpu_init() a few items were being
cleaned up that were never initialized. This can lead to crashed
and other problems. In those initialization function introduce
individual labels to jump to only the thing initialized get freed
on failure. Lastly in cfs_cpt_table_alloc() handle the failure
case instead of the passed case which is the perferred style.

Change-Id: I969f4e327888042a517bc321b68d55fc691b074e
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32085
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-6349 mds: remove obsolete MDS_VTX_BYPASS flag 84/31984/3
Andreas Dilger [Fri, 13 Apr 2018 03:44:52 +0000 (21:44 -0600)]
LU-6349 mds: remove obsolete MDS_VTX_BYPASS flag

The MDS_VTX_BYPASS flag is only set and never checked.  This is true
since 2.3.53-66-g54fe979 "LU-2216 mdt: remove obsolete DNE code", but
it was already obsolete for a long time before that.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ia9510146326e71a134fef85dd8febf7b752cab07
Reviewed-on: https://review.whamcloud.com/31984
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4684 ptlrpc: add dir migration connect flag 14/31914/6
Lai Siyao [Tue, 3 Apr 2018 02:55:18 +0000 (10:55 +0800)]
LU-4684 ptlrpc: add dir migration connect flag

Add dir migration connect flag to prevent collision with other
features. Though dir migration code exists, it will be reworked,
and the new RPC protocol won't be compatible with current one.

Also handle the previously-added OBD_CONNECT2_FLR flag.

Test-Parameters: trivial
Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Ie3eaec8602664170b52e3778b76059a0899ee7b1
Reviewed-on: https://review.whamcloud.com/31914
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
5 years agoLU-9019 libcfs: remove the remaining cfs_time wrappers 68/31068/8
James Simmons [Thu, 12 Apr 2018 14:00:11 +0000 (10:00 -0400)]
LU-9019 libcfs: remove the remaining cfs_time wrappers

In the lustre code various small bits are left of the libcfs time
wrappers. Remove all the remaining wrappers except for the inline
function cfs_time_seconds(). For cfs_time_seconds() we have to
move to nsec_to_jiffies() since msecs_to_jiffies() has become an
inline function in jiffies.h which means HZ can be different on
the installation node verses what the target node is configured
for. Since nsec_to_jiffies() is a normal built in kernel function
we can avoid the LU-5443 issues. For cfs_duration_secs() we use
the internal kernel function jiffies_to_msec() since only the
newest kernels have jiffies_to_nsec() and that function is just
a wrapper around jiffies_to_msec(). All the reset of the libcfs
time abstractions are gone.

Change-Id: I166973071304f1d55a15b1e21fcfbe434ff58199
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31068
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-9780 tests: Testing Round-Robin allocation 75/28075/5
Ashish Maurya [Fri, 16 Feb 2018 17:38:00 +0000 (20:38 +0300)]
LU-9780 tests: Testing Round-Robin allocation

Adding a test for fix in LU-977 which shows that
in absence of protection to lqr_start_idx there is
possibility of imblance in allocating objects on OSTs
with round-robin algorithm.

This test checks the protection of lqr_start_idx by using
a new reproducer, rr_alloc which uses MPI to create files in
parallel, and checking the even file distribution over OSTs.

Distribution check formula is adjusted as per implementation,i.e
some factors like - exhaustion of pre-created objects and counter
‘lrq_start_count’(lod_qos.c) reseed are also taken care so that
object allocation is not affected by these factors.

Test-Parameters: trivial osscount=3 clientcount=6                  envdefinitions=ONLY=rr_alloc testlist=parallel-scale
Signed-off-by: Ashish Maurya <ashish.maurya@seagate.com>
Signed-off-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: MRP-2723
Reviewed-by: Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Vikram Jadhav <jadhav.vikram@seagate.com>
Change-Id: I55f798c6dc8e607f002365f4a22ccf59a454fe1d
Reviewed-on: https://review.whamcloud.com/28075
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-3665 tests: Cleanup echo client after obdfilter-survey 50/9350/25
Nathaniel Clark [Fri, 21 Feb 2014 17:53:36 +0000 (12:53 -0500)]
LU-3665 tests: Cleanup echo client after obdfilter-survey

Some failures of obdfilter-survey do not cause an error in
obdfilter-survey.sh In some cases obdfilter-survey did not
cleanup echo clients it had created and that could hang
umount of osts.
Change test-framework.sh::cleanupall to remove echo clients before
trying to umount to prevent the echo clients from holding the OST or
MDS/MGS open forever.

Test-Parameters: trivial testlist=obdfilter-survey osscount=1 ostcount=2 mdscount=1 mdtcount=1
Test-Parameters: trivial testlist=obdfilter-survey mdtfilesystemtype=zfs ostfilesystemtype=zfs osscount=1 ostcount=2 mdscount=1 mdtcount=1
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I63ae59da84101c782aa9d5e7216cce3b3b1ff2fe
Reviewed-on: https://review.whamcloud.com/9350
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-6655 ptlrpc: skip delayed replay requests 05/23205/15
Hongchao Zhang [Mon, 23 Apr 2018 06:35:11 +0000 (14:35 +0800)]
LU-6655 ptlrpc: skip delayed replay requests

During recovery, there could be some delayed replay requests
after the final recovery completion ping request was handled,
and it should be skipped.

Change-Id: Ie0d5ff92c75f9d078b8ae28e899d4a821113194f
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/23205
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-1644 mgc: remove obsolete IR swabbing workaround 87/32087/3
Andreas Dilger [Tue, 17 Apr 2018 04:53:02 +0000 (22:53 -0600)]
LU-1644 mgc: remove obsolete IR swabbing workaround

The OBD_CONNECT_MNE_SWAB check was added to the MGC for compatibility
with servers in the 2.2.0-2.2.55 range (in 2012) with big-endian
clients.  2.2 was not an LTS release and is no longer being used.

Remove the checks on the client for OBD_CONNECT_MNE_SWAB being set,
and assume that the server does not have this bug.  This will allow
the removal of the rest of this workaround from the server code once
there are no more clients depending on the presence of this flag.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ie128745e3af8e9d41454e095922e0f6bc8fcab07
Reviewed-on: https://review.whamcloud.com/32087
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-10910 mdd: deny layout swap for DoM file 44/32044/5
Mikhail Pershin [Wed, 18 Apr 2018 07:17:10 +0000 (10:17 +0300)]
LU-10910 mdd: deny layout swap for DoM file

Layout swap is prohibited for DoM files until LU-10177
will be implemented. The only exception is the new layout
having the same DoM component.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I8b9deaea7aab0d5694a4c5d9fe2f9f36d2cdd382
Reviewed-on: https://review.whamcloud.com/32044
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-10909 utils: enable non-shared libmount_utils_ldiskfs.a 70/31970/5
Alex Zhuravlev [Thu, 12 Apr 2018 08:32:32 +0000 (11:32 +0300)]
LU-10909 utils: enable non-shared libmount_utils_ldiskfs.a

It depends on libcfs.a (due to cfs_abs_path()) so that library must
come after it in the list of libraries provided to the linker.

Test-Parameters: trivial
Change-Id: Id3026705ebac65b74d1ea48964c25aa6c4834552
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31970
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-930 doc: include status in "lfs df" man page 98/31998/3
Andreas Dilger [Fri, 13 Apr 2018 21:30:47 +0000 (15:30 -0600)]
LU-930 doc: include status in "lfs df" man page

Describe the "DRSI" target states in the lfs-df.1 man page.

Improve the overall description of how "lfs df [-i]" works, with
estimation of total/free inodes for ZFS, and show actual output.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I2ae4190c9ebd3467b5243df89938501d301cab07
Reviewed-on: https://review.whamcloud.com/31998
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Joseph Gmitter <joseph.gmitter@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-10903 obdecho: use OBD_ALLOC_LARGE for lnb 64/31964/2
Andreas Dilger [Wed, 11 Apr 2018 18:13:34 +0000 (12:13 -0600)]
LU-10903 obdecho: use OBD_ALLOC_LARGE for lnb

When allocating the niobuf_local, if there are a large number of
(potential) fragments this allocation can be quite large.  Use
OBD_ALLOC_LARGE(lnb) and OBD_FREE_LARGE(lnb) to avoid allocation
errors and console noise.  This was causing sanity test_180c to
fail in a VM on occasion, and could also be problem in real use.

Tidy up test_180[abc] to use stack_trap to handle echo device
cleanup rather than having a series of conditional checks.
Use "error" to report errors rather than error numbers.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ib0e721f58fc0a62acca29b31fb4dfac5021cab07
Reviewed-on: https://review.whamcloud.com/31964
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-10886 build: fix warnings during autoconf 04/31904/2
Andreas Dilger [Thu, 26 Jun 2014 23:53:52 +0000 (17:53 -0600)]
LU-10886 build: fix warnings during autoconf

Quiet warnings in configure checks to avoid confusion when debugging
autoconf/configure scripts, and to avoid potential errors during
detection.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I2916a120b57fb9fa529bf7050cf65511233ebbe5
Reviewed-on: https://review.whamcloud.com/31904
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-8200 test: improve sanityn.sh 33c 73/31673/4
Lai Siyao [Fri, 16 Mar 2018 14:36:11 +0000 (22:36 +0800)]
LU-8200 test: improve sanityn.sh 33c

If transaction is committed before unlock, lock won't be saved. So
trigger Sync-Lock-Cancel twice in sanityn.sh 33c, it's unlikely to
fail both.

Test-Parameters: trivial clientcount=2 mdscount=2 mdtcount=4 \
osscount=1 ostcount=8 mdtfilesystemtype=zfs ostfilesystemtype=zfs \
testlist=sanityn
Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I5c99e5a8261df8c9f463aea7ed67df95baaf3e6c
Reviewed-on: https://review.whamcloud.com/31673
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-10699 hsm: simplify mdt_hsm_{add,get}_actions() 85/31385/2
John L. Hammond [Thu, 22 Feb 2018 18:27:14 +0000 (12:27 -0600)]
LU-10699 hsm: simplify mdt_hsm_{add,get}_actions()

Split the mdt_hsm_get_actions() functionality of
hsm_find_compatible_cb() into a new llog callback
(hsm_find_action_cb()). Simplify hsm_find_compatible() since it only
needs to handle adding new requests. Simplify the MDS_HSM_ACTIONS
handling code.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ie1e7eac85397d82ad9cc4790582bf8ba51713e70
Reviewed-on: https://review.whamcloud.com/31385
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-8066 osc: move suitable values from procfs to sysfs 62/30962/6
Oleg Drokin [Thu, 12 Apr 2018 02:19:22 +0000 (22:19 -0400)]
LU-8066 osc: move suitable values from procfs to sysfs

All single-value controls are moved from /proc/fs/lustre/osc/.../
to /sys/fs/lustre/osc/.../

Linux-commit : aab38b00ac19347bf982cf42c71aab14a9301dee

Change-Id: I69744804bc424ab171da8f649336a3ad450f0d05
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30962
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-2711 tests: conf-sanity clean up environment in test 03/30703/5
James Nunez [Wed, 3 Jan 2018 22:16:34 +0000 (15:16 -0700)]
LU-2711 tests: conf-sanity clean up environment in test

Several Lustre test suites execute code or set variables
between tests even if the preceding test is not executed.
In the case of conf-sanity, the code between tests 23a and
23b and between 43b and 44 are there to clean up the environment
from the previous test. Properly clean up the file system
configuration/environment at the end of test 23a and 43b.

Test-Parameters: trivial testlist=conf-sanity

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I250bb2067d0c966f866b404025dc6240b16f301e
Reviewed-on: https://review.whamcloud.com/30703
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5170 lfs: Standardize error messages in lfs_path2fid() 70/30670/3
Steve Guminski [Wed, 12 Jul 2017 19:07:27 +0000 (15:07 -0400)]
LU-5170 lfs: Standardize error messages in lfs_path2fid()

Error messages in lfs_path2fid() are updated to a standard format.
Messages are prefixed with the name of the utility and the command
that caused the error.  User-provided values are delimited with
single quotes.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I32ab8650ddced9569837aa3d106ef1708c974bce
Reviewed-on: https://review.whamcloud.com/30670
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
5 years agoLU-5170 lfs: Standardize error messages in lfs_fid2path() 68/30668/3
Steve Guminski [Wed, 12 Jul 2017 19:03:57 +0000 (15:03 -0400)]
LU-5170 lfs: Standardize error messages in lfs_fid2path()

Error messages in lfs_fid2path() are updated to a standard format.
Messages are prefixed with the name of the utility and the command
that caused the error.  User-provided values are delimited with
single quotes.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I124f3e5bfad120abe701dce592da6005d53112c5
Reviewed-on: https://review.whamcloud.com/30668
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
5 years agoLU-5170 lfs: Standardize error messages in lfs_data_version() 67/30667/3
Steve Guminski [Wed, 12 Jul 2017 19:15:22 +0000 (15:15 -0400)]
LU-5170 lfs: Standardize error messages in lfs_data_version()

Error messages in lfs_data_version() are updated to a standard
format.  Messages are prefixed with the name of the utility and the
command that caused the error.  User-provided values are delimited
with single quotes.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: Idb0939c4fa23fc409d965183e5fe5dddcab6da4f
Reviewed-on: https://review.whamcloud.com/30667
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
5 years agoLU-10825 lnet: add ip2nets syntax handling for peer 86/31786/4
Amir Shehata [Fri, 23 Mar 2018 17:04:19 +0000 (10:04 -0700)]
LU-10825 lnet: add ip2nets syntax handling for peer

Allow peers to be added using ip2nets syntax, from
both command line and YAML block.

Command line example:
lnetctl peer add --ip2nets 10.10.10.[3-6,9]@tcp
lnetctl peer del --ip2nets 10.10.10.[3-6,9]@tcp

YAML example:
peer:
    ip2nets:
      - nid: 30.10.10.[3-8]@tcp
      - nid: 40.10.10.[9-40]@tcp

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I904be8d496ad2be277c3d21dc7f72cbc7ed02b50
Reviewed-on: https://review.whamcloud.com/31786
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-10825 libcfs: generate ip addresses 85/31785/4
Amir Shehata [Mon, 19 Mar 2018 19:37:57 +0000 (12:37 -0700)]
LU-10825 libcfs: generate ip addresses

Add infrastructure API cfs_ip_addr_range_gen() to generate a
maximum of 'count' IP addresses from an expression.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I479dd7128eef404106a7863124a38c501150ba9e
Reviewed-on: https://review.whamcloud.com/31785
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-8066 ldlm: move all remaining files from procfs to debugfs 55/29255/7
Dmitry Eremin [Thu, 8 Mar 2018 14:44:11 +0000 (09:44 -0500)]
LU-8066 ldlm: move all remaining files from procfs to debugfs

Move all files except stats. It will be moved later after change
type of obddev->obd_proc_entry member.

Linux-commit: 700815d47f9da0477229f009b6fa235f20da1e21

Clean up the helper functions used to implement "dump_granted_max" in
debugfs.

Change-Id: Id0a0b2c663295c7164c52f89d81525dbefbb992a
Signed-off-by: Dmitry Eremin <dmiter4ever@gmail.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/29255
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-8066 mgc: migrate away from procfs 50/29250/16
James Simmons [Thu, 12 Apr 2018 02:22:32 +0000 (22:22 -0400)]
LU-8066 mgc: migrate away from procfs

Move mgc to using the new sysfs ping and conn_uuid. Move all
sysfs/debugfs handling to lprocfs_mgc.c and implement proper
error handling.

Change-Id: Iecbfac0afedaef9a2b6be33ce026d61008a33136
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/29250
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
5 years agoLU-5170 lfs: Standardize llapi messages for lfs_find() 97/28997/2
Steve Guminski [Thu, 10 Aug 2017 13:37:00 +0000 (09:37 -0400)]
LU-5170 lfs: Standardize llapi messages for lfs_find()

Error messages in the llapi functions called by lfs_find()
are updated to a standard format.  Messages are prefixed with
the name of the utility and the command that caused the error.
User-provided values are delimited with single quotes.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: Ia324887964b4a89cc5d4f92d3f3a7fa421e03dca
Reviewed-on: https://review.whamcloud.com/28997
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
5 years agoLU-5170 lfs: Standardize error messages in lfs_mv() 39/28239/3
Steve Guminski [Wed, 12 Jul 2017 13:56:44 +0000 (09:56 -0400)]
LU-5170 lfs: Standardize error messages in lfs_mv()

Error messages in lfs_mv() are updated to a standard format.
Messages are prefixed with the name of the utility and the command
that caused the error.  User-provided values are delimited with
single quotes.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: Id3d276024a2dcd1ca9169123f9d9193a846d7c85
Reviewed-on: https://review.whamcloud.com/28239
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
5 years agoLU-5170 lfs: Standardize error messages in set_time() 34/28234/3
Steve Guminski [Wed, 12 Jul 2017 11:35:12 +0000 (07:35 -0400)]
LU-5170 lfs: Standardize error messages in set_time()

Error and warning messages in set_time() are updated to a standard
format.  Messages are prefixed with the name of the utility and the
command that caused the error.  User-provided values are delimited
with single quotes.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I4b5efa394b49f4a0af0c073ad707c2b8c6faf6b0
Reviewed-on: https://review.whamcloud.com/28234
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
5 years agoLU-10214 lnet: allow expressions for route config 11/30511/4
Amir Shehata [Thu, 12 Apr 2018 03:24:43 +0000 (20:24 -0700)]
LU-10214 lnet: allow expressions for route config

Support the ip2nets syntax for route gateway configuration. Only support
a maximum of 128 gateways per configuration command. This upper limit is
to prevent a large number of routes to be configured by mistake. This
feature is available from both command line and YAML interfaces.

Command line examples:
lnetctl route add --net tcp1 --gateway 12.3.4.[2-6]@tcp
lnetctl route del --net tcp1 --gateway 12.3.4.[2-6]@tcp

YAML examples:
route:
    - net: tcp1
      gateway: 12.3.4.[2-20]@tcp

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I0d8a0f94cce7140602a64f13f0401ef209f3ca57
Reviewed-on: https://review.whamcloud.com/30511
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-10926 llite: avoid panic in ll_set_acl() 45/32045/2
James Simmons [Wed, 18 Apr 2018 14:28:48 +0000 (10:28 -0400)]
LU-10926 llite: avoid panic in ll_set_acl()

While pushing the xattr work upstream Dan Carpenter noticed a bug
in ll_set_acl() that could panic a node. The problem is in the
case of default the code jumps to out: which calls the function
forget_cached_acl(). If you call forget_cached_acl() with a type
not related to ACLs it will crash the node via a BUG(). To avoid
this just test rc after the switch block and return since neither
the acl or xattr have been modified.

Change-Id: Ib6f3e5e818635c56cc4abbf37e7c63e6897a72a6
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32045
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
5 years agoNew tag 2.11.51 2.11.51 v2_11_51 v2_11_51_0
Oleg Drokin [Sat, 21 Apr 2018 03:38:47 +0000 (23:38 -0400)]
New tag 2.11.51

Change-Id: I734b61e40f2565433f2c84b33ad108e32e7ead82
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10901 build: Update ZFS/SPL to 0.7.8 49/31949/4
Nathaniel Clark [Wed, 11 Apr 2018 13:27:57 +0000 (09:27 -0400)]
LU-10901 build: Update ZFS/SPL to 0.7.8

This skips 0.7.7 due to regression fixed in 0.7.8

https://github.com/zfsonlinux/zfs/releases/tag/zfs-0.7.8
https://github.com/zfsonlinux/zfs/releases/tag/zfs-0.7.7

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Icd6a21793a5f7502c88121def8d86d9aa48c6ae8
Reviewed-on: https://review.whamcloud.com/31949
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10560 libcfs: Fix inconsistent declaration for cfs_kernel_write() 48/31948/7
Mike Marciniszyn [Wed, 11 Apr 2018 12:20:05 +0000 (05:20 -0700)]
LU-10560 libcfs: Fix inconsistent declaration for cfs_kernel_write()

The header file buffer should be const void * to match the
implementation.

Test-Parameters: trivial

Change-Id: I3a09b84e4c3dbe39870be04ae98c13e9e6221a6d
Fixes: b9a32054600a8d63948cced361191aa6ae7ea8f2
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-on: https://review.whamcloud.com/31948
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10869 build: package configuration files for Ubuntu / Debian 50/31850/6
James Simmons [Fri, 6 Apr 2018 16:06:35 +0000 (12:06 -0400)]
LU-10869 build: package configuration files for Ubuntu / Debian

For a long time Lustre never added /etc configuration files to its
debian packages. It could get away with but now you can see it
fail conf-sanity test 76a. This test the lctl set_param -P which
uses udev events to set the tunables for lustre. In order for it
to work a default udev rule has to be added to 99-lustre.rules.
Beside the missing 99-lustre.rules and in all the other files used
for configuration. Lastly create conffile which is the way Debian
handles potential stomping of configuration files. When installing
with apt-get install it will ask the person installing if they
want to over ride specific files.

Test-Parameters: clientdistro=ubuntu1604 trivial testlist=conf-sanity
Test-Parameters: envdefinitions=ONLY=76a testlist=conf-sanity

Change-Id: Ic0aaf2bba531ce23a3e23ef070a1501032ad1c9f
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31850
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Thomas Stibor <t.stibor@gsi.de>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10805 lnet: switch to module_param 88/31788/3
Li Dongyang [Tue, 27 Mar 2018 05:55:51 +0000 (16:55 +1100)]
LU-10805 lnet: switch to module_param

From Linux 4.15 kernel the set/get function prototypes
for module_param_call has changed. This triggers compile
error:
  CC [M]  lustre-release/lnet/lnet/api-ni.o
In file included from lustre-release/lnet/lnet/api-ni.c:38:0:
include/linux/moduleparam.h:232:22: error:
initialization from incompatible pointer type [-Werror]
  static const struct kernel_param_ops __param_ops_##name =
We can switch to module_param and cfs_kernel_param_arg_t since
they are already available in libcfs.

Linux-commit: b2f270e8747387335d80428c576118e7d87f69cc

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ie6f0f3a31fd22904e03e8300f45f0a8684265abd
Reviewed-on: https://review.whamcloud.com/31788
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10841 ldlm: ASSERTION(lock->l_granted_mode!=lock->l_req_mode) 26/31726/4
Andriy Skulysh [Mon, 19 Feb 2018 09:02:36 +0000 (11:02 +0200)]
LU-10841 ldlm: ASSERTION(lock->l_granted_mode!=lock->l_req_mode)

Policy processors can unlock resource to send BL AST,
so cached next list entry can become invalid.

Move sending BL ASTs to ldlm_reprocess_queue()
in case of LDLM_PROCESS_RECOVERY.

Cray-bug-id: LUS-5689
Change-Id: Ib9b757576461b2f74aaa916b4b62538a9abfa0dd
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/31726
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10699 hsm: remove struct hsm_compat_data_cb 82/31382/2
John L. Hammond [Thu, 22 Feb 2018 17:19:16 +0000 (11:19 -0600)]
LU-10699 hsm: remove struct hsm_compat_data_cb

Remove uses of struct hsm_compat_data_cb from hsm_find_compatible()
and its llog callback.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I33addc2bfd983046865286bc2c3200ed0bfd1380
Reviewed-on: https://review.whamcloud.com/31382
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-5170 lfs: Standardize error messages in lfs_changelog() 65/30665/3
Steve Guminski [Wed, 12 Jul 2017 19:00:25 +0000 (15:00 -0400)]
LU-5170 lfs: Standardize error messages in lfs_changelog()

Error messages in lfs_changelog() are updated to a standard format.
Messages are prefixed with the name of the utility and the command
that caused the error.  User-provided values are delimited with
single quotes.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: Ib9229d13a6fa4e76ee407de33f181465adc879b6
Reviewed-on: https://review.whamcloud.com/30665
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
6 years agoLU-10732 tests: sanity-lfsck to reset speed limit 88/31488/10
Fan Yong [Mon, 9 Apr 2018 09:32:36 +0000 (17:32 +0800)]
LU-10732 tests: sanity-lfsck to reset speed limit

- sanity-lfsck should reset speed limit on all servers.
- Change MDS/OST size to 300M as 100M is way too small for
  ZFS and mkfs time shouldn't be a problem for ldiskfs now
- Enable test_9a
- More accurate check for test_30.

Change-Id: I6129c1248a5bb51004d446b858c2ffba1354e74d
Signed-off-by: Fan Yong <fan.yong@intel.com>
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31488
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-9586 tests: remove replay-dual 15c from ALWAYS_EXCEPT 96/27396/9
James Nunez [Fri, 16 Feb 2018 17:11:01 +0000 (10:11 -0700)]
LU-9586 tests: remove replay-dual 15c from ALWAYS_EXCEPT

replay-dual test 15c "remove multiple OST orphans" was added
to the ALWAYS_EXCEPT list because of bugzilla 10124. Remove
replay-dual test 15c from ALWAYS_EXCEPT list.

Test-Parameters: trivial testlist=replay-dual
Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I88866e485fdb037575ee4a72ed38aaa4b465df7d
Reviewed-on: https://review.whamcloud.com/27396
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Cliff White <cliff.white@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10889 ptlrpc: update req timeout if resending happened 10/31910/4
Alexander Boyko [Mon, 9 Apr 2018 11:48:09 +0000 (07:48 -0400)]
LU-10889 ptlrpc: update req timeout if resending happened

When the server drops duplicate request processing, the client and
the server have different deadline for the same request. The server
operates with the first copy and the client operates with the second.

This patch adds request deadline updates if a duplicate request is
found.

A fix for LU-8420 changed lock callback prolong calculation to use
request deadline in case when service estimate changed since the
request has beed created. Using outdated deadline may cause
insufficient prolong timeout and subsequent client eviction.

Signed-off-by: Alexander Boyko <c17825@cray.com>
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I55725d396f50d864687248df46e7882290fc21ca
Cray-bug-id: MRP-3720 MRP-4289
Reviewed-on: https://review.whamcloud.com/31910
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10887 lfsck: offer shard's mode when re-create it 15/31915/3
Fan Yong [Mon, 9 Apr 2018 14:11:31 +0000 (22:11 +0800)]
LU-10887 lfsck: offer shard's mode when re-create it

The namespace will re-create the lost shard of a broken stripe
directory if "-C" option specified. Under such case, the @mode
parameter should be given properly.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I2441d8ca83c932a34ef1971d334f54ecd7343b27
Reviewed-on: https://review.whamcloud.com/31915
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10368 mdc: resend quotactl if needed 73/31773/2
Hongchao Zhang [Wed, 21 Mar 2018 15:21:08 +0000 (23:21 +0800)]
LU-10368 mdc: resend quotactl if needed

In mdc_quotactl, it is better to resend the quotactl request
if reconnection or failover is triggered during the process.

Change-Id: I64f96863a6f10026aa69cba3c59095966b58b98d
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/31773
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10847 doc: fix broken ChangeLog 03/32003/2
Bob Glossman [Mon, 16 Apr 2018 00:26:07 +0000 (17:26 -0700)]
LU-10847 doc: fix broken ChangeLog

recent kernel update landings changed lines in the wrong
section of the ChangeLog file.
This patch moves the changes to the correct locaions.

Test-Parameters: trivial

Change-Id: I8ea3f35a8a8c6a6c7af3b23fa253cc882d0e61f3
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-on: https://review.whamcloud.com/32003
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10912 mdc: use large xattr buffers for old servers 90/31990/3
John L. Hammond [Fri, 13 Apr 2018 15:57:28 +0000 (10:57 -0500)]
LU-10912 mdc: use large xattr buffers for old servers

Pre 2.10.1 MDTs will crash when they receive a listxattr (MDS_GETXATTR
with OBD_MD_FLXATTRLS) RPC for an orphan or dead object. So for
clients connected to these older MDTs, try to avoid sending listxattr
RPCs by making the bulk getxattr (MDS_GETXATTR with OBD_MD_FLXATTRALL)
more likely to succeed and thereby reducing the chances of falling
back to listxattr.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ia96323c47c91a44495b73be2d95705298c7f7ac9
Reviewed-on: https://review.whamcloud.com/31990
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-4939 tests: fix typo in sanity.sh 31/31931/3
Ben Evans [Tue, 10 Apr 2018 14:54:22 +0000 (10:54 -0400)]
LU-4939 tests: fix typo in sanity.sh

4939 got merged with a typo at the top of sanity.sh

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: Id9d407cb9ea2e3952f3ac02f5d675e381c66ade7
Reviewed-on: https://review.whamcloud.com/31931
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10834 tests: fix cleanup_77c() defect 18/31918/2
Elena Gryaznova [Mon, 9 Apr 2018 18:39:05 +0000 (21:39 +0300)]
LU-10834 tests: fix cleanup_77c() defect

Check the length of string is non-zero properly.
Without this fix cleanup_77c() calls:
 "rm -f *" in case of empty $osc_file_prefix, and
 "do_facet ost1 rm -f *" in case of empty $ost_file_prefix

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5817
Test-Parameters: trivial testlist=sanity envdefinitions="ONLY=77c"
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Change-Id: Id3f3ec52c965b15900f7a649120b89267daa9810
Reviewed-on: https://review.whamcloud.com/31918
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10847 kernel update [SLES12 SP2 4.4.120-92.70] 64/31764/3
Bob Glossman [Fri, 23 Mar 2018 22:53:18 +0000 (15:53 -0700)]
LU-10847 kernel update [SLES12 SP2 4.4.120-92.70]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp2 testgroup=review-ldiskfs \
  mdsdistro=sles12sp2 ossdistro=sles12sp2 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I24892ca2360370ede3ca7f909f2e2b1e2275870a
Reviewed-on: https://review.whamcloud.com/31764
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-4423 obdclass: use workqueue for zombie management 07/31707/5
Dmitry Eremin [Wed, 21 Mar 2018 16:56:06 +0000 (19:56 +0300)]
LU-4423 obdclass: use workqueue for zombie management

obdclass currently maintains two lists of data structures (imports and
exports), and a kthread which will free anything on either list.  The
thread is woken whenever anything is added to either list.

This is exactly the sort of thing that workqueues exist for.

So discard the zombie kthread and the lists and locks, and create a
single workqueue.  Each obd_import and obd_export gets a work_struct to
attach to this workqueue.

This requires a small change to import_sec_validate_get() which was
testing if an obd_import was on the zombie list.  This cannot have every
safely found it to be on the list (as it could be freed asynchronously)
so it must be dead code.

We could use system_wq instead of creating a dedicated zombie_wq, but as
we occasionally want to flush all pending work, it is a little nicer to
only have to wait for our own work items.

Change-Id: I3d0fafcc4d3896ff8760d5d36d8cfa187f86bc7d
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/31707
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-1187 idl: remove obsolete directory split flags 00/31700/2
Andreas Dilger [Tue, 20 Mar 2018 18:55:32 +0000 (12:55 -0600)]
LU-1187 idl: remove obsolete directory split flags

The directory split functionality from the old CMD (pre-DNE)
feature was never usable in production, and was removed before
the DNE 2.4 release.  Remove old flags relating to this feature.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I052ea57843ae67c18706a80221590924077bf46f
Reviewed-on: https://review.whamcloud.com/31700
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10157 lnet: make LNET_MAX_IOV dependent on page size 59/31559/5
James Simmons [Tue, 20 Mar 2018 20:44:56 +0000 (16:44 -0400)]
LU-10157 lnet: make LNET_MAX_IOV dependent on page size

The default behavior of LNet is to always use 256 pages which is
LNET_MAX_IOV and that LNET_MAX_PAYLOAD is always one megabyte.
This assumes pages are always 4K in size which is not the case.
This cause bulk I/O errors when using platforms like PowerPC or
ARM which tend to use 64K pages. This is resolved by first making
LNET_MAX_PAYLOAD always one megabyte since this is what the
configuring sets it too by default and no one ever changes it.
In theory it could set it to as high as 16MB but that will cause
the I/O errors since the ptlrpc layer expects the packets to be
always 1 megabyte in size. Also it would be better to make the
maximum payload a per network setup configurations instead of for
everything. Second we make LNET_MAX_IOV equal to LNET_MAX_PAYLOAD
divided by the PAGE_SIZE. This way packets will always be the
LNET_MAX_PAYLOAD in size but the number of pages used,
LNET_MAX_IOV will vary depending on the platform it is creating
packets on.

Change-Id: Ie1dcdb195e68b44e2fa2d9b24715216d8aca4c65
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31559
Tested-by: Jenkins
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ruth Klundt <rklundt@sandia.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10766 utils: Correct error handling in llapi_dir_create 10/31510/4
Oleg Drokin [Mon, 5 Mar 2018 06:29:16 +0000 (01:29 -0500)]
LU-10766 utils: Correct error handling in llapi_dir_create

Free dirpath if namepath allocation failed.

Change-Id: If5c15219a28767e5c89b25a44b2ddaa907c5d12d
Test-Parameters: trivial
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/31510
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
6 years agoLU-10877 lu: fix reference leak 70/31870/2
Alexey Lyashkov [Wed, 4 Apr 2018 09:28:11 +0000 (12:28 +0300)]
LU-10877 lu: fix reference leak

reference leak fix in dt_locate_at introduced in
f6d6a552398eb1e65857d9bf1afaaf98c8dc1a79.
Use a top level object in debug in lu_object_put to match with
lu_object_get.

Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I6dbeeb4f70a2914b1606ad0c5586db431a6dcd2c
Reviewed-on: https://review.whamcloud.com/31870
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Tested-by: Jenkins
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10898 tests: disable failing conf-sanity test 60/31960/2
Andreas Dilger [Wed, 11 Apr 2018 17:45:25 +0000 (11:45 -0600)]
LU-10898 tests: disable failing conf-sanity test

The test_32a and test_32d runs are failing continuously for
ZFS filesystems (module unload problem).  Disable tests while
we investigate source and fix it.

Test-Parameters: trivial testlist=conf-sanity mdtfilesystemtype=zfs ostfilesystemtype=zfs
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ia7e1fc7969c41d1fd46d6e9b078093704a1bed83
Reviewed-on: https://review.whamcloud.com/31960
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
6 years agoLU-10833 tests: missing sed command 04/31704/2
Alex Zhuravlev [Wed, 21 Mar 2018 06:34:38 +0000 (09:34 +0300)]
LU-10833 tests: missing sed command

sed is complaining on missing command:
       cmd=$(echo $cmd | sed '/-n//')

Change-Id: Id1f0ed65719b58c4a51e9829177dd51996124d25
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31704
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-6632 mgs: dont remove EXCLUDE records on lctl replace_nids 21/14921/8
Vladimir Saveliev [Wed, 7 Mar 2018 12:01:49 +0000 (06:01 -0600)]
LU-6632 mgs: dont remove EXCLUDE records on lctl replace_nids

conf-sanity.sh:test_66 is modified to illustrate the problem:
  add EXCLUDE records to config file. lctl replace_nids removes
  those records which leads to mounting problem
fix: Remove records marked as SKIP instead of EXCLUDE ones.

Change-Id: Ica4b23a74870d8ebcb09b240313df4d4c33bbbde
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Signed-off-by: Alyona Romanenko <alyona.romanenko@seagate.com>
Cray-bug-id: MRP-2105
Cray-bug-id: MRP-2766
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Test-Parameters: trivial envdefinitions=ONLY=66 testlist=conf-sanity
Reviewed-on: https://review.whamcloud.com/14921
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10859 ldiskfs: fix deadlock with heavy memory preassure 06/31806/3
Wang Shilong [Wed, 28 Mar 2018 01:03:22 +0000 (09:03 +0800)]
LU-10859 ldiskfs: fix deadlock with heavy memory preassure

On one Customer site, we hit following deadlock:

Thread 1:
ofd_object_punch
 osd_punch
  ldiskfs_truncate
   ldiskfs_inode_attach_jinode
     ...
     do_try_to_free_pages
      lu_cache_shrink
       mutex_lock -->try to hold @lu_sites_guard

kswapd thread2:
kthread
 shrink_slab
  lu_cache_shrink
    mutex_lock ---->hold already.
     ...
     dqget
      ldiskfs_acquire_dquot
       jbd2__journal_start-->blocked to wait for more credits.

Thread3:
kthread
 kjournald2
  jbd2_journal_commit_transaction-->blocked to wait Thread2 finished,
 since Thread1 add a handle into transaction.

So deadlock happens because of Thread1 wait Thread2, Thread2 wait Thread3..
but Thread3 wait Thread1....

This problem still exists even we have switched @lu_sites_guard
into a read/write lock, sine we hold write lock at lu_cahce_shrink().

Fixed the problem by making ldiskfs_inode_attach_jinode() use
GFP_NOFS.

Test-Parameters: testgroup=review-ldiskfs \
mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Change-Id: I0ab143fc0cdb8e1b0c490c2c25e8af483c491a81
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-on: https://review.whamcloud.com/31806
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
6 years agoLU-10845 kernel: kernel update [SLES12 SP3 4.4.120-94.17] 62/31762/2
Bob Glossman [Fri, 23 Mar 2018 20:40:28 +0000 (13:40 -0700)]
LU-10845 kernel: kernel update [SLES12 SP3 4.4.120-94.17]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp3 testgroup=review-ldiskfs \
  mdsdistro=sles12sp3 ossdistro=sles12sp3 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I45c1a65c5ef5373201d8b2a1c457bfa2b37df058
Reviewed-on: https://review.whamcloud.com/31762
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10560 mdt: remove extra headers from mdt_identity.c 46/31746/2
Li Dongyang [Fri, 23 Mar 2018 04:22:24 +0000 (15:22 +1100)]
LU-10560 mdt: remove extra headers from mdt_identity.c

This avoids a compile problem with 4.14 kernels:
  CC [M]  /root/lustre-release/lustre/mdt/mdt_identity.o
In file included from lustre/mdt/mdt_identity.c:49:0:
./arch/x86/include/asm/uaccess.h: In function ‘set_fs’:
  error: dereferencing pointer to incomplete type
  current->thread.addr_limit = fs;

We don't need to include <asm/uaccess.h>, also clean up
other no longer needed headers.

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ibcfcf3a466f7e4428994047c9d11bb557d46f9ab
Reviewed-on: https://review.whamcloud.com/31746
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
6 years agoLU-10835 tests: unload dm-flakey module 39/31739/3
Jian Yu [Thu, 22 Mar 2018 21:47:32 +0000 (14:47 -0700)]
LU-10835 tests: unload dm-flakey module

In test-framework.sh, dm-flakey module is loaded in dm_flakey_supported()
but it's not unloaded anywhere. This patch unloads it in dm_cleanup_dev().

Test-Parameters: trivial

Change-Id: Ie40b8bc1a36a7a271cb333b81965fbe136268db1
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/31739
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10826 ptlrpc: fix test_req_buffer_pressure behavior 90/31690/2
Bruno Faccini [Tue, 20 Mar 2018 10:06:47 +0000 (11:06 +0100)]
LU-10826 ptlrpc: fix test_req_buffer_pressure behavior

In 2nd patch for LU-9372, to allow limiting number of rqbd-buffers,
a wrong and unnecessary test had been added to enhance
test_req_buffer_pressure feature.
This patch fixes this issue by removing such test.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I8f1298fabbbc4e0d92078dcf49192ff4f0fbc907
Reviewed-on: https://review.whamcloud.com/31690
Tested-by: Jenkins
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8913 nodemap: fix nodemap range format '*@<net>' support 84/31684/2
Emoly Liu [Mon, 19 Mar 2018 12:42:06 +0000 (20:42 +0800)]
LU-8913 nodemap: fix nodemap range format '*@<net>' support

In cfs_ip_min_max(), (nidrange->nr_all == 1) means this nid range
is a full IP address range(*.*.*.*). In this case, we don't need
to compare it to any other nid range, but set min_nid to 0.0.0.0
and max_nid to 255.255.255.255 directly.

Also, test_10d is added to sanity-sec.sh to verify this patch and
some code cleanup is done for jt_nodemap_add/del_range().

Change-Id: I72c546b060f9e123204a566a3bd373b4f017502d
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-on: https://review.whamcloud.com/31684
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10566 test: fix nfs exports clean up 79/31679/5
Minh Diep [Fri, 16 Mar 2018 18:36:21 +0000 (11:36 -0700)]
LU-10566 test: fix nfs exports clean up

nfsv3 clean up fail causing subsequence tests to fail

Test-Parameters: trivial testlist=parallel-scale-nfsv3, parallel-scale-nfsv4

Change-Id: Ic4eb02ea4e4f142753fe36cf110ea0fbec398822
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/31679
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10565 osd: move ext4_tranfer_project to osd 47/31647/6
Yang Sheng [Wed, 14 Mar 2018 08:10:10 +0000 (16:10 +0800)]
LU-10565 osd: move ext4_tranfer_project to osd

Move ext4_tranfer_project from ldiskfs to osd.
Since upstream has accepted other projid patches
except this part.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I6e5acdf68ab9f7bc964d79f29132cee45e2fd3ac
Reviewed-on: https://review.whamcloud.com/31647
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10565 osd: unify interface for vfs 46/31646/6
Yang Sheng [Fri, 26 Jan 2018 17:26:26 +0000 (01:26 +0800)]
LU-10565 osd: unify interface for vfs

Some vfs changes were applied to other part but
OSD. So unify them with OSD layer.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Ia3e907964d6321571f52e4c24a46a8ab64e4d056
Reviewed-on: https://review.whamcloud.com/31646
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10710 tests: fix run_write_disjoint line continuation 45/31645/2
James Nunez [Wed, 14 Mar 2018 18:42:39 +0000 (12:42 -0600)]
LU-10710 tests: fix run_write_disjoint line continuation

There is a problem with creating a command in
run_write_disjoint() due to a line continuation followed
on the next line by tabs. We need to remove the end
quotation before the line continuation and first
quotation mark on following line.

Test-Parameters: trivial testlist=parallel-scale
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I32dc620dd5c3e3d305d0bf985a096e69c18404d1
Reviewed-on: https://review.whamcloud.com/31645
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10565 osd: bi_error, pagevec_init, PAGE_CACHE_SHIFT changes 44/31644/2
Yang Sheng [Wed, 14 Mar 2018 09:36:48 +0000 (17:36 +0800)]
LU-10565 osd: bi_error, pagevec_init, PAGE_CACHE_SHIFT changes

 - bi_error replace to bi_status in bio
 - pagevec_init takes one parameter
 - PAGE_CACHE_SHIFT be removed

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Ia04124d6d636d132550a63e1f8144c26cab39f8e
Reviewed-on: https://review.whamcloud.com/31644
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10819 o2ib: use splice in kiblnd_peer_connect_failed() 43/31643/2
John L. Hammond [Wed, 14 Mar 2018 17:12:06 +0000 (12:12 -0500)]
LU-10819 o2ib: use splice in kiblnd_peer_connect_failed()

In kiblnd_peer_connect_failed() replace a backwards list_add() and
list_del() with list_splice_init().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ib00d5d911d1070b6c8b49f14a2c7fc3552da553c
Reviewed-on: https://review.whamcloud.com/31643
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-4939 obdclass: llog_print params file 20/31620/8
Ben Evans [Fri, 9 Mar 2018 20:51:26 +0000 (15:51 -0500)]
LU-4939 obdclass: llog_print params file

Allow llog_print to handle the params file in yaml

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: Icf286bca7a1466bf3c8d9084971e58d2e8b8a651
Test-Parameters: trivial testlist=sanity
Reviewed-on: https://review.whamcloud.com/31620
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
6 years agoLU-10785 llite: use xattr_handler name for ACLs 95/31595/5
John L. Hammond [Thu, 8 Mar 2018 21:27:28 +0000 (15:27 -0600)]
LU-10785 llite: use xattr_handler name for ACLs

If struct xattr_handler has a name member then use it (rather than
prefix) for the ACL xattrs. This avoids a bug where ACL operations
failed for some kernels.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I28f6c5dbe3cdc4155e93d388d2c413092e02c082
Reviewed-on: https://review.whamcloud.com/31595
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10787 llite: correct removexattr detection 94/31594/3
John L. Hammond [Thu, 8 Mar 2018 19:30:46 +0000 (13:30 -0600)]
LU-10787 llite: correct removexattr detection

In ll_xattr_set_common() detect the removexattr() case correctly by
testing for a NULL value as well as XATTR_REPLACE.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I29a29851ad4ac432e257b63088e2d7a7dfc39605
Reviewed-on: https://review.whamcloud.com/31594
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10788 llite: pass flags through __vfs_setxattr() 93/31593/3
John L. Hammond [Thu, 8 Mar 2018 19:23:34 +0000 (13:23 -0600)]
LU-10788 llite: pass flags through __vfs_setxattr()

In the compat definition of __vfs_setxattr() pass the flags we
received down to the handler. For consistency with upstream return
-EOPNOTSUPP if no handler could be found.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I78b88d1521dd000e328f1add1a6159c70d16f5a7
Reviewed-on: https://review.whamcloud.com/31593
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
6 years agoLU-10792 llite: remove unused parameters from md_{get,set}xattr() 92/31592/3
John L. Hammond [Thu, 8 Mar 2018 19:03:54 +0000 (13:03 -0600)]
LU-10792 llite: remove unused parameters from md_{get,set}xattr()

md_getxattr() and md_setxattr() each have several unused
parameters. Remove them and improve the naming or remaining
parameters.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I578bdd5dab70745ba7f8fbb9f047fa9eb1f6ee9a
Reviewed-on: https://review.whamcloud.com/31592
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10541 llite: setxattr directly in ll_set_acl 88/31588/4
John L. Hammond [Thu, 8 Mar 2018 18:55:42 +0000 (12:55 -0600)]
LU-10541 llite: setxattr directly in ll_set_acl

Call md_setxattr() directly from ll_set_acl().

Test-Parameters: alwaysuploadlogs clientdistro=sles12sp3 testlist=parallel-scale-nfsv3
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ie266ee4fe7a67338122a6a3effb545d3dbaee008
Reviewed-on: https://review.whamcloud.com/31588
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10779 llite: rename FSFILT_IOC_* to system flags 46/31546/3
Jinshan Xiong [Tue, 6 Mar 2018 16:54:11 +0000 (08:54 -0800)]
LU-10779 llite: rename FSFILT_IOC_* to system flags

Those definitions were probably created for compatibility. Now that
FS_IOC_* have been existing in kernel for long time, we should use
them to avoid confusion.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Change-Id: Id3b72233b619f1cf761ec5769e27b94af862cd22
Reviewed-on: https://review.whamcloud.com/31546
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-10776 osc: Do not request more than 2GiB grant 33/31533/2
Patrick Farrell [Mon, 5 Mar 2018 16:24:32 +0000 (10:24 -0600)]
LU-10776 osc: Do not request more than 2GiB grant

The server enforces a grant limit of 2 GiB, which the
client must honor.  The existing client code combined with
16 MiB RPCs make it possible for the client to ask for
more than this limit.

Make this limit explicit, and also fix an overflow bug in
o_undirty calculation in osc_announce_cached.  (o_undirty
is a 32 bit value and 16 MiB*256 rpcs_in_flight = 4 GiB.
4 GiB + extra grant components overflows o_undirty.)

Cray-bug-id: LUS-5750
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ifcb8a9ea7529eae4cd209dc72223ed039c6f6a0d
Reviewed-on: https://review.whamcloud.com/31533
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-9444 tests: replace SINGLEMDS1 with SINGLEMDS 20/31420/4
James Nunez [Mon, 26 Feb 2018 17:59:50 +0000 (10:59 -0700)]
LU-9444 tests: replace SINGLEMDS1 with SINGLEMDS

In conf-sanity test 87, we use the global variable SINGLEMDS1
to get the version of the MDS. SINGLEMDS1 is not defined and
the test should use SINGLEMDS to check the version of the MDS.

Test-Parameters: trivial testlist=conf-sanity
Test-Parameters: envdefinitions=ONLY=87 mdsjob=lustre-b2_9 ossjob=lustre-b2_9 serverbuildno=22 testlist=conf-sanity
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ic8b1f32b87cc596fcc2e98d5b6095b6e4171bfd7
Reviewed-on: https://review.whamcloud.com/31420
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-10629 lod: Clear OST pool with setstripe 64/31364/12
Ben Evans [Wed, 21 Feb 2018 18:17:58 +0000 (13:17 -0500)]
LU-10629 lod: Clear OST pool with setstripe

When setstripe -d is run on a directory, we should
clear the OST pool along with all the other settings
Currently there is no way to clear an OST pool,
only change them.

Signed-off-by: Ben Evans <bevans@cray.com>
Cray-bug-id: LUS-5696
Change-Id: I50426ce79ab153a715d29cc5d54b0ce70726da41
Reviewed-on: https://review.whamcloud.com/31364
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10649 llite: yield cpu after call to ll_agl_trigger 40/31240/4
Ann Koehler [Wed, 7 Jun 2017 19:28:03 +0000 (14:28 -0500)]
LU-10649 llite: yield cpu after call to ll_agl_trigger

The statahead and agl threads loop over all entries in the
directory without yielding the CPU. If the number of entries in
the directory is large enough then these threads may trigger
soft lockups. The fix is to add calls to cond_resched() after
calling ll_agl_trigger(), which gets the glimpse lock for a
file.

Change-Id: I4fbc72a3c6bc77f2ffd8e3fd0daf4c8906bb954a
Cray-bug-id: LUS-2584
Signed-off-by: Chris Horn <hornc@cray.com>
Reviewed-on: https://review.whamcloud.com/31240
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10643 ptlrpc: ptlrpc_register_bulk() LBUG on ENOMEM 28/31228/8
Andriy Skulysh [Tue, 19 Dec 2017 09:20:21 +0000 (11:20 +0200)]
LU-10643 ptlrpc: ptlrpc_register_bulk() LBUG on ENOMEM

Assertion fails on !desc->bd_registered during
retry after ENOMEM.

Drop bd_registered flag and exit via cleanup_bulk
to ensure that bulk is fully unregistered.

Cray-bug-id: MRP-4733
Change-Id: I51be5ec041ef903040bf8508156da8079511c9f7
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/31228
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10598 obdclass: ignore IGIF formatted last_id 40/31140/2
Fan Yong [Fri, 2 Feb 2018 07:44:26 +0000 (15:44 +0800)]
LU-10598 obdclass: ignore IGIF formatted last_id

All the FIDs with sequence within [FID_SEQ_IGIF, FID_SEQ_IGIF_MAX]
is valid IGIF in spite of what the f_oid is. So the IGIF with zero
f_oid is also valid IGIF, not last_id. So that last_id check logic
should ignore IGIF formatted last_id.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I81dc7b237e91688b09f360e43899a1de2c44bf78
Reviewed-on: https://review.whamcloud.com/31140
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10560 libcfs: Use kernel_write when appropriate 54/31154/17
Mike Marciniszyn [Tue, 27 Feb 2018 15:25:59 +0000 (10:25 -0500)]
LU-10560 libcfs: Use kernel_write when appropriate

Changes in the upstream kernel might have removed
vfs_write() in favor of kernel_write().

Unfortunately, the kernel_write() was initially exported
with an API that is not plug compatible with vfs_write()

The ring down is:
- kernel_write new API
- vfs_write

Change-Id: I67f73786308561dc42b06d51c26bfb94021b7589
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-on: https://review.whamcloud.com/31154
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10461 tests: call exit in the skip routine 64/30964/8
James Nunez [Sun, 21 Jan 2018 22:31:31 +0000 (15:31 -0700)]
LU-10461 tests: call exit in the skip routine

There are many reasons to not run, or skip, a test; the test
may require a certain number of servers or a certain Lustre version.
In these cases, the skip() or skip_env() routine is called. When we
call skip, the intention is to exit the routine early. Thus, call
‘exit 0’ at the end of the skip() routine.

Some calls to skip() are changed to skip_env() when a test is being
skipped due to the Lustre configuration or test environment.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I42fd9535c0a803f334dfc5685f451a6bdc85e84b
Reviewed-on: https://review.whamcloud.com/30964
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10383 hsm: ignore compound_id 49/30949/5
John L. Hammond [Mon, 18 Dec 2017 15:24:33 +0000 (09:24 -0600)]
LU-10383 hsm: ignore compound_id

Ignore request compound ids in the HSM coordinator. Compound ids
prevent batching of CDT to CT requests and degrade HSM
performance. Use CT/archive id compatabiliy when deciding which HSM
actions to put in a request.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I38513f3b75313eb78bfb9811ab4e40e3e2b904c7
Reviewed-on: https://review.whamcloud.com/30949
Tested-by: Jenkins
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10383 hsm: add action count to hsm scan data 35/31235/3
John L. Hammond [Thu, 8 Feb 2018 19:19:39 +0000 (13:19 -0600)]
LU-10383 hsm: add action count to hsm scan data

Add an 'hsm_action_count' member to struct hsm_scan_data to count the
total number of actions in all requests in the hsd. Add an 'hsd_'
prefix to all pre-existing members of struct hsm_scan_data.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Iab784a0f281d697bc0db758f20ce500315b8194a
Reviewed-on: https://review.whamcloud.com/31235
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10383 hsm: remove struct hsm_thread_data 34/31234/3
John L. Hammond [Thu, 8 Feb 2018 17:31:37 +0000 (11:31 -0600)]
LU-10383 hsm: remove struct hsm_thread_data

Remove struct hsm_thread_data. Move allocation of the HSM scan data
requests array to mdt_coordinator().

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I12075dc000c312d2432c8e32787ed36560d1ae42
Reviewed-on: https://review.whamcloud.com/31234
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10802 nrs: mismatch problem for wildcard in jobid TBF 62/29162/7
Qian Yingjin [Fri, 22 Sep 2017 03:02:24 +0000 (11:02 +0800)]
LU-10802 nrs: mismatch problem for wildcard in jobid TBF

When set the NRS JOBID rule
"start runas jobid={*.500} rate=10", run the dd with user 500,
the RPC rate is not under control.
This patch fix this mismatch problem for wildcard in TBF JOBID.

Test-Parameters: trivial testlist=sanityn
Change-Id: I39a8e691c9dc8273ed9fce686eeef71be1ac3e43
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/29162
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9043 test: remove conf-sanity test 24a ALWAYS_EXCEPT 40/28540/10
dilip krishnagiri [Tue, 24 Jan 2017 17:32:39 +0000 (10:32 -0700)]
LU-9043 test: remove conf-sanity test 24a ALWAYS_EXCEPT

conf-sanity test 24a was added to the ALWAYS_EXCEPT list
due to bugzilla 23573. The issue described in bugzilla
23573 was fixed and landed to master.

conf-sanity test 24a should be removed from the
ALWAYS_EXCEPT list.

Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Id42b846d5fb34e8ebeb7fab63aeeafea40782321
Reviewed-on: https://review.whamcloud.com/28540
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9658 ptlrpc: Add QoS for uid and gid in NRS-TBF 08/27608/36
Teddy Chan [Fri, 9 Mar 2018 10:20:40 +0000 (18:20 +0800)]
LU-9658 ptlrpc: Add QoS for uid and gid in NRS-TBF

This patch add a new QoS feature in TBF policy which could
limits the rate based on uid or gid. The policy is able to
limit the rate both on MDT and OSS site.

The command for this feature is like:
Start the tbf uid QoS on OST:
    lctl set_param ost.OSS.*.nrs_policies="tbf uid"
Limit the rate of ptlrpc requests of the uid 500
    lctl set_param ost.OSS.*.nrs_tbf_rule=
 "start tbf_name uid={500} rate=100"

Start the tbf gid QoS on OST:
    lctl set_param ost.OSS.*.nrs_policies="tbf gid"
Limit the rate of ptlrpc requests of the gid 500
    lctl set_param ost.OSS.*.nrs_tbf_rule=
 "start tbf_name gid={500} rate=100"

or use generic tbf rule to mix them on OST:
    lctl set_param ost.OSS.*.nrs_policies="tbf"
Limit the rate of ptlrpc requests of the uid 500 gid 500
    lctl set_param ost.OSS.*.nrs_tbf_rule=
 "start tbf_name uid={500}&gid={500} rate=100"

Also, you can use the following rule to control all reqs
to mds:
Start the tbf uid QoS on MDS:
    lctl set_param mds.MDS.*.nrs_policies="tbf uid"
Limit the rate of ptlrpc requests of the uid 500
    lctl set_param mds.MDS.*.nrs_tbf_rule=
 "start tbf_name uid={500} rate=100"

Change-Id: I440ad087dd3dbacd8b5228717b0a1724ef47e3b4
Signed-off-by: Teddy Chan <teddy@ddn.com>
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/27608
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9592 tests: remove sanity-quota tests from ALWAYS_EXCEPT 10/27410/6
dilip krishnagiri [Wed, 9 Aug 2017 17:38:12 +0000 (11:38 -0600)]
LU-9592 tests: remove sanity-quota tests from ALWAYS_EXCEPT

Remove sanity-quota tests
34 "Usage transfer for user & group & project"
35 "Usage is still accessible across reboot"
from ALWAYS_EXCEPT list.

Test-Parameters: trivial testlist=sanity-quota mdtfilesystemtype=zfs ostfilesystemtype=zfs

Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Change-Id: I41390699480d9f88b1019c459c142d36fea624fb
Reviewed-on: https://review.whamcloud.com/27410
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10795 quota: fix wrong skipping of reintegration 07/31607/2
Wang Shilong [Fri, 9 Mar 2018 07:38:51 +0000 (15:38 +0800)]
LU-10795 quota: fix wrong skipping of reintegration

There are two problems addressed by this patch:
1)In qsd_prepare(), if @qqi_acct_failed is true,
that only means one type of quota failed, Quota
should continue to handle.
2)In qsd_config(), only trigger reintegration if
this type of quota is newly enabled, this could
fix annoying messages when admin running

$ lctl conf_param lustre.quota.mdt=ug

LustreError: 0-0: lustre-MDT0000: can't enable
quota enforcement since space accounting isn't
functional. Please run tunefs.lustre --quota on
an unmounted filesystem if not done already

Change-Id: I9bad618e7e8fa836902cac9f446714cd6c03f98a
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/31607
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-6032 obdclass: new wrapper to convert NID to string 56/12956/14
Liang Zhen [Fri, 5 Dec 2014 14:06:52 +0000 (22:06 +0800)]
LU-6032 obdclass: new wrapper to convert NID to string

This patch includes a couple of changes:
- add new wrapper function obd_import_nid2str
- use obd_import_nid2str and obd_export_nid2str to replace all
  libcfs_nid2str conversions for NID of export/import connection

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I57d08e6ef902c6a34c705663de0ed73bb3dc76f2
Reviewed-on: https://review.whamcloud.com/12956
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-6032 ldlm: don't disable softirq for exp_rpc_lock 57/12957/12
Liang Zhen [Fri, 5 Dec 2014 14:13:17 +0000 (22:13 +0800)]
LU-6032 ldlm: don't disable softirq for exp_rpc_lock

it is not necessary to call ldlm_lock_busy() in the context of timer
callback, we can call it in thread context of expired_lock_main.
With this change, we don't need to disable softirq for exp_rpc_lock.

Instead of moving busy locks to the end of the waiting list one
at a time in the context of the timer callback, move any locks
that may be expired onto the expired list.  If these locks are
still being used by RPCs being processed, then put them back
onto the end of the waiting list instead of evicting the client.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: Ic3da0dd4e81b758c7448d9613ccd4786693e075d
Reviewed-on: https://review.whamcloud.com/12957
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8999 quota: fix issue of multiple call of seq start 21/31721/4
Hongchao Zhang [Wed, 21 Mar 2018 15:17:03 +0000 (23:17 +0800)]
LU-8999 quota: fix issue of multiple call of seq start

Multiple call of lprocfs_quota_seq_start could change the block
orders in the lower level of the quota tree, which will cause
quota entries to be skipped.

This patch also fix a problem in walk_tree_dqentry, which some
entries could be skipped for the "index" can be added even if
a valid quota entry has been found.

Change-Id: I44936c70d4060bd83db22aba0e3f665981cfa50a
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/31721
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10698 obdclass: allow specifying complex jobids 91/31691/7
Andreas Dilger [Tue, 20 Mar 2018 09:45:36 +0000 (03:45 -0600)]
LU-10698 obdclass: allow specifying complex jobids

Allow specifying a format string for the jobid_name variable to create
a jobid for processes on the client.  The jobid_name is used when
jobid_var=nodelocal, if jobid_name contains "%j", or as a fallback if
getting the specified jobid_var from the environment fails.

The jobid_node string allows the following escape sequences:

    %e = executable name
    %g = group ID
    %h = hostname (system utsname)
    %j = jobid from jobid_var environment variable
    %p = process ID
    %u = user ID

Any unknown escape sequences are dropped. Other arbitrary characters
pass through unmodified, up to the maximum jobid string size of 32,
though whitespace within the jobid is not copied.

This allows, for example, specifying an arbitrary prefix, such as the
cluster name, in addition to the traditional "procname.uid" format,
to distinguish between jobs running on clients in different clusters:

    lctl set_param jobid_var=nodelocal jobid_name=cluster2.%e.%u
or
    lctl set_param jobid_var=SLURM_JOB_ID jobid_name=cluster2.%j.%e

To use an environment-specified JobID, if available, but fall back to
a static string for all processes that do not have a valid JobID:

    lctl set_param jobid_var=SLURM_JOB_ID jobid_name=unknown

Implementation notes:

The LUSTRE_JOBID_SIZE includes a trailing NUL, so don't use
"LUSTRE_JOBID_SIZE + 1" anywhere, as that is misleading.

Rename the "obd_jobid_node" variable to "obd_jobid_name" to match
the /proc "jobid_name" parameter name to avoid confusion.

Rename "struct jobid_to_pid_map" to "jobid_pid_map" since this is
not actually mapping from a jobid *to* a PID, but the reverse.
Save jobid length, and reorder fields to avoid holes in structure.

Consolidate PID->jobid cache handling in jobid_get_from_cache(),
which only does environment lookups and caches the results.
The fallback to using obd_jobid_name is handled by the caller.

Rename check_job_name() to jobid_name_is_valid(), since that makes
it clear to the reader a "true" return is a valid name.

In jobid_cache_init() there is no benefit for locking the jobid_hash
creation, since the spinlock is just initialized in this function,
so multiple callers of this function would already be broken.

Pass the buffer size from the callers (who know the buffer size) to
lustre_get_jobid() instead of assuming it is LUSTRE_JOBID_SIZE.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Iad350e87b446c7d2356718cf2e5f9563e63ebbe5
Reviewed-on: https://review.whamcloud.com/31691
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9273 tests: disable random I/O in replay-ost-single/5 71/31671/2
Alex Zhuravlev [Fri, 16 Mar 2018 11:28:57 +0000 (14:28 +0300)]
LU-9273 tests: disable random I/O in replay-ost-single/5

disable random I/O in replay-ost-single/5 as it's very slow
on ZFS - this is due to grants as the client consume them
way too quickly: 1MB blocksize + ~0.5MB metadata overhead
for each random 4K written by iozone.

Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5
Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5
Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5
Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5
Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5

Change-Id: Ic49429b8c681fdc16e5f95f483d78198b6f4804c
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31671
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
6 years agoLU-10264 mdc: fix possible NULL pointer dereference 21/31621/3
Andreas Dilger [Fri, 9 Mar 2018 23:18:53 +0000 (16:18 -0700)]
LU-10264 mdc: fix possible NULL pointer dereference

Fix two static analysis errors.

lustre/mdc/mdc_dev.c: in mdc_enqueue_send(), pointer 'matched' return
    from call to function 'ldlm_handle2lock' at line 704 may be NULL
    and will be dereferenced at line 705.
If client is evicted between ldlm_lock_match() and ldlm_handle2lock()
the lock pointer could be NULL.

lustre/lov/lov_dev.c:488 in lov_process_config, sscanf format
    specification '%d' expects type 'int' for 'd', but parameter 3
    has a different type '__u32'.
Converting to kstrtou32() requires changing the "index" variable type
from __u32 to u32, which is fine since it is only used internally,
fix up the few functions that are also passing "__u32 index" and the
resulting checkpatch.pl warnings.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3cc80d66bbb537161a561f4f2ba7830ddebcab07
Reviewed-on: https://review.whamcloud.com/31621
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-7420 echo: fix echo server to work with unified target 43/18443/11
Mikhail Pershin [Tue, 27 Mar 2018 11:00:47 +0000 (14:00 +0300)]
LU-7420 echo: fix echo server to work with unified target

After Unified Target introduction the echo server lost its
ability to serve incoming request, i.e. works like fake OFD.
Patch restores that functionality, so echo server is able to
process requests from the echo client via network.

Test-Parameters: trivial testlist=obdfilter-survey
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Change-Id: I0c0d347486463ce320c7c66a1f85f6979b9a3681
Reviewed-on: https://review.whamcloud.com/18443
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10752 build: fix rpm packaging issues for gss 57/31757/7
James Simmons [Thu, 29 Mar 2018 17:02:42 +0000 (13:02 -0400)]
LU-10752 build: fix rpm packaging issues for gss

Lustre can create rpms in two ways. One is with make rpm and the
other is using the actual source rpm that is provided. Their are
several issues with how GSS is handled with rpm packaging.

First problem is that you can ./configure --disable-gss which has
never been handled. Secondly if you do configure with disable-gss
it is still possible to have the option enable-gss-keyring set to
yes. The reason it was never seen before is due to everything
being treated with the keyring option. Now if the user sets
enable-gss to no then enable-gss-keyring will also be set to no
even if the user tries to set it to yes. This was done by properly
setting $enable_gss and $enable_gss_keyring in lustre-core.m4.
In the spec file create the bcond gss to handle the gss only case
and we turn on gss if gss_keyring is true. Move lgssc.conf under
the with_gss_keyring bcond which is only needed for server builds
along side lsvcgss.

It is impossible to know if it can build due to the spec file not
properly handling build dependencies for GSS and not knowing if
the kernel is too new for GSS. So the user has to provide the
options --with gss and / or --with gss-keyring to rpmbuild. If
the user only provides gss-keyring option to rpmbuild make sure
it enables gss as well. That is handled in the spec file.

For the case of make rpms fix it up so if gss-keyring is enabled
then by default the core gss handling is enabled. Also handle the
long ignored enable-gss case.

Test-Parameters: trivial

Change-Id: Ieed9df98a27bd6e77504486762d6e60ddca5a916
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31757
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9551 utils: add l_tunedisk to fix disk tunings 64/31464/6
Nathaniel Clark [Wed, 28 Feb 2018 22:18:09 +0000 (17:18 -0500)]
LU-9551 utils: add l_tunedisk to fix disk tunings

This adds l_tunedisk utility to utilize osd_tune_lustre call for
mount_utils.h.  This can be called from udev.
This adds a udev rule to fix disk tunings.
This in some ways duplicates LU-9132, which sets this value at mount
time, but if a multipath component is removed then re-added, the
multipath's max_sectors_kb will not propgate to the newly added device
and this now will cause an error for I/Os that would violate this.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I35330ebe75552d71b71212f9fae00cfdcc028ea1
Reviewed-on: https://review.whamcloud.com/31464
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>