Whamcloud - gitweb
fs/lustre-release.git
7 years agoLU-8491 quota: sleep while holding spinlock 23/21923/7
Niu Yawei [Mon, 15 Aug 2016 08:31:53 +0000 (04:31 -0400)]
LU-8491 quota: sleep while holding spinlock

Revise the memory allocation code in qmt_glimpse_lock() to avoid
sleep while holding spinlock.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I75d5751910906984c31454d4567f58d769af5d51
Reviewed-on: https://review.whamcloud.com/21923
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8364 ldiskfs: fixes for failover mode. 41/21141/7
Lokesh Nagappa Jaliminche [Fri, 25 Nov 2016 10:47:09 +0000 (16:17 +0530)]
LU-8364 ldiskfs: fixes for failover mode.

when ldiskfs runs in failover mode with read-only disk,
it may loose part of allocation updates and fail while
mounting fs due to group descriptor checks before journal
replay.
don't produce panic's with on disk checks in read-only mode.

Seagate-bug-id: MRP-797
Change-Id: I54bee3a0aeb9a15f5ee2a79f7a2a2a905f19af1a
Signed-off-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Signed-off-by: Lokesh Nagappa Jaliminche <lokesh.jaliminche@seagate.com>
Reviewed-on: https://morpheus.xyus.xyratex.com:8443/gerrit/239
Reviewed-by: Andrew Perepechko <Andrew_Perepechko@xyratex.com>
Reviewed-by: Alexander Zarochentsev <alexander_zarochentsev@xyratex.com>
Tested-by: Alexander Lezhoev <Alexander_Lezhoev@xyratex.com>
Reviewed-by: Vitaly Fertman <Vitaly_Fertman@xyratex.com>
Reviewed-on: https://review.whamcloud.com/21141
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
7 years agoLU-8351 ptlrpc: allow blocking asts to be delayed 65/21065/6
Vladimir Saveliev [Wed, 29 Jun 2016 13:10:24 +0000 (16:10 +0300)]
LU-8351 ptlrpc: allow blocking asts to be delayed

ptlrpc_import_delay_req() refuses to delay blocking asts when import
is not in LUSTRE_IMP_FULL yet. That leads to client eviction assuming
that it failed to respond.

Allow delays for blocking asts being resent.

Signed-off-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Seagate-bug-id: MRP-3500
Change-Id: I0e5cde9636afd48cc6cb565f586a59bc7ec01810
Reviewed-on: https://review.whamcloud.com/21065
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8272 ldlm: Use interval tree to update kms 79/20779/10
Patrick Farrell [Tue, 6 Dec 2016 18:38:33 +0000 (12:38 -0600)]
LU-8272 ldlm: Use interval tree to update kms

Currently, ldlm_extent_shift_kms does a linear search of
the list of granted locks on the resource when looking for
the lock to use to update the kms.

This can be avoided by using the interval trees which store
the extents of granted locks.  For PW/write locks, the lock
with the highest start must be the lock with the highest
end as well, so we can walk the interval tree in reverse to
almost immediately find the new 'highest end'.

Since the tree is sorted by 'start' and PR locks can
overlap, we cannot easily use the tree to find the PR lock
with the 'highest end'.  So we cannot optimize this case,
but many PR locks with different extents should be rare, so
this is OK (and it is no worse than what we do now).

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I9efa4733e691cb2299049ba917325b939be52069
Reviewed-on: https://review.whamcloud.com/20779
Reviewed-by: Vitaly Fertman <vitaly.fertman@seagate.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-6838 llog: limit file size of plain logs 28/18028/12
Alex Zhuravlev [Mon, 18 Jan 2016 06:24:19 +0000 (09:24 +0300)]
LU-6838 llog: limit file size of plain logs

on small filesystems plain log can grow dramatically. especially
given large record sizes produced by DNE and extended chunksize.
I saw >50% of space consumed by a single llog file which was still
in use. this leads to test failures (sanityn, etc).
the patch introduces additional limit on plain llog size, which
is calculated as <free space>/64 (128MB at most) at llog creation
time.

Change-Id: I0eab8177d4e416a32a6aab56d47e4142c81d13de
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/18028
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8813 gss: allow svcgssd to start without "-k" 25/23925/4
Andreas Dilger [Wed, 23 Nov 2016 19:55:40 +0000 (12:55 -0700)]
LU-8813 gss: allow svcgssd to start without "-k"

Previous versions of svcgssd did not require the "-k" option when
running in Kerberos mode (the only mode available).  If none of
the -k, -s, or -z options are given for enabling security flavours
then assume "-k" for compatibility reasons.

This will generate a warning before 3.1 is released, at which point
it will turn into an error.

Make the use of -s an error if SSK is not available.

Test-Parameters: trivial testlist=sanity-sec
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I9b7389bbca56d6717f02b21f57da52adc4602971
Reviewed-on: https://review.whamcloud.com/23925
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8740 lfsck: hold lock when access trace file object 01/23301/2
Fan Yong [Sun, 31 Jul 2016 15:49:36 +0000 (23:49 +0800)]
LU-8740 lfsck: hold lock when access trace file object

There is race condition between lfsck_in_notify() access the trace
file object and the lfsck_namespace_load_sub_trace_files() that may
re-create the trace file. Hold lfsck_sub_trace_obj::lsto_mutex and
check check the validaty of the trace file object to avoid trouble.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9715524dd7027f5fc8c7078c1a52d099e9e21132
Reviewed-on: https://review.whamcloud.com/23301
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8590 utils: remove duplicate code in lgss_sk 22/23722/8
Andreas Dilger [Fri, 11 Nov 2016 17:14:00 +0000 (10:14 -0700)]
LU-8590 utils: remove duplicate code in lgss_sk

Remove the code duplication between creating a new keyfile and
modifying an existing keyfile.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Id8f8ec535eb7e076ce70cd765e5c3c86ae686236
Reviewed-on: https://review.whamcloud.com/23722
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoNew tag 2.9.50 2.9.50 v2_9_50 v2_9_50_0
Oleg Drokin [Wed, 7 Dec 2016 23:36:12 +0000 (18:36 -0500)]
New tag 2.9.50

Starting on 2.10 release development cycle.

Change-Id: Ice84a19037602b4d9041f7d4dd67c21d0c4cd41e
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoNew Lustre release 2.9.0 b2_9 2.9.0 v2_9_0 v2_9_0_0
Oleg Drokin [Wed, 7 Dec 2016 23:33:07 +0000 (18:33 -0500)]
New Lustre release 2.9.0

Change-Id: I8a9dfb3eea3c42419a148e006c1b31a86cd8785b
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoNew tag 2.9.0-RC1 2.9.0-RC1 v2_9_0_RC1
Oleg Drokin [Thu, 24 Nov 2016 04:43:00 +0000 (23:43 -0500)]
New tag 2.9.0-RC1

Change-Id: I9155a8ac38a4038a1958d8f17e0bcda1ec55f5f9
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8590 utils: fix minor issues in lgss_sk usage 91/23691/9
Andreas Dilger [Thu, 10 Nov 2016 05:21:00 +0000 (22:21 -0700)]
LU-8590 utils: fix minor issues in lgss_sk usage

Print warning message if secret keyfile has permissive access mode.
Improve error messages to start with either "error:" or "warning:".

Add "--key-bits" long option for "-k", and "--integrity" for "-i".
Don't print "Prime (p):" field with "-r" if no prime key is stored.
Improve usage message and man page for lgss_sk utility.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I8c36f0f20a0144b351b51c2d25edad9c8bd0d050
Reviewed-on: http://review.whamcloud.com/23691
Tested-by: Jenkins
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8838 kernel: kernel update RHEL6.8 [2.6.32-642.11.1.el6] 58/23858/4
Bob Glossman [Tue, 15 Nov 2016 20:57:00 +0000 (12:57 -0800)]
LU-8838 kernel: kernel update RHEL6.8 [2.6.32-642.11.1.el6]

Update RHEL6.8 kernel to 2.6.32-642.11.1.el6

Test-Parameters: trivial clientdistro=el6.8 mdsdistro=el6.8 ossdistro=el6.8 \
  mdsfilesystemtype=ldiskfs mdtfilesystemtype=ldiskfs \
  ostfilesystemtype=ldiskfs testgroup=review-ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I37d33dfe09c7579d14692f9695e7042a2a1e2fb3
Reviewed-on: http://review.whamcloud.com/23858
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8861 doc: Update llapi_ladvise man page 01/23901/2
James Nunez [Tue, 22 Nov 2016 18:56:54 +0000 (11:56 -0700)]
LU-8861 doc: Update llapi_ladvise man page

Correct the input parameters for the llapi_ladvise routine
in the llapi_ladvise man page. The struct llapi_ladvise
expects is llapi_lu_ladvise.

Test-Parameters: trivial
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I30d55813a02a1d2d8f23db44b1f118f2cf7b6803
Reviewed-on: http://review.whamcloud.com/23901
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8559 llite: fix ll_create_nd for non HAVE_IOP_ATOMIC_OPEN 58/23758/5
Jinshan Xiong [Mon, 14 Nov 2016 22:51:39 +0000 (14:51 -0800)]
LU-8559 llite: fix ll_create_nd for non HAVE_IOP_ATOMIC_OPEN

Invoke ll_new_node() with LUSTRE_OPC_CREATE for non
HAVE_IOP_ATOMIC_OPEN case so that it can recognize volatile
file name.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ia38d353844dc4852dbaa308fe26f450108a009ea
Reviewed-on: http://review.whamcloud.com/23758
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8824 nodemap: load nodemap definitions first 49/23849/2
Kit Westneat [Wed, 16 Nov 2016 16:31:35 +0000 (11:31 -0500)]
LU-8824 nodemap: load nodemap definitions first

ZFS index files return keys in hash order instead of key numerical
order. This means that nodemap definitions could be returned after
the ID mapping and range definitions, causing the load code to break.
This change loads the config in two passes, ensuring that the nodemap
creation would occur first.

Test-Parameters: envdefinitions=SLOW=yes testlist=sanity-sec
Test-Parameters: envdefinitions=SLOW=yes mdtfilesystemtype=zfs mdsfilesystemtype=zfs ostfilesystemtype=zfs testlist=sanity-sec
Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: I97dac2e8fb2e7f2e0a0a6bd07f743d3379178890
Reviewed-on: http://review.whamcloud.com/23849
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8805 tests: fix defect introduced by LU-8226 61/23861/2
Elena Gryaznova [Fri, 18 Nov 2016 19:29:34 +0000 (21:29 +0200)]
LU-8805 tests: fix defect introduced by LU-8226

client loads run_tar.sh, run_dd.sh, etc. are
executed on remote nodes; ps -C does not select them
on main client.

Test-parameters: trivial testlist=recovery-mds-scale

Seagate-bug-id: MRP-4011
Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Change-Id: Iaa066298f96b14af148007410f29f5c7b965ee2c
Reviewed-on: http://review.whamcloud.com/23861
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8738 tests: ladvise dontneed test write to single OST 67/23867/2
James Nunez [Fri, 18 Nov 2016 20:49:45 +0000 (13:49 -0700)]
LU-8738 tests: ladvise dontneed test write to single OST

sanity test 255b exercises the ladvise hint 'dontneed' by
checking total cache and cache used on a single OST. Limit
the file striping to a single OST for the file created for
this test.

Test-Parameters: trivial testlist=sanity,sanity,sanity
Test-Parameters: trivial testlist=sanity,sanity,sanity
Test-Parameters: trivial testlist=sanity,sanity,sanity
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Iee08576726fc56bc9e7aa961c22819265c31f69b
Reviewed-on: http://review.whamcloud.com/23867
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8774 lprocfs: not use MAX_STRING_SIZE in copy_from_user 62/23462/4
Jian Yu [Mon, 31 Oct 2016 06:26:45 +0000 (14:26 +0800)]
LU-8774 lprocfs: not use MAX_STRING_SIZE in copy_from_user

This patch removes the usage of MAX_STRING_SIZE from
copy_from_user() and just copies enough bytes to cover
count passed in.

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I1ac2c779b5cd984f88bb85d4ae8d571f7931091f
Reviewed-on: http://review.whamcloud.com/23462
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8418 libcfs: remove lnet upcall code 40/21440/6
Alexander Zarochentsev [Thu, 10 Sep 2015 06:26:43 +0000 (09:26 +0300)]
LU-8418 libcfs: remove lnet upcall code

Removing lnet upcall infrastructure completely
as nobody uses it anymore. The upcall causes a delay
before calling BUG() and might even cause a hang
making getting a crash dump unreliable or containing
outdated info.

Change-Id: I20af6874116542d16bcc9a9eb75c813a124e346d
Seagate-bug-id: MRP-2939
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Reviewed-on: http://review.whamcloud.com/21440
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8655 tests: customize run_mdtest() 50/22850/4
Elena Gryaznova [Fri, 30 Sep 2016 17:57:58 +0000 (20:57 +0300)]
LU-8655 tests: customize run_mdtest()

Sometimes it is required to run mdtest with parameters
missing in run_mdtest() cmd.
Now these parameters can be specified by mdtest_custom_params.

Test-Parameters: trivial envdefinitions=ONLY=mdtestssf testlist=parallel-scale
Test-Parameters: trivial envdefinitions=ONLY=mdtestfpp testlist=parallel-scale
Seagate-bug-id: MRP-3376
Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Change-Id: If07f07ebf11516195843e497c5f97bbdadeb531b
Reviewed-on: http://review.whamcloud.com/22850
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8824 nodemap: properly handle errors loading nodemap conf 78/23778/4
Kit Westneat [Wed, 16 Nov 2016 00:08:13 +0000 (19:08 -0500)]
LU-8824 nodemap: properly handle errors loading nodemap conf

Modifies mgc_process_recover_nodemap_log to properly handle errors
returned by nodemap_process_idx_pages. Previously, these errors were
ignored.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: Icf0590eddf45d86a72623aeda863aee064993953
Reviewed-on: http://review.whamcloud.com/23778
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8129 tests: add version check to sanity.sh test_102n 87/23687/3
Emoly Liu [Fri, 18 Nov 2016 07:18:02 +0000 (15:18 +0800)]
LU-8129 tests: add version check to sanity.sh test_102n

We don't support the LFSCK compatibility between Lustre-2.9 and
Lustre-2.6 any more, so this patch is to add version check to
sanity.sh test_102n to make the test interoperate with the clients
that do not have the following change:
Lustre-commit: fd4ab6e6ae877c88e46c35c517349285aa6226d2
Lustre-change: http://review.whamcloud.com/20112

Test-Parameters: trivial
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: Ibf2c72e9b648df5666ed7a87c8372ea81b83a029
Reviewed-on: http://review.whamcloud.com/23687
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8816 utils: Check /etc/hostid instead of failing for ZFS 04/23804/4
Nathaniel Clark [Wed, 16 Nov 2016 16:52:57 +0000 (11:52 -0500)]
LU-8816 utils: Check /etc/hostid instead of failing for ZFS

Since ZFS doesn't check /etc/hostid until a pool is created or
imported.  Check for it's existance instead of just failing after
spl_hostid is checked.

Test-Parameters: trivial

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ia00b1e357c629ad6a7a2b636a2fc149036d03546
Reviewed-on: http://review.whamcloud.com/23804
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8796 kernel: kernel upgrade RHEL7.3 [3.10.0-514.el7] 60/23560/3
Bob Glossman [Tue, 25 Oct 2016 23:18:24 +0000 (16:18 -0700)]
LU-8796 kernel: kernel upgrade RHEL7.3 [3.10.0-514.el7]

With this mod we switch our supported el7 version to RHEL 7.3

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I2a50889bd484b33abea721582a6adb2ec6a0b06b
Reviewed-on: http://review.whamcloud.com/23560
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8534 ldiskfs: Add patch series for RHEL7.3 13/22113/5
Christopher J. Morrone [Wed, 24 Aug 2016 17:22:00 +0000 (13:22 -0400)]
LU-8534 ldiskfs: Add patch series for RHEL7.3

Add the new ldiskfs patch series file ldiskfs-3.10-rhel7.3.series which
supports the RHEL7.3 kernel.  Three patch files needed contextual updates
to allow them to apply.

Note that the new RHEL7.3 kernel contains a backport of the
upstream linux kernel commit 923ae0ff9250430133b3310fe62c47538cf1cbc1,
which introduces DAX to ext4.  This adds the flag EXT4_MOUNT_DAX
with value 0x00200.  This conflicted with ext4-data-in-dirent.patch's
EXT4_MOUNT_DIRDATA flag value.  Therefore, for RHEL7.3 the value of the
EXT4_MOUNT_DIRDATA flag is changed to 0x00002.

The ext4-corrupted-inode-block-bitmaps-handling-patches.patch needed
updating for two problems:

In ext4_validate_block_bitmap(), the patch removes the
struct ext4_group_info *grp declaration.  The upstream kernel now
has the following at the beginning of the function:

        if (buffer_verified(bh) || EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
                return;

The declaration/definion of grp is reintroduced to address that
use.

Change-Id: Ia1a2455c1f353b59202b48ce6cdaad801a7f42d2
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/22113
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8778 osd: osd_index_declare_ea_delete() reserve more credits 86/23486/3
Alex Zhuravlev [Sat, 29 Oct 2016 13:20:42 +0000 (16:20 +0300)]
LU-8778 osd: osd_index_declare_ea_delete() reserve more credits

when ".." direntry is being removed, OSD may need to update
local representative (agent inodes). reserve additional
credits for these updates.

Change-Id: I3689239ac9e7859fbb4a7c6edc87aa3d59a6be7e
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/23486
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8829 mgs: fix default secure RPC rule display 56/23756/2
John L. Hammond [Mon, 14 Nov 2016 18:45:08 +0000 (12:45 -0600)]
LU-8829 mgs: fix default secure RPC rule display

In seq_show_srpc_rules() ensure that the default Secure RPC rule is
displayed properly.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ifbb43012e92dfd22bc8caf028ddc9f1658cc5084
Reviewed-on: http://review.whamcloud.com/23756
Tested-by: Jenkins
Reviewed-by: Nathan Lavender <nblavend@iu.edu>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8305 tests: strengthen fileset cleanup in sanity-sec 93/23693/2
Sebastien Buisson [Thu, 10 Nov 2016 10:08:44 +0000 (11:08 +0100)]
LU-8305 tests: strengthen fileset cleanup in sanity-sec

Strengthen fileset cleanup on MGS side in test_27 of
sanity-sec.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I36a79f54d56225bd92c4672c87ebe396a2856035
Reviewed-on: http://review.whamcloud.com/23693
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Kit Westneat <kit.westneat@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8813 utils: l_getidentity compatibility 67/23667/4
Fan Yong [Fri, 12 Aug 2016 08:43:33 +0000 (16:43 +0800)]
LU-8813 utils: l_getidentity compatibility

Allow the new l_getidentity tool to parse old perm.conf which may
contains old 'rmtacl', 'rmtown'. These configurations are obsolete,
will be ignored directly, not error.

This patch also introduces new '-d' option to debug l_getidentity.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Id0c54d0c24f551e93af80a0ab461870aa5037f84
Reviewed-on: http://review.whamcloud.com/23667
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8795 gss: Prevent callout truncation with non-root users 00/23600/2
Jeremy Filizetti [Sat, 5 Nov 2016 23:09:05 +0000 (19:09 -0400)]
LU-8795 gss: Prevent callout truncation with non-root users

The SK changes included an additional svc_type field in the callout
which was initialized to the '0'.  Since the defaulted value is
not changed prior to callout for non-root users this breaks those
kerberos users.  SK is not affected because all users share the same
key context which is limited to the root user.

Signed-off-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Change-Id: I2c906714ee6ad6a0091ac922298aee7b63b9e856
Reviewed-on: http://review.whamcloud.com/23600
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8791 osd-zfs: hold oo_guard read lock for object write 50/23550/3
Jinshan Xiong [Thu, 3 Nov 2016 00:36:24 +0000 (17:36 -0700)]
LU-8791 osd-zfs: hold oo_guard read lock for object write

In order to avoid the deadlock of changing object block size and
writing the object at the same time.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Id1c3c7e66e74d4f61e2136311a0723b8da2da3bb
Reviewed-on: http://review.whamcloud.com/23550
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
8 years agoLU-8763 ldlm: do not dump update recovery list 44/23444/2
Olaf Faaland [Thu, 27 Oct 2016 19:18:30 +0000 (12:18 -0700)]
LU-8763 ldlm: do not dump update recovery list

Do not dump the update recovery list when recovery is aborted
or when checking whether recovery is complete.  The output
is not useful and is high volume on production systems.

Change-Id: I7f3cd71165475570353cb264b5587749ec252855
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-on: http://review.whamcloud.com/23444
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8700 build: dkms do not install llite_lloop 28/23228/7
Bruno Faccini [Tue, 18 Oct 2016 11:57:16 +0000 (13:57 +0200)]
LU-8700 build: dkms do not install llite_lloop

Do not build/install llite_lloop.ko module in DKMS.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I61b3455ae35477a83193ef2afca5815135db21cd
Reviewed-on: http://review.whamcloud.com/23228
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
8 years agoLU-8634 quota: fix return code of intent quota lock 51/22751/3
Niu Yawei [Mon, 26 Sep 2016 03:19:11 +0000 (23:19 -0400)]
LU-8634 quota: fix return code of intent quota lock

Intent quota operation should return error code in lock_policy_res2
like other intent operations, otherwise, it'll be confused with the
error code returned by intent locking.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I99fd04f72eeb6d10380ebd30da928b5749e74443
Reviewed-on: http://review.whamcloud.com/22751
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoNew tag 2.8.60 2.8.60 v2_8_60 v2_8_60_0
Oleg Drokin [Thu, 3 Nov 2016 04:20:49 +0000 (00:20 -0400)]
New tag 2.8.60

Change-Id: Id8cdb79c3377385f1ae53f18e0969c99e52be59a
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8246 ldlm: Do not grant a lock twice in a race 39/20839/7
Oleg Drokin [Thu, 16 Jun 2016 20:22:28 +0000 (16:22 -0400)]
LU-8246 ldlm: Do not grant a lock twice in a race

This leads to wrong ldlm pool accounting of granted locks.
Also handle the case of a destroyed lock.

Change-Id: Ied262d6688766e37f71304e6ee6659b48124e7ad
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/20839
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
8 years agoLU-8573 lnet: Revert LU-7650 patches 39/23439/3
Alex Zhuravlev [Thu, 27 Oct 2016 18:59:03 +0000 (21:59 +0300)]
LU-8573 lnet: Revert LU-7650 patches

These patches are causing LU-8573

Revert "LU-7650 o2iblnd: Put back work queue check previously removed"

This reverts commit bde1da1ec098450f40887587b0a46c9eb86a4f6c.

Revert "LU-7650 o2iblnd: handle mixed page size configurations."

This reverts commit 399a5ac1fc73343c69e0fd737032adf5329df1b2.

Change-Id: I4517fab64ac5b1023e615874f58b1bb4902e8c43
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/23439
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8707 build: fix lbuild-sles for kernel_module_package 66/23166/6
Minh Diep [Thu, 13 Oct 2016 21:10:33 +0000 (14:10 -0700)]
LU-8707 build: fix lbuild-sles for kernel_module_package

kernel_module_package macro also checking for
/boot/symsets-$kver-$flavor.tag.gz
in case of lbuild, we need to point it to lbuild
kernel-source directory

Change-Id: I3cf9c1f43fe9ea543f67967773fc8715325a47e9
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: http://review.whamcloud.com/23166
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8731 utils: propagate errors in lfs df 86/23286/3
John L. Hammond [Thu, 20 Oct 2016 16:52:44 +0000 (11:52 -0500)]
LU-8731 utils: propagate errors in lfs df

Add llapi_obd_fstatfs() which does the same thing as
llapi_obd_statfs() but takes an open file descriptor instead of a
path. Refector the handler for 'lfs df' to use llapi_obd_fstatfs(),
thereby avoiding opening the mount point for each target and making
the error conditions easier to understand. Propagate errors from
llapi_obd_fstatfs() as the exit status of 'lfs df'.

In conf-sanity.sh test_64() allow 'lfs df' to fail when a target is
offline.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Iabfc92a65571b1a277de7fd42431f5b7e45ad440
Reviewed-on: http://review.whamcloud.com/23286
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
8 years agoLU-8723 llapi: correct open() handling in llapi_obd_statfs() 85/23285/3
John L. Hammond [Thu, 20 Oct 2016 14:47:00 +0000 (09:47 -0500)]
LU-8723 llapi: correct open() handling in llapi_obd_statfs()

In llapi_obd_statfs() remove a spurious errno test and retry after
open().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I1667f1f0acf0e1f0d700049bc39a5e6c462b9df6
Reviewed-on: http://review.whamcloud.com/23285
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
8 years agoLU-8748 osd-zfs: set block size of echo object 23/23323/3
Niu Yawei [Mon, 24 Oct 2016 04:38:39 +0000 (00:38 -0400)]
LU-8748 osd-zfs: set block size of echo object

Set block size for zfs echo object.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I6efab645181ab3de6686bf82f4ecbf9ea3384b1b
Reviewed-on: http://review.whamcloud.com/23323
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8311 doc: add NIDs examples to mkfs.lustre and mount.lustre 55/23355/4
Jian Yu [Tue, 25 Oct 2016 11:19:20 +0000 (19:19 +0800)]
LU-8311 doc: add NIDs examples to mkfs.lustre and mount.lustre

This patch adds examples of how NIDs can be specified on
mkfs.lustre and mount command lines.

Test-Parameters: trivial

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I93a8f843755582f94844504f604409dd43b617f9
Reviewed-on: http://review.whamcloud.com/23355
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Zhiqi Tao <zhiqi.tao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8775 osd: do not report special writes in brw stats 63/23363/3
Alex Zhuravlev [Tue, 25 Oct 2016 14:24:51 +0000 (17:24 +0300)]
LU-8775 osd: do not report special writes in brw stats

do not report special writes (e.g. last_rcvd, etc) made with
copying write in brw_stats.

Change-Id: Id92ed13a8d241e6489731d51a546c3583a2156b8
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/23363
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Zhiqi Tao <zhiqi.tao@intel.com>
Tested-by: Zhiqi Tao <zhiqi.tao@intel.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8755 kernel: kernel update [SLES12 SP1 3.12.62-60.64.8] 64/23364/2
Bob Glossman [Tue, 25 Oct 2016 01:22:44 +0000 (18:22 -0700)]
LU-8755 kernel: kernel update [SLES12 SP1 3.12.62-60.64.8]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12 testgroup=review-ldiskfs \
  mdsdistro=sles12 ossdistro=sles12 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I52ab5a1ecff81e470b965af536fd0f638a120546
Reviewed-on: http://review.whamcloud.com/23364
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8751 kernel: kernel update RHEL7.2 [3.10.0-327.36.3.el7] 35/23335/3
Bob Glossman [Mon, 24 Oct 2016 15:36:09 +0000 (08:36 -0700)]
LU-8751 kernel: kernel update RHEL7.2 [3.10.0-327.36.3.el7]

update RHEL7.2 kernel to 3.10.0-327.36.3.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I0c6c457a6e48c4166508572d91e7b98f9ed4ad86
Reviewed-on: http://review.whamcloud.com/23335
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8749 osd-ldiskfs: inherit S_ISGID correctly 29/23329/3
Lai Siyao [Mon, 24 Oct 2016 08:37:04 +0000 (16:37 +0800)]
LU-8749 osd-ldiskfs: inherit S_ISGID correctly

For remote directory S_ISGID is inherited on agent, not where file
resides, and also the group inherited from parent.

Update sanity 6g to test this.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I159b2687ad00fdc7c35f60a18668a015240a1953
Reviewed-on: http://review.whamcloud.com/23329
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8733 gnilnd: Remove read capability of cksum_test procfile 55/23255/5
Chris Horn [Wed, 19 Oct 2016 16:17:58 +0000 (11:17 -0500)]
LU-8733 gnilnd: Remove read capability of cksum_test procfile

When the old create proc interface was deprecated cksum_test was
updated  to use the new file operations table. Inadvertantly read
was left as a capability without actually defining a function
that the file would use when someone tried to read the file.
This causes a kernel crash when cksum_test is read, though it can
only be done by the root user.

The fix is to remove the .read op from the fops table for the
cksum_test proc entry

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I406b076f1b66b6d991694c69a9b748ed42c09f39
Reviewed-on: http://review.whamcloud.com/23255
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7206 mdd: stop orphan cleanup before finish FLD 29/23029/3
Di Wang [Sun, 2 Oct 2016 12:51:07 +0000 (08:51 -0400)]
LU-7206 mdd: stop orphan cleanup before finish FLD

Stop orphan cleanup thread in PRE_CLEANUP phase.
Because orphan cleanup threads might need lookup
FLD, (__mdd_orphan_cleanup()->mdd_object_init()->
lod_object_alloc() ->lod_fld_lookup()), so let's
stop orphan cleanup threads before FLD cleanup.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: I8df9832c633017e2fca866579b497f8215054d31
Reviewed-on: http://review.whamcloud.com/23029
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8347 ldlm: granting conflicting locks 59/21059/5
Andriy Skulysh [Wed, 29 Jun 2016 11:07:23 +0000 (14:07 +0300)]
LU-8347 ldlm: granting conflicting locks

Postpone lock reprocess during lock replay stage.
Reprocess is needed during request replay stage
beacause local locks are still in use until
client ACK.

Change-Id: I250d22fee471db643f12a900fdfc51eacfa94aa2
Seagate-bug-id: MRP-3516
Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-on: http://review.whamcloud.com/21059
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8633 llite: do not clear uptodate bit in page delete 27/22827/3
Jinshan Xiong [Thu, 29 Sep 2016 21:31:01 +0000 (14:31 -0700)]
LU-8633 llite: do not clear uptodate bit in page delete

Otherwise, if the race between page fault and truncate occurs, it
will cause the page fault routine to return an EIO error.

In filemap_fault() {
    page_not_uptodate:
...
        ClearPageError(page);
        error = mapping->a_ops->readpage(file, page);
        if (!error) {
                wait_on_page_locked(page);
                if (!PageUptodate(page))
                        error = -EIO;
}
...
}

However, I tend to think this is a defect in kernel implementation,
because it assumes PageUptodate shouldn't be cleared but file read
routine doesn't make the same assumption.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ic4a919607a6121098e41eaf56b8ce3200f778ecf
Reviewed-on: http://review.whamcloud.com/22827
Tested-by: Maloo <hpdd-maloo@intel.com>
Tested-by: Jenkins
Reviewed-by: Li Dongyang <dongyang.li@anu.edu.au>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8680 osc: soft lock - osc_makes_rpc() 26/23326/6
Bobi Jam [Mon, 24 Oct 2016 05:11:25 +0000 (13:11 +0800)]
LU-8680 osc: soft lock - osc_makes_rpc()

It is possible that an osc_extent contains more than 256 chunks, and
the IO engine won't add this extent in one RPC
(try_to_add_extent_for_io) so that osc_check_rpcs() run into a loop
upon this extent and never break.

This patch changes osc_max_write_chunks() to make sure the value
can cover all possible osc_extent, so that all osc_extent will be
added into one RPC.

This patch also add another field erd_max_extents in extent_rpc_data
to make sure not to add too many fragments in a single RPC.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Icf58a6bd04655bb9aa5589dd002e118c21ed932d
Reviewed-on: http://review.whamcloud.com/23326
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-930 doc: move DLC doc to Documentation dir 31/22931/3
Andreas Dilger [Tue, 4 Oct 2016 17:10:25 +0000 (11:10 -0600)]
LU-930 doc: move DLC doc to Documentation dir

Move DLC documentation to top-level Documentation/ directory instead
of in the lustre/doc directory where man pages are kept.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I4e5c01a9ee796e099dbfcfb73a315c8187931cf0
Reviewed-on: http://review.whamcloud.com/22931
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Joseph Gmitter <joseph.gmitter@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8590 gss: Move DH parameter generation out of upcall 22/23322/5
Jeremy Filizetti [Sun, 2 Oct 2016 19:40:24 +0000 (15:40 -0400)]
LU-8590 gss: Move DH parameter generation out of upcall

This change adds the Diffie-Hellman parameter generation to the
lgss_sk utility prior to key loading.  The parameters are now
persistent to prevent long DH parameter generation times which
can cause mount command and connection timeouts.

This is based on recommendations from Matt Wood at Intel's
security review.

Signed-off-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Change-Id: Iba840168da533662ed8ec78004be9e4dc5369c68
Reviewed-on: http://review.whamcloud.com/23322
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8186 llite: Typo in ll_rw_extents_stats_pp_seq_show 48/23248/4
Steve Guminski [Tue, 18 Oct 2016 19:38:13 +0000 (15:38 -0400)]
LU-8186 llite: Typo in ll_rw_extents_stats_pp_seq_show

Add a missing quote character to ll_rw_extents_stats_pp_seq_show. Also
correct leading whitespace to match coding guidelines.

This corrects the text displayed on clients in
/proc/fs/lustre/llite/.../extents_stats_per_process

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I3046836372182925ea0f3b0f5909ae7f8dc5efd1
Reviewed-on: http://review.whamcloud.com/23248
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8124 osd-zfs: fix statfs small blocksize inode estimate 23/20123/7
Andreas Dilger [Tue, 10 May 2016 17:33:22 +0000 (11:33 -0600)]
LU-8124 osd-zfs: fix statfs small blocksize inode estimate

When a small recordsize is specified for the MDT dataset (e.g. 4KB)
the current statfs estimate for the total number of dnodes available
is constrained to assume one dnode per 4KB block.  However, if the
ZFS sector size is 4KB (ashift=12) then the SA (xattr) spill block
will also be allocated in 4KB units and ditto'd, consuming 8.5KB per
dnode plus extra overhead (OI, directory ZAP, etc).  If lots of
directories are created, there will be up to 64KB of space consumed
per dnode.  This throws off the dnode estimations significantly.

Instead, do not constrain the statfs dnode calculation by the small
recordsize and use the actual average space per dnode when estimating
the total number of dnodes the filesystem can hold.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I5403a855a0bd3d9077ef0e661d2f262ffa2cab07
Reviewed-on: http://review.whamcloud.com/20123
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8250 mdd: move linkea prepare out of transaction. 96/23096/5
Di Wang [Tue, 11 Oct 2016 19:30:53 +0000 (15:30 -0400)]
LU-8250 mdd: move linkea prepare out of transaction.

Move linkea prepare out of transaction to avoid reading
linkea remotely inside the transaciton.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: I10f0979c0c496fdcc5349f36ac5cca123d42c8a5
Reviewed-on: http://review.whamcloud.com/23096
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8378 all: remove set but unused variables 21/23221/2
Yang Sheng [Tue, 18 Oct 2016 04:50:22 +0000 (12:50 +0800)]
LU-8378 all: remove set but unused variables

Remove set but unused variables as report list.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: If9120ec088a2dd0b65564330bc295c08a1e579b7
Reviewed-on: http://review.whamcloud.com/23221
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8189 osc: osc_match_base prototype differs from declaration 67/23167/2
Steve Guminski [Fri, 14 Oct 2016 15:25:36 +0000 (11:25 -0400)]
LU-8189 osc: osc_match_base prototype differs from declaration

The patch updates the prototype in osc_internal.h to match the
enums used in the declaration.

The osc_match_base declaration in lustre/osc/osc_request.c uses
enums for stricter checking on the type and mode parameters:

int osc_match_base(struct obd_export *exp,
   ...
-->    enum ldlm_type type,
   union ldlm_policy_data *policy,
-->    enum ldlm_mode mode,
   ... int unref)

The prototype in lustre/osc/osc_internal.h instead used unsigned ints:

int osc_match_base(struct obd_export *exp,
   ...
-->    __u32 type,
   union ldlm_policy_data *policy,
-->    __u32 mode,
   ... int unref);

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I7ccc6383e0e12bf4fe5b5c3bad822f3322aaa1ff
Reviewed-on: http://review.whamcloud.com/23167
Tested-by: Jenkins
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6303 osc: remove handling cl_avail_grant less than zero 55/23155/2
James Simmons [Thu, 13 Oct 2016 23:13:58 +0000 (19:13 -0400)]
LU-6303 osc: remove handling cl_avail_grant less than zero

Earlier cl_avail_grant was changed to an unsigned int. Juila
Lawall reported for the upstream client the following which
affects the Intel branch as well:

drivers/staging/lustre/lustre/osc/osc_request.c:1045:5-24: WARNING: Unsigned
     expression compared with zero: cli -> cl_avail_grant < 0

Since cl_avail_grant can never be negative we can remove the
code handling the negative value case.

Change-Id: I10f7ac3aaab7ebf03a7f7ac0717b60134f09cddf
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/23155
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8697 llite: remove IS_ERR(master_inode) check 51/23151/2
James Simmons [Thu, 13 Oct 2016 22:32:51 +0000 (18:32 -0400)]
LU-8697 llite: remove IS_ERR(master_inode) check

The kernel function ilookup5_nowait never returns
IS_ERR so we can remove the IS_ERR check in the
ll_md_blocking_ast() function.

Change-Id: I5e72a8f70857f178a2377e9db80b2e2139c56ec3
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/23151
Tested-by: Jenkins
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8705 tests: do not skip lnet-selftest for DNE 38/23138/2
Elena Gryaznova [Thu, 13 Oct 2016 18:11:44 +0000 (21:11 +0300)]
LU-8705 tests: do not skip lnet-selftest for DNE

Patch removes the skip added by LU-4181.

Test-Parameters: mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs mdscount=2 mdtcount=4 testlist=lnet-selftest
Seagate-bug-id: MRP-3912
Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Reviewed-by: Alexander Lezhoev <alexander.lezhoev@seagate.com>
Tested-by: Alexander Lezhoev <alexander.lezhoev@seagate.com>
Change-Id: Id78ef3909896325d55569ea948f75906cf0b7c87
Reviewed-on: http://review.whamcloud.com/23138
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
8 years agoLU-7593 target: take ted_lcd_lock after transaction started 29/23129/3
Fan Yong [Wed, 27 Jul 2016 14:55:00 +0000 (22:55 +0800)]
LU-7593 target: take ted_lcd_lock after transaction started

Otherwise the thread1 may be blocked during the transaction start
in tgt_client_data_update() with 'ted_lcd_lock' held because another
thread2 may be blocked by such lock in tgt_txn_stop_cb() but with
transaction handle started. That is deadlock.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Id623d171e43beaa54ae4a9718fb4dc52c474df01
Reviewed-on: http://review.whamcloud.com/23129
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8692 kernel: kernel update RHEL7.2 [3.10.0-327.36.2.el7] 02/23102/3
Bob Glossman [Tue, 11 Oct 2016 13:39:16 +0000 (06:39 -0700)]
LU-8692 kernel: kernel update RHEL7.2 [3.10.0-327.36.2.el7]

Update RHEL7.2 kernel to 3.10.0-327.36.2.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I936a4d6f73913d64af76fdbaf51964d7ad2c7e8f
Reviewed-on: http://review.whamcloud.com/23102
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8695 target: use -1 as an offset to declare write 82/23082/3
Alex Zhuravlev [Tue, 11 Oct 2016 16:15:38 +0000 (19:15 +0300)]
LU-8695 target: use -1 as an offset to declare write

at the end of recovery or filesystem setup the number of clients
may increase significantly. this can lead to underestimated space
or credits reserved.

Change-Id: Id4f3755dc481f8a29a1a2a673c26d64d12f7dbf0
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/23082
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8685 kernel: jbd2: fix incorrect unlock on j_list_lock 50/23050/2
Bruno Faccini [Mon, 10 Oct 2016 13:10:47 +0000 (15:10 +0200)]
LU-8685 kernel: jbd2: fix incorrect unlock on j_list_lock

This patch has been back-ported to avoid kernel Oopses/BUG()s
due to j_list_lock found unlocked when expected to be locked!

In jbd2_journal_get_create_access(),
when 'jh->b_transaction == transaction' (asserted by below)

  J_ASSERT_JH(jh, (jh->b_transaction == transaction || ...

'journal->j_list_lock' will be incorrectly unlocked, since
the the lock is aquired only at the end of if / else-if
statements (missing the else case).

This bug has been introduced by an earlier change named
"jbd2: minimize region locked by j_list_lock in
journal_get_create_access()".

Signed-off-by: Taesoo Kim <tsgatesv@gmail.com>
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ifb8b038333e523caa1b274f53f49317182895de5
Reviewed-on: http://review.whamcloud.com/23050
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
8 years agoLU-8683 readahead: update ras window correctly 32/23032/3
Bobi Jam [Sat, 19 Dec 2015 02:10:29 +0000 (10:10 +0800)]
LU-8683 readahead: update ras window correctly

When stride-RA hit case miss, we only reset normal sequential
read-ahead window, but not reset the stride IO to avoid the overhead
of re-detecting stride IO. While when the normal RA window is set
to not insect with the stride-RA window, when we try to increase
the stride-RA window length later, the presumption does not hold.

This patch resets the stride IO as well in this case.

Lustre-change: http://review.whamcloud.com/17343
Lustre-commit: 88ef5af0bed93c88984c226db755d07601aef60f

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Iba6e51f12ac4d00548cc99b7bd423502b754db13
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/23032
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
8 years agoLU-8631 quota: better error message for 'lfs quota' 21/23021/4
Niu Yawei [Sat, 8 Oct 2016 02:56:23 +0000 (22:56 -0400)]
LU-8631 quota: better error message for 'lfs quota'

'lfs quota' should return useful error message when it's being
issued on a non-Lustre fs.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I9dd08982077132756c8684b44430251da53cbb90
Reviewed-on: http://review.whamcloud.com/23021
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8492 ptlrpc: Correctly calculate hrp->hrp_nthrs 06/19106/6
Amir Shehata [Wed, 23 Mar 2016 20:14:37 +0000 (13:14 -0700)]
LU-8492 ptlrpc: Correctly calculate hrp->hrp_nthrs

cpu_pattern can specify exactly 1 cpu in a partition:
"0[0]". That means CPT0 will have CPU 0. CPU 0 can have
hyperthreading enabled. This combination would result in

weight = cfs_cpu_ht_nsiblings(0);
hrp->hrp_nthrs = cfs_cpt_weight(ptlrpc_hr.hr_cpt_table, i);
hrp->hrp_nthrs /= weight;

evaluating to 0. Where
cfs_cpt_weight(ptlrpc_hr.hr_cpt_table, i) == 1
weight == 2

Therefore, if hrp_nthrs becomes zero, just set it to 1.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Id89d381436b2c61354d925420f2efce8d9a54864
Reviewed-on: http://review.whamcloud.com/19106
Tested-by: Jenkins
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8667 osp: validate FID before initializing precreate seq 34/22934/7
Di Wang [Sat, 1 Oct 2016 14:20:57 +0000 (10:20 -0400)]
LU-8667 osp: validate FID before initializing precreate seq

A few fixes for FID on OST...

Check if the OSP FID is valid before initializing
precreate sequence, in case the last_seq/oid file
is corrupted. Only MDT0 can use IDIF, and non-MDT0
can only use normal FID, which will also make sure
the following orphan cleanup will use the valid
sequence.

OFD will validate the object sequence to make sure
IDIF request will only operate on MDT0 group.

If MDT can not get new sequence from OST, it will
sleep 2 seconds before retry, to offer OSTs some
extra time to setup FID service with sequence
controller (MDT0).

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Id9bbfaf9170ff1a9240719eaee73ddcb4fd804e5
Reviewed-on: http://review.whamcloud.com/22934
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-4931 ladvise: add code for ladvise_hdr into wirecheck.c 40/21940/3
Gu Zheng [Tue, 16 Aug 2016 04:10:41 +0000 (12:10 +0800)]
LU-4931 ladvise: add code for ladvise_hdr into wirecheck.c

Add code into wirecheck.c to generate the ladvise_hdr checks
in wiretest.c.

Test-Parameters: trivial

Signed-off-by: Gu Zheng <gzheng@ddn.com>
Change-Id: Ic4488b2d6004d284a4fbf123ab7a0688da227212
Reviewed-on: http://review.whamcloud.com/21940
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8089 lwp: change lwp export only at first connect 26/13726/9
Mikhail Pershin [Thu, 20 Aug 2015 18:59:02 +0000 (21:59 +0300)]
LU-8089 lwp: change lwp export only at first connect

Fix lwp connection logic, change export only at first
connection.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I5b528c2f6d142b8cfdd8d84f3f540289e45d557f
Reviewed-on: http://review.whamcloud.com/13726
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8694 docs: ZFS hostid mkfs.lustre(8) man page update 18/23118/2
Nathaniel Clark [Tue, 11 Oct 2016 17:01:13 +0000 (13:01 -0400)]
LU-8694 docs: ZFS hostid mkfs.lustre(8) man page update

Add note about needing to reload spl module after creating /etc/hostid
file.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ie4c3697567ed2722742324fd70f382bb46a886d6
Reviewed-on: http://review.whamcloud.com/23118
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Steve Guminski <stephenx.guminski@intel.com>
Reviewed-by: Cliff White <cliff.white@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8713 utils: Try loading zfs.ko during zfs_init 10/23210/4
Nathaniel Clark [Mon, 17 Oct 2016 16:26:22 +0000 (12:26 -0400)]
LU-8713 utils: Try loading zfs.ko during zfs_init

Newer version of zfs (0.6.5.8 and 0.7.0) do not autoload zfs module at
boot nor do they load it during libzfs_init(), so try loading it
during initialization.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I44765b0c2b7ea8b1c8c6d45a9107842b17623dbc
Reviewed-on: http://review.whamcloud.com/23210
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8574 osd-ldiskfs: fix FID-in-dirent properly 10/22310/5
Fan Yong [Thu, 14 Jul 2016 02:11:54 +0000 (10:11 +0800)]
LU-8574 osd-ldiskfs: fix FID-in-dirent properly

Sometimes, the directory entry may be corrupted and contains bad
ino# information as to point to the inode that belong to other.
Two cases for that:

1) If such inode is unless, then the up layer namespace LFSCK may
   handle it as dangling name entry (by guess), and as the LFSCK
   progressing, its original inode (orphan) may be found then the
   LFSCK will fix the such bad name entry to reference the orphan.

2) If such inode is used by other, then osd_dirent_check_repair()
   will find that the FID-in-dirent does not match the FID-in-LMA,
   and fix the FID-in-dirent. So the up layer namespace LFSCK will
   get the wrong FID (not the original FID-in-dirent). Under such
   case, the up layer LFSCK will fix it as unmatched pairs, then
   the orphan cannot be recovered.

In fact, when injecting failure stub for LFSCK sanity test, we
hit similar issues as the 2) case. To resolve such trouble, we
can enhance the osd_dirent_check_repair() logic as following:
If the FID-in-dirent does not match the inode's FID-in-LMA, then
check the inode's linkEA, if it recognizes the dirent entry (with
parent dir's FID + child name), then trouble the inode, and fix
the FID-in-dirent with the FID-in-LMA. Otherwise, the dirent may
be corrupted, as the LFSCK processing, the right orphan may be
found, then fix it later. The worst case is that the inconsistence
is detected but kept there, but it will make wrong fixing.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I3516a6ad9cc1766453612c440df7db02bc2f09a4
Reviewed-on: http://review.whamcloud.com/22310
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6808 ptlrpc: no need to reassign mbits for replay 48/23048/6
Niu Yawei [Mon, 10 Oct 2016 11:08:54 +0000 (07:08 -0400)]
LU-6808 ptlrpc: no need to reassign mbits for replay

It's not necessary reassgin & re-adjust rq_mbits for replay
request in ptlrpc_set_bulk_mbits(), they all must have already
been correctly assigned before.

Such unecessary reassign could make the first matchbit not
PTLRPC_BULK_OPS_MASK aligned, that'll trigger LASSERT in
ptlrpc_register_bulk():

- ptlrpc_set_bulk_mbits() is called when first time sending
  request, rq_mbits is set as xid, which is BULK_OPS aligned;

- ptlrpc_set_bulk_mbits() continue to adjust the mbits for
  multi-bulk RPC, rq_mbits is not aligned anymore, then rq_xid
  is changed accordingly if client is connecting to an old
  server, so rq_xid became unaligned too;

- The request is replayed, ptlrpc_set_bulk_mbits() reassign
  the rq_mbits as rq_xid, which isn't aligned already, but
  ptlrpc_register_bulk() still assumes this value as the
  first matchbits and LASSERT it's BULK_OPS aligned.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ib5d5a969702d3b621fb44643586cc19bf931c365
Reviewed-on: http://review.whamcloud.com/23048
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8450 nodemap: modify ldlm_revoke_export_locks 20/23120/3
Kit Westneat [Wed, 12 Oct 2016 22:16:14 +0000 (18:16 -0400)]
LU-8450 nodemap: modify ldlm_revoke_export_locks

Modify ldlm_revoke_export_locks to use cfs_hash_for_each_nolock
instead of cfs_hash_for_each_empty, to avoid looping multiple times
over the export hash.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: I8ab475b8117493ed55961e94e747b4955141f003
Reviewed-on: http://review.whamcloud.com/23120
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8413 test: limit # of processes for sanity test_101f 39/23039/3
Bobi Jam [Mon, 10 Oct 2016 05:44:27 +0000 (13:44 +0800)]
LU-8413 test: limit # of processes for sanity test_101f

It tries to test mmap readahead does not miss too much pages,
so we need limit the number of processes of the iozone read.

Test-Parameters: trivial
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ic5c753549234fe2bffcaa42f75208c2295f84ac0
Reviewed-on: http://review.whamcloud.com/23039
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8682 llite: protect from accessing NULL lli_clob 31/23031/3
Bobi Jam [Tue, 27 Oct 2015 13:17:41 +0000 (21:17 +0800)]
LU-8682 llite: protect from accessing NULL lli_clob

Need to check file's lli_clob object before calling
lov_read_and_clear_async_rc().

lustre-change: http://review.whamcloud.com/16954
lustre-commit: 4e416d205b0d04e291e0f279741af85fa73845d2

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Intel-bug-id: LDEV-231
Change-Id: I210b22d4e17dd4a407378aa987434d1940799f1f
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/23031
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
8 years agoLU-8681 osd: ingore ENODATA during unlink agent parent 30/23030/2
Niu Yawei [Sun, 9 Oct 2016 15:00:42 +0000 (11:00 -0400)]
LU-8681 osd: ingore ENODATA during unlink agent parent

If the directory is created before 2.0, there are no LMA
for the directory, then osd_delete_from_remote_parent()
will return ENODATA. So we should ignore ENODATA in this
case otherwise it will cause unlink fails.

Lustre-commit: 860283cb433dff4246a5c255ed89325323ee8e7c
Lustre-change: http://review.whamcloud.com/16760

Signed-off-by: Wang Di <di.wang@intel.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Intel-bug-id: LDEV-64
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: I728c081f324253080c5d2cb740b3a11b26f9d570
Reviewed-on: http://review.whamcloud.com/23030
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
8 years agoLU-8669 kernel: kernel update RHEL6.8 [2.6.32-642.6.1.el6] 60/22960/2
Bob Glossman [Tue, 4 Oct 2016 21:53:57 +0000 (14:53 -0700)]
LU-8669 kernel: kernel update RHEL6.8 [2.6.32-642.6.1.el6]

Update RHEL6.8 kernel to 2.6.32-642.6.1.el6

Test-Parameters: clientdistro=el6.8 mdsdistro=el6.8 ossdistro=el6.8 \
  mdsfilesystemtype=ldiskfs mdtfilesystemtype=ldiskfs \
  ostfilesystemtype=ldiskfs testgroup=review-ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I84e92c261aec3c438112504bdf51555214fcc845
Reviewed-on: http://review.whamcloud.com/22960
Tested-by: Jenkins
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8636 build: prevent src download at the same time 39/22939/7
Minh Diep [Tue, 4 Oct 2016 22:11:36 +0000 (15:11 -0700)]
LU-8636 build: prevent src download at the same time

If the lbuild kernelsrc/kernelrpm directory is shared
among the builders, we need to keep them from using
it while it's being downloaded

Test-Parameters: trivial

Change-Id: I240996c84ea541f5985b8d6ec73e3c6a56d2d805
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: http://review.whamcloud.com/22939
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8660 mgs: handle return code of server_make_name() 67/22867/3
James Simmons [Tue, 4 Oct 2016 13:44:45 +0000 (09:44 -0400)]
LU-8660 mgs: handle return code of server_make_name()

Make sure server_make_name() actually succeeded when
called in mkfs_lustre utility and mgs_set_index().

Change-Id: I218351f0f3dd98e1b928664f872c1702a419a7cc
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22867
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7501 utils: keep lfs arguments consistent 81/22581/7
Yang Sheng [Sun, 18 Sep 2016 11:33:17 +0000 (19:33 +0800)]
LU-7501 utils: keep lfs arguments consistent

lfs getstripe use -m and deprecate -M to match lfs setstripe

lfs setstripe use --ost to match lfs find and deprecate --ost-list

lfs find add --mdt-index to match lfs setstripe

lfs getstripe and migrate add --mdt to match lfs find

lfs setdirstripe add --mdt-count|-c, --mdt-index|-m, --mdt-hash as
    aliases for the existing --count|-c and --index|-i and
    --hash-type|-t options

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I8983b7438e7046657a087e97d9fbc467242c9dd8
Reviewed-on: http://review.whamcloud.com/22581
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8239 utils: llanalyze drops useful logs 41/20641/6
Artem Blagodarenko [Mon, 6 Jun 2016 14:55:42 +0000 (17:55 +0300)]
LU-8239 utils: llanalyze drops useful logs

llanalyze should not execute entering_rpc and
leaving_rpc if no rpctrace option set, otherwise wrong
pid is set and usefull logs are skipped.

This patch executes entering_rpc and leaving_rpc
only if rpctrace is set.

Test-Parameters: trivial
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Change-Id: I4676b0ca80f8ed314716e9368558489b64825ffa
Reviewed-on: http://review.whamcloud.com/20641
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
8 years agoLU-8239 utils: llanalyze logs parser fix 32/20632/9
Artem Blagodarenko [Mon, 6 Jun 2016 11:19:31 +0000 (14:19 +0300)]
LU-8239 utils: llanalyze logs parser fix

llanalyze looks to be broken:
- thread PID parsed not correctly;
- competition and blocking upcalls start and finish message is parsed
wrong way.

This patch fixes problems with logs parsing.

Test-Parameters: trivial
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Change-Id: I412120a2c7e877ba5374178a6a246984e2dcca08
Reviewed-on: http://review.whamcloud.com/20632
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
8 years agoLU-8289 utils: add ll_decode_linkea tool 44/20444/15
Li Xi [Thu, 6 Oct 2016 23:13:47 +0000 (19:13 -0400)]
LU-8289 utils: add ll_decode_linkea tool

A MDT recovered by fsck might contain some files under lost+found
directory of ldiskfs. And in order to get the right path to move
them to, the xattr of trusted.link could be used to extract the
parent FIDs.

This path adds an new tool ll_decode_linkea to dump the parent FIDs
of a file.

Test-Parameters: trivial
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I1da6ea78ab2f9e2db8fbdfdcfb8690c57e9eb2b0
Reviewed-on: http://review.whamcloud.com/20444
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6635 lfsck: block replacing the OST-object for test 46/18146/4
Fan Yong [Fri, 17 Jun 2016 03:20:39 +0000 (11:20 +0800)]
LU-6635 lfsck: block replacing the OST-object for test

For sanity-lfsck test_18e, sometimes, before the client wirte
happened, the layout LFSCK has already gone into the phase2
and replaced the new created OST-object with the old orphan
OST-object, then cause the subsequent check failure.

To resolve such trouble, we will hold the layout LFSCK when
replacing the new created OST-object until the client write
happened.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ia5be2cd2ea920fa02bb281ca5c59d8d252f3fd7b
Reviewed-on: http://review.whamcloud.com/18146
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8645 ptlrpc: update imp_known_replied_xid on resend-replay 76/22776/3
Niu Yawei [Wed, 28 Sep 2016 02:40:33 +0000 (22:40 -0400)]
LU-8645 ptlrpc: update imp_known_replied_xid on resend-replay

The imp_known_replied_xid should be updated when try to resend
an already replied replay request, because the xid of this replay
request could be less than current imp_known_replied_xid.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Iaafcb187efb24cc88dc96b30a63083fac83a9078
Reviewed-on: http://review.whamcloud.com/22776
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8569 lfsck: cleanup lfsck requests list before exit 23/22723/5
Fan Yong [Sat, 16 Jul 2016 21:42:03 +0000 (05:42 +0800)]
LU-8569 lfsck: cleanup lfsck requests list before exit

When the lfsck assistant thread hits some failure and exit
at the second stage scanning, it does not cleanup the requests
list 'lfsck_assistant_data::lad_req_list', that may cause the
lfsck main engine hit "LASSERT(list_empty(&lad->lad_req_list))"
when handles double scan.

This patch unifies the assistant thread exit process: before
cleanup the list 'lfsck_assistant_data::lad_req_list' set the
thread as "stopping" to prevent more lfsck requests being added,
then cleanup the list 'lfsck_assistant_data::lad_req_list'.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I7facd20e2742c7ad5d4fbee1c975dacd8f6ea363
Reviewed-on: http://review.whamcloud.com/22723
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-3289 ssk: fix SK_IV_REV_START on 32-bit systems 89/23089/3
Andreas Dilger [Tue, 11 Oct 2016 17:43:07 +0000 (11:43 -0600)]
LU-3289 ssk: fix SK_IV_REV_START on 32-bit systems

Use a 64-bit constant for SK_IV_REV_START so that it doesn't
overflow on 32-bit systems.

Test-Parameters: trivial testlist=sanity-sec
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ie790f7ea438847b18e7b83689e3816ceffe2ddd7
Reviewed-on: http://review.whamcloud.com/23089
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-3289 gss: Fix issues with SK privacy and integrity mode 22/21922/8
Jeremy Filizetti [Tue, 9 Aug 2016 23:19:43 +0000 (19:19 -0400)]
LU-3289 gss: Fix issues with SK privacy and integrity mode

This patch has several fixes for skpi:

1. The original SK patches failed to account for out of order
handling of RPCs and bulk pages during encryption.  As a result
clients would be out of sync with the IV used for decryption.
This patches moves the encryption to a format similar to RFC3686
to handle these RPCs and bulk pages.

2. A header was added to the SK mode RPCs to allow versioning and
send the unencrypted IV used for an RPC.  The versioning will allow
for future protocol changes.

3. Several changes to fix or impove security of the implementation
based on a security review from Matthew Wood at Intel:
- Derive a unique key for integrity modes instead of using the
  shared secret key (ska, ski, and skpi modes).  This helps prevent
  replays.
- Use PBKDF2 instead of HMAC to derive keys for integrity and
  encryption.
- Have the server side pass a random value (like the client) and
  incorporate this value into the key binding information.

Signed-off-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Change-Id: I247187ecbd8cb23c602cec6a92eca938f135e564
Reviewed-on: http://review.whamcloud.com/21922
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8429 gnilnd: Option to not reconnect after conn timeout 59/21459/4
Chuck Fossen [Fri, 15 Apr 2016 13:42:27 +0000 (13:42 +0000)]
LU-8429 gnilnd: Option to not reconnect after conn timeout

When routers time out a client connection during a catastrophic
network disturbance like a cabinet EPO, there still may be
traffic from the file system that is using the router for the
return path to the client. This will cause a new connection to try
to be formed before the network has quiesced causing multiple failed
connection attempts which need to be put in purgatory since they could
possibly connect in the future. This can cause the gart space to be
consumed with registrations.

To avoid this, add a module parameter to_reconn_disable which when set
will change the state of the peer that has timed out to PEER_TIMED_OUT
which will act just like PEER_DOWN so that no traffic will be
attempted to a peer in this state.

When the network recovers, the client will form a new connection and
the state will change back to PEER_UP.

Changed gnp_down to gnp_state and GNILND_RCA_NODE_* to GNILND_PEER_*.

To add this option to routers, update /etc/modprobe.conf.local with:
options kgnilnd to_reconn_disable=1

To dynamically add this parameter to a booted node:
echo 1 > /sys/module/kgnilnd/parameters/to_reconn_disable

Tested functionality with both timing out a connection and bringing
down nodes to check the proper states are entered.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I19cebab401208133d94e29c603eb340f77354684
Reviewed-on: http://review.whamcloud.com/21459
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Chuck Fossen <chuckf@cray.com>
Reviewed-by: James Shimek <jshimek@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6387 tests: fix lp_utils build issues on Power8 41/22941/2
James Simmons [Tue, 4 Oct 2016 23:07:42 +0000 (19:07 -0400)]
LU-6387 tests: fix lp_utils build issues on Power8

With the latest RHEL7 image for power8 some compile
issues have surfaced. Currently the issue is with
begin() and end() inline functions for the lp_util
code. The solution appears to not make the functions
inline.

Change-Id: I8f11ed488890407c117f13ebc741d7140702c647
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22941
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8575 lod: clear ost usable flag to avoid striping. 67/22367/3
Jadhav Vikram [Fri, 5 Feb 2016 04:07:07 +0000 (09:37 +0530)]
LU-8575 lod: clear ost usable flag to avoid striping.

clear ost->ltd_qos.ltq_usable before checking
OBD_FAIL_MDS_OSC_PRECREATE param and some other flags in
lod_alloc_qos() while finding all ost's which are valid stripe
candidates. Call to lod_alloc_qos before setting this flag/param
sets ost->ltd_qos.ltq_usable to 1 for all valid ost but after
setting this flag/param ost->ltd_qos.ltq_usable not getting
cleared so there is need to clear it so that it will not get added
into stripe array.

Also in test_27u corrects problem where log file gets created in
lustre mount directory instead of /tmp. This make sures TLOG file
will not get created on ost0.

Seagate-bug-id: MRP-2847

Signed-off-by: Jadhav Vikram <jadhav.vikram@seagate.com>
Change-Id: I6dd16c44b331b09a0ffee530b3eb8508bde64294
Reviewed-on: http://es-gerrit.xyus.xyratex.com:8080/9757
Tested-by: Jenkins
Reviewed-by: Ujjwal Lanjewar <ujjwal.lanjewar@seagate.com>
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Tested-by: Elena V. Gryaznova <elena.gryaznova@seagate.com>
Reviewed-on: http://review.whamcloud.com/22367
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7903 ptlrpc: leaked rs on difficult reply 96/22696/4
Niu Yawei [Fri, 23 Sep 2016 04:06:25 +0000 (00:06 -0400)]
LU-7903 ptlrpc: leaked rs on difficult reply

reply_out_callback() should call ptlrpc_schedule_difficult_reply()
to finalize the rs if it's already not on uncommitted list, otherwise,
the rs and the export held by rs could be leaked:

- target_send_reply() sends a difficult reply before the transaction
  committed, the reply is linked to scp_rep_active;

- export gets disconnected by umount or whatever reason,
  server_disconnect_export() is called to complete all outstanding
  replies, which will calls into ptlrpc_handle_rs() to dispose of
  the rs, so the rs is removed from the uncommitted list and
  LNetMDUnlink() is called to unlink the reply buffer and generate
  an unlink event;

- reply_out_callback() is called to process above unlink event,
  ptlrpc_schedule_difficult_reply() is supposed to be called to
  dispose of the rs finally. However, it could be skipped because of
  following flawed code snippet:

  if (!rs->rs_no_ack ||
      rs->rs_transno <= rs->rs_export->exp_obd->obd_last_committed)
          ptlrpc_schedule_difficult_reply(rs);

  The intention of above code is: if rs_no_ack is true (COS enabled),
  and transaction is not committed, we should rely on commit callback
  to release the rs. However, it overlooked the situation that rs
  could have been removed from the uncommitted list by disconnecting
  export.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I76d1a9e08c94340520fb731e8671b7b9205e0eb1
Reviewed-on: http://review.whamcloud.com/22696
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8592 mdt: hold mdt_device::mdt_md_root until service stop 38/22438/10
Fan Yong [Mon, 18 Jul 2016 04:39:47 +0000 (12:39 +0800)]
LU-8592 mdt: hold mdt_device::mdt_md_root until service stop

Otherwise, if someone is using such object, it may trigger
object reference ASSERTION(atomic_read(&o->lo_header->loh_ref) > 0).

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ifd7adbce9a68da537f592c64117f4ecafb0a9ec4
Reviewed-on: http://review.whamcloud.com/22438
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8668 tests: print more information about dd failure 90/22990/2
Andreas Dilger [Thu, 6 Oct 2016 20:10:42 +0000 (14:10 -0600)]
LU-8668 tests: print more information about dd failure

Don't drop the dd output so that we can see why this is failing.

Test-Parameters: trivial testlist=sanityn,sanityn,sanityn,sanityn
Test-Parameters: trivial testlist=sanityn,sanityn,sanityn,sanityn
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3364bca75817ee4d0128079744e4b01a3ac938a0
Reviewed-on: http://review.whamcloud.com/22990
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-3289 gss: Change the handling of keys for SK 26/22626/3
Jeremy Filizetti [Tue, 6 Sep 2016 01:49:33 +0000 (21:49 -0400)]
LU-3289 gss: Change the handling of keys for SK

Servers were automatically loading keys of the client type to allow
server to server communication to work by only including a path
to the --skpath option of mount.lustre.  However, this has multiple
issues due to ordering with multiple keys and can be unpredictable.
Instead keys that will be used for server to server communication
must be loaded manually or by a pre-mount script using lgss_sk
and specifiying the client type.

In addition client's should only load a single key with --skpath so
a check is added to not allow directories on the client.

Signed-off-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Change-Id: I239753fa1a2bff19bed598e6d2a073e8567d1002
Reviewed-on: http://review.whamcloud.com/22626
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-8658 ptlrpc: Suppress error for flock requests 56/22856/2
Patrick Farrell [Fri, 30 Sep 2016 19:12:53 +0000 (14:12 -0500)]
LU-8658 ptlrpc: Suppress error for flock requests

-EAGAIN is a normal return when requesting POSIX flocks.
We can't recognize exactly that case here, but it's the
only case that should result in -EAGAIN on LDLM_ENQUEUE, so
don't print to console in that case.

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Idfbaf671023ac2c3dc84ddd62d2e547427b1f50b
Reviewed-on: http://review.whamcloud.com/22856
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>