Whamcloud - gitweb
fs/lustre-release.git
2 years agoLU-9404 mdt: set HSM xattr only when needed 67/26867/2
John L. Hammond [Thu, 27 Apr 2017 15:39:11 +0000 (10:39 -0500)]
LU-9404 mdt: set HSM xattr only when needed

In mdt_hsm_add_hal() avoid setting the HSM xattr when the HSM
attributes have not changed.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I12570034127b9928e49ea329bf77b674aaa6ade8
Reviewed-on: https://review.whamcloud.com/26867
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9403 mdt: prevent HSM leak on re-archive 66/26866/2
John L. Hammond [Thu, 27 Apr 2017 15:23:26 +0000 (10:23 -0500)]
LU-9403 mdt: prevent HSM leak on re-archive

In mdt_hsm_is_action_compat() if the file to be archived already
exists in some backend archive then ensure that the re-archive
uses the same backend archive.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ifc0ef03264a20557c31df7add9e34a1dc1f0c814
Reviewed-on: https://review.whamcloud.com/26866
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8364 ldiskfs: fixes for failover mode 54/26854/4
Yang Sheng [Thu, 27 Apr 2017 06:48:22 +0000 (14:48 +0800)]
LU-8364 ldiskfs: fixes for failover mode

Port failover mode patches to other distro and
fix failure path in replay patch.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I51f5ca0b906a3cbd7554fabb8b447cda4096c781
Reviewed-on: https://review.whamcloud.com/26854
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9411 tests: skip llapi_layout_test 30, 31 for interop 53/26853/4
Andreas Dilger [Thu, 27 Apr 2017 04:07:41 +0000 (22:07 -0600)]
LU-9411 tests: skip llapi_layout_test 30, 31 for interop

The test30, test31 should be skipped if running a pre-PFL MDS
as the PFL layout type is not supported.

Change the test29 skip version to 2.8.55 since the patch for
this test didn't actually land until that version.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ib80bd8a70386b3e8881f8ca3d417a8be18acab07
Reviewed-on: https://review.whamcloud.com/26853
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9312 hsm: add a cookie indexed request hash 63/26763/3
John L. Hammond [Wed, 17 Jun 2015 22:42:36 +0000 (15:42 -0700)]
LU-9312 hsm: add a cookie indexed request hash

Replace linear scans of the HSM coordinator's cdt_requests list with
lookups into a cookie indexed hash (cdt_request_cookie_hash). Rename
cdt_requests to cdt_request_list. Remove the unused function
mdt_hsm_get_running().

Change-Id: I97309aeeb0e02a07e8ddac9f1667989c65f01b8b
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-on: https://review.whamcloud.com/26763
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9312 hsm: fix error handling around mdt_hsm_get_md_hsm() 41/26741/3
John L. Hammond [Wed, 19 Apr 2017 15:42:20 +0000 (10:42 -0500)]
LU-9312 hsm: fix error handling around mdt_hsm_get_md_hsm()

Correct several spurious NULL return checks from
mdt_hsm_get_md_hsm().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Icfe74e87183bc5356d4c7627088b402805dcc164
Reviewed-on: https://review.whamcloud.com/26741
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9119 lnet: selftest MR fix 23/26723/2
Amir Shehata [Mon, 30 Jan 2017 23:10:32 +0000 (15:10 -0800)]
LU-9119 lnet: selftest MR fix

selftest always responded to the primary nid of the peer rather than
the source of the message, which it should be.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I14a4b6ffc5882cb23298429d8a4bd0bcb0a8a5be
Reviewed-on: https://review.whamcloud.com/26723
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9394 osd: __osd_obj2dnode() to return negative errors 93/26893/3
Alex Zhuravlev [Fri, 28 Apr 2017 22:19:26 +0000 (01:19 +0300)]
LU-9394 osd: __osd_obj2dnode() to return negative errors

DMU/ZFS uses positive values, Lustre negatives..

Change-Id: I3615fac1616d6647897c68ef94f298b356e508d1
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/26893
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-9040 scrub: detect dead loop scanning 51/26751/10
Fan Yong [Tue, 2 May 2017 20:10:33 +0000 (04:10 +0800)]
LU-9040 scrub: detect dead loop scanning

It is found that the OI scrub may fall into dead loop scanning
for some unknown reason. This patch checks the scanning cursor
to make sure it will not scan the same object repeatedly.

It also fixes a logic error about 'noscrub' handling, that may
cause the OI scrub to fall into dead loop scanning when the OI
scrub resumes from former crashed partial scanning.

Test-Parameters: mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs mdscount=2 mdtcount=4 testlist=sanity-scrub,sanity-scrub,sanity-scrub
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ia1f63e8a2d675e9fa4567fa329905ac769b83a74
Reviewed-on: https://review.whamcloud.com/26751
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8998 llapi: add LLAPI_LAYOUT_COMP_USE_PREV 84/26484/4
Andreas Dilger [Mon, 10 Apr 2017 21:22:06 +0000 (15:22 -0600)]
LU-8998 llapi: add LLAPI_LAYOUT_COMP_USE_PREV

Add LLAPI_LAYOUT_COMP_USE_PREV to be able to iterate through
components in reverse order.

Add a test case to llapi_layout_test.c to exercise COMP_USE_LAST
and COMP_USE_PREV options.

Improve description of component ID in llapi_layout_comp_id_get.3
to indicate that the component ID does not imply ordering or other
semantics, and is just a numeric identifier for each component.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I21f78e575c2429ef927c8c2fc50bf150f59cab07
Reviewed-on: https://review.whamcloud.com/26484
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9201 test: avoid long sleeps in mount_facet() 21/26021/5
Andreas Dilger [Mon, 13 Mar 2017 20:53:12 +0000 (14:53 -0600)]
LU-9201 test: avoid long sleeps in mount_facet()

Reduce the long sleep during mount since this was fixed via
patch https://review.whamcloud.com/24845 for LU-7481.

This reduces one llmount.sh time from 62s to 37s (2 MDTs, 4 OSTs),
and removes about 800s from each conf-sanity run (201x 4s sleeps
due to "Commit the device label" in a recent test log).

Test-Parameters: trivial testlist=conf-sanity,conf-sanity,conf-sanity
Test-Parameters: trivial testlist=conf-sanity,conf-sanity,conf-sanity
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ib51858a00b935c4f7e473cead117e7d59c3ebbe5
Reviewed-on: https://review.whamcloud.com/26021
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9205 tests: fix failures in CLIENTONLY mode 51/25951/9
Dmitry Eremin [Mon, 13 Mar 2017 13:15:20 +0000 (16:15 +0300)]
LU-9205 tests: fix failures in CLIENTONLY mode

Turn off sanity tests 27F 130 160e 255 311 313 399 407 in
CLIENTONLY mode because of they required remote access to
MDS/OSS nodes.

Test-Parameters: trivial
Change-Id: Id1b79c614200c0d06c208a1c8f04ee13b10165ce
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/25951
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
2 years agoLU-8367 osp: orphan cleanup do not wait for reserved 26/25926/25
Alex Zhuravlev [Fri, 10 Mar 2017 11:37:25 +0000 (14:37 +0300)]
LU-8367 osp: orphan cleanup do not wait for reserved

a thread holding an object reserved on some OST may block
another thread trying to recover that OST. a set of threads
like these may lead to a livelock and cascading timeouts.

Change-Id: Ic14741759d30f9453b0fe28a91a878795a84ef39
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/25926
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9285 osp: revert patches LU-8367 and LU-8973 25/25925/8
Alex Zhuravlev [Fri, 10 Mar 2017 08:50:21 +0000 (11:50 +0300)]
LU-9285 osp: revert patches LU-8367 and LU-8973

another solution will be proposed.

Revert "LU-8972 osp: skip subsequent orphan cleanups"

This reverts commit 6f56f71b407a8c14db4c2accd37da5b4feecde1a.

Revert "LU-8367 osp: do not block orphan cleanup"

This reverts commit 2ce0d5b0640e3e440822080e407eee1ce1cafd75.

Change-Id: I4fb215d4dcdbe0edac0c25998b7deebf02a427c0
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/25925
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9183 llite: handle removal the pos argument of generic_write_sync 26/25826/11
Dmitry Eremin [Fri, 3 Mar 2017 18:31:36 +0000 (21:31 +0300)]
LU-9183 llite: handle removal the pos argument of generic_write_sync

In commit e259221763a40403d5bb232209998e8c45804ab8 the pos argument
of generic_write_sync() was removed.

Change-Id: Iad76c517e372d7dc5e12670b5a0b8106005b71ff
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/25826
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9183 llite: handle make the string hashes salt the hash 19/25819/14
Dmitry Eremin [Thu, 2 Mar 2017 19:57:15 +0000 (22:57 +0300)]
LU-9183 llite: handle make the string hashes salt the hash

In commit 8387ff2577eb9ed245df9a39947f66976c6bcd02 Linus Torvalds
make the string hashes salt the hash.

Hash users that don't have any particular initial salt can just use
the NULL pointer as a no-salt.

Change-Id: Id262d459370aa46c2e3d0e8b1e09dad74c717f03
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/25819
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9125 test: Update setstripe options 75/25475/8
James Nunez [Wed, 15 Feb 2017 20:52:37 +0000 (13:52 -0700)]
LU-9125 test: Update setstripe options

Some flags for 'lfs setstripe'  will be depricated in
tag 2.9.59; '--count' replaced by --stripe-count or -c.

replay-single test 68 will silently fail due to this change.
We need to check that an error is called if 'lfs setstripe'
fails and change the depricated parameters used in replay-single.
The check_default_stripe_attr() routine in sanity.sh also needs
to be updated with the new setstripe options.

Test-Parameters: trivial testlist=sanity,replay-single

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ie5809c9268684675585d17cd1c402ec3fb002239
Reviewed-on: https://review.whamcloud.com/25475
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
2 years agoLU-4423 ptlrpc: use 64-bit times for request times 77/24977/14
Arnd Bergmann [Mon, 1 May 2017 18:07:59 +0000 (14:07 -0400)]
LU-4423 ptlrpc: use 64-bit times for request times

All request timestamps and deadlines in lustre are recorded in time_t
and timeval units, which overflow in 2038 on 32-bit systems.

In this patch, I'm converting them to time64_t and timespec64,
respectively. Unfortunately, this makes a relatively large patch,
but I could not find an obvious way to split it up some more without
breaking atomicity of the change.

Also unfortunately, this introduces two instances of div_u64_rem()
in the request path, which can be slow on 32-bit architectures. This
can probably be avoided by a larger restructuring of the code, but
it is unlikely that lustre is used in performance critical setups
on 32-bit architectures, so it seems better to optimize for correctness
rather than speed here.

Linux-commit: 219e6de627243c8dbc701eaafe1c30c481d1f82c

Change-Id: Iff3c2bdb50bbb34d27edd4402838f915c16530f4
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24977
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-4017 quota: cleanup to improve quota codes 77/26577/8
Wang Shilong [Thu, 13 Apr 2017 02:45:10 +0000 (22:45 -0400)]
LU-4017 quota: cleanup to improve quota codes

Add man page updates for project quota, also
cleanup to address some style and minor problems

Change-Id: I3ee3e866dd0300a1b07e0f5319dfd695c0bafba0
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/26577
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8376 ost: enhance end to end bulk cksum error report 60/23960/20
Bruno Faccini [Fri, 25 Nov 2016 14:57:20 +0000 (15:57 +0100)]
LU-8376 ost: enhance end to end bulk cksum error report

Some sites have experienced spurious checksum errors upon bulk
xfers where it is very difficult to determine the source of the
corruption.
With this patch, upon cksum error, full dump of all pages in a
bulk xfer is now possible (enabled via a /proc tunable) on both
Client and OSS sides, to allow easier root cause identification.

sanity.sh/test_77[b,d,f,g]() existing sub-tests results can already
be used to show the effects of this patch, by injecting bulk cksum
error/corruption using OBD_FAIL_[OSC,OST]_CHECKSUM_[SEND,RECEIVE]
fail codes.

sanity.sh/test_77c has been created to specificaly test new dump
on cksum error functionality.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I0d200bb6d5c41c55a66ac012fd9cbd8d702d2f3a
Reviewed-on: https://review.whamcloud.com/23960
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8650 mdt: enable REP-ACK for DNE 07/22807/4
Lai Siyao [Thu, 29 Sep 2016 12:48:47 +0000 (20:48 +0800)]
LU-8650 mdt: enable REP-ACK for DNE

LU-7903 reveals that REP-ACK is disabled in 2.8, this was
introduced in LU-3538 http://review.whamcloud.com/12530
which is to support DNE Commit-on-Sharing, but it disabled
REP-ACK, while Commit-on-Sharing doesn't take effect for
local operations (operation which involves only one MDT)
either, this may cause single MDT recovery fail.

To fix this, we need to enable REP-ACK, and also make sure
http://review.whamcloud.com/12530 work as designed:
1. save local locks upon unlock as before, but don't convert
   locks into COS mode.
2. reply_out_callback() wakes up ptlrpc_handle_rs(), if
   reply was committed, decref locks like before.
3. otherwise for uncommitted reply convert locks to COS mode,
   and later when it's committed, ptlrpc_commit_replies()
   wakes up ptlrpc_handle_rs() again, which will decref locks
   like before.

In short, local locks will be converted into COS mode upon
REP-ACK, and transaction commit will decref locks.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Id21681017573b50e071dd8b5a4d65489843781a1
Reviewed-on: https://review.whamcloud.com/22807
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8307 ldlm: cond_resched in ldlm_bl_thread_main 88/20888/2
Patrick Farrell [Mon, 20 Jun 2016 21:15:51 +0000 (16:15 -0500)]
LU-8307 ldlm: cond_resched in ldlm_bl_thread_main

When clearing all of the ldlm LRUs (as Cray does at the end of
a job), a ldlm_bl_work_item is generated for each namespace
and then they are placed on a list for the ldlm_bl threads to
iterate over.

If the number of namespaces greatly exceeds the number of
ldlm_bl threads, a given thread will iterate over many
namespaces without sleeping looking for work.  This can go
on for an extremely long time and result in an RCU stall.

This patch adds a cond_resched() between completing one
work item and looking for the next.  This is a fairly cheap
operation, as it will only schedule if there is an
interrupt waiting, and it will not be called too much -
Even the largest file systems have < 100 namespaces per
ldlm_bl_thread currently.

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ic8022faf641ad6ab02462ab376a4bfd510dca14c
Reviewed-on: https://review.whamcloud.com/20888
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ned Bass <bass6@llnl.gov>
Reviewed-by: Ann Koehler <amk@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9430 utils: fix logic errors and putchar in sk_name2hmac() 20/26920/2
Chris Hanna [Tue, 2 May 2017 18:11:13 +0000 (14:11 -0400)]
LU-9430 utils: fix logic errors and putchar in sk_name2hmac()

In the sk_name2hmac function in lgss_sk.c, there are a couple minor
errors: bad usage of strcmp(), use of putchar() instead of assigning
a lowercased value, and use of a logical OR instead of AND.

These errors would prevent proper creation of shared keys in certain
circumstances.

Signed-off-by: Chris Hanna <hannac@iu.edu>
Change-Id: I16462f15201626f194e1b452acf3a1e63dbf0ed7
Reviewed-on: https://review.whamcloud.com/26920
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-7062 ldlm: GPF in _ldlm_lock_debug() 39/16139/8
Andriy Skulysh [Mon, 31 Aug 2015 08:39:34 +0000 (11:39 +0300)]
LU-7062 ldlm: GPF in _ldlm_lock_debug()

Lock's resource can change on a client.
Take a resource reference under spinlock
to print debug information.

Change-Id: Id8acb801ea549bf3c1ce1bcf6349db31578579f3
Xyratex-bug-id: MRP-2760
Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-on: https://review.whamcloud.com/16139
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoNew tag 2.9.57 2.9.57 v2_9_57 v2_9_57_0
Oleg Drokin [Mon, 8 May 2017 03:44:40 +0000 (23:44 -0400)]
New tag 2.9.57

Change-Id: Idc4dec64104cfb538501a8ee50f101f10fd69ff4
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-4017 quota: extend to test project quota 11/26411/16
Wang Shilong [Wed, 5 Apr 2017 07:52:17 +0000 (03:52 -0400)]
LU-4017 quota: extend to test project quota

Extend sanity-quota.sh to test project quota.
Also extend llog_test module to test new format
@llog_setattr64_rec_v2. codes should be able
handle both @llog_setattr64_rec_v2 and @llog_setattr64_rec
well.

Change-Id: I4f22c1e994da10ffed64c08749ae749740ed9b46
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/26411
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9119 lnet: rename LNET_MAX_INTERFACES 93/26693/4
Olaf Weber [Fri, 27 Jan 2017 15:14:50 +0000 (16:14 +0100)]
LU-9119 lnet: rename LNET_MAX_INTERFACES

LNET_MAX_INTERFACES is the number of interfaces supported by
interface bonding in the ksocknal LND. It shows up in LNet
because a number of data structures are shared between LNDs.

Rename it to LNET_NUM_INTERFACES to reduce the confusion of
what it does.

Test-Parameters: trivial
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: Ibc1d85a379d6616eb1db2fcb54aaffc835ffa9f4
Reviewed-on: https://review.whamcloud.com/26693
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9119 lnet: loopback NID in lnet_select_pathway() 92/26692/4
Olaf Weber [Fri, 27 Jan 2017 15:14:34 +0000 (16:14 +0100)]
LU-9119 lnet: loopback NID in lnet_select_pathway()

In lnet_select_pathway() sending to the loopback NID is handled
as a special case, because there are no credits involved. (The
loopback NID doesn't use credits, and therefore does not have
any credits. If a message goes through the credit-managing code
it therefore ends up waiting indefinitely for credits to become
available.)

The check whether we're sending over the loopback NID must be
done after we've completed choosing the NI to send over. In its
present location it only handles the case where the loopback
NID was explicitly passed in as the source NID.

(Lustre does not exercise this code path during normal operation,
the bug was encountered while testing code for the peer discovery
feature.)

Test-Parameters: trivial
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: Ifa25abf508214ae363a2f1bb04ffeab1891a2564
Reviewed-on: https://review.whamcloud.com/26692
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9119 socklnd: propagate errors on send failure 91/26691/4
Olaf Weber [Fri, 27 Jan 2017 15:13:53 +0000 (16:13 +0100)]
LU-9119 socklnd: propagate errors on send failure

When an attempt to send a message fails, for example because no
connection could be established with the remote address, socklnd
drops the message. For a PUT or REPLY message with non-zero
payload, ksocknal_tx_done() calls lnet_finalize() with -EIO
as the error code. But for an ACK or GET message there is no
payload, and lnet_finalize() is called with 0 (no error) as the
error code. This leaves upper layers to rely on other means to
determine that sending the message did actually fail, and that
(for example) no REPLY will ever answer a failed GET.

Add an error code parameter to ksocknal_tx_done().

In ksocknal_txlist_done() change the 0/1 'error' indicator to be
an actual error code that is passed on the ksocknal_tx_done().
Update the callers of ksocknal_txlist_done() to pass in the error
code if they have encountered an error.

Test-Parameters: trivial
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I66b897a31e537e70dcc2622ffdfcc6e96fa93193
Reviewed-on: https://review.whamcloud.com/26691
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9119 lnet: fix race in lnet shutdown path 90/26690/4
Olaf Weber [Fri, 27 Jan 2017 15:13:29 +0000 (16:13 +0100)]
LU-9119 lnet: fix race in lnet shutdown path

The locking changes for the lnet_net_lock made for Multi-Rail
introduce a race in the LNet shutdown path. The code keeps two
states in the_lnet.ln_shutdown: 0 means LNet is either up and
running or shut down, while 1 means lnet is shutting down. In
lnet_select_pathway() if we need to restart and drop and relock
the lnet_net_lock we can find that LNet went from running to
stopped, and not be able to tell the difference.

Replace ln_shutdown with a three-state ln_state patterned on
ln_rc_state: states are LNET_STATE_SHUTDOWN, LNET_STATE_RUNNING,
and LNET_STATE_STOPPING. Most checks against ln_shutdown now test
ln_state against LNET_STATE_RUNNING. LNet moves to RUNNING state
in lnet_startup_lndnets().

Test-Parameters: trivial
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I7afcbeb793dfa4d0a361e421ae06a99b7d4db903
Reviewed-on: https://review.whamcloud.com/26690
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9183 llite: handle flags as argument for inode_operations->rename 27/25827/11
Dmitry Eremin [Fri, 3 Mar 2017 18:27:36 +0000 (21:27 +0300)]
LU-9183 llite: handle flags as argument for inode_operations->rename

In Linux kernel v3.14 the inode_operations->rename() needs flags in
arguments.

Change-Id: I5028357d1d459b83ff0b1df0abeaadf78c5d05da
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/25827
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8879 tests: speed up copytool_cleanup() in sanity-hsm 25/24025/7
Quentin Bouget [Tue, 29 Nov 2016 15:39:26 +0000 (16:39 +0100)]
LU-8879 tests: speed up copytool_cleanup() in sanity-hsm

This patch implements the following improvements:

 - The coordinator now wakes up when hsm_control is set to 'shutdown'

 - The wait_copytools() function in sanity-hsm uses a polling
   mechanism to detect when all running copytools are killed.
   It used to sleep before the first check, even though that check
   would pass most of the time. This has been fixed.

 - wait_copytools() used to sleep for 2s between its checks. It now
   sleeps for 0.1s, 0.2s, 0.4s, 0.8s, 1.6s, 3.2s, 3.2s, 3.2s, ...
   until it times out.

Considering how often the wait_copytools() function is called in
sanity-hsm, this patch should represent a noticeable speed-up.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: Ia460df59a724caaa194565dd7af402c8c617f40e
Reviewed-on: https://review.whamcloud.com/24025
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jean-Baptiste Riaux <riaux.jb@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8882 osd: use bydnode methods to access DMU 35/24035/21
Alex Zhuravlev [Wed, 30 Nov 2016 20:07:54 +0000 (23:07 +0300)]
LU-8882 osd: use bydnode methods to access DMU

newer ZFS allows to access DMU by dnode which save expensive
dnode# to dnode_t mapping.

Change-Id: I469c2a72d18f170ebb96dd33c23bb6d8f037188a
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/24035
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8873 osd: use sa_handle_get_from_db() 04/24004/15
Alex Zhuravlev [Tue, 29 Nov 2016 16:48:59 +0000 (19:48 +0300)]
LU-8873 osd: use sa_handle_get_from_db()

use sa_handle_get_from_db() instead of sa_handle_get() and
save on object->dnode lookup

Change-Id: I2a23e36c3c98ecf4ec00ac590a32d2c14a867aa0
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/24004
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8998 llapi: rename llapi_layout_comp_move -> *use 83/26483/3
Andreas Dilger [Thu, 30 Mar 2017 23:12:29 +0000 (17:12 -0600)]
LU-8998 llapi: rename llapi_layout_comp_move -> *use

Rename llapi_layout_comp_move() and llapi_layout_comp_move_at()
to llapi_layout_comp_use() and llapi_layout_comp_use_id(),
respectively.  This avoids confusion about what "move" and "at" in
the function name implies. The component itself is not actually
being moved, just a different layout component is being selected
for access or modification.  Using "_id" instead of "_at" also
makes it more clear what the difference is between these functions.

Rename LLAPI_LAYOUT_COMP_POS_{FIRST,NEXT,LAST} correspondingly to
LLAPI_LAYOUT_COMP_USE_{FIRST,NEXT,LAST} to match.

Split llapi_layout_comp_use_id.3 from llapi_layout_comp_use.3 since
they are mostly independent anyway.

Test-Parameters: trivial testlist=sanity-pfl
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I85926d4ec9774745bc49b0d178ed9b23ec2cab07
Reviewed-on: https://review.whamcloud.com/26483
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9306 kuc: initialize kkuc_groups at module init time 83/26883/4
John L. Hammond [Fri, 28 Apr 2017 15:12:01 +0000 (10:12 -0500)]
LU-9306 kuc: initialize kkuc_groups at module init time

Some kkuc functions use kkuc_groups[group].next == NULL to test for an
empty group list. This is incorrect if the group was previously added
to but not empty. Remove all next == NULL tests and use module load
time initialization of the kkuc_groups array.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I0d6af7b8584f18d1dc03873993d6aac55b1677a9
Reviewed-on: https://review.whamcloud.com/26883
Tested-by: Jenkins
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9340 lov: readahead shouldn't exceed component boundary 61/26861/3
Jinshan Xiong [Mon, 17 Apr 2017 19:29:51 +0000 (12:29 -0700)]
LU-9340 lov: readahead shouldn't exceed component boundary

Otherwise, it will extend the readahead RPC to the next component
while the actual lock of that component is not checked.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ice743d45f9df5e6fdc83b07aa6af1b182b660c9a
Reviewed-on: https://review.whamcloud.com/26677
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
(cherry picked from commit e31e234c06ac798501cdb7ec92269af83157cb21)
Reviewed-on: https://review.whamcloud.com/26861
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-4017 quota: add project id support to lfs find 64/26464/12
Wang Shilong [Sun, 9 Apr 2017 02:22:24 +0000 (10:22 +0800)]
LU-4017 quota: add project id support to lfs find

Project ID support is added into 'lfs find', which is
needed for sanity-quota.sh etc to collect project id information
if test failed

Change-Id: I60cfc4f81bb8779db0a33a5c9bae7255e9d0100c
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/26464
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-4017 quota: add project inherit attributes 63/26463/10
Wang Shilong [Fri, 7 Apr 2017 01:09:18 +0000 (09:09 +0800)]
LU-4017 quota: add project inherit attributes

Add @LUSTRE_PROJINHERIT_FL inode flag which means
creating new objects parents projid, it is disabled
in default, unless setting explictly.

It is kept same inteface as Ext4/XFS, you could use
following ioctl directly:

chattr  +P <directory>

Agent inode for DNE should ignore project ID, patch also
remove some unnecessary lustre_set_wire_obdo() call for
attr fetching/updates, which mixed la flags and obdo flags.

Change-Id: I573b71c5bd7b0089172025c30c6824562444d57d
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/26463
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-3308 mdc: allow setting readdir RPC size parameter 88/26088/10
Andreas Dilger [Sun, 19 Mar 2017 06:45:55 +0000 (00:45 -0600)]
LU-3308 mdc: allow setting readdir RPC size parameter

Allow the mdc.*.max_pages_per_rpc tunable to set the MDS bulk
readdir RPC size, rather than always using the default 1MB RPC
size.  The tunable is set in the MDC, as it should be, rather
than in the llite superblock, which requires extra code just to
get it up from the MDC's connect_data only to send it back down.
The RPC size could be tuned independently if different types of
MDSes are used (e.g. local vs. remote).

Remove the md_op_data.op_max_pages and ll_sb_info.ll_md_brw_pages
fields that previously were used to pass the readdir size from
llite to mdc_read_page().  Reorder some 32-bit fields in md_op_data
to avoid struct holes.

Remove lprocfs_obd_rd_max_pages_per_rpc() as it is no longer used.
Move osc_obd_max_pages_per_rpc_seq_write() to obdclass along with
lprocfs_obd_max_pages_per_rpc_seq_show().

Remove debug messages from fid_flatten*() since this clutters up
the debug logs, and is redundant with other debug messages.

Fix test_24v's calculation for the number of RPCs being sent, so
that it will be correct even if the readdir RPC size is modified.
Register cleanup trap to avoid leaving lots of files behind.

Merge "slow" sanity test_24D into test_24v since they both need to
have many files in a single directory, which avoids duplication.

Change calc_osc_stats() and calc_llite_stats() into calc_stats()
to be used in test_24v and other tests.  Similarly, clear_osc_stats()
and clear_llite_stats() are consolidated into clear_stats().

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ibd814cea3b788129be8aca2866d1bb139b3ebbe5
Reviewed-on: https://review.whamcloud.com/26088
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9069 tests: improve output of sanity test_255a 07/24907/7
Andreas Dilger [Mon, 16 Jan 2017 21:24:10 +0000 (14:24 -0700)]
LU-9069 tests: improve output of sanity test_255a

Improve output of sanity.sh test_255a to contain more information.
Clean up the performance measurements and calculations to make the
test easier to read.  The random_read_iops() helper routine might
be useful for other tests in the future as well.

The test does not (yet) work for ZFS, so it will skip the checks on
ZFS OSTs until "ladvise -a dontneed" is implemented for osd-zfs.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ia177d87e41266b058a6863bbf36108ad71ef9a00
Reviewed-on: https://review.whamcloud.com/24907
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9420 lnd: Remove a bad check which slipped in 91/26891/4
Doug Oucharek [Fri, 28 Apr 2017 21:44:26 +0000 (14:44 -0700)]
LU-9420 lnd: Remove a bad check which slipped in

When the patch for LU-5710 landed, a check for message size was landed
that should not have been.  This check was part of a patch in LU-7650
which was later pulled because it broke things. LU-5718 picked up this
code via its many rebases (it too forever to land LU-5718 which is the
core problem here).

This patch removes that messaage size check.

Test-Parameters: trivial
Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: I3d114ec16cfbfd994efd9aee55e28a09159597be
Reviewed-on: https://review.whamcloud.com/26891
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8587 osp: Remove unused osp_max_pages_per_rpc 72/24572/6
Oleg Drokin [Wed, 26 Apr 2017 06:04:45 +0000 (02:04 -0400)]
LU-8587 osp: Remove unused osp_max_pages_per_rpc

It was never added to be seen from proc and in fact it is
completely unused anyway, so just remove it and the
osp_max_pages_per_rpc_seq_show function.

Change-Id: I4d0c582e4d48daff29795e95a7c8fa1f24340766
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/24572
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-9336 utils: prevent key clobber and clarify lgss_sk usage 38/26838/2
Chris Hanna [Wed, 26 Apr 2017 13:43:55 +0000 (09:43 -0400)]
LU-9336 utils: prevent key clobber and clarify lgss_sk usage

Prevent lgss_sk from overwriting key value when modifying attributes.
Altered usage text to match, and clarified that the input source is
the key value, not a source of randomization for key generation.

Signed-off-by: Chris Hanna <hannac@iu.edu>
Change-Id: I87b9d59b65f3172b0425115441eaa1456489daeb
Reviewed-on: https://review.whamcloud.com/26838
Tested-by: Jenkins
Reviewed-by: Kit Westneat <kit.westneat@gmail.com>
Reviewed-by: Nathan Lavender <nblavend@iu.edu>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8103 tests: Skip sanity test_404 for interop mode 27/26827/3
Saurabh Tandan [Tue, 25 Apr 2017 23:18:55 +0000 (16:18 -0700)]
LU-8103 tests: Skip sanity test_404 for interop mode

Skip sanity test_404 if the server version is older than
2.8.53 for interop mode.

This needs to be done due to the patch
http://review.whamcloud.com/19637/ which was introduced in
ticket LU-6601.

Test-Parameters: trivial
Signed-off-by: Saurabh Tandan <saurabh.tandan@intel.com>
Change-Id: I96271380cb7c3c09877de13c31c98367f74aad22
Reviewed-on: https://review.whamcloud.com/26827
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9119 lnet: Normalize ioctl interface 89/26689/6
Amir Shehata [Mon, 17 Apr 2017 23:30:32 +0000 (16:30 -0700)]
LU-9119 lnet: Normalize ioctl interface

To avoid backwards compatibility issues between base MR
and Dynamic Discovery standardize the ioctl interface by
bringing in changes to the interface required by
Dynamic Discovery now.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I014d74943b893ec24e3d42e1eb6824d755460c2b
Reviewed-on: https://review.whamcloud.com/26689
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf@sgi.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9119 lnet: remove debug ioctl 88/26688/4
Olaf Weber [Fri, 27 Jan 2017 15:15:50 +0000 (16:15 +0100)]
LU-9119 lnet: remove debug ioctl

Remove a debug ioctl that was added to allow for debug
messages from user space. However, the code is currently
not being used.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: Ifd2bee73ef507bd07296af76dac1caf08ded9e64
Reviewed-on: https://review.whamcloud.com/26688
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9057 lnet: fix static analysis issues 87/26687/4
Amir Shehata [Fri, 3 Feb 2017 02:05:20 +0000 (18:05 -0800)]
LU-9057 lnet: fix static analysis issues

Fixed a set of issues found while running static analysis.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I22ddfdda86c979c7a300ab9df777efbdd5973ac5
Reviewed-on: https://review.whamcloud.com/26687
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf@sgi.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9316 kernel: kernel update RHEL6.9 [2.6.32-696.1.1.el6] 87/26587/3
Bob Glossman [Tue, 11 Apr 2017 13:19:20 +0000 (06:19 -0700)]
LU-9316 kernel: kernel update RHEL6.9 [2.6.32-696.1.1.el6]

Update RHEL6.9 kernel to 2.6.32-696.1.1.el6

Test-Parameters: trivial clientdistro=el6.9 mdsdistro=el6.9 \
  ossdistro=el6.9 mdtfilesystemtype=ldiskfs \
  ostfilesystemtype=ldiskfs testgroup=review-ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I5955da3a8953175d36e68877de52f9f9f2fd659b
Reviewed-on: https://review.whamcloud.com/26587
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9296 ptlrpc: set rq_sent when send fails due to -ENOMEM 70/26470/3
John L. Hammond [Mon, 10 Apr 2017 14:17:20 +0000 (09:17 -0500)]
LU-9296 ptlrpc: set rq_sent when send fails due to -ENOMEM

In ptl_send_rpc() set rq_sent when we fail to send the RPC due
to insufficient memory, since this is what the upper layers expect.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ia19c5d44e2999a9b347ec1088a7d448e1b548136
Reviewed-on: https://review.whamcloud.com/26470
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9294 mgs: ignore stale barrier locks 55/26455/6
Fan Yong [Sat, 3 Dec 2016 00:08:25 +0000 (08:08 +0800)]
LU-9294 mgs: ignore stale barrier locks

Currently, when MDT umount/crash, it may not notify the MGS
to cleanup related barrier lock and bitmap. Then subsequent
barrier operation may find some stale barrier locks. It is
unnecessary to return failure by mgs_barrier_glimpse_lock()
for such case, the caller will handle that properly.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I341fd648da2eebbaab729145d1f06c420fce6455
Reviewed-on: https://review.whamcloud.com/26455
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9355 obd: remove obsolete OBD_FL_LOCAL_MASK 28/26728/5
Wang Shilong [Wed, 19 Apr 2017 03:02:57 +0000 (11:02 +0800)]
LU-9355 obd: remove obsolete OBD_FL_LOCAL_MASK

From Andreas Dilger:

The OBD_FL_LOCAL_MASK support has not been used since
commit e62f0a3c5b9 b=21980 cache ll_obdo_cache: Can't free all
objects which predates Jira. With the landing of patch
https://review.whamcloud.com/26463
LU-4017 quota: add project inherit attributes the handling of the "local"
flags is broken and since it is unused it is better to be removed entirely.

Change-Id: Iebd3de73f78f72851c5a664e72fa3d145729e1d6
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/26728
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-9183 llite: handle removal the offset argument of direct_IO 22/25822/10
Dmitry Eremin [Fri, 3 Mar 2017 17:11:40 +0000 (20:11 +0300)]
LU-9183 llite: handle removal the offset argument of direct_IO

In commit c8b8e32d700fe943a935e435ae251364d016c497 the offset
argument for ->direct_IO() was removed.

Change-Id: I9f1cd5862dfcb40ad6b43b9da3072a852c550c49
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/25822
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6401 uapi: fix up lustre_ostid.h and lustre_fid.h 69/24569/7
James Simmons [Sat, 8 Apr 2017 21:38:42 +0000 (17:38 -0400)]
LU-6401 uapi: fix up lustre_ostid.h and lustre_fid.h

Several inline functions in the header lustre_ostid.h
are using debug macros instead of returning proper errors.
Remove the debug macros and properly handle the returned
error codes. Place both UAPI headers lustre_fid.h and
lustre_ostid.h into the uapi directory.

Change-Id: Ic32afd05850b5bf02fb8de655cb1971eeb52a321
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24569
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6401 uapi: turn lustre_ioctl.h into a proper UAPI header 68/24568/16
James Simmons [Sat, 8 Apr 2017 19:44:32 +0000 (15:44 -0400)]
LU-6401 uapi: turn lustre_ioctl.h into a proper UAPI header

Remove all the complex inline functions. Move all the user land
specific functions into the userland library. Unwind the kernel
specific functions and move obd_ioctl_is_valid() into the kernel
source file linux-module.c.

Change-Id: I91e69a21231f3effd23b191b6df9b5515a2ccc64
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24568
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8851 nodemap: add uid/gid only flags to control mapping 53/23853/4
Kit Westneat [Fri, 18 Nov 2016 14:50:02 +0000 (09:50 -0500)]
LU-8851 nodemap: add uid/gid only flags to control mapping

This patch adds two flags to nodemaps which control whether or not
the nodemap should map UIDs, GIDs, or both. These flags can be
controlled via lctl as a new nodemap parameter map_mode, with values
both, uid_only, or gid_only.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: I3efe6ff348d909c196a89273a0c9c046c56dbf1d
Reviewed-on: https://review.whamcloud.com/23853
Reviewed-by: Chris Hanna <hannac@iu.edu>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-5834 build: obsolete lustre-client if installing server 62/20162/8
Andreas Dilger [Fri, 13 May 2016 01:53:54 +0000 (19:53 -0600)]
LU-5834 build: obsolete lustre-client if installing server

When installing the "lustre" (client+server) package, obsolete
older lustre-client package so that it can install without error.

Remove ancient Provides: and Obsoletes: lines from 1.2 days.

Fix remaining Provides: and Obsoletes: and package Group: lines
so they contain version numbers and names to make rpmlint happy.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ifdd1d7567ab03d0d1dfa599b592b6f28e09cab07
Reviewed-on: https://review.whamcloud.com/20162
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
2 years agoLU-9311 tests: add sanity-pfl.sh test_13 00/26700/4
Emoly Liu [Mon, 24 Apr 2017 07:16:28 +0000 (15:16 +0800)]
LU-9311 tests: add sanity-pfl.sh test_13

This patch adds sanity-pfl.sh test_13 to verify the following fix:
https://review.whamcloud.com/#/c/26474/
test_13 uses 8 OSTs to make the LOVEA buffer bigger than the request
reply buffer, and writes data to the composite file to verify if the
layout write intent RPC resent by the client is reprocessed by
mdt_layout_change().

Test-parameters: trivial ostcount=8 testlist=sanity-pfl

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I7df6ff472c9cd879267912851d7288258db61d15
Reviewed-on: https://review.whamcloud.com/26700
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7214 lnet: Allow min stats to be reset in peers and nis 70/20470/8
Doug Oucharek [Thu, 26 May 2016 18:26:17 +0000 (11:26 -0700)]
LU-7214 lnet: Allow min stats to be reset in peers and nis

Allow writes to the peers and nis LNet procfs files to
reset the mininum stat columns.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: Ic7aa647a154a424b35be1598266887d27efb4ab3
Reviewed-on: https://review.whamcloud.com/20470
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf@sgi.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9234 test: Skip test_70f if OSS version is older than 2.9.53 39/26739/3
Wei Liu [Wed, 19 Apr 2017 16:49:05 +0000 (09:49 -0700)]
LU-9234 test: Skip test_70f if OSS version is older than 2.9.53

Add version check to replay-single test_70f. Skip it if OSS is older
than 2.9.53

Test-Parameters: trivial testlist=replay-single

Change-Id: I31424129e28c35eb0ac2bab9d8f321c13c978672
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Reviewed-on: https://review.whamcloud.com/26739
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
2 years agoLU-9359 pfl: instantiate enough component at mdd_create_data 06/26706/5
Bobi Jam [Tue, 18 Apr 2017 13:58:59 +0000 (21:58 +0800)]
LU-9359 pfl: instantiate enough component at mdd_create_data

mknod creates file without layout, then truncate trigger the MDT
create OST objects, while current implementation only instantiate the
1st component, and when the truncate size locates in the other
components, the lvb size info is lost.

This patch makes MDT creates enought OST objects to cover the file's
size.

This patch fixes the misunderstanding of ost_pool::op->size, it
indicates the buffer size allocated instead of the array count.

Another issue fixed is that in lod_alloc_qos(), only fill in the ost
inused array when the lod_qos_declare_object_on() succeeds.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ie66950e9b3d8cc009cca58f63936b275759211f1
Reviewed-on: https://review.whamcloud.com/26706
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-9346 lod: replay of PFL file open failure 30/26630/7
Bobi Jam [Fri, 14 Apr 2017 17:41:37 +0000 (01:41 +0800)]
LU-9346 lod: replay of PFL file open failure

During replay of PFL file open, lod_qos_parse_config()->
lod_use_defined_striping() initialed stripe LU-objects, but it keeps
the component's LCME_FL_INIT flag; and later in lod_striping_create()
these component will be skipped create OST objects, that fails the
replay, it should replay creating its OST objects.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ic84374941df7a14b53e463f6117d5fbb9995c33d
Reviewed-on: https://review.whamcloud.com/26630
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9334 lfsck: reset trace file for upgrade case 16/26716/7
Fan Yong [Fri, 2 Dec 2016 20:44:27 +0000 (04:44 +0800)]
LU-9334 lfsck: reset trace file for upgrade case

With PFL introduced, the layout LFSCK on-disk trace file
will be changed, that is incompatibile with non-PFL case.
Use new layout LFSCK trace file magic for FPL case, that
will reset the layout LFSCK on-disk trace files for when
upgrade from non-PFL case.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Id3b3fa4a89e5a408700653d8b759b76626c78912
Reviewed-on: https://review.whamcloud.com/26716
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-9356 osd-ldiskfs: add blk_plug when do bio 97/26697/4
Qian Yingjin [Mon, 17 Apr 2017 08:05:30 +0000 (16:05 +0800)]
LU-9356 osd-ldiskfs: add blk_plug when do bio

During 16MB bulk RPC I/O evaluation on rhel7, due to kernel
BIO_MAX_PAGES (256) limit, the 16MB IO is divided into 16 1MB
I/O submitting to underly block device one by one. And we found
that the SFA disk driver got lots of 1MB IOs.
To optimize the performance, this patch introduces blk_plug into
osd-ldiskfs when do bio, before submit IOs, it calls blk_start_plug,
after submit all 16MB IOs, calls blk_finish_plug, so that the 16MB
bulk IO will have more change to merge in the block evelvator
scheduler layer.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: If26db9f85baf97bc441cc4ad19d5c9f97bd3d7e5
Reviewed-on: https://review.whamcloud.com/26697
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9342 pfl: handle uninited component in user defined LOVEA 19/26619/8
Bobi Jam [Fri, 14 Apr 2017 08:06:34 +0000 (16:06 +0800)]
LU-9342 pfl: handle uninited component in user defined LOVEA

It is possible that user defined LOVEA contains uninstantiated flag
like replay of create or layout extend or lfs swap, the partial LOVEA
will passed into
lod_declare_xattr_set()->lod_declare_striped_object()->
lod_prepare_create()->lod_qos_parse_config()->
lod_use_defined_striping(),
so that lod_use_defined_striping() also need to take care of
uninstantiated component entry.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I45d46b118bcde66f4604b80e2da51a808f381219
Reviewed-on: https://review.whamcloud.com/26619
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-9344 test: hung with sendfile_grouplock test12() 46/26646/9
Bobi Jam [Sat, 15 Apr 2017 03:57:18 +0000 (11:57 +0800)]
LU-9344 test: hung with sendfile_grouplock test12()

This is a makeshift fix.

When we hold a group lock of a file, there should no data written to
the file, since during the write IO, the file's layout could possibly
change, and the write IO will try to update its layout, which could
be blocked by itself.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ia24310d509e8a93c3c1d849c54eacf789f40bfb8
Reviewed-on: https://review.whamcloud.com/26646
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-9339 ldiskfs: fix regression of macro check 69/26769/3
Wang Shilong [Fri, 21 Apr 2017 02:31:22 +0000 (22:31 -0400)]
LU-9339 ldiskfs: fix regression of macro check

include/linux/quota.h undefine @PRJQUOTA and make it
enum, ldiskfs macro check here will be negative, fix
it to use @HAVE_PROJECT_QUOTA, also wrap ext4-export-*.patch
to ext4-projid-feature-support.patch.

Change-Id: I415e366423408413dfb5b99dee98f7a30513f83f
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/26769
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9276 kernel: kernel update [SLES12 SP1 3.12.69-60.64.35] 86/26286/6
Bob Glossman [Thu, 30 Mar 2017 16:27:47 +0000 (09:27 -0700)]
LU-9276 kernel: kernel update [SLES12 SP1 3.12.69-60.64.35]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12 testgroup=review-ldiskfs \
  mdsdistro=sles12 ossdistro=sles12 mdtfilesystemtype=ldiskfs \
  ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ied31b0a3601ac6e1db06da3e0167b875d1f2200a
Reviewed-on: https://review.whamcloud.com/26286
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9183 llite: handle removal of parent argument of ->d_compare() 18/25818/13
Dmitry Eremin [Thu, 2 Mar 2017 19:48:44 +0000 (22:48 +0300)]
LU-9183 llite: handle removal of parent argument of ->d_compare()

In commit 6fa67e707559303e086303aeecc9e8b91ef497d5 the parent
parameter for ->d_compare() was removed.

Change-Id: Ia241619c3ade13036154973a19f44a2083a9bc53
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/25818
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9183 libcfs: handle get_user_pages() with gup_flags 17/25817/10
Dmitry Eremin [Thu, 2 Mar 2017 19:39:37 +0000 (22:39 +0300)]
LU-9183 libcfs: handle get_user_pages() with gup_flags

In commit c8fe4609827aedc9c4b45de80e7cdc8ccfa8541b the arguments
write/force for get_user_pages() were replaced with gup_flags.

Change-Id: I64b3e69ee736d1f3be48f2c4c8fe1af1dc14e857
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/25817
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9152 mgc: Remove unnecessary checks for config_log_put() 54/25854/4
Steve Guminski [Mon, 6 Mar 2017 21:05:22 +0000 (16:05 -0500)]
LU-9152 mgc: Remove unnecessary checks for config_log_put()

Because config_log_put() now checks if its parameter is NULL, it
is unnecessary to perform the check prior to calling it.  This patch
removes the redundant checks.

Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: Id6b1fccebd5bc53a29bc364b9a3c47956649920a
Reviewed-on: https://review.whamcloud.com/25854
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
2 years agoLU-9306 tests: more debug info for hsm test_24d 70/26770/7
Fan Yong [Sat, 3 Dec 2016 02:18:35 +0000 (10:18 +0800)]
LU-9306 tests: more debug info for hsm test_24d

More debug information for sanity-hsm test_24d.
Drop unnecessary condition check for mdt_hsm_cdt_stop().

Test-Parameters: testlist=sanity-hsm,sanity-hsm,sanity-hsm,sanity-hsm
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ib4a43693d364e208468937d15e3c48a1b1afb17a
Reviewed-on: https://review.whamcloud.com/26770
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8424 osd-zfs: ZFS macro DN_MAX_BONUSLEN is deprecated 78/26078/3
Ned Bass [Sat, 18 Mar 2017 00:55:46 +0000 (17:55 -0700)]
LU-8424 osd-zfs: ZFS macro DN_MAX_BONUSLEN is deprecated

The ZFS macro DN_MAX_BONUSLEN was deprecated as of ZFS 0.7.0. Lustre
should instead use the compatibility wrappers such as
osd_dmu_object_alloc() and osd_zap_create_flags(). The reason for the
API change is that ZFS 0.7.0 adds support for variable length dnodes,
so the maximum bonus length should not be treated as a fixed
constant. The maximum bonus length may vary by dnode and by dataset,
and it should be derived accordingly.

This change:

- Adds an additional compatibility function osd_obj_bonuslen(obj) to
  obtain a maximum bonus length given an osd_object.

- Updates code that uses the deprecated macro to instead use
  appropriate compatibility interfaces.

- Removes the definition of DN_MAX_BONUSLEN that was added in commit
  49fc02fb738e9420ab10c5a7d41534c7a55b8ea0 to ensure that future
  builds using the deprecated macro will fail.

- Adds DN_MAX_BONUSLEN and DN_OLD_MAX_BONUSLEN to the list of
  deprecated interfaces in checkpatch.pl.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Change-Id: I1fcc84e55b39ca49a88acb909b5e3294f3b46723
Reviewed-on: https://review.whamcloud.com/26078
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9323 kernel: kernel update RHEL7.3 [3.10.0-514.16.1.el7] 90/26590/2
Bob Glossman [Wed, 12 Apr 2017 14:49:38 +0000 (07:49 -0700)]
LU-9323 kernel: kernel update RHEL7.3 [3.10.0-514.16.1.el7]

update RHEL 7.3 kernel to 3.10.0-514.16.1.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ia53f001b7b277a20349b23ffdf54e0b0dbc1d065
Reviewed-on: https://review.whamcloud.com/26590
Tested-by: Jenkins
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9327 test: handle nodes lacking numbers in names for hostlist_expand. 60/26560/3
James Simmons [Thu, 13 Apr 2017 19:54:23 +0000 (15:54 -0400)]
LU-9327 test: handle nodes lacking numbers in names for hostlist_expand.

Currently hostlist_expand() in test-framework assumes all node names will
contain numbers to sort them by. It is possible to have nodes that instead
are named alphabetical in nature (nodea, nodeb, ...). Handle this additional
case.

Change-Id: I014d7934a869192b3bc1908b4b0d559724d443c9
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/26560
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9008 pfl: Read should not trigger layout write intent 99/26499/10
Jinshan Xiong [Tue, 11 Apr 2017 11:08:16 +0000 (19:08 +0800)]
LU-9008 pfl: Read should not trigger layout write intent

In lov_io_rw_iter_init(), only write not read operation should
trigger layout write intent.

For append write, it has to make sure all uninited components
are instantiated.

Page mkwrite should also trigger write intent.

Fixed sanity-pfl test cases as well - in order to instantiate
a component by truncate, the truncated size must be within the
extent of component.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I3c5257c5799ae0df35a572b9ee981f583260e8ec
Reviewed-on: https://review.whamcloud.com/26499
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9292 utils: handle partitioned MD arrays correctly 99/26399/2
Jadhav Vikram [Tue, 4 Apr 2017 04:08:11 +0000 (09:38 +0530)]
LU-9292 utils: handle partitioned MD arrays correctly

mount.lustre doesn't handle partitioned MD arrays correctly.
The function set_blockdev_tunables doesn't correctly parse
the newer names of partitioned MD devices (e.g /dev/md0p2).
It is written to parse the older partitioned MD devices
(e.g /dev/md_d2p3). This means that MD partions like /dev/md0p2
do not receive MD-level tunings that non-partitioned MD devices
get (specifically stripe_cache_size), leading to a large
performance hit. so fix is to handle newer and older MD device
names.

Seagate-bug-id: MRP-2608
Signed-off-by: Jadhav Vikram <jadhav.vikram@seagate.com>
Reviewed-by: Ashish Purkar <ashish.purkar@seagate.com>
Reviewed-by: Christopher Walker <chris.walker@seagate.com>
Tested-by: Elena V. Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Change-Id: Ia0abe63e725e3a70d2561960faa1bc48981f2fd0
Reviewed-on: https://review.whamcloud.com/26399
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9258 nodemap: group quota ID not properly mapped 09/26209/4
Kit Westneat [Mon, 27 Mar 2017 16:27:27 +0000 (12:27 -0400)]
LU-9258 nodemap: group quota ID not properly mapped

This patch fixes a typo in which group quota IDs were mapped as user
IDs for quota commands.

Test-Parameters: trivial
Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: I9458b4af04a3638fb8ccf3bd2a96b30305021514
Reviewed-on: https://review.whamcloud.com/26209
Tested-by: Jenkins
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-9227 nrs: Rate change of a TBF rule loses control 85/26085/4
Qian Yingjin [Mon, 20 Mar 2017 02:33:54 +0000 (10:33 +0800)]
LU-9227 nrs: Rate change of a TBF rule loses control

In some test cases, i.e.
start dd_0 {dd.0} 1000
dd if=/dev/zero of=/mnt/lustre/test bs=1M count=100
start dd_1000 {dd.1000} 1000
After ran above commands, changing rate of dd_0 take no effect.
The reason is that starting rule dd_1000 increases the sequence,
but the sequence number of the class with id "dd.0" does not
change accordingly, resulting in failure of rate change.
This patch fixes this problem.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I9cdd469fd57dea692b86285cc26040a117b120ad
Reviewed-on: https://review.whamcloud.com/26085
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9153 llog: update llog print format to use FIDs 40/25640/9
Andreas Dilger [Mon, 27 Feb 2017 00:32:40 +0000 (17:32 -0700)]
LU-9153 llog: update llog print format to use FIDs

Print llog identifiers using FIDs instead of OSTID format, which has
been deprecated since 2.3.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Id012c059e8151e2a78086579150f04b1b05cab07
Reviewed-on: https://review.whamcloud.com/25640
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9148 kernel: kernel update RHEL6.8 [2.6.32-642.15.1.el6] 33/25633/5
Bob Glossman [Wed, 22 Feb 2017 17:32:41 +0000 (09:32 -0800)]
LU-9148 kernel: kernel update RHEL6.8 [2.6.32-642.15.1.el6]

Update RHEL6.8 kernel to 2.6.32-642.15.1.el6

Test-Parameters: trivial clientdistro=el6.8 mdsdistro=el6.8 \
  ossdistro=el6.8 mdtfilesystemtype=ldiskfs \
  ostfilesystemtype=ldiskfs testgroup=review-ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Iaac7b99246b9e9d3204c3545aaf0b9d2fd8d3185
Reviewed-on: https://review.whamcloud.com/25633
Tested-by: Jenkins
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8457 pacemaker: Update healthLNET to 0.99.4 97/25297/5
Nathaniel Clark [Tue, 7 Feb 2017 13:55:06 +0000 (08:55 -0500)]
LU-8457 pacemaker: Update healthLNET to 0.99.4

Fixed minor issue with lnet connectivity
Fix License header.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I074b93e2e3ea29e608a6f1b46600556a1b255438
Reviewed-on: https://review.whamcloud.com/25297
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Malcolm Cowe <malcolm.j.cowe@intel.com>
2 years agoLU-8397 mgc: add comma-separated nids for primary MGS 08/21308/4
Vladimir Saveliev [Wed, 1 Feb 2017 15:29:56 +0000 (23:29 +0800)]
LU-8397 mgc: add comma-separated nids for primary MGS

In lustre_start_mgc(), if the primary MGS has multiple nids separated by
commas, then only the first nid is added, and the other nids are not added.
This patch fixes the above issue by adding the nids until the end of the
primary MGS nid list or hitting the first failover MGS nid.

Test-Parameters: combinedmdsmgs=false envdefinitions=ONLY=77 testlist=conf-sanity
Test-Parameters: envdefinitions=ONLY=77 testlist=conf-sanity

Signed-off-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Signed-off-by: Jian Yu <jian.yu@intel.com>
Seagate-bug-id: MRP-2930
Change-Id: I3e84786bfc08767c75a133affb4f86325d789d6e
Reviewed-on: https://review.whamcloud.com/21308
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9354 utils: fix lfs.c compile for PPC 27/26727/2
Andreas Dilger [Wed, 19 Apr 2017 02:53:00 +0000 (20:53 -0600)]
LU-9354 utils: fix lfs.c compile for PPC

Fix the lfs_setstripe_args parameter to use "unsigned long long"
for lsa_comp_end so that when passed to llapi_parse_size() it is
the correct type.  PPC annoyingly uses "unsigned long" for __u64
while most other architectures use "unsigned long long".

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I27b724d5eab5f4669bde1f5fb44c75a1051ba7bd
Reviewed-on: https://review.whamcloud.com/26727
Tested-by: Jenkins
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9311 pfl: shouldn't reprocess done/no-op resent request 74/26474/5
Bobi Jam [Mon, 10 Apr 2017 17:50:04 +0000 (01:50 +0800)]
LU-9311 pfl: shouldn't reprocess done/no-op resent request

When the LOVEA buffer is bigger than the request reply buffer, the
client will resend the layout write intent RPC, and
mdt_layout_change() should not reprocess it since the 2nd process
will try to cancel the 1st granted CR lock, while client has not
get it granted yet because of the reply buffer shortage resend the
RPC.

There is another layout change resent case: the client's job has been
done by another client, referring lod_declare_layout_change -EALREADY
case, and it became a operation w/o transaction, so we should not do
the layout change, otherwise mdt_layout_change() will try to cancel
the granted server CR lock whose remote counterpart is still in hold
on the client, and a deadlock ensues.

This patch also adjusts some debug messages, makes dump_lsm() dump
uninstantiated component stripe info.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I9b063ee54d57c233eca3250502a2707997892898
Reviewed-on: https://review.whamcloud.com/26474
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Jenkins
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9324 test: get layout parameters 78/26578/9
Bobi Jam [Thu, 13 Apr 2017 03:36:58 +0000 (11:36 +0800)]
LU-9324 test: get layout parameters

Add test framework functions get_layout_param()/parse_layout_param
to get plain/PFL file/dir layout parameters, so that the layout
parameters can be used later.

Test-Parameters: trivial testlist=sanity-pfl
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I4f43a5b4d23e353660114b7ff299a97afc405a07
Reviewed-on: https://review.whamcloud.com/26578
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-9168 spec: New lustre-resource-agents rpm 29/26229/4
Nathaniel Clark [Mon, 27 Mar 2017 16:55:31 +0000 (12:55 -0400)]
LU-9168 spec: New lustre-resource-agents rpm

This rpm bundles up all resource agents for Lustre into a single RPM.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ifc4422f11e9d6b4c1e03ef85fc78e35d6fcb0316
Reviewed-on: https://review.whamcloud.com/26229
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Zhiqi Tao <zhiqi.tao@intel.com>
Reviewed-by: Gabriele Paciucci <gabriele.paciucci@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8840 osp: handle EA cache properly (2) 07/25207/4
Fan Yong [Sat, 8 Oct 2016 06:00:24 +0000 (14:00 +0800)]
LU-8840 osp: handle EA cache properly (2)

For success case, dt_xattr_get() should return the EA size
instead of zero. If such EA does not exist, return -ENODATA.

More code cleanup for OSP EA cache to avoid potential reference
leak, buffer overflow, and so on.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I85d7c7a2cafd50334f2ea0634f5e2b21c0b3908e
Reviewed-on: https://review.whamcloud.com/25207
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-2155 utils: Enable ldiskfs mmp on tunefs failover add 58/26758/2
Nathaniel Clark [Wed, 19 Apr 2017 14:54:51 +0000 (10:54 -0400)]
LU-2155 utils: Enable ldiskfs mmp on tunefs failover add

Enable Multi-Mount Protection in ldiskfs when adding a failover peer
via tunefs.  MMP is enabled in mkfs if failover is configured
initially, but it wasn't if failover was added later.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ie21ef5324de240afe0fd760cb4a9b3e1b4165064
Reviewed-on: https://review.whamcloud.com/26758
Tested-by: Jenkins
Reviewed-by: Zhiqi Tao <zhiqi.tao@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-8998 utils: allow "--component-*" to be abbreviated 99/26699/2
Andreas Dilger [Tue, 18 Apr 2017 07:35:54 +0000 (01:35 -0600)]
LU-8998 utils: allow "--component-*" to be abbreviated

Allow "--component-foo" arguments to be abbreviated to "--comp-foo"
to simplify userspace, especially for options like "--comp-flags"
that do not have a short option.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I8c434dd68c042efefc2df93d0f2d7b9bc33ebbe5
Reviewed-on: https://review.whamcloud.com/26699
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9351 pfl: setstripe to existing file 55/26655/2
Niu Yawei [Mon, 17 Apr 2017 12:58:32 +0000 (08:58 -0400)]
LU-9351 pfl: setstripe to existing file

It's ok to setstripe to an existing file when the file has no
LOVEA created yet, so we should not use exlusive open for the
'lfs setstripe'.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I2090ba2940b391f7853cc06ccdb9cd842ad6f984
Reviewed-on: https://review.whamcloud.com/26655
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
2 years agoLU-9335 pfl: calculate PFL file LOVEA correctly 97/26597/11
Bobi Jam [Thu, 13 Apr 2017 16:37:25 +0000 (00:37 +0800)]
LU-9335 pfl: calculate PFL file LOVEA correctly

PFL file could contain uninstantiated component, so it could still
keeps the specified -1 stripe count,
lov_mds_md_size()/lov_user_md_size() should heed this case,
otherwise its LOVEA size could be errneous big.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ied4bf4531f0b0ac9fdefc9efef3c97ae5ae449f4
Reviewed-on: https://review.whamcloud.com/26597
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8922 lod: check master stripes properly 76/24776/5
Di Wang [Mon, 9 Jan 2017 15:07:36 +0000 (10:07 -0500)]
LU-8922 lod: check master stripes properly

When creating striped directory, it should
first check if the stripe has been created
on the current MDT, otherwise it might create
duplicate stripes on the master MDT, especially
when one MDT is deactived, and specified stripes
is larger than the active MDTs.

Test-Parameters: envdefinitions=ONLY=50 testlist=conf-sanity,conf-sanity,conf-sanity
Test-Parameters: envdefinitions=ONLY=50 testlist=conf-sanity,conf-sanity,conf-sanity
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Id3bba2817c4c7c9584f9129e32555f0f676b3364
Reviewed-on: https://review.whamcloud.com/24776
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-5718 o2iblnd: multiple sges for work request 51/12451/12
Liang Zhen [Tue, 28 Oct 2014 03:34:26 +0000 (11:34 +0800)]
LU-5718 o2iblnd: multiple sges for work request

In current protocol, lnet router cannot align buffer for rdma,
o2iblnd may run into "too fragmented RDMA" issue while routing
non-page-aligned IO larger than 512K, because each page will
be splited into two fragments by kiblnd_init_rdma().

With this patch, o2iblnd can have multiple sges for each work
request, and combine multiple remote fragments of the same page
into one work request to resovle the "too fragmented RDMA" issue.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: Id57a74dc92801b012956ab785233aa87cac14263
Reviewed-on: https://review.whamcloud.com/12451
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9334 lfsck: object leak in lfsck_load_one_trace_file 03/26703/2
Fan Yong [Tue, 29 Nov 2016 13:18:08 +0000 (21:18 +0800)]
LU-9334 lfsck: object leak in lfsck_load_one_trace_file

In lfsck_load_one_trace_file(), if we successfully load or
create the object via local_index_find_or_create(), but the
subsequent do_index_try() failed, then we need to release
such object, otherwise it will be left there.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ic3c7db9239e0d10a5cf6fc2254a4c414f4cd007f
Reviewed-on: https://review.whamcloud.com/26703
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-1032 build: fix typo in lustre-dkms.spec changelog 58/26358/2
Andreas Dilger [Tue, 4 Apr 2017 23:22:51 +0000 (17:22 -0600)]
LU-1032 build: fix typo in lustre-dkms.spec changelog

Fix typo in the lustre-dkms.spec.in ChangeLog which has an
invalid date, and causes a build warning.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I6029c93928f84ff59c8798d2f8115e872c4cab07
Reviewed-on: https://review.whamcloud.com/26358
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9384 tests: skip 2.9 filesystem images on 2.10 89/26789/2
Andreas Dilger [Sun, 23 Apr 2017 04:19:00 +0000 (22:19 -0600)]
LU-9384 tests: skip 2.9 filesystem images on 2.10

The 2.9 disk images upgraded to 2.10 are causing common test failures
due to small differences in space usage.  Disable testing these images
until a proper fix is available.

Test-Parameters: trivial envdefinitions=ONLY=32 testlist=conf-sanity
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I64d14ac6b079c50ee234ff886c9a80d9213ebbe5
Reviewed-on: https://review.whamcloud.com/26789
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-6283 ptlrpc: Implement NRS Delay Policy 01/14701/32
Chris Horn [Thu, 12 Mar 2015 23:25:14 +0000 (18:25 -0500)]
LU-6283 ptlrpc: Implement NRS Delay Policy

The NRS Delay policy seeks to perturb the timing of request processing
at the PtlRPC layer, with the goal of simulating high server load, and
finding and exposing timing related problems. When this policy is
active, upon arrival of a request the policy will calculate an offset,
within a defined, user-configurable range, from the request arrival
time, to determine a time after which the request should be handled.
The request is then stored using the cfs_binheap implementation,
which sorts the request according to the assigned start time.
Requests are removed from the binheap for handling once their start
time has been passed.

The behavior of the policy can be controlled via three proc files
which can be written to via lctl similar to other policies.

nrs_delay_min: Controls the minimum amount of time, in seconds, that a
request will be delayed by this policy. The default is 5 seconds.

nrs_delay_max: Controls the maximum amount of time, in seconds, that a
request will be delayed by this policy. The default is 300 seconds.

nrs_delay_pct: Control the percentage of requests that will be delayed
by this policy. The default is 100. Note, when a request is not
selected for handling by the delay policy due to this variable then
the request will be handled by whatever fallback policy is defined
for that service. If no other fallback policy is defined then the
request will be handled by the FIFO policy.

Some examples:

lctl set_param *.*.*.nrs_delay_min=reg_delay_min:5, to set the regular
request minimum delay on all PtlRPC services to 5 seconds.

lctl set_param *.*.*.nrs_delay_min=hp_delay_min:2, to set the
high-priority request minimum delay on all PtlRPC services to 2
seconds.

lctl set_param *.*.ost_io.nrs_delay_min=8, to set both the regular and
high-priority request minimum delay of the ost_io service to 8
seconds.

lctl set_param *.*.*.nrs_delay_max=reg_delay_max:20, to set the
regular request maximum delay on all PtlRPC services to 20 seconds.

lctl set_param *.*.*.nrs_delay_max=hp_delay_max:10, to set the
high-priority request maximum delay on all PtlRPC services to 10
seconds.

lctl set_param *.*.ost_io.nrs_delay_max=35, to set both the regular
and high-priority request maximum delay of the ost_io service to 35
seconds.

lctl set_param *.*.*.nrs_delay_pct=reg_delay_pct:5, to delay 5
percent of regular requests on all PtlRPC services.

lctl set_param *.*.*.nrs_delay_pct=hp_delay_pct:2, to delay 2 percent
of high-priority requests on all PtlRPC services.

lctl set_param *.*.ost_io.nrs_delay_pct=8, to delay 8 percent of both
regular and high-priority requests of the ost_io service.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Iab50a639900adf31893c7b1fe83658932fd59db1
Reviewed-on: https://review.whamcloud.com/14701
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9319 statahead: skip agl for the file in restoring 01/26501/4
Fan Yong [Mon, 7 Mar 2016 21:34:36 +0000 (05:34 +0800)]
LU-9319 statahead: skip agl for the file in restoring

In case of restore, the MDT has the right size and has already sent
it back without granting the layout lock, inode is up-to-date. Then
AGL (async glimpse lock) is useless.

Also to glimpse we need the layout, in case of a running restore the
MDT holds the layout lock so the glimpse will block up to the end of
restore (statahead/agl will block).

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Iaaf138a28671c3eccfb05b08ce8d7364423256a1
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-on: https://review.whamcloud.com/26501
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9332 tests: fix pool_list issue in conf-sanity test_82b 82/26582/3
Emoly Liu [Fri, 14 Apr 2017 06:18:49 +0000 (14:18 +0800)]
LU-9332 tests: fix pool_list issue in conf-sanity test_82b

In conf-sanity.sh test_82b, if we use separated MGS and MDS nodes,
MDS needs more time to list the OSTs in pool, so use wait_update
instead of the pool_list command.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I8f7281d44a12731d7ae2d1f8fa9bad163086cecc
Reviewed-on: https://review.whamcloud.com/26582
Tested-by: Jenkins
Reviewed-by: Parinay Kondekar <parinay.kondekar@seagate.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>