Whamcloud - gitweb
fs/lustre-release.git
8 years agoLU-6245 libcfs: remove err.h 09/15909/4
James Simmons [Wed, 12 Aug 2015 14:25:09 +0000 (10:25 -0400)]
LU-6245 libcfs: remove err.h

With the cleanup of userland with libcfs we no longer
need the special error handling macros.

Change-Id: I5a7a2e1df3beef548b74703b45052ce85166f3aa
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/15909
Reviewed-by: frank zago <fzago@cray.com>
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6245 libcfs: remove unused cfs_timer_done 17/13917/3
James Simmons [Mon, 10 Aug 2015 14:36:22 +0000 (10:36 -0400)]
LU-6245 libcfs: remove unused cfs_timer_done

Remove the cfs_timer_done function in the libcfs
kernel module since it is not used anywhere.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Ieb521b5b2ce4d3a66dfa0bafc3a866fc38fcd65f
Reviewed-on: http://review.whamcloud.com/13917
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-5108 osc: Performance tune for LRU 58/10458/18
Jinshan Xiong [Sat, 18 Jul 2015 13:10:09 +0000 (06:10 -0700)]
LU-5108 osc: Performance tune for LRU

Early launch page LRU work in osc_io_rw_iter_init();
Change the page LRU shrinking policy by OSC attributes;
Delete the contented lock osc_object::oo_seatbelt
Other tiny changes for LRU management

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I688c29a99a469ef74f929a0689596170c665b2ee
Reviewed-on: http://review.whamcloud.com/10458
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-3031 ldlm: disconnect speedup 43/5843/31
Vitaly Fertman [Wed, 22 Jul 2015 14:52:03 +0000 (10:52 -0400)]
LU-3031 ldlm: disconnect speedup

disconnect takes too long time if there are many locks to cancel.
besides the amount of time spent on each lock cancel, there is a
resched() in cfs_hash_for_each_relax(), i.e. disconnect or eviction
may take unexpectedly long time.
- do not cancel locks on disconnect_export;
- export will be left in obd_unlinked_exports list pinned by live
  locks;
- new re-connects will created other non-conflicting exports;
- new locks will cancel obsolete locks on conflicts;
- once all the locks on the disconnected export will be cancelled,
  the export will be destroyed on the last ref put;
- do not cancel in small portions, cancel all together in just 1
  dedicated thread - use server side blocking thread for that;
- cancel blocked locks first so that waiting locks could proceed;
- take care about blocked waiting locks, so that they would get
  cancelled quickly too;
- do not remove lock from waiting list on AST error before moving
  it to elt_expired_locks list, because it removes it from export
  list too; otherwise this blocked lock will not be cancelled
  immediately on failed export;
- cancel lock instead of just destroy for failed export, to make
  full cleanup, i.e. remove it from export list.

also make the proper order of events on umount:
- disconnect export;
- cleanup namespace, to cancel all the locks before export barrier;
- exports barrier;
- lprocfs_free_per_client_stats (requires nid_exp_ref_count == 0);
- namespace_free_post is left in cleanup ensure will not get and
  segfault on an absent namespace.

Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Change-Id: Ia39b09ce967237ed5078c8a71e760b1e103c6f55
Xyratex-bug-id: MRP-395 MRP-1366 MRP-1366
Reviewed-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Alexey Lyashkov <Alexey_Lyashkov@xyratex.com>
Tested-by: Elena Gryaznova <Elena_Gryaznova@xyratex.com>
Reviewed-on: http://review.whamcloud.com/5843
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6802 recovery: don't skip open replay on reconnect 71/15871/2
Niu Yawei [Thu, 6 Aug 2015 08:14:40 +0000 (04:14 -0400)]
LU-6802 recovery: don't skip open replay on reconnect

Once reconnect happened during replay, we'd continue the open
replay with the last failed replay, but not the next.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I164c40db143ca860ab59f60582942614d5fb7925
Reviewed-on: http://review.whamcloud.com/15871
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6929 libcfs: typo in cfs_hash_for_each_relax() 13/15813/3
Niu Yawei [Fri, 31 Jul 2015 02:06:37 +0000 (22:06 -0400)]
LU-6929 libcfs: typo in cfs_hash_for_each_relax()

In cfs_hash_for_each_relax(), rc != 0 means caller want to break
the iteration, so we should only continue the iteration when rc
is zero.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I9666cd26ebe93627009bd03b7bbd341a65beaddf
Reviewed-on: http://review.whamcloud.com/15813
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6122 lnet: Allocate the correct number of rtr buffers 19/13519/11
Amir Shehata [Fri, 23 Jan 2015 23:27:22 +0000 (15:27 -0800)]
LU-6122 lnet: Allocate the correct number of rtr buffers

This patch ensures that the correct number of router buffers are
allocated.  It keeps a count that keeps track of the number of
buffers allocated.  Another count keeps the number of buffers
requested. The number of buffers allocated is set when creating
new buffers and reduced when buffers are freed.

The number of requested buffer is set when the buffers are
allocated and is checked when credits are returned to determine
whether the buffer should be freed or kept.

In lnet_rtrpool_adjust_bufs() grab lnet_net_lock() before using
rbp_nbuffers to ensure that it doesn't change by
lnet_return_rx_credits_locked() during the process of allocating
new buffers.  All other access to rbp_nbuffers is already being
protected by lnet_net_lock().

This avoids the case where we allocate less than the desired
number of buffers.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I96627cc8ba3d3d70a0bf581b21ccd3c9b2de327f
Reviewed-on: http://review.whamcloud.com/13519
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6455 mdt: disable IMA support 28/14928/3
Hongchao Zhang [Thu, 23 Apr 2015 21:04:04 +0000 (05:04 +0800)]
LU-6455 mdt: disable IMA support

in IMA (Integrity Measurement Architecture), there are two xattr
"security.ima" and "security.evm" to protect the file to be modified
accidentally or maliciously, the two xattr are not compatible with
VBR, then disable it to workaround the problem currently and enable
it when the conditions are ready.

Change-Id: Ie3e30dcb0d4d605a17d301c6cda14818af40d7b0
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/14928
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoNew tag 2.7.58 2.7.58 v2_7_58 v2_7_58_0
Oleg Drokin [Mon, 17 Aug 2015 23:21:37 +0000 (19:21 -0400)]
New tag 2.7.58

Change-Id: Ia373f99ab02b0ed713e7cc762782fd7ef2d6405d

8 years agoLU-6846 osd: reset do_body_ops after creation 95/15595/4
wang di [Tue, 21 Jul 2015 08:26:31 +0000 (01:26 -0700)]
LU-6846 osd: reset do_body_ops after creation

Reset the do_body_ops from osd_body_ops_new to
osd_body_ops after object creation succeeds.
Otherwise in OUT handler, if creation the llog object
and write records to the llog object are in the same
transaction(one updates handling process), the object
will not have create do_body_ops in write update.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Id45f1a29572faadc381194f45a980eb58d193e93
Reviewed-on: http://review.whamcloud.com/15595
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6742 osd: remove legacy code 35/15335/7
Alex Zhuravlev [Thu, 18 Jun 2015 05:38:03 +0000 (08:38 +0300)]
LU-6742 osd: remove legacy code

osd_convert_root_to_new_seq() was introduced to convert
the filesystems created with pre-production Orion code.
we don't need this anymore.

Change-Id: Ib5657fc7a905bebacd4074878cc76365e1e3d8d9
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/15335
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
8 years agoLU-7004 utils: defer conf_param deprecation 79/15979/2
Andreas Dilger [Thu, 13 Aug 2015 19:12:49 +0000 (13:12 -0600)]
LU-7004 utils: defer conf_param deprecation

Since there has been relatively little testing of "lctl set_param -P"
and the test scripts and documentation have not been converted from
using "lctl conf_param" yet, I'm deferring the conf_param deprecation
warning to 2.8.53 (essentially 2.9.0) to give time for this change.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I6e879aaf5f9c9e1434f52a7274c663096ceb35b5
Reviewed-on: http://review.whamcloud.com/15979
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
8 years agoLU-6905 osp: rename ost_server(conn)_uuid 25/15725/4
wang di [Fri, 24 Jul 2015 11:26:58 +0000 (04:26 -0700)]
LU-6905 osp: rename ost_server(conn)_uuid

For OSP to MDT, rename ost_server(conn)_uuid(under /proc)
to mdt_server(conn)_uuid.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Iae794b3fd9fd9c21e5b6f04f406ab5e504aab1eb
Reviewed-on: http://review.whamcloud.com/15725
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7005 tests: disable conf-sanity test_50i 80/15980/2
Andreas Dilger [Thu, 13 Aug 2015 19:24:03 +0000 (13:24 -0600)]
LU-7005 tests: disable conf-sanity test_50i

Skip new test for permanently disabling MDT because it is failing
about 50% of the time.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I39f71aa5fa880f95f024f74fbca6ef261a6e4dcd
Reviewed-on: http://review.whamcloud.com/15980

8 years agoLU-6924 ptlrpc: replay bulk request 93/15793/5
wang di [Tue, 28 Jul 2015 08:16:52 +0000 (01:16 -0700)]
LU-6924 ptlrpc: replay bulk request

Even though the server might already got the bulk
replay request, but bulk transfer timeout, let's
replay the bulk request, i.e. treat such replay as
same as no replied replay request (See
ptlrpc_replay_interpret()).

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I1f71eacc3a68941c00f16c9628342c662e7fe181
Reviewed-on: http://review.whamcloud.com/15793
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6881 update: add lock to protect commit check 90/15690/3
wang di [Tue, 21 Jul 2015 13:16:21 +0000 (06:16 -0700)]
LU-6881 update: add lock to protect commit check

In sub_trans_commit_cb(), the commit check should
be protected by lock, otherwise in some racy
scenarios, all committed will never be true,
even though all sub transaction has been committed.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I8f43ca8083753ab6eef4f2be56ef77bb8640bb79
Reviewed-on: http://review.whamcloud.com/15690
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6586 mgs: deactive MDT permanently 47/14747/11
wang di [Thu, 7 May 2015 13:40:51 +0000 (06:40 -0700)]
LU-6586 mgs: deactive MDT permanently

Deactivate/activate MDT in the config log, so
the user can permanently deactivate one MDT.

Add active proc entry for MDC, and mark MDC to
be inactive once the MDC is deactivated.

Add conf-sanity.sh 50i to verify activate
and deactivate MDT.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ia8d18f48f12fc180c32da777bde0902a774ddf93
Reviewed-on: http://review.whamcloud.com/14747
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6770 osc: use global osc_rq_pool to reduce memory usage 22/15422/13
Li Xi [Mon, 13 Jul 2015 14:29:54 +0000 (22:29 +0800)]
LU-6770 osc: use global osc_rq_pool to reduce memory usage

The per-osc request pools consume a lot of memory if there are
hundreds of OSCs on one client. This will be a critical problem
if the client doesn't have sufficient memory for both OSCs and
applications.

This patch replaces per-osc request pools with a global pool
osc_rq_pool. The total memory usage is 5MB by default. And it
can be set by a module parameter of OSC:
"options osc osc_reqpool_mem_max=POOL_SIZE". The unit of POOL_SIZE
is MB. If cl_max_rpcs_in_flight is the same for all OSCs, the
memory usage of the OSC pool can be calculated as:
Min(POOL_SIZE * 1M,
    (cl_max_rpcs_in_flight + 2) * OSC number * OST_IO_MAXREQSIZE)

Also, this patch changes the allocation logic of OSC write requests.
The allocation from osc_rq_pool will only be tried after normal
allocation failed.

Signed-off-by: Wu Libin <lwu@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I1b0c522ade01dba11d860ab57f83af53619ce4ba
Reviewed-on: http://review.whamcloud.com/15422
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6765 obdecho: don't copy lu_site 57/15657/4
Olaf Faaland [Mon, 20 Jul 2015 23:57:06 +0000 (16:57 -0700)]
LU-6765 obdecho: don't copy lu_site

While creating an echo device, echo_device_alloc() copies the lu_site
from MD stack, such kind of copy result in uninitialized mutex and
other potential issues.

Instead of copying the lu_site, we'd use the lu_site by pointer directly.

Test-Parameters: alwaysuploadlogs envdefinitions=SLOW=yes \
mdtfilesystemtype=zfs mdsfilesystemtype=zfs ostfilesystemtype=zfs \
clientdistro=el6.6 ossdistro=el6.6 mdsdistro=el6.6 \
mdtcount=1 testlist=mds-survey

Test-Parameters: alwaysuploadlogs envdefinitions=SLOW=yes \
mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
clientdistro=el6.6 ossdistro=el6.6 mdsdistro=el6.6 \
mdtcount=1 testlist=mds-survey

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: I00bd1a211cf0d556e427ba2f281fbcb1940d41f3
Reviewed-on: http://review.whamcloud.com/15657
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6714 llog: fix the llog_cat_set_first_idx() 41/15841/4
Mikhail Pershin [Sat, 1 Aug 2015 16:10:58 +0000 (19:10 +0300)]
LU-6714 llog: fix the llog_cat_set_first_idx()

The bug was introduced in previous commit causing
wrong catalog index to be set.

Test-Parameters: alwaysuploadlogs envdefinitions=SLOW=yes mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs clientdistro=el7 ossdistro=el6.6 mdsdistro=el6.6 mdtcount=1 testlist=performance-sanity
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Iaa0747c5f504109f3a265c7cdda91ebe7ba83b4f
Reviewed-on: http://review.whamcloud.com/15841
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6090 lnet: save errno between function calls 83/15783/5
Amir Shehata [Tue, 28 Jul 2015 22:55:05 +0000 (15:55 -0700)]
LU-6090 lnet: save errno between function calls

errno can be overwritten by functions like snprintf even if
it succeeded, so save the errno when needed.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I9a47267851f7c102727f8653a3a3727047f33186
Reviewed-on: http://review.whamcloud.com/15783
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6339 lnet: correct lnet script Usage help 80/15780/2
Amir Shehata [Tue, 28 Jul 2015 20:34:08 +0000 (13:34 -0700)]
LU-6339 lnet: correct lnet script Usage help

Changed Usage message to use lnet rather than Lustre

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I9357fded89e9e33ccac63ee49620bd1c7e69f454
Reviewed-on: http://review.whamcloud.com/15780
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6891 test: fix grow_xattr() defect 72/15672/3
Elena Gryaznova [Tue, 21 Jul 2015 18:37:14 +0000 (21:37 +0300)]
LU-6891 test: fix grow_xattr() defect

string needs to be quoted to avoid syntax error on
DNE setup

Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Seagate-bug-id: MRP-2762
Reviewed-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Change-Id: Ibdb383f2581fbe4bd9e92f9d89a98bfe14209f23
Reviewed-on: http://review.whamcloud.com/15672
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6791 build: Update ZFS/SPL version to 0.6.4.2 81/15481/4
Nathaniel Clark [Thu, 2 Jul 2015 21:05:43 +0000 (17:05 -0400)]
LU-6791 build: Update ZFS/SPL version to 0.6.4.2

Updates ZFS and SPL to latest maintence version.  Includes the
following:

Bug Fixes:
* Fix panic due to corrupt nvlist when running utilities
(zfsonlinux/zfs#3335)
* Fix hard lockup due to infinite loop in zfs_zget()
(zfsonlinux/zfs#3349)
* Fix panic on unmount due to iput taskq (zfsonlinux/zfs#3281)
* Improve metadata shrinker performance on pre-3.1 kernels
(zfsonlinux/zfs#3501)
* Linux 4.1 compat: use read_iter() / write_iter()
* Linux 3.12 compat: NUMA-aware per-superblock shrinker
* Fix spurious hung task watchdog stack traces (zfsonlinux/zfs#3402)
* Fix module loading in zfs import systemd service
(zfsonlinux/zfs#3440)
* Fix intermittent libzfs_init() failure to open /dev/zfs
(zfsonlinux/zfs#2556)

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I053087317ff9e5bedc1671bb46062e96bfe6f074
Reviewed-on: http://review.whamcloud.com/15481
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Reviewed-by: Isaac Huang <he.huang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6450 mdd: Remove unused MAY_ constants 98/15398/4
Ben Evans [Thu, 25 Jun 2015 19:00:39 +0000 (12:00 -0700)]
LU-6450 mdd: Remove unused MAY_ constants

Remove unused MAY_ constants from lustre_idl.h
Remove dead code from mdd_permission()

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I722f55eac88c7143555e680f72971b96791b00ae
Reviewed-on: http://review.whamcloud.com/15398
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-5319 utils: update lr_reader to display additional data 62/14862/10
Gregoire Pichon [Mon, 28 Jul 2014 12:16:33 +0000 (14:16 +0200)]
LU-5319 utils: update lr_reader to display additional data

The lr_reader command currently displays only the server area
of the LAST_RCVD file of a Lustre ldiskfs target device.

This patch enhances the command to allow diplaying :
- the per-client area of the LAST_RCVD file
  when --client or -c option is specified
- the reply data of the REPLY_DATA file
  when --reply or -r option is specified
It still only support ldiskfs target device.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Change-Id: Ifc2b7f9591eef0da5e4a5dc00f37e58484d4ea97
Reviewed-on: http://review.whamcloud.com/14862
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
8 years agoLU-2261 test: Re-enable sanity test 156 for zfs 35/13535/4
James Simmons [Tue, 27 Jan 2015 18:44:22 +0000 (13:44 -0500)]
LU-2261 test: Re-enable sanity test 156 for zfs

With the landing of LU-4259 zfs now supports brw
stats like ldiskfs. Earlier sanity test 156 was
disabled for zfs but now it can be re-enabled so
we can actually verify brw stats on zfs works.

Apply additional clean up to test 156: Spaces to tabs.

Change-Id: I0a0014bee4c666faadce96177c44953573d85ffd
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/13535
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
8 years agoLU-5246 tests: create OST objects on correct MDT 37/12937/8
Andreas Dilger [Thu, 4 Dec 2014 19:35:00 +0000 (03:35 +0800)]
LU-5246 tests: create OST objects on correct MDT

The sanity.sh test_220 is checking that precreated objects on the MDT
can be used even when the OST has subsequently run out of inodes.
The test was not correctly working if the directory was striped over
multiple MDTs since the number of precreated objects on other MDTs
may be fewer than the number of precreated objects $SINGLEMDS and
run out of objects.

Change the test to only create files on $SINGLEMDS so to avoid this.

Also fix the return code of lod_alloc_specific() to be -ENOSPC if
the stripe count is 1, since -EFBIG doesn't make sense in that case.
Fix other code comments for lod_alloc_specific() and lod_alloc_qos().

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I788706c3b212b426aa15424d02913aedcdb98b2a
Reviewed-on: http://review.whamcloud.com/12937
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6215 llite: handle new_sync_[read|write] removal 65/15165/5
James Simmons [Thu, 30 Jul 2015 16:04:41 +0000 (12:04 -0400)]
LU-6215 llite: handle new_sync_[read|write] removal

In newer linux kernel version the file read/write
API have been moving to using struct iov_iter. To
continue supporting the old API new wrapper
functions were created, new_sync_*, which allowed
the use of the new API's to support the old API's.
For the linux 4.1 kernel those new_sync_* are used
internally so they are no longer exposed to any
file systems. If a file system wants to use the
"old" api they need to call vfs_[write|read].
Update lustre to handle this change. The upstream
commit that made new_sync_* internal was:

Linux commit: 5d5d568975307877e9195f5305f4240e506a2807

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: Ibf69d25af8a7d5cda00c5b1be4757ec369e8e814
Reviewed-on: http://review.whamcloud.com/15165
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6365 obd: Eliminate hash bucket scans in lu_cache_shrink 66/14066/3
Ann Koehler [Tue, 14 Jul 2015 21:52:26 +0000 (16:52 -0500)]
LU-6365 obd: Eliminate hash bucket scans in lu_cache_shrink

The lu_cache_shrink slab shrinker is too slow, accounting for > 90% of
the time spent in shrink_slab when allocating huge pages. Most of its
time is spent iterating over the buckets in each site's object hash
table to compute the number of freeable objects. This iteration is
eliminated by adding an lru length count to the lu_site struct. A
percpu counter is used to maintain the lru length, so that the
lu_site does not need to be locked when an object is accessed through
the hash table. A counter is updated whenever an object is added to
or deleted from any of the hash table buckets. The number of freeable
objects is the sum of the counter values across all cpus.

Signed-off-by: Ann Koehler <amk@cray.com>
Change-Id: I5c70dd8e8cc1fe024292076e118774d9299bf40b
Reviewed-on: http://review.whamcloud.com/14066
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-5554 ptlrpc: race at req processing 35/10735/7
Alexander.Boyko [Tue, 17 Jun 2014 08:22:55 +0000 (12:22 +0400)]
LU-5554 ptlrpc: race at req processing

First commit 56e5a6bb4fc0329092f85016b8664da0b6ffa8a8
decrease race window, but does not remove it. Disable rq_resend
right after MSG_REPLAY flag set. Import lock protects two threads
from race between set/clear MSG_REPLAY and rq_resend flags.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Xyratex-bug-id: MRP-1888
Change-Id: I64d7c997976ff73956ebe97d7ea0775e0a54bc50
Reviewed-on: http://review.whamcloud.com/10735
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6896 llog: update log initialization needs to be sync 21/15721/3
wang di [Thu, 23 Jul 2015 11:24:30 +0000 (04:24 -0700)]
LU-6896 llog: update log initialization needs to be sync

During initialization, update llog catlist and cat file creation
needs to be synchronized, because the following update records
will be written to these files, if these files are missing during
update recovery, then update recovery will fail.

Because update llog is initialized in a separate thread, it should
set loc_handle until the catlog is written to catlist to make sure
the udpate llog is fully initialized because MDT can handle cross
MDT operation. see lod_obd_get_info() to check the healthy of the
target.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Iccd09749bedde46c01ff89b5be4fa6f4327ec9b9
Reviewed-on: http://review.whamcloud.com/15721
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6972 llite: get rid of unused ll_super_blocks list 22/15922/2
Oleg Drokin [Sun, 9 Aug 2015 00:02:06 +0000 (20:02 -0400)]
LU-6972 llite: get rid of unused ll_super_blocks list

ll_super_blocks became unused quite a while ago with switch
to the new CLIO code.
So this patch removes the list, ll_sb_lock spinlock that guards it
and superblock info ll_list linkage.

Change-Id: If0af4f4c05f45feaedc54ca57f0d1106ad7cabe3
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/15922
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
8 years agoLU-6831 lmv: revalidate the dentry for striped dir 20/15720/2
wang di [Thu, 23 Jul 2015 09:24:47 +0000 (02:24 -0700)]
LU-6831 lmv: revalidate the dentry for striped dir

If there are bad stripe during striped dir revalidation,
most likely due the race between close(unlink) and
getattr, then let's revalidate the dentry, instead of
return error, like normal directory.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I410940473d123796f75607d39153240b7d3f737b
Reviewed-on: http://review.whamcloud.com/15720
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6124 test: skip tests require remote server with nodsh set 06/13406/3
Elena Gryaznova [Thu, 16 Jul 2015 16:40:41 +0000 (19:40 +0300)]
LU-6124 test: skip tests require remote server with nodsh set

Patch fixes the following tests to be skipped for remote
servers with nodsh set:
- sanity 116a, 116b, 160c, 205, 220, 225a, 225b, 230c;
- sanityn 34, 40a, 40b, 40c, 40d, 40e.

Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Xyratex-bug-id: MRP-1720
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Change-Id: I5567e752f0d05c2447de595cb72c754e06e2fe83
Reviewed-on: http://review.whamcloud.com/13406
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-3163 tests: Fix test_61 from conf-sanity test 86/15486/4
Ashish Purkar [Fri, 3 Jul 2015 11:45:19 +0000 (17:15 +0530)]
LU-3163 tests: Fix test_61 from conf-sanity test

When large xattr is enabled and fstype is ldiskfs,
use tune2fs to enable/disable large_xattr.

Also replaced $TUNE2FS in all places instead of tune2fs.

Seagate-bug-id: MRP-1191
Signed-off-by: Ashish Purkar <ashish.purkar@seagate.com>
Change-Id: I86605d980343a234cd6f4e2101cac409e017af62
Reviewed-on: http://review.whamcloud.com/15486
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6126 test: skip sanity test_187a, test_187b on old MDS 08/13408/3
Elena Gryaznova [Wed, 14 Jan 2015 23:21:21 +0000 (03:21 +0400)]
LU-6126 test: skip sanity test_187a, test_187b on old MDS

Don't run sanity test_187a, test_187b in interop mode for
MDS's versions older than 2.3.0 (per file data_version
implementation was included in 2.3, LU-827).

Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Xyratex-bug-id: MRP-1826
Reviewed-by: Vladimir Saveliev <vladimir_saveliev@xyratex.com>
Change-Id: I6d42b51c769be80486c84fe436f6f3f71616e1cb
Reviewed-on: http://review.whamcloud.com/13408
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6490 gss: handle struct key_type match replacement 04/15804/3
James Simmons [Mon, 3 Aug 2015 14:36:16 +0000 (10:36 -0400)]
LU-6490 gss: handle struct key_type match replacement

Starting with the 3.17 kernel the function match
used by struct key_type was replaced by a new
function match_preparse. This patch added support
for this change.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: I2db0860cf089b0ba8f70bad9155f48e52ec8c4b5
Reviewed-on: http://review.whamcloud.com/15804
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sebastien.buisson@bull.net>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6096 ldiskfs: mark dir's inode dirty 81/15581/5
Alex Zhuravlev [Mon, 13 Jul 2015 12:04:08 +0000 (15:04 +0300)]
LU-6096 ldiskfs: mark dir's inode dirty

mark directory's inode dirty in ldiskfs_add_dot_dotdot()
as ldiskfs_init_new_dir() doesn't do this internally.
otherwise, the attributes set by ldiskfs_append() aren't
transferred to the buffer cache.

Change-Id: Ice93b75e291736f83c0d9b1047f7a627fef0141e
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/15581
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6938 osd: fix error handling in osd_xattr_list() 28/15828/2
Alex Zhuravlev [Sat, 1 Aug 2015 09:28:23 +0000 (12:28 +0300)]
LU-6938 osd: fix error handling in osd_xattr_list()

in case of buffer overflow, osd_xattr_list() should leave
properly: release the semaphore and the cursor.

Change-Id: I750183cb083cb87b3d8adae8fb0f41b61e6689e5
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/15828
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6542 lnet: handle cYAML out of memory error 78/15778/3
Amir Shehata [Tue, 28 Jul 2015 20:19:40 +0000 (13:19 -0700)]
LU-6542 lnet: handle cYAML out of memory error

in cYAML_build_error() there is a potential on run out of memory
scenario, that it is ignored and no error is built.
In this case print out a properly formated YAML error message
indicating fatal error

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Id9159044acafb38bd022f421a997fa2d2f1841e8
Reviewed-on: http://review.whamcloud.com/15778
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6906 mgc: IR log failure should not stop mount 28/15728/3
wang di [Sat, 25 Jul 2015 10:09:36 +0000 (03:09 -0700)]
LU-6906 mgc: IR log failure should not stop mount

If clients or other targets can not get IR config lock
or lock, the mount should continue, instead of failing.
Because timeout mechanism will handle the recovery anyway.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ie717cb363180907b510593ee4a6caaec6f07a5f3
Reviewed-on: http://review.whamcloud.com/15728
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6544 mkfs: Improve MDT inode size calculation 43/15643/9
Emoly Liu [Wed, 29 Jul 2015 08:15:45 +0000 (16:15 +0800)]
LU-6544 mkfs: Improve MDT inode size calculation

This patch reduces mkfs.lustre "--stripe-count-hint" limits when
calculating MDT inode size, so that there is a proper amount of
space reserved for different kinds of EAs and ACL in the inode.

This would allow files with N stripes to fit the lov, lma, link
EAs into the inode rather than storing it in an external xattr
block, which reduces performance and significantly increases the
space used per file on the MDT.

Also, this patch adds conf-sanity.sh test_87 to verfiy this
calculation with different stripe count.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: Idfdd9a064f0ac07c383a3af79c61c1ff973fb3f7
Reviewed-on: http://review.whamcloud.com/15643
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6904 mdd: prepare linkea before declare 24/15724/6
wang di [Fri, 24 Jul 2015 11:08:59 +0000 (04:08 -0700)]
LU-6904 mdd: prepare linkea before declare

In mdd_rename, it should prepare linkea before journal
start, otherwise, it may send RPC to other MDT while
holding the journal(see mdd_prepare_linkea). So if the
other MDT is in recovery, then this thread will be
blocked with the started journal handle, then it will
block other journal thread.

For the same reason, mdd_rename_order, which will send
RPC to the remote MDT (mdd_is_parent()), should also
move ahead of mdd_trans_start().

Because mdd_sobj can not be null in mdd_rename, this
patch also cleanup this API a bit.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Id2426ef5d0d83c704403aa3f762596284576df6d
Reviewed-on: http://review.whamcloud.com/15724
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6874 out: missing noop in out_update_ops 92/15692/2
wang di [Tue, 21 Jul 2015 13:37:19 +0000 (06:37 -0700)]
LU-6874 out: missing noop in out_update_ops

NOOP is missing in out_update_ops, though this NOOP
operation will only exit inside the update log, i.e.
out_handler should never handle noop update. But
if we add new update in out_update_ops table, missing
noop there might cause problems.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I967ffefcd1499801fa1890ed8d97ce70190c1902
Reviewed-on: http://review.whamcloud.com/15692
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6490 build: enable gss build on sles12 54/15354/2
Bob Glossman [Fri, 19 Jun 2015 15:09:06 +0000 (08:09 -0700)]
LU-6490 build: enable gss build on sles12

Now that the build failures of gss on sles12 are fixed
change lbuild to stop disabling the gss build there.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Icf25c038c28b8e230550de0bbe5b0b2fbe0c0b78
Reviewed-on: http://review.whamcloud.com/15354
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Sebastien Buisson <sebastien.buisson@bull.net>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6490 gss: 3.1x kernels adjustments for gssapi code 42/15342/9
Sebastien Buisson [Thu, 18 Jun 2015 16:15:47 +0000 (18:15 +0200)]
LU-6490 gss: 3.1x kernels adjustments for gssapi code

There are a number of changes in 3.1x kernels concerning the GSSAPI:
- libgssapi and libgssglue do not exist anymore, so call krb5
  primitives directly, and remove associated config checks;
- struct cred has no tgcred member anymore, so use cred directly;
- struct key_type instantiate and update function prototypes
  have changed;
- add new config checks on struct cred and struct key_type;
- u_int is BSD specific, so it is replaced with unsigned int.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Change-Id: I3b13c2afcb4b800bdcffb3b8713048f8e39f6866
Reviewed-on: http://review.whamcloud.com/15342
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6356 ptlrpc: add a 'enc_pool_max_memory_mb' module param 69/15069/4
Sebastien Buisson [Fri, 29 May 2015 12:22:55 +0000 (14:22 +0200)]
LU-6356 ptlrpc: add a 'enc_pool_max_memory_mb' module param

Create a new 'enc_pool_max_memory_mb' module param for ptlrpc.
It is used to set max memory used by encoding pools, in MB.
By default encoding pools will use at max 1/8 of total physical
memory, but it appears to be too few under some circumstances.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Change-Id: If431ba082eec4bc2c4f2131d4b6b5b7028a3a569
Reviewed-on: http://review.whamcloud.com/15069
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-5888 utils: limit max_sectors_kb tunable setting 40/13240/12
Andreas Dilger [Wed, 1 Apr 2015 16:35:29 +0000 (12:35 -0400)]
LU-5888 utils: limit max_sectors_kb tunable setting

Properly limit the max_sectors_kb tunable if larger than 32MB, as
added in commit 73bd456e896 (http://review.whamcloud.com/12723).
It was using the old value in "buf" instead of the updated "newval".

Change write_file() to use a file descriptor directly instead of
streams, since it only writes a single value before closing again.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I43a5b4c9d0e45649e0666fd58634286de53ebbe5
Reviewed-on: http://review.whamcloud.com/13240
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6916 target: fix ted_lcd_lock init 70/15770/2
wang di [Sun, 26 Jul 2015 14:42:36 +0000 (07:42 -0700)]
LU-6916 target: fix ted_lcd_lock init

Move ted_lcd_lock init from tgt_client_new()
to tgt_client_alloc(), otherwise if error happens
in client_new_export(), the error handler path
will cause panic like this

<1>BUG: unable to handle kernel NULL pointer dereference at (null)
<1>IP: [<ffffffff812957f6>] __list_add+0x26/0xa0
<4> [<ffffffff8152adef>] __mutex_lock_slowpath+0xcf/0x180
<4> [<ffffffffa047e087>] ? cfs_hash_putref+0x2e7/0x480 [libcfs]
<4> [<ffffffff8152acfb>] mutex_lock+0x2b/0x50
<4> [<ffffffffa083db72>] tgt_client_free+0x62/0x610 [ptlrpc]
<4> [<ffffffffa0f59f84>] mdt_destroy_export+0x74/0x220 [mdt]
<4> [<ffffffffa05862c5>] class_new_export+0x315/0x9c0 [obdclass]
<4> [<ffffffffa058745e>] class_connect+0xae/0x250 [obdclass]
<4> [<ffffffffa0f57fd1>] mdt_obd_connect+0xb1/0x720 [mdt]
<4> [<ffffffffa07aaf78>] target_handle_connect+0xe58/0x2d30 [ptlrpc]
<4> [<ffffffff8128c1f0>] ? string+0x40/0x100
<4> [<ffffffffa084edf2>] tgt_request_handle+0x5b2/0x1230 [ptlrpc]
<4> [<ffffffffa07f7611>] ptlrpc_main+0xe41/0x1920 [ptlrpc]
<4> [<ffffffffa07f67d0>] ? ptlrpc_main+0x0/0x1920 [ptlrpc]
<4> [<ffffffff8109abf6>] kthread+0x96/0xa0
<4> [<ffffffff8100c20a>] child_rip+0xa/0x20
<4> [<ffffffff8109ab60>] ? kthread+0x0/0xa0
<4> [<ffffffff8100c200>] ? child_rip+0x0/0x20

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: If09c019384f53a63107aeaa293ef93d6415f81b5
Reviewed-on: http://review.whamcloud.com/15770
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6827 osd: trans credit insufficient for EA object create 94/15694/5
Bobi Jam [Thu, 23 Jul 2015 07:16:08 +0000 (15:16 +0800)]
LU-6827 osd: trans credit insufficient for EA object create

EA object consumes more credits than regular object: osd_mk_index vs.
osd_mkreg, this patch reserves more credits for OSD_OT_CREATE
operation.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I40ec58ca467474ec4d96e5a1d24164fae03fc227
Reviewed-on: http://review.whamcloud.com/15694
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6768 osd: unmap reallocated blocks 93/15593/5
Alex Zhuravlev [Mon, 13 Jul 2015 20:36:10 +0000 (23:36 +0300)]
LU-6768 osd: unmap reallocated blocks

call unmap_underlying_metadata() on every allocated
block. otherwise metadata blocks released in a previous
transaction can be written by the kernel corrupting
user's data. pblock != 0 should be a good sign of that.

Change-Id: I0d06611feb384a3f7ef5d8e5b34951822369ed0f
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/15593
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6427 osd: race between destroy and load LMV 46/14346/17
wang di [Wed, 1 Apr 2015 19:57:57 +0000 (12:57 -0700)]
LU-6427 osd: race between destroy and load LMV

Do not get LMVEA if the object has been destroyed,
because it needs to iterate the master object, which
is dangerous for ZFS OSD.

Add oo_destroyed flags to avoid accessing destroyed
directory.

Add sanityn.sh 83 to verify accessing destroyed striped
directory will not cause panic.

Signed-off-by: wang di <di.wang@intel.com>
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I1ef6bf197eac0917fd649868235c96af2f64207c
Reviewed-on: http://review.whamcloud.com/14346
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6502 lnet: remove unnecessary NULL check 79/15779/2
Amir Shehata [Tue, 28 Jul 2015 20:29:18 +0000 (13:29 -0700)]
LU-6502 lnet: remove unnecessary NULL check

In LNetCtl():IOC_LIBCFS_GET_NET there is a check for config == NULL
This is not necessary as it'll never be NULL.  That's ensured before
the call to LNetCtl.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ib4d1ca5ca4a342f134e3102bcaf167e60f978b5f
Reviewed-on: http://review.whamcloud.com/15779
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6833 build: gerrit_checkpatch.py improvements 60/15560/4
John L. Hammond [Fri, 10 Jul 2015 13:51:39 +0000 (08:51 -0500)]
LU-6833 build: gerrit_checkpatch.py improvements

In contrib/scripts/gerrit_checkpatch.py:
  Add LASSERT and LCONSOLE, to the list of ignored types.
  Use a 60 second HTTP request timeout to avoid hangs.
  Catch any Exception around HTTP methods.
  Refactor some methods for to facilitate unit testing.
  Add timestamps to log messages.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ic8ddf815e58744af1cfa5a03986d869b790d91ae
Reviewed-on: http://review.whamcloud.com/15560
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-5092 nodemap: add structure to hold nodemap config 54/14254/15
Kit Westneat [Thu, 11 Jun 2015 23:03:25 +0000 (19:03 -0400)]
LU-5092 nodemap: add structure to hold nodemap config

This patch moves global state variables into a configuration
structure so that new configurations can be more easily loaded and
swapped into the active role.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: Ib0d51d56154d5e831b13f2935feab9bd73944bcc
Reviewed-on: http://review.whamcloud.com/14254
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6908 lfsck: initalize name before use 27/15727/4
Alex Zhuravlev [Sun, 26 Jul 2015 13:21:34 +0000 (16:21 +0300)]
LU-6908 lfsck: initalize name before use

lfsck_create_lpf() should initialize name before using
that to enqueue a hashed lock.

Change-Id: I32ade3231111e5ce687eff348a0454a98c34d101
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/15727
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6803 gss: add printf format checking to printerr() 51/15751/2
James Simmons [Mon, 27 Jul 2015 18:26:56 +0000 (14:26 -0400)]
LU-6803 gss: add printf format checking to printerr()

Add printf format checking to the gss error reporting functions
printerr(). Fixup all errors exposed by this change.

Change-Id: I2b7a2d65cb3ee81b11eb6af45297dad2e6cbb796
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/15751
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sebastien.buisson@bull.net>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6857 tests: TF_FAIL is not initialised 17/15617/2
Elena Gryaznova [Thu, 16 Jul 2015 14:29:44 +0000 (17:29 +0300)]
LU-6857 tests: TF_FAIL is not initialised

TF_FAIL is initialised in auster only,
this leads to t-f failure "touch: missing file operand"
if the suites run not under auster control.
DEFAULT_SUITES is initialised in acceptance-small only,
this leads to garbage summary message printed.

Patch adds export TF_FAIL to init_test_env() and
skips useless summary printing for the case when suites run
not under acceptance-small control.

Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Xyratex-bug-id: MRP-2116
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Change-Id: I047aae3b7c6f7d75e98e7862a7f00fd2d75d8b1f
Reviewed-on: http://review.whamcloud.com/15617
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6261 gnilnd: kgnilnd_check_rdma_cq race in error path. 39/15439/3
Chris Horn [Mon, 27 Jul 2015 13:59:57 +0000 (09:59 -0400)]
LU-6261 gnilnd: kgnilnd_check_rdma_cq race in error path.

It is possible for multiple threads to both be processing a
transaction error that causes the connection to close.
If the close path stalls after changing the conn state to CLOSING,
the tx_done path will put a tx in purgatory since the conn state is
not ESTABLISHED anymore. When the close path continues, it expects to
find that the conn->gnc_mdd_list is empty and assert if it isn't
empty.
Change the error code for the tx_done path so that we don't put this
tx in purgatory since we are closing the connection anyway.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I4458522f16508eb53d380f62320c65c7bf84657a
Reviewed-on: http://review.whamcloud.com/15439
Tested-by: Jenkins
Reviewed-by: James Shimek <jshimek@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6261 gnilnd: Use kgnilnd_vzalloc to allocate fma blocks. 38/15438/4
Chris Horn [Mon, 27 Jul 2015 20:03:42 +0000 (16:03 -0400)]
LU-6261 gnilnd: Use kgnilnd_vzalloc to allocate fma blocks.

In low memory situations we may not be able to allocate memory.
vmalloc tries forever to get memory.
Use kgnilnd_vzalloc which uses the GFP_NOFS flag for allocating
memory for fma blocks.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I18f8564424abfc9c63e675ac98ca61487b5a2b34
Reviewed-on: http://review.whamcloud.com/15438
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6460 llite: clear LLIF_FILE_RESTORING when done 09/14609/7
Bruno Faccini [Mon, 27 Apr 2015 09:08:34 +0000 (11:08 +0200)]
LU-6460 llite: clear LLIF_FILE_RESTORING when done

Clear LLIF_FILE_RESTORING if restore done to ensure to start again
to glimpse new attrs.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I0f2f9532e06965273efa10e3f69e26d00676b8a6
Reviewed-on: http://review.whamcloud.com/14609
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6260 llite: add support for new iter functionality 28/15028/10
James Simmons [Tue, 21 Jul 2015 19:32:34 +0000 (15:32 -0400)]
LU-6260 llite: add support for new iter functionality

For the 3.16+ kernels struct file_operations added new read
and write methods; read_iter and write_iter; to use struct
iov_iter as a parameter since it contains all the iovec
data needed. This avoid having the file system managing
iovec data like transversing the iovec page list. Now we
can use supplied iov_iter helper functions that does this
work for us. In later kernels only this back end is
supported. Backported from the upstream lustre client.

-----------------------------------------------------------------------
Linux-commit: b42b15fdad3ebb790250041d1517acebb9bd56d9

lustre: get rid of messing with iovecs

* switch to ->read_iter/->write_iter
* keep a pointer to iov_iter instead of iov/nr_segs
* do not modify iovecs; use iov_iter_truncate()/iov_iter_advance() and
  a new primitive - iov_iter_reexpand() (expand previously truncated
  iterator) istead.
* (racy) check for lustre VMAs intersecting with iovecs kept for now as
  for_each_iov() loop.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-----------------------------------------------------------------------
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: I71928beaa7b1c87f3e2c689e1dee96052eaef872
Reviewed-on: http://review.whamcloud.com/15028
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6341 llite: Use ll_dir_getstripe to get default LMVEA 90/13990/11
wang di [Tue, 14 Jul 2015 15:51:56 +0000 (08:51 -0700)]
LU-6341 llite: Use ll_dir_getstripe to get default LMVEA

Use ll_dir_getstripe to get default stripeEA in ll_new_node(),
Because ll_getxattr_common requires admin rights for retrieving
default LMVEA (because of trusted- prefix), which might cause
mkdir (from normal user) failure.

If parent does not have default stripeEA, then child should always
be in the same MDT for mkdir. Otherwise MDT should return -EREMOTE,
then client will refresh the default stripe index, and recreate
the object.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I0d8884b78fbc8b8d1930b1133150686e65d20c54
Reviewed-on: http://review.whamcloud.com/13990
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6052 utils: change "lfs mv" to "lfs migrate" 54/13754/14
Lai Siyao [Fri, 13 Feb 2015 08:54:15 +0000 (16:54 +0800)]
LU-6052 utils: change "lfs mv" to "lfs migrate"

"lfs mv" causes some confusion between mv and migration, change
"lfs mv" to "lfs migrate -m", and mark "lfs mv" deprecated.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I72ead14854eb6038ba13fddec81b867bc0542b46
Reviewed-on: http://review.whamcloud.com/13754
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: frank zago <fzago@cray.com>
8 years agoLU-6836 tests: Skip sanity-quota test 4a for ZFS 79/15679/3
James Nunez [Wed, 22 Jul 2015 18:29:51 +0000 (12:29 -0600)]
LU-6836 tests: Skip sanity-quota test 4a for ZFS

sanity-quota test 4a is failing frequently in
review-zfs-part-1 test groups. We are temporarily disabling
this test until a fix is submitted.

Note that LU-6836 is thought to be caused by bad ZFS sync
performance. We should reenable test 4a when ZIL support
is ready (LU-4009).

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I07867458e4365c0b1d31654be08e0fef9092c9eb
Reviewed-on: http://review.whamcloud.com/15679
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6769 build: Test for kthread_worker support 28/15428/6
Chris Horn [Mon, 29 Jun 2015 18:15:23 +0000 (13:15 -0500)]
LU-6769 build: Test for kthread_worker support

The OFED compatibility layer will backport kthread_worker unless
CONFIG_COMPAT_IS_KTHREAD is defined. kthread_worker is available in
the SLES 11 SP3 kernel, but not RHEL 6.5 or RHEL 6.6. This mod adds a
test to check if kthread is available in the kernel and defines
CONFIG_COMPAT_IS_KTHREAD appropriately.

Later versions of MLNX_OFED (2.4, 3.0) require that we similarly
define HAVE_KTHREAD_WORK.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ia89b73e4c8a3ad8549efdea70f0ebd96a1151dba
Reviewed-on: http://review.whamcloud.com/15428
Tested-by: Jenkins
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6828 lod: fix memory leak in lod_connect_to_osd 35/15635/3
Yang Sheng [Wed, 22 Jul 2015 01:38:04 +0000 (18:38 -0700)]
LU-6828 lod: fix memory leak in lod_connect_to_osd

We would replace '$fsname-mdtlov' to '$fsname-MDT0000-osd'
when upgrade from 1.x. But buffer alloced as former length.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I8c92e3cb4472527fa5ad5da7f0e4cb498eaccb6c
Reviewed-on: http://review.whamcloud.com/15635
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-5597 build: Ensure MOFED Module symvers are used 98/15498/4
Nathaniel Clark [Thu, 2 Jul 2015 21:43:09 +0000 (17:43 -0400)]
LU-5597 build: Ensure MOFED Module symvers are used

Change way extra symbols are included in build.  Use
KBUILD_EXTRA_SYMBOLS to directly included Modules.symvers
files instead of buildling Modules.symvers in buildroot
directory.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I4a6ee59f4e4eed9f878ad52993b8b17426f19d4a
Reviewed-on: http://review.whamcloud.com/15498
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6907 contrib: make gerrit_checkpatch.py less spammy 29/15729/3
Oleg Drokin [Mon, 27 Jul 2015 04:59:26 +0000 (00:59 -0400)]
LU-6907 contrib: make gerrit_checkpatch.py less spammy

Currently every checkpatch message is broadast to everybody
on the patch and then to all watchers.
Use the REST API to indicate that on failure the message
should only be sent to the patch owner and in case of success
the emails should not be sent at all.

Change-Id: I65b2017c1cc9558eed3707d81f936acac4af37f5
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/15729
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
8 years agoLU-6803 gss: add printf format checking to __logmsg{,_gss}() 24/15524/2
John L. Hammond [Tue, 7 Jul 2015 15:08:18 +0000 (10:08 -0500)]
LU-6803 gss: add printf format checking to __logmsg{,_gss}()

Add printf format checking to the gss error reporting functions
__logmsg() and __logmsg_gss(). Fixup all errors so discovered.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ie1526e0189f0b4e43d6bd3dcecb53574945f40d1
Reviewed-on: http://review.whamcloud.com/15524
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6713 lmv: lock necessary part of lmv_add_target 69/15269/5
wang di [Thu, 11 Jun 2015 15:01:07 +0000 (08:01 -0700)]
LU-6713 lmv: lock necessary part of lmv_add_target

Release lmv_init_mutex once the new target is added
into lmv_tgt_desc, so lmv_obd_connect will not be
serialized.

New target should be allowed to added to fld client
lists, so FLD can always choose new added target to
do the FLD lookup request, and also remove some noise
error messages in this process.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I1aa2e31108f2da672b41e2a5c1c544605328ceea
Reviewed-on: http://review.whamcloud.com/15269
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6446 ldiskfs: remove WARN_ON from ldiskfs_orphan_add{del} 90/14690/4
Yang Sheng [Wed, 6 May 2015 03:56:05 +0000 (11:56 +0800)]
LU-6446 ldiskfs: remove WARN_ON from ldiskfs_orphan_add{del}

RHEL7.1 kernel carefully check i_mutex whether locked in
many places. It will be triggered while ldiskfs_truncate()
was invoked in osd layer. A dead lock would occured if we
just locked i_mutex around it since Lustre using a reverse
order for start journal==>lock i_mutex. Consider Lustre has
own ldlm lock. So just remove such messages in ldiskfs patches.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I79520b0a1d013722a5a27e71318416608bc25285
Reviewed-on: http://review.whamcloud.com/14690
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6802 ptlrpc: reset replay cursor on reconnection 69/15669/2
Jinshan Xiong [Tue, 21 Jul 2015 15:25:56 +0000 (08:25 -0700)]
LU-6802 ptlrpc: reset replay cursor on reconnection

In this way, clients can make sure every single replayable ptlrpc
requests will be replayed and replies will be received.

Test-Parameters: envdefinitions=PTLDEBUG=+inode mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs clientdistro=el6.6 ossdistro=el6.6 mdsdistro=el6.6 mdscount=2 mdtcount=4 testlist=sanity,sanity,sanity,sanity,sanity,sanity
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I32daf787d141c84376879fb0fb3b3eb8424f91ad
Reviewed-on: http://review.whamcloud.com/15669
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoNew tag 2.7.57 2.7.57 v2_7_57 v2_7_57_0
Oleg Drokin [Mon, 27 Jul 2015 18:58:01 +0000 (14:58 -0400)]
New tag 2.7.57

Change-Id: I7802951fb411c6f40a2d7ee14ba0759c2036c227
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6872 lov: avoid infinite loop in lsm_alloc_plain() 44/15644/2
John L. Hammond [Mon, 20 Jul 2015 14:24:27 +0000 (09:24 -0500)]
LU-6872 lov: avoid infinite loop in lsm_alloc_plain()

In lsm_alloc_plain() use a signed loop index to avoid an infinite loop
in the error path.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I084bfadd8a6bf44bbcbe60624e31926ec6cdc04e
Reviewed-on: http://review.whamcloud.com/15644
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6790 build: handle external Intel OFED stack 82/15582/2
James Simmons [Mon, 13 Jul 2015 14:01:37 +0000 (10:01 -0400)]
LU-6790 build: handle external Intel OFED stack

Besides the Mellanox stack Intel also releases their own
OFED stack. Just like the Mellanox stack this Intel OFED
stack has a compatibility layer to support distributions
kernels. This layer provided stomps over the default
kernel headers so like the Mellanox stack compact-2.6.h
has to be placed before all other headers. The reason
it didn't show up before with the Mellanox stacks is the
Intel stack additionally stomps the pci layer of the
kernel.

Change-Id: If565f25432cf87723b69e5d65e00a7d5d301dc4f
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/15582
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6835 build: add -Wall to CFLAGS for test/ and utils/ 92/15392/4
John L. Hammond [Thu, 25 Jun 2015 06:03:26 +0000 (01:03 -0500)]
LU-6835 build: add -Wall to CFLAGS for test/ and utils/

Add -Wall to the CFLAGS used for lustre/{test,utils}/. Fixup missing
includes where needed.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I19d0739e0a9b5b079665d5d24d54c6dcaad93b0c
Reviewed-on: http://review.whamcloud.com/15392
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6501 libcfs: removal last entry of libcfs_netstrfns[] 24/15424/8
Frederic Saunier [Tue, 21 Jul 2015 19:04:51 +0000 (15:04 -0400)]
LU-6501 libcfs: removal last entry of libcfs_netstrfns[]

Currently NID string handling test for the last entry,
and last entry has .nf_type == (__u32) -1. If we ask
for a non existent LND we hit the last entry which then
calls a strlen on a NULL which causes a error. We can
avoid this problem if we just remove the last entry
since it is not used for anything except as a last
entry marker.

Signed-off-by: Frederic Saunier <frederic.saunier@atos.net>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: Ia8e3bea0f01bb5fa78e88bfbca698b0aa0d148ea
Reviewed-on: http://review.whamcloud.com/15424
Tested-by: Jenkins
Reviewed-by: frank zago <fzago@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6261 gnilnd: Clean up cfs abstractions from gnilnd 37/15437/2
Chris Horn [Mon, 29 Jun 2015 19:35:38 +0000 (14:35 -0500)]
LU-6261 gnilnd: Clean up cfs abstractions from gnilnd

Running the script contrib/scripts/libcfs_cleanup.sed gets us ready
for cfs changes.

find lnet/klnds/gnilnd -name "*.[ch]" -exec \
sed -i -f contrib/scripts/libcfs_cleanup.sed {} ;

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ie7aa1616d33d9f60870f4b8c2a946ce66d3c348c
Reviewed-on: http://review.whamcloud.com/15437
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: James Shimek <jshimek@cray.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6261 gnilnd: Hold shared MDD for gnilnd. 36/15436/2
Chris Horn [Mon, 29 Jun 2015 19:31:42 +0000 (14:31 -0500)]
LU-6261 gnilnd: Hold shared MDD for gnilnd.

Creating and destroying shared MDDs can cause OS noise. This affects
benchmarks that are sensitive to OS noise like fwq.
Allocate and register some memory so that we always have a shared MDD
available.
Moved kgnilnd_check_kgni_version() into gnilnd.c to simplify
gemini/aries headers.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ibc1f4b9a44035c6fb25e88d30552136486c260d6
Reviewed-on: http://review.whamcloud.com/15436
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: James Shimek <jshimek@cray.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6261 gnilnd: Fix LIBCFS_ALLOC_POST incompatibility. 35/15435/2
Chris Horn [Mon, 29 Jun 2015 18:54:34 +0000 (13:54 -0500)]
LU-6261 gnilnd: Fix LIBCFS_ALLOC_POST incompatibility.

master has changed LIBCFS_ALLOC to use vzalloc so LIBCFS_ALLOC_POST
does not contain a memset anymore.
kgnilnd_vmalloc() used LIBCFS_ALLOC_POST to zero memory allocated but
master removed the memset in favor of using vzalloc.
Use the __GFP_ZERO flag in (now named) kgnilnd_vzalloc to zero memory
allocated.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ia014c18f13e4e263eccb51d49d375fc4b3bc8b61
Reviewed-on: http://review.whamcloud.com/15435
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: James Shimek <jshimek@cray.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6261 gnilnd: Use trylock for conn mutex. 34/15434/2
Chris Horn [Mon, 29 Jun 2015 18:50:24 +0000 (13:50 -0500)]
LU-6261 gnilnd: Use trylock for conn mutex.

When converting to thread safe implementation, I missed the need for
the conn mutex to trylock because we may be holding the kgn_peer_conn
lock.
Change conn mutex to use a trylock in kgnilnd_sendmsg_trylock().
Add the module parameter thread_safe.
Disable thread safe on gemini.
In kgnilnd_create_conn(), use NOFS flag when vmalloc the tx_ref_table
to avoid possible hangs because of OOM condition.
Clean up info message for down event.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I1896c12b421ae35f65d1816bbe3eb5599b664498
Reviewed-on: http://review.whamcloud.com/15434
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: James Shimek <jshimek@cray.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6261 gnilnd: Add pkey module parameter 33/15433/2
Chris Horn [Mon, 29 Jun 2015 18:46:54 +0000 (13:46 -0500)]
LU-6261 gnilnd: Add pkey module parameter

lustre:19561 changed the pkey to the value reserved in gni_pub.h and
the max_immediate size. These need to be the same on both the service
and compute nodes for gnilnd to establish a connection.
Add the module parameter pkey so that a node running a dev version of
software will be able to connect to a node running an older version
with appropriate changes to modprobe.conf.
With this change, add the following parameters to
IMAGE_PATH/compute/etc/modprobe.conf:

options kgnilnd max_immediate=2048
options kgnilnd pkey=0xa3579

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I08ea2d66e745c37f41931a53e599bc572664dd9a
Reviewed-on: http://review.whamcloud.com/15433
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: James Shimek <jshimek@cray.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6261 gnilnd: Add ability to bind scheduler threads to cpus. 32/15432/2
Chris Horn [Mon, 29 Jun 2015 18:38:14 +0000 (13:38 -0500)]
LU-6261 gnilnd: Add ability to bind scheduler threads to cpus.

Added the module parameter thread_affinity which when enabled, will
bind the kgnilnd_sd_xx threads to cpus 1 through xx.
thread_affinity defaults to disabled.
Testing shows that enabling thread affinity will allow us to get
better small message performance with more scheduler threads.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I5298c25cc21a1a856e4d84f1bca4f484449cd055
Reviewed-on: http://review.whamcloud.com/15432
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: James Shimek <jshimek@cray.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6679 ldlm: do not send blocking ast for group locks 19/15119/3
Li Dongyang [Wed, 3 Jun 2015 06:32:40 +0000 (16:32 +1000)]
LU-6679 ldlm: do not send blocking ast for group locks

Group locks are acquired and released manually on client so
it doesn't make sense to send blocking AST to client when
there is an incompatible lock enqueued.

Currently client will set CBPENDING on the group lock when
it receives a blocking AST. Having the CBPENDING flag set
will make ldlm_lock_match to fail and there will be two
group locks granted on the same resource on the client.

Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Change-Id: I7af89e957528b3ed9771d86243ac8271084ee81f
Reviewed-on: http://review.whamcloud.com/15119
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6847 kernel: kernel update RHEL 6.6 [2.6.32-504.30.3.el6] 05/15605/3
Bob Glossman [Tue, 14 Jul 2015 17:27:09 +0000 (10:27 -0700)]
LU-6847 kernel: kernel update RHEL 6.6 [2.6.32-504.30.3.el6]

Update RHEL6.6 kernel to 2.6.32-504.30.3.el6

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I4f60d43dea248ce57290579c021838d0f731c332
Reviewed-on: http://review.whamcloud.com/15605
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6186 tests: avoid errors in using >&- bash close syntax 57/13857/4
Bruno Faccini [Tue, 24 Feb 2015 16:59:26 +0000 (17:59 +0100)]
LU-6186 tests: avoid errors in using >&- bash close syntax

Seems that recent EL7 bash version include changes in handling
the ">&-" redirection/close syntax, which cause "Bad file
descriptor" error. Use >/dev/null to avoid this.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Iac3e54ef69f0c5c241b87a9e307e059974a4caf3
Reviewed-on: http://review.whamcloud.com/13857
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6822 nrs: remove obsolete assertion in nrs_orr_start() 40/15540/2
Emoly Liu [Thu, 9 Jul 2015 08:13:39 +0000 (16:13 +0800)]
LU-6822 nrs: remove obsolete assertion in nrs_orr_start()

kmem_cache_destroy() doesn't return any value, so we should remove
this obsolete assertion, otherwise LBUG will happen when orr policy
is enabled.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I240e7bd660a6960b1e29da3575388a966ce8dca9
Reviewed-on: http://review.whamcloud.com/15540
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6540 test: fix no %s specifier for the file 15/14815/3
Dmitry Eremin [Thu, 14 May 2015 18:40:27 +0000 (21:40 +0300)]
LU-6540 test: fix no %s specifier for the file

The %m prints strerror(errno), but there is no %s specifier for the file.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Id9a8f486521b94e98986140fa374c40e08a00dac
Reviewed-on: http://review.whamcloud.com/14815
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6442 utils: "-G <value>" can be passed through mkfs options 00/14400/8
Artem Blagodarenko [Wed, 8 Apr 2015 12:52:53 +0000 (15:52 +0300)]
LU-6442 utils: "-G <value>" can be passed through mkfs options

mkfs.lustre util checks if parametrs already contain "flex_bg"
option (number of block groups that will be packed together
to create a larger virtual block group)  and adds this option
if it doesn't exists. But second parameter "-G" (number of block
groups that will be packed together to create a larger virtual
block group) can be added twice: one passed through mkfs options
and another one - default value. In this case this parameter
is not changed actually and default value is applied.

This patch adds extra check. Default option "-G" is added
only if no "-G" option passed through mkfs options.

Xyratex-bug-id: MRP-2046
Signed-off-by: Artem Blagodarenko <artem_blagodarenko@xyratex.com>
Reviewed-by: Sergey Cheremencev <sergey_cheremencev@xyratex.com>
Tested-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@seagate.com>
Change-Id: I14fb5f8d10fa369428efcbbcb4f638f388979818
Reviewed-on: http://review.whamcloud.com/14400
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6198 test: fix large-lun test_1, test_2 defects 01/13601/3
Elena Gryaznova [Mon, 2 Feb 2015 23:46:35 +0000 (03:46 +0400)]
LU-6198 test: fix large-lun test_1, test_2 defects

setupall()->start ost called at the end of large-lun.sh
fails to start the ost devices which leads test_1() and
test_2() fail if run separately.

The reasons are:
1) osts data are destroyed by test_1()->llverdev
2) osts are reformatted by test_2, but mds config not changed
Patch fixes the following:
- format Lustre at the end of test_1() and test_2()
- add the "REFORMAT" option check to avoid killing
filesystems that do not want a reformat.

Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Xyratex-bug-id: MRP-1984
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Change-Id: I7f1465d4777b7a5d5538e019226334dc090e3f50
Reviewed-on: http://review.whamcloud.com/13601
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6261 gnilnd: Changes for small message rate improvment 31/15431/2
Chris Horn [Mon, 29 Jun 2015 18:31:16 +0000 (13:31 -0500)]
LU-6261 gnilnd: Changes for small message rate improvment

Change number of threads on Aries service nodes to 7 for better
performance of small messages.
Change the max_immediate value to 8k instead of 2k. FMA has better
performance up to 8k.
lnet passes an niov of 256 reflecting the size of the router buffer
number of pages in the buffer pool when receiving data from the ib
interface when the message size is over one page. Adjust the niov to
reflect the message size/offset for sends.
Use the cookie defined in gni_pub.h instead of value from gnilnd.h.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I84270653efcff56f06da7de7fb9674ae319800dd
Reviewed-on: http://review.whamcloud.com/15431
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Shimek <jshimek@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6261 gnilnd: Thread-safe optimizations. 30/15430/3
Chris Horn [Fri, 10 Jul 2015 15:02:07 +0000 (11:02 -0400)]
LU-6261 gnilnd: Thread-safe optimizations.

Take advantage of improved gni threading.
Do not use gnd_cq_mutex lock for kgni versions that support thread
safe gni api.
Check if version is greater than code rev 0xb9 and use locking of
smsg and rdma on a per connection instead of the global cq lock.
Changed gnc_tx_seq and gnc_rx_seq to atomics.
Added gnc_smsg_mutex and gnc_rdma_mutex per conn to protect the lists
that the messages are placed on.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ic03f1877ab7b9632ca5517cd74f7e7fa25ba171b
Reviewed-on: http://review.whamcloud.com/15430
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Shimek <jshimek@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6668 test: regression tests for NRS TBF policy 02/15102/5
Li Xi [Sat, 11 Jul 2015 14:37:17 +0000 (22:37 +0800)]
LU-6668 test: regression tests for NRS TBF policy

This patch adds fundamental regression tests for NRS TBF policy.

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I37371845c2920f65d38df03bc42b40fd9dea5bb0
Reviewed-on: http://review.whamcloud.com/15102
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6661 test: add version check for tests 21/15521/3
Lai Siyao [Tue, 7 Jul 2015 13:37:29 +0000 (21:37 +0800)]
LU-6661 test: add version check for tests

add version check for sanity 27e and 162c, so that interop test
won't fail.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I45cc25d71e2e7b361455c9dcde1c9a3e1c797e9b
Reviewed-on: http://review.whamcloud.com/15521
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6683 osd: declare enough credits for generating LMA 61/15361/2
Fan Yong [Tue, 26 May 2015 05:33:37 +0000 (13:33 +0800)]
LU-6683 osd: declare enough credits for generating LMA

Usually, the LMA EA is set first after the object created. But if the
system is upgraded from 1.8 release or older, then there is no LMA EA
stored in the object, and then generating LMA EA for the object maybe
not the first EA in the object's inode.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I2bdbf5ba56db1ea08edf0a8a4d724df4ad97e071
Reviewed-on: http://review.whamcloud.com/15361
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6690 tests: start nfsserver service for SLES 49/15149/3
Jian Yu [Thu, 4 Jun 2015 23:32:56 +0000 (16:32 -0700)]
LU-6690 tests: start nfsserver service for SLES

The NFS server service name in SLES distro is "nfsserver"
instead of "nfs". This patch fixes setup-nfs.sh to start
nfsserver service for SLES.

Test-Parameters: alwaysuploadlogs \
envdefinitions=EXCEPT=compilebench \
mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
clientdistro=sles11sp3 ossdistro=sles11sp3 mdsdistro=sles11sp3 \
mdtcount=1 testlist=parallel-scale-nfsv3,parallel-scale-nfsv4

Test-Parameters: alwaysuploadlogs \
mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
clientdistro=el6.6 ossdistro=el6.6 mdsdistro=el6.6 \
mdtcount=1 testlist=parallel-scale-nfsv3,parallel-scale-nfsv4

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: Ifb200400c0fa072fee6f563431be0429e2c79890
Reviewed-on: http://review.whamcloud.com/15149
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6834 lod: fix idx_array overwritten 72/15572/2
wang di [Thu, 9 Jul 2015 03:01:46 +0000 (20:01 -0700)]
LU-6834 lod: fix idx_array overwritten

Compare index with stripe_count -1 , otherwise it will
overwrite the idx_array[] in lod_prep_md_striped_create().

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ib116821fb9e9b800752d760c262ded725c55ce0e
Reviewed-on: http://review.whamcloud.com/15572
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6529 ldlm: reclaim granted locks defensively 31/14931/11
Niu Yawei [Thu, 21 May 2015 15:07:54 +0000 (11:07 -0400)]
LU-6529 ldlm: reclaim granted locks defensively

To avoid ldlm lock exhausting server memory, two global parameters:
ldlm_watermark_low & ldlm_watermark_high are used for reclaiming
granted locks and rejecting incoming enqueue requests defensively.

ldlm_watermark_low: When the amount of granted locks reaching this
threshold, server start to revoke locks gradually.

ldlm_watermark_high: When the amount of granted locks reaching this
threshold, server will return -EINPROGRESS to any incoming enqueue
request until the lock count is shrunk below the threshold again.

ldlm_watermark_low & ldlm_watermark_high is set to 20% & 30% of the
total memory by default. It is tunable via proc entry, when it's set
to 0, the feature is disabled.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I2fab39ac0ab6f269b7f1a40f3e08b8a51807cc69
Reviewed-on: http://review.whamcloud.com/14931
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>