Whamcloud - gitweb
fs/lustre-release.git
8 years agoLU-7461 lod: retry to get remote update log 22/17322/2
Di Wang [Sat, 21 Nov 2015 15:16:28 +0000 (07:16 -0800)]
LU-7461 lod: retry to get remote update log

If the remote MDT is also in recovery status,
then retrieving update logs in lod_sub_recovery_thread()
might return -EAGAIN or -EIO or -EBUSY, let's
retry in this case until the recovery is aborted or
the local MDT is umounted.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Iee945942bd01925cdcfe75c4e59dccbd63b34498
Reviewed-on: http://review.whamcloud.com/17322
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7416 osp: check rq_repmsg in osp_request_commit_cb 30/17130/2
Di Wang [Tue, 10 Nov 2015 10:22:20 +0000 (02:22 -0800)]
LU-7416 osp: check rq_repmsg in osp_request_commit_cb

Check if rq_repmsg is NULL before retrieving
last committed transno from reply message.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Ibf1e110e33df333934c65dfcf52870954e936180
Reviewed-on: http://review.whamcloud.com/17130
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7343 osd-ldiskfs: handle ldiskfs_append failure 48/17148/5
Fan Yong [Fri, 2 Oct 2015 05:21:21 +0000 (13:21 +0800)]
LU-7343 osd-ldiskfs: handle ldiskfs_append failure

In new linux kernel (linux-3.1x, x>=0), the ldiskfs exported
function ldiskfs_append() return error# via the return value,
instead of via the output parameter @err as it does on other
kernels (linux-2.6). Under such case, the caller should not
assume non-NULL returned value is valid buffer head, it can
stands for error#. So check that properly.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I4dca43bcfd31aafd999f54934a51d258071dab22
Reviewed-on: http://review.whamcloud.com/17148
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
8 years agoLU-7376 tests: sanity-hsm/59 should skip old servers. 29/17029/3
Alex Zhuravlev [Tue, 3 Nov 2015 12:05:52 +0000 (15:05 +0300)]
LU-7376 tests: sanity-hsm/59 should skip old servers.

there is no poin to crash vulnerable versions.

Change-Id: Iacafd10d2a3d04ba1bb9ca70d8e343809490a349
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/17029
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7436 tests: skip conf-sanity/91 with old servers 22/17222/2
Alex Zhuravlev [Tue, 17 Nov 2015 06:39:25 +0000 (09:39 +0300)]
LU-7436 tests: skip conf-sanity/91 with old servers

due to missing functionality.

Change-Id: I2be72820413aee9a9d7082c2d8c6f308eeb6e141
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/17222
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7415 kernel: kernel update RHEL 6.7 [2.6.32-573.8.1.el6] 19/17119/3
Bob Glossman [Tue, 10 Nov 2015 16:03:49 +0000 (08:03 -0800)]
LU-7415 kernel: kernel update RHEL 6.7 [2.6.32-573.8.1.el6]

Update RHEL6.7 kernel to 2.6.32-573.8.1.el6

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I1b3c4b144b06e5e96f818e35c08f490e574ed798
Reviewed-on: http://review.whamcloud.com/17119
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7164 osc: osc_extent should hold refcount to osc_object 33/16433/3
Jinshan Xiong [Tue, 15 Sep 2015 19:19:10 +0000 (12:19 -0700)]
LU-7164 osc: osc_extent should hold refcount to osc_object

To avoid a race that osc_extent and osc_object destroy happens on the
same time, which causes kernel crash.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I3e3237f0d1cff4bd992bef4e4c01355a1d5c8d9f
Reviewed-on: http://review.whamcloud.com/16433
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7384 lfsck: check transaction stop status 42/17042/3
Fan Yong [Fri, 18 Sep 2015 07:52:31 +0000 (15:52 +0800)]
LU-7384 lfsck: check transaction stop status

The LFSCK modification will be sent to remote server when the
transaction stop, for sync transaction case, we can check the
dt_trans_stop() result.

If the lfsck_namespace_create_orphan_dir() failed, but we may
ignored that before because of ignoring dt_trans_stop result.
Then it may cause subsequent lfsck_namespace_insert_normal()
failed at LASSERT(lu_object_exists(o) != 0);

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: If897b7bd479ecdb61e6435f3177211f865a4e303
Reviewed-on: http://review.whamcloud.com/17042
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-3536 lfsck: reuse parameter name for re-locating object 32/16932/7
Fan Yong [Wed, 23 Sep 2015 15:24:02 +0000 (23:24 +0800)]
LU-3536 lfsck: reuse parameter name for re-locating object

Usually, LFSCK engine will locate the object against the bottom
device (OSD), then make related check/repair directly. Sometimes,
such as lfsck_namespace_repair_dirent(), we need to modify based
on LOD device. Under such case, the LFSCK will re-locate related
object with the same FID.

Originally, there is no special rules about the parameter's name,
that is confused which one should be used. For example, the input
parameter is named as "parent" that is against OSD, we need to
re-locate the obj based on the LOD, named as "pobj", then in the
subsequent logic, "pobj" should be used, but unfortunately, the
"parent" may be used by wrong. It is difficult to find out such
invalid usage.

To avoid such trouble, we prefer to reuse the (input) parameter
name after re-locating the object, name "pobj" as "parent".

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I5b6d7c5c10e1817ef2bade4931485228b26c511d
Reviewed-on: http://review.whamcloud.com/16932
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
8 years agoLU-3322 lnet: make connect parameters persistent 74/17074/5
Amir Shehata [Fri, 6 Nov 2015 20:41:01 +0000 (12:41 -0800)]
LU-3322 lnet: make connect parameters persistent

Store map-on-demand and peertx credits in the peer, since the peer
is persistent. Also made sure that when assigning the parameters
received on the connection to the peer structure through create,
that if another peer is added before grabbing the lock we assign
these parameters to it as well.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ie68f1ba1349d15b0a31eff9a2ca454df8e408ea9
Reviewed-on: http://review.whamcloud.com/17074
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7324 lnet: recv could access freed message 65/17065/4
Liang Zhen [Fri, 6 Nov 2015 14:23:05 +0000 (22:23 +0800)]
LU-7324 lnet: recv could access freed message

When lnet_parse_put calls lnet_ptl_match_md, this function can attach
current message on the delayed list if there is no match. It means
this message can be taken over and freed by another thread who is
posting new MD, then it is not safe for caller of lnet_parse_put to
check this message again.

This patch fixes this issue by adding a local variable "ready_delay"
to store corresponding status of lnet_msg, so lnet doesn't need to
check the message again if lnet_ptl_match_md returned MATCH_NONE for
it.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I0f8827103dd637648112e936ce6e685266e5ca40
Reviewed-on: http://review.whamcloud.com/17065
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7221 ldlm: do not take a reference on target if stopping 40/16940/5
Bruno Faccini [Mon, 26 Oct 2015 13:37:19 +0000 (14:37 +0100)]
LU-7221 ldlm: do not take a reference on target if stopping

In the set of changes of patch for LU-5569
(http://review.whamcloud.com/11750/,
commit 892078e3b566c04471e7dcf2c28e66f2f3584f93) one is to take a
reference on target even if it is stopping (umount'ed). Then, upon
connections attempts, this can lead to unwanted cleanup actions to
occur on [obd_self_]export from class_decref(), finally causing a
LBUG in class_export_put() because export's exp_refcount has already
reached 0.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: If960437934fb694d173a4fd1fbfb9e43d496fea6
Reviewed-on: http://review.whamcloud.com/16940
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7318 out: dynamic reply size 89/16889/16
Alex Zhuravlev [Tue, 20 Oct 2015 13:53:18 +0000 (16:53 +0300)]
LU-7318 out: dynamic reply size

every update on the initiator side can declare how many bytes
it expects back. OUT packing library put these numbers on the
wire and prepary an appropriate buffer for the reply. then OUT
target do few checks to ensure individual replies fit their
buffers.

Change-Id: I443b5c879bc321c33efb70af665ecd2b2f7baa18
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/16889
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7077 target: avoid using possible error return NULL pointer 73/16473/5
Bob Glossman [Thu, 17 Sep 2015 19:05:02 +0000 (12:05 -0700)]
LU-7077 target: avoid using possible error return NULL pointer

previous fix http://review.whamcloud.com/15576 added a call
to cfs_hash_getref().  add LASSERT() to ensure the can
never happen here return value of NULL is in fact never seen.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Ic6132b5450534db0bb9b89c3dd6f55517450c42a
Reviewed-on: http://review.whamcloud.com/16473
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
8 years agoLU-7174 build: make git ignore dkms generated file 36/17136/3
James Simmons [Thu, 12 Nov 2015 14:31:49 +0000 (09:31 -0500)]
LU-7174 build: make git ignore dkms generated file

While testing patches other non-patch the related build
by product dkms.mkconf show up with git status. To avoid
adding this by accident place thes by product files in the
proper .gitignore files.

Change-Id: I49a5411f8c1159a75d1cd28067dfbf3c2a677d6c
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/17136
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7169 tests: check disk corruption during failover 64/16664/9
Fan Yong [Thu, 24 Sep 2015 09:04:41 +0000 (17:04 +0800)]
LU-7169 tests: check disk corruption during failover

It is a debug patch for conf-sanity test_84. It is suspected
that there is some disk corruption during the MDT0 failover.

Test-Parameters: mdsfilesystemtype=ldiskfs mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs testlist=conf-sanity,conf-sanity,conf-sanity
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I7e20f26e1ecee483474ace44c8284b5776f3c602
Reviewed-on: http://review.whamcloud.com/16664
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
8 years agoLU-7377 utils: Don't fail on plugin load for mount/tunefs 28/17128/2
Nathaniel Clark [Fri, 6 Nov 2015 18:48:34 +0000 (13:48 -0500)]
LU-7377 utils: Don't fail on plugin load for mount/tunefs

While loading mount_utils_zfs, if zfs modules aren't loaded, but
zfs plugin is present, it will return an error, this shouldn't cause
all module loading to fail.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Idad52745cdfa9d673ab9bd4afe38de4d51ae9a49
Reviewed-on: http://review.whamcloud.com/17128
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7414 target: do not share update and rdbuf 29/17129/3
Di Wang [Tue, 10 Nov 2015 10:12:00 +0000 (02:12 -0800)]
LU-7414 target: do not share update and rdbuf

Redefine lu_rdbuf structure to simplify the rdbuf
allocation in out_read().

And also move tti_u.rdbuf out of tgt_thread_info
union, otherwise rdbuf and update will share
the same memory and cause corruption, see out_read().

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Idb0f5af1b00fd5fd15ebc8742aa60d9a43df0a8a
Reviewed-on: http://review.whamcloud.com/17129
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
8 years agoLU-7200 kernel: kernel update [SLES11 SP3 3.0.101-0.47.67] 17/16617/9
Bob Glossman [Wed, 23 Sep 2015 18:36:38 +0000 (11:36 -0700)]
LU-7200 kernel: kernel update [SLES11 SP3 3.0.101-0.47.67]

Update SLES11 SP3 kernel to 3.0.101-0.47.67

Test-Parameters: mdsdistro=sles11sp3 ossdistro=sles11sp3 \
  clientdistro=sles11sp3 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
  testgroup=review-ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Icd7bac6bea6866f82e2e03f5dbbb1bda1a4ecacf
Reviewed-on: http://review.whamcloud.com/16617
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoNew tag 2.7.63 2.7.63 v2_7_63 v2_7_63_0
Oleg Drokin [Mon, 16 Nov 2015 22:41:54 +0000 (17:41 -0500)]
New tag 2.7.63

Change-Id: I79f285380612f61679c1cf8e51446c9018e8225c
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6204 build: clean up kernel module metadata 87/16787/5
Andreas Dilger [Mon, 19 Oct 2015 15:24:22 +0000 (11:24 -0400)]
LU-6204 build: clean up kernel module metadata

Update static MODULE_VERSION() lines - this should be automated.

Improve MODULE_DESCRIPTION() descriptions.

Make the name of the module_init()/_exit() functions consistently
{module_name}_init and {module_name}_exit.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I1c3fe5698c7f41d971a38225650597c913500c1e
Reviewed-on: http://review.whamcloud.com/16787
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7345 obdclass: annotate locks in __local_file_create 57/16957/4
Olaf Faaland [Tue, 27 Oct 2015 00:51:46 +0000 (17:51 -0700)]
LU-7345 obdclass: annotate locks in __local_file_create

dt_write_lock() is called for both the child and the parent dt_objects
when a directory is created.  This triggers a false positive in
lockdep when running with CONFIG_LOCKDEP=y, as the structure
containing the lock and the name of the lock is the same, and so it
appears to be a recursive lock attempt based on lock class.

This gives the two locks different subclasses so lockdep can
differentiate between them.

Also, osd-zfs osd_object_{read,write}_lock() functions currently
ignore the subclass (role) provided by the caller, calling down_read()
instead of down_read_nested() for example.

Make osd_zfs use the _nested variants so the role takes effect.

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: Iab79feadfbd7d1a5a06749ecb9f6888b55a78d73
Reviewed-on: http://review.whamcloud.com/16957
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7269 ptlrpc: remove ptlrpc_prep_req 65/16765/5
Ben Evans [Wed, 7 Oct 2015 17:30:52 +0000 (12:30 -0500)]
LU-7269 ptlrpc: remove ptlrpc_prep_req

Remove unused functions ptlrpc_prep_req, ptlrpc_prep_req_pool
Combine __ptlrpc_request_bufs_pack and ptlrpc_request_bufs_pack

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I4e4e64aa1f7fa4c85daf311906f6417a513dcddc
Reviewed-on: http://review.whamcloud.com/16765
Tested-by: Jenkins
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Chris Horn <hornc@cray.com>
8 years agoLU-7362 lnet: Remove LASSERTS from router checker 03/17003/2
Doug Oucharek [Fri, 30 Oct 2015 21:40:59 +0000 (14:40 -0700)]
LU-7362 lnet: Remove LASSERTS from router checker

In lnet_router_checker(), there are two LASSERTS.  Neither protects
us from anything and one of them triggered for a customer crashing
the system unecessarily.  This patch removes them.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: If2732632e47103fb8fa63a263c4c5ef4a44142a3
Reviewed-on: http://review.whamcloud.com/17003
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Matt Ezell <ezellma@ornl.gov>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7296 ldlm: improve lock timeout messages 24/16824/2
John L. Hammond [Wed, 14 Oct 2015 16:33:35 +0000 (11:33 -0500)]
LU-7296 ldlm: improve lock timeout messages

In ldlm_expired_completion_wait() remove the useless LCONSOLE_WARN()
message and upgrade the LDLM_DEBUG() statement to LDLM_ERROR().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I6293720bf8e038057a2c84a715359cdbb8cebe91
Reviewed-on: http://review.whamcloud.com/16824
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7120 scrub: handle osd_scrub_post return value 68/16368/2
Fan Yong [Fri, 31 Jul 2015 16:00:20 +0000 (00:00 +0800)]
LU-7120 scrub: handle osd_scrub_post return value

To avoid missing some failure cases during write scrub status
to disk.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I7f77bdba184f634b4f9dd748c3f1b97609b81960
Reviewed-on: http://review.whamcloud.com/16368
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ashish Purkar <ashish.purkar@seagate.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6013 utils: don't initialize OSD code for client mount 19/13019/4
Andreas Dilger [Wed, 10 Dec 2014 10:53:49 +0000 (03:53 -0700)]
LU-6013 utils: don't initialize OSD code for client mount

Don't even try to initialize the server OSD handling code if this
is a client mountpoint.  That avoids potential problems if the OSD
code isn't working or available when it isn't needed on a client.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I104b70b9d27811d306879fc047a83f85ea3ebbe5
Reviewed-on: http://review.whamcloud.com/13019
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Jenkins
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoRevert "LU-4865 zfs: grow block size by write pattern" 53/17053/4
Andreas Dilger [Thu, 5 Nov 2015 18:47:06 +0000 (18:47 +0000)]
Revert "LU-4865 zfs: grow block size by write pattern"

This reverts commit 3e4369135127b350dbc26a4a5dc94cfa46e394cf.

This has shown problems in testing and may be the cause of LU-7392.

Change-Id: I664f7f8c943d8a90f2d2a9845aea2636535d6b1e
Reviewed-on: http://review.whamcloud.com/17053
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7330 ldlm: fix race of starting bl threads 26/17026/2
Niu Yawei [Tue, 3 Nov 2015 06:59:32 +0000 (01:59 -0500)]
LU-7330 ldlm: fix race of starting bl threads

There is race in the code of starting bl threads which leads to
thread number exceeds the maximum number when race happened, it
can also lead to duplicated thread name.

This patch fixes the race and cleanup the code a bit.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I9c9be125d1d76890b8c52476684976dad3cb3d87
Reviewed-on: http://review.whamcloud.com/17026
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
8 years agoLU-2222 mdt: restore evict-by-nid functionality 67/16867/8
Alex Zhuravlev [Mon, 19 Oct 2015 10:18:44 +0000 (13:18 +0300)]
LU-2222 mdt: restore evict-by-nid functionality

Writing a NID or UUID to mdt.*.evict_tgt_nids will evict clients
with NID or UUID specified all the targets (OSTs and MDTs).

Change-Id: I66a60a6c81fbac1571f5685111df7b00a306be36
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/16867
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7367 tests: fix fail_loc code in 110g 09/17009/2
Alex Zhuravlev [Sun, 1 Nov 2015 18:47:48 +0000 (21:47 +0300)]
LU-7367 tests: fix fail_loc code in 110g

test 110g used wrong code for OBD_FAIL_MIGRATE_NET_REP

Change-Id: I5dc2e18fb99e35422a9ae227e2e16ba7d39600a3
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/17009
Tested-by: Jenkins
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7366 build: resolve compile issues on Ubuntu 15.04 08/17008/2
James Simmons [Sun, 1 Nov 2015 16:47:29 +0000 (11:47 -0500)]
LU-7366 build: resolve compile issues on Ubuntu 15.04

Ubuntu 15.04 has been released which uses a 4.2 kernel
and gcc 5.2. The gcc version fails to build lustre due
to missing headers in the GSS userland code and a
variable not being uninitialized in the llite layer.

Change-Id: I3615414ac039277a6ef6c6af1a541590b9d79566
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/17008
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7353 utils: fix lctl usage messages 80/16980/3
Andreas Dilger [Wed, 28 Oct 2015 20:06:13 +0000 (14:06 -0600)]
LU-7353 utils: fix lctl usage messages

Fix the lctl usage message for sub-commands that need the LNet network
to be specified.  lctl::g_net_is_set() incorrectly recommended using
the "network" command to specify the LNet network, which is incorrect
when using lctl in command-line mode:

  # lctl peer_list
  You must run the 'network' command before 'peer_list'.
  # lctl network tcp0 peer_list
  # lctl --net tcp0 peer_list
  12345-192.168.20.1@tcp [1]192.168.40.147->mookie-gig:988 #15

Fix that to correctly recommend using the "--net" command when using
non-interactive mode, and to return an error if "network" is used in
non-interactive mode with extra arguments.

Replace mention of "portals" in help messages with "LNet".
Remove mention of obsolete elan, qsw, ra network types.

Improve the help message content for related subcommands.
Fix whitespace and command descriptions for related subcommands.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Idf9bf663b16012ebc9f38566ecba9859a54cab07
Reviewed-on: http://review.whamcloud.com/16980
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7336 ofd: cleanup proc when ofd_info_init fails 34/16934/3
Li Xi [Sat, 24 Oct 2015 05:36:00 +0000 (13:36 +0800)]
LU-7336 ofd: cleanup proc when ofd_info_init fails

In ofd_init0(), if ofd_info_init() fails it should cleanup
procs.

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I3ff278526f09ef7e36631712ce21a498a6644907
Reviewed-on: http://review.whamcloud.com/16934
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7012 osp: don't use OSP when import is deactivated 37/16937/3
Mikhail Pershin [Wed, 23 Sep 2015 19:09:27 +0000 (12:09 -0700)]
LU-7012 osp: don't use OSP when import is deactivated

Unset opd_imp_connected flag upon IMP_EVENT_INACTIVE event,
it will stop any llog processing by that device until import
will be activated again.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Ie219e536c216130f428ba933d11842511692c95b
Reviewed-on: http://review.whamcloud.com/16937
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7371 osd-ldiskfs: fix wrong read length over isize 20/17020/5
Li Xi [Mon, 2 Nov 2015 16:39:31 +0000 (00:39 +0800)]
LU-7371 osd-ldiskfs: fix wrong read length over isize

If the isize is 4095, a read length of 4096 will be
returned because a wrong calculation of EOF. This patch fixes the
problem.

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I73b18641f000a2d96067243c08c26e51d0d53244
Reviewed-on: http://review.whamcloud.com/17020
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7261 ldiskfs: clean up code style for large_xattr 78/16778/4
Andreas Dilger [Fri, 9 Oct 2015 06:37:03 +0000 (00:37 -0600)]
LU-7261 ldiskfs: clean up code style for large_xattr

Clean up the code style for the large_xattr patches to match the
upstream kernel style (use ! instead of == 0, and similar), and
the original style of the code before the earlier versions of the
patch was applied.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Icffe339d1b002a55984856829afde9e3eae98bd9
Reviewed-on: http://review.whamcloud.com/16778
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7379 kernel: kernel update RHEL7.1 [3.10.0-229.20.1.el7] 44/17044/3
Bob Glossman [Tue, 3 Nov 2015 22:09:32 +0000 (14:09 -0800)]
LU-7379 kernel: kernel update RHEL7.1 [3.10.0-229.20.1.el7]

Test-Parameters: mdsdistro=el7 ossdistro=el7 \
  clientdistro=el7 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
  testgroup=review-ldiskfs

update RHEL 7.1 kernel to 3.10.0-229.20.1.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I690daa6493232353703f5392cfe0a979b824f3f1
Reviewed-on: http://review.whamcloud.com/17044
Tested-by: Jenkins
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7304 ldiskfs: fix bug when bigalloc is enabled 32/16832/3
Wang Shilong [Tue, 13 Oct 2015 00:28:29 +0000 (20:28 -0400)]
LU-7304 ldiskfs: fix bug when bigalloc is enabled

See following error when enabled bigalloc feature
for ldiskfs rhel7:

LDISKFS-fs error (device sdb):
ldiskfs_mb_check_ondisk_bitmap:3611: comm mkdir:
 on-disk bitmap for group 8corrupted: 0 blocks free in
 bitmap, 32768 - in gd

Fixed to use EXT4_CLUSTERS_PER_GROUP, otherwise,
we will get wrong value and fail to check, which
make FS become RO..

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I7f61918918e6f4e2f372929181b704b0648dcbca
Reviewed-on: http://review.whamcloud.com/16832
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7354 osd: avoid NULL pointer in osd_obj_update_entry 10/17010/3
Fan Yong [Sun, 13 Sep 2015 09:41:13 +0000 (17:41 +0800)]
LU-7354 osd: avoid NULL pointer in osd_obj_update_entry

In osd_obj_update_entry(), the variable @oi_fid may be NULL.
We need to check such case before further using it to avoid
accessing invalid RAM.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ibf47e949d69f0b9e5657a6dce2007fe4f6f1a9f6
Reviewed-on: http://review.whamcloud.com/17010
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7243 misc: update Intel copyright messages 2015 58/16758/3
Andreas Dilger [Tue, 6 Oct 2015 23:25:40 +0000 (17:25 -0600)]
LU-7243 misc: update Intel copyright messages 2015

Update copyright messages in files modified by Intel employees
in 2015 by non-trivial patches.  Exclude patches that are only
deleting code, renaming functions, or adding or removing whitespace.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I70fe6a346790e15d23606a3f380e7ef8fb8b84a0
Reviewed-on: http://review.whamcloud.com/16758
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7230 llite: clear dir stripe md in ll_iget 77/16677/15
Di Wang [Thu, 8 Oct 2015 07:51:16 +0000 (00:51 -0700)]
LU-7230 llite: clear dir stripe md in ll_iget

If ll_iget fails during inode initialization, especially
during striped directory lookup after creation failed,
then it should clear stripe MD before make_bad_inode(),
because make_bad_inode() will reset the i_mode, which
can cause ll_clear_inode() skip freeing those stripe MD.

Remove the name entry from the directory, once creation
failed. Note: this will not rollback all of local
operation, and LFSCK will take care of the orphan object.

Add sanity.sh 300p to verify the case.

And also enable lfs rm_entry for local object as well,
because sometimes it is quite possible to create the
local corrupted striped directory, and we might need
use "lfs rm_entry" to delete the corrupted striped dir.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I892c52117b83c8348aa0ceb888e73c84e79ffe46
Reviewed-on: http://review.whamcloud.com/16677
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
8 years agoLU-6634 llog: destroy plain llog if init fails 27/16427/15
Alex Zhuravlev [Tue, 15 Sep 2015 09:36:35 +0000 (12:36 +0300)]
LU-6634 llog: destroy plain llog if init fails

llog_cat_add_rec() should destroy the plain llog
in the same transaction if initialization of that
failed. also, llog_osd_write_rec() should check
the object still exists as it's possible that
another thread failed to initialize and destroyed
the llog.

Change-Id: I7b823d34b32b5caaf0cc17b4cfe278a07a78ec15
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/16427
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7277 lod: keep trying to get remote update log 86/16786/2
Di Wang [Thu, 8 Oct 2015 08:09:04 +0000 (01:09 -0700)]
LU-7277 lod: keep trying to get remote update log

Because the remote MDT might be in recovery at the same
time, let's Keep trying to get remote update log until
the recovery is abort.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Id9543201ce543be730e73f9f51f3f7a0d10d3dfc
Reviewed-on: http://review.whamcloud.com/16786
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6992 test: wait device to be registered 81/16781/5
Hongchao Zhang [Thu, 22 Oct 2015 19:26:45 +0000 (03:26 +0800)]
LU-6992 test: wait device to be registered

in mount_facet, the device label can only be used after the device
registered to MGS and it was rewritten.

Change-Id: I9ed65631391f2be84e484e409fbfe59020d982be
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/16781
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6627 mdc: quiet console message for known -EINTR 11/14911/11
Andreas Dilger [Thu, 21 May 2015 17:58:03 +0000 (11:58 -0600)]
LU-6627 mdc: quiet console message for known -EINTR

If a user process is waiting for MDS recovery during close, but the
process is interrupted, the file is still closed but it prints a
message on the console.  Quiet the console message for -EINTR, since
this is expected behaviour.

Fix code style issues in this function.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3bc7284bb7014dbaeb9bfb27a4e98a8abb4a54b6
Reviewed-on: http://review.whamcloud.com/14911
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-5187 ofd: Fix precreate console warning 67/10767/4
Nathaniel Clark [Fri, 20 Jun 2014 14:37:42 +0000 (10:37 -0400)]
LU-5187 ofd: Fix precreate console warning

Alter console warning to be more explanitory.  Also add debug log
message to save developer centric info.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ib1192562f75b1e2a2c8c9f4ee476da54ebeb1a9a
Reviewed-on: http://review.whamcloud.com/10767
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-5472 tests: export UMOUNT to avoid SLES12 issue 45/16445/10
Yang Sheng [Fri, 23 Oct 2015 18:32:33 +0000 (02:32 +0800)]
LU-5472 tests: export UMOUNT to avoid SLES12 issue

sles12 umount command has issue with '-d' option.
it will report error while a absolute pathname
of mountpoint present with '-d'. So we export
UMOUNT to avoid such problem. In fact, loopdev
can be free automaticly. So '-d' needn't given
explicity.  But consider compatibility we still
keep it.
Also include http://review.whamcloud.com/#/c/14799/
port patch to master.
In SLES12, umount command will run statfs() on the
filesystem, which will cause unmounting Lustre client
hang when OST is unavailable. This patch adds "-f"
option to zconf_umount in conf-sanity.sh to avoid the
issue.

Test-Parameters: alwaysuploadlogs envdefinitions=ONLY=32 clientdistro=sles12 mdtcount=1 testlist=sanity
Test-Parameters: alwaysuploadlogs envdefinitions=SLOW=yes clientdistro=sles12 mdtcount=1 testlist=conf-sanity
Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: If466c2101e0db52b5ec1f7273a846dc2497cfb84
Reviewed-on: http://review.whamcloud.com/16445
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-5770 osd: find bufsize in declare_xattr_set 12/12412/11
Alexander Zarochentsev [Fri, 24 Oct 2014 03:42:27 +0000 (07:42 +0400)]
LU-5770 osd: find bufsize in declare_xattr_set

mdd_link/mdd_unlink need to know LINKEA buffer size before
transaction start for correct estimation of tx credits.
In case if xattr size not known, make a call to osd_xattr_set()
to find correct buffer size.

Xyratex-bug-id: MRP-2093
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Change-Id: I890e8182bd2d191a0322d25f0684f2a220873546
Reviewed-on: http://review.whamcloud.com/12412
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7039 llog: skip to next chunk for corrupt record 40/16740/3
Di Wang [Mon, 5 Oct 2015 03:34:41 +0000 (20:34 -0700)]
LU-7039 llog: skip to next chunk for corrupt record

Skip to next chunk if current record is corrupted.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: I7729ca1b10646fa796a3f94aabe39d8d36cf613a
Reviewed-on: http://review.whamcloud.com/16740
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7325 ldiskfs: use correct types for inode num 13/16913/4
Alexander Zarochentsev [Thu, 22 Oct 2015 12:49:00 +0000 (15:49 +0300)]
LU-7325 ldiskfs: use correct types for inode num

using signed integer for inode numbers in large EA code
resulting incorrect inode numbers when casting to
unsigned long in ext4_iget() call.

Seagate-bug-id: MRP-3025
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Change-Id: I49e578a87c4d0f0274a9a42151675822f57c1c5f
Reviewed-on: http://review.whamcloud.com/16913
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6385 tests: sync I/O in obdfilter-survey 42/16942/2
Nathaniel Clark [Mon, 26 Oct 2015 16:14:49 +0000 (12:14 -0400)]
LU-6385 tests: sync I/O in obdfilter-survey

This ensures I/O is synced before each test.  Previous patch
http://review.whamcloud.com/14143 works only with the Lustre test
framework.  This will ensure I/O is always sysnced.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ib19a9e4afd8ca83deceb78fe8fdb4d231da0bc40
Reviewed-on: http://review.whamcloud.com/16942
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6215 kernel: report 4.2.1 kernel support 69/16769/4
James Simmons [Wed, 28 Oct 2015 19:45:30 +0000 (15:45 -0400)]
LU-6215 kernel: report 4.2.1 kernel support

Add to the Lustre ChangeLog file support for 4.2.1 linux
kernels for clients and servers in the case of using the
latest ZFS backend.

Change-Id: If126c6c1dc8e46e0485f04090590759dc2eb1587
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/16769
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7040 test: skip sanity-hsm 12q for old MDSs 83/16683/2
John L. Hammond [Wed, 30 Sep 2015 14:52:48 +0000 (09:52 -0500)]
LU-7040 test: skip sanity-hsm 12q for old MDSs

Before 2.7.58 MDSs do not set OBD_MD_TSTATE in some of the cases that
they should. So skip sanity-hsm test_12q when running against these
servers.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I2ebe1b24e28bd4a0f5c03eace1c74678c7442c5f
Reviewed-on: http://review.whamcloud.com/16683
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
8 years agoLU-1032 build: DKMS RPM for Lustre Client modules 47/12347/22
Bruno Faccini [Mon, 20 Oct 2014 14:00:01 +0000 (16:00 +0200)]
LU-1032 build: DKMS RPM for Lustre Client modules

Permit Lustre Client (only) modules DKMS RPM creation.

This patch is a follow on to the first set of patches for LU-1032
that only allowed for the creation of Lustre Server (zfs only)
modules DKMS RPM.

It also changes original behavior by allowing to dynamically
modify dkms.conf on-target. This particularly helps to change
configure and list of modules to be built/installed list, like
to configure with gss to build ptlrpc_gss.ko module when
krb5_devel is present instead to have it been a mandatory
required dependency.

Also implements feature of DKMS RPM creation from Makefile
(thanks to mjmac), now in 2 separate SRPM/RPM steps and for both
Client and Server versions.

Also use an auto-increment (Array[${#Array[@]}]=) operator in
dkms.conf modules declarations to help for future changes when
there will be a need to add/delete modules.

Change in lustre/utils Makefile has been required to allow
building of ptlrpc_gss module with --enable-gss and without the
need to specify --enable-utils which was causing an unexpected
zfs user-land dependency for DKMS Server build.

To satisfy lustre rpm requirement of a package providing
lustre-osd, provides has been added to DKMS Server RPM since
it does actually generate osd-zfs module.

Signed-off-by: Michael MacDonald <michael.macdonald@intel.com>
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I278d50307a17fe49a06392351890946b7dd3557a
Reviewed-on: http://review.whamcloud.com/12347
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6889 kernel: new kerrnel [SLES11 SP4 3.0.101-65] 32/15832/19
Bob Glossman [Wed, 8 Jul 2015 17:35:47 +0000 (10:35 -0700)]
LU-6889 kernel: new kerrnel [SLES11 SP4 3.0.101-65]

add target and config files for SLES11 SP4

Test-Parameters: mdsdistro=sles11sp4 ossdistro=sles11sp4 \
  clientdistro=sles11sp4 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
  testgroup=review-ldiskfs

Test-Parameters: envdefinitions=ONLY=205 \
  testlist=sanity,sanity,sanity,sanity,sanity,sanity \
  clientdistro=sles11sp4 mdsdistro=sles11sp4 ossdistro=sles11sp4

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Iba96fbbc834df76fbc1af019c5e67c4ca0282272
Reviewed-on: http://review.whamcloud.com/15832
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7299 utils: allow mkfs.lustre --index to specify in hex/dec 31/16831/6
Thomas Stibor [Thu, 15 Oct 2015 11:07:00 +0000 (13:07 +0200)]
LU-7299 utils: allow mkfs.lustre --index to specify in hex/dec

The mkfs.lustre --index argument should be able to handle hex
index values as well as decimal values, especially since the
OSTxxxx identifiers are printed in hexadecimal as well.

Signed-off-by: Thomas Stibor <t.stibor@gsi.de>
Change-Id: I9f7564e3d674353fbebef18bde1598c01bb5bb2c
Reviewed-on: http://review.whamcloud.com/16831
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
8 years agoLU-7186 lod: do not propagate size if stripeless 43/16743/6
Alex Zhuravlev [Wed, 7 Oct 2015 08:21:03 +0000 (11:21 +0300)]
LU-7186 lod: do not propagate size if stripeless

if a file has no stripes, but the size isn't zero,
then do not try to propagate this size to stripes.

Change-Id: I25401bcc41e3ea84a2b9158120f1e907af47fafa
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/16743
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7086 tests: resolve /sbin symlink in test-framework.sh 06/16606/3
Emoly Liu [Wed, 23 Sep 2015 09:02:46 +0000 (17:02 +0800)]
LU-7086 tests: resolve /sbin symlink in test-framework.sh

In rhel7 and other new distros, /sbin is linked to /usr/sbin, so
the mount check in load_modules_local() is not matched anymore.
To avoid multiple bind mounts, this patch checks /sbin symlink
before using it directly.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I287ccd81ae4187a381a7f94dee30338d20dd6155
Reviewed-on: http://review.whamcloud.com/16606
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7109 lfsck: update OST-index in IDIF inside OSD 82/16282/3
Fan Yong [Mon, 27 Jul 2015 01:02:57 +0000 (09:02 +0800)]
LU-7109 lfsck: update OST-index in IDIF inside OSD

Old IDIF used "0" as the OST index, that may cause compatibility
issues when update. There is switch inside osd-ldiskfs for controlling
whether convert old IDIF to new one. Once the real OST index became
part of the IDIF and stored on disk (in LMA EA), then the OST cannot
be downgraded.

Because the conversion switch is inside osd-ldiskfs, the LFSCK should
not update the IDIF-in-LMA, instead, leave it to be handled by OSD to
avoid downgrading trouble.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9fee60ffabf732c0ab0734b1184e4e49638c9e88
Reviewed-on: http://review.whamcloud.com/16282
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6920 test: add some slack to jobstats expiry in test_205 53/16753/8
Bob Glossman [Wed, 7 Oct 2015 20:09:18 +0000 (13:09 -0700)]
LU-6920 test: add some slack to jobstats expiry in test_205

Add a tiny fudge factor to the time allowed for jobstats
to expire before checking on them.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Iff2c70e2522b0da0e9c2080cadde49b96a9af60a
Reviewed-on: http://review.whamcloud.com/16753
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6443 tests: add debugging to mmp.sh test 42/14442/3
Andreas Dilger [Fri, 10 Apr 2015 22:05:56 +0000 (16:05 -0600)]
LU-6443 tests: add debugging to mmp.sh test

Print out information about MMP intervals to help debug problems
in the mmp.sh tests.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I6865f1c4a59c83f0dc9e46953b9d7ac7cc3ebbe5
Reviewed-on: http://review.whamcloud.com/14442
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7245 socklnd: Bind peers to a specific CPT 10/16710/4
James Simmons [Tue, 6 Oct 2015 02:54:27 +0000 (22:54 -0400)]
LU-7245 socklnd: Bind peers to a specific CPT

Currently the socklnd driver doesn't support
CPT affinity for its peers. Binding peers to
a specific CPT and memory allocated to the
NUMA node belonging to the CPT should give a
performance boost.

Change-Id: I1cc418dc4ba6269e346a4aa1454de79e580e1fba
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/16710
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf@sgi.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7232 statahead: lock leaks if statahead file recreated 41/16841/3
Lai Siyao [Fri, 16 Oct 2015 03:30:59 +0000 (11:30 +0800)]
LU-7232 statahead: lock leaks if statahead file recreated

During statahead file may be recreated, though this is rare case,
current code will leak the lock, this patch will release lock in
this case.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Ifc1fee0fe5a0f377badc3fdf6dc2a6950a26cff6
Reviewed-on: http://review.whamcloud.com/16841
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6895 lfsck: conflict lu_dirent_attrs members 21/16821/4
Fan Yong [Sat, 29 Aug 2015 00:15:55 +0000 (08:15 +0800)]
LU-6895 lfsck: conflict lu_dirent_attrs members

It is by wrong that the LUDA_UNKNOWN is defined the same value as
LUDA_UPGRADE.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I4067d7852166fab77df69736b3a160e9f67a6abc
Reviewed-on: http://review.whamcloud.com/16821
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7263 mdt: put mnew in mdt_reint_rename_internal() 51/16751/3
John L. Hammond [Wed, 7 Oct 2015 18:11:52 +0000 (13:11 -0500)]
LU-7263 mdt: put mnew in mdt_reint_rename_internal()

In mdt_reint_rename_internal() if the mnew object is remote then put
it before returning.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I7a380f2d54d6546a9009c062ead7c77b8c8a88ee
Reviewed-on: http://review.whamcloud.com/16751
Tested-by: Jenkins
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7091 mdd: refresh nlink after update linkea 36/16236/13
Di Wang [Wed, 7 Oct 2015 04:31:53 +0000 (21:31 -0700)]
LU-7091 mdd: refresh nlink after update linkea

It should refresh nlink after update linkea, because
it might decrease the nlink of source object,
otherwise it might try to delete the stripeEA with
wrong nlink.

And also before migration, it should check if all
of the parents of the migrating object are in the
target MDT, only checking if these parents parent
are remote is not enough.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Icfb4ffc666855f0c7f35f004ecc864c422610135
Reviewed-on: http://review.whamcloud.com/16236
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7039 tgt: Delete txn_callback correctly in tgt_init() 97/16797/3
Di Wang [Thu, 8 Oct 2015 23:15:29 +0000 (16:15 -0700)]
LU-7039 tgt: Delete txn_callback correctly in tgt_init()

txn_callback should only be deleted after initialization.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ifdfabe6439c1413d02782d0dfe7a14d6b82ed0df
Reviewed-on: http://review.whamcloud.com/16797
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoNew tag 2.7.62 2.7.62 v2_7_62 v2_7_62_0
Oleg Drokin [Mon, 26 Oct 2015 14:54:23 +0000 (10:54 -0400)]
New tag 2.7.62

Change-Id: Icf66ec576577b7054046d889335ee019e4853351
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7285 update: update next transno only if recovery succeeds 99/16799/3
Di Wang [Thu, 8 Oct 2015 23:58:35 +0000 (16:58 -0700)]
LU-7285 update: update next transno only if recovery succeeds

Update obd_next_recovery_transno only if update recovery
succeeds, otherwise if client send replay request with the
same transno, it will cause panic in check_for_next_transno()

LustreError: 4529:0:(ldlm_lib.c:1826:check_for_next_transno())
ASSERTION( req_transno >= next_transno ) failed: req_transno:
1404455952555, next_transno: 1404455952556

LustreError: 4529:0:(ldlm_lib.c:1826:check_for_next_transno()) LBUG
Call Trace:
[<ffffffffa074c875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[<ffffffffa074ce77>] lbug_with_loc+0x47/0xb0 [libcfs]
[<ffffffffa0a8640c>] check_for_next_transno+0x68c/0x6d0 [ptlrpc]
[<ffffffffa089a6ed>] ? keys_fini+0x16d/0x240 [obdclass]
[<ffffffffa0a85d80>] ? check_for_next_transno+0x0/0x6d0 [ptlrpc]
[<ffffffffa0a82883>] target_recovery_overseer+0x93/0x320 [ptlrpc]
[<ffffffffa0a81000>] ? exp_req_replay_healthy+0x0/0x30 [ptlrpc]
[<ffffffffa0a89510>] target_recovery_thread+0x6d0/0x2380 [ptlrpc]
[<ffffffffa0a88e40>] ? target_recovery_thread+0x0/0x2380 [ptlrpc]
[<ffffffff8109e78e>] kthread+0x9e/0xc0

Add replay-single.sh 71a to verify double MDTs failover.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Id74768a851985a1cec53e6bce28a0bf00b3fc1c7
Reviewed-on: http://review.whamcloud.com/16799
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6271 osc: faulty assertion in osc_object_prune() 27/16727/5
Jinshan Xiong [Tue, 6 Oct 2015 00:45:36 +0000 (17:45 -0700)]
LU-6271 osc: faulty assertion in osc_object_prune()

There may exist freeing pages in object's radix tree at
the time of osc_object_prune(), which causes failure at
the assertion of (osc->oo_npages == 0). This is a safe
race.

This problem is introduced in change at:
Lustre-commit: e8b421531c166b91ab5c1f417570c544bcdd050c
Lustre-change: http://review.whamcloud.com/16456

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I7d4e59bccfb012b870a2e8fa7ab99774def57349
Reviewed-on: http://review.whamcloud.com/16727
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7195 jobstats: Allow setting static content for jobid_var 98/16598/7
Oleg Drokin [Mon, 12 Oct 2015 15:33:34 +0000 (11:33 -0400)]
LU-7195 jobstats: Allow setting static content for jobid_var

When enabling jobstats a ten percent performance was observed
when running any job. This was due to the expense of the kernel
acquiring the process environment state. Create a alternative
way to setting jobid_var besides meddling directly in process
environment variables (which is also not possible on certain
platforms due to not exported  symbols), create jobid_name
proc file to represent this info (to be filled by job scheduler
epilogue). Is this based on the upstream commit

Linux-commit : 76133e66b1417a73c0950d0716219d09ee21d595

except it doesn't remove the process environment probing to
allow backwards compatiblity. This patch doesn't notify the
admins that using old jobstat proc method has a heavy cost.

Change-Id: If81733e549222a7ab31b24673f0e9b8401541130
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
CC: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: http://review.whamcloud.com/16598
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
8 years agoLU-7261 ldiskfs: fix large_xattr overwrite 77/16777/4
Alexey Lyashkov [Fri, 9 Oct 2015 06:23:47 +0000 (00:23 -0600)]
LU-7261 ldiskfs: fix large_xattr overwrite

Handle the case where a large (external inode) xattr is being replaced
correctly.  The special case for this in ext4_set_xattr() was
incorrectly setting the offset of the xattr data within the inode
when it shouldn't have.

Add an e2fsck check of the large_xattr filesystem in conf-sanity
test_61 to verify this is working correctly.

Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I27123d7985eff0538b6f64139cebc2f0f1806260
Reviewed-on: http://review.whamcloud.com/16777
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6921 test: failed to operate on TBF rules 05/16305/3
vinayakswami hariharmath [Tue, 8 Sep 2015 06:20:25 +0000 (11:50 +0530)]
LU-6921 test: failed to operate on TBF rules

Operate tbf rules on ost1 rather than ost0.
ost0 looks to be wrong target since OSTCOUNT starts from 1.

Signed-off-by: vinayakswami hariharmath <vinayakswami.hariharmath@seagate.com>
Change-Id: I57b4e1d09411638ba35b37473421c747958620cf
Reviewed-on: http://review.whamcloud.com/16305
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
8 years agoLU-6868 mdd: add changelog for migration 45/16645/7
wang di [Thu, 24 Sep 2015 07:35:33 +0000 (00:35 -0700)]
LU-6868 mdd: add changelog for migration

Add changelog for migration, so robinhood policy engine
can handle the migration command.

Add test_160d to verify the migration changelog

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Iaa33dee607fcd79285f59bd3131d70b7e5329622
Reviewed-on: http://review.whamcloud.com/16645
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7295 osp: do not warn on uncommitted changes 17/16817/2
Alex Zhuravlev [Wed, 14 Oct 2015 10:07:53 +0000 (13:07 +0300)]
LU-7295 osp: do not warn on uncommitted changes

there is no need to warn about uncommitted changes at umount.

Change-Id: I7a5578d7ea044553fa8a9544e1ee6998468842b4
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/16817
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6746 ptlrpc: Move IT_* definitions to lustre_idl.h 28/16228/7
Ben Evans [Mon, 12 Oct 2015 23:06:04 +0000 (19:06 -0400)]
LU-6746 ptlrpc: Move IT_* definitions to lustre_idl.h

Put IT_* definitions into an enum, as they're sent over the wire,
adjust calls, print statements, etc. to use the new enum.

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: Ie6ad700ac185459ace72ea67563864e43c548ec3
Reviewed-on: http://review.whamcloud.com/16228
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6556 obdclass: re-allow catalog to wrap around 12/14912/28
Bruno Faccini [Thu, 21 May 2015 18:02:50 +0000 (20:02 +0200)]
LU-6556 obdclass: re-allow catalog to wrap around

Since patch for LU-4528 a LLOG catalog is no longer allowed to
wrap around. This is a regression and it can also cause catalog
corruption (grow behind max-size/records) upon upgrading if
catalog has already wrap around.

This patch reintroduces catalog wrap around capability, and also
introduces a new test to extensively check it.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ife9a452199895ed9d9f43eb9fdeeac15322e272a
Reviewed-on: http://review.whamcloud.com/14912
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-4341 test: skip failing sanity test 170 46/16146/5
Bob Glossman [Mon, 31 Aug 2015 19:22:32 +0000 (12:22 -0700)]
LU-4341 test: skip failing sanity test 170

Since sanity.sh, test_170 always fails in sles11 testing
add it to ALWAYS_EXCEPT when testing on sles11.
This can be removed when we have a real fix for the test.

Test-Parameters: mdsdistro=sles11sp3 ossdistro=sles11sp3 \
  clientdistro=sles11sp3 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
  testlist=sanity

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I76a2bfaad2bff8786ea832a4c9cabb11a71c11e4
Reviewed-on: http://review.whamcloud.com/16146
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
8 years agoLU-7153 build: Update SPL/ZFS to 0.6.5.2 99/16399/9
Nathaniel Clark [Mon, 14 Sep 2015 16:51:48 +0000 (12:51 -0400)]
LU-7153 build: Update SPL/ZFS to 0.6.5.2

ZFS/SPL 0.6.5.2

Bug Fixes
* Init script fixes zfsonlinux/zfs#3816
* Fix uioskip crash when skip to end zfsonlinux/zfs#3806
  zfsonlinux/zfs#3850
* Userspace can trigger an assertion zfsonlinux/zfs#3792
* Fix quota userused underflow bug zfsonlinux/zfs#3789
* Fix performance regression from unwanted synchronous I/O
  zfsonlinux/zfs#3780
* Fix deadlock during ARC reclaim zfsonlinux/zfs#3808
  zfsonlinux/zfs#3834
* Fix deadlock with zfs receive and clamscan zfsonlinux/zfs#3719
* Allow NFS activity to defer snapshot unmounts zfsonlinux/zfs#3794
* Linux 4.3 compatibility zfsonlinux/zfs#3799
* Zed reload fixes zfsonlinux/zfs#3773
* Fix PAX Patch/Grsec SLAB_USERCOPY panic zfsonlinux/zfs#3796
* Always remove during dkms uninstall/update zfsonlinux/spl#476

ZFS/SPL 0.6.5.1

Bug Fixes

* Fix zvol corruption with TRIM/discard zfsonlinux/zfs#3798
* Fix NULL as mount(2) syscall data parameter zfsonlinux/zfs#3804
* Fix xattr=sa dataset property not honored zfsonlinux/zfs#3787

ZFS/SPL 0.6.5

Supported Kernels

* Compatible with 2.6.32 - 4.2 Linux kernels.

New Functionality

* Support for temporary mount options.
* Support for accessing the .zfs/snapshot over NFS.
* Support for estimating send stream size when source is a bookmark.
* Administrative commands are allowed to use reserved space improving
  robustness.
* New notify ZEDLETs support email and pushbullet notifications.
* New keyword 'slot' for vdev_id.conf to control what is use for the
  slot number.
* New zpool export -a option unmounts and exports all imported pools.
* New zpool iostat -y omits the first report with statistics since
  boot.
* New zdb can now open the root dataset.
* New zdb can print the numbers of ganged blocks.
* New zdb -ddddd can print details of block pointer objects.
* New zdb -b performance improved.
* New zstreamdump -d prints contents of blocks.

New Feature Flags

* large_blocks - This feature allows the record size on a dataset to
be set larger than 128KB. We currently support block sizes from 512
bytes to 16MB. The benefits of larger blocks, and thus larger IO, need
to be weighed against the cost of COWing a giant block to modify one
byte. Additionally, very large blocks can have an impact on I/O
latency, and also potentially on the memory allocator. Therefore, we
do not allow the record size to be set larger than zfs_max_recordsize
(default 1MB). Larger blocks can be created by changing this tuning,
pools with larger blocks can always be imported and used, regardless
of this setting.

* filesystem_limits - This feature enables filesystem and snapshot
limits. These limits can be used to control how many filesystems
and/or snapshots can be created at the point in the tree on which the
limits are set.

*Performance*

* Improved zvol performance on all kernels (>50% higher throughput,
  >20% lower latency)
* Improved zil performance on Linux 2.6.39 and earlier kernels (10x
  lower latency)
* Improved allocation behavior on mostly full SSD/file pools (5% to
  10% improvement on 90% full pools)
* Improved performance when removing large files.
* Caching improvements (ARC):
** Better cached read performance due to reduced lock contention.
** Smarter heuristics for managing the total size of the cache and the
   distribution of data/metadata.
** Faster release of cached buffers due to unexpected memory pressure.

*Changes in Behavior*

* Default reserved space was increased from 1.6% to 3.3% of total pool
capacity. This default percentage can be controlled through the new
spa_slop_shift module option, setting it to 6 will restore the
previous percentage.

* Loading of the ZFS module stack is now handled by systemd or the
sysv init scripts. Invoking the zfs/zpool commands will not cause the
modules to be automatically loaded. The previous behavior can be
restored by setting the ZFS_MODULE_LOADING=yes environment variable
but this functionality will be removed in a future release.

* Unified SYSV and Gentoo OpenRC initialization scripts. The previous
functionality has been split in to zfs-import, zfs-mount, zfs-share,
and zfs-zed scripts. This allows for independent control of the
services and is consistent with the unit files provided for a systemd
based system. Complete details of the functionality provided by the
updated scripts can be found here.

* Task queues are now dynamic and worker threads will be created and
destroyed as needed. This allows the system to automatically tune
itself to ensure the optimal number of threads are used for the active
workload which can result in a performance improvement.

* Task queue thread priorities were correctly aligned with the default
Linux file system thread priorities. This allows ZFS to compete fairly
with other active Linux file systems when the system is under heavy
load.

* When compression=on the default compression algorithm will be lz4 as
long as the feature is enabled. Otherwise the default remains lzjb.
Similarly lz4 is now the preferred method for compressing meta data
when available.

* The use of mkdir/rmdir/mv in the .zfs/snapshot directory has been
disabled by default both locally and via NFS clients. The
zfs_admin_snapshot module option can be used to re-enable this
functionality.

* LBA weighting is automatically disabled on files and SSDs ensuring
the entire device is used fairly.
* iostat accounting on zvols running on kernels older than Linux 3.19
is no longer supported.

* The known issues preventing swap on zvols for Linux 3.9 and newer
kernels have been resolved. However, deadlocks are still possible for
older kernels.

Module Options

* Changed zfs_arc_c_min default from 4M to 32M to accommodate large
  blocks.
* Added metaslab_aliquot to control how many bytes are written to a
  top-level vdev before moving on to the next one. Increasing this may
  be helpful when using blocks larger than 1M.
* Added spa_slop_shift, see 'reserved space' comment in the 'changes
  to behavior' section.
* Added zfs_admin_snapshot, enable/disable the use of mkdir/rmdir/mv
  in .zfs/snapshot directory.
* Added zfs_arc_lotsfree_percent, throttle I/O when free system
  memory drops below this percentage.
* Added zfs_arc_num_sublists_per_state, used to allow more
  fine-grained locking.
* Added zfs_arc_p_min_shift, used to set a floor on arc_p.
* Added zfs_arc_sys_free, the target number of bytes the ARC should
  leave as free.
* Added zfs_dbgmsg_enable, used to enable the 'dbgmsg' kstat.
* Added zfs_dbgmsg_maxsize, sets the maximum size of the dbgmsg
  buffer.
* Added zfs_max_recordsize, used to control the maximum allowed
  record size.
* Added zfs_arc_meta_strategy, used to select the preferred ARC
  reclaim strategy.
* Removed metaslab_min_alloc_size, it was unused internally due to
  prior changes.
* Removed zfs_arc_memory_throttle_disable, replaced by
  zfs_arc_lotsfree_percent.
* Removed zvol_threads, zvols no longer require a dedicated task
  queue.
* See zfs-module-parameters(5) for complete details on available
  module options.

Bug Fixes

* Improved documentation with many updates, corrections, and
  additions.
* Improved sysv, systemd, initramfs, and dracut support.
* Improved block pointer validation before issuing IO.
* Improved scrub pause heuristics.
* Improved test coverage.
* Improved heuristics for automatic repair when zfs_recover=1 module
  option is set.
* Improved debugging infrastructure via 'dbgmsg' kstat.
* Improved zpool import performance.
* Fixed deadlocks in direct memory reclaim.
* Fixed deadlock on db_mtx and dn_holds.
* Fixed deadlock in dmu_objset_find_dp().
* Fixed deadlock during zfs rollback.
* Fixed kernel panic due to tsd_exit() in ZFS_EXIT.
* Fixed kernel panic when adding a duplicate dbuf to dn_dbufs.
* Fixed kernel panic due to security / ACL creation failure.
* Fixed kernel panic on unmount due to iput taskq.
* Fixed panic due to corrupt nvlist when running utilities.
* Fixed panic on unmount due to not waiting for all znodes to be
  released.
* Fixed panic with zfs clone from different source and target pools.
* Fixed NULL pointer dereference in dsl_prop_get_ds().
* Fixed NULL pointer dereference in dsl_prop_notify_all_cb().
* Fixed NULL pointer dereference in zfsdev_getminor().
* Fixed I/Os are now aggregated across ZIO priority classes.
* Fixed .zfs/snapshot auto-mounting for all supported kernels.
* Fixed 3-digit octal escapes by changing to 4-digit which
  disambiguate the output.
* Fixed hard lockup due to infinite loop in zfs_zget().
* Fixed misreported 'alloc' value for cache devices.
* Fixed spurious hung task watchdog stack traces.
* Fixed direct memory reclaim deadlocks.
* Fixed module loading in zfs import systemd service.
* Fixed intermittent libzfs_init() failure to open /dev/zfs.
* Fixed hot-disk sparing for disk vdevs
* Fixed system spinning during ARC reclaim.
* Fixed formatting errors in {{zfs(8)}}
* Fixed zio pipeline stall by having callers invoke next stage.
* Fixed assertion failed in zrl_tryenter().
* Fixed memory leak in make_root_vdev().
* Fixed memory leak in zpool_in_use().
* Fixed memory leak in libzfs when doing rollback.
* Fixed hold leak in dmu_recv_end_check().
* Fixed refcount leak in bpobj_iterate_impl().
* Fixed misuse of input argument in traverse_visitbp().
* Fixed missing missing mutex_destroy() calls.
* Fixed integer overflows in dmu_read/dmu_write.
* Fixed verify() failure in zio_done().
* Fixed zio_checksum_error() to only include info for ECKSUM errors.
* Fixed -ESTALE to force lookup on missing NFS file handles.
* Fixed spurious failures from dsl_dataset_hold_obj().
* Fixed zfs compressratio when using with 4k sector size.
* Fixed spurious watchdog warnings in prefetch thread.
* Fixed unfair disk space allocation when vdevs are of unequal size.
* Fixed ashift accounting error writing to cache devices.
* Fixed zdb -d has false positive warning when
  feature@large_blocks=disabled.
* Fixed zdb -h | -i seg fault.
* Fixed force-received full stream into a dataset if it has a
  snapshot.
* Fixed snapshot error handling.
* Fixed 'hangs' while deleting large files.
* Fixed lock contention (rrw_exit) while running a read only load.
* Fixed error message when creating a pool to include all problematic
  devices.
* Fixed Xen virtual block device detection, partitions are now
  created.
* Fixed missing E2BIG error handling in zfs_setprop_error().
* Fixed zpool import assertion in libzfs_import.c.
* Fixed zfs send -nv output to stderr.
* Fixed idle pool potentially running itself out of space.
* Fixed narrow race which allowed read(2) to access beyond fstat(2)'s
  reported end-of-file.
* Fixed support for VPATH builds.
* Fixed double counting of HDR_L2ONLY_SIZE in ARC.
* Fixed 'BUG: Bad page state' warning from kernel due to writeback
  flag.
* Fixed arc_available_memory() to check freemem.
* Fixed arc_memory_throttle() to check pageout.
* Fixed'zpool create warning when using zvols in debug builds.
* Fixed loop devices layered on ZFS with 4.1 kernels.
* Fixed zvol contribution to kernel entropy pool.
* Fixed handling of compression flags in arc header.
* Substantial changes to realign code base with illumos.
* Many additional bug fixes.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I87c012aec9ec581b10a417d699dafc7d415abf63
Reviewed-on: http://review.whamcloud.com/16399
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
8 years agoLU-6155 osd-zfs: dbuf_hold_impl() called without the lock 41/13541/16
Isaac Huang [Tue, 27 Jan 2015 21:03:32 +0000 (14:03 -0700)]
LU-6155 osd-zfs: dbuf_hold_impl() called without the lock

The osd-zfs osd_count_not_mapped() calls dbuf_hold_impl() without
the required lock. In addition, dbuf_hold_impl() is an internal
function and has the expensive side effect of reading the block
from disk which would convert a full-block write into a
read-modify-write.

Since space estimation with ZFS is complicated any way, just use
the worst case as a rough estimate where a snapshot holds all current
blocks, i.e. no old space can be freed after the COW.

Skip test sanity-quota/23 on ZFS because overwrites on ZFS are not
guarenteed to be space neutral, and new worst-case assumptions will
always cause this test to fail.

Change-Id: Idf6f2ff80ff185ca8c0f38e1002ff90e457c3ca0
Signed-off-by: Isaac Huang <he.huang@intel.com>
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-on: http://review.whamcloud.com/13541
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6852 ldlm: Do not evict MDS-MDS connection 24/13224/45
wang di [Tue, 6 Oct 2015 18:36:43 +0000 (14:36 -0400)]
LU-6852 ldlm: Do not evict MDS-MDS connection

Do not put the MDT-MDT lock in the waiting lock list, so
it will evict MDTs due to the lock timeout between MDTs,
which can help the updates replay being finished finally,
so the DNE filesystem will be in consistent state after
recovery.

If for some reasons, the filesystem will hang there because
of these two changes, then the administrator should step in
and inactivate the MDT manually and run lfsck.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I83e7f8f55ee15730ed2d9826d08a398ddd72792a
Reviewed-on: http://review.whamcloud.com/13224
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6215 lnet: make o2iblnd buildable for 4.2.1 kernels 67/16767/3
James Simmons [Thu, 8 Oct 2015 14:33:06 +0000 (10:33 -0400)]
LU-6215 lnet: make o2iblnd buildable for 4.2.1 kernels

The commit f5c9753872cfa8ad47821be3fa924c74c4c8b0d
altered some macros for the ko2iblnd driver which wasn't
updated for the most recent kernels. A simple one line change
restores this support.

Change-Id: Iedd5e36451bf84aae29058e40a89055f451bfeec
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/16767
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-2049 grant: delay grant releasing until commit 31/13531/14
Johann Lombardi [Mon, 26 Jan 2015 16:04:51 +0000 (17:04 +0100)]
LU-2049 grant: delay grant releasing until commit

Grant space acquired for a bulk write is released from the grant
accounting at the end of request processing. At that point, the
additional space consumed by the write request is believed to be
taken into account in any subsequent statfs call.
However, it does not seem to be the case with all backend
filesystems and more particularly ZFS which seems to provide
reliable space information only once the transaction associated
with the bulk write has committed. This creates a hole in the
grant space management where we can end up allocating more grant
space than really available.

This patch postpones grant releasing until transaction commit time.
This is done by registering a commit callback in charge of this
operation.
The patch also removes the implicit use of info->fti_used and stores
the amount of grant space to be released in obdo::o_grant_used.

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: Id99b8712ffc1e5f103df4835b698127619b8ba85
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-on: http://review.whamcloud.com/13531
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6204 misc: Add missing MODULE_VERSION for lustre 29/16729/4
James Simmons [Wed, 7 Oct 2015 14:46:27 +0000 (10:46 -0400)]
LU-6204 misc: Add missing MODULE_VERSION for lustre

Many of the lustre modules are missing a MODULE_VERSION.
Update the remaining MODULE_AUTHORS from Intel to OpenSFS.

Change-Id: Iae24d820c68c570c6e1399bbc7396060d21bdf41
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/16729
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7244 llite: Fix XATTR_NAME_EVM redefinition 07/16707/3
Dmitry Eremin [Sat, 12 Sep 2015 04:10:37 +0000 (23:10 -0500)]
LU-7244 llite: Fix XATTR_NAME_EVM redefinition

In Linux kernel version 3.2.x the defintion of XATTR_NAME_EVM exist
but defintion XATTR_NAME_IMA is not. So, check them independently.

Change-Id: Ib98534d278ae4d5eaaa86538beb9bf683b9cf807
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: http://review.whamcloud.com/16707
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7122 utils: changelog_{de}register cleanup 41/16341/4
Henri Doreau [Tue, 28 Apr 2015 14:15:41 +0000 (16:15 +0200)]
LU-7122 utils: changelog_{de}register cleanup

Document the -n switch for "changelog_register" in the man page.
Apply coding style and remove unneeded code.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Change-Id: I38431d371bc08f5068e5b7e3e62a7847dc64283d
Reviewed-on: http://review.whamcloud.com/16341
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6899 test: rename sanity test_162 to test_162a 10/15710/2
Elena Gryaznova [Fri, 24 Jul 2015 14:20:09 +0000 (17:20 +0300)]
LU-6899 test: rename sanity test_162 to test_162a

Made this test be run separately from others in this group.
- sanity test_162 is renamed to test_162a

Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Seagate-bug-id: MRP-2496
Reviewed-by: Alexander Lezhoev <alexander.lezhoev@seagate.com>
Change-Id: I945b5ab006722d230058ebf44538480e018964c9
Reviewed-on: http://review.whamcloud.com/15710
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-5733 lnet: Use lnet_is_route_alive for router aliveness 55/14055/2
Chris Horn [Thu, 12 Mar 2015 22:39:17 +0000 (17:39 -0500)]
LU-5733 lnet: Use lnet_is_route_alive for router aliveness

lctl show_route and lctl route_list will output router aliveness
information via lnet_get_route(). lnet_get_route() should use the
lnet_is_route_alive() function, introduced in e8a1124
http://review.whamcloud.com/7857, to determine route aliveness.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ie57aebeb6b4c80a3b89ed72fc6acbccbbd321be1
Reviewed-on: http://review.whamcloud.com/14055
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7184 lod: cleanup unused OSP devices on error 35/16635/3
John L. Hammond [Thu, 24 Sep 2015 20:46:31 +0000 (15:46 -0500)]
LU-7184 lod: cleanup unused OSP devices on error

In lod_add_device() if the OSP device to be added cannot be added then
call LCFG_CLEANUP in the cleanup path.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I01e0a5b0f541481a002cf60fcece05908ba3194f
Reviewed-on: http://review.whamcloud.com/16635
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6895 scrub: not trigger scrub if inode removed by race 39/16439/8
Fan Yong [Mon, 21 Sep 2015 09:10:48 +0000 (17:10 +0800)]
LU-6895 scrub: not trigger scrub if inode removed by race

When osd_consistency_check(), the target file may has just been
removed by other, so the osd_oi_lookup() will return -ENOENT to
the caller. Under such case, the osd_consistency_check() should
not trigger OI scrub.

On the other hand, if someone unlinked the file during OI scrub
adding the missed OI mapping to the OI file, the OI scrub needs
to remove the new added OI mapping.

Test-Parameters: alwaysuploadlogs \
envdefinitions=SLOW=yes,ENABLE_QUOTA=yes \
mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs \
ostfilesystemtype=ldiskfs clientdistro=el7 ossdistro=el7 \
mdsdistro=el7 mdtcount=1 \
testlist=sanity-scrub,sanity-scrub,sanity-scrub,sanity-scrub

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I4703fc7f99d7b0a0f769127b5cdba5a2b992250d
Reviewed-on: http://review.whamcloud.com/16439
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6895 lfsck: not destroy directory when fix FID-in-dirent 40/16440/9
Fan Yong [Fri, 14 Aug 2015 04:34:54 +0000 (12:34 +0800)]
LU-6895 lfsck: not destroy directory when fix FID-in-dirent

When repair FID-in-dirent, the lfsck may append the FID after
the name entry directly. If checking the space after the name
entry improperly, it may over write the subsequent name entry
as to crash the whole directory.

Test-Parameters: alwaysuploadlogs envdefinitions=SLOW=yes,ENABLE_QUOTA=yes mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs clientdistro=el7 ossdistro=el7 mdsdistro=el7 mdtcount=1 testlist=sanity-lfsck,sanity-lfsck,sanity-lfsck,sanity-lfsck
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ia1afc643fdfac205a5ea7aa9c365e45b4da90868
Reviewed-on: http://review.whamcloud.com/16440
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6386 tgt: don't update client data with smaller transno 13/14113/6
Mikhail Pershin [Thu, 1 Oct 2015 18:34:27 +0000 (21:34 +0300)]
LU-6386 tgt: don't update client data with smaller transno

Fix tgt_last_rcvd_update() to don't update transaction number
in client slot with smaller value.

Also patch removes outdated code about ted_lcd == NULL case.
This is not possible now, because lcd is set to NULL only
upon export destroy. This check was needed in past when that
lcd was set to NULL during export disconnect and some activity
was still possible on this export.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I34717ea91493785beadcf725d49c4c9265b63f7c
Reviewed-on: http://review.whamcloud.com/14113
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7045 osd: enough credits for single indirect block write 30/16330/8
Fan Yong [Fri, 7 Aug 2015 05:13:03 +0000 (13:13 +0800)]
LU-7045 osd: enough credits for single indirect block write

For single indirect block case, if the i_data[LDISKFS_IND_BLOCK]
block is not allocated, the osd_calc_bkmap_credits() should declare
additional three blocks for subsequent write operation; otherwise,
preserve another single block for that.

Test-Parameters: alwaysuploadlogs envdefinitions=SLOW=yes,ENABLE_QUOTA=yes,CONF_SANITY_EXCEPT=45 mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs clientdistro=el7 ossdistro=el7 mdsdistro=el7 mdtcount=1 testlist=conf-sanity,conf-sanity,conf-sanity,conf-sanity,conf-sanity
Test-Parameters: alwaysuploadlogs envdefinitions=SLOW=yes,ENABLE_QUOTA=yes,CONF_SANITY_EXCEPT=45 mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs clientdistro=el7 ossdistro=el7 mdsdistro=el7 mdscount=2 mdtcount=4 testlist=conf-sanity,conf-sanity,conf-sanity,conf-sanity,conf-sanity
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I76b50cef8df56b49dae7afe4d759a55599548479
Reviewed-on: http://review.whamcloud.com/16330
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Andreas Dilger <andreas.dilger@intel.com>
8 years agoLU-6842 clio: add cl_page LRU shrinker 30/15630/12
Bobi Jam [Fri, 17 Jul 2015 05:36:37 +0000 (13:36 +0800)]
LU-6842 clio: add cl_page LRU shrinker

Register cache shrinker to reclaim memory from cl_page LRU list.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Id22fd1f1f8554dc03ac7313a58abd8cd3472ece0
Reviewed-on: http://review.whamcloud.com/15630
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7005 tests: wait client imports fully recovered 83/15983/10
wang di [Wed, 12 Aug 2015 06:25:05 +0000 (23:25 -0700)]
LU-7005 tests: wait client imports fully recovered

In conf-sanity.sh 50i, it should wait client and all MDTs recover
before creating files.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I637ebfb6c531708e194df4c03d8657361d1b40ee
Reviewed-on: http://review.whamcloud.com/15983
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7196 kernel: kernel update RHEL 6.7 [2.6.32-573.7.1.el6] 08/16608/4
Bob Glossman [Tue, 22 Sep 2015 16:13:48 +0000 (09:13 -0700)]
LU-7196 kernel: kernel update RHEL 6.7 [2.6.32-573.7.1.el6]

update RHEL 6.7 kernel to 2.6.32-573.7.1.el6

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I1b90ac046582c052612219b8af1d172069bb01fd
Reviewed-on: http://review.whamcloud.com/16608
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-6886 mdd: declare changelog store for POSIX ACLs 60/15660/3
Li Dongyang [Tue, 21 Jul 2015 05:20:49 +0000 (15:20 +1000)]
LU-6886 mdd: declare changelog store for POSIX ACLs

mdd_xattr_del() records POSIX ACL ops in the changelog,
we should declare them in mdd_declare_xattr_del().

Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Change-Id: I9184c7906d0da715c12b833bab080c56a1a07285
Reviewed-on: http://review.whamcloud.com/15660
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
8 years agoLU-7074 mdd: validate the linkea before packing 35/16235/13
wang di [Wed, 2 Sep 2015 10:51:28 +0000 (03:51 -0700)]
LU-7074 mdd: validate the linkea before packing

During migration, let's validate linkea entry before
packing the updates into the buffer and sending to
the remote MDT.

And also move retrieving linkea before transaction
start to avoiding sending RPC inside the transaction.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I36f235274d39560f6654fd76967e45400e8187ce
Reviewed-on: http://review.whamcloud.com/16235
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
8 years agoLU-7228 build: make lustre rpm also provide lustre-client 73/16673/2
Frank Zago [Tue, 29 Sep 2015 20:47:38 +0000 (15:47 -0500)]
LU-7228 build: make lustre rpm also provide lustre-client

An application packaged in an rpm has to depend on either lustre or
lustre-client. Since the lustre rpm also includes everything the
lustre-client does, it should also provides lustre-client.

That way an application rpm only has to require lustre-client and not
juggle between a lustre or lustre-client dependency.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: I46ae8a96b0fbc6153a288bf45896f7b4ed1dfddc
Reviewed-on: http://review.whamcloud.com/16673
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Thomas LEIBOVICI <thomas.leibovici@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>