Whamcloud - gitweb
fs/lustre-release.git
9 years agoLU-5866 build: add option to disable zfs build 76/12576/8
Wang Shilong [Mon, 10 Nov 2014 12:51:30 +0000 (20:51 +0800)]
LU-5866 build: add option to disable zfs build

add option --disable-zfs to disable build zfs for Lustre.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Ie7c1c5d0417979f61f0294390377eaebc36fd320
Reviewed-on: http://review.whamcloud.com/12576
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5577 obdclass: change loop indexes to unsigned 87/12387/4
Dmitry Eremin [Mon, 13 Oct 2014 17:18:21 +0000 (21:18 +0400)]
LU-5577 obdclass: change loop indexes to unsigned

Cleanup warnings about comparison between signed and unsigned.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I2d94940251f639942142d54a561225daa8cd8a74
Reviewed-on: http://review.whamcloud.com/12387
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5451 lod: improve weird FID handling 60/11560/9
John L. Hammond [Fri, 22 Aug 2014 15:21:25 +0000 (10:21 -0500)]
LU-5451 lod: improve weird FID handling

In lod_fld_lookup() the FID in question may have come from disk or
wire. Thus if fid_is_sane() returns false then return -EIO rather than
asserting.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I6c7e3885a8b1aa81fcaa8891392a11e40a02fbce
Reviewed-on: http://review.whamcloud.com/11560
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5933 fiemap: set FIEMAP_EXTENT_LAST correctly 81/12781/2
Bobi Jam [Wed, 19 Nov 2014 05:47:58 +0000 (13:47 +0800)]
LU-5933 fiemap: set FIEMAP_EXTENT_LAST correctly

When we've collected enough extents as user requested, we'd check one
further to decide whether we've reached the last extent of the file.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ic4c4710adf98552626d87d54c893ba9fa18ef7b8
Reviewed-on: http://review.whamcloud.com/12781
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5696 ptlrpc: missing wakeup for ptlrpc_check_set 58/12158/10
Liang Zhen [Wed, 1 Oct 2014 16:47:46 +0000 (00:47 +0800)]
LU-5696 ptlrpc: missing wakeup for ptlrpc_check_set

This patch changes a few things:

- There is no guarantee that request_out_callback will happen
  before reply_in_callback, if a request got reply and unlinked
  reply buffer before request_out_callback is called, then the
  thread waiting on ptlrpc_request_set will miss wakeup event.

  This may seriously impact performance of some IO workloads or
  result in RPC timeout

- To make code more easier to understand, this patch changes
  action-bits "rq_req_unlink" and "rq_reply_unlink" to
  status-bits "rq_req_unlinked" and "rq_reply_unlinked"

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: Ie6043534af3c9b48a52da30210d327f3de83b866
Reviewed-on: http://review.whamcloud.com/12158
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Li Wei <wei.g.li@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5474 tests: sanity-hsm test_90 use local HSM_ARCHIVE 69/12069/11
James Nunez [Mon, 17 Nov 2014 18:29:40 +0000 (11:29 -0700)]
LU-5474 tests: sanity-hsm test_90 use local HSM_ARCHIVE

sanity-hsm test 90 suffers from frequent failures due to
slow archive speeds. If the existing archive is not
local, test 90 now uses a local disk archive to speed
the archive process.

sanity-hsm test 40 was modified to query the SINGLEAGT
node to check if the archive is a local disk for
HSM_ARCHIVE.

copytool_cleanup was modified to match copytool_setup;
remove the contents of $hsm_root and not $hsm_root itself.

Test 90 is removed from the exception list.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I0beee30b681d4b80f23d33cb42ff5b2944fc21d1
Reviewed-on: http://review.whamcloud.com/12069
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoRevert "LU-5275 lprocfs: remove last of non seq data structs and functions." 53/12953/3
Johann Lombardi [Fri, 5 Dec 2014 09:53:39 +0000 (09:53 +0000)]
Revert "LU-5275 lprocfs: remove last of non seq data structs and functions."

This reverts commit 0ad4f8a4227ed7dd93fec99d33c6bb25056473fc.
This patch has broken the el6.6 build:
include/linux/proc_fs.h:120: note: previous declaration of 'remove_proc_subtree' was here

Change-Id: I62d9b032448d9eea5d089b69382c2ff5064c5d3d
Reviewed-on: http://review.whamcloud.com/12953
Tested-by: Jenkins
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
9 years agoLU-5581 ldlm: evict clients returning errors on ASTs 52/11752/5
Alexey Lyashkov [Sun, 2 Nov 2014 16:53:48 +0000 (11:53 -0500)]
LU-5581 ldlm: evict clients returning errors on ASTs

When a client returns an error other then EINVAL replying to blocking
ast, it is unsafe to cancel the lock on server side only, because the
client may continue its I/O assuming it still owns the lock while the
real lock may be granted already to another client.

In only valid error case when client replied to AST with EINVAL cancel
the lock and return ERESTART, evict the client in any other error
case.

Xyratex-bug-id: MRP-2041
Signed-off-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Change-Id: Ibce60ce3b2c24ba388155ac49cba8f20388893e7
Reviewed-on: http://review.whamcloud.com/11752
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-4647 nodemap: fix problem with node reclassification 75/12575/2
Kit Westneat [Wed, 5 Nov 2014 14:23:10 +0000 (09:23 -0500)]
LU-4647 nodemap: fix problem with node reclassification

nodemap_add_member can't be used to move an already hashed member
to a new nodemap, so this patch copies the needed functionality to
nm_member_reclassif_cb. This also adds a mutex lock for reclassifying
so that there is only one nodemap reclassifying at a time.
Reclassifying takes a lock on a nodemap's nm_member_hash, so a
deadlock could arise if one nodemap is trying to add a member to
another nodemap, and that second nodemap is also reclassifying and
eventually tries to add a member to the first nodemap.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: Icc93a8e6d8384afa90e45cc04f1422512974ce4a
Reviewed-on: http://review.whamcloud.com/12575
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5889 mdc: Proper accessing struct lov_user_md 83/12683/2
Yoshifumi Uemura [Wed, 12 Nov 2014 07:02:04 +0000 (16:02 +0900)]
LU-5889 mdc: Proper accessing struct lov_user_md

In mdc_setattr_pack() access the members of struct lov_user_md by
little endian byte order.

Signed-off-by: Yoshifumi Uemura <kogexe@gmail.com>
Change-Id: I201f00f527242faa6e1a199d3e792e5cdfa48006
Reviewed-on: http://review.whamcloud.com/12683
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-4975 ofd: Fix Doxygen warnings for ofd files 65/12665/2
Doug Oucharek [Tue, 11 Nov 2014 02:35:48 +0000 (18:35 -0800)]
LU-4975 ofd: Fix Doxygen warnings for ofd files

After patches 10417 and 10586, some Doxygen warnings were created
due to incorrect syntax and missing function parameters in header.
This patch fixes those warnings.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: I6859d9cbe17f52562ae1e93ea4fa0afb08b3f547
Reviewed-on: http://review.whamcloud.com/12665
Tested-by: Jenkins
Reviewed-by: Richard Henwood <richard.henwood@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
9 years agoLU-1892 osp: Fix Doxygen warnings for osp_trans.c 59/12659/2
Doug Oucharek [Mon, 10 Nov 2014 20:22:14 +0000 (12:22 -0800)]
LU-1892 osp: Fix Doxygen warnings for osp_trans.c

After patch 10361, three Doxygen warnings were created due to
incorrect syntax. This patch fixes those.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: Iaaec90f545a8f55522b5f87111dea2b544592ea2
Reviewed-on: http://review.whamcloud.com/12659
Tested-by: Jenkins
Reviewed-by: Richard Henwood <richard.henwood@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5813 lnet: Fix typo in route show command 57/12557/4
Amir Shehata [Tue, 4 Nov 2014 20:58:01 +0000 (12:58 -0800)]
LU-5813 lnet: Fix typo in route show command

Fix type: s/vebose/verbose

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I454dc319b670b176c96916ea6b1d44036f9f0199
Reviewed-on: http://review.whamcloud.com/12557
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5577 obdclass: lu_htable_order() return type to long 85/12385/2
Dmitry Eremin [Fri, 10 Oct 2014 19:35:16 +0000 (23:35 +0400)]
LU-5577 obdclass: lu_htable_order() return type to long

Change the type accordant usage.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I4a2071f9ca51cc34f1fd7c73ccf7dac52a9ff0e9
Reviewed-on: http://review.whamcloud.com/12385
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5275 lprocfs: remove last of non seq data structs and functions. 98/12298/4
James Simmons [Mon, 3 Nov 2014 20:55:06 +0000 (15:55 -0500)]
LU-5275 lprocfs: remove last of non seq data structs and functions.

This patch removes the rest of the non-seq file data structs
and functions. We rename the current seq data structs and
functions to match what is in the upstream lustre client.
Some functions in newer kernels are absent in RHEL6.5 and
SLES11SP3 kernels but lustre has equivalent functions so
they have also been renamed to match what exist in newer
kernels.

Change-Id: Iec17cd214864fe7c004eae8972397be326cdfee4
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/12298
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-2675 obd: remove client_obd_lock_t 31/12231/4
John L. Hammond [Wed, 5 Nov 2014 01:10:43 +0000 (20:10 -0500)]
LU-2675 obd: remove client_obd_lock_t

Remove the definition of client_obd_lock_t and the functions
client_obd_list_{init,lock,unlock,done}(). Use spinlock_t for the
cl_{loi,lru}_list_lock members of struct client_obd and call
spin_{lock,unlock}() directly.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I3c4b9cf531b6d62c3481a40f4a1c448cf864beec
Reviewed-on: http://review.whamcloud.com/12231
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5536 target: allow FLD_READ request during recovery 99/12199/6
Mikhail Pershin [Mon, 6 Oct 2014 18:35:10 +0000 (22:35 +0400)]
LU-5536 target: allow FLD_READ request during recovery

FLD_READ opcode was introduced but not added to the
tgt_filter_recovery_request() as allowed during recovery.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I1d63a95202288e3d72b77037658e5ba0eec4103e
Reviewed-on: http://review.whamcloud.com/12199
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
9 years agoLU-5702 ldlm: suppress error message for valid case 89/12189/3
Mikhail Pershin [Mon, 6 Oct 2014 11:01:53 +0000 (15:01 +0400)]
LU-5702 ldlm: suppress error message for valid case

LVB object may not exist and this is valid case

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I673933d582856a212289e228f0ccfb156c88cfb1
Reviewed-on: http://review.whamcloud.com/12189
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Li Wei <wei.g.li@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
9 years agoLU-5857 tests: check lctl return value in check_catastrophe() 40/12640/4
Jian Yu [Wed, 3 Dec 2014 02:26:10 +0000 (18:26 -0800)]
LU-5857 tests: check lctl return value in check_catastrophe()

This patch fixes check_catastrophe() to check the return value of
lctl command. The catastrophe value would be checked only if the
lctl command passed.

The patch also simplifies the function to check catastrophe value
on all of the test nodes without separating local and remote nodes.

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I0ffdafe27b0829dde5a8ea136be76e35b5ea8f43
Reviewed-on: http://review.whamcloud.com/12640
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5586 llite: fix dup flags names 92/12892/2
Bob Glossman [Mon, 1 Dec 2014 19:03:26 +0000 (11:03 -0800)]
LU-5586 llite: fix dup flags names

The name 'xattr' is used for two different ll_flags bits.
Change the names to be distinct and different, reflecting
the names of the bits used in LL_SBI_xbitnamex #defines.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I538cbee8f5382e1a7c74f2dcd598025886225cc3
Reviewed-on: http://review.whamcloud.com/12892
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5951 clio: update timestamps after buiding rpc 65/12865/4
Niu Yawei [Mon, 1 Dec 2014 06:05:55 +0000 (01:05 -0500)]
LU-5951 clio: update timestamps after buiding rpc

The mtime/atime/ctime in the write RPC has to be updated after
the RPC is built (where xid is generated), otherwise, it could
race with the setattr and updating wrong timestamps on OST side.

Seems this regression was introduced when landing clio code.

Use ofd_write_lock() to protect fmd lookup/udpate in
ofd_punch_object(), otherwise, it could race with ofd_attr_set()
and ofd_commitrw().

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I16216038ea2bd064ef7f33857a1d4aba167ac5fb
Reviewed-on: http://review.whamcloud.com/12865
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
9 years agoLU-5950 mgc: add nid iteration 29/12829/3
Alexander.Boyko [Mon, 24 Nov 2014 10:55:15 +0000 (13:55 +0300)]
LU-5950 mgc: add nid iteration

mgc_apply_recover_logs use only first nid from entry,
this could be the problem for a cluster with several network
address for a one node.

Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Change-Id: I6ec348761c2d51edd613cb388e37ef7776990424
Xyratex-bug-id: MRP-2255
Reviewed-on: http://review.whamcloud.com/12829
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Ann Koehler <amk@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
9 years agoLU-5912 libcfs: use vfs api for fsync calls 31/12731/3
Bob Glossman [Fri, 14 Nov 2014 22:26:30 +0000 (14:26 -0800)]
LU-5912 libcfs: use vfs api for fsync calls

Use vfs_fsync_range() instead of direct use of filp->f_op->fsync()
routines.  Doing so will apply correct locking transparently without
needing to decide how to do it ourselves.
What we were doing was a long term violation of the locking
protocols described in Documentation/filesystems/Locking in linux
source but was never noticed until new checking code went into the
RHEL 6.6 kernel.  The new check triggered a visible error in syslog.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I551215fc340637364fe04f6e3bae963cf983c953
Reviewed-on: http://review.whamcloud.com/12731
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-5894 mds: allow 2.4/2.5 clients create remote dir 15/12715/2
Wang Di [Fri, 14 Nov 2014 07:16:09 +0000 (23:16 -0800)]
LU-5894 mds: allow 2.4/2.5 clients create remote dir

MDS will only return ENOTSUPP if old client (2.4/2.5) tries
to create striped dir with stripe count > 1, so it can still
create remote directory on the new MDS (>= 2.6).

Change-Id: I25c90ae793f91eed032949d26fd5e7fc41801e4f
Signed-off-by: Wang Di <di.wang@intel.com>
Reviewed-on: http://review.whamcloud.com/12715
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
9 years agoLU-5808 llog: check name strictly to avoid invalid record 37/12437/2
Li Xi [Mon, 27 Oct 2014 13:54:25 +0000 (21:54 +0800)]
LU-5808 llog: check name strictly to avoid invalid record

Records for a file system cound be written to llog of another file
system by mistake if the name of the former one is the prefix of
the latter one. This patch fixes the problem by using more strict
checking of llog name.

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: If45c59b0226b71e8a95f9aa719eae8412c89a2f1
Reviewed-on: http://review.whamcloud.com/12437
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
9 years agoLU-5635 llog: prevent out-of-bound index 61/12161/3
Frank Zago [Wed, 1 Oct 2014 20:30:50 +0000 (15:30 -0500)]
LU-5635 llog: prevent out-of-bound index

llog_process_thread() can be called from llog_cat_process_cb with an
index already out of bound, leading to the following crash:

LustreError: 3773:0:(llog.c:310:llog_process_thread())
  ASSERTION(index <= last_index + 1 ) failed:
LustreError: 3773:0:(llog.c:310:llog_process_thread()) LBUG

 #0 [ffff8801144bf900] machine_kexec at ffffffff81038f3b
 #1 [ffff8801144bf960] crash_kexec at ffffffff810c5d82
 #2 [ffff8801144bfa30] panic at ffffffff8152798a
 #3 [ffff8801144bfab0] lbug_with_loc at ffffffffa02f8eeb [libcfs]
 #4 [ffff8801144bfad0] llog_process_thread at ffffffffa0413fff [obdclass]
 #5 [ffff8801144bfb80] llog_process_or_fork at ffffffffa041585f [obdclass]
 #6 [ffff8801144bfbd0] llog_cat_process_cb at ffffffffa0418612 [obdclass]
 #7 [ffff8801144bfc30] llog_process_thread at ffffffffa0413c22 [obdclass]
 #8 [ffff8801144bfce0] llog_process_or_fork at ffffffffa041585f [obdclass]
 #9 [ffff8801144bfd30] llog_cat_process_or_fork at ffffffffa0416b9d [obdclass]
    RIP: 00007f6de5e4f730  RSP: 00007fff9aa26d98  RFLAGS: 00000206
    RAX: 0000000000000000  RBX: ffffffff8100b072  RCX: 00007f6de5e4f730
    RDX: 0000000000008000  RSI: 00000000019c7000  RDI: 0000000000000003
    RBP: 00000000019c7000   R8: 00007f6de6103ee8   R9: 0000000000000001
    R10: 00007fff9aa26b20  R11: 0000000000000246  R12: ffffffffffff8000
    R13: 0000000000000003  R14: 0000000000008000  R15: 0000000000000003
    ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b

If index is too big, simply return success.

Change-Id: I81bbedbbe2bcef478c370ef40fc069447d39efbd
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/12161
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5687 dt: propagate errors from failed declarations 30/12130/4
John L. Hammond [Tue, 30 Sep 2014 16:09:30 +0000 (11:09 -0500)]
LU-5687 dt: propagate errors from failed declarations

Check for and return errors from dt_declare_*() in several locations.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Id18b12d6c713e78e2f1cc782ff659d2c84cc60bb
Reviewed-on: http://review.whamcloud.com/12130
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-2675 lnet: remove ulnds 17/12117/3
John L. Hammond [Mon, 29 Sep 2014 18:33:24 +0000 (13:33 -0500)]
LU-2675 lnet: remove ulnds

Remove the unused userspace LND code (all of lnet/ulnds/) and
supporting autocrud.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I104d8b22afdde5027a2a0ef1a9ecc0423b67fae5
Reviewed-on: http://review.whamcloud.com/12117
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-2675 lmv: remove lmv_init_{lock,unlock}() 15/12115/3
John L. Hammond [Mon, 29 Sep 2014 18:12:52 +0000 (13:12 -0500)]
LU-2675 lmv: remove lmv_init_{lock,unlock}()

In struct lmv_obd rename the init_mutex member to
lmv_init_mutex. Remove the compat macros lmv_init_{lock,unlock}() and
use mutex_{lock,unlock}(&lmv->lmv_init_mutex) instead.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Iae1f5d6b7fd1f96ba430d5e7af97c51ce3e042a8
Reviewed-on: http://review.whamcloud.com/12115
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-2675 md: remove unused code from md_object.h 13/12113/3
John L. Hammond [Mon, 29 Sep 2014 17:55:46 +0000 (12:55 -0500)]
LU-2675 md: remove unused code from md_object.h

Remove several unused functions, structures, and members from
md_object.h.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I33de0ba987bfde95172e9bfb77929b6b4dcd0aa8
Reviewed-on: http://review.whamcloud.com/12113
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5622 tests: check/wait for copytool death 22/11922/5
Bruno Faccini [Mon, 15 Sep 2014 15:37:31 +0000 (17:37 +0200)]
LU-5622 tests: check/wait for copytool death

Seems that copytool death/kill may take more time so
this condition must be handled in sanity-hsm copytool_cleanup()
function to avoid situations where copytool will then not be
restarted, but only signaled, in next copytool_setup().

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ia72ed07f0219cf0aa2ef5b3805fb1f7faf4dab66
Reviewed-on: http://review.whamcloud.com/11922
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Tested-by: Jenkins
Reviewed-by: Robert Read <robert.read@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-3456 ptlrpc: quiet errors on initial connection 57/10057/4
Andreas Dilger [Tue, 22 Apr 2014 19:54:46 +0000 (13:54 -0600)]
LU-3456 ptlrpc: quiet errors on initial connection

It may be that a client or MDS is trying to connect to a target (OST
or peer MDT) before that target is finished setup.  Rather than
spamming the console logs during initial connection, only print a
console error message if there are repeated failures trying to
connect to the target, which may indicate an error on that node.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I98ec7b4c2109b700b53297038d3fede4773ebbe5
Reviewed-on: http://review.whamcloud.com/10057
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-4820 osd: drop memcpy in zfs osd 60/9760/10
Alex Zhuravlev [Mon, 24 Mar 2014 15:30:19 +0000 (19:30 +0400)]
LU-4820 osd: drop memcpy in zfs osd

dmu_read() was called from osd_read_prep() copying from
ARC bufs into the same ARC bufs. seem to be the remainings
of per-zerocopy age.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I0f3657c360d8541d7c3c6e8e32eac78bc5702b42
Reviewed-on: http://review.whamcloud.com/9760
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5878 lfs: migrate file to its proper destination 01/12601/6
Frank Zago [Thu, 6 Nov 2014 17:08:30 +0000 (11:08 -0600)]
LU-5878 lfs: migrate file to its proper destination

llapi_file_open_param() is supposed to be returning the opened file
descriptor. However, when llapi_search_ost() is called, it returns 1,
which sets rc to 1, which in turn is confused for an error later, and
returned to the caller. So when the copy happen, the destination file
descriptor is 1 (stdout).

Fixed a typo in the function description, and format the parameters
descriptions.

Fixed a bad indentation.

There's no need to test lum before freeing it since at that point is
not NULL (and free will test it anyway).

Change-Id: I16fe26480b880aa818b1bb706b22bfdd6833d69c
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/12601
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
9 years agoLU-5861 lnet: invoke lnetctl properly from startup script 61/12561/2
Amir Shehata [Tue, 4 Nov 2014 21:14:39 +0000 (13:14 -0800)]
LU-5861 lnet: invoke lnetctl properly from startup script

Use the correct lnetctl command syntax to load default config:
lnetctl import < lnet.conf

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I54dd0d34f75b91c1c6ceb9745d817cb43f82ef25
Reviewed-on: http://review.whamcloud.com/12561
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-4119 ldlm: abort recovery by time_hard 78/9078/11
Sergey Cheremencev [Thu, 20 Nov 2014 16:58:43 +0000 (11:58 -0500)]
LU-4119 ldlm: abort recovery by time_hard

Set obd_abort_recovery to 1 when recovery time
reaches obd_recovery_time_hard.

Xyratex-bug-id: MRP-1365

Change-Id: Ida8f71cb63d5db9bf85bcdf2c152b4d9f71b8bca
Signed-off-by: Sergey Cheremencev <Sergey_Cheremencev@xyratex.com>
Reviewed-on: http://review.whamcloud.com/9078
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5893 kernel: kernel update [RHEL7 3.10.0-123.9.3.el7] 57/12657/4
Bob Glossman [Mon, 10 Nov 2014 19:20:18 +0000 (11:20 -0800)]
LU-5893 kernel: kernel update [RHEL7 3.10.0-123.9.3.el7]

update RHEL7 kernel to 3.10.0-123.9.3.el7

Test-Parameters: clientdistro=el7
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ife164ff8bea44369bc33cae07cfbb59d5845e406
Reviewed-on: http://review.whamcloud.com/12657
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-1453 scrub: auto trigger OI scrub more flexible 38/12738/10
Fan Yong [Sat, 13 Sep 2014 20:22:41 +0000 (04:22 +0800)]
LU-1453 scrub: auto trigger OI scrub more flexible

Generally, scanning the whole device for OI scrub routine check may
takes some long time. If the whole system only contains several bad
OI mappings, then it is not worth to trigger OI scrub automatically
with full speed when some bad OI mapping is auto-detected. Instead,
we can make the OI scrub to fix the found bad OI mappings only, and
if more and more bad OI mappings are found as to exceeds some given
threshold that can be adjusted via some proc interface, then the OI
scrub will run with full speed to scan whole device.

Currently, we offer two kinds of thresholds for triggering OI scrub
to scan the whole device:

1) "the total OI mappings count" vs "the bad OI mappings count".
   If such ratio is low than the given threshold that can be set
   via the proc interface "full_scrub_ratio", then trigger urgent
   mode OI scrub.

2) "the speed of found the bad OI mappings". If the speed exceeds
   the given threshold that can be adjusted via the proc interface
   "full_scrub_speed", then trigger urgent mode OI scrub.

Test-Parameters: mdsfilesystemtype=ldiskfs mdtfilesystemtype=ldiskfs \
ostfilesystemtype=ldiskfs envdefinitions=ONLY=4 testlist=sanity-scrub
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ibc4592fef1da11994ec30eb348d20576be5ae54b
Reviewed-on: http://review.whamcloud.com/12738
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
9 years agoLU-1452 scrub: OI scrub skips uninitialized groups 37/12737/5
Fan Yong [Thu, 11 Sep 2014 23:55:43 +0000 (07:55 +0800)]
LU-1452 scrub: OI scrub skips uninitialized groups

If the ldiskfs group descriptor is marked as LDISKFS_BG_INODE_UNINIT,
then means that the inodes in such group have never been initialized,
so the otable based iterator can skip this group directly to speed up
the scanning.

If the iteration position reaches the unused inodes area in the
group descriptor (indicated by bg_itable_unused), then skip the
rest inodes in this group to reduce the scanning time.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ie8a2eb1269d288865ce51d40e211e3db54d062af
Reviewed-on: http://review.whamcloud.com/12737
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
9 years agoLU-5867 lfsck: Enable --create_mdtobj flag 78/12578/5
James Nunez [Tue, 9 Sep 2014 18:53:42 +0000 (02:53 +0800)]
LU-5867 lfsck: Enable --create_mdtobj flag

Using the --create_mdtobj flag in 'lctl lfsck_start'
creates an error. "create_mdtobj" is added to the
option struct so it will be recognized as a valid option.

When displaying the results of LFSCK, "create_mdtobj" is
not listed as a parameter. "create_mdtobj" is added to
the lfsck_param_names array so it will be printed when
used.

Also, added LSV_CREATE_MDTOBJ to the lfsck_request
valid options/flags.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I1923bb9a71958b390b9abea248b328ac59c3caad
Reviewed-on: http://review.whamcloud.com/12578
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5963 nodemap: use proper hashing 81/12881/2
Alexey Lyashkov [Sat, 29 Nov 2014 08:55:22 +0000 (11:55 +0300)]
LU-5963 nodemap: use proper hashing

don't hash a export pointer as string.
check a situation when we don't delete a export from nodemap
hash.

Signed-off-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Change-Id: Id53281078f165ce984abebc74992bde30fcc9f31
Reviewed-on: http://review.whamcloud.com/12881
Tested-by: Jenkins
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Kit Westneat <kit.westneat@gmail.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5727 ldlm: revert the changes for lock canceling policy 33/12733/2
Jinshan Xiong [Sat, 15 Nov 2014 01:07:37 +0000 (17:07 -0800)]
LU-5727 ldlm: revert the changes for lock canceling policy

The changes for LRU lock policy was introduced by commit bfae5a4e,
where I was trying to revise the policy to pick locks for canceling.

However, this caused two problems as mentioned in LU-5727. The first
problem is that the lock can only be picked for canceling only if
the number of LRU locks is over preset LRU number AND it's aged; the
second problem is that mdc_cancel_weight() tends to not cancel OPEN
locks, therefore open locks can be kept forever and finally exhausts
memory on the MDT side.

The first problem is fixed by patch e8812867. This patch will revert
the rest of changes related to LRU policy revise.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ie1dbcd15dc6e739d01ddcae01d7e637688a1d4b2
Reviewed-on: http://review.whamcloud.com/12733
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5507 recovery: don't replay closed open 67/12667/4
Niu Yawei [Tue, 11 Nov 2014 05:54:34 +0000 (00:54 -0500)]
LU-5507 recovery: don't replay closed open

To avoid scanning the replay open list every time in the
ptlrpc_free_committed(), the fix of LU-2613 (4322e0f9) changed
the ptlrpc_free_committed() to skip the open list unless the
import generation is changed. That introduced a race which could
make a closed open being replayed:

1. Application calls ll_close_inode_openhandle()-> mdc_close(),
   to close file, rq_replay is cleared, but the open request is
   still on the imp_committed_list;

2. Before the md_clear_open_replay_data() is called for close,
   client start replay, and that closed open will be replayed
   mistakenly;

3. Open replay interpret callback (mdc_replay_open) could race
   with the mdc_clear_open_replay_data() at the end;

This patch fix the ptlrpc_free_committed() to make sure the
open list is scanned on recovery to prevent the closed open request
from being replayed.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ia67fe5d8d501a69bafbbd7e44bd612abb9c254c6
Reviewed-on: http://review.whamcloud.com/12667
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-2833 tests: Unexempt sanity/48a for zfs 07/12607/3
Nathaniel Clark [Thu, 6 Nov 2014 20:51:37 +0000 (15:51 -0500)]
LU-2833 tests: Unexempt sanity/48a for zfs

With LU-2449 being landed this test no longer fails on ZFS.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ie82f25ac0152dee7972a8a210d8669b59798e9a7
Reviewed-on: http://review.whamcloud.com/12607
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoRevert "LU-3573 osd-zfs: Only advance zap cursor as needed" 87/12887/4
Andreas Dilger [Mon, 1 Dec 2014 09:07:00 +0000 (09:07 +0000)]
Revert "LU-3573 osd-zfs: Only advance zap cursor as needed"

This reverts commit 1da9b84b39ab36be9ba67a72ae175dde6521769b.

This patch introduced a far more serious regression in conf-sanity
test_32b LU-5924 and should be reverted until the problem is fixed.

Change-Id: I28f04a33d1c1bb4688d2ba9af6015a2737fb1d93
Reviewed-on: http://review.whamcloud.com/12887
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5079 tests: fix service_time in max_recovery_time() 24/12724/9
Jian Yu [Mon, 24 Nov 2014 22:32:55 +0000 (14:32 -0800)]
LU-5079 tests: fix service_time in max_recovery_time()

This patch fixes the calculation of service_time in
max_recovery_time() to use the new method in
check_and_start_recovery_timer() and new values of
CONNECTION_SWITCH_MAX and CONNECTION_SWITCH_INC.

The patch also fixes replay-dual sub-tests:
- to call wait_clients_import_state() instead of sleeping
  uncertain time in test_11()
- to add some margin into the recovery time comparison
  in test_20()

Test-Parameters: alwaysuploadlogs \
envdefinitions=SLOW=yes,ENABLE_QUOTA=yes,REPLAY_DUAL_EXCEPT=21 \
mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs \
ostfilesystemtype=ldiskfs mdtcount=1 \
testlist=replay-dual,replay-dual

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: Ife0fab28ed7b67ac61022f7e8a38957e3995b167
Reviewed-on: http://review.whamcloud.com/12724
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5650 mgc: check the import stat for lprocfs 27/12327/2
Hongchao Zhang [Tue, 9 Sep 2014 12:18:17 +0000 (20:18 +0800)]
LU-5650 mgc: check the import stat for lprocfs

in lprocfs_mgc_rd_ir_state, the import state should be checked
the validity before doing further work.

Change-Id: Ic582150a1cdbef331a929ce378d6e4f987a169fd
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/12327
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5888 utils: limit max_sectors_kb tunable setting 23/12723/2
Andreas Dilger [Fri, 14 Nov 2014 19:48:39 +0000 (12:48 -0700)]
LU-5888 utils: limit max_sectors_kb tunable setting

Limit the value set by mount.lustre set_blockdev_tunables() to a
reasonable 32MB instead of the maximum possible amount, since the
parsing of max_hw_sectors_kb might be bad, or it just returns a
value much larger than we need.

Also quiet the printing of the max_sectors_kb tunable that was added
in commit 9813961151e (http://review.whamcloud.com/9865) so that it
only prints something when the value is actually changed, instead of
printing it for every tunable even if the value is the same.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I648c2d8484ae5cef59ab62421cd01bc0ed02fcd6
Reviewed-on: http://review.whamcloud.com/12723
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Blake Caldwell <blakec@ornl.gov>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5862 changelog: Proper record remapping 74/12574/4
Henri Doreau [Wed, 5 Nov 2014 14:01:52 +0000 (15:01 +0100)]
LU-5862 changelog: Proper record remapping

Fixed changelog_remap_rec() to correctly remap records emitted
with jobid_var=disabled, i.e. delivered by new servers but with
no jobid field.

Updated sanity test 205 accordingly.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Change-Id: Ia151e9bfde2def8819913ee658bde6b71ef3ab18
Reviewed-on: http://review.whamcloud.com/12574
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Robert Read <robert.read@intel.com>
9 years agoLU-5848 debug: more debug log for dt_sync 73/12573/3
Fan Yong [Sat, 6 Sep 2014 04:39:46 +0000 (12:39 +0800)]
LU-5848 debug: more debug log for dt_sync

Add some D_CACHE logs at the entry/exit for osp_sync()/osd_sync().

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Iaa7fbfbbadb9312528b5092d64615b277de6b679
Reviewed-on: http://review.whamcloud.com/12573
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5641 tests: ensure user daemon is in group bin on mds 62/12762/2
Bob Glossman [Tue, 18 Nov 2014 01:33:57 +0000 (17:33 -0800)]
LU-5641 tests: ensure user daemon is in group bin on mds

The previous fix for this problem only fixed groups on client.
That worked as long as we were only testing with el7 client,
but was an incomplete solution for el7 client/servers.
Need to apply the same fix to mds too to keep things consistent.

Signed-off-by: Bob Gossman <bob.glossman@intel.com>
Change-Id: I411970c591a72b0393ed892f15da1f5d6340df8c
Reviewed-on: http://review.whamcloud.com/12762
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5892 lfsck: remove improper LASSERT in lfsck_needs_scan_dir 70/12670/2
Fan Yong [Sat, 6 Sep 2014 20:13:49 +0000 (04:13 +0800)]
LU-5892 lfsck: remove improper LASSERT in lfsck_needs_scan_dir

Inside the lfsck_needs_scan_dir(), when the internal variable @fid
becomes the input @obj's parent FID, the internal variable @depth
may be still zero, so the original "LASSERT(depth > 0);" is improper
under such case. Then remove it.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I64f10be682c51c6ac5cc1af3497eb569281fcd21
Reviewed-on: http://review.whamcloud.com/12670
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5832 utils: Fix buffer overflow in bound string copy 16/12516/8
Dmitry Eremin [Fri, 31 Oct 2014 10:45:26 +0000 (13:45 +0300)]
LU-5832 utils: Fix buffer overflow in bound string copy

The function 'strncpy' may incorrectly check buffer boundaries
and may overflow buffer 'info->name' of fixed size (256). Also
there is one similar error on line 1135.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I512ab6678fbf1d02bac2eb290fd13c22fca9dc2b
Reviewed-on: http://review.whamcloud.com/12516
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5568 lnet: fix kernel crash when network failed to start 12/12512/5
Amir Shehata [Fri, 31 Oct 2014 00:50:15 +0000 (17:50 -0700)]
LU-5568 lnet: fix kernel crash when network failed to start

When loading Lustre modules without proper network configuration,
it always hit the following kernel panic:
LNetError: 105-4: Error -100 starting up LNI tcp
LNetError: 2145:0:(api-ni.c:823:lnet_unprepare())
 ASSERTION( list_empty(&the_lnet.ln_nis) ) failed:
LNetError: 2145:0:(api-ni.c:823:lnet_unprepare()) LBUG
Pid: 2145, comm: modprobe
x0aCall Trace:
[<ffffffffa044f853>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
[<ffffffffa044fdf5>] lbug_with_loc+0x45/0xc0 [libcfs]
[<ffffffffa04f3267>] lnet_unprepare+0x297/0x340 [lnet]
[<ffffffffa04f3b5c>] LNetNIInit+0x25c/0x3e0 [lnet]
[<ffffffff81061bc6>] ? put_online_cpus+0x56/0x80
[<ffffffffa0983000>] ? init_module+0x0/0x1000 [ptlrpc]
[<ffffffffa081310c>] ptlrpc_ni_init+0x2c/0x1a0 [ptlrpc]
[<ffffffffa0983000>] ? init_module+0x0/0x1000 [ptlrpc]
[<ffffffffa0813291>] ptlrpc_init_portals+0x11/0xf0 [ptlrpc]
[<ffffffffa0983000>] ? init_module+0x0/0x1000 [ptlrpc]
[<ffffffffa09831c4>] init_module+0x1c4/0x1000 [ptlrpc]
[<ffffffff810020e2>] do_one_initcall+0xe2/0x190
[<ffffffff810ca7fb>] load_module+0x129b/0x1a90
[<ffffffff812da590>] ? ddebug_dyndbg_module_param_cb+0x0/0x60
[<ffffffff810c7133>] ? copy_module_from_fd.isra.43+0x53/0x150
[<ffffffff810cb1a6>] SyS_finit_module+0xa6/0xd0
[<ffffffff815f2119>] system_call_fastpath+0x16/0x1b
...
This is because in lnet_startup_lndnis(), we may add list items to
@the_lnet.ln_nis and @the_lnet.ln_nis_cpt before it failed. But in
lnet_startup_lndis() failure path,it did not cleanup list thus
causing assertion in lnet_unprepare().

Fix the assertion by cleaning up using lnet_shutdown_lndnis()
if the startup fails.

In a future enahancement the ni startup API will be modified to
cleanup after itself in case of failure.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ia344fd7c0f24c87b654554dda9e57bf5525edc85
Reviewed-on: http://review.whamcloud.com/12512
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5731 osp: flush async updates for osp_sync 59/12359/2
Fan Yong [Thu, 21 Aug 2014 04:19:25 +0000 (12:19 +0800)]
LU-5731 osp: flush async updates for osp_sync

Current osp_sync() only considers the async requests that are
handled by the osp_sync_thread, but ignores the async updates
that are handled directly by the background ptlrpcd threads.
Usually, such async updates are for LFSCK remote repairing.
This patch will flush all of them when dt_sync() is called.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I0e6d54120acbd8ab82cf776222277ae3b805812d
Reviewed-on: http://review.whamcloud.com/12359
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-4839 tests: Give copytool more time to start 82/12682/6
Nathaniel Clark [Wed, 12 Nov 2014 01:56:28 +0000 (20:56 -0500)]
LU-4839 tests: Give copytool more time to start

Copytool can take some time to start, and if the HSM archive directory
is on a busy NFS server, it can take a bit of time for the initial
opens to occur.  This allows those actions more time to complete which
should give this test a better chance of passing correctly.

Test-Parameters: alwaysuploadlogs envdefinitions=SLOW=yes \
mdtfilesystemtype=zfs mdsfilesystemtype=zfs ostfilesystemtype=zfs \
testlist=sanity-hsm,sanity-hsm,sanity-hsm,sanity-hsm

Test-Parameters: alwaysuploadlogs envdefinitions=SLOW=yes,ONLY=60 \
mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
mdtcount=4 testlist=sanity-hsm,sanity-hsm,sanity-hsm,sanity-hsm

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I28bc57b92c34b4eee07ba34a2d976f2c39dc70dc
Reviewed-on: http://review.whamcloud.com/12682
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Michael MacDonald <michael.macdonald@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5707 lfsck: store namespace LFSCK statistics info in new EA 21/12321/5
Fan Yong [Tue, 9 Sep 2014 03:23:04 +0000 (11:23 +0800)]
LU-5707 lfsck: store namespace LFSCK statistics info in new EA

For Lustre-2.6 or older release, the namespace LFSCK statistics info
was stored as XATTR_NAME_LFSCK_NAMESPACE EA, but in Lustre-2.7, the
namespace LFSCK will introduce more statistics information that will
cause the XATTR_NAME_LFSCK_NAMESPACE EA to be extended. If it still
uses the old XATTR_NAME_LFSCK_NAMESPACE EA, then when downgrade, the
old LFSCK will get -ERANGE when load the new trace file from disk,
and then the LFSCK cannot be started after downgrade.

To avoid such trouble, Lustre-2.7 will use new EA to store the
namespace LFSCK statistics info: XATTR_NAME_LFSCK_NAMESPACE_V2,
and keep a dummy XATTR_NAME_LFSCK_NAMESPACE EA in the trace file
to be compatible with old LFSCK.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I55b5adb962434013b00e3938a67b671010ecc206
Reviewed-on: http://review.whamcloud.com/12321
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
9 years agoLU-5740 build: add RHEL6.6 [2.6.32-504.el6] to build selections 09/12609/4
Bob Glossman [Tue, 28 Oct 2014 17:25:04 +0000 (10:25 -0700)]
LU-5740 build: add RHEL6.6 [2.6.32-504.el6] to build selections

Add support for building with RHEL6.6 kernel version 2.6.32-504.el6
while retaining the ability to build with older RHEL 6.5 kernels.
New ldiskfs patch series for el6.6 is included.

Test-Parameters: clientdistro=el6.6 mdsdistro=el6.6\
  ossdistro=el6.6 mdsfilesystemtype=ldiskfs\
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I780feefbbc179607762c0d2997fd608830f3db8b
Reviewed-on: http://review.whamcloud.com/12609
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Jenkins
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5941 build: build dkms build at installed source tree 02/12802/2
Minh Diep [Thu, 20 Nov 2014 16:10:53 +0000 (08:10 -0800)]
LU-5941 build: build dkms build at installed source tree

Port from:
https://github.com/
zfsonlinux/zfs/commit/46bf86a9635266dd399443f5bf5c5f8d0f280aa2

Signed-off-by: Minh Diep <minh.diep@intel.com>
Change-Id: If0c8543d955594b4f9dc305c35271a9cc94e1bbd
Reviewed-on: http://review.whamcloud.com/12802
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5941 dkms: make lustre-dkms require 2.2.0.3-28.git.7c3e7c5 01/12801/2
Minh Diep [Thu, 20 Nov 2014 16:05:25 +0000 (08:05 -0800)]
LU-5941 dkms: make lustre-dkms require 2.2.0.3-28.git.7c3e7c5

Due to a bug in dkms, we need to enfore the use of
dkms-2.2.0.3-28.git.7c3e7c5 version.

Signed-off-by: Minh Diep <minh.diep@intel.com>
Change-Id: I9ad8ccaa5106b221f41a50c520d8bdfef160c065
Reviewed-on: http://review.whamcloud.com/12801
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-2524 test: Code clean up for conf-sanity 30/10530/7
James Nunez [Fri, 30 May 2014 19:20:21 +0000 (13:20 -0600)]
LU-2524 test: Code clean up for conf-sanity

The patch modifying the tdir variable to a single directory
has landed; http://review.whamcloud.com/#/c/8123/. We can
now conduct miscellaneous cleanup including:

Remove the `-p` (parents) option from many calls to mkdir
Replace `lfs setstripe` with $SETSTRIPE
Replace `lfs getstripe` with $GETSTRIPE
Replace `lctl` with $LCTL
Added check for and call `error` and/or added error messages
for a variety of common functions.
Replace `…` with $(...)
Remove linefeed escape after |, ||, & and && operators.
Modify directory and file names to use $tdir and $tfile
Remove 'mkdir -p $MOUNT' when 'mount_client $MOUNT' is
called right before or after mkdir

Test-Parameters: alwaysuploadlogs \
envdefinitions=SLOW=yes testlist=conf-sanity

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I94bd51ce2d2f225736e12c4f9ac1a86a3d8a23d8
Reviewed-on: http://review.whamcloud.com/10530
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
9 years agoLU-5814 llite: remove ll_objects_destroy() 18/12618/2
John L. Hammond [Fri, 7 Nov 2014 15:00:09 +0000 (09:00 -0600)]
LU-5814 llite: remove ll_objects_destroy()

Remove ll_objects_destroy(). This function is not needed for
interoperability with servers of version 2.4 or higher (after lustre
commit 5165cdd4).

Remove the then unused function lov_destroy() and its supporting
functions. Remove the lsm_destroy method of struct lsm_operations.

Remove the unused struct lov_stripe_md, MD export, and capa parameters
from obd_destroy() and its implementations.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: If8634b3d88a660d00891219c348622ec45361316
Reviewed-on: http://review.whamcloud.com/12618
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5418 echo: replace lov_stripe_md with lov_oinfo 47/12447/3
John L. Hammond [Wed, 29 Oct 2014 17:15:06 +0000 (12:15 -0500)]
LU-5418 echo: replace lov_stripe_md with lov_oinfo

In echo_client replace uses of struct lov_stripe_md with struct
lov_oinfo (since the instances of the former really only contained a
single instance of the latter). Remove the then unneccessary functions
echo_alloc_memmd(), echo_free_memmd(), osc_unpackmd(), and
obd_alloc_memmd(). Remove the struct lov_stripe_md * parameter from
obd_create(). Flatten osc_create() and osc_real_create() into a single
function.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I5fe276bcc56e1fa8138a4d3f20b9d5297cf74f3f
Reviewed-on: http://review.whamcloud.com/12447
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-3962 iokit: fix whitespace in scripts 56/10456/6
Andreas Dilger [Tue, 27 May 2014 17:53:01 +0000 (11:53 -0600)]
LU-3962 iokit: fix whitespace in scripts

Fix the whitespace in mds-survey and obdfilter-survey to use tabs
instead of 4-space indentation.  Fix coding style in several places.

Remove the use of a python script just to get the page size.  Instead,
use "getconf PAGE_SIZE" to do this.

Test-Parameters: alwaysuploadlogs envdefinitions=SLOW=yes \
testlist=mds-survey,obdfilter-survey

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I921007043c360b45d45fc03a8237edea9a3ebbe5
Reviewed-on: http://review.whamcloud.com/10456
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5537 ptlrpc: Fix an rq_no_reply assertion failure 40/11740/3
Li Wei [Wed, 3 Sep 2014 09:02:22 +0000 (17:02 +0800)]
LU-5537 ptlrpc: Fix an rq_no_reply assertion failure

An OSS had an assertion failure:

  LustreError: 5366:0:(ldlm_lib.c:2689:target_bulk_io()) @@@ timeout
  on bulk GET after 0+0s  req@ffff88083a61b400
  x1476486691018500/t0(4300509964)
  o4->8dda3382-83f8-6445-5eea-828fd59e4a06@192.168.1.116@o2ib1:0/0
  lens 504/448 e 391470 to 0 dl 1408494729 ref 2 fl Complete:/4/0 rc
  0/0
  LustreError: 5432:0:(niobuf.c:550:ptlrpc_send_reply()) ASSERTION(
  req->rq_no_reply == 0 ) failed:
  Lustre: soaked-OST0000: Bulk IO write error with
  8dda3382-83f8-6445-5eea-828fd59e4a06 (at 192.168.1.116@o2ib1),
  client will retry: rc -110
  LustreError: 5432:0:(niobuf.c:550:ptlrpc_send_reply()) LBUG
  Pid: 5432, comm: ll_ost_io03_003

  Call Trace:
  [<ffffffffa0641895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
  [<ffffffffa0641e97>] lbug_with_loc+0x47/0xb0 [libcfs]
  [<ffffffffa09cda4c>] ptlrpc_send_reply+0x4ec/0x7f0 [ptlrpc]
  [<ffffffffa09d4aae>] ? lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc]
  [<ffffffffa09e4d75>] ptlrpc_at_check_timed+0xcd5/0x1370 [ptlrpc]
  [<ffffffffa09dc1e9>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
  [<ffffffffa09e66f8>] ptlrpc_main+0x12e8/0x1990 [ptlrpc]
  [<ffffffff81069290>] ? pick_next_task_fair+0xd0/0x130
  [<ffffffff81529246>] ? schedule+0x176/0x3b0
  [<ffffffffa09e5410>] ? ptlrpc_main+0x0/0x1990 [ptlrpc]
  [<ffffffff8109abf6>] kthread+0x96/0xa0
  [<ffffffff8100c20a>] child_rip+0xa/0x20
  [<ffffffff8109ab60>] ? kthread+0x0/0xa0
  [<ffffffff8100c200>] ? child_rip+0x0/0x20

The thread in tgt_brw_write() had decided not to reply by setting
rq_no_reply, right before another thread tried to send an early reply
for the request.

Change-Id: I9096a098621a38610c0d0d2dff016c012fc4b7f2
Signed-off-by: Li Wei <wei.g.li@intel.com>
Reviewed-on: http://review.whamcloud.com/11740
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
9 years agoLU-20 kernel: increase BH_LRU_SIZE to 16 77/12577/2
Sebastien Buisson [Wed, 5 Nov 2014 15:34:14 +0000 (16:34 +0100)]
LU-20 kernel: increase BH_LRU_SIZE to 16

As kernel community did not want a complicated way of
modifying BH_LRU_SIZE, it was proposed to directly set it
to 16. This has been accepted.
This patch is merged in the upstream kernel:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/
linux.git/commit/?id=86cf78d73de8c6bfa89804b91ee0ace71a459961

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Change-Id: I71fb455de9ec70ed90f86d402ae76ecfba1e1e61
Reviewed-on: http://review.whamcloud.com/12577
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
9 years agoLU-5729 osd: iput in case of error in osd_scrub_setup 25/12325/4
Sergey Cheremencev [Fri, 26 Sep 2014 13:00:56 +0000 (17:00 +0400)]
LU-5729 osd: iput in case of error in osd_scrub_setup

In case of ENOSPACE from osd_scrub_file_store iput is needed.
Otherwise there is a message in dmesg: "VFS: Busy inodes after
unmount of vdb. Self-destruct in 5 seconds. Have a nice day..."
Also added osd_oi_fini for case of error from osd_initial_OI_scrub
or osd_scrub_start.

Change-Id: Ibc6f487c9bd5b07f09cb3f7e3b5fc2bf1e329fb0
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@seagate.com>
Xyratex-bug-id: MRP-2109
Reviewed-on: http://review.whamcloud.com/12325
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5855 lfsck: misc fixes for zfs-based backend 52/12552/5
Fan Yong [Wed, 3 Sep 2014 16:25:33 +0000 (00:25 +0800)]
LU-5855 lfsck: misc fixes for zfs-based backend

It contains several fixes to make the LFSCK to work under DNE mode
for zfs-based backend.

Test-Parameters: mdsfilesystemtype=zfs mdtfilesystemtype=zfs ostfilesystemtype=zfs mdscount=2 mdtcount=2 testlist=sanity-lfsck
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I8e8758336d4ce67667f7e3586475ddd72db2d419
Reviewed-on: http://review.whamcloud.com/12552
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5833 lfsck: handle lfsck_open_dir() return-value properly 33/12533/3
Fan Yong [Tue, 2 Sep 2014 11:06:03 +0000 (19:06 +0800)]
LU-5833 lfsck: handle lfsck_open_dir() return-value properly

Inside the lfsck_prep(), the returned value from lfsck_open_dir()
should be handled properly before returning to the caller directly.
For example: positive number from lfsck_open_dir() means the end of
current directory, but if continuously return such value to the
lfsck_prep()'s caller, then the whole LFSCK first-stage scanning
will be regarded as done by wrong.

Test-Parameters: mdsfilesystemtype=zfs mdtfilesystemtype=zfs ostfilesystemtype=zfs testlist=sanity-lfsck
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9e5c32b8594a65f1b605196373034ace6c9d1881
Reviewed-on: http://review.whamcloud.com/12533
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-5817 clio: Do not allow group locks with gid 0 59/12459/4
Patrick Farrell [Mon, 10 Nov 2014 07:39:29 +0000 (01:39 -0600)]
LU-5817 clio: Do not allow group locks with gid 0

When a group lock with GID=0 is released (put_grouplock is
called), an assertion in cl_put_grouplock is hit.

We should not allow group lock requests with GID=0, instead
we should return -EINVAL.

Also fix random_group_id so it never returns gid==0.

Change-Id: I56e58791742809da5353a4d8dfbf3b80a22f3814
Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-on: http://review.whamcloud.com/12459
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: frank zago <fzago@cray.com>
9 years agoLU-5732 hsm: sanity check for progress input 85/12285/4
Frank Zago [Sun, 12 Oct 2014 18:57:05 +0000 (13:57 -0500)]
LU-5732 hsm: sanity check for progress input

During an HSM archive or restore, the progress is reported by the
copytool, in userspace. That value may be bogus. For instance, this
will crash the MDS in interval_set():

he.offset = -1;
he.length = 10;
rc = llapi_hsm_action_progress(hcp, &he, length, 0);

So check that userspace is giving a sane progress extent value.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: I0eb3fa9a66400a4ff3cee2f256c08e1d84744111
Reviewed-on: http://review.whamcloud.com/12285
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-3270 statahead: small fixes and cleanup 67/9667/12
Lai Siyao [Fri, 14 Mar 2014 12:10:32 +0000 (20:10 +0800)]
LU-3270 statahead: small fixes and cleanup

small fixes:
* when 'unplug' is set for ll_statahead(), sa_put() shouldn't kill
  the entry found, because its inflight RPC may not finish yet.
* remove 'sai_generation', add 'lli_sa_generation' because the
  former one is not safe to access without lock.
* revalidate_statahead_dentry() may fail to wait for statahead
  entry to become ready, in this case it should not release this
  entry, because it may be used by inflight statahead RPC.

cleanups:
* rename ll_statahead_enter() to ll_statahead().
* move dentry 'lld_sa_generation' update to ll_statahead() to
  simplify code and logic.
* other small cleanups.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I65759c7dfcbe879b42f14152dbfe5949e3d37ea0
Reviewed-on: http://review.whamcloud.com/9667
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5473 tests: print space usage in sanity test_51b 85/12185/7
Andreas Dilger [Fri, 24 Oct 2014 07:23:14 +0000 (00:23 -0700)]
LU-5473 tests: print space usage in sanity test_51b

In sanity.sh test_51b print out the space usage before and after
the test so that the failure can be debugged.

Skip test_51b and test_51ba for ZFS when running regular review
tests, since there isn't a limit of 60000 subdirectories (ZFS
nlink is a 64-bit number), and they take a long time to run in
a VM (20 minutes combined).

Test-Parameters: alwaysuploadlogs \
envdefinitions=SLOW=yes,ENABLE_QUOTA=yes,ONLY=51 \
mdtfilesystemtype=zfs mdsfilesystemtype=zfs ostfilesystemtype=zfs \
clientdistro=el6 ossdistro=el6 mdsdistro=el6 \
mdtcount=1 mdssizegb=2 ostcount=2 ostsizegb=8 \
testlist=sanity

Test-Parameters: alwaysuploadlogs \
envdefinitions=SLOW=yes,ENABLE_QUOTA=yes,ONLY=51 \
mdtfilesystemtype=zfs mdsfilesystemtype=zfs ostfilesystemtype=zfs \
clientdistro=el6 ossdistro=el6 mdsdistro=el6 \
mdtcount=1 mdssizegb=2 ostcount=2 ostsizegb=8 nettypes=o2ib \
testlist=sanity

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I21b072fbcb05dea3fd7803bf3353de11ffbcab07
Reviewed-on: http://review.whamcloud.com/12185
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5778 lod: Fix lod_qos_statfs_update() 17/12617/2
Aurelien Degremont [Fri, 7 Nov 2014 13:10:03 +0000 (14:10 +0100)]
LU-5778 lod: Fix lod_qos_statfs_update()

When an OST is sick, or unactivate, lod cannot fetch its statfs
information. In lod_qos_statfs_update() this was preventing lod
to get information for other OST because refresh was stopped at
first error.
This patch fixes this behaviour.

Signed-off-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Change-Id: Id0217f228381ef7a41fdbfd99f5499dcc97ace0e
Reviewed-on: http://review.whamcloud.com/12617
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5865 lfsck: avoid NULL pointer 97/12597/2
Fan Yong [Thu, 4 Sep 2014 01:37:08 +0000 (09:37 +0800)]
LU-5865 lfsck: avoid NULL pointer

NOT pass "NULL" as the parameter of @lmv for lfsck_record_lmv(),
then the subsequent handling inside lfsck_record_lmv() needs NOT
to worry about the case of "lmv == NULL".

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I5f308818edd5ded2c4ccc7d59fb0908791b8aae3
Reviewed-on: http://review.whamcloud.com/12597
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-181 ptlrpc: reorganize ptlrpc_request 06/8806/16
Liang Zhen [Sun, 5 Jan 2014 15:00:26 +0000 (23:00 +0800)]
LU-181 ptlrpc: reorganize ptlrpc_request

ptlrpc_request has some structure members are only for client side,
and some others are only for server side, this patch moved these
members to different structure then putting into an union.

By doing this, size of ptlrpc_request is decreased about 300 bytes,
besides saving memory, it also can reduce memory footprint while
processing.

Another change in this patch is, osp will not use rq_exp_list anymore
because it is a server only member now.
osp will use ptlrpc_req_async_args to store commit_cb parameters in
this patch.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: Id910ac225b8e9d33a0cae40b3124ce55f1a3fbc9
Reviewed-on: http://review.whamcloud.com/8806
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
9 years agoLU-5654 osp: Call obd_fid_fini() on osp_init0() error path 37/12037/2
Li Wei [Wed, 24 Sep 2014 09:09:48 +0000 (17:09 +0800)]
LU-5654 osp: Call obd_fid_fini() on osp_init0() error path

osp_init0() should call obd_fid_fini() on its error path to avoid
leaks.

Change-Id: I1a679db172ae60c74049d2dd3e111c93cfcbeda2
Signed-off-by: Li Wei <wei.g.li@intel.com>
Reviewed-on: http://review.whamcloud.com/12037
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
9 years agoLU-5854 lnet: make YAML output/input consistent 56/12556/6
Amir Shehata [Tue, 4 Nov 2014 20:49:51 +0000 (12:49 -0800)]
LU-5854 lnet: make YAML output/input consistent

The YAML format used for configuring and showing networks was not
consistent. This patch makes both formats consistent.

EX:
net:
    - net: tcp
      nid: 192.168.206.130@tcp
      status: up
      interfaces:
          0: eth0
      tunables:
          peer_timeout: 180
          peer_credits: 8
          peer_buffer_credits: 0
          credits: 256
      CPT: [0,1]

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Id4314679930709ac43104f1ba544bb6d1ca8cb0a
Reviewed-on: http://review.whamcloud.com/12556
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5831 lfsck: extend lfsck_request::lr_pool_name 34/12534/3
Fan Yong [Mon, 1 Sep 2014 12:11:30 +0000 (20:11 +0800)]
LU-5831 lfsck: extend lfsck_request::lr_pool_name

Fix some issues found during Lustre source static analysis:

1) Extend lfsck_request::lr_pool_name size to match the
   lmv_mds_md_v1::lmv_pool_name.
2) Check lfsck->li_obj_dir inside lfsck_close_dir() before using.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I84443089135c5de1b9fa89eb76e5cd623412a01f
Reviewed-on: http://review.whamcloud.com/12534
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
9 years agoLU-4217 build: Remove ACL mount options 54/12154/3
James Nunez [Wed, 1 Oct 2014 14:06:08 +0000 (08:06 -0600)]
LU-4217 build: Remove ACL mount options

The "acl" mount option on the client has been
depricated since Lustre 1.8. The "acl" mount
option code is now obsolete and is removed.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ib765476a71ebb732d9ffda60b336530e0a758943
Reviewed-on: http://review.whamcloud.com/12154
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5006 mdd: don't call attr_set on object create 43/12243/4
Niu Yawei [Thu, 9 Oct 2014 08:11:31 +0000 (04:11 -0400)]
LU-5006 mdd: don't call attr_set on object create

The object attr has been initialzed in OSD layer when create
object, it's not necessary to initialize it again in MDD layer.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I6f4094d4384b2c153d4dad2666d64281c0450059
Reviewed-on: http://review.whamcloud.com/12243
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5848 lnet: fix inconsistent seq_no names 62/12562/2
Amir Shehata [Tue, 4 Nov 2014 21:29:06 +0000 (13:29 -0800)]
LU-5848 lnet: fix inconsistent seq_no names

When YAML output is printed the literal "seqno" is used,
when it's parsed, the literal "seq_no" is expected.
This patch makes it consistent.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Iabf5394e858007c7f6e87c7baf892887da88f8e3
Reviewed-on: http://review.whamcloud.com/12562
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
9 years agoLU-4730 utils: lctl get_param, set_param cleanup 45/9545/10
Andreas Dilger [Mon, 3 Nov 2014 15:42:18 +0000 (10:42 -0500)]
LU-4730 utils: lctl get_param, set_param cleanup

Cleanup "lctl get_param" and "lctl set_param" code and error handling.
Deny "parameters" with embedded relative paths to avoid strangeness.
Return an error consistently if multiple parameters are set but the
last one did not fail.  Remove deprecated full-path handling.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I1004b5b4da4dc9b5825ef498758e248ed52f4141
Reviewed-on: http://review.whamcloud.com/9545
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Chao Wang <chao.ornl@gmail.com>
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5329 mgs: Remove nibtbl swab code for 2.2 clients 10/12010/3
James Nunez [Mon, 22 Sep 2014 22:50:59 +0000 (16:50 -0600)]
LU-5329 mgs: Remove nibtbl swab code for 2.2 clients

Remove obsolete code that allows compatibility with
Lustre 2.2 clients.

Due to a bug, Lustre 2.2 clients always swab nidtbl
entries even if the server and client are using the
same endian. The fix was to allow the servers to do
the swabbing for the client.

Now, clients will do the swabbing.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I420ca986c0a68343be07272bb419cbdb1cebf148
Reviewed-on: http://review.whamcloud.com/12010
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
9 years agoLU-2675 libcfs: remove libcfs posix headers 87/11987/5
John L. Hammond [Mon, 3 Nov 2014 19:42:49 +0000 (14:42 -0500)]
LU-2675 libcfs: remove libcfs posix headers

Remove libcfs/include/libcfs/posix/. Include what was needed from
libcfs/posix/libcfs.h into libcfs/libcfs.h or in the appropriate .c
file.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ia3016c83f13554b617c5f4a6dcc86adf222d4e49
Reviewed-on: http://review.whamcloud.com/11987
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-5551 lustre: fix messages with missisng newlines 96/11996/5
John L. Hammond [Fri, 19 Sep 2014 15:09:51 +0000 (10:09 -0500)]
LU-5551 lustre: fix messages with missisng newlines

Add missisng newlines to four CERROR() messages. Restore the trailing
newline in the definition of OSC_DUMP_GRANT(). Remove an unnecessary
CDEBUG() from ldlm_pool_recalc().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I549de59dd9cd205e1a6d0fbcd70ccd1cbf5e389b
Reviewed-on: http://review.whamcloud.com/11996
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-5871 lod: Do not return EAGAIN in lod_object_init 86/12586/4
Wang Di [Wed, 5 Nov 2014 18:46:59 +0000 (10:46 -0800)]
LU-5871 lod: Do not return EAGAIN in lod_object_init

Convert EAGAIN to EIO if fld_client_rpc() fails in
lod_object_init(), otherwise it will confuse
lu_object_find_at(), and make it wait there for no
reason, which should only wait if the object is dying.
See call chain lu_object_find_at()-> lu_object_find_try()
->lu_object_alloc()->lod_object_init()->lod_fld_lookup()
->fld_client_rpc(), and even worse waitq is not being
initialized yet when the failure happened here.

Change-Id: Ieae434b34c239efea86a4a471fb01e397336a31c
Signed-off-by: Wang Di <di.wang@intel.com>
Reviewed-on: http://review.whamcloud.com/12586
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-3573 osd-zfs: Only advance zap cursor as needed 82/12582/3
Nathaniel Clark [Wed, 5 Nov 2014 18:05:22 +0000 (13:05 -0500)]
LU-3573 osd-zfs: Only advance zap cursor as needed

Only advance the zap cursor when ozi_pos is not advanced, otherwise
occasionally the a file could get "lost" because the zap_cursor would
advance over it before the retrieve happened.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Iad560e2ffb4cfe2c74a1cf9197be7c2537538822
Reviewed-on: http://review.whamcloud.com/12582
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5565 osd-ldiskfs: separate LASSERT() into two lines 98/12398/2
Andreas Dilger [Wed, 22 Oct 2014 18:35:31 +0000 (12:35 -0600)]
LU-5565 osd-ldiskfs: separate LASSERT() into two lines

Separate the compound assertions in osd-ldiskfs into two lines:

    LASSERT(dt_object_exists(dt) && !dt_object_remote(dt));
to
    LASSERT(dt_object_exists(dt));
    LASSERT(!dt_object_remote(dt));

so that it is possible to distinguish which of the two is being hit.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3ff4fc28bffe955ab051ece665faa4c8a6500c1e
Reviewed-on: http://review.whamcloud.com/12398
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
9 years agoLU-5626 ldiskfs: update non-htree dotdot in rename 85/12585/2
Bob Glossman [Wed, 5 Nov 2014 17:40:42 +0000 (09:40 -0800)]
LU-5626 ldiskfs: update non-htree dotdot in rename

This mod duplicates changes previously committed only for el6
for sles11sp3.

In 2.4+, when renaming a directory, its old dotdot entry will
be removed firstly, then the new dotdot entry is inserted, and
ldiskfs tries to append FID-in-dirent to the new entry.
But the space for dotdot entry may not be enough to hold
the new dotdot with FID-in-dirent, such as an MDT device
restored from file-level backup, or a device upgraded from 1.8.

In that case, for non-HTree directories, the ".." entry
will be written in the next available space in the directory
block.  This is invalid, as the ".." entry must be the
second entry in the block.

The same bug was fixed for HTree directories in LU-2638.
As Fan Yong said then: we do not want to introduce
complex logic to handle directory data moving, instead, in
such case, ignore the FID-in-dirent for the new dotdot entry,
and just insert the new dotdot entry.

There is one known flaw: This patch, like the one for
LU-2638, skips the entire data section rather than just
the FID.  This could cause trouble if something else ever
uses this section with ".." entries.

Test-Parameters: mdsdistro=sles11sp3 ossdistro=sles11sp3 \
 mdsfilesystemtype=ldiskfs mdtfilesystemtype=ldiskfs \
 ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Iaba11ac19ab7f802925af7a562ad7f739e6ed5c8
Reviewed-on: http://review.whamcloud.com/12585
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-5830 scripts: use lustre_rmmod in lnet start/stop script 13/12513/4
Bruno Faccini [Fri, 31 Oct 2014 00:59:51 +0000 (01:59 +0100)]
LU-5830 scripts: use lustre_rmmod in lnet start/stop script

In lnet's start/stop script stop phase, use lustre_rmmod instead
to try to unload a static list/sequence of modules.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ie8584b32e4d7cd21de0ed18954aa38124485964d
Reviewed-on: http://review.whamcloud.com/12513
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
9 years agoNew tag 2.6.90 2.6.90 v2_6_90 v2_6_90_0
Oleg Drokin [Thu, 6 Nov 2014 18:50:28 +0000 (13:50 -0500)]
New tag 2.6.90

Change-Id: I9fcc98e0df6a44f5836c6a038fd59d8614200bd8

9 years agoLU-5863 utils: Handle the special case of ldd_svname for mgs 64/12564/3
James Simmons [Wed, 5 Nov 2014 00:43:44 +0000 (19:43 -0500)]
LU-5863 utils: Handle the special case of ldd_svname for mgs

Currently parse_ldd checks to see if ldd_svname is 8 or more
characters in length. This is true for all servers except
the mgs which is always labeled as "MGS". This can prevent
the mounting of the MGT. The solution is to see if we are
handling a MGT which doesn't require any extra type of
special handling that other OSD need.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I68582471e2b6ce47473a4fefb21e589c8c5b3730
Reviewed-on: http://review.whamcloud.com/12564
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5814 lov: remove unused {get,set}_info handlers 45/12445/4
John L. Hammond [Mon, 27 Oct 2014 22:26:04 +0000 (17:26 -0500)]
LU-5814 lov: remove unused {get,set}_info handlers

In LOV and OSC remove handlers for the obsolete get and set info keys:
KEY_CAPA_KEY, KEY_CONNECT_FLAG, KEY_EVICT_BY_NID, KEY_LAST_ID,
KEY_LOCK_TO_STRIPE, KEY_MDS_CONN, KEY_NEXT_ID.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Iab1adaffc4ea860ea6ce2a2614b5ab6f6444e34b
Reviewed-on: http://review.whamcloud.com/12445
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
9 years agoLU-5691 hsm: remove a request from the index if not present in the store 42/12142/2
Frank Zago [Tue, 30 Sep 2014 23:05:48 +0000 (18:05 -0500)]
LU-5691 hsm: remove a request from the index if not present in the store

When processing the list of requests that have aged out, if the
request cannot be found in the store, removing it from the index. If
that is not done, Lustre will try again to remove it, leading to an
endless cycle of cancellation.

This fixes the repetition of these messages:

  LustreError:
  2028:0:(mdt_coordinator.c:1465:mdt_hsm_update_request_state())
  tas01-MDT0000: Cannot find running request for cookie 0x54249515 on
  fid=[0x200000404:0x15caa:0x0]
  LustreError:
  2028:0:(mdt_coordinator.c:1465:mdt_hsm_update_request_state())
  Skipped 15979999 previous similar messages
  LustreError: 2028:0:(mdt_coordinator.c:339:mdt_coordinator_cb())
  tas01-MDT0000: Cannot cleanup timeouted request:
  [0x200000404:0x15caa:0x0] for cookie 0x54249515 action=CANCEL
  LustreError: 2028:0:(mdt_coordinator.c:339:mdt_coordinator_cb())
  Skipped 15979999 previous similar messages

Change-Id: Ie7a2a98be8cc97db9af7a64476c06fc7321544eb
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/12142
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5853 build: fix el7 build regression 46/12546/2
Bob Glossman [Mon, 3 Nov 2014 22:43:27 +0000 (14:43 -0800)]
LU-5853 build: fix el7 build regression

Correct the build failure caused by recent master landing for
LU-4647 nodemap: add mapping functionality by using PDE_DATA()
instead of PDE() in new code.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I5b43e485cf5ba25e8473ed5783848aca77b96048
Reviewed-on: http://review.whamcloud.com/12546
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Jenkins
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5383 utils: fix array index out of bounds 24/12524/2
Dmitry Eremin [Fri, 31 Oct 2014 15:33:59 +0000 (18:33 +0300)]
LU-5383 utils: fix array index out of bounds

Possible attempt to access element -8..-1 of array 'ldd_svname'.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Ib4ec6a6d74ff6e805725d0ff4487868b7cbffa2f
Reviewed-on: http://review.whamcloud.com/12524
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5577 changelog: fix comparison between signed and unsigned 74/12474/2
Dmitry Eremin [Wed, 29 Oct 2014 12:36:25 +0000 (15:36 +0300)]
LU-5577 changelog: fix comparison between signed and unsigned

Change type of changelog_*{namelen,size}() to size_t.
Fixed string specifier for unsigned types.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Ie24c87242328d14ee608ad38b530a6e581db93b9
Reviewed-on: http://review.whamcloud.com/12474
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
9 years agoLU-5814 echo: remove userspace LSM handling 46/12446/4
John L. Hammond [Mon, 27 Oct 2014 22:53:04 +0000 (17:53 -0500)]
LU-5814 echo: remove userspace LSM handling

In lustre/obdecho/echo_client.c, remove handling of lov_stripe_md
passed from userspace (since userspace never passes it). Remove the
LOV specific code (ed_next_islov) from the echo client (since it
doesn't work).

Remove echo_get_stripe_off_id() and all calls to it since the stripe
count of the passed in lsm is always 0 and the funciton does nothing
in this case. Remove the then unused lsm parameters of
echo_client_page_debug_setup() and echo_client_page_debug_check().

In the OBD_IOC_GETATTR and OBD_IOC_SETATTR cases of
echo_client_iocontrol() do not set the oi_md member of struct obd_info
since only LOV OBD methods access it.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: If5d31ca3bf798d2e4f6c4f63c2012160e50f8cd7
Reviewed-on: http://review.whamcloud.com/12446
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-5814 lov: remove LL_IOC_RECREATE_{FID,OBJ} 42/12442/4
John L. Hammond [Mon, 27 Oct 2014 21:13:21 +0000 (16:13 -0500)]
LU-5814 lov: remove LL_IOC_RECREATE_{FID,OBJ}

Remove the obsolete ioctls LL_IOC_RECREATE_FID and LL_IOC_RECREATE_OBJ
along with their handlers in llite. Remove the then unused OBD method
lov_create(). Remove OBD_FL_RECREATE_OBJS handling from osc_create().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ib7183235d9eb761d2dfa2072dbeb8dd4d918e4ad
Reviewed-on: http://review.whamcloud.com/12442
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>