Whamcloud - gitweb
fs/lustre-release.git
10 years agoLU-4808 tests: sanity cleanup to work on Xeon Phi 86/10086/2
Dmitry Eremin [Mon, 24 Mar 2014 18:03:25 +0000 (22:03 +0400)]
LU-4808 tests: sanity cleanup to work on Xeon Phi

* Made the test_74c robust to different platforms.
* Fix an issue with values more than 2^32 (for example 9.36473e+09).
* Fix an integer comparison in case of empty variable.
* Check for tools presents (rsync and getfattr).
* Check for CLIENTONLY mode.
* Remove "-x" options from grep (it's not always supported).
* Coding style cleanup.

Lustre-commit: e217648d50da55366ef819e78bb8d7601a109077
Lustre-change: http://review.whamcloud.com/9766

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: I8e553da96afd78e96c9c534898db7173aa268223
Reviewed-on: http://review.whamcloud.com/10086
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-3333 ptlrpc: Protect request buffer changing
Oleg Drokin [Mon, 27 May 2013 18:06:09 +0000 (14:06 -0400)]
LU-3333 ptlrpc: Protect request buffer changing

*_enlarge_reqbuf class of functions can change request body location
for a request that's already in replay list, as such a parallel
traverser of the list (after_reply -> ptlrpc_free_committed) might
access freed and scrambled memory causing assertion.

Since all such users only can get to this request under imp_lock, take
imp_lock to protect against them in *_enlarge_reqbuf

Change-Id: I0fa690aabd8696a9f05b94c66e06e30eefb5c759
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/10074
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
10 years agoLU-2479 ldiskfs: do not check dir max size for regular files 43/10043/2
Emoly Liu [Thu, 17 Apr 2014 05:57:08 +0000 (13:57 +0800)]
LU-2479 ldiskfs: do not check dir max size for regular files

ldiskfs_append() is used not only to extend directory but also to
maintain iam container. In later case ldiskfs_append() should not
check for max directory size. The iam container is distunguished as
regular file.

This is a backport of
Lustre-commit: 4b9e64aec7cfdf859f1f931e1ee44056db050bb9
Lustre-change: http://review.whamcloud.com/8137

Signed-off-by: Vladimir Saveliev <vladimir_saveliev@xyratex.com>
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: Ibc3e96078e9af3eb854c6666f087b1049b68f409
Reviewed-on: http://review.whamcloud.com/10043
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4422 quota: fix s-q test_6 46/9946/2
Niu Yawei [Mon, 14 Apr 2014 04:06:11 +0000 (12:06 +0800)]
LU-4422 quota: fix s-q test_6

s-q test_6 should not fail by checking whether the file is
growing, because in most time, the test is waiting for quota
slave reconnecting to quota master.

This patch is back-ported from the following one:
Lustre-commit: a1cf492e3632e2930b9593817d9397ba1b378fda
Lustre-change: http://review.whamcloud.com/9339

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I5c40e704266672af270e8306dff1fb867af1b8b5
Reviewed-on: http://review.whamcloud.com/9946
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4915 kernel: kernel update [SLES11 SP3 3.0.101-0.21] 82/9982/2
Bob Glossman [Wed, 16 Apr 2014 17:08:11 +0000 (10:08 -0700)]
LU-4915 kernel: kernel update [SLES11 SP3 3.0.101-0.21]

update target and config files for new version

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ia2f2659e3f57d2fccacf4afe2c50efe7a656be7c
Reviewed-on: http://review.whamcloud.com/9982
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4381 lov: to not hold sub locks at initialization 94/9994/2
Jinshan Xiong [Thu, 6 Feb 2014 06:49:23 +0000 (22:49 -0800)]
LU-4381 lov: to not hold sub locks at initialization

Otherwise, it will cause deadlock because it essentially holds
some sub locks and then to request others in an arbitrary order.

Lustre-commit: c6ab1fcc056778b18f685ec591ce27907e887617
Lustre-change: http://review.whamcloud.com/9152

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: I9bdfa2339c83396efa5d16763a5329d06e232ddd
Reviewed-on: http://review.whamcloud.com/9994
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4558 clio: Solve a race in cl_lock_put 87/9887/3
Jinshan Xiong [Thu, 3 Apr 2014 00:35:45 +0000 (17:35 -0700)]
LU-4558 clio: Solve a race in cl_lock_put

It's not atomic to check the last reference and state of cl_lock
in cl_lock_put(). This can cause a problem that an using lock is
freed, if the process is preempted between atomic_dec_and_test()
and (lock->cll_state == CLS_FREEING).

This problem can be solved by holding a refcount by coh_locks. In
this case, it can be sure that if the lock refcount reaches zero,
nobody else can have any chance to use it again.

Lustre-commit: ec1a5d4f1f5b52ee5031ddc11a862c82996541f7
Lustre-change: http://review.whamcloud.com/9881

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I5e16396156e1c7b8b86f7aa74b7b4735bb774a0f
Reviewed-on: http://review.whamcloud.com/9887
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4591 lov: cancel ungranted sub lock 51/9851/2
Shuichi Ihara [Sat, 29 Mar 2014 05:29:20 +0000 (14:29 +0900)]
LU-4591 lov: cancel ungranted sub lock

When the top lock is canceled due to error, we should cancel
ungranted sub lock otherwise the sublock state is undefined.

backport from http://review.whamcloud.com/9524

Signed-off-by: Shuichi Ihara <sihara@ddn.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ifa68743cd5bf3a9d69258014d066124fd7fc87c9
Reviewed-on: http://review.whamcloud.com/9851
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4721 obdclass: handle local storage init/fini properly 72/9572/3
Fan Yong [Tue, 25 Feb 2014 18:42:11 +0000 (02:42 +0800)]
LU-4721 obdclass: handle local storage init/fini properly

1) In local_oid_storage_fini(), take the mutex on ls_device
   before decreasing the 'los' reference to avoid others to
   obtain the mutex earlier and freed the 'los' by race.

2) When llog init the local stroage for FID_SEQ_LLOG and
   FID_SEQ_LLOG_NAME, it should record the handlers which
   can be used to fini them to avoid releasing the handler
   which is in using by others.

3) NOT forget the llog_ctxt_put() if something wrong during
   the llog_osd_setup().

4) NOT put the object in lastid_compat_check() until all the
   usages on such object have been done.

Test-Parameters: envdefinitions=SLOW=yes,ENABLE_QUOTA=yes testlist=sanity-scrub,sanity-scrub,sanity-scrub,sanity-scrub
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: If7d9241147870e202b6a1082a398ee1c48f1a6d8
Reviewed-on: http://review.whamcloud.com/9572
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4596 lprocfs: mdt/*/exports/*/uuid is empty after remount
Andriy Skulysh [Thu, 6 Feb 2014 22:50:08 +0000 (00:50 +0200)]
LU-4596 lprocfs: mdt/*/exports/*/uuid is empty after remount

need to assign exp_nid_stats even if old stats are present

Change-Id: I9b0ac4c7ccee6cbd457827b20123bf40d91aaf33
Xyratex-bug-id: MRP-1641
Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-on: http://review.whamcloud.com/9171
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4853 ldiskfs: fix race in ldiskfs_ext_new_extent_cb
Niu Yawei [Wed, 2 Apr 2014 12:44:00 +0000 (08:44 -0400)]
LU-4853 ldiskfs: fix race in ldiskfs_ext_new_extent_cb

In ldiskfs_ext_calc_credits_for_insert(), we should use the 'depth'
stored in the 'path' instead from inode, because the extent tree
could have been changed when ldiskfs_ext_calc_credits_for_insert()
is called (by ldiskfs_ext_new_extent_cb()).

It was fixed in LU-2555, but the fix is missed in
sles11sp2/ext4-misc.patch

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I9ca349ba52060f10fc980721317ba47e10572473
Reviewed-on: http://review.whamcloud.com/9868
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4252 osd: remove locking i_mutex in osd_punch 42/9742/2
yangsheng [Fri, 21 Mar 2014 03:53:30 +0000 (11:53 +0800)]
LU-4252 osd: remove locking i_mutex in osd_punch

This piece of code intent to calm down the kernel
WARN_ON messsage show up since 3.9 upstream. So we
can just remove it for now. We'll add a ldiskfs
patch to resolve this issue while 3.9 server support
come up.

Signed-off-by: yang sheng <yang.sheng@intel.com>
Change-Id: I0d8a13829ebb1d18c672a09729be603c923bc79c
Reviewed-on: http://review.whamcloud.com/9742
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-992 doc: remove old kernels from lustre/ChangeLog 28/8628/4
Andreas Dilger [Thu, 19 Dec 2013 22:07:35 +0000 (15:07 -0700)]
LU-992 doc: remove old kernels from lustre/ChangeLog

Remove the old and obsolete kernels from the current list of
kernel versions that Lustre works with, since this is confusing
to list kernels that are no longer supported at all.

Update recommended e2fsprogs version to include "or newer".

Move ldiskfs README file to top-level directory.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ia4859b3b2bf5a8aa9faabeb43d554130103ebbe5
Reviewed-on: http://review.whamcloud.com/8628
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4109 tests: raise out-of inode LOV EA detection threshold 52/9652/2
Bruno Faccini [Mon, 4 Nov 2013 14:28:39 +0000 (15:28 +0100)]
LU-4109 tests: raise out-of inode LOV EA detection threshold

Sanity sub-test 57b sometimes failed because the threshold (< 8K
of MDT space) it used to detect if LOV EA has been stored out-of
inode is too short (vs space also allocated for llog files, ChangeLog,
etc). So raising threshold to 16.

Lustre commit: f52da9f69bf08e82a3f4ddaf174e3a6318258f1f
Lustre change: http://review.whamcloud.com/8156

Change-Id: I4599d8b45828cee607937b387c1b46ae0df4f6b3
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Reviewed-on: http://review.whamcloud.com/9652
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4728 mdt: fix NULL deference of mdt_fid_lock 07/9707/2
Li Xi [Fri, 7 Mar 2014 04:32:12 +0000 (12:32 +0800)]
LU-4728 mdt: fix NULL deference of mdt_fid_lock

When enabling hsm_control, mti_exp field of struct mdt_thread_info
could be NULL.  ldlm_cli_enqueue_local will crash the kernel when
dereference it.

Lustre commit:  0eca7d92e4a84cc6ea2ca9975d2d3b4cef17a686
Lustre change: http://review.whamcloud.com/9543

Change-Id: Iba41a8e646a2fad7f009934d8bf1b67930a98f70
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-on: http://review.whamcloud.com/9707
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4545 hsm: Always report errors to coordinator. 08/9708/3
Henri Doreau [Wed, 19 Feb 2014 14:49:41 +0000 (15:49 +0100)]
LU-4545 hsm: Always report errors to coordinator.

Make sure feedback on processed items gets properly delivered to the
coordinator even if errors occur between action item delivery and
item processing initialization phase.

Lustre commit: 344a261080d462c29cf45b075deacf793c92e197
Lustre change: http://review.whamcloud.com/9310

Change-Id: I9b79f049728d011480dd86ef6baa23e39a124634
Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-on: http://review.whamcloud.com/9708
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4770 tests: tests scripts simplification 35/9735/2
Dmitry Eremin [Mon, 17 Mar 2014 20:33:19 +0000 (00:33 +0400)]
LU-4770 tests: tests scripts simplification

1. Xeon Phi don't support "mount -t lustre", we can use only
   "mount.lustre" utility.
2. A "head" utility don't support shortcut "-1" option.
   It should be specified as "-n 1".
3. A "sed" utility don't support expression "s/[-.]/ /3"

Lustre-commit: 4d96e960d574387e8ac4a31249ef1d30c859480b
Lustre-change: http://review.whamcloud.com/9668

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: I24804d7bde55ac3a70d99a582a1635ee9bfe62b8
Reviewed-on: http://review.whamcloud.com/9735
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoNew tag 2.5.1-RC4 2.5.1 2.5.1-RC4 v2_5_1 v2_5_1_0 v2_5_1_0_RC4 v2_5_1_RC4
Oleg Drokin [Thu, 13 Mar 2014 12:52:39 +0000 (08:52 -0400)]
New tag 2.5.1-RC4

Change-Id: Ide0da772d7fd739fce878691e8820e19848534b9

10 years agoLU-4751 hsm: Fix sanity-hsm tests for non-mrsh $PDSH 93/9593/2
Michael MacDonald [Tue, 11 Mar 2014 04:37:14 +0000 (00:37 -0400)]
LU-4751 hsm: Fix sanity-hsm tests for non-mrsh $PDSH

The workaround for starting backgrounded copytool
monitors via pdsh -Rmrsh is not compatible with non-mrsh
values of $PDSH. This commit adds a branch such that
the workaround is only used when required by the test
cluster environment.

Lustre-commit: 0c261a1fd22a0c867b02baa275ae5023f27d6f5f
Lustre-change: http://review.whamcloud.com/#/c/9587/

Signed-off-by: Michael MacDonald <michael.macdonald@intel.com>
Change-Id: Iab7a94180313170a968c7acbe66919a532f08cfd
Reviewed-on: http://review.whamcloud.com/9593
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoNew tag 2.5.1-RC3 2.5.1-RC3 v2_5_1_0_RC3 v2_5_1_RC3
Oleg Drokin [Tue, 11 Mar 2014 13:34:41 +0000 (09:34 -0400)]
New tag 2.5.1-RC3

Change-Id: I9e62744390dc3207a56fee8ad28535249af356ba

10 years agoLU-4209 utils: fix O_TMPFILE/O_LOV_DELAY_CREATE conflict 92/9492/2
Andreas Dilger [Mon, 18 Nov 2013 09:47:26 +0000 (02:47 -0700)]
LU-4209 utils: fix O_TMPFILE/O_LOV_DELAY_CREATE conflict

In kernel 3.11 O_TMPFILE was introduced, but the open flag value
conflicts with the O_LOV_DELAY_CREATE flag 020000000 added to fix
LU-812 in Lustre 2.4.  O_LOV_DELAY_CREATE allows applications
to defer file layout and object creation from open time (the default)
until it can instead be specified by the application using an ioctl.

Instead of trying to find a non-conflicting O_LOV_DELAY_CREATE flag
or define a Lustre-specific flag that isn't of use to most/any other
filesystems, use (O_NOCTTY|FASYNC) as the new value.  These flags
are not meaningful for newly-created regular files and should be
ok since O_LOV_DELAY_CREATE is only meaningful for new files.

I looked into using O_ACCMODE/FMODE_WRITE_IOCTL, which allows calling
ioctl() on the minimally-opened fd and is close to what is needed,
but that doesn't allow specifying the actual read or write mode for
the file, and fcntl(F_SETFL) doesn't allow O_RDONLY/O_WRONLY/O_RDWR
to be set after the file is opened.

For 2.5.1 and later, only check for the 020000000 flag in the kernel
for compatibility with applications compiled against 2.5.0 headers,
since this is needed for SLES11 SP2/SP3 clients on 3.0 kernels.

We will keep the 0100000000 flag in O_LOV_DELAY_CREATE for backward
compatibility until 3.13 is the oldest supported client kernel, but
drop the conflicting __O_TMPFILE value of 02000000 since that will
cause an error when running on 3.11+ kernels.  The 020000000 has only
been used in Lustre 2.4.0-2.4.2 and 2.5.0 and always in conjunction
with 0100000000, so any apps that used O_LOV_DELAY_CREATE directly
instead of calling llapi_file_create*() will still work until Linux
3.13 is used.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I565f3454616edc60c6acee01034aa5d7733ebbe5
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Peng Tao <bergwolf@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/9492
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4643 hsm: Make sanity-hsm test_60 more robust 78/9378/6
Michael MacDonald [Tue, 25 Feb 2014 00:56:36 +0000 (19:56 -0500)]
LU-4643 hsm: Make sanity-hsm test_60 more robust

The first version of this test was fragile and could fail
intermittently when test infrastructure was not capable of
providing 1MB/sec in lustre bandwidth. This commit changes the
test to validate that a progress update occurs within the expected
window, rather than testing for a specific amount of data copied
under ideal conditions.

Lustre-commit: 1c9fbda98bf2274d63517ad72ef81484081834b9
Lustre-change: http://review.whamcloud.com/#/c/9376/

Signed-off-by: Michael MacDonald <michael.macdonald@intel.com>
Change-Id: I335e4cd9a3671f60be5a9236f322705b68664c8c
Reviewed-on: http://review.whamcloud.com/9378
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4689 hsm: count NULL terminator in hai_first/hal_size 68/9468/2
Peng Tao [Thu, 27 Feb 2014 08:07:39 +0000 (16:07 +0800)]
LU-4689 hsm: count NULL terminator in hai_first/hal_size

If fsname is 8-byte aligned, hai_first fails to count the ending NULL
terminator causing hai to directly attached after fsname and future
hai_first will return a different position for first hai.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
Change-Id: I5e9e1f48f99b4743b2d5b93397e06f6becabeb26
Reviewed-on: http://review.whamcloud.com/9431
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/9468

10 years agoLU-4724 hsm: Safe copytool event logging 68/9568/2
Michael MacDonald [Fri, 7 Mar 2014 16:18:10 +0000 (11:18 -0500)]
LU-4724 hsm: Safe copytool event logging

Protect against concurrent event log writes by multiple
threads within a copytool process. Fixes sanity-hsm test_71
failures.

Lustre-commit: f28e754dcbf36e159386038bdebfaef50135715f
Lustre-change: http://review.whamcloud.com/#/c/9553/

Signed-off-by: Michael MacDonald <michael.macdonald@intel.com>
Change-Id: Ic83b162c60074bde7235bb9ae44f8f28797c1dfd
Reviewed-on: http://review.whamcloud.com/9568
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoNew tag 2.5.1-RC2 2.5.1-RC2 v2_5_1_0_RC2 v2_5_1_RC2
Oleg Drokin [Thu, 6 Mar 2014 16:19:27 +0000 (11:19 -0500)]
New tag 2.5.1-RC2

Change-Id: Ie7c87fe160c2cdd0bc4ca225ea07dc4cbdb9b7e2

10 years agoLU-4703 mdd: do not skip xattr sanity check for all cases 13/9513/2
Li Dongyang [Tue, 4 Mar 2014 06:10:49 +0000 (17:10 +1100)]
LU-4703 mdd: do not skip xattr sanity check for all cases

xattr sanity check should be done at all times.
Otherwise we are risking letting a non root user setting
access acl on any file.

Lustre-commit: 6d7066989c06bdf6c86f5b8b61d542fda8e5739e
Lustre-change: http://review.whamcloud.com/9469

Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ia669a20139abc5f4b64688cccd4b07a5bcc6910b
Reviewed-on: http://review.whamcloud.com/9513
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4020 hsm: allow copytool event monitoring with JSON 12/9512/3
Michael MacDonald [Wed, 26 Feb 2014 21:18:32 +0000 (16:18 -0500)]
LU-4020 hsm: allow copytool event monitoring with JSON

Adds hooks into various llapi_hsm_* functions to emit JSON-formatted
event messages for consumption by a monitoring agent. The copytool
needs to be supplied with an optional --event-fifo argument to
enable this feature.

Incorporates the following work done by Bruno Faccini:
* Put all JSON routines in a separate file/lib to allow for
  LGPL licensing.
* Major code cleanup to follow best practices and address review
  concerns.

Lustre-commit: 45c9ef1dfff207086665c764c7d500c00aa03c7f
Lustre-change: http://review.whamcloud.com/7790

Signed-off-by: Michael MacDonald <michael.macdonald@intel.com>
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: Icfad0f2093f24fcfec22408d9819d438c6a5e7f1
Reviewed-on: http://review.whamcloud.com/9512
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4704 acl: fix permission problem of setfacl 14/9514/2
Li Xi [Tue, 4 Mar 2014 08:48:43 +0000 (16:48 +0800)]
LU-4704 acl: fix permission problem of setfacl

Setxattr does not check the permission when setting ACL xattrs. This
will cause security problem because any user can walk around
permission checking by changing ACL rules.

Lustre-commit: 42f504ecada81d7a2a8e2244f345e8dbf73fd157
Lustre-change: http://review.whamcloud.com/9473

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: If6bda8ec6b3d91f31bef812acb9f9f636ed56291
Reviewed-on: http://review.whamcloud.com/9514
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4083 mdt: Take lov_mutex in mdt_reint_unlink and _rename
James Nunez [Thu, 10 Oct 2013 20:32:39 +0000 (14:32 -0600)]
LU-4083 mdt: Take lov_mutex in mdt_reint_unlink and _rename

Take the mot_lov_mutex lock around mdo_unlink and mdo_rename
to synchronize around striping.

Test-Parameters: testlist=racer
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Iaa00b81fd7cfb25ce5f3dcea3c2d6289d133134f
Reviewed-on: http://review.whamcloud.com/7919
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoNew tag 2.5.1-RC1 2.5.1-RC1 v2_5_1_0_RC1 v2_5_1_RC1
Oleg Drokin [Tue, 4 Mar 2014 19:16:09 +0000 (14:16 -0500)]
New tag 2.5.1-RC1

Change-Id: I9a98cdff4049373d4931e234c032c7e577739319

10 years agoLU-4101 mdt: protect internal xattrs
John L. Hammond [Mon, 14 Oct 2013 17:34:13 +0000 (12:34 -0500)]
LU-4101 mdt: protect internal xattrs

In mdt_reint_setxattr() require CAP_SYS_ADMIN to modify a trusted
xattr and silently disallow modification those trusted xattrs used by
Lustre internally.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ic616dca74a90da0aedb0ec2624618f91ac6fcaf4
Reviewed-on: http://review.whamcloud.com/7943
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4639 hsm: HSM requests not delivered 22/9422/5
James Nunez [Tue, 25 Feb 2014 22:34:46 +0000 (15:34 -0700)]
LU-4639 hsm: HSM requests not delivered

The total size of an HSM archive request may exceed the
desired (LNET) message. When this happens, it can hang
the client and not allow the archive request to succeed.

Before we know the total size of the hsm_action_items, we
need to limit the size of the reguest. Doing this limits
the number of items that can be sent in one archive request.
We’ve reduced the size allowed for the user archive request
to MDS_MAXREQSIZE/3.

sanity-hsm test 90 forms a list of files to archive. The
number of files in the list needed to be decreased to
match the limit described above.

Lustre-change: http://review.whamcloud.com/#/c/9393/
Lustre-commit: a23e0fe42f26cd54384058d927bcf42330174e7b

Change-Id: I84c36ba318a6ed424248a0567c33e824de3e8021
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Reviewed-on: http://review.whamcloud.com/9422
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4470 build: wrong linux symbol file search 88/9288/3
Bob Glossman [Wed, 29 Jan 2014 20:00:42 +0000 (12:00 -0800)]
LU-4470 build: wrong linux symbol file search

Long standing build flaw just discovered.  The autoconf function
LB_CHECK_SYMBOL_EXPORT looks for the linux symbol table in the wrong place.
In most builds this doesn't matter as the wrong path being used exactly
matches the correct path.  In SLES builds it does matter a lot.
Failing to find the linux symbol table can lead to incorrect autoconf results.

Change-Id: I9068f2b9aace740b67d88598844c0b8fc1d95b49
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-on: http://review.whamcloud.com/9288
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4357 libcfs: restore __GFP_WAIT flag to memalloc calls 82/9382/2
Ann Koehler [Wed, 12 Feb 2014 17:14:00 +0000 (01:14 +0800)]
LU-4357 libcfs: restore __GFP_WAIT flag to memalloc calls

In 2.4, the flags passed to the memory allocation functions are
translated from CFS enumeration values types to the kernel GFP
values by calling cfs_alloc_flags_to_gfp(). This function adds
__GFP_WAIT to all flags except CFS_ALLOC_ATOMIC. In 2.5, when
the cfs wrappers were dropped, cfs_alloc_flags_to_gfp() was
removed and the CFS_ALLOC_xxxx was simply replaced with __GFP_xxxx.
This means that most memory allocation calls are missing the
__GFP_WAIT flag. The result is that Lustre experiences more ENOMEM
errors, many of which the higher levels of Lustre do not handle
robustly.
Notes GFP_NOFS = __GFP_WAIT | __GFP_IO. So the patch replaces
__GFP_IO with GFP_NOFS.
Patch does not add __GFP_WAIT to GFP_IOFS. GFP_IOFS was not used in
2.4 so it has never been used with __GFP_WAIT.

Lustre-commit: e6b3d5de62b43a8702aa2dbda4a0bc6b6b10b5c2
Lustre-change: http://review.whamcloud.com/9223

Signed-off-by: Ann Koehler <amk@cray.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I8f0341c34e595d437c06a4564a11e4c52bd4363a
Reviewed-on: http://review.whamcloud.com/9382
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4471 mdd: mdd_unlink: do trans_start after sanity check 79/9379/5
Patrick Farrell [Wed, 22 Jan 2014 19:04:26 +0000 (13:04 -0600)]
LU-4471 mdd: mdd_unlink: do trans_start after sanity check

Currently, mdd_trans_start is called before
mdd_unlink_sanity_check. This means a remote directory
which has files in it can be removed on MDT0 before the
sanity check on MDT1 finds the files and errors, which
orphans the files on MDT1. This patch moves the sanity
check before mdd_trans_create and mdd_trans_start.

Lustre-commit: 754a8ac8bf4b907bb579fe441eed1ba3447d3e0f
Lustre-change: http://review.whamcloud.com/8827

Signed-off-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Iac3cb214f16cb8e342be6691013d1cf262775ff0
Reviewed-on: http://review.whamcloud.com/9379
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4597 clio: clear nowait flag agl lock re-enqueue 28/9328/2
Niu Yawei [Thu, 13 Feb 2014 07:07:14 +0000 (02:07 -0500)]
LU-4597 clio: clear nowait flag agl lock re-enqueue

The LDLM_FL_BLOCK_NOWAIT flag should be cleared when re-enqueue
the agl lock as normal glimpse, otherwise, it won't get size back
if there is conflicting locks on other client.

Lustre-change: http://review.whamcloud.com/9249
Lustre-commit: 85c352274b3435a41649e9a5089da83b54893d37

Change-Id: I421d033496b57fb3d24635587112eaab3ba2ea32
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-on: http://review.whamcloud.com/9328
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4382 ldiskfs: add quota credit for ldiskfs_delete_inode 34/9334/2
Bobi Jam [Sat, 8 Feb 2014 07:02:13 +0000 (15:02 +0800)]
LU-4382 ldiskfs: add quota credit for ldiskfs_delete_inode

In ldiskfs_delete_inode() we missed possible journal credits
for journaled quota change, this patch makes it up.

Lustre-commit: c6041da30a189e2bd2f2dd85b6032cb9b99a769d
Lustre-change: http://review.whamcloud.com/9187

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: I29d5e2bfa471a08a01b4c316e447cf2fa3df6b82
Reviewed-on: http://review.whamcloud.com/9334
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4460 mount: fix lmd_parse() to handle comma-separated NIDs 29/9029/3
Jian Yu [Tue, 28 Jan 2014 13:12:53 +0000 (21:12 +0800)]
LU-4460 mount: fix lmd_parse() to handle comma-separated NIDs

This patch reverts commit 3917e62018878dfffac59ceed70f20b0419945d3,
which cannot handle the upgrade situation that old mountdata already
contains comma-separated NIDs. The correct way to fix the original
issue is to parse comma-separated NIDs in lmd_parse().

The patch also updates disk2_4-ldiskfs.tar.bz2 to make the mountdata
of ost contain comma-separated NIDs so as to verify the patch under
upgrade situation.

Test-Parameters: alwaysuploadlogs \
envdefinitions=SLOW=yes,ENABLE_QUOTA=yes testlist=conf-sanity

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I58d6f4f1b74c78fc2198652677571bfd4a57d785
Reviewed-on: http://review.whamcloud.com/9029
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4269 ldlm: Hold lock when clearing flag 46/9346/3
Li Xi [Wed, 8 Jan 2014 09:13:16 +0000 (17:13 +0800)]
LU-4269 ldlm: Hold lock when clearing flag

This patch moves lock's skip flag clearing from lru-delete to
lru-add code to prevent clearing lock's flag without resource lock
proection.

Lustre-commit: 98c2e6b446166ba8f89e60a0d4f38683b920f506
Lustre-change: http://review.whamcloud.com/8772

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: I6f449ba0e03b936435ffa8a1b158a4987c636862
Reviewed-on: http://review.whamcloud.com/9346
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
10 years agoLU-4098 lmv: kernel crash due to misconfigured MDT 47/9347/2
Dmitry Eremin [Mon, 14 Oct 2013 11:43:27 +0000 (15:43 +0400)]
LU-4098 lmv: kernel crash due to misconfigured MDT

There are few places with access to lmv->tgts[] without check for NULL.
Usually it may happens when MDT configured starting from index 1
instead of 0. For example:
  mkfs.lustre --reformat --mgs --mdt --index=1 /dev/sdd1

Lustre-commit: caa70178a413c84c400e875866cdbbd368df4758
Lustre-change: http://review.whamcloud.com/7941

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: Ieccedcfe7a3a18cb649ef739a077338a8a1b8855
Reviewed-on: http://review.whamcloud.com/9347
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4343 tests: mkdir failing in sanity-hsm test 228 80/9280/3
James Nunez [Thu, 20 Feb 2014 05:51:40 +0000 (13:51 +0800)]
LU-4343 tests: mkdir failing in sanity-hsm test 228

sanity-hsm test 228 calls mkdir on $tdir. Currently, the tdir
variable is two directories. This is changed in LU-2524. Until
LU-2524 lands, any call to mkdir with the tdir variable needs
the "-p" flag.

Also added removal of two files that the test creates and a new
routine to create small files with dd using the sync flag.

Lustre-commit: 2f253abab679c21b41197379b23e36943e5995a7
Lustre-change: http://review.whamcloud.com/8542

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: Idd4354d6012032563d41c10238f619251e885e65
Reviewed-on: http://review.whamcloud.com/9280
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4505 quota: race of edquot updating 15/9315/2
Niu Yawei [Wed, 22 Jan 2014 04:24:00 +0000 (23:24 -0500)]
LU-4505 quota: race of edquot updating

The slave edquot flag could be set mistakenly as following:

- slave A acquires quota from master, master found that the
  user is running out of quota, set edquot in reply;
- another slave deletes files and release quota to master,
  master clears edquot and notify all slaves by glimpse;
- glimpse reaches slave A before the reply of dqacq, so
  edquot flag will be set on slave A at the end.

Given that edquot can't be fully trusted, it should only be
revalidated every 5 seconds on the sync acquire path.

Lustre-commit: 109fef5b053490549726f7b5abc9ba840d3a4ae0
Lustre-change: http://review.whamcloud.com/8954

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I553bd1bc3aa6df6c449341e56564073043afd3da
Reviewed-on: http://review.whamcloud.com/9315
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4577 lnet: Dropped messages are not accounted correctly 11/9311/3
Matt Ezell [Mon, 3 Feb 2014 18:19:48 +0000 (13:19 -0500)]
LU-4577 lnet: Dropped messages are not accounted correctly

LNET messages that are dropped are not accounted for correctly in
/proc/sys/lnet/stats. What I assume to be a simple typo is causing
drop_length to be double-counted and drop_count to never be
incremented.

Lustre-change: http://review.whamcloud.com/9096
Lustre-commit: 3abb0bb5f82559f2f5349dca763cf6edc7f6754b

Change-Id: Ia8454221885a1d765a3f7fadbcf5582fcfe7cf09
Signed-off-by: Matt Ezell <ezellma@ornl.gov>
Reviewed-on: http://review.whamcloud.com/9311
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4620 kernel: kernel update [RHEL6.5 2.6.32-431.5.1.el6] 18/9318/2
Bob Glossman [Thu, 13 Feb 2014 01:08:18 +0000 (17:08 -0800)]
LU-4620 kernel: kernel update [RHEL6.5 2.6.32-431.5.1.el6]

update RHEL6.5 kernel to 2.6.32-431.5.1.el6

Lustre-commit: 6fa2299177e8749e63a88047aee49b5a9af6c3ef
Lustre-change: http://review.whamcloud.com/9253

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I54c6476c84650d7419d70fb89efc2680af6ecabe
Reviewed-on: http://review.whamcloud.com/9318
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Cliff White <cliff.white@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-2687 test: add b2_4 zfs image for conf-sanity test_32a 24/8824/2
Wei Liu [Mon, 22 Jul 2013 22:07:08 +0000 (15:07 -0700)]
LU-2687 test: add b2_4 zfs image for conf-sanity test_32a

In order to ensure that we do not break ZFS upgrades
in the future, add 2.4.0 zfs filesystem test image for
conf-sanity.sh test_32a.

Test-Parameters: mdtfilesystemtype=zfs \
ostfilesystemtype=zfs mdsfilesystemtype=zfs \
envdefinitions=SLOW=yes testlist=conf-sanity

Change-Id: Iae560e05b428907409dc7069d30b601b52750cca
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Reviewed-on: http://review.whamcloud.com/8824
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4484 lbuild: add support for fresh versions of MPSS 3.x.x
Dmitry Eremin [Tue, 14 Jan 2014 11:36:55 +0000 (15:36 +0400)]
LU-4484 lbuild: add support for fresh versions of MPSS 3.x.x

* Adopt lbuild script for new version of MPSS with x.x.x notation.
* Remove dependency from MPSS package to avoid renaming issue in
  the future. The name of package which was used for dependency
  was renamed in MPSS.
* Use new server with MPSS released packages for download.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Ie4407ad00177ad6d22770230a4dc6bde967d91ef
Reviewed-on: http://review.whamcloud.com/8836
Tested-by: Jenkins
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3968 lbuild: Extend script with build for Xeon Phi card
Dmitry Eremin [Fri, 30 Aug 2013 18:29:50 +0000 (22:29 +0400)]
LU-3968 lbuild: Extend script with build for Xeon Phi card

Automatically download, compile and produce Lustre client RPMs
for Xeon Phi(TM) card if "--mpss-version" option is specified
for contrib/lbuild/lbuild script.

Also try to compile with Xeon Phi(TM) OFED if it's available.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Ida07d764dc824c13f22ffb53d24e2c6f79ce3573
Reviewed-on: http://review.whamcloud.com/7066
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4613 tests: purge older request result in test_12o 95/9295/2
Bruno Faccini [Wed, 12 Feb 2014 09:52:07 +0000 (10:52 +0100)]
LU-4613 tests: purge older request result in test_12o

sanity-hsm/test_12o sub-test, which has been introduced as part
of LU-3834, submits 2 RESTORE requests for the same FID and thus
needs to purge 1st result from log before to check 2nd.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ia2a0ead487b29a68c8a920bae2aa1d654eac4051
Reviewed-on: http://review.whamcloud.com/9295
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4442 test: add version check for replay-vbr.sh test_7g 90/9290/3
Emoly Liu [Tue, 18 Feb 2014 11:53:20 +0000 (19:53 +0800)]
LU-4442 test: add version check for replay-vbr.sh test_7g

In replay-vbr.sh test_7g.3, because mdt_object_exists() was added
in http://review.whamcloud.com/#/c/8371, client will not be evicted
without object version check.

The patch also fixes the wrong usage of wait_mds_ost_sync() in
replay_vbr.sh test_7_cycle(). The first parameter should be a timeout
in seconds, not a facet.

Test-Parameters: envdefinitions=SLOW=yes,ONLY=7 testlist=replay-vbr

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I4a960fc53451fc717370bc96f926f067bbb2946a
Reviewed-on: http://review.whamcloud.com/9290
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4386 osc: don't activate deactivated obd_import 83/9283/2
Hongchao Zhang [Thu, 5 Sep 2013 13:50:48 +0000 (21:50 +0800)]
LU-4386 osc: don't activate deactivated obd_import

In ptlrpc_activate_import(), obd_import->imp_deactive should
be checked if it is deactivated, otherwise it will trigger an
LBUG in ptlrpc_invalidate_import():

  ptlrpc_invalidate_import() ASSERTION(imp->imp_invalid) failed

Change-Id: I4c16f166c0c2cf60664119bf438dfd8606d71a2f
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/9283
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4293 mdd: Allow layout swap for IGIF FIDs 78/9278/2
Bruno Faccini [Mon, 6 Jan 2014 09:25:47 +0000 (10:25 +0100)]
LU-4293 mdd: Allow layout swap for IGIF FIDs

Patch to also allow layout swap for pre-2.x migrated
files (ie, IGIF FID with linkEA).

Root user special case has also been added to lfs/migrate
command to map owner/group of original file to
volatile, in order to comply with other layout_swap rules.

Lustre-commit: bd5ba50502bec5786c9a2f05c29f7b99a35147fb
Lustre-change: http://review.whamcloud.com/8737

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Iad6194c6050fa2ba066d2051871a10a60ddae995
Reviewed-on: http://review.whamcloud.com/9278
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4208 osd-zfs: hold pool config lock to register property 56/9256/2
Ned Bass [Mon, 4 Nov 2013 23:07:11 +0000 (15:07 -0800)]
LU-4208 osd-zfs: hold pool config lock to register property

- Hold the DSL pool configuration lock when calling
  dsl_prop_register().  Failure to do so will panic the node if
  assertions are enabled in ZFS.  This change requires a build of ZFS
  on Linux that exports symbols dsl_pool_config_enter and
  dsl_pool_config_exit, which was done in commit 40a806d [1], and will
  appear in ZFS release 0.6.3.

- Fix up variable declaration alignment in osd_mount().

- Add check for exported symbols in autoconf

[1] https://github.com/zfsonlinux/zfs/commit/40a806d

Lustre-change: http://review.whamcloud.com/8172
Lustre-commit: f8bc2f7fc03d3f86eef434cf644191e689ee57ec

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Li Wei <wei.g.li@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: Ice673efb5501456d1a4f423ec08dfb4f571f8221
Reviewed-on: http://review.whamcloud.com/9256
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4429 llite: fix open lock matching in ll_md_blocking_ast() 60/9260/2
John L. Hammond [Fri, 3 Jan 2014 23:31:53 +0000 (17:31 -0600)]
LU-4429 llite: fix open lock matching in ll_md_blocking_ast()

In ll_md_blocking_ast() match open locks before all others, ensuring
that MDS_INODELOCK_OPEN is not cleared from bits by another open lock
with a different mode. Change the int flags parameter of
ll_md_real_close() to fmode_t fmode. Clean up verious style issues in
both functions.

Lustre-commit: 2b23ad0d183141dc25377f2d37de6e6e36ba1169
Lustre-change: http://review.whamcloud.com/8718

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: Ic44ac8ac8c07b71d4c929d7d359bee881c6b05b0
Reviewed-on: http://review.whamcloud.com/9260
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4287 kernel: kernel update RHEL6.5 [2.6.32-431.3.1.el6] 03/9103/5
yangsheng [Wed, 8 Jan 2014 16:03:17 +0000 (00:03 +0800)]
LU-4287 kernel: kernel update RHEL6.5 [2.6.32-431.3.1.el6]

Add RHEL6.5 support [2.6.32-431.3.1.el6]

ext4 in RHEL6.5's kernel version 2.6.32-431.3.1.el6 no longer contains
the required function ext4_ext_walk_space(). We start a new rhel6.5
ldiskfs patch series and reintroduce ext4_ext_walk_space() through an
new patch, copying ext4_ext_walk_space() from older kernel rhel6.4
2.6.32-358.23.2.el6.

Lustre-commit: efa8fa578d2f7eeeaea11522dd311dddaa715a03
Lustre-change: http://review.whamcloud.com/8549

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Signed-off-by: yang sheng <yang.sheng@intel.com>
Change-Id: I5cff1860c43d06a6399b43f92ef90283c4600c8e
Reviewed-on: http://review.whamcloud.com/9103
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4154 lfsck: skip old lfsck test in DNE mode 76/9176/2
Emoly Liu [Fri, 7 Feb 2014 10:47:37 +0000 (18:47 +0800)]
LU-4154 lfsck: skip old lfsck test in DNE mode

The old e2fsck/lfsck tool will not be allowed to run on a DNE
filesystem. This patch updates generate_db() to pass master MDS
parameters only, so that the old lfsck does not corrupt it or
delete all of the files on other MDTs.
This patch also fixes a typo in run_lfsck_remote().

This patch is back-ported from the following one:
Lustre-commit: b5f3d6db9200e369a68284a8ef85a1205e5905e1
Lustre-change: http://review.whamcloud.com/8206

Test-Parameters: alwaysuploadlogs mdtcount=4 testlist=lfsck
Test-Parameters: alwaysuploadlogs mdtcount=1 testlist=lfsck

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I5590b20c1c0003dbb1975a254093724de22497d4
Reviewed-on: http://review.whamcloud.com/9176
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-946 lprocfs: List open files in filesystem
Girish Shilamkar [Sun, 19 May 2013 08:27:00 +0000 (16:27 +0800)]
LU-946 lprocfs: List open files in filesystem

Added lprocfs file on MDT to list open files in per-export
directory for mdt.

Test-Parameters: testlist=sanity,sanityn
Signed-off-by: Girish Shilamkar <gshilamkar@ddn.com>
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: If8f233d95dca4cd4c4044d85bd117a027dabd80e
Reviewed-on: http://review.whamcloud.com/6386
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Swapnil Pimpale <spimpale@ddn.com>
10 years agoLU-3528 mdt: check object exists for remote directory 13/9213/2
wang di [Tue, 17 Dec 2013 00:06:22 +0000 (16:06 -0800)]
LU-3528 mdt: check object exists for remote directory

Check whether the remote object exists before enqueue and
getattr to avoid LBUG.

Remove unnecssary remote object exist check in mdd_object_lock.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ia634a8c7b9cd2810515e854163c5fdd6bdf8716f
Reviewed-on: http://review.whamcloud.com/8371
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4430 mdt: check for MDS_FMODE_EXEC in mdt_mfd_open()
John L. Hammond [Fri, 3 Jan 2014 23:42:08 +0000 (17:42 -0600)]
LU-4430 mdt: check for MDS_FMODE_EXEC in mdt_mfd_open()

In the error path of mdt_mfd_open() check for MDS_FMODE_EXEC rather
than FMODE_EXEC in the open flags.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I04c53eb1af0fdeeb2c2b0c2f2ef1340b247921d8
Reviewed-on: http://review.whamcloud.com/8719
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4260 lod: free striping if striping initialization fails
wang di [Mon, 18 Nov 2013 08:18:09 +0000 (00:18 -0800)]
LU-4260 lod: free striping if striping initialization fails

It should free striping if striping information initialization is
faild, otherwise the later object find will pick up this wrong lod
object, and hit LBUG

ASSERTION( lc->ldo_stripenr == 0 ) failed:

[<ffffffffa0349895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[<ffffffffa0349e97>] lbug_with_loc+0x47/0xb0 [libcfs]
[<ffffffffa0e3f78f>] lod_ah_init+0x57f/0x5c0 [lod]
[<ffffffffa0b73a83>] mdd_object_make_hint+0x83/0xa0 [mdd]
[<ffffffffa0b7feb2>] mdd_create_data+0x332/0x7d0 [mdd]
[<ffffffffa0d9cc2c>] mdt_finish_open+0x125c/0x18a0 [mdt]
[<ffffffffa0d984f8>] ? mdt_object_open_lock+0x1c8/0x510 [mdt]
[<ffffffffa0d9ee8d>] mdt_reint_open+0x115d/0x20c0 [mdt]

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I67b2bd0e013b860767d19eda986fdcff7e16c486
Reviewed-on: http://review.whamcloud.com/8324
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3772 ptlrpc: fix nrs cleanup
Niu Yawei [Thu, 14 Nov 2013 04:48:00 +0000 (23:48 -0500)]
LU-3772 ptlrpc: fix nrs cleanup

When service start failed due to short of memory, the cleanup code
could operate on uninitialized structure and cause crash at the end.

This patch fix the nrs_svcpt_cleanup_locked() to perform cleanup only
on the nrs which has been properly initialized.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ieafa5b144133490b662f5a80a7b99311a9970de3
Reviewed-on: http://review.whamcloud.com/7410
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3857 osd: cleanup procfs after osd_shutdown
wangdi [Sun, 8 Dec 2013 08:00:03 +0000 (00:00 -0800)]
LU-3857 osd: cleanup procfs after osd_shutdown

Since osd_procfs_fini will try cleanup all proc entries,
and osd_shutdown/qsd_fini will try to cleanup procfs
itself, so osd_procfs_fini should be done after qsd_fini,
otherwise the qsd entries will be destoryed twice, and
caused panic

Call Trace:
 [<ffffffffa081cc45>] lprocfs_remove+0x25/0x40 [obdclass]
 [<ffffffffa0b23dd0>] qsd_fini+0x80/0x450 [lquota]
 [<ffffffffa0d2ec78>] osd_shutdown+0x38/0xe0 [osd_ldiskfs]
 [<ffffffffa0d36bf9>] osd_device_fini+0x129/0x190 [osd_ldiskfs]
 [<ffffffffa0834913>] class_cleanup+0x573/0xd30 [obdclass]
 [<ffffffffa081233c>] ? class_name2dev+0x7c/0xe0 [obdclass]
 [<ffffffffa083663a>] class_process_config+0x156a/0x1ad0 [obdclass]
 [<ffffffffa06be9b8>] ? libcfs_log_return+0x28/0x40 [libcfs]
 [<ffffffffa082f202>] ? lustre_cfg_new+0x312/0x6e0 [obdclass]
 [<ffffffffa0836d19>] class_manual_cleanup+0x179/0x6e0 [obdclass]
 [<ffffffffa06be9b8>] ? libcfs_log_return+0x28/0x40 [libcfs]
 [<ffffffffa0d378b4>] osd_obd_disconnect+0x174/0x1e0 [osd_ldiskfs]
 [<ffffffffa083926b>] lustre_put_lsi+0x1ab/0xeb0 [obdclass]
 [<ffffffffa08414d8>] lustre_common_put_super+0x5c8/0xbe0 [obdclass]
 [<ffffffffa087081d>] server_put_super+0x1bd/0xed0 [obdclass]
 [<ffffffffa0871bbb>] server_fill_super+0x68b/0x1630 [obdclass]
 [<ffffffffa0840bb8>] lustre_fill_super+0x1d8/0x530 [obdclass]
 [<ffffffffa08409e0>] ? lustre_fill_super+0x0/0x530 [obdclass]

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: If12cebf971583afeeaa031bd24f69bb0fe0cdf1a
Reviewed-on: http://review.whamcloud.com/8506
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-2818 mdt: Properly handle ENOMEM
Oleg Drokin [Tue, 21 Jan 2014 18:53:26 +0000 (13:53 -0500)]
LU-2818 mdt: Properly handle ENOMEM

When osd_keys_init fails in mdt_lvbo_fill, properly bail out with
error instead of asserting.

Change-Id: I832742ed49cc7740d8e709bc4b87e5d5aa100d39
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/8947
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-3618 ptlrpc: rq_commit_cb is called for twice
Liang Zhen [Sun, 12 Jan 2014 16:11:47 +0000 (00:11 +0800)]
LU-3618 ptlrpc: rq_commit_cb is called for twice

If a ptlrpc_request is already on imp::imp_replay_list, when it's
replayed and replied, after_reply() will call req::rq_commit_cb
for the request, then call it again in ptlrpc_free_committed.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I796c3351ad896aa3e1d0c2147ca7f775b7c14bfc
Reviewed-on: http://review.whamcloud.com/8815
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4454 libcfs: warn if all HTs in a core are gone
Liang Zhen [Wed, 8 Jan 2014 06:51:17 +0000 (14:51 +0800)]
LU-4454 libcfs: warn if all HTs in a core are gone

libcfs cpu partition can't support CPU hotplug, but it is safe
when plug-in new CPU or enabling/disabling hyper-threading.
It has potential risk only if plug-out CPU because it may break CPU
affinity of Lustre threads.

Current libcfs will print warning for all CPU notification, this
patch changed this behavior and only output warning when we lost all
HTs in a CPU core which may have broken affinity of Lustre threads.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I62267b62871c129beeb1593c4f69e7b81a79999d
Reviewed-on: http://review.whamcloud.com/8770
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-3601 Do not create layout in lease-open
Oleg Drokin [Tue, 29 Oct 2013 02:20:01 +0000 (22:20 -0400)]
LU-3601 Do not create layout in lease-open

leases are not real opens so it makes no sense to create layouts
when the lease is taken.

Change-Id: Ica2d6a348c360bd20bb7bd27061839df84dae84b
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-on: http://review.whamcloud.com/8084
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
10 years agoLU-4152 mdt: Don't enqueue two locks on the same resource
Oleg Drokin [Tue, 29 Oct 2013 02:15:03 +0000 (22:15 -0400)]
LU-4152 mdt: Don't enqueue two locks on the same resource

Due to mechanics of ldlm internals, enqueueing two different ibits
lock on the same resource is deadlock prone.
As such change mdt_object_open_lock to release open lock if it becomes
necessary to get exclusive layout lock (to create objects).
It's ok to release the open lock right away as it's never guaranteed to
be issued anyway.

Change-Id: Ib669e68323ea72c75a0a8bea289d8bea079309b0
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/8083
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-3834 mdt: handle swap_layouts failures during restore 12/9212/2
Bruno Faccini [Tue, 10 Dec 2013 09:55:59 +0000 (10:55 +0100)]
LU-3834 mdt: handle swap_layouts failures during restore

Actually nothing occur after swap_layouts failures during restore,
this can lead to file being left in incoherent state and thus be
unavailable because HS_RELEASED is clear but LOV_PATTERN_F_RELEASED
is still set.
This patch will allow original layout to be recovered by the use of
SWAP_LAYOUTS_MDS_HSM flag. Additionaly this requires HSM xattr of
the data FID to be set.
Also adds layout-swap failure injection and related test.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Id0e9a005362e4a3854b33f6ce1888197d20e7dbf
Reviewed-on: http://review.whamcloud.com/7631
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Reviewed-by: jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4336 quota: improper assert in osc_quota_chkdq()
Niu Yawei [Tue, 3 Dec 2013 01:57:40 +0000 (20:57 -0500)]
LU-4336 quota: improper assert in osc_quota_chkdq()

In osc_quota_chkdq(), we should never try to access oqi found
from hash, since it could have been freed by osc_quota_setdq().

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ia73cf89cb5bbd730fa6f0a00e44771f733b2baa6
Reviewed-on: http://review.whamcloud.com/8460
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
10 years agoLU-4253 osc: Don't flush active extents.
Ann Koehler [Thu, 14 Nov 2013 22:02:15 +0000 (16:02 -0600)]
LU-4253 osc: Don't flush active extents.

The extent is active so we need to abort and let the caller
re-dirty the page. If we continued on here, and we were the
one making the extent active, we could deadlock waiting for
the page writeback to clear but it won't because the extent
is active and won't be written out.

Signed-off-by: Ann Koehler <amk@cray.com>
Change-Id: Iba646d8185b12ab227fe0bbee1c6602ccdc32ad6
Reviewed-on: http://review.whamcloud.com/8278
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3373 misc: small changes for 3.10 server support 13/9013/6
yangsheng [Sat, 21 Sep 2013 17:24:35 +0000 (01:24 +0800)]
LU-3373 misc: small changes for 3.10 server support

--quota use struct kqid as parameter
--export ext4_dec/inc_count for nlink count
--ext4_find_entry & ext4_journal_start_sb changes
--iop->truncate removed
--other trival changes to calm compiler warning

Lustre-commit: 9bd7e40d2934cd0162eeff5388f054444a982ac9
Lustre-change: http://review.whamcloud.com/7794

Signed-off-by: yang sheng <yang.sheng@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ia2670d925ecbbfcc1ed3abb1a15a8d91fa27bd32
Reviewed-on: http://review.whamcloud.com/9013
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4106 scrub: Trigger OI scrub properly 06/9006/3
Fan Yong [Mon, 27 Jan 2014 06:20:44 +0000 (14:20 +0800)]
LU-4106 scrub: Trigger OI scrub properly

There is the following race case between osd_fid_lookup() and object
unlink/detroy:

Both RPC service thread_1 and RPC service thread_2 try to find the
same obj_A at the same time. At the beginning, the obj_A is not in
cache. The thread_1 is in osd_fid_lookup() and finds the OI mapping
for obj_A. But before the thread_1 finding out related inode_A, the
thread_2 moves faster and finds the inode_A and unlinks the inode_A.
So the thread_1 will fail to find the inode_A. Under such case, the
thread_1 will try to check OI again to make sure whether related OI
mapping is still there or not. If no OI mapping, then it is normal
becuase someone has unlinked the file by race; otherwise, it may be
caused by file-level backup/restore, then thread_1 will trigger OI
scrub to rebuild OI files.

But we ignored a corner case that the thread_1 recheck the OI files
may just between the thread_2 has dropped the inode_A's referene to
zero and will remove related OI mapping from the OI file. Then the
thread_1 is misguided, and will trigger OI scrbu unexpectedly.

More initial OI scrub for the /ROOT/.lustre directory to make sure
the necessary files/directories for mount are ready before used.

This patch also enhances the ls_locate()/dt_locate_at() interface
to allow the caller to pass some hints to low layer, such as flag
LOC_F_NEW for create, to help the low layer to handle efficiently
and properly.

This patch is back-ported from the following ones:
Lustre-commit: 8931d9070415e808e09bb4befd7cd38ef2431149
Lustre-change: http://review.whamcloud.com/8002
and
Lustre-commit: bab8a7dd5597014ee68e52bd39bde0ed40711777
Lustre-change: http://review.whamcloud.com/8101

Test-Parameters: mdtcount=4 testlist=sanity-scrub

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I5259549340f97a2f9118ab1db081f2ab2cfd8933
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: http://review.whamcloud.com/9006
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4178 tests: disable HSM sanity subtests 34/9134/2
Bob Glossman [Thu, 31 Oct 2013 18:10:24 +0000 (11:10 -0700)]
LU-4178 tests: disable HSM sanity subtests

Turn off high failure rate subtests in sanity-hsm.
If fixes for these failures land tests may be turned on again later.

Lustre-commit: d85d724d9a3a503718f6df840be67e5f6f5af78c
Lustre-change: http://review.whamcloud.com/8122

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I2e3f3822492a3398ebcbd2ba4565455986515764
Reviewed-on: http://review.whamcloud.com/9134
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4554 lfsck: old single-OI MDT always scrubbed 39/9139/2
James Nunez [Wed, 5 Feb 2014 17:15:29 +0000 (10:15 -0700)]
LU-4554 lfsck: old single-OI MDT always scrubbed

Old ldiskfs MDT's that contain a single OI container named "oi.16"
trigger an automatic OI scrub on each restart. This is because
osd_oi_table_open() gets ENOENT opening "oi.16.0" and consequently
sets bit 0 in scrub_file::sf_oi_bitmap. This bit indicates the OI
container 0 needs to be recreated, and it triggers a scrub in
osd_fid_lookup() for lookups that fail with ENOENT. Fix this by
clearing the bit in osd_oi_init() after a successful open of
"oi.16".

Lustre-change: http://review.whamcloud.com/#/c/9067
Lustre-commit: b4159f5d722bc43cff82b4c45336b01fd769e1db

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I3f19b15b51fce85bf791df76389f0b28951356c3
Reviewed-on: http://review.whamcloud.com/9139
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
10 years agoLU-4512 hsm: Fix lhsmtool_posix --report option 34/8934/5
Michael MacDonald [Mon, 20 Jan 2014 17:08:28 +0000 (12:08 -0500)]
LU-4512 hsm: Fix lhsmtool_posix --report option

The --report option is intended to allow an override of the
default copytool progress reporting interval, but it doesn't
work. This commit implements the intended functionality and
renames the option to "--update-progress", or "-u" for short.

Also fixes the progress display in hsm/active_requests to
reflect the change from percentage complete to bytes moved.

Signed-off-by: Michael MacDonald <michael.macdonald@intel.com>
Change-Id: Id6ead1b33868e3454f00053165944bc3900cabb4
Reviewed-on: http://review.whamcloud.com/8934
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4589 kernel: kernel update [SLES11 SP3 3.0.101-0.15] 50/9150/2
Bob Glossman [Wed, 5 Feb 2014 19:04:54 +0000 (11:04 -0800)]
LU-4589 kernel: kernel update [SLES11 SP3 3.0.101-0.15]

update target and config files for new kernel version

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I28496e7dc9322dcf7fec4493602042cc89db6fec
Reviewed-on: http://review.whamcloud.com/9150
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4263 osd-zfs: Avoid converting last ID FIDs to OST IDs
Li Wei [Fri, 30 Aug 2013 07:12:40 +0000 (15:12 +0800)]
LU-4263 osd-zfs: Avoid converting last ID FIDs to OST IDs

When obdfilter-survey first creates an object on a fresh ZFS OST, the
last ID object for FID_SEQ_ECHO has to be created in the first place.
The last ID FID, [FID_SEQ_ECHO:0:0], can not be converted to an OST ID
because the resulting OST ID would be indistinguishable from an
FID_SEQ_OST_MDT0 OST ID and would confuse ostid_id().  This patch
checks for last ID FIDs before converting them to OST IDs in
osd_get_idx_for_ost_obj().

Change-Id: I96cdf85b4725e4882cecabaf90466c7f77a5e0a6
Intel-bug-id: FF-182
Signed-off-by: Li Wei <wei.g.li@intel.com>
Reviewed-on: http://review.whamcloud.com/8301
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4293 utils: handle lfs migrate failure in lfs_migrate
Andreas Dilger [Wed, 18 Dec 2013 08:50:56 +0000 (01:50 -0700)]
LU-4293 utils: handle lfs migrate failure in lfs_migrate

If "lfs migrate" returns an error, possibly because it is refusing
to migrate an IGIF FID, fall back to using rsync to copy the file
and rename it.  Print a message in this case so the user knows it
is not a fatal error yet.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I114006afb93d8c8d78923a874f3b914200500c1e
Reviewed-on: http://review.whamcloud.com/8616
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3806 obdclass: add LCT_SERVER_SESSION for server session
wang di [Wed, 21 Aug 2013 07:04:43 +0000 (00:04 -0700)]
LU-3806 obdclass: add LCT_SERVER_SESSION for server session

Add LCT_SERVER_SESSION for server session, and separate the
server session flag from LCT_SESSION, so to avoid allocating
session info for client stack for each server request, if
client and server are on the same node.

Signed-off-by: Wang Di <di.wang@intel.com>
Change-Id: I808c3f58cd7a03ebc166e51fe1e32ea34ae0e3e8
Reviewed-on: http://review.whamcloud.com/7412
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4194 ldlm: Make OBD_[ALLOC|FREE]_LARGE use consistent
Christopher J. Morrone [Fri, 15 Nov 2013 21:40:19 +0000 (13:40 -0800)]
LU-4194 ldlm: Make OBD_[ALLOC|FREE]_LARGE use consistent

struct ldlm_lock's l_lvb_data field is freed in ldlm_lock_put()
using OBD_FREE.  However, some other code paths can attach
a buffer to l_lvb_data that was allocated using OBD_ALLOC_LARGE.
This can lead to a kfree() of a vmalloc()ed buffer, which can
trigger a kernel Oops.

Change-Id: Ic75a67530862eeb4d065c14bbbac80939bff5731
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/8298
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4194 ldlm: set l_lvb_type coherent when layout is returned
Bruno Faccini [Thu, 14 Nov 2013 16:20:00 +0000 (17:20 +0100)]
LU-4194 ldlm: set l_lvb_type coherent when layout is returned

In case layout has been packed into server reply when not
requested, lock l_lvb_type must be set accordingly.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Iaf54c9ba27785e529f4f2fb967d2fad4fc1dfbcb
Reviewed-on: http://review.whamcloud.com/8270
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
10 years agoLU-4196 build: Build support for OFED-3.5 and SLES 11 82/8882/4
Chris Horn [Mon, 27 Jan 2014 02:45:16 +0000 (10:45 +0800)]
LU-4196 build: Build support for OFED-3.5 and SLES 11

CONFIG_COMPATE_SLES_11_SP* needed in EXTRA_LNET_INCLUDE to allow
building against OFED-3.5

Lustre-commit: 369e02e84f39565195e08f043ab0421d2d3bd185
Lustre-change: http://review.whamcloud.com/8140

Test-Parameters: clientdistro=sles11sp3 ossdistro=sles11sp3 \
mdsdistro=sles11sp3 nettypes=o2ib clientibstack=inkernel \
ossibstack=inkernel mdsibstack=inkernel testlist=sanity

Signed-off-by: Chris Horn <hornc@cray.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: Ib26c757044aff828c3bbbd3adfd5fb709cca9cf0
Reviewed-on: http://review.whamcloud.com/8882
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4124 build: make module installation directory flexible 15/8315/4
Stephen Champion [Fri, 18 Oct 2013 22:02:45 +0000 (15:02 -0700)]
LU-4124 build: make module installation directory flexible

Add --with-kmp-moddir option to configure.

Distributions vary in the installation directory for kernel modules.

The RHEL standard installation directory is
        /lib/modules/$(uname -r)/extra
while the SLES standard is
        /lib/modules/$(uname -r)/updates

Adding the option --with-kmp-moddir option to configure allows users
to select the appropriate installation target.  With this change, it
is neccessary to support both options in the test framework.

Signed-off-by: Stephen Champion <schamp@sgi.com>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: http://review.whamcloud.com/8065
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: Iec3137e0e5039dd43622c2e285030a5339fa6fd3
Reviewed-on: http://review.whamcloud.com/8315
Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
10 years agoLU-3679 lnet: reflect down routes in /proc/sys/lnet/routes 95/8195/2
Chris Horn [Wed, 23 Oct 2013 17:12:40 +0000 (12:12 -0500)]
LU-3679 lnet: reflect down routes in /proc/sys/lnet/routes

We consider routes "down" if the router is down or the router
NI for the target network is down. This should be reflected
in the output of /proc/sys/lnet/routes

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I82ee769d88aec92f1690ad9c095e32c9a9f9e282
Reviewed-on: http://review.whamcloud.com/7857
Reviewed-by: Cory Spitz <spitzcor@cray.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/8195
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
10 years agoLU-4231 llite: proper support of NFS anonymous dentries 98/8498/3
Dmitry Eremin [Wed, 20 Nov 2013 18:35:11 +0000 (22:35 +0400)]
LU-4231 llite: proper support of NFS anonymous dentries

NFS can ask to encode dentries that are not connected to the root.
The fix check for parent is NULL and encode a file handle accordingly.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Idba91fd4bca4f26a37fd9bc76a340d2fbf557c9e
Reviewed-on: http://review.whamcloud.com/8347
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/8498

10 years agoLU-4444 tests: Skip conf-sanity/69 on zfs 54/8854/2
Nathaniel Clark [Wed, 15 Jan 2014 10:59:14 +0000 (18:59 +0800)]
LU-4444 tests: Skip conf-sanity/69 on zfs

Because file creates happen slowly on ZFS and the number of files
required to run the test is 100K, this test cannot run in a
reasonable amount of time.

Also bail out of test if createmany fails (possible if MDS or OST is
too small), this prevents the test from just timing out instead.

This patch is back-ported from the following one:
Lustre-commit: eb38c458c868d5389e2641189218f22ad1272aef
Lustre-change: http://review.whamcloud.com/8841

Test-Parameters: envdefinitions=SLOW=yes testlist=conf-sanity
Test-Parameters: envdefinitions=SLOW=yes testlist=conf-sanity \
mdsfilesystemtype=zfs mdtfilesystemtype=zfs ostfilesystemtype=zfs

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I6d9daad3239b576935190a121a2aa818441ec97b
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: http://review.whamcloud.com/8854
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-3189 tests: add version check code into sanity test 53 34/8834/3
Jian Yu [Tue, 14 Jan 2014 09:03:00 +0000 (17:03 +0800)]
LU-3189 tests: add version check code into sanity test 53

This patch adds Lustre version check codes into sanity test
53 to make the test work with servers that do not have the
following patch:

Lustre-commit: 6c4c51e3079e6c257fbf86536e4739110c166e3b
Lustre-change: http://review.whamcloud.com/4789

Test-Parameters: envdefinitions=SLOW=yes,ENABLE_QUOTA=yes,ONLY=53 \
ossjob=lustre-b2_3 mdsjob=lustre-b2_3 ossbuildno=41 mdsbuildno=41 \
testlist=sanity

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: Ie6eaeee31780f4ea4077805f52efda279ff96670
Reviewed-on: http://review.whamcloud.com/8834
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4222 mdt: extra checking for getattr RPC.
wang di [Wed, 18 Dec 2013 08:01:45 +0000 (00:01 -0800)]
LU-4222 mdt: extra checking for getattr RPC.

Check whether getattr RPC can hold layout MD(RMF_MDT_MD),
in case the client sends some invalid RPC, which can
cause panic on MDT.

Client will retrieve cl_max_md_size/cl_default_md_size
from MDS during mount process, so it will initialize
cl_max_md_size/cl_default_md_size before sending getattr
to MDS.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I43bbe54c37360242bb7a3cd2aa8d90c2b9e0baf1
Reviewed-on: http://review.whamcloud.com/8599
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
10 years agoLU-4360 Fix use after free in ksocknal_send
Oleg Drokin [Sat, 28 Dec 2013 03:31:15 +0000 (22:31 -0500)]
LU-4360 Fix use after free in ksocknal_send

Call to ksocknal_launch_packet might schedule a callback that
might free the just sent message, and so subsequent access to it
via lntmsg->msg_vmflush goes to freed memory.

Instead we'll just remember if we are in the vmflush thread and
only restore if we happened to set mempressure flag.

Change-Id: I2f0f8b27e26e11b37ad60fde4c98e86c39768349
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3680 ptlrpc: Fix assertion failure of null_alloc_rs()
Patrick Farrell [Fri, 22 Nov 2013 16:47:54 +0000 (10:47 -0600)]
LU-3680 ptlrpc: Fix assertion failure of null_alloc_rs()

lustre_get_emerg_rs() set the size of the reply buffer to zero
by mistake, which will cause LBUG in null_alloc_rs() when memory
pressure is high. This patch fix this problem and adds a size
check to avoid the problem of insufficient buffer size.

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I9fbd4f14e8e1263de2af564c4f2e420f5f2b43bc
Reviewed-on: http://review.whamcloud.com/8200
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4221 osd: add case LCFG_PARAM to osd_process_config 18/8618/8
Emoly Liu [Tue, 7 Jan 2014 14:56:21 +0000 (22:56 +0800)]
LU-4221 osd: add case LCFG_PARAM to osd_process_config

Some proc parameters were moved from ofd to osd module and only
their symlinks were kept in ofd for interoperability/compatibility.
To process this kind of config params passed by ofd, this patch is
to do the following fixes:
- add case LCFG_PARAM to osd_process_config() to process parameters
  with prefix both PARAM_OSD and PARAM_OST.
- since these parameters are not included by the static lprocfs var
  list, a pre-check is added for them to avoid "unknown param" error
  message confuses the uses. If they are matched in this check, they
  will be passed to the osd directly.
- get rid of lprocfs_osd_init_vars() and use struct lprocfs_vars
  lprocfs_osd_{obd,module}_vars[] instead.
- improve the error messages in class_process_proc_param() and
  class_process_proc_seq_param() a little.
- add conf-sanity.sh test_28a to verify the patch and skip this test
  for ZFS OSTs since ZFS has no such kind of parameters.

This is a backport of commit b1491d26271f074dc6f99cde037403337c0b2151
in http://review.whamcloud.com/8238 .

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Signed-off-by: Michael MacDonald <mjmac@whamcloud.com>
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I8b8d4244f90bd9e16acdccedd09da73fbb5e501b
Reviewed-on: http://review.whamcloud.com/8618
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Tested-by: Michael MacDonald <michael.macdonald@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4276 ldiskfs: enable read/write access by default 79/8779/2
Bob Glossman [Tue, 19 Nov 2013 22:51:09 +0000 (14:51 -0800)]
LU-4276 ldiskfs: enable read/write access by default

Add build time config option to allow read/write access by default.
While the new CONFIG_LDISKFS_FS_RW only matters in SLES11 builds,
it's easiest to just add the flag to all builds unconditionally.
It will be ignored in builds where it doesn't matter.

Lustre-commit: 14c94c20c3447584e81d720c2b2a17888716709e
Lustre-change: http://review.whamcloud.com/8335

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I4fcf0b2f884b1442db0aac5788bf62f07537c5d4
Reviewed-on: http://review.whamcloud.com/8779
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4030 tests: use free_fd() to allocate file descriptor 11/8811/2
Vladmir Saveliev [Sun, 12 Jan 2014 03:21:28 +0000 (11:21 +0800)]
LU-4030 tests: use free_fd() to allocate file descriptor

free_fd() lists /proc/self/fd to find the smallest unused file
descriptor
sanity test_31n is changed to use free_fd() instead of using hardcoded
173
sanity test_236 is changed to use free_fd() instead of using "{FD}<>"
which is not available on eariler bash

Since test_31n now uses the function free_fd to find an unused file
descriptor, it no longer depends on fd 173 being free.  This change
also removes that test on whether fd 173 is in use.

This patch is back-ported from the following ones:
Lustre-commit: 1f9235152b2f44c7bd64c5c021066f1984f341e6
Lustre-change: http://review.whamcloud.com/8181
and
Lustre-commit: 73e816e57167eb92425b6cf29fc570e56c88f6bd
Lustre-change: http://review.whamcloud.com/8622

Signed-off-by: Vladmir Saveliev <vladimir_saveliev@xyratex.com>
Change-Id: I0c9c04787d45dfe6ba5ed01adb0a8ee265c6b3c5
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: http://review.whamcloud.com/8811
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
10 years agoLU-3939 tests: sanity-hsm/test_40 needs a local HSM_ARCHIVE 71/8771/2
Bruno Faccini [Wed, 8 Jan 2014 07:24:20 +0000 (15:24 +0800)]
LU-3939 tests: sanity-hsm/test_40 needs a local HSM_ARCHIVE

sanity-hsm/test_40 suffers frequent failures during auto-test due
to remote/NFS-mounted HSM_ARCHIVE causing the 400 archive requests
to take more than 100s to be drained from copytool requests queue.
This patch allows copytool_setup func to allow each sub-test to
specify a non-default hsm-root/HSM_ARCHIVE dir and test_40 uses it.

This patch is back-ported from the following one:
Lustre-commit: 8484f1c51c701141237e98a1467c75364766f357
Lustre-change: http://review.whamcloud.com/7703

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I733b267991faa3b8c9415fea116d2086575333bb
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: http://review.whamcloud.com/8771
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3920 tests: check MDS version before testing HSM feature 34/8734/4
Vladimir Saveliev [Mon, 6 Jan 2014 04:26:31 +0000 (12:26 +0800)]
LU-3920 tests: check MDS version before testing HSM feature

Sanity tests 56y and 229 fail when MDS does not have HSM
support. Check MDS version and skip the tests in that case.

This patch is back-ported from the following one:
Lustre-commit: b635ddd7f6ebe04681fae34da3b26e3b6b5301f0
Lustre-change: http://review.whamcloud.com/8121

Test-Parameters: envdefinitions=SLOW=yes,ENABLE_QUOTA=yes,ONLY=229 \
ossjob=lustre-b2_4 mdsjob=lustre-b2_4 ossbuildno=70 mdsbuildno=70 \
testlist=sanity

Xyratex-bug-id: MRP-1417

Signed-off-by: Vladimir Saveliev <vladimir_saveliev@xyratex.com>
Change-Id: I6bf3bffd45ad8a2a7c72424447a4d486389c8e8d
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: http://review.whamcloud.com/8734
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4299 kernel: kernel update [SLES11 SP3 3.0.101-0.8] 66/8766/2
Bob Glossman [Mon, 6 Jan 2014 23:13:37 +0000 (15:13 -0800)]
LU-4299 kernel: kernel update [SLES11 SP3 3.0.101-0.8]

update target and config files for new kernel version

Lustre-commit: a6bf2c1ee73a217df8e0b44fb0d5cea15a3bd874
Lustre-change: http://review.whamcloud.com/8762

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I1970bc9657286b57746e3f0a18ca9d22f134189e
Reviewed-on: http://review.whamcloud.com/8766
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4165 tests: skip sanity-lfsck test_2c for 2.4 or older 28/8728/2
Fan Yong [Sat, 26 Oct 2013 20:56:28 +0000 (04:56 +0800)]
LU-4165 tests: skip sanity-lfsck test_2c for 2.4 or older

It makes no sense to run sanity-lfsck test_2c against 2.4 or older.

Test-Parameters: mdsjob=lustre-b2_4 ossjob=lustre-b2_4 mdsbuildno=58 ossbuildno=58 testlist=sanity-lfsck
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I96792b4325a69f880e326dc8963cf3e6bd09bf87
Reviewed-on: http://review.whamcloud.com/8386
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: http://review.whamcloud.com/8728
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4072 tests: Decrease load on MDT for ZFS in sanity/24v 69/8769/2
Nathaniel Clark [Wed, 8 Jan 2014 06:33:06 +0000 (14:33 +0800)]
LU-4072 tests: Decrease load on MDT for ZFS in sanity/24v

Due to performance of ZFS, reduce the number of file creates until
LU-2887/LU-4072 are resolved.

This patch is back-ported from the following one:
Lustre-commit: ee009f3b3e7bd467df3da3d0b53777db65790062
Lustre-change: http://review.whamcloud.com/7870

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I6271a7892c02885855b9e5b750438087e7875c5b
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: http://review.whamcloud.com/8769
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4223 utils: fixing loop leaking in utils 23/8723/2
wang di [Sat, 4 Jan 2014 14:15:59 +0000 (22:15 +0800)]
LU-4223 utils: fixing loop leaking in utils

1. If the file is being opened by popen, it should use
pclose instead of fclose to close the file, to make sure
the process created by popen is closed after pclose, then
to avoid loop device is being hold on release.

2. Give another try in loop_cleanup in case there are still
some process going on with the loop.

3. wait loop device to release before continue conf-sanity
32c.

4. Add losetup -a to list loop dev information when the
test(conf-sanity 32) fails.

This patch is back-ported from the following one:
Lustre-commit: 98ac0fe3a45dde62759ecaa4c84e6250ac2067f8
Lustre-change: http://review.whamcloud.com/8409

Test-Parameters: envdefinitions=SLOW=yes,ENABLE_QUOTA=yes \
mdscount=4 mdtcount=4 testlist=conf-sanity

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ic1ebc2a6b2ce4280c2123080171e203e99267b28
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: http://review.whamcloud.com/8723
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4270 test: fix sanity test_209 29/8729/3
Niu Yawei [Mon, 6 Jan 2014 04:44:58 +0000 (12:44 +0800)]
LU-4270 test: fix sanity test_209

Fix the connect_flags checking in test_209 of sanity.sh

This patch is back-ported from the following one:
Lustre-commit: b498499104af17da081f1c22b9c07951104846a3
Lustre-change: http://review.whamcloud.com/8326

Test-Parameters: envdefinitions=SLOW=yes,ENABLE_QUOTA=yes,ONLY=209 \
ossjob=lustre-b2_4 mdsjob=lustre-b2_4 ossbuildno=70 mdsbuildno=70 \
testlist=sanity

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I6c34b9dbe6d3b7475d85588e7adb3acb762fab32
Reviewed-on: http://review.whamcloud.com/8729
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3971 hsm: Copytool code cleanup
Henri Doreau [Fri, 6 Sep 2013 12:24:09 +0000 (14:24 +0200)]
LU-3971 hsm: Copytool code cleanup

Minor refactoring of the bandwidth controling code.
Deletion of a superfluous select() call on regular files.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Change-Id: Iae550bb69c1524865b38a92d9b7674fce2f58258
Reviewed-on: http://review.whamcloud.com/7583
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>