Whamcloud - gitweb
fs/lustre-release.git
11 years agoLU-2188 tests: Fix assumptions in test 133d
Nathaniel Clark [Thu, 1 Nov 2012 17:51:13 +0000 (13:51 -0400)]
LU-2188 tests: Fix assumptions in test 133d

The test assumed that with 512 files in a directory, the inode sizes
would be different in the two test directories.  This is not the case
on zfs which caused the get_rename_size() function to return multiple
values.  This change adds a argument to specify which stat is pulled
from rename_stats, and doesn't rely on the sizes being different.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I568ec95bd7f0613caf96101055a392ea5762cd2d
Reviewed-on: http://review.whamcloud.com/4438
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2244 mds: remove remaining of old mds code
Alex Zhuravlev [Sun, 28 Oct 2012 17:17:42 +0000 (20:17 +0300)]
LU-2244 mds: remove remaining of old mds code

it's not used anymore.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: Ie6e94d7a19a38ed57397ff48091597ea02f2ada1
Reviewed-on: http://review.whamcloud.com/4398
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: wangdi <di.wang@intel.com>
11 years agoLU-1832 ldlm: fix double list add
Peng Tao [Wed, 5 Sep 2012 07:51:04 +0000 (15:51 +0800)]
LU-1832 ldlm: fix double list add

Adding list to itself will cause kernel warning if
CONFIG_DEBUG_LIST is on.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: Ibaf135c2c6ca6cc8ee4f0e6f270d738c6964fddb
Reviewed-on: http://review.whamcloud.com/3880
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Keith Mannthey <kemannthey@gmail.com>
11 years agoLU-2186 mdt: initialize pointer to lu_site
Alex Zhuravlev [Tue, 16 Oct 2012 19:20:25 +0000 (23:20 +0400)]
LU-2186 mdt: initialize pointer to lu_site

later it's used to access top device (which is MDT) and
learn number of current clients to foresee how many
sequences will be needed.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I0f542dfbc45836180ec274dc605d3770b527e988
Reviewed-on: http://review.whamcloud.com/4280
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
11 years agoLU-2073 procfs: procfs symlinks are apparently never freed
yangsheng [Thu, 1 Nov 2012 15:33:05 +0000 (23:33 +0800)]
LU-2073 procfs: procfs symlinks are apparently never freed

We shouldn't set proc_dir_entry->data to NULL in any case.
There must cause memleak when it is a symlink entry.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I45c82fd206be738b5fdc4b2e612c3d87a708df67
Reviewed-on: http://review.whamcloud.com/4434
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
11 years agoLU-1538 tests: sanity.sh failed tests to clean up after themselves
Oleg Drokin [Sat, 3 Nov 2012 19:36:55 +0000 (15:36 -0400)]
LU-1538 tests: sanity.sh failed tests to clean up after themselves

commit 467cf22b changed behavior of error() function to abort the
test right away, as the result a lot of older tests were leaving
piles of files behing causing subsequent tests to fail spuriously.
Also tests like 32[ijkl] left mountpoints on lutre so subsequent test
65j is no longer able to umount lustre and hands there indefinitely.

This patch adds cleanups in tests: 24v, 27m, 32[ijkl]

Additionally tests 17m, 27m, 59 were making unsafe assumptions about
how long would it take for objects to be deleted. Removed explicit sleep
there to calls to wait_delete_completed

test 110: fixed a typo with quotes, autogenerate long filenames

test 72a 80: removed unnecessary "true" call.

Change-Id: I1c1002bfad278b767e45301b56e74688690690ee
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/4454
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1337 llite: kernel 3.2 make_request_fn returns void
Liu Xuezhao [Tue, 30 Oct 2012 08:59:11 +0000 (16:59 +0800)]
LU-1337 llite: kernel 3.2 make_request_fn returns void

3.2 request_queue.make_request_fn defined as function returns void.
(kernel commit 5a7bbad27a410350e64a2d7f5ec18fc73836c14f)
Add LC_HAVE_VOID_MAKE_REQUEST_FN/HAVE_VOID_MAKE_REQUEST_FN for check.

Signed-off-by: Liu Xuezhao <xuezhao.liu@emc.com>
Change-Id: I49a27873c1754addc9fef7c5f50cbf84592adf05
Reviewed-on: http://review.whamcloud.com/3576
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1527 clio: check if lock is freed in cl_lock_peek()
Andriy Skulysh [Thu, 4 Oct 2012 14:20:05 +0000 (17:20 +0300)]
LU-1527 clio: check if lock is freed in cl_lock_peek()

The lock may have been freed between cl_lock_lookup() and
cl_lock_mutex_get() so we should check lock state after grabbing
lock mutex.

Xyratex-bug-id: MRP-665
Change-Id: Id3562b3dd8bd052b74ad7840f08b904ca38a6746
Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Reviewed-on: http://review.whamcloud.com/3117
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2191 utils: tunefs.lustre failed to read ZFS partitions
Nathaniel Clark [Mon, 5 Nov 2012 21:07:55 +0000 (16:07 -0500)]
LU-2191 utils: tunefs.lustre failed to read ZFS partitions

ZFS shared libraries were not loaded prior to attempting to verify
type of partition supplied on commandline, it would never recognize a
ZFS partition.  Mount type also needs to be passed down to
osd_read_lld, and not just use whatever is set in defaults.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Iad88da4ddd9cf5fcc75f8409933467d9237f58d3
Reviewed-on: http://review.whamcloud.com/4469
Tested-by: Hudson
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-2281 utils: Fix possible segfault in tunefs.lustre
Nathaniel Clark [Mon, 5 Nov 2012 21:22:33 +0000 (16:22 -0500)]
LU-2281 utils: Fix possible segfault in tunefs.lustre

ldiskfs_read_ldd() can segfault if fopen of mountdata fails, because
it will always try to fclose the file handled (which if it is NULL,
dies).

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I553a7972b61ec01473bf834f98f8937bc7b11dbc
Reviewed-on: http://review.whamcloud.com/4470
Tested-by: Hudson
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2275: obdclass: Proper error cleaup for class_newdev
Oleg Drokin [Mon, 5 Nov 2012 05:24:18 +0000 (00:24 -0500)]
LU-2275: obdclass: Proper error cleaup for class_newdev

class_newdev did not have a proper cleanup for the case of no more
obd devices and used to leak obdtype reference and some memory
in such a case.
This patch fixes the issue.

Change-Id: I6b683f914f5cbcd21ef414fe470ccc88c39c4deb
Signed-off-by: Oleg Drokin <green@whamcloud.com
Reviewed-on: http://review.whamcloud.com/4460
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1531 mdt: Check non-normalised fid.
build [Thu, 11 Oct 2012 18:16:32 +0000 (13:16 -0500)]
LU-1531 mdt: Check non-normalised fid.

Apply fid checking in a manner similar to mdt_fid2path processing.

IGIF FIDs are checked to ensure correct behavior for upgraded
1.8 filesystems.

Signed-off-by: Richard Henwood <richard.henwood@intel.com>
Change-Id: Iea7ebfda8a31915b9d4fe2959773c9312b087485
Reviewed-on: http://review.whamcloud.com/4255
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2020 sanity: test 140 should allow 40 consecutive symlink
Peng Tao [Mon, 24 Sep 2012 07:17:41 +0000 (15:17 +0800)]
LU-2020 sanity: test 140 should allow 40 consecutive symlink

For kernel > 3.5, to test recursive symlink, we need real
recursive symlink.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: I4f1b834a79cdf4edb1775da45200f6fd2a680709
Reviewed-on: http://review.whamcloud.com/4079
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
11 years agoLU-2170 osc: set osc_lock attribute only once
Jinshan [Fri, 19 Oct 2012 16:28:00 +0000 (12:28 -0400)]
LU-2170 osc: set osc_lock attribute only once

Set osc_lock's attribute by lock allocator, otherwise if this lock is
matched and enqueued by a glimpse thread, the osc_lock's ols_glimpse
will be set to true and the lock state will be messed in
osc_lock_upcall().

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ib8492fa159a43dad11febe5a01f8c4ef72b8c4f3
Reviewed-on: http://review.whamcloud.com/4316
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2241 symlink: fix off-by-one error when reading symlinks
Nathaniel Clark [Wed, 31 Oct 2012 20:56:09 +0000 (16:56 -0400)]
LU-2241 symlink: fix off-by-one error when reading symlinks

This fixes and off-by-one error when reading symlinks of inode size.
The null character is not accounted for when checking bufferlength
vs. inode data size.

Also add regression test to sanity.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: If4464cac60d57012311226113ff38b9c28926958
Reviewed-on: http://review.whamcloud.com/4415
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1684 ldlm: move ldlm flags not sent through wire to upper 32bits
Vitaly Fertman [Mon, 22 Oct 2012 12:49:27 +0000 (16:49 +0400)]
LU-1684 ldlm: move ldlm flags not sent through wire to upper 32bits

there is no empty bit for a LDLM_FL_* flag in lower 32bits, i.e. which needs
to be sent through wire. move locally used flags to upper 32bits to free some
bits.

Change-Id: Iddaff0a75b19d7311800d2ac6c3fef1012b9ffd2
Reviewed-by: Alexander Zarochentsev <Alexander_Zarochentsev@xyratex.com>
Reviewed-by: Andrew Perepechko <Andrew_Perepechko@xyratex.com>
Xyratex-Bug-ID: MRP-541
Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Reviewed-on: http://review.whamcloud.com/3494
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2235 lfsck: remove unnecessary warning message
Fan Yong [Sat, 3 Nov 2012 02:34:40 +0000 (10:34 +0800)]
LU-2235 lfsck: remove unnecessary warning message

Currently, the new online LFSCK does not work for ZFS backend,
but it is neither fatal nor block the mount processing.
So remove those unnecessary warning message.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I106a6b5c978cde8695821776570c30605f03c400
Reviewed-on: http://review.whamcloud.com/4452
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Prakash Surya <surya1@llnl.gov>
11 years agoLU-1199 build: Remove ancient "nonfree" module support
Christopher J. Morrone [Tue, 30 Oct 2012 02:54:49 +0000 (19:54 -0700)]
LU-1199 build: Remove ancient "nonfree" module support

Lustre doesn't have any "nonfree" kernel modules, and the code to
support "nonfree" was from 2005.  I think we can remove it now.

Change-Id: I790d170fabdc5cd6e4948f20ccca2a6bfdd1bc29
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/4408
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1199 build: Remove duplicate LC_MODULE_LOADING
Christopher J. Morrone [Tue, 30 Oct 2012 02:43:07 +0000 (19:43 -0700)]
LU-1199 build: Remove duplicate LC_MODULE_LOADING

It appears that LC_MODULE_LOADING was accidentally declared twice
back-to-back in the same file.  This removes the first declaration
on the assumption that if my eye-balling of the code missed a
difference, the second one is the one we've been using anyway.

Change-Id: I04a9da80d6be7bef6e4fd35eca8f3e490a8a824f
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/4407
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1729: lu_buf code cleaning
jcl [Thu, 9 Aug 2012 21:29:05 +0000 (23:29 +0200)]
LU-1729: lu_buf code cleaning

Fix DLUBUF define and use LU_BUF_NULL to clear a lu_buf

Change-Id: I742308616d9c39196e56bf4983523152d26e1245
Signed-off-by: jcl <jacques-charles.lafoucriere@cea.fr>
Reviewed-on: http://review.whamcloud.com/3589
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1714 lnet: Properly initialize sg_magic value
Prakash Surya [Fri, 17 Aug 2012 16:11:32 +0000 (09:11 -0700)]
LU-1714 lnet: Properly initialize sg_magic value

When the CONFIG_DEBUG_SG flag is enabled in the kernel, we must ensure
the sg_magic field is properly initialized. Otherwise, internal kernel
assertions will fail when trying to verify this field. As a result,
certain calls to sg_* function had to be changed or inserted to ensure
the sg_init_table function would be called, initializing the magic
value. Also, we need to ensure this value isn't zeroed out in the
kiblnd_setup_rd_kiov function.

Signed-off-by: Prakash Surya <surya1@llnl.gov>
Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Reviewed-by: Alexander Zarochentsev <alexander_zarochentsev@xyratex.com>
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Change-Id: I5b6b265a4a8dd37408bb78decd79ed54e0f9251b
Reviewed-on: http://review.whamcloud.com/3709
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1840 ldlm: fix mutex leak in ldlm_resource_get
Peng Tao [Thu, 6 Sep 2012 03:08:51 +0000 (11:08 +0800)]
LU-1840 ldlm: fix mutex leak in ldlm_resource_get

We created resource with lr_lvb_mutex locked. Need to drop
it before returning.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: Id81f792605d864b9d3236498f063d6c003d8cd77
Reviewed-on: http://review.whamcloud.com/3883
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1951 mdd: fix for error handler of mdd_rename
Liang Zhen [Fri, 5 Oct 2012 12:16:19 +0000 (20:16 +0800)]
LU-1951 mdd: fix for error handler of mdd_rename

If mdd_rename() failed to unlink target file/dir, it will try to
revert everything including insert target file/dir back into target
directory, but it didn't restore nlink count of target, which will
leave a file/dir under target directory with wrong nlink number.

Another thing is fixed by this patch is, mdd_attr_check_set_internal()
didn't release mdd_write_lock() while jumping to error handler.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I601f0569de87b71d032f86ed1082c27d5bf5adaf
Reviewed-on: http://review.whamcloud.com/4405
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1337 llite: provides ll_get_acl to ->i_op->get_acl
Liu Xuezhao [Tue, 30 Oct 2012 08:52:55 +0000 (16:52 +0800)]
LU-1337 llite: provides ll_get_acl to ->i_op->get_acl

Since kernel 3.1 generic_permission() has lost the check_acl
argument, ACL checking has been taken to VFS and filesystems
need to provide a non-NULL ->i_op->get_acl to read an ACL
from disk.

This patch is a complementarity to http://review.whamcloud.com/3397
(d018b087c962b8c66e8dc479fc66e964a2e5fd94), to fix failure of test_25
of sanityn.sh.

Signed-off-by: Liu Xuezhao <xuezhao.liu@emc.com>
Change-Id: Ica96adac03c1792e2e8b668b959457a4ffec9a43
Reviewed-on: http://review.whamcloud.com/3885
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1337 llite: kernel 3.1 changes open_to_namei_flags
Liu Xuezhao [Tue, 30 Oct 2012 08:45:48 +0000 (16:45 +0800)]
LU-1337 llite: kernel 3.1 changes open_to_namei_flags

Kernel 3.1 changes the translation from open_flag to namei_flag,
(kernel commit 8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a).

So after 3.1, kernel's nameidata.intent.open.flags is different
with lustre's lookup_intent.it_flags, as lustre's it_flags'
lower bits equal to FMODE_xxx while kernel doesn't transliterate
lower bits of nameidata.intent.open.flags to FMODE_xxx.

This patch keeps lustre it_flags' semantics and add
ll_namei_to_lookup_intent_flag for translation.

Signed-off-by: Liu Xuezhao <xuezhao.liu@emc.com>
Change-Id: I408685040688bae574d04cf288abb6ca967607df
Reviewed-on: http://review.whamcloud.com/3583
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-930 utils: minor fixes to lfs_migrate.1 man page
Andreas Dilger [Thu, 1 Nov 2012 21:50:16 +0000 (15:50 -0600)]
LU-930 utils: minor fixes to lfs_migrate.1 man page

Fix the formatting of the lfs_migrate.1 man page SYNOPSIS section,
since ".Blfs_migrate" is not the same as ".B lfs_migrate", and hence
the synopsis was missing the actual name of the command it described.

Also fix some minor grammar issues.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie0d8e3cd6fbab0663562b6a99f124ead953ebbe5
Reviewed-on: http://review.whamcloud.com/4440
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Richard Henwood <richard.henwood@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1279 utils: Silence modprobe ptlrpc output in mount.lustre
Oleg Drokin [Fri, 2 Nov 2012 18:52:07 +0000 (14:52 -0400)]
LU-1279 utils: Silence modprobe ptlrpc output in mount.lustre

Patch d8d9b78a5c08eb1d938ab9e3bdaf7f756bfbb5ec introduced
this modprobe, but order of redirects was reversed which results
in printing spurios messages like "FATAL: Module ptlrpc not found."
when mountig lustre from local build dir.

Change-Id: I688d073ad3b0565f73c29a50c2b81383adfd7a48
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/4449
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-928 fid: add comments describing different FIDs
Andreas Dilger [Wed, 26 Sep 2012 09:29:26 +0000 (11:29 +0200)]
LU-928 fid: add comments describing different FIDs

Add comments to the code describing various FID types, from
http://wiki.lustre.org/index.php/Architecture_-_Interoperability_fids_zfs

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I325b48a0e85fb25ed8c3a3709e623978969d8d4a
Reviewed-on: http://review.whamcloud.com/4102
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Ned Bass <bass6@llnl.gov>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1337 llite: kernel 3.1 kills inode->i_alloc_sem
Liu Xuezhao [Thu, 27 Sep 2012 06:20:25 +0000 (14:20 +0800)]
LU-1337 llite: kernel 3.1 kills inode->i_alloc_sem

Kernel 3.1 kills inode->i_alloc_sem, use i_dio_count and
inode_dio_wait/inode_dio_done instead.
(kernel commit bd5fe6c5eb9c548d7f07fe8f89a150bb6705e8e3).

Add HAVE_INODE_DIO_WAIT to differentiate it.

Signed-off-by: Liu Xuezhao <xuezhao.liu@emc.com>
Change-Id: Ife36e07a85c76153985a4a86ee1973262c4c0e27
Reviewed-on: http://review.whamcloud.com/3582
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2153 quota: several fixes for reintegration
Niu Yawei [Tue, 16 Oct 2012 02:48:03 +0000 (22:48 -0400)]
LU-2153 quota: several fixes for reintegration

- On master side, never delete the id entry from the global/slave
  index, otherwise, those deleted entries will not be transfered
  during reintegration, and improved test_7a for this change;
- When start reintegration thread, if there is any pending
  updates, abort and try to start reintegration later;
- Set rq_no_retry_einprogress for quota request;
- When master found quota acquire for not enforced ID, return
  -ESRCH to slave instead of -EINPROGRESS;
- Check free inodes in test_2;

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I64037f6aff6be686250272eda53c027bf5ba47c2
Reviewed-on: http://review.whamcloud.com/4275
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
11 years agoLU-1337 build: remove unnecessary includings of system.h
Liu Xuezhao [Tue, 30 Oct 2012 09:12:11 +0000 (17:12 +0800)]
LU-1337 build: remove unnecessary includings of system.h

<asm/system.h> is removed in kernel 3.4, and it is indeed not needed.

Signed-off-by: Liu Xuezhao <xuezhao.liu@emc.com>
Change-Id: Ic4d0a086656c5dfb05669aae40680b41e8ea00c7
Reviewed-on: http://review.whamcloud.com/3575
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1994 kernel: v3.5 defines INVALID_UID
Peng Tao [Wed, 22 Aug 2012 08:55:22 +0000 (16:55 +0800)]
LU-1994 kernel: v3.5 defines INVALID_UID

With kernel commit 7a4e7408, Lustre doesn't need to redefine
INVALID_UID.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: I96b854cc51db735d8c985528c879fbeb5b049ab9
Reviewed-on: http://review.whamcloud.com/3755
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1337 osc: fix -Werror=unused-result
chas williams - CONTRACTOR [Tue, 14 Aug 2012 14:42:25 +0000 (10:42 -0400)]
LU-1337 osc: fix -Werror=unused-result

Newer Fedora kernels build using -Werror=unused-result.  It appears
that GOTO() isn't correctly assigning rc in this instance.  The
unused PTR_ERR() is generating warning which is upgraded to an error.

Signed-off-by: chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Change-Id: I66d730d4d0e20f0f1c7671dc00acefdf7ed1fbe9
Reviewed-on: http://review.whamcloud.com/3638
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-845 tests: automate large LUN testing
Wei3 Liu [Tue, 30 Oct 2012 21:42:56 +0000 (14:42 -0700)]
LU-845 tests: automate large LUN testing

a. run llverdev on the raw device to verify there is no driver issue
b. run llverfs on OST ldiskfs filesystem
c. use up free inodes on the OST with mdsrate
d. run llverfs on lustre filesystem

Change-Id: I021009647d2053fa53cff1067f8f2bc83d12ce45
Signed-off-by: Wei Liu <sarah@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1700
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1279 utils: mount.lustre load ptlrpc module if necessary
Bobi Jam [Thu, 18 Oct 2012 10:10:09 +0000 (18:10 +0800)]
LU-1279 utils: mount.lustre load ptlrpc module if necessary

When LNET modules have not loaded, and mounting multiple targets at
the same time could fail. Use mount.lustre to load the network modules
if necessary.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I9d7a4007cc5b233055a4a985237b01ff0874cf54
Reviewed-on: http://review.whamcloud.com/4292
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
11 years agoLU-1169 mgs: Fix race during new fsdb creation.
Andriy Skulysh [Thu, 18 Oct 2012 10:29:31 +0000 (13:29 +0300)]
LU-1169 mgs: Fix race during new fsdb creation.

Lock fsdb_mutex until the fsdb is loaded from llogs.
It fixes race between loading data from llog into fsdb
and obtaining data form it.

Xyratex-bug-id: MRP-230
Signed-off-by: Andriy Skulysh <andriy_skulysh@xyratex.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Bruce Korb <Bruce_Korb@us.xyratex.com>
Change-Id: I8c29040a182f363e83e61e57d3e20756f40300ea
Reviewed-on: http://review.whamcloud.com/2251
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2226 osp: dump statfs data via lprocfs
Alex Zhuravlev [Thu, 25 Oct 2012 09:59:39 +0000 (13:59 +0400)]
LU-2226 osp: dump statfs data via lprocfs

register another set of vars to be accessed with
data=dt device. use existing lprocfs_osd_rd_*() helpers.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: Ib2fed358866847d8abb0e818c1d40494c0642681
Reviewed-on: http://review.whamcloud.com/4390
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
11 years agoLU-2152 iam: it->load fix
Niu Yawei [Mon, 15 Oct 2012 03:42:01 +0000 (23:42 -0400)]
LU-2152 iam: it->load fix

Current iam it->load for lfix doesn't work properly because
iam_lfix_ilookup() isn't implemented at all.

This patch also added one more reintegration test for quota to
test the global index transfer in multiple bulks, and proc entry
for global index copy is added to verify the limits on slaves
easily.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Ifb1dca0551b2aa4db3d37ff4ac6b3fcded34b7cc
Reviewed-on: http://review.whamcloud.com/4266
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-2211 quota: cap how long a thread can wait for quota
Johann Lombardi [Fri, 19 Oct 2012 13:59:12 +0000 (15:59 +0200)]
LU-2211 quota: cap how long a thread can wait for quota

Change qsd_op_begin() path to wait for quota space for less than
obd_timeout / 2.
This patch also abandons the qsd_ops enum in favor of a more generic
qsd_adjust() implementation which will always do the same processing
even if adjustment is delayed because of a quota request in flight.

Signed-off-by: Johann Lombard <johann.lombardi@intel.com>
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I5faf637c5330ca7f503c292e0e28edb84458ee89
Reviewed-on: http://review.whamcloud.com/4314
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
11 years agoLU-921 llite: warning in case of discarding dirty pages
Hongchao Zhang [Tue, 23 Oct 2012 12:00:17 +0000 (20:00 +0800)]
LU-921 llite: warning in case of discarding dirty pages

when a client is evicted, dirty pages may get silently discarded,
the caller of successful write(2) will not know that the data he
wrote have been discarded due to eviction before it can be flushed
to the OSS.

Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Change-Id: Iecfbf096548ff08cdd6064d53ad8c688343fcddc
Reviewed-on: http://review.whamcloud.com/1908
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1822 llite: Remove deprecated truncate handler
michael.mckay [Tue, 4 Sep 2012 15:31:45 +0000 (11:31 -0400)]
LU-1822 llite:  Remove deprecated truncate handler

Remove the ll_truncate handler. This handler was only being used
to display a debug message about the truncated object. That line
was moved to a different location, and the handler removed.
This handler is an issue in kernels after 2.6.34 when running the
patchless client. In that version of the kernel the kernel will log a
kernel warning if its called and the inode has a handler for truncate.
The truncate logic was updated some time ago to be more
consistent with the new sequence of events.

Xyratex-bug-id: MRP-597
Reviewed-by: Alexander Zarochentsev <Alexander_Zarochentsev@xyratex.com>
Reviewed-by: Iurii Golovach <iurii_golovach@xyratex.com>
Signed-off-by: Michael McKay <michael_mckay@xyratex.com>
Change-Id: I77b372a2825fd2bdc4b215ee20a979f03dc7d64b
Reviewed-on: http://review.whamcloud.com/3860
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Iurii Golovach <iurii.golovach@gmail.com>
11 years agoNew tag 2.3.54 2.3.54 v2_3_54 v2_3_54_0
Oleg Drokin [Mon, 29 Oct 2012 06:47:01 +0000 (02:47 -0400)]
New tag 2.3.54

Change-Id: I0c6415d7924ee83c11a5e383915d06fca41ccf2a
Signed-off-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1337 llite: ll_inode_permission should check RCU walk
Peng Tao [Tue, 18 Sep 2012 10:57:53 +0000 (18:57 +0800)]
LU-1337 llite: ll_inode_permission should check RCU walk

For >3.1 kernels, RCU flag is folded into mask field.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: Icc6751493e7359646cb6bd84b3ac05de167e4d88
Reviewed-on: http://review.whamcloud.com/4039
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Liu Xuezhao <xuezhao.liu@emc.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-812 llite: 3.0+ kernel fsync should call write
Peng Tao [Tue, 25 Sep 2012 11:16:14 +0000 (19:16 +0800)]
LU-812 llite: 3.0+ kernel fsync should call write

Since 3.0, kernel pushes i_mutex and fsync to fs fsync
callback. So Lustre should check and do the same. Otherwise
there might be data corruption and sanity 63b will fail.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: I2f2f6792276eaf6783bffb813f3c3e5405be0450
Reviewed-on: http://review.whamcloud.com/4091
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1889 build: fix false 'uninitialized scalar variable' errs
Sebastien Buisson [Tue, 11 Sep 2012 14:43:33 +0000 (16:43 +0200)]
LU-1889 build: fix false 'uninitialized scalar variable' errs

Fix false 'uninitialized scalar variable' errors found by Coverity
version 6.0.3:
Uninitialized scalar variable (UNINIT)
Using uninitialized value, element or field when calling function.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Change-Id: I83a7dd3ae4a027bf0ebced572245bc4fff35e119
Reviewed-on: http://review.whamcloud.com/3939
Tested-by: Hudson
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1857 build: fix 'Unbounded source buffer' errors
Sebastien Buisson [Fri, 7 Sep 2012 13:59:51 +0000 (15:59 +0200)]
LU-1857 build: fix 'Unbounded source buffer' errors

Fix 'unbounded source buffer' defects found by Coverity version 6.0.3:
Unbounded source buffer (STRING_SIZE)
Passing string of unknown size to a function that expects
a string of a particular size.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Change-Id: I18e51f04e62241b5c5dad7ae963d8070d6954dd4
Reviewed-on: http://review.whamcloud.com/3904
Tested-by: Hudson
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1526 utils: Supply default MDT index
James Simmons [Thu, 18 Oct 2012 12:26:16 +0000 (08:26 -0400)]
LU-1526 utils: Supply default MDT index

To prepare for DNE indexing has become a requirement
for MDTs and with the latest lustre you can't mount
a MDT that was not formated with a index. While mount
has this requirement mkfs.lustre has a bug that allows
you to format a MDS without a index and not even warn
the user. At the same time mkfs.lustre has to handle
the case were a user will not supply a index since it
was not required in earlier lustre releases. This patch
address this problem by supplying a default index of
zero for the MDT if no index is supplied to mkfs.lustre
and warns the user they must supply a index in the
future.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I45932321885856d97b10630a0667e8338822b199
Reviewed-on: http://review.whamcloud.com/4293
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2019 llite: update i_flags in ll_iocontrol properly
Peng Tao [Thu, 20 Sep 2012 09:09:49 +0000 (17:09 +0800)]
LU-2019 llite: update i_flags in ll_iocontrol properly

When client has lsm, we still need to update cache i_flags.
Otherwise i_flags is out of sync.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: I7fcb84da82129238f327885a0fc5827fcac90a8d
Reviewed-on: http://review.whamcloud.com/4078
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1756 kernel: clean up lustre_compat25.h
Peng Tao [Thu, 16 Aug 2012 07:59:21 +0000 (15:59 +0800)]
LU-1756 kernel: clean up lustre_compat25.h

1. unused functions:
   mapping_has_pages(), ll_call_writepage(), __set_page_ll_data()
   ll_invalidate_inode_pages(), __set_page_ll_data()
   CheckWriteback(), KIOBUF_GET_BLOCKS()
2. rename ll_vfs_create to vfs_create
3. remove kdev_t related macros
4. move cfs_cleanup_group_info() to lustre_common.h
5. remove kiobuf
6. move ll_inode_blksize() to lustre_common.h
7. drop LL_RENAME_DOES_D_MOVE

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: Ic5e29e399e70ccd04cbe1448f3c6cfc3a258289b
Reviewed-on: http://review.whamcloud.com/3686
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2213 scrub: stop LFSCK before osd_shutdown
Fan Yong [Mon, 22 Oct 2012 17:12:05 +0000 (01:12 +0800)]
LU-2213 scrub: stop LFSCK before osd_shutdown

The osd_shutdown will clean all the otable-based iteration,
but up layer LFSCK depends on the otable-based iteration.

So we need to stop the LFSCK before osd_shutdown called.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I97625d54766122314630aff0069d9e14d23b9840
Reviewed-on: http://review.whamcloud.com/4217
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-2224 osd-zfs: Fix osd_commit_async() locking
Brian Behlendorf [Thu, 25 Oct 2012 05:45:40 +0000 (22:45 -0700)]
LU-2224 osd-zfs: Fix osd_commit_async() locking

The ZFS osd_commit_async() function never properly acquires the
tx->tx_sync_lock() mutex to protext the tx_state_t.  However,
the mutex is correctly dropped so we just add the obviously
missing mutex_enter().

Change-Id: Iae426feaeb5885034515d6bf0ccb9509ed098bb0
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-on: http://review.whamcloud.com/4383
Tested-by: Hudson
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Prakash Surya <surya1@llnl.gov>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-2216 mdt: remove obsolete DNE code
wangdi [Sat, 27 Oct 2012 22:05:56 +0000 (15:05 -0700)]
LU-2216 mdt: remove obsolete DNE code

1. remove split checking and cross-ref code from DNE.
2. remove IAM code on ldiskfs and utils.
3. remove cmm directory.

Change-Id: I0c81d753462863706e8918393369dde94a45030c
Signed-off-by: Wang Di <di.wang@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/4353
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2179 osc: truncate partial page correctly
Jinshan Xiong [Sun, 21 Oct 2012 00:26:30 +0000 (17:26 -0700)]
LU-2179 osc: truncate partial page correctly

If a partial page is being truncated, the corresponding osc extent
should be held until the truncate finished.

Debug patch for osc_extent_wait() and don't wait for completion
of RPC it's not even sent in truncate.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I96a5ec1fdbb3133c735ebdfdd0330a45a2a8ab1a
Reviewed-on: http://review.whamcloud.com/4317
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2173 lod: QoS code to give up if no good OSP found
Alex Zhuravlev [Thu, 18 Oct 2012 18:38:58 +0000 (22:38 +0400)]
LU-2173 lod: QoS code to give up if no good OSP found

on any iteration. this code was removed by mistake in
commit 03b988a (LU-2093).

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: Ifa0d3a5ceeaaf84d3ec49e39bd2f337414a216ce
Reviewed-on: http://review.whamcloud.com/4300
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2219 ptlrpc: so_hpreq_handler is set twice for the ost_io svc
Nikitas Angelinas [Thu, 25 Oct 2012 09:04:20 +0000 (10:04 +0100)]
LU-2219 ptlrpc: so_hpreq_handler is set twice for the ost_io svc

ptlrpc_service_conf.psc_ops.so_hpreq_handler is set twice for
the ost_io service in ost_setup(); the second assignment
overwrites the first to NULL, so ost_io threads would never
handle RPCs as high-priority ones.

While we are at it, remove some superfluous assignments of
so_hpreq_handler to NULL for statically allocated
ptlrpc_service_conf structs when initializing other ptlrpc
services, and rename some relevant functions.

Signed-off-by: Nikitas Angelinas <nikitas_angelinas@xyratex.com>
Change-Id: Ia728a3d7f20511fcb58b259126b05055d5860455
Xyratex-bug-id: MRP-724
Reviewed-on: http://review.whamcloud.com/4368
Tested-by: Hudson
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2214 lod: fix tricky iterator methods
Alex Zhuravlev [Mon, 22 Oct 2012 13:35:53 +0000 (17:35 +0400)]
LU-2214 lod: fix tricky iterator methods

instead of bypassing LOD layer in the iterator methods,
just get own iterator structure in lod, which keep references
to the object and the iterator of the layer below.

this also let LOD to have different iterators in different
objects which is required for DNE.

to verify the approach lfsck goes through LOD now.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I62935319a686f4b06b2cdf5ea4002a800c0c430d
Reviewed-on: http://review.whamcloud.com/4370
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1571 mdt: Do not update xid for open replay req
Wang Di [Sat, 15 Sep 2012 14:34:15 +0000 (07:34 -0700)]
LU-1571 mdt: Do not update xid for open replay req

Do not update last_xid for open replay req,
otherwise the following resend(after replay)
can not be matched with correct xid.

Remove unnecessary mti_transo zero check in
mdt_empty_transno.

Signed-off-by: wang di <di.wang@whamcloud.com>
Change-Id: I2a05f3ac05b301ae31641a1dc51f8c4eed96427d
Reviewed-on: http://review.whamcloud.com/3195
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-2174 test: improve error message
Niu Yawei [Mon, 15 Oct 2012 07:47:23 +0000 (03:47 -0400)]
LU-2174 test: improve error message

In sanity-quota.sh, if the testing user/group isn't existing, print
error message to inform user to create them.

Check free space for test_0.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Ie08250d665b305b140315f76391fd5161a6fbdd5
Reviewed-on: http://review.whamcloud.com/4268
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2167 ptlrpc: Fix use after free in ptlrpcd on termination
Oleg Drokin [Sat, 13 Oct 2012 16:51:35 +0000 (12:51 -0400)]
LU-2167 ptlrpc: Fix use after free in ptlrpcd on termination

Should not use pc after signalling completion of its use
since it will be freed later.

Change-Id: Id20e8d188fea77f23a52e9a374e7e5e84fe3ad4b
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/4264
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
11 years agoLU-1930 build: Back end file system fixes.
James Simmons [Tue, 16 Oct 2012 11:44:03 +0000 (07:44 -0400)]
LU-1930 build: Back end file system fixes.

Currently Lustre for ZFS requires the zfs development
rpm for its userland support to be installed on the
build machine so we can create the lustre zfs utilities.
What this patch does is allow a user to be able to build
against the zfs/spl source drops as well as the rpms.
A work around is provided so we can point the lustre
build system to were a user can temporary install the
zfs user land headers.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I8e5586ed22956a9dd4799826a442b8f5a895d872
Reviewed-on: http://review.whamcloud.com/3980
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1337 llite: kernel 3.1 renames lock-manager ops
Liu Xuezhao [Thu, 9 Aug 2012 02:37:39 +0000 (10:37 +0800)]
LU-1337 llite: kernel 3.1 renames lock-manager ops

Kernel 3.1 renames lock-manager ops(lock_manager_operations) from
fl_xxx to lm_xxx (commit 8fb47a4fbf858a164e973b8ea8ef5e83e61f2e50).

Add LC_LM_XXX_LOCK_MANAGER_OPS/HAVE_LM_XXX_LOCK_MANAGER_OPS to check.

Re-arrange several macro definitions in lustre-core.m4 as kernel
version sequence.

Signed-off-by: Liu Xuezhao <xuezhao.liu@emc.com>
Change-Id: Ic86ec9db2f8262ef7ab9f5f2fb51ca79591120a4
Reviewed-on: http://review.whamcloud.com/3579
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2145 target: use tgt_ prefix for target function
Mikhail Pershin [Mon, 15 Oct 2012 13:23:00 +0000 (17:23 +0400)]
LU-2145 target: use tgt_ prefix for target function

There are several prefixes used: target_, lut_, tg_. Start to use
tgt_ prefix as single one.

Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Change-Id: I4e1f38b81a5311d56472162b8a1114a2aa252874
Reviewed-on: http://review.whamcloud.com/4273
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-1131 osd-ldiskfs: better journal credit tracking
Andreas Dilger [Wed, 3 Oct 2012 21:26:50 +0000 (15:26 -0600)]
LU-1131 osd-ldiskfs: better journal credit tracking

When running with a small MDT device during testing, it is possible to
overflow the reserved credit maximum for the journal.  Improve the
ldiskfs debugging for transaction credits so that it is possible to
better understand where the reserved credits are being used.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iea90b771a5e19190cc95cbf8f2f725bede500c1e
Reviewed-on: http://review.whamcloud.com/4282
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2154 osp: precreate logic to use last assigned id
Alex Zhuravlev [Wed, 17 Oct 2012 20:28:50 +0000 (00:28 +0400)]
LU-2154 osp: precreate logic to use last assigned id

instead of "next to assign". this change removes few cases
where last-created-id could become less than next-to-assign
resulting in an invalid assertions.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: If41563ea2f227ea980e3017d9485cb9c7caccad5
Reviewed-on: http://review.whamcloud.com/4289
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
11 years agoLU-2138 osp: set expiration before RPC is sent
Alex Zhuravlev [Thu, 18 Oct 2012 12:30:42 +0000 (16:30 +0400)]
LU-2138 osp: set expiration before RPC is sent

osp_statfs_update() should set opd_statfs_fresh_till before
the request is sent. otherwise the race is possible when
interpret function is called sooner than osp_statfs_update()
sets opd_statfs_fresh_till to "disable" value. the race can
result in suspened statfs updates misguiding the object
allocation algorithm.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I2ff03a611267292d0cd6a465c1eb14023516234b
Reviewed-on: http://review.whamcloud.com/4294
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2145 target: move target code to the separate directory
Mikhail Pershin [Sun, 14 Oct 2012 11:58:13 +0000 (15:58 +0400)]
LU-2145 target: move target code to the separate directory

Create target/ directory for unified target code and move
there already existed code from ptlrpc/target.c

Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Change-Id: Id808ed3eb390dd051cbca0a3ef2bf02e5f5d722f
Reviewed-on: http://review.whamcloud.com/4258
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-1999 test: LW conn test should rely on debug logs
Johann Lombardi [Wed, 17 Oct 2012 21:39:50 +0000 (23:39 +0200)]
LU-1999 test: LW conn test should rely on debug logs

Recovery-small test 6 should rely on lustre debug logs instead of
dmesg since console messages are rate limited and might not be printed
as expected by the test.

The test should also be skipped if the server does not support
lightweight connections (i.e. is older than 2.3.50).

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: I78994831506e632f1730eb6f80fe145c7fc2cf3e
Reviewed-on: http://review.whamcloud.com/4288
Tested-by: Hudson
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1538 tests: fix test cases when OST is full
Andreas Dilger [Sat, 13 Oct 2012 21:04:44 +0000 (15:04 -0600)]
LU-1538 tests: fix test cases when OST is full

In sanity.sh test_101d() the test didn't check if "dd" failed to write
the full file size, and produced an confusing error about readahead
performance.

In sanityn.sh test_36() it also didn't check if "dd" failed to write
the full file size, and then multiop read was stuck in a loop of zero
length reads forever.  Fix both "dd" error checking, and multiop.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic4d5ec90d77b1a9302d3e8f128f292b3765611d7
Reviewed-on: http://review.whamcloud.com/4265
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2041 oi: keep oi mapping cache consistency
Fan Yong [Mon, 8 Oct 2012 06:33:29 +0000 (14:33 +0800)]
LU-2041 oi: keep oi mapping cache consistency

Sometimes the local ID of the per RPC thread OI mappig cache may be
changed, but the FID of such OI mapping cache has not been updated,
which will cause the RPC thread finds some unexpected object with
the given FID, should be fixed.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I90cc01601925ada08e0f021ba49e3310f10aed35
Reviewed-on: http://review.whamcloud.com/4208
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1347 ldlm: makes EXPORT_SYMBOL follows function body
Liu Xuezhao [Sun, 7 Oct 2012 11:40:36 +0000 (19:40 +0800)]
LU-1347 ldlm: makes EXPORT_SYMBOL follows function body

Makes EXPORT_SYMBOL macros immediately follow the function body,
to follow normal Linux kernel coding style.

Signed-off-by: Liu Xuezhao <xuezhao.liu@emc.com>
Change-Id: I5644d908c652b2e34d81e923367af4b5728399e1
Reviewed-on: http://review.whamcloud.com/2837
Tested-by: Hudson
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2012 tests: disable replay-dual test_14b
Andreas Dilger [Tue, 9 Oct 2012 22:41:25 +0000 (16:41 -0600)]
LU-2012 tests: disable replay-dual test_14b

Disable replay-dual.sh test_14b until OST gap handling is fixed.

The test is modified to make the pass/fail result more clear for when
it is re-enabled, since any blocks that are allocated on the OST while
the test is running will cause it to fail.  This could happen because
of updated config llog, OI updates, etc.  Allow some small margin of
space to be allocated on the OST before declaring failure.  To make
failed the orphan handling totally clear, ensure the orphan object is
much larger than the margin of error.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib14ecdfa1f8fbd5807195acb60e5ba507f500c1e
Reviewed-on: http://review.whamcloud.com/4237
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2197 quota: don't send preacq if no usage and no activity
Johann Lombardi [Tue, 16 Oct 2012 16:50:57 +0000 (18:50 +0200)]
LU-2197 quota: don't send preacq if no usage and no activity

Slaves should not try to pre-acquire quota space when the ID
has no usage and there is no outstanding activities.

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: I5076910024deca2a1dc75189837dd2b6cdabe7bf
Reviewed-on: http://review.whamcloud.com/4279
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1770 ptlrpc: introducing OBD_CONNECT_FLOCK_OWNER flag
Iurii.Golovach [Tue, 16 Oct 2012 09:13:52 +0000 (12:13 +0300)]
LU-1770 ptlrpc: introducing OBD_CONNECT_FLOCK_OWNER flag

After applying flock policy fix into the 1.8 users met with an issue
when 1.8 clients with a fixed flock policy recognized incorrectly by
2.x servers.
This flag is intended to present 1.8 clients with fixed flock policy
to let 2.x servers make flock policy recognition correctly.
Patches with functionality changes were attached on review in LU-1575

Xyratex-bug-id: MRP-489
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-by: Andriy Skulysh <andriy_skulysh@xyratex.com>
Signed-off-by: Iurii Golovach <iurii_golovach@xyratex.com>
Change-Id: Id00b496ad3f556f99be5e9218497399f18a00357
Reviewed-on: http://review.whamcloud.com/3722
Reviewed-by: Cory Spitz <spitzcor@cray.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2107 util: Enhance l_getidenty to leverage 'nscd' cacheing
Sergii Glushchenko [Wed, 10 Oct 2012 10:33:15 +0000 (13:33 +0300)]
LU-2107 util: Enhance l_getidenty to leverage 'nscd' cacheing

Replace getgrent() use with getgrouplist() in l_getidentity,
which would allow it to leverage the 'nscd' cache.

Xyratex-bug-id: MRP-643
Signed-off-by: Sergii Glushchenko <sergii_glushchenko@xyratex.com>
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Iurii Golovach <iurii_golovach@xyratex.com>
Change-Id: I593abaeefe02cdb2d4f8761124bdd48477a7a22a
Reviewed-on: http://review.whamcloud.com/4223
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Iurii Golovach <iurii.golovach@gmail.com>
11 years agoLU-2083 build: install git commit hooks automatically
Andreas Dilger [Wed, 3 Oct 2012 22:18:54 +0000 (16:18 -0600)]
LU-2083 build: install git commit hooks automatically

Install the Lustre Git commit hooks into .git/hooks/ by default when
autogen.sh is run, so that they are present when patches are being
committed.  This avoids the relatively common case where a new tree
is checked out by new or experienced developers and is missing the
commit hooks when patches are being submitted.

While the commit hooks are sure to be installed for in any tree that
was built, this isn't a guarantee that the hooks will be installed in
every tree that has a commit, but it is very likely to be the case.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6a15420fb7a35b790c1e816c67e20a8004500c1e
Reviewed-on: http://review.whamcloud.com/4175
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Bruce Korb <bruce_korb@xyratex.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2193 ofd: look up FID to destroy before locking
Andreas Dilger [Tue, 16 Oct 2012 05:15:17 +0000 (23:15 -0600)]
LU-2193 ofd: look up FID to destroy before locking

If the MDS is replaying object destroys after recovery, then it may
be trying to destroy non-existent objects.  This can provoke spurious
errors in lvbo_init() due to the inability to populate the lock LVB.
Rather than quiet the useful error message from lvbo_init(), instead
do the object lookup on the to-be-destroyed FID first.  If lookup
fails to find an object, skip the object locking entirely since it
isn't needed and would just flood the console after recovery.

During destroy RPCs from the MDS, the ELC buffer is always empty, so
short-circuit the initial lock cancellation attempt that is useless.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id6197f23773ea271e0cb0912b19585b3df500c1e
Reviewed-on: http://review.whamcloud.com/4276
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2150 ost: ost_brw_read() to ptlrpc_free_bulk_nopin()
Alex Zhuravlev [Thu, 11 Oct 2012 19:31:19 +0000 (23:31 +0400)]
LU-2150 ost: ost_brw_read() to ptlrpc_free_bulk_nopin()

since a643e38 (LU-2089) OST do not pin pages involved in
BULKs: this is done to prevent get/put on the pages which
were allocated as part of order N (>1) allocation with 0
refcounter. get/put on such a page leads to warning from
the kernel. in the original patch one code path was not
fixed, so this patch completes the change.

also, to prevent confusion, the patch removes couple macros:
ptlrpc_free_bulk() and ptlrpc_prep_bulk_page(). so now the
caller should specify whether ptlrpc should reference pages
or not.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I8cf5f334e8f7edab0ad37678e1e8af18904a0be6
Reviewed-on: http://review.whamcloud.com/4256
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1303 lod: improvements and fixes
Alex Zhuravlev [Wed, 10 Oct 2012 09:52:41 +0000 (13:52 +0400)]
LU-1303 lod: improvements and fixes

- osp_statfs() returns -ENOTCONN if the corresponded OST found
  not connected. this let us to remove few additional checks in
  the allocation policy functions.

- struct obd_statfs gets new field: os_fprecreated
  LOD uses this to skip OSPs with no objects ready to use

- osp_statfs() returns number of already precerated objects
  in new os_fprecreated field

- OS_STATE_DEGRADED is ignored on the first 2 passes in RR policy

- lod_alloc_specific() to verify and skip OSPs already used in
  striping

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I86351bc1dcca7182bc5adf4eb3e03c054e33e95f
Reviewed-on: http://review.whamcloud.com/4242
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
11 years agoLU-744 osc: add lru pages management - new RPC
Jinshan Xiong [Wed, 16 May 2012 03:11:37 +0000 (20:11 -0700)]
LU-744 osc: add lru pages management - new RPC

Add a cache management at OSC layer, this way we can control how much
memory can be used to cache lustre pages and avoid complex solution
as what we did in b1_8.

In this patch, admins can set how much memory will be used for caching
Lustre pages per file system. A self-adapative algorithm is used to
balance those budget among OSCs.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I76c840aef5ca9a3a4619f06fcaee7de7f95b05f5
Reviewed-on: http://review.whamcloud.com/2514
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1717 ldlm: Fix rs_opc initialization
Li Wei [Mon, 15 Oct 2012 01:29:36 +0000 (09:29 +0800)]
LU-1717 ldlm: Fix rs_opc initialization

By the time target_send_reply() initializes rs_opc, rs_msg has not been
filled with a valid opc yet.  Following Oleg's suggestion on the Jira
ticket, this patch changes target_send_reply() to initialize rs_opc with
rq_reqmsg instead and silences a couple of related warnings that are of
only informative nature.

Change-Id: I4b96454e0bcf3dd0dc8f21b0de70a89ce37faacf
Signed-off-by: Li Wei <liwei@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/4271
Tested-by: Hudson
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-2097 quota: more ll_vfs_dq_init()
Niu Yawei [Mon, 15 Oct 2012 09:06:05 +0000 (05:06 -0400)]
LU-2097 quota: more ll_vfs_dq_init()

Calls ll_vfs_dq_init() in several places to avoid the missing
block accounting for existing inode problem.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I1500aa1f75b6a6184d1b40877a69fabdf4fac130
Reviewed-on: http://review.whamcloud.com/4270
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1538 tests: use $TESTSUITE instead of $0
Andreas Dilger [Fri, 5 Oct 2012 03:49:17 +0000 (21:49 -0600)]
LU-1538 tests: use $TESTSUITE instead of $0

Use "$TESTSUITE" instead of "$0" in test script messages, since $0 is
a full pathname and clutters up the test logs.  Instead, $TESTSUITE is
only the test suite name, and is more compact.  Use it in all output.

This makes the output from complete() redundant, in that it prints
$TESTSUITE both as an argument and internally via the equals_msg()
function (via banner()), so remove the $TESTSUITE argument from all
callers of complete().

The equals_msg() function was only a thin wrapper around banner(), and
was only used by complete(), remove it and call banner() directly.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6b3f256000317a17fdd2a361a38d4dfdda500c1e
Reviewed-on: http://review.whamcloud.com/4192
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2147 quota: several fixes to reintegration procedure
Johann Lombardi [Thu, 11 Oct 2012 13:55:58 +0000 (15:55 +0200)]
LU-2147 quota: several fixes to reintegration procedure

This patch gathers several fixes/improvements to the quota
reintegration procedure:
- do not set rq_no_resend & rq_no_delay for IT_QUOTA_CONN to have
  the reintegration thread waiting instead of stopping/starting
  the reint thread until the master is available
- add procfs tunable to force reintegration, it can be useful for
  testing, but also for fixing things at customer site when a bug
  was hit during reintegration.
- when transferring indexes, the per-page header isn't swabbed
- on index transfer, the hash value isn't sent any more (unlike on
  orion_quota) since we now use II_FL_NOHASH. As a consequence,
  qsd_reint_entries() shouldn't take the hash size into account
  when parsing a page container key/record pairs.

This patch also:
- quiets many common messages which aren't real errors and shouldn't
  make it to the console
- fixes a bug in qmt_adjust_edquot() which does not check correctly
  whether the revoke timeout has elapsed
- changes test_6 to use a larger quota limit to avoid edquot flag
  to be set by the QMT and cause the test to fail.
- removes temporary code from setup_quota() in t-f now that the
  new quota code is fully landed.
- re-enables ost-pool tests 23a & 23b

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: I9af9a025faa1ef173810df647b93307e2139c6f9
Reviewed-on: http://review.whamcloud.com/4253
Tested-by: Hudson
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1972 mdt: declare RPC handlers in a sane way
Andreas Dilger [Fri, 12 Oct 2012 02:02:48 +0000 (20:02 -0600)]
LU-1972 mdt: declare RPC handlers in a sane way

Declare the MDT RPC handlers in a way that they can be found when
searching for them, otherwise the code is completely opaque when
looking for the handler for an RPC (e.g. MDS_CLOSE or LDLM_ENQUEUE).

While it might make sense to have macros replace a lot of repetetive
code blocks, it doesn't make sense to chop up the RPC names so badly
that they can never be found through normal searching.  Rename the
RPC handler definition macros to have more meaningful names, and
remove unused macros and special cases where not strictly necessary.

Rename a few OBD_FAIL_LDLM_ error injection hooks instead of making
the macros more complex for a small number of use cases.  These names
are only used internally, even if the values are used in the tests.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8d4dc0709faeae4458c3563864268a00f8500c1e
Reviewed-on: http://review.whamcloud.com/4260
Tested-by: Hudson
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2123 tests: sanity 57 and 129 can't detect fstype
James Simmons [Tue, 9 Oct 2012 14:36:12 +0000 (10:36 -0400)]
LU-2123 tests: sanity 57 and 129 can't detect fstype

In sanity test 57a,57b and 129 the function facet_type_fstype
is used to determine the file system backend. This function
does not actually exist so this test never run for the
ldiskfs case. The fix is to use the proper function which
is facet_fstype.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Ia9766a3a2200c6ce2d48ff0265eb73a4d71c06e7
Reviewed-on: http://review.whamcloud.com/4230
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
11 years agoLU-2093 lod: fall back to RR allocation when QoS fails
Alex Zhuravlev [Wed, 10 Oct 2012 08:32:38 +0000 (12:32 +0400)]
LU-2093 lod: fall back to RR allocation when QoS fails

lod_alloc_qos() checks is there enough OSPs to satisfy the request
checking OSP state with dt_statfs(), then it tries to reserve
objects on some of them. during the reservation the state of OSP
can change (due to broken connection, for example), then QoS code
might found less ready OSPs than required. this is a valid situation
and LOD should fallback to RR allocation.

sanity/116a added to verify this: dt_statfs() are still reporting
OSPs are good, but no actual object can be created on OSP with
index 1.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: Iae6916f998070960eb47c71f2bc1e48adb2ac080
Reviewed-on: http://review.whamcloud.com/4241
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Tested-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1595 build: allow longer component in summary
Andreas Dilger [Wed, 3 Oct 2012 21:50:08 +0000 (15:50 -0600)]
LU-1595 build: allow longer component in summary

Allow a longer "component:" field in the commit summary message, up to
11 characters, for osd-ldiskfs.  This exceeded the 9-character limit.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id0ed056e9b9efef07b26efeb9e2f4f1e8d500c1e
Reviewed-on: http://review.whamcloud.com/4174
Tested-by: Hudson
Reviewed-by: Bruce Korb <bruce_korb@xyratex.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2163 lprocfs: fix jobstats initialization race
Andreas Dilger [Fri, 12 Oct 2012 21:23:13 +0000 (15:23 -0600)]
LU-2163 lprocfs: fix jobstats initialization race

If two threads are racing to add the same jobid into the job stats
list in lprocfs_job_stats_log(), one thread will lose the race from
cfs_hash_findadd_unique() and enter the "if (job != job2)" case.  It
could fail LASSERT(!cfs_list_empty(&job->js_list)) depending whether
the other thread in "else" added "job2" to the list first or not.

Simply locking the check for cfs_list_empty(&job->js_list) is not
sufficient to fix the race.  There would need to be locking over the
whole cfs_hash_findadd_unique() and cfs_list_add() calls, but since
ojs_lock is global for the whole OST this may have performance costs.

Instead, just remove the LASSERT() entirely, since it provides no
value, and the "losing" thread can happily use the job_stat struct
immediately since it was fully initialized in job_alloc().

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iecb17e2dc80621fd388295998df5708bcaabcab0
Reviewed-on: http://review.whamcloud.com/4263
Tested-by: Hudson
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-744 llite: reimplement ll_get_fsname()
Jinshan Xiong [Fri, 17 Aug 2012 00:39:33 +0000 (17:39 -0700)]
LU-744 llite: reimplement ll_get_fsname()

ll_get_fsname() used to allocate a piece of memory to store fsname,
this is not needed and error prone because it requires the caller
to free that piece of memory.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ibc077b46728a1358e51a345ec13c966fc947c428
Reviewed-on: http://review.whamcloud.com/3704
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2037 tests: Wait for devices to initialize on test setup
michael.mckay [Thu, 27 Sep 2012 14:28:03 +0000 (10:28 -0400)]
LU-2037 tests: Wait for devices to initialize on test setup

Fix an issue where we do not wait for a device to
initialize before getting the label. This label then does
not correspond to an actual device.
A check is now done on the label names to see
if you are getting back the 'ffff' which signifies that the
device has not finished initializing yet. In this case
we will wait and retry in 1,3,5,10 seconds or until
the command succeeds.

Xyratex-bug-id: MRP-546
Reviewed-by: Iurii Golovach <Iurii_Golovach@xyratex.com>
Reviewed-by: Kyrylo Shatskyy <Kyrylo_Shatskyy@xyratex.com>
Signed-off-by: Michael McKay <michael_mckay@xyratex.com>
Change-Id: I01e045227a4b3b6e007dcc9685238a5425cdffe8
Reviewed-on: http://review.whamcloud.com/4111
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-823 compat: Don't use cfs_ functions on kernel structures
Christopher J. Morrone [Fri, 4 Nov 2011 22:59:11 +0000 (15:59 -0700)]
LU-823 compat: Don't use cfs_ functions on kernel structures

It is not really correct to use the generic "cfs_" prefixed
locking functions on the Linux kernel's data structures.  Revert
back to using the Linux locking function.

Change-Id: I64619c4b4f4963634b3d1e43c1b1519598e65e8d
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/4176
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Peng Tao <bergwolf@gmail.com>
11 years agoLU-2142 scrub: reset completed scrub position if retrigger
Fan Yong [Thu, 11 Oct 2012 09:42:29 +0000 (17:42 +0800)]
LU-2142 scrub: reset completed scrub position if retrigger

If former OI scrub has been completed, and the user wants to run
the OI scrub again, then reset the OI scrub to make it to rescan
the device from the beginning.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I5b8e9ee51ccbf95ed131b963389c4ecfb92b9035
Reviewed-on: http://review.whamcloud.com/4250
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2144 utils: reset 'optind' to avoid segmentation fault
Fan Yong [Thu, 11 Oct 2012 09:12:12 +0000 (17:12 +0800)]
LU-2144 utils: reset 'optind' to avoid segmentation fault

Sometimes lfsck_{start,stop} commands may be called several
times under the same lctl shell. Under such case, the function
getopt_long() called inside the lfsck_start/lfsck_stop may be
confused, and cause segmentation fault. So reset the external
variable 'optind' by force to avoid the segmentation fault.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I35b635eac44854ae4b17ae00bed778320dbe9d9e
Reviewed-on: http://review.whamcloud.com/4248
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2140 test: add fake nid with proper nettype
Niu Yawei [Thu, 11 Oct 2012 06:23:09 +0000 (02:23 -0400)]
LU-2140 test: add fake nid with proper nettype

The fake nid should be added with proper nettype.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I490f80328d8210e8eca9ccd8484fa4d7717c7429
Reviewed-on: http://review.whamcloud.com/4247
Tested-by: Hudson
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1981 doc: reorganize osd API documentation
Johann Lombardi [Tue, 9 Oct 2012 22:32:16 +0000 (00:32 +0200)]
LU-1981 doc: reorganize osd API documentation

Move osd-api.txt to lustre/doc and reorganize the document.

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: I76fe29e36a325c0687c84389b9ed98fbbeb3b85c
Reviewed-on: http://review.whamcloud.com/4236
Tested-by: Hudson
Reviewed-by: Ian Colle <Ian.Colle@intel.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2059 tests: skip local config tests only on ZFS
Andreas Dilger [Fri, 5 Oct 2012 03:04:34 +0000 (21:04 -0600)]
LU-2059 tests: skip local config tests only on ZFS

Only skip conf-sanity.sh (5d, 19b, 21b, 27a) and insanity.sh (2, 4)
when the backing OST filesystem type is ZFS, not for ldiskfs.  The
support for locally-cached config llogs is not implemented for ZFS
yet, so ZFS OSTs cannot be started without the MGS yet.

The ZFS local config support is in progress and these tests will be
re-enabled as part of the landing.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8c49483401a7132ce09b93aa5d93610c4d500c1e
Reviewed-on: http://review.whamcloud.com/4234
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1458 test: enable lustre_rsync debug log dump
Bobi Jam [Mon, 27 Aug 2012 16:41:33 +0000 (00:41 +0800)]
LU-1458 test: enable lustre_rsync debug log dump

* Make lustre_rsync dump its debug log to help debugging.
* Add debug messages in lr_move().

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I0f26322b3a4677bcb1b09d09e0e7c0ea1b4dbe3d
Reviewed-on: http://review.whamcloud.com/3795
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2088 utils: drop obsolete debug symbols
Andreas Dilger [Thu, 4 Oct 2012 20:08:49 +0000 (14:08 -0600)]
LU-2088 utils: drop obsolete debug symbols

Remove obsolete modules from the list of debugging symbols:
llite, smfs, fsfilt_ext3, fsfilt_reiserfs, fsfilt_smfs,
mds_ext3, cobd, cmobd

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I47b60934fd0ceb58b6bae77306b60d28d2300c1e
Reviewed-on: http://review.whamcloud.com/4188
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1757 brw: added OBD short io connect flag
Alexander.Boyko [Wed, 10 Oct 2012 08:23:08 +0000 (12:23 +0400)]
LU-1757 brw: added OBD short io connect flag

To prevent collisions with any future flags needed in features written
against this branch.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Change-Id: I567020c24ab64c4faeed159ef8c6814e74f73503
Reviewed-on: http://review.whamcloud.com/3891
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1095 ptlrpc: improve ptlrpc debug message consistency
Ned Bass [Mon, 6 Aug 2012 18:49:27 +0000 (11:49 -0700)]
LU-1095 ptlrpc: improve ptlrpc debug message consistency

Enforce the following conventions for better consistency
in a few ptlrpc/target.c debug messages.

- Print each message on a single line for better grep results.

- Provide a distinctive message for different functions to
  reduce appearance of redundancy.

- Print device name at the start, otherwise on systems with many
  targets it isn't easy to tell which one was involved.

- Print rc at the end.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Change-Id: Ibd203367dde4d95d32671217271420c57a8dc0ad
Reviewed-on: http://review.whamcloud.com/3547
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>