Whamcloud - gitweb
fs/lustre-release.git
11 years agoLU-1337 osc: fix -Werror=unused-result
chas williams - CONTRACTOR [Tue, 14 Aug 2012 14:42:25 +0000 (10:42 -0400)]
LU-1337 osc: fix -Werror=unused-result

Newer Fedora kernels build using -Werror=unused-result.  It appears
that GOTO() isn't correctly assigning rc in this instance.  The
unused PTR_ERR() is generating warning which is upgraded to an error.

Signed-off-by: chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Change-Id: I66d730d4d0e20f0f1c7671dc00acefdf7ed1fbe9
Reviewed-on: http://review.whamcloud.com/3638
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-845 tests: automate large LUN testing
Wei3 Liu [Tue, 30 Oct 2012 21:42:56 +0000 (14:42 -0700)]
LU-845 tests: automate large LUN testing

a. run llverdev on the raw device to verify there is no driver issue
b. run llverfs on OST ldiskfs filesystem
c. use up free inodes on the OST with mdsrate
d. run llverfs on lustre filesystem

Change-Id: I021009647d2053fa53cff1067f8f2bc83d12ce45
Signed-off-by: Wei Liu <sarah@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1700
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1279 utils: mount.lustre load ptlrpc module if necessary
Bobi Jam [Thu, 18 Oct 2012 10:10:09 +0000 (18:10 +0800)]
LU-1279 utils: mount.lustre load ptlrpc module if necessary

When LNET modules have not loaded, and mounting multiple targets at
the same time could fail. Use mount.lustre to load the network modules
if necessary.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I9d7a4007cc5b233055a4a985237b01ff0874cf54
Reviewed-on: http://review.whamcloud.com/4292
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
11 years agoLU-1169 mgs: Fix race during new fsdb creation.
Andriy Skulysh [Thu, 18 Oct 2012 10:29:31 +0000 (13:29 +0300)]
LU-1169 mgs: Fix race during new fsdb creation.

Lock fsdb_mutex until the fsdb is loaded from llogs.
It fixes race between loading data from llog into fsdb
and obtaining data form it.

Xyratex-bug-id: MRP-230
Signed-off-by: Andriy Skulysh <andriy_skulysh@xyratex.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Bruce Korb <Bruce_Korb@us.xyratex.com>
Change-Id: I8c29040a182f363e83e61e57d3e20756f40300ea
Reviewed-on: http://review.whamcloud.com/2251
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2226 osp: dump statfs data via lprocfs
Alex Zhuravlev [Thu, 25 Oct 2012 09:59:39 +0000 (13:59 +0400)]
LU-2226 osp: dump statfs data via lprocfs

register another set of vars to be accessed with
data=dt device. use existing lprocfs_osd_rd_*() helpers.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: Ib2fed358866847d8abb0e818c1d40494c0642681
Reviewed-on: http://review.whamcloud.com/4390
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
11 years agoLU-2152 iam: it->load fix
Niu Yawei [Mon, 15 Oct 2012 03:42:01 +0000 (23:42 -0400)]
LU-2152 iam: it->load fix

Current iam it->load for lfix doesn't work properly because
iam_lfix_ilookup() isn't implemented at all.

This patch also added one more reintegration test for quota to
test the global index transfer in multiple bulks, and proc entry
for global index copy is added to verify the limits on slaves
easily.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Ifb1dca0551b2aa4db3d37ff4ac6b3fcded34b7cc
Reviewed-on: http://review.whamcloud.com/4266
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-2211 quota: cap how long a thread can wait for quota
Johann Lombardi [Fri, 19 Oct 2012 13:59:12 +0000 (15:59 +0200)]
LU-2211 quota: cap how long a thread can wait for quota

Change qsd_op_begin() path to wait for quota space for less than
obd_timeout / 2.
This patch also abandons the qsd_ops enum in favor of a more generic
qsd_adjust() implementation which will always do the same processing
even if adjustment is delayed because of a quota request in flight.

Signed-off-by: Johann Lombard <johann.lombardi@intel.com>
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I5faf637c5330ca7f503c292e0e28edb84458ee89
Reviewed-on: http://review.whamcloud.com/4314
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
11 years agoLU-921 llite: warning in case of discarding dirty pages
Hongchao Zhang [Tue, 23 Oct 2012 12:00:17 +0000 (20:00 +0800)]
LU-921 llite: warning in case of discarding dirty pages

when a client is evicted, dirty pages may get silently discarded,
the caller of successful write(2) will not know that the data he
wrote have been discarded due to eviction before it can be flushed
to the OSS.

Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Change-Id: Iecfbf096548ff08cdd6064d53ad8c688343fcddc
Reviewed-on: http://review.whamcloud.com/1908
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1822 llite: Remove deprecated truncate handler
michael.mckay [Tue, 4 Sep 2012 15:31:45 +0000 (11:31 -0400)]
LU-1822 llite:  Remove deprecated truncate handler

Remove the ll_truncate handler. This handler was only being used
to display a debug message about the truncated object. That line
was moved to a different location, and the handler removed.
This handler is an issue in kernels after 2.6.34 when running the
patchless client. In that version of the kernel the kernel will log a
kernel warning if its called and the inode has a handler for truncate.
The truncate logic was updated some time ago to be more
consistent with the new sequence of events.

Xyratex-bug-id: MRP-597
Reviewed-by: Alexander Zarochentsev <Alexander_Zarochentsev@xyratex.com>
Reviewed-by: Iurii Golovach <iurii_golovach@xyratex.com>
Signed-off-by: Michael McKay <michael_mckay@xyratex.com>
Change-Id: I77b372a2825fd2bdc4b215ee20a979f03dc7d64b
Reviewed-on: http://review.whamcloud.com/3860
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Iurii Golovach <iurii.golovach@gmail.com>
11 years agoNew tag 2.3.54 2.3.54 v2_3_54 v2_3_54_0
Oleg Drokin [Mon, 29 Oct 2012 06:47:01 +0000 (02:47 -0400)]
New tag 2.3.54

Change-Id: I0c6415d7924ee83c11a5e383915d06fca41ccf2a
Signed-off-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1337 llite: ll_inode_permission should check RCU walk
Peng Tao [Tue, 18 Sep 2012 10:57:53 +0000 (18:57 +0800)]
LU-1337 llite: ll_inode_permission should check RCU walk

For >3.1 kernels, RCU flag is folded into mask field.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: Icc6751493e7359646cb6bd84b3ac05de167e4d88
Reviewed-on: http://review.whamcloud.com/4039
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Liu Xuezhao <xuezhao.liu@emc.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-812 llite: 3.0+ kernel fsync should call write
Peng Tao [Tue, 25 Sep 2012 11:16:14 +0000 (19:16 +0800)]
LU-812 llite: 3.0+ kernel fsync should call write

Since 3.0, kernel pushes i_mutex and fsync to fs fsync
callback. So Lustre should check and do the same. Otherwise
there might be data corruption and sanity 63b will fail.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: I2f2f6792276eaf6783bffb813f3c3e5405be0450
Reviewed-on: http://review.whamcloud.com/4091
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1889 build: fix false 'uninitialized scalar variable' errs
Sebastien Buisson [Tue, 11 Sep 2012 14:43:33 +0000 (16:43 +0200)]
LU-1889 build: fix false 'uninitialized scalar variable' errs

Fix false 'uninitialized scalar variable' errors found by Coverity
version 6.0.3:
Uninitialized scalar variable (UNINIT)
Using uninitialized value, element or field when calling function.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Change-Id: I83a7dd3ae4a027bf0ebced572245bc4fff35e119
Reviewed-on: http://review.whamcloud.com/3939
Tested-by: Hudson
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1857 build: fix 'Unbounded source buffer' errors
Sebastien Buisson [Fri, 7 Sep 2012 13:59:51 +0000 (15:59 +0200)]
LU-1857 build: fix 'Unbounded source buffer' errors

Fix 'unbounded source buffer' defects found by Coverity version 6.0.3:
Unbounded source buffer (STRING_SIZE)
Passing string of unknown size to a function that expects
a string of a particular size.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Change-Id: I18e51f04e62241b5c5dad7ae963d8070d6954dd4
Reviewed-on: http://review.whamcloud.com/3904
Tested-by: Hudson
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1526 utils: Supply default MDT index
James Simmons [Thu, 18 Oct 2012 12:26:16 +0000 (08:26 -0400)]
LU-1526 utils: Supply default MDT index

To prepare for DNE indexing has become a requirement
for MDTs and with the latest lustre you can't mount
a MDT that was not formated with a index. While mount
has this requirement mkfs.lustre has a bug that allows
you to format a MDS without a index and not even warn
the user. At the same time mkfs.lustre has to handle
the case were a user will not supply a index since it
was not required in earlier lustre releases. This patch
address this problem by supplying a default index of
zero for the MDT if no index is supplied to mkfs.lustre
and warns the user they must supply a index in the
future.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I45932321885856d97b10630a0667e8338822b199
Reviewed-on: http://review.whamcloud.com/4293
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2019 llite: update i_flags in ll_iocontrol properly
Peng Tao [Thu, 20 Sep 2012 09:09:49 +0000 (17:09 +0800)]
LU-2019 llite: update i_flags in ll_iocontrol properly

When client has lsm, we still need to update cache i_flags.
Otherwise i_flags is out of sync.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: I7fcb84da82129238f327885a0fc5827fcac90a8d
Reviewed-on: http://review.whamcloud.com/4078
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1756 kernel: clean up lustre_compat25.h
Peng Tao [Thu, 16 Aug 2012 07:59:21 +0000 (15:59 +0800)]
LU-1756 kernel: clean up lustre_compat25.h

1. unused functions:
   mapping_has_pages(), ll_call_writepage(), __set_page_ll_data()
   ll_invalidate_inode_pages(), __set_page_ll_data()
   CheckWriteback(), KIOBUF_GET_BLOCKS()
2. rename ll_vfs_create to vfs_create
3. remove kdev_t related macros
4. move cfs_cleanup_group_info() to lustre_common.h
5. remove kiobuf
6. move ll_inode_blksize() to lustre_common.h
7. drop LL_RENAME_DOES_D_MOVE

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: Ic5e29e399e70ccd04cbe1448f3c6cfc3a258289b
Reviewed-on: http://review.whamcloud.com/3686
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2213 scrub: stop LFSCK before osd_shutdown
Fan Yong [Mon, 22 Oct 2012 17:12:05 +0000 (01:12 +0800)]
LU-2213 scrub: stop LFSCK before osd_shutdown

The osd_shutdown will clean all the otable-based iteration,
but up layer LFSCK depends on the otable-based iteration.

So we need to stop the LFSCK before osd_shutdown called.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I97625d54766122314630aff0069d9e14d23b9840
Reviewed-on: http://review.whamcloud.com/4217
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-2224 osd-zfs: Fix osd_commit_async() locking
Brian Behlendorf [Thu, 25 Oct 2012 05:45:40 +0000 (22:45 -0700)]
LU-2224 osd-zfs: Fix osd_commit_async() locking

The ZFS osd_commit_async() function never properly acquires the
tx->tx_sync_lock() mutex to protext the tx_state_t.  However,
the mutex is correctly dropped so we just add the obviously
missing mutex_enter().

Change-Id: Iae426feaeb5885034515d6bf0ccb9509ed098bb0
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-on: http://review.whamcloud.com/4383
Tested-by: Hudson
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Prakash Surya <surya1@llnl.gov>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-2216 mdt: remove obsolete DNE code
wangdi [Sat, 27 Oct 2012 22:05:56 +0000 (15:05 -0700)]
LU-2216 mdt: remove obsolete DNE code

1. remove split checking and cross-ref code from DNE.
2. remove IAM code on ldiskfs and utils.
3. remove cmm directory.

Change-Id: I0c81d753462863706e8918393369dde94a45030c
Signed-off-by: Wang Di <di.wang@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/4353
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2179 osc: truncate partial page correctly
Jinshan Xiong [Sun, 21 Oct 2012 00:26:30 +0000 (17:26 -0700)]
LU-2179 osc: truncate partial page correctly

If a partial page is being truncated, the corresponding osc extent
should be held until the truncate finished.

Debug patch for osc_extent_wait() and don't wait for completion
of RPC it's not even sent in truncate.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I96a5ec1fdbb3133c735ebdfdd0330a45a2a8ab1a
Reviewed-on: http://review.whamcloud.com/4317
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2173 lod: QoS code to give up if no good OSP found
Alex Zhuravlev [Thu, 18 Oct 2012 18:38:58 +0000 (22:38 +0400)]
LU-2173 lod: QoS code to give up if no good OSP found

on any iteration. this code was removed by mistake in
commit 03b988a (LU-2093).

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: Ifa0d3a5ceeaaf84d3ec49e39bd2f337414a216ce
Reviewed-on: http://review.whamcloud.com/4300
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2219 ptlrpc: so_hpreq_handler is set twice for the ost_io svc
Nikitas Angelinas [Thu, 25 Oct 2012 09:04:20 +0000 (10:04 +0100)]
LU-2219 ptlrpc: so_hpreq_handler is set twice for the ost_io svc

ptlrpc_service_conf.psc_ops.so_hpreq_handler is set twice for
the ost_io service in ost_setup(); the second assignment
overwrites the first to NULL, so ost_io threads would never
handle RPCs as high-priority ones.

While we are at it, remove some superfluous assignments of
so_hpreq_handler to NULL for statically allocated
ptlrpc_service_conf structs when initializing other ptlrpc
services, and rename some relevant functions.

Signed-off-by: Nikitas Angelinas <nikitas_angelinas@xyratex.com>
Change-Id: Ia728a3d7f20511fcb58b259126b05055d5860455
Xyratex-bug-id: MRP-724
Reviewed-on: http://review.whamcloud.com/4368
Tested-by: Hudson
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2214 lod: fix tricky iterator methods
Alex Zhuravlev [Mon, 22 Oct 2012 13:35:53 +0000 (17:35 +0400)]
LU-2214 lod: fix tricky iterator methods

instead of bypassing LOD layer in the iterator methods,
just get own iterator structure in lod, which keep references
to the object and the iterator of the layer below.

this also let LOD to have different iterators in different
objects which is required for DNE.

to verify the approach lfsck goes through LOD now.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I62935319a686f4b06b2cdf5ea4002a800c0c430d
Reviewed-on: http://review.whamcloud.com/4370
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1571 mdt: Do not update xid for open replay req
Wang Di [Sat, 15 Sep 2012 14:34:15 +0000 (07:34 -0700)]
LU-1571 mdt: Do not update xid for open replay req

Do not update last_xid for open replay req,
otherwise the following resend(after replay)
can not be matched with correct xid.

Remove unnecessary mti_transo zero check in
mdt_empty_transno.

Signed-off-by: wang di <di.wang@whamcloud.com>
Change-Id: I2a05f3ac05b301ae31641a1dc51f8c4eed96427d
Reviewed-on: http://review.whamcloud.com/3195
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-2174 test: improve error message
Niu Yawei [Mon, 15 Oct 2012 07:47:23 +0000 (03:47 -0400)]
LU-2174 test: improve error message

In sanity-quota.sh, if the testing user/group isn't existing, print
error message to inform user to create them.

Check free space for test_0.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Ie08250d665b305b140315f76391fd5161a6fbdd5
Reviewed-on: http://review.whamcloud.com/4268
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2167 ptlrpc: Fix use after free in ptlrpcd on termination
Oleg Drokin [Sat, 13 Oct 2012 16:51:35 +0000 (12:51 -0400)]
LU-2167 ptlrpc: Fix use after free in ptlrpcd on termination

Should not use pc after signalling completion of its use
since it will be freed later.

Change-Id: Id20e8d188fea77f23a52e9a374e7e5e84fe3ad4b
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/4264
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
11 years agoLU-1930 build: Back end file system fixes.
James Simmons [Tue, 16 Oct 2012 11:44:03 +0000 (07:44 -0400)]
LU-1930 build: Back end file system fixes.

Currently Lustre for ZFS requires the zfs development
rpm for its userland support to be installed on the
build machine so we can create the lustre zfs utilities.
What this patch does is allow a user to be able to build
against the zfs/spl source drops as well as the rpms.
A work around is provided so we can point the lustre
build system to were a user can temporary install the
zfs user land headers.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I8e5586ed22956a9dd4799826a442b8f5a895d872
Reviewed-on: http://review.whamcloud.com/3980
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1337 llite: kernel 3.1 renames lock-manager ops
Liu Xuezhao [Thu, 9 Aug 2012 02:37:39 +0000 (10:37 +0800)]
LU-1337 llite: kernel 3.1 renames lock-manager ops

Kernel 3.1 renames lock-manager ops(lock_manager_operations) from
fl_xxx to lm_xxx (commit 8fb47a4fbf858a164e973b8ea8ef5e83e61f2e50).

Add LC_LM_XXX_LOCK_MANAGER_OPS/HAVE_LM_XXX_LOCK_MANAGER_OPS to check.

Re-arrange several macro definitions in lustre-core.m4 as kernel
version sequence.

Signed-off-by: Liu Xuezhao <xuezhao.liu@emc.com>
Change-Id: Ic86ec9db2f8262ef7ab9f5f2fb51ca79591120a4
Reviewed-on: http://review.whamcloud.com/3579
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2145 target: use tgt_ prefix for target function
Mikhail Pershin [Mon, 15 Oct 2012 13:23:00 +0000 (17:23 +0400)]
LU-2145 target: use tgt_ prefix for target function

There are several prefixes used: target_, lut_, tg_. Start to use
tgt_ prefix as single one.

Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Change-Id: I4e1f38b81a5311d56472162b8a1114a2aa252874
Reviewed-on: http://review.whamcloud.com/4273
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-1131 osd-ldiskfs: better journal credit tracking
Andreas Dilger [Wed, 3 Oct 2012 21:26:50 +0000 (15:26 -0600)]
LU-1131 osd-ldiskfs: better journal credit tracking

When running with a small MDT device during testing, it is possible to
overflow the reserved credit maximum for the journal.  Improve the
ldiskfs debugging for transaction credits so that it is possible to
better understand where the reserved credits are being used.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iea90b771a5e19190cc95cbf8f2f725bede500c1e
Reviewed-on: http://review.whamcloud.com/4282
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2154 osp: precreate logic to use last assigned id
Alex Zhuravlev [Wed, 17 Oct 2012 20:28:50 +0000 (00:28 +0400)]
LU-2154 osp: precreate logic to use last assigned id

instead of "next to assign". this change removes few cases
where last-created-id could become less than next-to-assign
resulting in an invalid assertions.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: If41563ea2f227ea980e3017d9485cb9c7caccad5
Reviewed-on: http://review.whamcloud.com/4289
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
11 years agoLU-2138 osp: set expiration before RPC is sent
Alex Zhuravlev [Thu, 18 Oct 2012 12:30:42 +0000 (16:30 +0400)]
LU-2138 osp: set expiration before RPC is sent

osp_statfs_update() should set opd_statfs_fresh_till before
the request is sent. otherwise the race is possible when
interpret function is called sooner than osp_statfs_update()
sets opd_statfs_fresh_till to "disable" value. the race can
result in suspened statfs updates misguiding the object
allocation algorithm.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I2ff03a611267292d0cd6a465c1eb14023516234b
Reviewed-on: http://review.whamcloud.com/4294
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2145 target: move target code to the separate directory
Mikhail Pershin [Sun, 14 Oct 2012 11:58:13 +0000 (15:58 +0400)]
LU-2145 target: move target code to the separate directory

Create target/ directory for unified target code and move
there already existed code from ptlrpc/target.c

Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Change-Id: Id808ed3eb390dd051cbca0a3ef2bf02e5f5d722f
Reviewed-on: http://review.whamcloud.com/4258
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-1999 test: LW conn test should rely on debug logs
Johann Lombardi [Wed, 17 Oct 2012 21:39:50 +0000 (23:39 +0200)]
LU-1999 test: LW conn test should rely on debug logs

Recovery-small test 6 should rely on lustre debug logs instead of
dmesg since console messages are rate limited and might not be printed
as expected by the test.

The test should also be skipped if the server does not support
lightweight connections (i.e. is older than 2.3.50).

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: I78994831506e632f1730eb6f80fe145c7fc2cf3e
Reviewed-on: http://review.whamcloud.com/4288
Tested-by: Hudson
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1538 tests: fix test cases when OST is full
Andreas Dilger [Sat, 13 Oct 2012 21:04:44 +0000 (15:04 -0600)]
LU-1538 tests: fix test cases when OST is full

In sanity.sh test_101d() the test didn't check if "dd" failed to write
the full file size, and produced an confusing error about readahead
performance.

In sanityn.sh test_36() it also didn't check if "dd" failed to write
the full file size, and then multiop read was stuck in a loop of zero
length reads forever.  Fix both "dd" error checking, and multiop.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic4d5ec90d77b1a9302d3e8f128f292b3765611d7
Reviewed-on: http://review.whamcloud.com/4265
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2041 oi: keep oi mapping cache consistency
Fan Yong [Mon, 8 Oct 2012 06:33:29 +0000 (14:33 +0800)]
LU-2041 oi: keep oi mapping cache consistency

Sometimes the local ID of the per RPC thread OI mappig cache may be
changed, but the FID of such OI mapping cache has not been updated,
which will cause the RPC thread finds some unexpected object with
the given FID, should be fixed.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I90cc01601925ada08e0f021ba49e3310f10aed35
Reviewed-on: http://review.whamcloud.com/4208
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1347 ldlm: makes EXPORT_SYMBOL follows function body
Liu Xuezhao [Sun, 7 Oct 2012 11:40:36 +0000 (19:40 +0800)]
LU-1347 ldlm: makes EXPORT_SYMBOL follows function body

Makes EXPORT_SYMBOL macros immediately follow the function body,
to follow normal Linux kernel coding style.

Signed-off-by: Liu Xuezhao <xuezhao.liu@emc.com>
Change-Id: I5644d908c652b2e34d81e923367af4b5728399e1
Reviewed-on: http://review.whamcloud.com/2837
Tested-by: Hudson
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2012 tests: disable replay-dual test_14b
Andreas Dilger [Tue, 9 Oct 2012 22:41:25 +0000 (16:41 -0600)]
LU-2012 tests: disable replay-dual test_14b

Disable replay-dual.sh test_14b until OST gap handling is fixed.

The test is modified to make the pass/fail result more clear for when
it is re-enabled, since any blocks that are allocated on the OST while
the test is running will cause it to fail.  This could happen because
of updated config llog, OI updates, etc.  Allow some small margin of
space to be allocated on the OST before declaring failure.  To make
failed the orphan handling totally clear, ensure the orphan object is
much larger than the margin of error.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib14ecdfa1f8fbd5807195acb60e5ba507f500c1e
Reviewed-on: http://review.whamcloud.com/4237
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2197 quota: don't send preacq if no usage and no activity
Johann Lombardi [Tue, 16 Oct 2012 16:50:57 +0000 (18:50 +0200)]
LU-2197 quota: don't send preacq if no usage and no activity

Slaves should not try to pre-acquire quota space when the ID
has no usage and there is no outstanding activities.

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: I5076910024deca2a1dc75189837dd2b6cdabe7bf
Reviewed-on: http://review.whamcloud.com/4279
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1770 ptlrpc: introducing OBD_CONNECT_FLOCK_OWNER flag
Iurii.Golovach [Tue, 16 Oct 2012 09:13:52 +0000 (12:13 +0300)]
LU-1770 ptlrpc: introducing OBD_CONNECT_FLOCK_OWNER flag

After applying flock policy fix into the 1.8 users met with an issue
when 1.8 clients with a fixed flock policy recognized incorrectly by
2.x servers.
This flag is intended to present 1.8 clients with fixed flock policy
to let 2.x servers make flock policy recognition correctly.
Patches with functionality changes were attached on review in LU-1575

Xyratex-bug-id: MRP-489
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-by: Andriy Skulysh <andriy_skulysh@xyratex.com>
Signed-off-by: Iurii Golovach <iurii_golovach@xyratex.com>
Change-Id: Id00b496ad3f556f99be5e9218497399f18a00357
Reviewed-on: http://review.whamcloud.com/3722
Reviewed-by: Cory Spitz <spitzcor@cray.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2107 util: Enhance l_getidenty to leverage 'nscd' cacheing
Sergii Glushchenko [Wed, 10 Oct 2012 10:33:15 +0000 (13:33 +0300)]
LU-2107 util: Enhance l_getidenty to leverage 'nscd' cacheing

Replace getgrent() use with getgrouplist() in l_getidentity,
which would allow it to leverage the 'nscd' cache.

Xyratex-bug-id: MRP-643
Signed-off-by: Sergii Glushchenko <sergii_glushchenko@xyratex.com>
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Iurii Golovach <iurii_golovach@xyratex.com>
Change-Id: I593abaeefe02cdb2d4f8761124bdd48477a7a22a
Reviewed-on: http://review.whamcloud.com/4223
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Iurii Golovach <iurii.golovach@gmail.com>
11 years agoLU-2083 build: install git commit hooks automatically
Andreas Dilger [Wed, 3 Oct 2012 22:18:54 +0000 (16:18 -0600)]
LU-2083 build: install git commit hooks automatically

Install the Lustre Git commit hooks into .git/hooks/ by default when
autogen.sh is run, so that they are present when patches are being
committed.  This avoids the relatively common case where a new tree
is checked out by new or experienced developers and is missing the
commit hooks when patches are being submitted.

While the commit hooks are sure to be installed for in any tree that
was built, this isn't a guarantee that the hooks will be installed in
every tree that has a commit, but it is very likely to be the case.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6a15420fb7a35b790c1e816c67e20a8004500c1e
Reviewed-on: http://review.whamcloud.com/4175
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Bruce Korb <bruce_korb@xyratex.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2193 ofd: look up FID to destroy before locking
Andreas Dilger [Tue, 16 Oct 2012 05:15:17 +0000 (23:15 -0600)]
LU-2193 ofd: look up FID to destroy before locking

If the MDS is replaying object destroys after recovery, then it may
be trying to destroy non-existent objects.  This can provoke spurious
errors in lvbo_init() due to the inability to populate the lock LVB.
Rather than quiet the useful error message from lvbo_init(), instead
do the object lookup on the to-be-destroyed FID first.  If lookup
fails to find an object, skip the object locking entirely since it
isn't needed and would just flood the console after recovery.

During destroy RPCs from the MDS, the ELC buffer is always empty, so
short-circuit the initial lock cancellation attempt that is useless.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id6197f23773ea271e0cb0912b19585b3df500c1e
Reviewed-on: http://review.whamcloud.com/4276
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2150 ost: ost_brw_read() to ptlrpc_free_bulk_nopin()
Alex Zhuravlev [Thu, 11 Oct 2012 19:31:19 +0000 (23:31 +0400)]
LU-2150 ost: ost_brw_read() to ptlrpc_free_bulk_nopin()

since a643e38 (LU-2089) OST do not pin pages involved in
BULKs: this is done to prevent get/put on the pages which
were allocated as part of order N (>1) allocation with 0
refcounter. get/put on such a page leads to warning from
the kernel. in the original patch one code path was not
fixed, so this patch completes the change.

also, to prevent confusion, the patch removes couple macros:
ptlrpc_free_bulk() and ptlrpc_prep_bulk_page(). so now the
caller should specify whether ptlrpc should reference pages
or not.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I8cf5f334e8f7edab0ad37678e1e8af18904a0be6
Reviewed-on: http://review.whamcloud.com/4256
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1303 lod: improvements and fixes
Alex Zhuravlev [Wed, 10 Oct 2012 09:52:41 +0000 (13:52 +0400)]
LU-1303 lod: improvements and fixes

- osp_statfs() returns -ENOTCONN if the corresponded OST found
  not connected. this let us to remove few additional checks in
  the allocation policy functions.

- struct obd_statfs gets new field: os_fprecreated
  LOD uses this to skip OSPs with no objects ready to use

- osp_statfs() returns number of already precerated objects
  in new os_fprecreated field

- OS_STATE_DEGRADED is ignored on the first 2 passes in RR policy

- lod_alloc_specific() to verify and skip OSPs already used in
  striping

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I86351bc1dcca7182bc5adf4eb3e03c054e33e95f
Reviewed-on: http://review.whamcloud.com/4242
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
11 years agoLU-744 osc: add lru pages management - new RPC
Jinshan Xiong [Wed, 16 May 2012 03:11:37 +0000 (20:11 -0700)]
LU-744 osc: add lru pages management - new RPC

Add a cache management at OSC layer, this way we can control how much
memory can be used to cache lustre pages and avoid complex solution
as what we did in b1_8.

In this patch, admins can set how much memory will be used for caching
Lustre pages per file system. A self-adapative algorithm is used to
balance those budget among OSCs.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I76c840aef5ca9a3a4619f06fcaee7de7f95b05f5
Reviewed-on: http://review.whamcloud.com/2514
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1717 ldlm: Fix rs_opc initialization
Li Wei [Mon, 15 Oct 2012 01:29:36 +0000 (09:29 +0800)]
LU-1717 ldlm: Fix rs_opc initialization

By the time target_send_reply() initializes rs_opc, rs_msg has not been
filled with a valid opc yet.  Following Oleg's suggestion on the Jira
ticket, this patch changes target_send_reply() to initialize rs_opc with
rq_reqmsg instead and silences a couple of related warnings that are of
only informative nature.

Change-Id: I4b96454e0bcf3dd0dc8f21b0de70a89ce37faacf
Signed-off-by: Li Wei <liwei@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/4271
Tested-by: Hudson
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-2097 quota: more ll_vfs_dq_init()
Niu Yawei [Mon, 15 Oct 2012 09:06:05 +0000 (05:06 -0400)]
LU-2097 quota: more ll_vfs_dq_init()

Calls ll_vfs_dq_init() in several places to avoid the missing
block accounting for existing inode problem.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I1500aa1f75b6a6184d1b40877a69fabdf4fac130
Reviewed-on: http://review.whamcloud.com/4270
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1538 tests: use $TESTSUITE instead of $0
Andreas Dilger [Fri, 5 Oct 2012 03:49:17 +0000 (21:49 -0600)]
LU-1538 tests: use $TESTSUITE instead of $0

Use "$TESTSUITE" instead of "$0" in test script messages, since $0 is
a full pathname and clutters up the test logs.  Instead, $TESTSUITE is
only the test suite name, and is more compact.  Use it in all output.

This makes the output from complete() redundant, in that it prints
$TESTSUITE both as an argument and internally via the equals_msg()
function (via banner()), so remove the $TESTSUITE argument from all
callers of complete().

The equals_msg() function was only a thin wrapper around banner(), and
was only used by complete(), remove it and call banner() directly.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6b3f256000317a17fdd2a361a38d4dfdda500c1e
Reviewed-on: http://review.whamcloud.com/4192
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2147 quota: several fixes to reintegration procedure
Johann Lombardi [Thu, 11 Oct 2012 13:55:58 +0000 (15:55 +0200)]
LU-2147 quota: several fixes to reintegration procedure

This patch gathers several fixes/improvements to the quota
reintegration procedure:
- do not set rq_no_resend & rq_no_delay for IT_QUOTA_CONN to have
  the reintegration thread waiting instead of stopping/starting
  the reint thread until the master is available
- add procfs tunable to force reintegration, it can be useful for
  testing, but also for fixing things at customer site when a bug
  was hit during reintegration.
- when transferring indexes, the per-page header isn't swabbed
- on index transfer, the hash value isn't sent any more (unlike on
  orion_quota) since we now use II_FL_NOHASH. As a consequence,
  qsd_reint_entries() shouldn't take the hash size into account
  when parsing a page container key/record pairs.

This patch also:
- quiets many common messages which aren't real errors and shouldn't
  make it to the console
- fixes a bug in qmt_adjust_edquot() which does not check correctly
  whether the revoke timeout has elapsed
- changes test_6 to use a larger quota limit to avoid edquot flag
  to be set by the QMT and cause the test to fail.
- removes temporary code from setup_quota() in t-f now that the
  new quota code is fully landed.
- re-enables ost-pool tests 23a & 23b

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: I9af9a025faa1ef173810df647b93307e2139c6f9
Reviewed-on: http://review.whamcloud.com/4253
Tested-by: Hudson
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1972 mdt: declare RPC handlers in a sane way
Andreas Dilger [Fri, 12 Oct 2012 02:02:48 +0000 (20:02 -0600)]
LU-1972 mdt: declare RPC handlers in a sane way

Declare the MDT RPC handlers in a way that they can be found when
searching for them, otherwise the code is completely opaque when
looking for the handler for an RPC (e.g. MDS_CLOSE or LDLM_ENQUEUE).

While it might make sense to have macros replace a lot of repetetive
code blocks, it doesn't make sense to chop up the RPC names so badly
that they can never be found through normal searching.  Rename the
RPC handler definition macros to have more meaningful names, and
remove unused macros and special cases where not strictly necessary.

Rename a few OBD_FAIL_LDLM_ error injection hooks instead of making
the macros more complex for a small number of use cases.  These names
are only used internally, even if the values are used in the tests.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8d4dc0709faeae4458c3563864268a00f8500c1e
Reviewed-on: http://review.whamcloud.com/4260
Tested-by: Hudson
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2123 tests: sanity 57 and 129 can't detect fstype
James Simmons [Tue, 9 Oct 2012 14:36:12 +0000 (10:36 -0400)]
LU-2123 tests: sanity 57 and 129 can't detect fstype

In sanity test 57a,57b and 129 the function facet_type_fstype
is used to determine the file system backend. This function
does not actually exist so this test never run for the
ldiskfs case. The fix is to use the proper function which
is facet_fstype.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Ia9766a3a2200c6ce2d48ff0265eb73a4d71c06e7
Reviewed-on: http://review.whamcloud.com/4230
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
11 years agoLU-2093 lod: fall back to RR allocation when QoS fails
Alex Zhuravlev [Wed, 10 Oct 2012 08:32:38 +0000 (12:32 +0400)]
LU-2093 lod: fall back to RR allocation when QoS fails

lod_alloc_qos() checks is there enough OSPs to satisfy the request
checking OSP state with dt_statfs(), then it tries to reserve
objects on some of them. during the reservation the state of OSP
can change (due to broken connection, for example), then QoS code
might found less ready OSPs than required. this is a valid situation
and LOD should fallback to RR allocation.

sanity/116a added to verify this: dt_statfs() are still reporting
OSPs are good, but no actual object can be created on OSP with
index 1.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: Iae6916f998070960eb47c71f2bc1e48adb2ac080
Reviewed-on: http://review.whamcloud.com/4241
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Tested-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1595 build: allow longer component in summary
Andreas Dilger [Wed, 3 Oct 2012 21:50:08 +0000 (15:50 -0600)]
LU-1595 build: allow longer component in summary

Allow a longer "component:" field in the commit summary message, up to
11 characters, for osd-ldiskfs.  This exceeded the 9-character limit.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id0ed056e9b9efef07b26efeb9e2f4f1e8d500c1e
Reviewed-on: http://review.whamcloud.com/4174
Tested-by: Hudson
Reviewed-by: Bruce Korb <bruce_korb@xyratex.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2163 lprocfs: fix jobstats initialization race
Andreas Dilger [Fri, 12 Oct 2012 21:23:13 +0000 (15:23 -0600)]
LU-2163 lprocfs: fix jobstats initialization race

If two threads are racing to add the same jobid into the job stats
list in lprocfs_job_stats_log(), one thread will lose the race from
cfs_hash_findadd_unique() and enter the "if (job != job2)" case.  It
could fail LASSERT(!cfs_list_empty(&job->js_list)) depending whether
the other thread in "else" added "job2" to the list first or not.

Simply locking the check for cfs_list_empty(&job->js_list) is not
sufficient to fix the race.  There would need to be locking over the
whole cfs_hash_findadd_unique() and cfs_list_add() calls, but since
ojs_lock is global for the whole OST this may have performance costs.

Instead, just remove the LASSERT() entirely, since it provides no
value, and the "losing" thread can happily use the job_stat struct
immediately since it was fully initialized in job_alloc().

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iecb17e2dc80621fd388295998df5708bcaabcab0
Reviewed-on: http://review.whamcloud.com/4263
Tested-by: Hudson
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-744 llite: reimplement ll_get_fsname()
Jinshan Xiong [Fri, 17 Aug 2012 00:39:33 +0000 (17:39 -0700)]
LU-744 llite: reimplement ll_get_fsname()

ll_get_fsname() used to allocate a piece of memory to store fsname,
this is not needed and error prone because it requires the caller
to free that piece of memory.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ibc077b46728a1358e51a345ec13c966fc947c428
Reviewed-on: http://review.whamcloud.com/3704
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2037 tests: Wait for devices to initialize on test setup
michael.mckay [Thu, 27 Sep 2012 14:28:03 +0000 (10:28 -0400)]
LU-2037 tests: Wait for devices to initialize on test setup

Fix an issue where we do not wait for a device to
initialize before getting the label. This label then does
not correspond to an actual device.
A check is now done on the label names to see
if you are getting back the 'ffff' which signifies that the
device has not finished initializing yet. In this case
we will wait and retry in 1,3,5,10 seconds or until
the command succeeds.

Xyratex-bug-id: MRP-546
Reviewed-by: Iurii Golovach <Iurii_Golovach@xyratex.com>
Reviewed-by: Kyrylo Shatskyy <Kyrylo_Shatskyy@xyratex.com>
Signed-off-by: Michael McKay <michael_mckay@xyratex.com>
Change-Id: I01e045227a4b3b6e007dcc9685238a5425cdffe8
Reviewed-on: http://review.whamcloud.com/4111
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-823 compat: Don't use cfs_ functions on kernel structures
Christopher J. Morrone [Fri, 4 Nov 2011 22:59:11 +0000 (15:59 -0700)]
LU-823 compat: Don't use cfs_ functions on kernel structures

It is not really correct to use the generic "cfs_" prefixed
locking functions on the Linux kernel's data structures.  Revert
back to using the Linux locking function.

Change-Id: I64619c4b4f4963634b3d1e43c1b1519598e65e8d
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/4176
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Peng Tao <bergwolf@gmail.com>
11 years agoLU-2142 scrub: reset completed scrub position if retrigger
Fan Yong [Thu, 11 Oct 2012 09:42:29 +0000 (17:42 +0800)]
LU-2142 scrub: reset completed scrub position if retrigger

If former OI scrub has been completed, and the user wants to run
the OI scrub again, then reset the OI scrub to make it to rescan
the device from the beginning.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I5b8e9ee51ccbf95ed131b963389c4ecfb92b9035
Reviewed-on: http://review.whamcloud.com/4250
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2144 utils: reset 'optind' to avoid segmentation fault
Fan Yong [Thu, 11 Oct 2012 09:12:12 +0000 (17:12 +0800)]
LU-2144 utils: reset 'optind' to avoid segmentation fault

Sometimes lfsck_{start,stop} commands may be called several
times under the same lctl shell. Under such case, the function
getopt_long() called inside the lfsck_start/lfsck_stop may be
confused, and cause segmentation fault. So reset the external
variable 'optind' by force to avoid the segmentation fault.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I35b635eac44854ae4b17ae00bed778320dbe9d9e
Reviewed-on: http://review.whamcloud.com/4248
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-2140 test: add fake nid with proper nettype
Niu Yawei [Thu, 11 Oct 2012 06:23:09 +0000 (02:23 -0400)]
LU-2140 test: add fake nid with proper nettype

The fake nid should be added with proper nettype.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I490f80328d8210e8eca9ccd8484fa4d7717c7429
Reviewed-on: http://review.whamcloud.com/4247
Tested-by: Hudson
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1981 doc: reorganize osd API documentation
Johann Lombardi [Tue, 9 Oct 2012 22:32:16 +0000 (00:32 +0200)]
LU-1981 doc: reorganize osd API documentation

Move osd-api.txt to lustre/doc and reorganize the document.

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: I76fe29e36a325c0687c84389b9ed98fbbeb3b85c
Reviewed-on: http://review.whamcloud.com/4236
Tested-by: Hudson
Reviewed-by: Ian Colle <Ian.Colle@intel.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2059 tests: skip local config tests only on ZFS
Andreas Dilger [Fri, 5 Oct 2012 03:04:34 +0000 (21:04 -0600)]
LU-2059 tests: skip local config tests only on ZFS

Only skip conf-sanity.sh (5d, 19b, 21b, 27a) and insanity.sh (2, 4)
when the backing OST filesystem type is ZFS, not for ldiskfs.  The
support for locally-cached config llogs is not implemented for ZFS
yet, so ZFS OSTs cannot be started without the MGS yet.

The ZFS local config support is in progress and these tests will be
re-enabled as part of the landing.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8c49483401a7132ce09b93aa5d93610c4d500c1e
Reviewed-on: http://review.whamcloud.com/4234
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1458 test: enable lustre_rsync debug log dump
Bobi Jam [Mon, 27 Aug 2012 16:41:33 +0000 (00:41 +0800)]
LU-1458 test: enable lustre_rsync debug log dump

* Make lustre_rsync dump its debug log to help debugging.
* Add debug messages in lr_move().

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I0f26322b3a4677bcb1b09d09e0e7c0ea1b4dbe3d
Reviewed-on: http://review.whamcloud.com/3795
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2088 utils: drop obsolete debug symbols
Andreas Dilger [Thu, 4 Oct 2012 20:08:49 +0000 (14:08 -0600)]
LU-2088 utils: drop obsolete debug symbols

Remove obsolete modules from the list of debugging symbols:
llite, smfs, fsfilt_ext3, fsfilt_reiserfs, fsfilt_smfs,
mds_ext3, cobd, cmobd

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I47b60934fd0ceb58b6bae77306b60d28d2300c1e
Reviewed-on: http://review.whamcloud.com/4188
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1757 brw: added OBD short io connect flag
Alexander.Boyko [Wed, 10 Oct 2012 08:23:08 +0000 (12:23 +0400)]
LU-1757 brw: added OBD short io connect flag

To prevent collisions with any future flags needed in features written
against this branch.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Change-Id: I567020c24ab64c4faeed159ef8c6814e74f73503
Reviewed-on: http://review.whamcloud.com/3891
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1095 ptlrpc: improve ptlrpc debug message consistency
Ned Bass [Mon, 6 Aug 2012 18:49:27 +0000 (11:49 -0700)]
LU-1095 ptlrpc: improve ptlrpc debug message consistency

Enforce the following conventions for better consistency
in a few ptlrpc/target.c debug messages.

- Print each message on a single line for better grep results.

- Provide a distinctive message for different functions to
  reduce appearance of redundancy.

- Print device name at the start, otherwise on systems with many
  targets it isn't easy to tell which one was involved.

- Print rc at the end.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Change-Id: Ibd203367dde4d95d32671217271420c57a8dc0ad
Reviewed-on: http://review.whamcloud.com/3547
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2111 echo_cleint: Wrong error code returns for MDS echo
James Simmons [Mon, 8 Oct 2012 18:45:39 +0000 (14:45 -0400)]
LU-2111 echo_cleint: Wrong error code returns for MDS echo

If MD echo client fails to resolve a path a error is
returned by PTR_ERR but the rc variable is never updated
with this value to return back thus the application
thinks it worked. This patch properly sets the returned
rc variable to let the application testing know a failure
has occured.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: If7d2e9fbb28bcb239f7cc5021efebdcf0784ea14
Reviewed-on: http://review.whamcloud.com/4225
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1666 obdclass: reduce lock contention on coh_page_guard
Jinshan Xiong [Mon, 13 Aug 2012 23:57:49 +0000 (16:57 -0700)]
LU-1666 obdclass: reduce lock contention on coh_page_guard

Define a per-page spinlock to get and put a cl_page instead of
grabbing per-object lock coh_page_guard.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Iecf5e6840ab1a28edcf2c4bcde6a72c2f9b5bdae
Reviewed-on: http://review.whamcloud.com/3627
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Ned Bass <bass6@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2119 osd: missing osd_shutdown()
Niu Yawei [Wed, 10 Oct 2012 02:40:03 +0000 (22:40 -0400)]
LU-2119 osd: missing osd_shutdown()

For zfs osd, osd_shutdown() should be called in osd_device_fini()
just like ldiskfs osd does, to can make sure that everything is
cleared even if the osd_process_config(CLEANUP) has no chance to
be called. (when OFD/MDT wasn't started)

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I54f08ac657e01ffb7a367278810016b585b3c0da
Reviewed-on: http://review.whamcloud.com/4239
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-107 scripts: rename heartbeat resource agent
Ned Bass [Tue, 9 Oct 2012 20:10:18 +0000 (13:10 -0700)]
LU-107 scripts: rename heartbeat resource agent

Add a .ha_v2 extension to the Lustre heartbeat resource agent script
to resolve conflicts with the lustre init script on case-insensitive
filesystems.  The extension also provides a clue as to what the
script's purpose is.  Add a compatibility symlink with the plain name
during package installation.   If we later add v3-style scripts we can
control which one gets linked to via a configure option.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Change-Id: I0e37a390fbfb3f00c1c1e666a7cdbe5d37fa885b
Reviewed-on: http://review.whamcloud.com/4233
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1842 quota: remove leftovers from old code
Johann Lombardi [Mon, 8 Oct 2012 12:34:47 +0000 (14:34 +0200)]
LU-1842 quota: remove leftovers from old code

- remove all references to qunit_data which is the old format for
  slave-master RPCs
- gather all quota-related data structure definitions in one single
  place in lustre_idl.h
- rename lquota.h to lustre_quota.h since this is the standard naming
  scheme for all lustre components.

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: Ifbadb639517c52741c210668f28452ea55bf6a43
Reviewed-on: http://review.whamcloud.com/4221
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2131 tests: sanity using ost0 instead of ost1
James Simmons [Tue, 9 Oct 2012 17:20:25 +0000 (13:20 -0400)]
LU-2131 tests: sanity using ost0 instead of ost1

In sanity test 81a,81b and the jobstat function are testing
against ost0 which technically don't exist. The test
framework starts the OSTs with ost1. This patch has the test
use the proper ost1.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Ibb7e647b8d6a736ca930e9294aacf584c9c49880
Reviewed-on: http://review.whamcloud.com/4232
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1842 oi: handle failure cases for osd_oi_fini
Fan Yong [Mon, 8 Oct 2012 05:44:19 +0000 (13:44 +0800)]
LU-1842 oi: handle failure cases for osd_oi_fini

Sometimes the osd_oi_fini() may be called without osd_oi_init()
called firstly, or with osd_oi_init() failure. Under such cases,
the osd_device::od_oi_table may be invalid, so just skip related
OI cleanup.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: Iffa008b29bb8763fcdc4d12c1e4cae93026a25b3
Reviewed-on: http://review.whamcloud.com/4219
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Hudson
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-2099 osd: set dr_elapsed before dr_numreqs
Alex Zhuravlev [Mon, 8 Oct 2012 06:36:39 +0000 (10:36 +0400)]
LU-2099 osd: set dr_elapsed before dr_numreqs

so that service thread always get dr_elapsed and
dr_elapsed_valid initialized when the bio is done.
also, the patch adds a bit of debug to help with
the original bug if this patch doesn't help.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I7c41af9a2be5be2f37190fb053d7449653fc7a99
Reviewed-on: http://review.whamcloud.com/4216
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2102 llog: wrong handle was used for changelog processing
Mikhail Pershin [Tue, 9 Oct 2012 06:40:43 +0000 (10:40 +0400)]
LU-2102 llog: wrong handle was used for changelog processing

- Wrong llog handle was passed tp changelog_user_init_cb() for
  processing

Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Change-Id: I3656829e283232ff8c17829e18df68b320e26f2f
Reviewed-on: http://review.whamcloud.com/4229
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1943 tests: Refresh conf-sanity 32[ab]
Alex Zhuravlev [Sat, 6 Oct 2012 17:01:46 +0000 (21:01 +0400)]
LU-1943 tests: Refresh conf-sanity 32[ab]

Existing conf-sanity 32[ab] does not run on multi-node clusters or
network types other than TCP.  This patch rewrites the tests to
start the targets on the MDS and mount the upgraded file system on
the primary client.  This scheme works both on a single-node
development environment and a typical Autotest cluster.

The disk image tarball format has been updated to include more
"metadata".  For example, the 1.8.7-wc1 and 2.1.1 tarballs added
by this patch include the kernel versions, architectures, Lustre
versions, and commit SHAs (currently only Lustre version) used to
create the disk images.  In addition, the script used to create
the tarballs is recorded as a test in test_32newtarball().

A couple of new data verifications are added to make sure the name
space, the file data, and most of the file attributes are consistent
across upgrades.

Signed-off-by: Li Wei <liwei@whamcloud.com>
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I3b0d38d67e86e0e2ac24d9aae4406b055b63ce61
Reviewed-on: http://review.whamcloud.com/4213
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1606 api: Rename liblustreapi.h -> lustreapi.h
Christopher J. Morrone [Wed, 18 Jul 2012 00:22:18 +0000 (17:22 -0700)]
LU-1606 api: Rename liblustreapi.h -> lustreapi.h

The header file "liblustreapi.h" is unfortunately named,
implying that it is only part of the lustre-in-user-space-client
rather than the more general lustre api for users that it is.

This patch renames it to simply "lustreapi.h" to avoid that
confusion.  This also helps to move towards making it
easier to use the lustre API for users.  They can simply

  #include <lustre/lustreapi.h>

and then compile with "-llustreapi".

For backwards compatibility, liblustreapi.h becomes a stub
that includes lustreapi.h, and can be removed at some date
in the future.

Since LU-113, LASSERT should be removed from userspace.
In this change, lustre_user.h has had LASSERTs removed.

All code examples in the man pages have been tested.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Signed-off-by: Richard Henwood <richard.henwood@intel.com>
Change-Id: Id1e709d36a855ad2a8eff206fa9f6bbe87182a29
Reviewed-on: http://review.whamcloud.com/3427
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
11 years agoLU-2110 mount: do not start osp twice
Alex Zhuravlev [Mon, 8 Oct 2012 19:31:20 +0000 (23:31 +0400)]
LU-2110 mount: do not start osp twice

failover nid can be added under the same marker as the
main device setup block. so do not call lustre_osp_setup()
on a subsequent LCFG_ADD_UUID command in the config.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: Ic4858812e942f73dd264dca5e33b27d51509f670
Reviewed-on: http://review.whamcloud.com/4227
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
11 years agoLU-1842 test: remove code from s-q to use OFD
Johann Lombardi [Mon, 8 Oct 2012 13:19:59 +0000 (15:19 +0200)]
LU-1842 test: remove code from s-q to use OFD

Now that obdfilter is gone and OFD is the default, we don't need to
restart lustre with USE_OFD any more.

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: Ifa3647bd815866aa29bd88a9b3c6be16c6c66fa5
Reviewed-on: http://review.whamcloud.com/4222
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2097 quota: missing ll_vfs_dq_init()
Niu Yawei [Mon, 8 Oct 2012 09:55:36 +0000 (05:55 -0400)]
LU-2097 quota: missing ll_vfs_dq_init()

ll_vfs_dq_init() should be called before operating on the system
objects (like llog, oi, last_rcvd, etc.), otherwise, block accounting
will be missed if object is already existing.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I922c9ca8caebf52ae07fd130c41286dad68c1b8f
Reviewed-on: http://review.whamcloud.com/4220
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1691 kernel: kernel update [SLES11 SP2 3.0.34-0.7.9]
James Simmons [Mon, 8 Oct 2012 13:34:22 +0000 (09:34 -0400)]
LU-1691 kernel: kernel update [SLES11 SP2 3.0.34-0.7.9]

Add SLES11 SP2 client support to 3.0.34-0.7.9.

Allow 2.6 (SP1) and 3.0 (SP2) clients to be built for SLES11
Standard lbuild will build for the kernel version that builder
is using, this can be overridden by specifying the target
directly as an lbuild parameter

This change explictly does not make changes to support SLES11SP2
server

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Chris Gearing <chris.gearing@intel.com>
Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: Iece4c0286a3f0dcd28fe96e03a8aec9bda065ed5
Reviewed-on: http://review.whamcloud.com/3734
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1199 build: Change various %defines to %globals
Christopher J. Morrone [Thu, 8 Mar 2012 02:43:38 +0000 (18:43 -0800)]
LU-1199 build: Change various %defines to %globals

As a general rule, "%define" should not be used inside
of %{ } blocks.  %define is locally scoped.  While it
appears to work with constructs like this:

   %{!?foo: %define foo bar}

that is only because of an rpm quirk that fails to
free non-global scope variables immediately.  Later use
of parameterized macros in the file can trigger cleanup
of the local variables, and "foo" will be once again
undefined.

The solution is to use %global like so:

   %{!?foo: %global foo bar}

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: Ie18b0b86324334330b726bf69249d97e47e9350e
Reviewed-on: http://review.whamcloud.com/3420
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1898 test: new tests 8[a,b,c,d,e] need 2.3 or later
Bob Glossman [Thu, 13 Sep 2012 21:34:02 +0000 (14:34 -0700)]
LU-1898 test: new tests 8[a,b,c,d,e] need 2.3 or later

check lustre version to make sure tests do not run
on unsupported version

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I5b284af44a6132070b8e89659b7981844a74555e
Reviewed-on: http://review.whamcloud.com/3986
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoNew tag 2.3.53 2.3.53 v2_3_53 v2_3_53_0
Oleg Drokin [Mon, 8 Oct 2012 07:35:23 +0000 (03:35 -0400)]
New tag 2.3.53

Change-Id: Id9d4ed992ee7995f121f37dce05c8ae40c807779
Signed-off-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-753 obd: remove obsolete commit callback code
Andreas Dilger [Tue, 25 Oct 2011 21:56:45 +0000 (15:56 -0600)]
LU-753 obd: remove obsolete commit callback code

Remove old obd_transno_commit_cb() function, which was replaced
by lut_cb_last_committed() in the new LU stack implementation.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Change-Id: I789687c63761c2532c02f5a1b827d8625b770c1c
Reviewed-on: http://review.whamcloud.com/4211
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-1943 filter: remove obdfilter from tree
Mikhail Pershin [Sat, 6 Oct 2012 16:02:17 +0000 (20:02 +0400)]
LU-1943 filter: remove obdfilter from tree

Remove obdfilter and keep only OFD

Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Change-Id: If72fe59238325cc63f804f84804b180156c94ea9
Reviewed-on: http://review.whamcloud.com/4209
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-1175 tests: use ofd by default instead of obdfilter
Alex Zhuravlev [Wed, 3 Oct 2012 19:30:58 +0000 (23:30 +0400)]
LU-1175 tests: use ofd by default instead of obdfilter

Set USE_OFD=yes by default in order to start testing the ofd module in
preference to the obdfilter module.  Force LOAD_MODULES_REMOTE to be
set to avoid obdfilter being loaded by modprobe remotely on mount.

There was a bug in comma_list() appending a space to the list of
nodes, which caused do_rpc_nodes() to be unhappy when a nearly-empty
list of nodes was being passed by callers.  Rework this function, and
clean up other callers to just pass the nodes and let do_rpc_nodes()
detect this case and skip any remote execution.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I896c9c9b7216c418b635e893202810333d8ef871
Reviewed-on: http://review.whamcloud.com/4171
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-2089 ofd: do not pin pages provided by osd
Alex Zhuravlev [Fri, 5 Oct 2012 10:25:44 +0000 (14:25 +0400)]
LU-2089 ofd: do not pin pages provided by osd

depending on implementation, some pages can be allocated
in order > 0 and kernel does increase refcounter on the
first page only. in this case ptlrpc_free_bulk() calling
cfs_unpin() will try to release such pages leading to
warning and other bad things in the kernel.

instead let ofd/ost to rely on dbo_bufs_get/dbo_bufs_put
as they know details of the pages provided.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I3591e21eef9557d6004d29e63986c7bd5987802b
Reviewed-on: http://review.whamcloud.com/4198
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1842 quota: tunefs.lustre --quota
Niu Yawei [Sat, 6 Oct 2012 09:31:30 +0000 (05:31 -0400)]
LU-1842 quota: tunefs.lustre --quota

Add "--quota" for tunefs.lustre to convert old quota disk format
into new format.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Ie1951c9183b940c41956e863f76f5e357a1b2bd8
Reviewed-on: http://review.whamcloud.com/4205
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1943 tests: async IO is enough for sanity/80
Alex Zhuravlev [Tue, 10 Apr 2012 13:06:22 +0000 (17:06 +0400)]
LU-1943 tests: async IO is enough for sanity/80

Port of ORI-620 to the master

it can be hard to write 1MB in a second for some
backends (like ZFS), so relax current synchronous
semantics as this doesn't affect the logic.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Change-Id: Ie87815de4a2ae51fc166287d74d43278101fe76b
Reviewed-on: http://review.whamcloud.com/4212
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1718 client: Restore NFS export for Lustre on 3.X kernels
James Simmons [Mon, 27 Aug 2012 18:55:07 +0000 (14:55 -0400)]
LU-1718 client: Restore NFS export for Lustre on 3.X kernels

In Linux 3.0+ kernels struct file_system_type changed the
get_sb function to a new function called mount which was
different in that the vfsmount data was no longer passed in.
The vfsmount data was used by the llite layer for nfs export
function called get_name to search for filp that was then used
with the ll_readdir method. The approach to solve this change
was to go the route of btrfs and gfs2 to refactor some of the
llite methods to implement a directory scan independent of
filp which could be shared with nfs export funtionality.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I72730476b120cec1ede6e03c774c9e470a1a5a70
Reviewed-on: http://review.whamcloud.com/3624
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1962 util: Add --lazy flag to lfs df
Andriy Skulysh [Wed, 19 Sep 2012 13:37:22 +0000 (16:37 +0300)]
LU-1962 util: Add --lazy flag to lfs df

add --lazy flag to lfs df
it allows to skip unavailable OSTs and report
"Resource temporarily unavailable" for them

Xyratex-bug-id: MRP-583
Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Alexander Boyko <alexander_boyko@xyratex.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Change-Id: Idb6b728b52c5fa1590d07201700a3fad0ef7cc78
Reviewed-on: http://review.whamcloud.com/4007
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1963 ptlrpc: Incorrect accounting for srv_hpreq_ratio
Andriy Skulysh [Mon, 27 Aug 2012 09:23:53 +0000 (12:23 +0300)]
LU-1963 ptlrpc: Incorrect accounting for srv_hpreq_ratio

Allow one thread to always handle HP requests.
Without this patch 1 thread dedicated to serve HP requests
will do nothing if all normal requests are in progress for
a long time but there are HP requests in input queue.

Xyratex-bug-id: MRP-661
Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-by: Nathan Rutman <nathan.rutman@xyratex.com>
Change-Id: I35f5329ca019a5d2e2dcee9d7a13eaa74e85233e
Reviewed-on: http://review.whamcloud.com/4008
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1833 util: don't update mtab if linked to /proc
Peng Tao [Wed, 5 Sep 2012 10:36:05 +0000 (18:36 +0800)]
LU-1833 util: don't update mtab if linked to /proc

Some distros may link /etc/mtab to /proc/mounts, or
/proc/self/mounts.  In such case, we don't need to
update mtab. Otherwise we false alart user with
"
mount.lustre: addmntent: Invalid argument:
"

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: I2e3e213e4ee3bc177865c4ca7435a7eecd274b46
Reviewed-on: http://review.whamcloud.com/3881
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-2093 osd: maintain O/<group>/LAST_ID properly
Alex Zhuravlev [Sat, 6 Oct 2012 16:14:25 +0000 (20:14 +0400)]
LU-2093 osd: maintain O/<group>/LAST_ID properly

unfortunately this is not easy to do from ofd,
we have to do this in osd.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I389f278ac53da2a65ba8ed4a6e93cb9622eedae2
Reviewed-on: http://review.whamcloud.com/4210
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1607 tests: allow MODOPTS_LIBCFS module params
Andreas Dilger [Sat, 15 Sep 2012 02:53:18 +0000 (20:53 -0600)]
LU-1607 tests: allow MODOPTS_LIBCFS module params

The changes made for forcing non-default CPT partition counts in
commit 95d67b28076f7938a6c962a5256e9b581a439f71 now prevents
overriding libcfs module options using MODOPTS_LIBCFS, because
there is always the cpu_npartitions parameter specified at module
loading time that overrides the environment variable.

Instead add the cpu_npartitions=2 argument to the environment
variable instead of making it a parameter.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I94bf500164dc23e81080bd2675646b5fa7500c1e
Reviewed-on: http://review.whamcloud.com/3999
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1943 test: Support --with-ldiskfsprogs in test framework
Brian Behlendorf [Thu, 29 Sep 2011 19:27:24 +0000 (12:27 -0700)]
LU-1943 test: Support --with-ldiskfsprogs in test framework

ORI-347 port to the master

When Lustre is built with the --with-ldiskfsprogs option the
Lustre utilities will expect to find the e2fsprogs utilties
repackaged as ldiskfsprogs on the system.  This is done to
avoid having to replace the distribution provided e2fsprogs
package.

This change extends the test framework to check for the
ldiskfsprogs utilties to avoid having to explicitly set
the shell variables when --with-ldiskfsprogs is used.
If the tools are not available it will fallback to using
the e2fsprogs utilities.

Signed-off-by: Prakash Surya <surya1@llnl.gov>
Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Change-Id: Ia094d9b005f7e455beca1c5d6d6ad03bdc03671f
Reviewed-on: http://review.whamcloud.com/4194
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1943 tests: Allocate sequences on MOUNT2 in replay_barrier()
Li Wei [Tue, 10 Jan 2012 08:03:21 +0000 (16:03 +0800)]
LU-1943 tests: Allocate sequences on MOUNT2 in replay_barrier()

Port of ORI-448 to the master

In order to avoid sequence file updates after a target is made
read-only, replay_barrier() creates a file in "MOUNT" on every
client node. This is not enough because the "MOUNT2" clients may
also trigger sequence file updates. This patch makes sure the trick
is applied to "MOUNT2" clients as well.

Change-Id: I832689a2d2ca205d927bc5d0a15ab14fceb3bf80
Signed-off-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/4196
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>