Whamcloud - gitweb
fs/lustre-release.git
11 years agoLU-2237 osd: skip OI ops for local objects
Fan Yong [Sat, 27 Oct 2012 07:45:27 +0000 (00:45 -0700)]
LU-2237 osd: skip OI ops for local objects

We should not add the FID mapping in the OI file for local object.
Otherwise it will cause OI lookup to return non-exist local object
when the local object lost for system crash, and it also prevents
new local object to be created, and then causes the server cannot
mountup.

This issue has been fixed in OI scrub project, which will be back
ported to lustre-2.1 soon. This is a temporary patch to allow the
current lustre-2.1 to be workable.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: Ibe2861818aaef3842605fb7d4e24cc02dad22104
Reviewed-on: http://review.whamcloud.com/4395
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-1779 tests: fix run_one_logged() to log SKIP status
Yu Jian [Fri, 7 Sep 2012 02:58:35 +0000 (10:58 +0800)]
LU-1779 tests: fix run_one_logged() to log SKIP status

In the current test framework, only those tests which are in the
$ALWAYS_EXCEPT list are logged with SKIP status, other skipped
tests are all logged with PASS status.

This patch fixes the above issue by setting the SKIP status in
pass() and logging the status in run_one_logged().

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I7d4e66982a04d8759e887d88d0e406da719c03bf
Reviewed-on: http://review.whamcloud.com/3899
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-718 mdt: allow mds_num_threads module parameter
Andreas Dilger [Mon, 27 Aug 2012 21:19:40 +0000 (15:19 -0600)]
LU-718 mdt: allow mds_num_threads module parameter

In 1.8.x the MDS service threads tunable is called "mds_num_threads",
but in 2.0/2.1 this was renamed mdt_num_threads without any chance
of compatibility.  In 2.3.x this was named back to mds_num_threads
(commit bd8835cc2dde6e86701650ccf90423ecd8fb042e) since the threads
are a property of the MDS, not of the MDT(s) on the system.

Add a compatible module option "mds_num_threads" for 2.1 so that it
is possible to upgrade 1.8->2.1->2.4 without problems.  No message
is printed in 2.1.x about deprecation, since it isn't possible to use
only one or the other in 2.1, and a warning will be printed in 2.3+
once the system is upgraded.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3cb095812488b9459e4a3e878757d40410accab0
Reviewed-on: http://review.whamcloud.com/3804
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
11 years agoLU-748 test: shorten the runtime of sanity subtest_220
Hongchao Zhang [Mon, 12 Dec 2011 18:28:08 +0000 (02:28 +0800)]
LU-748 test: shorten the runtime of sanity subtest_220

in sanity.sh, test_220 tries to exhaust all of the inodes on the OSTs
in order to verify that when it returns -ENOSPC to inode precreate
request, but there is still free blocks, then the MDS continues to use
these precreated inodes on the OSTs.

Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Change-Id: Icaad07311125f362f0efb26da76534c7dca27b6a
Reviewed-on: http://review.whamcloud.com/1676
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
11 years agoLU-1762 tests: get correct MMP update and check intervals
Yu Jian [Tue, 21 Aug 2012 04:27:09 +0000 (12:27 +0800)]
LU-1762 tests: get correct MMP update and check intervals

This patch fixes the get_mmp_update_interval() and
get_mmp_check_interval() in mmp.sh to get the correct
MMP update and check intervals from both the old and
new outputs of debugfs.

The patch also improves test_8() to increase the running
time of e2fsck to allow mount operation to be started
before e2fsck operation stops.

Test-Parameters: testlist=mmp
Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I1b73df9c7e8aea7f9a3967c278b6e82546d26dbf
Reviewed-on: http://review.whamcloud.com/3733
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
11 years agoTag 2.1.3rc3 2.1.3 2.1.3-RC3 v2_1_3 v2_1_3_0 v2_1_3_0_RC3 v2_1_3_RC3
Oleg Drokin [Sat, 25 Aug 2012 04:32:36 +0000 (00:32 -0400)]
Tag 2.1.3rc3

Change-Id: Id3770004ac710228de969d8b1090628b689730e9

11 years agoLU-1540 osd: add NUL terminator for long symlink
Bobi Jam [Thu, 23 Aug 2012 14:56:59 +0000 (22:56 +0800)]
LU-1540 osd: add NUL terminator for long symlink

Add NUL terminator for long symlink to ldiskfs inode on-disk data.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Id7ce7829ec9b4c8eb72cf257df046a5288a5eb7b
Reviewed-on: http://review.whamcloud.com/3765
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years ago2.1.3rc2 2.1.3-RC2 v2_1_3_0_RC2 v2_1_3_RC2
Oleg Drokin [Sat, 18 Aug 2012 19:00:36 +0000 (15:00 -0400)]
2.1.3rc2

Change-Id: I1e028505e38120a5270f9fabdec4ce7f91fc454f
Signed-off-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1282 lprocfs: Use present cpu numbers to save memory
Bobi Jam [Thu, 5 Apr 2012 06:25:40 +0000 (14:25 +0800)]
LU-1282 lprocfs: Use present cpu numbers to save memory

Port of combined patch from master branch:
Commit 65dc702123f91c4fb2ae25604f98e195fcc15544
Commit 8c831cb8a05f0d6f63b88e9b2dfb85ba4eca217a
Commit 560efa06be97651252caff4ba9bc2c014cf62ff9

* lprocfs stats data should allocated by the number of present cpus in
  stead of by possible cpu number which wastes a lot of memory.
* When new cpus are hot-plugged in, alloc necessary percpu array
  elements on demand.
* Add a LPROCFS_STATS_FLAG_IRQ_SAFE flag, when a stat is non-percpu
  stats with this flag, lprocfs_stats_lock() should disable irq.
* OSS minimum thread number also better be decided by online cpu
  number.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I7e80f27123d2e0a6352dc8d01e6ca70b9f137220
Reviewed-on: http://review.whamcloud.com/3607
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1689 tests: fix mount during e2fsck test
Yu Jian [Wed, 15 Aug 2012 01:40:59 +0000 (09:40 +0800)]
LU-1689 tests: fix mount during e2fsck test

The current mmp test 8 (mount during e2fsck) has two time issues:
1) the mount operation may start before e2fsck
2) the e2fsck operation may stop before mount

This patch fixes the above issues by providing enough time for e2fsck
operation to be started before mount operation, and setting the
superblock free_blocks_count field with 0 to force e2fsck checking
the Lustre server target device, which provides enough time for
the mount operation to be started during the e2fsck operation.

Test-Parameters: testlist=mmp
Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I5be2b84f063a0db386a8d9d48db53c00ebd77864
Reviewed-on: http://review.whamcloud.com/3643
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1625 test: reduce test duration for nfs mode
Keith Mannthey [Mon, 13 Aug 2012 19:55:55 +0000 (12:55 -0700)]
LU-1625 test: reduce test duration for nfs mode

There isn't much value to run long duration in
nfs mode.  Based on original work by Minh Diep.

Test-Parameters: testgroup=full
Signed-off-by: Keith Mannthey <keith@whamcloud.com>
Change-Id: I635d388e4dba5192199602b29ccaae843e9a1346
Reviewed-on: http://review.whamcloud.com/3596
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-969 debug: reduce stack usage
Oleg Drokin [Mon, 13 Aug 2012 17:37:56 +0000 (13:37 -0400)]
LU-969 debug: reduce stack usage

1, libcfs_debug_vmsg2 to accept libcfs_debug_msg_data struture
   to replace SUBSYSTEM, __FILE__, __FUNCTION__, __LINE__ and
   cdls on the stack

2, CDEBUG, DEBUG_CAPA use static libcfs_debug_msg_data

3, remove the local variable in RETURN/GOTO/__CHECK_STACK

4, reduce stack in recovery thread by moving lu_env,
   ptlrpc_thread to heap.

Updated patch to include all 2.3 fixes. (lu1436 and 1408)

Change-Id: I42437a35d428546beadf656602ede9c12c8bd2fd
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/3623
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1484 lprocfs: refine LC_PROCFS_USERS check
Bobi Jam [Tue, 24 Jul 2012 08:40:31 +0000 (16:40 +0800)]
LU-1484 lprocfs: refine LC_PROCFS_USERS check

In some RHEL patched 2.6.18 kernels, pde_users member is added in
another struct proc_dir_entry_aux instead of in struct proc_dir_entry
in later kernel version of 2.6.23.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Icee65893b2fbf4d0c3b3e957cb038be99aaf6eb8
Reviewed-on: http://review.whamcloud.com/3471
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years ago2.1.3rc1 2.1.3-RC1 v2_1_3_0_RC1 v2_1_3_RC1
Oleg Drokin [Tue, 7 Aug 2012 22:25:29 +0000 (18:25 -0400)]
2.1.3rc1

Change-Id: Id5c32ebee7287d137b28a81349d5df4584b15e15
Signed-off-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1477 kernel: Kernel update [RHEL6.3 2.6.32-279.2.1.el6]
yangsheng [Thu, 26 Jul 2012 16:52:18 +0000 (00:52 +0800)]
LU-1477 kernel: Kernel update [RHEL6.3 2.6.32-279.2.1.el6]

Add support for RHEL6.3 kernel 2.6.32-279.2.1.el6.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I1db26247beff1af667d23858a65e0b8d85888485
Reviewed-on: http://review.whamcloud.com/3467
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1216 LBUG: ASSERTION(lli->lli_sai == NULL) failed using robinhood tool
Bob Glossman [Wed, 1 Aug 2012 16:47:10 +0000 (09:47 -0700)]
LU-1216 LBUG: ASSERTION(lli->lli_sai == NULL) failed using robinhood tool

Since statahead is still buggy, this small fix turns it off default.
This is a workaround and should go away with a future proper fix.

Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ia5781a976586cb7105cdeae4d772ce76ea56f0b6
Reviewed-on: http://review.whamcloud.com/3512
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1703 llite: Set page dirty before calling sync io
Jinshan Xiong [Fri, 3 Aug 2012 04:04:12 +0000 (21:04 -0700)]
LU-1703 llite: Set page dirty before calling sync io

This problem is imported by commit 9053bd5f where master patch was
used for b2_1 directly. Unfortunately the page must be dirty to call
vvp_page_sync_io() to write a page.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ia804040d7a53973f72e13f769a94d52c847fa7f7
Reviewed-on: http://review.whamcloud.com/3521
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1626 lov: fix lov request set finish check race
Bobi Jam [Mon, 16 Jul 2012 11:27:09 +0000 (19:27 +0800)]
LU-1626 lov: fix lov request set finish check race

When several lov_request callbacks are called, if one of them is
the last lov_request in the set, lov_finished_set() checks for
all of them will return true, while the following action is supposed
be called only once for the set, in this case the assumption is broke
and the lov request set's refcount is wrong.

This patch fixed another glitch, in qos_remedy_create(), when we use
OST pool, the ost_idx value does not initialied correctly.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Id3ff1777b2146630b2d693e046038fcc6f465309
Reviewed-on: http://review.whamcloud.com/3402
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1585 lnet: Fix an incorrect timestamp calculation in lst.c
Doug Oucharek [Thu, 26 Jul 2012 00:11:04 +0000 (17:11 -0700)]
LU-1585 lnet: Fix an incorrect timestamp calculation in lst.c

The operation in routine lst_timeval_diff() (in lst.c) has
a bug.  It uses tv_sec where it should be using tv_usec.

Signed-off-by: Doug Oucharek <doug@whamcloud.com>
Change-Id: I8886428253b62562840aa37842e33b63a29be56b
Reviewed-on: http://review.whamcloud.com/3472
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Isaac Huang <iclaymore@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1432 ptlrpc: LBUG in lprocfs_free_client_stats()
Lai Siyao [Fri, 29 Jun 2012 09:15:41 +0000 (17:15 +0800)]
LU-1432 ptlrpc: LBUG in lprocfs_free_client_stats()

* serialize connect and target obd cleanup to avoid connect
  accessing unexisted data structure.
* connect export refcounting cleanup.

Signed-off-by: Hiroya Nozaki <nozaki.hiroya@jp.fujitsu.com>
Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Change-Id: I0a9e8a58ecdc1212565a478f4a758755a1b95f99
Reviewed-on: http://review.whamcloud.com/3244
Tested-by: Hudson
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1194 llog: fix for not sync llcd at thread stop
Alexander.Boyko [Tue, 15 May 2012 08:55:40 +0000 (12:55 +0400)]
LU-1194 llog: fix for not sync llcd at thread stop

If llog_obd_repl_cancel() happend between llog_sync() and
class_import_put() at filter_llog_finish(), llog_recov_thread_stop()
throw LBUG. This patch fix this issue by adding new flags to llog_ctxt.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Xyratex-bug-id: MRP-456
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I896519ed11abd301a889f658f96950ec15e76f97
Reviewed-on: http://review.whamcloud.com/3480
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1249 debug: Auto correct improper debug buffer size setting
Bobi Jam [Mon, 9 Apr 2012 05:03:51 +0000 (13:03 +0800)]
LU-1249 debug: Auto correct improper debug buffer size setting

Use the minimum required value when the debug buffer size setting
value is too small, and use the maximum acceptable value when it is
too large.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I89def7762f2ec9da3a25d28f7ffa9aede390eb85
Reviewed-on: http://review.whamcloud.com/2489
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-958 tests: debug_mb set incorrectly for smp or vm
Denis Kondratenko [Tue, 27 Mar 2012 07:47:51 +0000 (10:47 +0300)]
LU-958 tests: debug_mb set incorrectly for smp or vm

For cpus with number of cores or for some VMs,
number of possible CPUs in the system could
be greater than number of cpu reported by getconf.
Added check for maximum debug buffer size.
Added check that "possible" is exist, if not - use old method.

Xyratex-bug-id: MRP-219 incorrect settings for debug_mb
Signed-off-by: Denis Kondratenko <Denis_Kondratenko@xyratex.com>
Change-Id: I1ea367d1b956ae1009c4a501e0f02b6c9209a2f7
Reviewed-on: http://review.whamcloud.com/2377
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-958 tests: debug_mb set incorrectly for smp or vm
Denis Kondratenko [Fri, 10 Feb 2012 10:54:45 +0000 (12:54 +0200)]
LU-958 tests: debug_mb set incorrectly for smp or vm

For cpus with number of cores or for some VMs,
number of possible CPUs in the system could
be greater than number of cpu reported by getconf.
Added check for maximum debug buffer size.

Xyratex-bug-id: MRP-219 incorrect settings for debug_mb

Reviewed-by: Andrew Perepechko <Andrew_Perepechko@xyratex.com>
Reviewed-by: Alexey Lyashko <Alexey_Lyashko@xyratex.com>
Signed-off-by: Denis Kondratenko <Denis_Kondratenko@xyratex.com>
Change-Id: I7001af7b1c88d5be056734d7d73a0263cca01627
Reviewed-on: http://review.whamcloud.com/1912
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1511 kernel: kernel update [RHEL5.8 2.6.18-308.11.1.el5]
yangsheng [Tue, 31 Jul 2012 10:47:38 +0000 (18:47 +0800)]
LU-1511 kernel: kernel update [RHEL5.8 2.6.18-308.11.1.el5]

Update RHEL5.8 kernel to 2.6.18-308.11.1.el5.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: Ibe1a40d45394e9b7ae5fdfdfeaa37f9d3653f022
Reviewed-on: http://review.whamcloud.com/3498
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-575 MRP-133 lfs find speedup
Vitaly Fertman [Wed, 25 May 2011 21:07:09 +0000 (01:07 +0400)]
LU-575 MRP-133 lfs find speedup

lfs find should send getattr on mds only if needed;
lfs find should not break on matched obd but check other parameters as well;
lfs find time compare fixes;

Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-by: Colin Faber <colin.faber@xyratex.com>
Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
LU-575 MRP-260 fix quota tests

a fix for quota size units which conflicted with lfind size units

Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-by: Andrew Perepechko <Andrew_Perepechko@xyratex.com>
Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Id0484955602d6504b622006207ba7be4f183529f
Reviewed-on: http://review.whamcloud.com/2644
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1158 general: added nanosecond OBD connect flag
Isami Romanowski [Wed, 11 Jul 2012 20:44:14 +0000 (15:44 -0500)]
LU-1158 general: added nanosecond OBD connect flag

To prevent collisions with any future flags needed in features written
against this branch.

Signed-off-by: Isami Romanowski <isami@whamcloud.com>
Change-Id: I8637cc081c4c8f3f1b7d3e2065dbf9ea45d09bfa
Reviewed-on: http://review.whamcloud.com/3378
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
11 years agoLU-1442 llite: cleanup if a page failed to add into cache
Jinshan Xiong [Mon, 23 Jul 2012 14:09:51 +0000 (22:09 +0800)]
LU-1442 llite: cleanup if a page failed to add into cache

In lustre, we assume that a dirty page must be queued in osc cache
for writing. However, in vvp_io_commit_write(), if a page failed to
add into cache, page dirty flag isn't cleared this will cause the
page will never be added into cache again.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I1c132c6f1d4f5845682e51850eb895b292fc5f0d
Reviewed-on: http://review.whamcloud.com/3447
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
11 years agoLU-1059 clio: to not try to discard freeing pages
Jinshan Xiong [Tue, 31 Jan 2012 21:30:21 +0000 (13:30 -0800)]
LU-1059 clio: to not try to discard freeing pages

This is a bug imported in LU-948. We should check if we have owned
the page successfully before trying to discard it.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I30631be98f1fcc1b98abe727c8c6984b918bfffd
Reviewed-on: http://review.whamcloud.com/2073
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-948 clio: add a callback to cl_page_gang_lookup()
Jinshan Xiong [Thu, 12 Jan 2012 00:03:41 +0000 (16:03 -0800)]
LU-948 clio: add a callback to cl_page_gang_lookup()

Add a callback to cl_page_gang_lookup() so that it will be easier to
fix this issue and be helpful for new IO engine.

If a read lock is being canceled, we used to grab page lock and then
check if they are covered by another lock, otherwise they will be
discarded. This is unnecessary because we can do this w/o grabbing
page lock.

With the above fix, when a read-ahead page is in IO during recovery,
and one of covering locks is being canceled by early cancel for
recovery, it will detect that this page is being covered by another
one, and then this page will be skipped w/o trying to grab page lock.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I22a3ea0790f5c0e01c12c29208b6d60c38058f12
Reviewed-on: http://review.whamcloud.com/1955
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1576 llite: correct page usage count
Bobi Jam [Mon, 2 Jul 2012 08:56:07 +0000 (16:56 +0800)]
LU-1576 llite: correct page usage count

If kernel has add_to_page_cache_lru(), the ll_pagevec_add() is defined
as an empty function, while page_cache_get(page) only makes sense if
ll_pagevec_add() is defined.

This patch moves page_cache_get into ll_pagevec_add() macro
definition.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Iad98aacff43beec3e7a64fd1a778f549250aa5b8
Reviewed-on: http://review.whamcloud.com/3255
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1620 lnet: Make asym router failure parameters tunable
Joseph Herring [Wed, 12 May 2010 23:49:47 +0000 (16:49 -0700)]
LU-1620 lnet: Make asym router failure parameters tunable

Make the asymmetric router failure parameters tunable.

Change-Id: Ie36f79d01c35d4c11c4532187abdeb9473ea60b4
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/3371
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1493 quota: extra release caused by race
Niu Yawei [Mon, 11 Jun 2012 10:55:32 +0000 (03:55 -0700)]
LU-1493 quota: extra release caused by race

There is a race between the check_cur_qunit() and the
dqacq_completion(): check_cur_qunit() read hardlimit
and calculate how much quota need be acquired/released
based on the hardlimit, however, the hardlimit can be
changed by the dqacq_completion() at anytime. So that
could result in extra quota acquire/release when there
is inflight dqacq.

In general, such extra dqacq dosen't bring fatal error,
unless an extra release is going to release more than
'hardlimit' quota.

To minimize the code changes (anyway, it'll be totally
rewritten in the new quota design), we just do one more
check here to avoid the extra release which could bring
fatal error. A better solution could be calculating the
qd_count here and removing the lqs_blk/ino_rec stuff.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I0ad5ff0f32e39f32872c201ad1d545fbd9d1a57d
Reviewed-on: http://review.whamcloud.com/3074
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
11 years agoLU-1471 tests: check rpcidmapd service in setup-nfs.sh
Yu Jian [Wed, 18 Jul 2012 12:35:30 +0000 (20:35 +0800)]
LU-1471 tests: check rpcidmapd service in setup-nfs.sh

The rpcidmapd system service is not in SLES11 distro, which
caused "service: no such service rpcidmapd" error while running
setup-nfs.sh. This patch fixes the above issue by checking the
service before restarting or stopping it.

Test-Parameters: testlist=parallel-scale-nfsv3,parallel-scale-nfsv4

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I87b0d496c0214329fa185a935a3e049a5dd2a1f4
Reviewed-on: http://review.whamcloud.com/3431
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1418 osc: remove DEADLOCK error messages
Alexander.Boyko [Thu, 17 May 2012 12:48:09 +0000 (16:48 +0400)]
LU-1418 osc: remove DEADLOCK error messages

Deadlock is impossible for the current code, and the check
exist from some previous version. It can be removed.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Xyratex-bug-id: MRP-497
Change-Id: Ifbd4270739894c946553952d86ff931c4c707791
Reviewed-on: http://review.whamcloud.com/2825
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1095 debug: fix missing CDEBUG() newline
Andreas Dilger [Tue, 19 Jun 2012 04:31:24 +0000 (22:31 -0600)]
LU-1095 debug: fix missing CDEBUG() newline

The console message cleanup 389fde827be2ee6fb4ee08e955d773a2a16e70c6
lost a newline in one of the messages, which itself generates a
warning on the console.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie5eed2977e496e082d5b8e62bfc39e0df93fcab0
Reviewed-on: http://review.whamcloud.com/3134
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1129 obdfilter: handle race condition of recreating objects
Yu Jian [Tue, 3 Jul 2012 14:22:48 +0000 (22:22 +0800)]
LU-1129 obdfilter: handle race condition of recreating objects

During OST recovery, a race can happen while handling replayed
OST_WRITE request during the MDS->OST orphan recovery period to
recreate missing objects, which can trigger ASSERTION(diff >= 0)
failure.

This patch handles the above issue by adding obd->obd_recovering
into the assertion to check whether the OST is in recovery or not.
If it's in recovery and diff < 0, then no assertion failure occurs,
the object has been recreated. If the OST is not in recovery and
diff < 0, then the assertion failure occurs.

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: Id62970067b3507e832fb65b3ff623e6e67f3becc
Reviewed-on: http://review.whamcloud.com/3264
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1378 fid: Add console info for super seq allocation
wangdi [Wed, 9 May 2012 05:18:20 +0000 (22:18 -0700)]
LU-1378 fid: Add console info for super seq allocation

Add console information for super sequence allocation. Because
one super sequence will include 1 billion sequences, it rarely
happens in reality, so it will not cause the flood of console
msg.

Signed-off-by: Di Wang <di.wang@whamcloud.com>
Change-Id: I5154ee2a03006680b6a08d588287bf3941149457
Reviewed-on: http://review.whamcloud.com/2701
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1428 ldlm: fix a race in ldlm_lock_destroy_internal
Liang Zhen [Tue, 5 Jun 2012 08:34:34 +0000 (16:34 +0800)]
LU-1428 ldlm: fix a race in ldlm_lock_destroy_internal

ldlm_lock::l_exp_hash should be protected by internal lock of
cfs_hash, but we called cfs_hlist_unhashed(lock::l_exp_hash)
w/o holding cfs_hash lock in ldlm_lock_destroy_internal,
which means if someone called ldlm_lock_cancel on a lock while
export::exp_lock_hash is in progress of rehashing (thread context of
cfs_workitem), there could be tiny window between deleting this lock
from bucket[A] and re-adding it to bucket[B] of l_exp_hash, and
cfs_hlist_unhashed(lock::l_exp_hash) will return 1 in this window,
then we destroyed a lock but left it on l_exp_hash forever because
lock::l_destroyed has been set to 1 and ldlm_lock_destroy_internal()
wouldn't be able to remove the lock from l_exp_hash even it's called
infinite times in ldlm_cancel_locks_for_export_cb().

This patch also added some debug information to
ldlm_cancel_locks_for_export_cb in case this patch can't fix this
problem.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: Ia0932658b3f085a55535e36bee4fb833e74fa242
Reviewed-on: http://review.whamcloud.com/3028
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-857 security: Lustre client tolerates enforced SELinux.
Aurelien Degremont [Mon, 14 Nov 2011 15:25:57 +0000 (16:25 +0100)]
LU-857 security: Lustre client tolerates enforced SELinux.

Fix a bug which prevents Lustre clients to access directoriess when
SELinux is enforced, on RHEL 6.
This patch does not add a real SELinux support for Lustre but ables
to activate it for all other local filesystems, without Lustre
misbehaving.

Signed-off-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Change-Id: Ia6692c96a8439eb9239cb55ce32a1c54958241d1
Reviewed-on: http://review.whamcloud.com/1703
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1423 mdt: 16K pagesize clients error during ls
yangsheng [Sat, 2 Jun 2012 18:14:36 +0000 (02:14 +0800)]
LU-1423 mdt: 16K pagesize clients error during ls

The 1.8.x client need return entire page even partially
filled in readdir.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I762e63dafc511537e3f9e47782dc328a0d7c69de
Reviewed-on: http://review.whamcloud.com/3014
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1164 o2iblnd: param to tune number of kib scheduler threads
Sebastien Buisson [Wed, 6 Jun 2012 14:23:52 +0000 (16:23 +0200)]
LU-1164 o2iblnd: param to tune number of kib scheduler threads

This patch gives the ability to control the number of kib scheduler
threads launched by the ko2iblnd module, via a kernel module option.

Indeed, we can have some situations where the default threads number
is not appropriate. The default value is to create as many threads as
the number of CPU cores.

On certain platforms, we have noticed a performance penalty
when running too many kib scheduler threads.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Change-Id: I2f7baa2f9bc9350502d1a0738c8e72777b57fa57
Reviewed-on: http://review.whamcloud.com/3047
Tested-by: Hudson
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1522 recovery: rework LU-1166 patch in different way
Mikhail Pershin [Sun, 17 Jun 2012 11:08:36 +0000 (15:08 +0400)]
LU-1522 recovery: rework LU-1166 patch in different way

Dropping recovery counters upon last export put caused LU-1522 issue,
return class_export_recovery_cleanup() back to the
class_export_disconnect() and use exp_failed flag to avoid race
between target_handle_connect() and class_disconnect_stale_exports()

Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Change-Id: I78c19a8d49786877d2de27c82bf40ebec494f044
Reviewed-on: http://review.whamcloud.com/3145
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1203 mdt: recognize old rootsquash and nosquash_nid params
Yu Jian [Fri, 29 Jun 2012 03:40:52 +0000 (11:40 +0800)]
LU-1203 mdt: recognize old rootsquash and nosquash_nid params

Change mdt_process_config() to make it capable of recognizing
old "mdt.rootsquash" and "mdt.nosquash_nid" parameters.

The new parameters are "mdt.root_squash" and "mdt.nosquash_nids".

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I258609ea6f63f14f2e16ec141de1455dd8137b9c
Reviewed-on: http://review.whamcloud.com/3237
Tested-by: Hudson
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years ago2.1.2 RC3 2.1.2 2.1.2-RC3 v2_1_2 v2_1_2_0 v2_1_2_0_RC3 v2_1_2_RC3
Oleg Drokin [Fri, 8 Jun 2012 19:23:44 +0000 (15:23 -0400)]
2.1.2 RC3

Update version to 2.1.2

Change-Id: I64345a2327371b00daa3b9d7148526809f972ac9
Signed-off-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1467 ost: ASSERTION(lock->l_req_mode == lock->l_granted_mode)
yangsheng [Wed, 6 Jun 2012 08:15:47 +0000 (16:15 +0800)]
LU-1467 ost: ASSERTION(lock->l_req_mode == lock->l_granted_mode)

The lock may be cancel while ost_prolong_lock_one invoked, so
just return in this case.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: Ica6ad9199e4b210145e99d2420925803b18a7edd
Reviewed-on: http://review.whamcloud.com/3042
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1166 recovery: don't leak a connected client counter.
Bob Glossman [Tue, 22 May 2012 18:20:26 +0000 (11:20 -0700)]
LU-1166 recovery: don't leak a connected client counter.

target_handle_connect vs client eviction race may leak a
connected client counter and some evicted clients will counted twice.

Xyratex-bug: MRP-451

additional changes to complete pieces left out of previous commit

Signed-off-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I4218fcc8f5eacc8ddd61fe5d6d22ec4d5eace00a
Reviewed-on: http://review.whamcloud.com/2874
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoUpdate release date for new RC 2.1.2-RC2 v2_1_2_RC2
Oleg Drokin [Sun, 27 May 2012 16:31:18 +0000 (12:31 -0400)]
Update release date for new RC

Change-Id: I3d91ec3d3ef40f69844c09c1cab5c9a24ebbd642
Signed-off-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-549 llite: Improve statfs performance if selinux is disabled
Yevheniy Demchenko [Tue, 10 Apr 2012 20:01:14 +0000 (22:01 +0200)]
LU-549 llite: Improve statfs performance if selinux is disabled

Even if selinux is disabled, client still tries to get selinux
attributes from MDS. As xattrs are not yet cached, this significantly
slows down xattr heavy operations like ls -l. This patch forces
to return -EOPNOTSUPP on the client side if selinux is disabled.
It speeds up ls -l 25% for cold-cache case and 50% for hot-cache
case.

Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz>
Signed-off-by: Keith Mannthey <keith@whamcloud.com>
Change-Id: I0c24bd8559818b0fae29a082790b392095f91ab5
:# Please enter the commit message for your changes. Lines starting
Reviewed-on: http://review.whamcloud.com/2904
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1398 build: Module.symvers dependencies
Minh Diep [Thu, 24 May 2012 20:06:58 +0000 (13:06 -0700)]
LU-1398 build: Module.symvers dependencies

Ensure a Module.symvers file is generated with the correct
symbols for the configured lustre backend filesystems. This
is accomplished by adding a generic module-symvers rule which
depends on a filesystem specific version of the rule.  When a
filesystem is not configured the result is an empty rule.

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: I5333e226f69ca75b6a959cc1ed673d640da22b23
Reviewed-on: http://review.whamcloud.com/2898
Tested-by: Hudson
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-969 Revert stack usage reduction patch
Oleg Drokin [Fri, 25 May 2012 04:06:59 +0000 (00:06 -0400)]
LU-969 Revert stack usage reduction patch

This patch introduced quite a few problems in the end.
Broke return values on 32bit systems (LU-1436)
Local io performance regression (LU-1408)

Revert "LU-969 debug: reduce stack usage"

This reverts commit b9cbe3616b6e0b44c7835b1aec65befb85f848f9.

Change-Id: I9966d9490e5016ef95d3ca088796ae187af318d5
Signed-off-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1134 test: can not assume lustre setup before nfs test 2.1.2-RC1 v2_1_2_RC1
Minh Diep [Thu, 24 May 2012 08:32:11 +0000 (16:32 +0800)]
LU-1134 test: can not assume lustre setup before nfs test

During autotest, lustre can be unmounted. parallel-scale-nfs
test should not assume that lustre is mounted and skip the setup.
This patch also includes the fix for LU-1213.

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I81fd5a428f8367f68928716b5635bf94bcc7590c
Reviewed-on: http://review.whamcloud.com/2565
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-532 llite: trusted. xattr is invisible to non-root
Bob Glossman [Thu, 17 May 2012 18:41:59 +0000 (11:41 -0700)]
LU-532 llite: trusted. xattr is invisible to non-root

Filter out all invalid xattrs in listxattr.
This includes trusted. xattrs that can cause
unnecessary "EPERM" in subsequent getxattr operations.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ic32fe262772370cd837bef878c9bfd9eefc0ec3c
Reviewed-on: http://review.whamcloud.com/2490
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-629 ptlrpc: fix _debug_req to print opc/status
Andreas Dilger [Wed, 24 Aug 2011 21:17:53 +0000 (15:17 -0600)]
LU-629 ptlrpc: fix _debug_req to print opc/status

The 2.x _debug_req() function was changed in bug 16359/commit 5467a86021
to avoid problems with accessing unswabbed message buffers. Unfortunately,
this broke the printing of many/most _debug_req() messages, because it
didn't check whether swabbing was actually needed in the first place.

Also, in ptlrpc_expire_one_request() some extra debugging information was
added in bug 21636/commit 368689640 but never removed, making this common
message overly verbose.

Fix _debug_req() so that it prints opcode/flags/status, unless the
ptlrpc_body _needs_ to be swabbed, but isn't.  Also print out more
useful idenfifiers for the nodes (the obd_name and NID instead of
the connection UUID).  This removes some of the added verbosity from
ptlrpc_expire_one_request(), and most of the rest was already being
printed out (deadline, current, etc).

Change-Id: I88a78486becd19f5b38f5578e5cc30e649564908
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1286
Tested-by: Hudson
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Ported-by: Keith Mannthey <keith@whamclound.com>
Reviewed-on: http://review.whamcloud.com/2875

12 years agoLU-1424 kernel: Kernel update [RHEL6.2 2.6.32-220.17.1.el6]
yangsheng [Mon, 21 May 2012 15:43:50 +0000 (23:43 +0800)]
LU-1424 kernel: Kernel update [RHEL6.2 2.6.32-220.17.1.el6]

Update RHEL6.2 kernel to 2.6.32-220.17.1.el6.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I01b238bd6d4ca52eeb8a36bc404f2557e5aa653b
Reviewed-on: http://review.whamcloud.com/2850
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-992 ldiskfs: fix typo for rhel5 ldiskfs patches
yangsheng [Wed, 11 Apr 2012 05:27:48 +0000 (13:27 +0800)]
LU-992 ldiskfs: fix typo for rhel5 ldiskfs patches

A typo indroduced a long time ago. Fix it even rhel5
support will deprecate.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I10564cd8dee7d62e05616869044dab0930a5638a
Reviewed-on: http://review.whamcloud.com/2506
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1274 osc: Do not grab mutex of cl_lock for glimpse
Jinshan Xiong [Fri, 30 Mar 2012 19:57:34 +0000 (12:57 -0700)]
LU-1274 osc: Do not grab mutex of cl_lock for glimpse

Otherwise this will cause client eviction if that lock is being
flushed and OST happens to be slow to finish the IO.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I4d7d9e8c275653d4e3f50f81dc416142d4905377
Reviewed-on: http://review.whamcloud.com/2808
Tested-by: Hudson
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-425 tests: fix the issue of using "grep -w"
Yu Jian [Wed, 16 May 2012 08:19:00 +0000 (16:19 +0800)]
LU-425 tests: fix the issue of using "grep -w"

This patch fixes the following issue while using "grep -w"
to do exact match:

$ echo /mnt/nbp0-2 | grep -w /mnt/nbp0
/mnt/nbp0-2

Per the description of "-w" option:
-w, --word-regexp
Select only those lines containing matches that form whole words.
The test is that the matching substring must either be at the
beginning of the line, or preceded by a non-word constituent
character. Similarly, it must be either at the end of the line
or followed by a non-word constituent character. Word-constituent
characters are letters, digits, and the underscore.

So, the hyphen "-" character is a non-word constituent character
and "grep -w" does not do exact match on strings which contain it.

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I61e611aad78748ad1e6362c7df3e0792e2766016
Reviewed-on: http://review.whamcloud.com/2801
Tested-by: Hudson
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1095 debug: Report remaining recovery time consistently
Christopher J. Morrone [Mon, 27 Feb 2012 00:20:47 +0000 (16:20 -0800)]
LU-1095 debug: Report remaining recovery time consistently

Consistency is good, always report the remaining recovery time
in the mm:ss format.  This patch get's the last 3 remaining
instances where it is simply reported as a total number of seconds.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: If5599d8c24b1cd862ab89670553fcd24672cadbc
Reviewed-on: http://review.whamcloud.com/2204
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
(cherry picked from commit e8c6d2e9647b2dc95edddac5e902168816e7f57b)
Reviewed-on: http://review.whamcloud.com/2834

12 years agoLU-1095 debug: Common client/server message standardization
Christopher J. Morrone [Mon, 27 Feb 2012 00:16:51 +0000 (16:16 -0800)]
LU-1095 debug: Common client/server message standardization

Enhance and standardize several common messages.  In particular
when a peer is involved ensure peers nid is in the message, and
on the server include the obd name in the message.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: Iaea477e7dab240866a10c1863886d21d674e293d
Reviewed-on: http://review.whamcloud.com/2200
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Ported-by: Keith Mannthey <keith@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/2833

12 years agoLU-1280 ldiskfs: remove EXT_ASSERT from ext3_ext_new_extent_cb()
Yu Jian [Thu, 17 May 2012 14:30:03 +0000 (22:30 +0800)]
LU-1280 ldiskfs: remove EXT_ASSERT from ext3_ext_new_extent_cb()

The EXT_ASSERT() in ext3_ext_new_extent_cb() is invalid since
new locking is introduced in ext4_ext_walk_space().

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I8de3ad4004c304a45be14347df50bf066d8f4caa
Reviewed-on: http://review.whamcloud.com/2827
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1366 utils: disable ldiskfs extents feature for MDT
Bobi Jam [Tue, 15 May 2012 16:03:03 +0000 (00:03 +0800)]
LU-1366 utils: disable ldiskfs extents feature for MDT

Explicitly disable "extents" for MDT filesystem if it's based on ext4,
it provides no benifit for MDT.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I284c6c207fb8cc79537bebd60b6ab8d836fd4ed9
Reviewed-on: http://review.whamcloud.com/2798
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1205 tests: sanityn test_18 sometimes takes long time to run
Jinshan Xiong [Fri, 13 Apr 2012 23:15:51 +0000 (16:15 -0700)]
LU-1205 tests: sanityn test_18 sometimes takes long time to run

This is a live-lock problem where two processes are writing to the
same mmaped file via two nodes. To write a mmap region, both processes
will do:

  acquire cl_lock -> read page -> release cl_lock-> install page.

During the above steps, the page can be truncated after the lock is
released and then immediately cancelled by the other process, so
kernel has to do page fault again and never complete.

Lustre can't handle this case well so this test case is disabled.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I0cbb00a1ca68715a0b97ce369a18c53fa8de19cb
Reviewed-on: http://review.whamcloud.com/2723
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1205 tests: cleanup code style in mmap_sanity.c
Andreas Dilger [Mon, 12 Mar 2012 20:43:45 +0000 (14:43 -0600)]
LU-1205 tests: cleanup code style in mmap_sanity.c

Cleanup numerous code style issues in the mmap_sanity.c test:
- whitespace at end of line
- spaces around operators
- indentation
- line wrapping at 80 columns

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: If47eeeb1dec2705b9aa4e70cba3c1bc9241546a7
Reviewed-on: http://review.whamcloud.com/2722
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1205 tests: add timestamps to sanityn 18 mmap
Andreas Dilger [Mon, 12 Mar 2012 20:23:07 +0000 (14:23 -0600)]
LU-1205 tests: add timestamps to sanityn 18 mmap

The sanityn.sh test_18 mmap_sanity.c test sometimes takes over
an hour to run, and sometimes only seconds.  Add timestamps to
the subtest results so that it is possible to debug where that
time is being spent.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I641566c9a0b204095ad0c2e3bee852a0e8fd6881
Reviewed-on: http://review.whamcloud.com/2721
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-980 llog: cleanup return value in llog_client_create
Hongchao Zhang [Thu, 12 Jan 2012 13:29:00 +0000 (21:29 +0800)]
LU-980 llog: cleanup return value in llog_client_create

in llog_client_create, the newly allocated llog_handle is
return by parameter res, but it doesn't be cleaned up
if the following operations failed and the corresponding
llog_handle is already freed.

Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ib8c40c53b071fff7de3550a39f009915cb8511a7
Reviewed-on: http://review.whamcloud.com/2806
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1102 crypto: correctly check crypto_alloc_blkcipher returns
Bobi Jam [Wed, 9 May 2012 19:22:58 +0000 (03:22 +0800)]
LU-1102 crypto: correctly check crypto_alloc_blkcipher returns

ll_crypto_alloc_blkcipher() returns error value as well as possible
NULL pointer, should check its return value carefully.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I181b236406e2649580a04940886f849ad6071078
Reviewed-on: http://review.whamcloud.com/2703
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1374 kernel: Kernel update [RHEL5.8 2.6.18-308.4.1.el5]
James Simmons [Wed, 9 May 2012 15:02:39 +0000 (11:02 -0400)]
LU-1374 kernel: Kernel update [RHEL5.8 2.6.18-308.4.1.el5]

Update RHEL5.8 kernel to 2.6.18-308.4.1.el5.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I1025558c97b1d6887d52020857a997cdc495d865
Reviewed-on: http://review.whamcloud.com/2684
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1345 tests: sanity test 215 non integer handling fix
James Simmons [Fri, 4 May 2012 11:40:49 +0000 (07:40 -0400)]
LU-1345 tests: sanity test 215 non integer handling fix

Sanity test 215 test the format of various /proc/sys/lnet/* files.
Some of those files are integer values but their can be times when
no valid number is available so a NA is reported. This patch
handles those cases.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: If03a57e378e98ddd689c0e555fc8c9dc87d39138
Reviewed-on: http://review.whamcloud.com/2603
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-554 lnet: add gnilnd awareness to LNet
James Simmons [Wed, 9 May 2012 14:33:24 +0000 (10:33 -0400)]
LU-554 lnet: add gnilnd awareness to LNet

This allows servers on any network to talk to gnilnd routers.
This is 2.1 version of the Oracle 23884 attachment 31892.

Change-Id: I96777551b0caa50021ebb32755caaa01623ea97d
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Wally Wang <wang@cray.com>
Reviewed-on: http://review.whamcloud.com/2449
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Cory Spitz <spitzcor@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1308 Additional multihomed nid config fix
Oleg Drokin [Wed, 25 Apr 2012 19:28:22 +0000 (15:28 -0400)]
LU-1308 Additional multihomed nid config fix

Need to put the new nid addition at the last slot available,
not next after the last.

Change-Id: Icf9d898fba4c6e9c05f085b855a33282ea0d4b47
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/2599
Reviewed-by: Denis Kondratenko <Denis_Kondratenko@xyratex.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Cory Spitz <spitzcor@cray.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-1308 Properly add multihomed nids to peer table
Oleg Drokin [Tue, 17 Apr 2012 06:31:10 +0000 (02:31 -0400)]
LU-1308 Properly add multihomed nids to peer table

class_add_uuid had a copy&paste error where it was checking against
wrong entry for nid tables and as such had trouble finding multihomed
nid configurations.

Change-Id: I2d73bdde9cf7b0bf882b14b473b4491873e64c25
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/2561
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
12 years agoLU-630 lnet: only router checks peer health
Lai Siyao [Mon, 5 Dec 2011 07:28:39 +0000 (15:28 +0800)]
LU-630 lnet: only router checks peer health

The peer health code is designed for router, so a ~rtr node always
assumes peers to be alive.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ib794feace322112988a5b727ed40fb38f8f57370
Reviewed-on: http://review.whamcloud.com/2646
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1095 debug: Send common recovery messages to D_HA
Christopher J. Morrone [Sun, 26 Feb 2012 23:05:14 +0000 (15:05 -0800)]
LU-1095 debug: Send common recovery messages to D_HA

These messages are always present at recovery time, and are not
understable by a sysadmin.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I907b0ac49541b20699914dc4f8c5e0db3fb6bec9
Reviewed-on: http://review.whamcloud.com/2198
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1095 debug: Improve recovery console messages
Christopher J. Morrone [Sat, 3 Mar 2012 01:41:45 +0000 (17:41 -0800)]
LU-1095 debug: Improve recovery console messages

Quiet and/or improve a few recovery messages.

A sysadmin will not understand this:

  2012-03-02 16:27:19 Lustre: 5211:0:(ldlm_lib.c:2072:
  target_queue_recovery_request()) Next recovery transno: 410629539,
  current: 410629539, replaying

Messages like this are too verbose for the console:

  2012-03-02 16:27:59 LustreError: 5286:0:
  (genops.c:1270:class_disconnect_stale_exports())
  lc3-OST0004: disconnect stale client
  47808f4f-9f36-e8eb-f363-14b1abe4ac57@<unknown>

and can be left to this simpler message:

  2012-03-02 16:27:59 Lustre: lc3-OST0005: disconnecting 0 stale
  clients

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I457602c3440ba10475e4ddca7c4e58ef8669922c
Reviewed-on: http://review.whamcloud.com/2249
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Liu Xuezhao <xuezhao.liu@emc.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1095 debug: CWARN to CDEBUG for mds_notify() event
Brian Behlendorf [Fri, 19 Feb 2010 19:53:55 +0000 (11:53 -0800)]
LU-1095 debug: CWARN to CDEBUG for mds_notify() event

Both of these warnings represent correct behavior the administrator
does not need to know about, or more importantly do anything about.
As such I am moving both of these warnings to CDEBUG(D_CONFIG).

  Lustre: 8099:0:(mds_lov.c:1167:mds_notify()) MDS lc1-MDT0000:
  add target lc1-OST0023_UUID

  Lustre: lc1-MDT0000: in recovery, not resetting orphans on
  lc1-OST0007_UUID

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I66a98d87e3d5de7205420c74db4f6d9bcaaf31a7
Reviewed-on: http://review.whamcloud.com/2202
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1095 debug: Improve messages for fake requests
Christopher J. Morrone [Mon, 27 Feb 2012 00:19:21 +0000 (16:19 -0800)]
LU-1095 debug: Improve messages for fake requests

Update the console filter to correctly handle fake requests and
squelched the lov_update_create_set() message for the
-ETIMEDOUT/-ENOTCONN case.

 LustreError: 7872:0:(lov_request.c:693:lov_update_create_set()) error
 creating fid 0x104c5e0b sub-object on OST idx 53/2: rc = -107

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I5f37f585566b053d515665fcddbcc8a3e653d89a
Reviewed-on: http://review.whamcloud.com/2203
Tested-by: Hudson
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1095 debug: Standardize, suppress mount/umount messages
Christopher J. Morrone [Mon, 27 Feb 2012 00:06:29 +0000 (16:06 -0800)]
LU-1095 debug: Standardize, suppress mount/umount messages

Standardize mount/umount console message to include profile name,
and optionally suppress them with the 'quiet' mount option.  We
have been using private namespaces for testing and mounting then
umounting the FS as needed for each job.  In this context these
messages end up causing alot of syslog noise.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I7514f6016c337a358e5e31146644810dff292d02
Reviewed-on: http://review.whamcloud.com/2199
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1095 mgs: remove message from console
Christopher J. Morrone [Fri, 10 Feb 2012 23:24:06 +0000 (15:24 -0800)]
LU-1095 mgs: remove message from console

There is no good reason for a sysadmin to see this message
on the console.  Most of the time this will be a fluke
due to the vagarities of lnet networks (server decides
client is disconnected, but client doesn't know that yet,
messages arriving out of order, etc.).

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I0c18734f82a9c89a5e940ce4e2c602614e89ce26
Reviewed-on: http://review.whamcloud.com/2133
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-969 debug: reduce stack usage
Hongchao Zhang [Mon, 12 Mar 2012 08:11:47 +0000 (16:11 +0800)]
LU-969 debug: reduce stack usage

1, libcfs_debug_vmsg2 to accept libcfs_debug_msg_data struture
   to replace SUBSYSTEM, __FILE__, __FUNCTION__, __LINE__ and
   cdls on the stack

2, CDEBUG, DEBUG_CAPA use static libcfs_debug_msg_data

3, remove the local variable in RETURN/GOTO/__CHECK_STACK

4, reduce stack in recovery thread by moving lu_env,
   ptlrpc_thread to heap.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I75fe53027f56e27255b5f558e8fd57c7db833648
Reviewed-on: http://review.whamcloud.com/2668
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1361 build: enable kabi on rhel6
Minh Diep [Thu, 3 May 2012 23:00:46 +0000 (16:00 -0700)]
LU-1361 build: enable kabi on rhel6

Turn on USE_KABI=true to build with kabi on rhel6

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Ie028ced17baf5a4540c59b8b63fb279a146718a6
Reviewed-on: http://review.whamcloud.com/2642
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Tested-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1312 kernel: crash at boot time in isci driver
yangsheng [Wed, 2 May 2012 13:29:01 +0000 (21:29 +0800)]
LU-1312 kernel: crash at boot time in isci driver

Restore SG_ALL to default value to avoid crash isci.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I9c0e358d5cbc41af2c4c9549e837bc54f50820ad
Reviewed-on: http://review.whamcloud.com/2626
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-577 tests: FAIL replay-single test_70b rundbench load
James Simmons [Wed, 18 Apr 2012 14:13:54 +0000 (10:13 -0400)]
LU-577 tests: FAIL replay-single test_70b rundbench load

Test 70b for replay-single assumes that lustre is mounted on
/mnt/lustre which is not the case for us. This patch passes
the proper MOUNT. The test also was not using the standard
DIR/tdir setup which had generated data files not being
cleaned up. Increased the sleep period to match dbench's
warm up period. This gives dbench a change to start up when
using many clients. Set the pdsh FANOUT environment variable
because by default pdsh launches in blocks of 32 nodes. This
way pdsh will lauch all node jobs at the same time

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I5fd4160fc684c19990caf60b51ef62d18ff98249
Reviewed-on: http://review.whamcloud.com/2538
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1014 mountconf: MGS should process parameter config
Lai Siyao [Thu, 23 Feb 2012 08:23:25 +0000 (16:23 +0800)]
LU-1014 mountconf: MGS should process parameter config

MGS doesn't have llog config of its own, but it should process
<profile>-params config which is global parameters for the whole
system.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I7c3a236fa0c24581494ba0e2a3ab40271a2e8c8f
Reviewed-on: http://review.whamcloud.com/2667
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1247 obdfilter: fix invalid check of precrate objects
Alexander.Boyko [Wed, 21 Mar 2012 17:47:53 +0000 (21:47 +0400)]
LU-1247 obdfilter: fix invalid check of precrate objects

MDT precreate objects when it has objects count less than the
oscc->oscc_grow_count / 2. oscc->oscc_grow_count can be equal
to OST_MAX_PRECREATE, so MDT (last_id - next_id) is less than the
(OST_MAX_PRECREAT * 3 / 2). This patch fix the wrong condition at
filter_handle_precreate() when delete orphans request happend.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Andriy Skulysh <andriy_skulysh@xyratex.com>
Xyratex-bug-id: MRP-440

Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I1e555c5480709d2acd4c3810a464b70767a6549f
Reviewed-on: http://review.whamcloud.com/2666
Reviewed-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Alexander Boyko <alexander_boyko@xyratex.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-989 ldlm: Fix client's import destruction
Andriy Skulysh [Fri, 13 Jan 2012 14:08:57 +0000 (16:08 +0200)]
LU-989 ldlm: Fix client's import destruction

Move client's import destruction from disconnect to cleanup phase
The patch allows to use connect after disconnect.

Xyratex-bug-id: MRP-288
Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I0a63f66205ac5931ead0acea492f3e480669e237
Reviewed-on: http://review.whamcloud.com/2664
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1166 recovery: don't leak a connected client counter.
Alexey Lyashkov [Mon, 5 Mar 2012 16:17:19 +0000 (20:17 +0400)]
LU-1166 recovery: don't leak a connected client counter.

target_handle_connect vs client eviction race may leak a
connected client counter and some evicted clients will counted twice.

Xyratex-bug: MRP-451

Signed-off-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I13f8168baf904e214605514e4ddfc6f16ab077c9
Reviewed-on: http://review.whamcloud.com/2665
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-663 kernel: Some arch do not have NUMA features anymore
Gregoire Pichon [Wed, 7 Sep 2011 14:55:04 +0000 (16:55 +0200)]
LU-663 kernel: Some arch do not have NUMA features anymore

Some architectures, especially x86_64, do not have cpu_to_node()
defined as a macro, and node_to_cpumask() exported by the kernel
anymore.

The cpu_to_node() routine is defined either as a macro, as an inline
routine using another exported symbol, or as an exported symbol.
Anyway, the kernel defines this service since at least version
2.6.12.

The node_to_cpumask() routine has been replaced by cpumask_of_node()
for x86 architectures since kernel version 2.6.30.

The set_cpus_allowed() routine is not defined if
CONFIG_CPUMASK_OFFSTACK=y since kernel version 2.6.32.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I169a3e1f54816e0a29b265b1d2773f99dbf4eaff
Reviewed-on: http://review.whamcloud.com/2620
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-447 lnet: add lctl --net XXX push
James Simmons [Fri, 30 Mar 2012 12:50:09 +0000 (08:50 -0400)]
LU-447 lnet: add lctl --net XXX push

Lctl --net XXX push is used to clear out purgatory conns arbitrarily.
We use this with lctl --net XXX disconnect for regression testing.
This does not nuke the peer, so it shouldn't yield lnd_query failures
like del_peer does.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ia7b033750134020022df676f451d91b20e4f5db4
Reviewed-on: http://review.whamcloud.com/2645
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-646 port bz23485 (clarification of lustre fsync behavior)
Lai Siyao [Tue, 30 Aug 2011 03:44:31 +0000 (20:44 -0700)]
LU-646 port bz23485 (clarification of lustre fsync behavior)

Add directory fsync operation.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I339e0505da2de7dbe2de7f3d5f513df8332fe956
Reviewed-on: http://review.whamcloud.com/2643
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1358 kernel: Kernel update [RHEL6.2 2.6.32-220.13.1.el6]
yangsheng [Fri, 4 May 2012 16:14:44 +0000 (00:14 +0800)]
LU-1358 kernel: Kernel update [RHEL6.2 2.6.32-220.13.1.el6]

Update RHEL6.2 kernel to 2.6.32-220.13.1.el6.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I927e544c990bebf51c38911962c24cf48e70cba7
Reviewed-on: http://review.whamcloud.com/2652
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1280 ldiskfs: remove LASSERTF from ext3_ext_new_extent_cb()
Yu Jian [Thu, 3 May 2012 11:50:15 +0000 (19:50 +0800)]
LU-1280 ldiskfs: remove LASSERTF from ext3_ext_new_extent_cb()

The LASSERTF() in ext3_ext_new_extent_cb() was injected for
debugging purpose to make sure the race really happened but
was forgotten to be removed from the original patch in
http://review.whamcloud.com/1618 .

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I12482e7092320d7b80190c8a84014708bf67c75e
Reviewed-on: http://review.whamcloud.com/2639
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1319 mdt: increment MDT getattr stats
yangsheng [Wed, 2 May 2012 14:44:42 +0000 (22:44 +0800)]
LU-1319 mdt: increment MDT getattr stats

Move increment of MDT getattr stat from mdt_getattr() to
mdt_getattr_internal() so we don't miss other call paths
that may service getattr requests.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I25293fe98567b1250ecc2f9645295c1522345295
Reviewed-on: http://review.whamcloud.com/2637
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1350 debug: lower debug message level
Bobi Jam [Thu, 26 Apr 2012 17:18:44 +0000 (01:18 +0800)]
LU-1350 debug: lower debug message level

File info read and unlink race is normal, we'd lower the debug message
level since a lot of unnecessary unmasked messages will be generated
if mdt_object_find() cannot find those deleted objects.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: If4ec54fbd341bbdd16dbe0efc779be57e9640220
Reviewed-on: http://review.whamcloud.com/2608
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1017 handle -EAGAIN properly in lu_object_find_try()
Niu Yawei [Tue, 31 Jan 2012 06:06:47 +0000 (22:06 -0800)]
LU-1017 handle -EAGAIN properly in lu_object_find_try()

htable_lookup() could return -EAGAIN for dying object, we should
handle it properly in lu_object_find_try().

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I1fa53a95f96f5a5c0d12158521d733fbd852b590
Reviewed-on: http://review.whamcloud.com/2629
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1217 osc: to not check a cl_lock's state w/o protection
Jinshan Xiong [Mon, 26 Mar 2012 19:17:17 +0000 (12:17 -0700)]
LU-1217 osc: to not check a cl_lock's state w/o protection

osc_page_putref_lock() used to check cl_lock's refcount and
corresponding osc_lock's ols_hold without any protection, this
is racy because other process can change the lock state so as to
make the assertion be false.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I65fe1fa7fc55e8642fea6789784d7bb92a45d56f
Reviewed-on: http://review.whamcloud.com/2604
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-685 obdclass: lu_object reclamation is inefficient
Lai Siyao [Thu, 15 Sep 2011 06:45:13 +0000 (23:45 -0700)]
LU-685 obdclass: lu_object reclamation is inefficient

Put only non-referenced lu_object in lru list to speed up object
reclamation.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ibde905bfe7ec5ec0b66f31a6070081cf3dc331cd
Reviewed-on: http://review.whamcloud.com/2628
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1084 ptlrpc: Change CWARNs to CDEBUGs
Christopher J. Morrone [Sat, 11 Feb 2012 01:34:32 +0000 (17:34 -0800)]
LU-1084 ptlrpc: Change CWARNs to CDEBUGs

These messages should not appear on the console.  A sysadmin
will have no idea what to make of most of them.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ia8d7e033bcd14d7c8ea5b1b27f849ef81eb9ad4a
Reviewed-on: http://review.whamcloud.com/2621
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-459 quiet too noisy console messages at mount
Andreas Dilger [Mon, 22 Aug 2011 22:56:45 +0000 (16:56 -0600)]
LU-459 quiet too noisy console messages at mount

Quiet a number of extra debug messages printed to the console after a
remount or recovery.  They provide no value and just add to the general
confusion of reading Lustre debug messages.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Id83fa6c5538cf34f3af4503c1e16540a8de6e74e
Reviewed-on: http://review.whamcloud.com/2619
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-814 test: automate NFS over lustre testing
Minh Diep [Thu, 5 Jan 2012 16:55:48 +0000 (08:55 -0800)]
LU-814 test: automate NFS over lustre testing

Provide setup nfs within auster framework

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Icfd61bf6772807a344576b92b5268a83a7b79e4b
Reviewed-on: http://review.whamcloud.com/1664
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>