Whamcloud - gitweb
fs/lustre-release.git
12 years agoLU-1205 tests: cleanup code style in mmap_sanity.c
Andreas Dilger [Mon, 12 Mar 2012 20:43:45 +0000 (14:43 -0600)]
LU-1205 tests: cleanup code style in mmap_sanity.c

Cleanup numerous code style issues in the mmap_sanity.c test:
- whitespace at end of line
- spaces around operators
- indentation
- line wrapping at 80 columns

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: If47eeeb1dec2705b9aa4e70cba3c1bc9241546a7
Reviewed-on: http://review.whamcloud.com/2722
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1205 tests: add timestamps to sanityn 18 mmap
Andreas Dilger [Mon, 12 Mar 2012 20:23:07 +0000 (14:23 -0600)]
LU-1205 tests: add timestamps to sanityn 18 mmap

The sanityn.sh test_18 mmap_sanity.c test sometimes takes over
an hour to run, and sometimes only seconds.  Add timestamps to
the subtest results so that it is possible to debug where that
time is being spent.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I641566c9a0b204095ad0c2e3bee852a0e8fd6881
Reviewed-on: http://review.whamcloud.com/2721
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-980 llog: cleanup return value in llog_client_create
Hongchao Zhang [Thu, 12 Jan 2012 13:29:00 +0000 (21:29 +0800)]
LU-980 llog: cleanup return value in llog_client_create

in llog_client_create, the newly allocated llog_handle is
return by parameter res, but it doesn't be cleaned up
if the following operations failed and the corresponding
llog_handle is already freed.

Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ib8c40c53b071fff7de3550a39f009915cb8511a7
Reviewed-on: http://review.whamcloud.com/2806
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1102 crypto: correctly check crypto_alloc_blkcipher returns
Bobi Jam [Wed, 9 May 2012 19:22:58 +0000 (03:22 +0800)]
LU-1102 crypto: correctly check crypto_alloc_blkcipher returns

ll_crypto_alloc_blkcipher() returns error value as well as possible
NULL pointer, should check its return value carefully.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I181b236406e2649580a04940886f849ad6071078
Reviewed-on: http://review.whamcloud.com/2703
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1374 kernel: Kernel update [RHEL5.8 2.6.18-308.4.1.el5]
James Simmons [Wed, 9 May 2012 15:02:39 +0000 (11:02 -0400)]
LU-1374 kernel: Kernel update [RHEL5.8 2.6.18-308.4.1.el5]

Update RHEL5.8 kernel to 2.6.18-308.4.1.el5.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I1025558c97b1d6887d52020857a997cdc495d865
Reviewed-on: http://review.whamcloud.com/2684
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1345 tests: sanity test 215 non integer handling fix
James Simmons [Fri, 4 May 2012 11:40:49 +0000 (07:40 -0400)]
LU-1345 tests: sanity test 215 non integer handling fix

Sanity test 215 test the format of various /proc/sys/lnet/* files.
Some of those files are integer values but their can be times when
no valid number is available so a NA is reported. This patch
handles those cases.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: If03a57e378e98ddd689c0e555fc8c9dc87d39138
Reviewed-on: http://review.whamcloud.com/2603
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-554 lnet: add gnilnd awareness to LNet
James Simmons [Wed, 9 May 2012 14:33:24 +0000 (10:33 -0400)]
LU-554 lnet: add gnilnd awareness to LNet

This allows servers on any network to talk to gnilnd routers.
This is 2.1 version of the Oracle 23884 attachment 31892.

Change-Id: I96777551b0caa50021ebb32755caaa01623ea97d
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Wally Wang <wang@cray.com>
Reviewed-on: http://review.whamcloud.com/2449
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Cory Spitz <spitzcor@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1308 Additional multihomed nid config fix
Oleg Drokin [Wed, 25 Apr 2012 19:28:22 +0000 (15:28 -0400)]
LU-1308 Additional multihomed nid config fix

Need to put the new nid addition at the last slot available,
not next after the last.

Change-Id: Icf9d898fba4c6e9c05f085b855a33282ea0d4b47
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/2599
Reviewed-by: Denis Kondratenko <Denis_Kondratenko@xyratex.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Cory Spitz <spitzcor@cray.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-1308 Properly add multihomed nids to peer table
Oleg Drokin [Tue, 17 Apr 2012 06:31:10 +0000 (02:31 -0400)]
LU-1308 Properly add multihomed nids to peer table

class_add_uuid had a copy&paste error where it was checking against
wrong entry for nid tables and as such had trouble finding multihomed
nid configurations.

Change-Id: I2d73bdde9cf7b0bf882b14b473b4491873e64c25
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/2561
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
12 years agoLU-630 lnet: only router checks peer health
Lai Siyao [Mon, 5 Dec 2011 07:28:39 +0000 (15:28 +0800)]
LU-630 lnet: only router checks peer health

The peer health code is designed for router, so a ~rtr node always
assumes peers to be alive.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ib794feace322112988a5b727ed40fb38f8f57370
Reviewed-on: http://review.whamcloud.com/2646
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1095 debug: Send common recovery messages to D_HA
Christopher J. Morrone [Sun, 26 Feb 2012 23:05:14 +0000 (15:05 -0800)]
LU-1095 debug: Send common recovery messages to D_HA

These messages are always present at recovery time, and are not
understable by a sysadmin.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I907b0ac49541b20699914dc4f8c5e0db3fb6bec9
Reviewed-on: http://review.whamcloud.com/2198
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1095 debug: Improve recovery console messages
Christopher J. Morrone [Sat, 3 Mar 2012 01:41:45 +0000 (17:41 -0800)]
LU-1095 debug: Improve recovery console messages

Quiet and/or improve a few recovery messages.

A sysadmin will not understand this:

  2012-03-02 16:27:19 Lustre: 5211:0:(ldlm_lib.c:2072:
  target_queue_recovery_request()) Next recovery transno: 410629539,
  current: 410629539, replaying

Messages like this are too verbose for the console:

  2012-03-02 16:27:59 LustreError: 5286:0:
  (genops.c:1270:class_disconnect_stale_exports())
  lc3-OST0004: disconnect stale client
  47808f4f-9f36-e8eb-f363-14b1abe4ac57@<unknown>

and can be left to this simpler message:

  2012-03-02 16:27:59 Lustre: lc3-OST0005: disconnecting 0 stale
  clients

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I457602c3440ba10475e4ddca7c4e58ef8669922c
Reviewed-on: http://review.whamcloud.com/2249
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Liu Xuezhao <xuezhao.liu@emc.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1095 debug: CWARN to CDEBUG for mds_notify() event
Brian Behlendorf [Fri, 19 Feb 2010 19:53:55 +0000 (11:53 -0800)]
LU-1095 debug: CWARN to CDEBUG for mds_notify() event

Both of these warnings represent correct behavior the administrator
does not need to know about, or more importantly do anything about.
As such I am moving both of these warnings to CDEBUG(D_CONFIG).

  Lustre: 8099:0:(mds_lov.c:1167:mds_notify()) MDS lc1-MDT0000:
  add target lc1-OST0023_UUID

  Lustre: lc1-MDT0000: in recovery, not resetting orphans on
  lc1-OST0007_UUID

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I66a98d87e3d5de7205420c74db4f6d9bcaaf31a7
Reviewed-on: http://review.whamcloud.com/2202
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1095 debug: Improve messages for fake requests
Christopher J. Morrone [Mon, 27 Feb 2012 00:19:21 +0000 (16:19 -0800)]
LU-1095 debug: Improve messages for fake requests

Update the console filter to correctly handle fake requests and
squelched the lov_update_create_set() message for the
-ETIMEDOUT/-ENOTCONN case.

 LustreError: 7872:0:(lov_request.c:693:lov_update_create_set()) error
 creating fid 0x104c5e0b sub-object on OST idx 53/2: rc = -107

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I5f37f585566b053d515665fcddbcc8a3e653d89a
Reviewed-on: http://review.whamcloud.com/2203
Tested-by: Hudson
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1095 debug: Standardize, suppress mount/umount messages
Christopher J. Morrone [Mon, 27 Feb 2012 00:06:29 +0000 (16:06 -0800)]
LU-1095 debug: Standardize, suppress mount/umount messages

Standardize mount/umount console message to include profile name,
and optionally suppress them with the 'quiet' mount option.  We
have been using private namespaces for testing and mounting then
umounting the FS as needed for each job.  In this context these
messages end up causing alot of syslog noise.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I7514f6016c337a358e5e31146644810dff292d02
Reviewed-on: http://review.whamcloud.com/2199
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1095 mgs: remove message from console
Christopher J. Morrone [Fri, 10 Feb 2012 23:24:06 +0000 (15:24 -0800)]
LU-1095 mgs: remove message from console

There is no good reason for a sysadmin to see this message
on the console.  Most of the time this will be a fluke
due to the vagarities of lnet networks (server decides
client is disconnected, but client doesn't know that yet,
messages arriving out of order, etc.).

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I0c18734f82a9c89a5e940ce4e2c602614e89ce26
Reviewed-on: http://review.whamcloud.com/2133
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-969 debug: reduce stack usage
Hongchao Zhang [Mon, 12 Mar 2012 08:11:47 +0000 (16:11 +0800)]
LU-969 debug: reduce stack usage

1, libcfs_debug_vmsg2 to accept libcfs_debug_msg_data struture
   to replace SUBSYSTEM, __FILE__, __FUNCTION__, __LINE__ and
   cdls on the stack

2, CDEBUG, DEBUG_CAPA use static libcfs_debug_msg_data

3, remove the local variable in RETURN/GOTO/__CHECK_STACK

4, reduce stack in recovery thread by moving lu_env,
   ptlrpc_thread to heap.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I75fe53027f56e27255b5f558e8fd57c7db833648
Reviewed-on: http://review.whamcloud.com/2668
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1361 build: enable kabi on rhel6
Minh Diep [Thu, 3 May 2012 23:00:46 +0000 (16:00 -0700)]
LU-1361 build: enable kabi on rhel6

Turn on USE_KABI=true to build with kabi on rhel6

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Ie028ced17baf5a4540c59b8b63fb279a146718a6
Reviewed-on: http://review.whamcloud.com/2642
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Tested-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1312 kernel: crash at boot time in isci driver
yangsheng [Wed, 2 May 2012 13:29:01 +0000 (21:29 +0800)]
LU-1312 kernel: crash at boot time in isci driver

Restore SG_ALL to default value to avoid crash isci.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I9c0e358d5cbc41af2c4c9549e837bc54f50820ad
Reviewed-on: http://review.whamcloud.com/2626
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-577 tests: FAIL replay-single test_70b rundbench load
James Simmons [Wed, 18 Apr 2012 14:13:54 +0000 (10:13 -0400)]
LU-577 tests: FAIL replay-single test_70b rundbench load

Test 70b for replay-single assumes that lustre is mounted on
/mnt/lustre which is not the case for us. This patch passes
the proper MOUNT. The test also was not using the standard
DIR/tdir setup which had generated data files not being
cleaned up. Increased the sleep period to match dbench's
warm up period. This gives dbench a change to start up when
using many clients. Set the pdsh FANOUT environment variable
because by default pdsh launches in blocks of 32 nodes. This
way pdsh will lauch all node jobs at the same time

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I5fd4160fc684c19990caf60b51ef62d18ff98249
Reviewed-on: http://review.whamcloud.com/2538
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1014 mountconf: MGS should process parameter config
Lai Siyao [Thu, 23 Feb 2012 08:23:25 +0000 (16:23 +0800)]
LU-1014 mountconf: MGS should process parameter config

MGS doesn't have llog config of its own, but it should process
<profile>-params config which is global parameters for the whole
system.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I7c3a236fa0c24581494ba0e2a3ab40271a2e8c8f
Reviewed-on: http://review.whamcloud.com/2667
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1247 obdfilter: fix invalid check of precrate objects
Alexander.Boyko [Wed, 21 Mar 2012 17:47:53 +0000 (21:47 +0400)]
LU-1247 obdfilter: fix invalid check of precrate objects

MDT precreate objects when it has objects count less than the
oscc->oscc_grow_count / 2. oscc->oscc_grow_count can be equal
to OST_MAX_PRECREATE, so MDT (last_id - next_id) is less than the
(OST_MAX_PRECREAT * 3 / 2). This patch fix the wrong condition at
filter_handle_precreate() when delete orphans request happend.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Andriy Skulysh <andriy_skulysh@xyratex.com>
Xyratex-bug-id: MRP-440

Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I1e555c5480709d2acd4c3810a464b70767a6549f
Reviewed-on: http://review.whamcloud.com/2666
Reviewed-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Alexander Boyko <alexander_boyko@xyratex.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-989 ldlm: Fix client's import destruction
Andriy Skulysh [Fri, 13 Jan 2012 14:08:57 +0000 (16:08 +0200)]
LU-989 ldlm: Fix client's import destruction

Move client's import destruction from disconnect to cleanup phase
The patch allows to use connect after disconnect.

Xyratex-bug-id: MRP-288
Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I0a63f66205ac5931ead0acea492f3e480669e237
Reviewed-on: http://review.whamcloud.com/2664
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1166 recovery: don't leak a connected client counter.
Alexey Lyashkov [Mon, 5 Mar 2012 16:17:19 +0000 (20:17 +0400)]
LU-1166 recovery: don't leak a connected client counter.

target_handle_connect vs client eviction race may leak a
connected client counter and some evicted clients will counted twice.

Xyratex-bug: MRP-451

Signed-off-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I13f8168baf904e214605514e4ddfc6f16ab077c9
Reviewed-on: http://review.whamcloud.com/2665
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-663 kernel: Some arch do not have NUMA features anymore
Gregoire Pichon [Wed, 7 Sep 2011 14:55:04 +0000 (16:55 +0200)]
LU-663 kernel: Some arch do not have NUMA features anymore

Some architectures, especially x86_64, do not have cpu_to_node()
defined as a macro, and node_to_cpumask() exported by the kernel
anymore.

The cpu_to_node() routine is defined either as a macro, as an inline
routine using another exported symbol, or as an exported symbol.
Anyway, the kernel defines this service since at least version
2.6.12.

The node_to_cpumask() routine has been replaced by cpumask_of_node()
for x86 architectures since kernel version 2.6.30.

The set_cpus_allowed() routine is not defined if
CONFIG_CPUMASK_OFFSTACK=y since kernel version 2.6.32.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I169a3e1f54816e0a29b265b1d2773f99dbf4eaff
Reviewed-on: http://review.whamcloud.com/2620
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-447 lnet: add lctl --net XXX push
James Simmons [Fri, 30 Mar 2012 12:50:09 +0000 (08:50 -0400)]
LU-447 lnet: add lctl --net XXX push

Lctl --net XXX push is used to clear out purgatory conns arbitrarily.
We use this with lctl --net XXX disconnect for regression testing.
This does not nuke the peer, so it shouldn't yield lnd_query failures
like del_peer does.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ia7b033750134020022df676f451d91b20e4f5db4
Reviewed-on: http://review.whamcloud.com/2645
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-646 port bz23485 (clarification of lustre fsync behavior)
Lai Siyao [Tue, 30 Aug 2011 03:44:31 +0000 (20:44 -0700)]
LU-646 port bz23485 (clarification of lustre fsync behavior)

Add directory fsync operation.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I339e0505da2de7dbe2de7f3d5f513df8332fe956
Reviewed-on: http://review.whamcloud.com/2643
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1358 kernel: Kernel update [RHEL6.2 2.6.32-220.13.1.el6]
yangsheng [Fri, 4 May 2012 16:14:44 +0000 (00:14 +0800)]
LU-1358 kernel: Kernel update [RHEL6.2 2.6.32-220.13.1.el6]

Update RHEL6.2 kernel to 2.6.32-220.13.1.el6.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I927e544c990bebf51c38911962c24cf48e70cba7
Reviewed-on: http://review.whamcloud.com/2652
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1280 ldiskfs: remove LASSERTF from ext3_ext_new_extent_cb()
Yu Jian [Thu, 3 May 2012 11:50:15 +0000 (19:50 +0800)]
LU-1280 ldiskfs: remove LASSERTF from ext3_ext_new_extent_cb()

The LASSERTF() in ext3_ext_new_extent_cb() was injected for
debugging purpose to make sure the race really happened but
was forgotten to be removed from the original patch in
http://review.whamcloud.com/1618 .

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I12482e7092320d7b80190c8a84014708bf67c75e
Reviewed-on: http://review.whamcloud.com/2639
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1319 mdt: increment MDT getattr stats
yangsheng [Wed, 2 May 2012 14:44:42 +0000 (22:44 +0800)]
LU-1319 mdt: increment MDT getattr stats

Move increment of MDT getattr stat from mdt_getattr() to
mdt_getattr_internal() so we don't miss other call paths
that may service getattr requests.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I25293fe98567b1250ecc2f9645295c1522345295
Reviewed-on: http://review.whamcloud.com/2637
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1350 debug: lower debug message level
Bobi Jam [Thu, 26 Apr 2012 17:18:44 +0000 (01:18 +0800)]
LU-1350 debug: lower debug message level

File info read and unlink race is normal, we'd lower the debug message
level since a lot of unnecessary unmasked messages will be generated
if mdt_object_find() cannot find those deleted objects.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: If4ec54fbd341bbdd16dbe0efc779be57e9640220
Reviewed-on: http://review.whamcloud.com/2608
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1017 handle -EAGAIN properly in lu_object_find_try()
Niu Yawei [Tue, 31 Jan 2012 06:06:47 +0000 (22:06 -0800)]
LU-1017 handle -EAGAIN properly in lu_object_find_try()

htable_lookup() could return -EAGAIN for dying object, we should
handle it properly in lu_object_find_try().

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I1fa53a95f96f5a5c0d12158521d733fbd852b590
Reviewed-on: http://review.whamcloud.com/2629
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1217 osc: to not check a cl_lock's state w/o protection
Jinshan Xiong [Mon, 26 Mar 2012 19:17:17 +0000 (12:17 -0700)]
LU-1217 osc: to not check a cl_lock's state w/o protection

osc_page_putref_lock() used to check cl_lock's refcount and
corresponding osc_lock's ols_hold without any protection, this
is racy because other process can change the lock state so as to
make the assertion be false.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I65fe1fa7fc55e8642fea6789784d7bb92a45d56f
Reviewed-on: http://review.whamcloud.com/2604
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-685 obdclass: lu_object reclamation is inefficient
Lai Siyao [Thu, 15 Sep 2011 06:45:13 +0000 (23:45 -0700)]
LU-685 obdclass: lu_object reclamation is inefficient

Put only non-referenced lu_object in lru list to speed up object
reclamation.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ibde905bfe7ec5ec0b66f31a6070081cf3dc331cd
Reviewed-on: http://review.whamcloud.com/2628
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1084 ptlrpc: Change CWARNs to CDEBUGs
Christopher J. Morrone [Sat, 11 Feb 2012 01:34:32 +0000 (17:34 -0800)]
LU-1084 ptlrpc: Change CWARNs to CDEBUGs

These messages should not appear on the console.  A sysadmin
will have no idea what to make of most of them.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ia8d7e033bcd14d7c8ea5b1b27f849ef81eb9ad4a
Reviewed-on: http://review.whamcloud.com/2621
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-459 quiet too noisy console messages at mount
Andreas Dilger [Mon, 22 Aug 2011 22:56:45 +0000 (16:56 -0600)]
LU-459 quiet too noisy console messages at mount

Quiet a number of extra debug messages printed to the console after a
remount or recovery.  They provide no value and just add to the general
confusion of reading Lustre debug messages.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Id83fa6c5538cf34f3af4503c1e16540a8de6e74e
Reviewed-on: http://review.whamcloud.com/2619
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-814 test: automate NFS over lustre testing
Minh Diep [Thu, 5 Jan 2012 16:55:48 +0000 (08:55 -0800)]
LU-814 test: automate NFS over lustre testing

Provide setup nfs within auster framework

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Icfd61bf6772807a344576b92b5268a83a7b79e4b
Reviewed-on: http://review.whamcloud.com/1664
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-780 test: improve parallel-scale to support hyperion run
Minh Diep [Fri, 4 Nov 2011 23:12:57 +0000 (16:12 -0700)]
LU-780 test: improve parallel-scale to support hyperion run

We need to add support for srun/slurm, and a few tests
from hyperion-sanity script that has been used for hyperion
testing

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: I7f1baa0c99980ad9001436911d23f1030aa7d0fe
Reviewed-on: http://review.whamcloud.com/1615
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Cliff White <cliffw@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1137 ldlm: fix for the flock handling for 1.8 clients
Alexey Lyashkov [Fri, 24 Feb 2012 10:47:37 +0000 (02:47 -0800)]
LU-1137 ldlm: fix for the flock handling for 1.8 clients

The current fix intended to fix the issue with incorrect flock
owner field filling. This issue observed when 1.8 clients
(and with lesser version) doesn't fill the owner field correctly.
With this patch this filling integrated on the 2.x server side.

Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy_skulysh@xyratex.com>
Signed-off-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Signed-off-by: Iurii Golovach <iurii_golovach@xyratex.com>
Xyratex-bug-id: MRP-413
Change-Id: I88ba40eb9cb74d07b90862801669028c5dc94e08
Reviewed-on: http://review.whamcloud.com/2193
Reviewed-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years ago2.1.2-rc0 2.1.2-RC0 v2_1_2_0_RC0
Oleg Drokin [Mon, 23 Apr 2012 18:44:03 +0000 (14:44 -0400)]
2.1.2-rc0

Change-Id: I97a7c0367c6db67282593b2c4e5c246b519e1d8f
Signed-off-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1282 lprocfs: Add a module param to disable percpu stats
Bobi Jam [Thu, 12 Apr 2012 00:48:42 +0000 (08:48 +0800)]
LU-1282 lprocfs: Add a module param to disable percpu stats

Add an obdclass module option to choose to use a single lprocfs stats
structure rather than percpu data.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I45d5a05029197e629d4f7d161a5e4e5d01a93bf5
Reviewed-on: http://review.whamcloud.com/2515
Tested-by: Hudson
Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-734 tests: add sub-tests into recovery-*-scale tests
Yu Jian [Wed, 11 Apr 2012 08:23:34 +0000 (16:23 +0800)]
LU-734 tests: add sub-tests into recovery-*-scale tests

This patch adds sub-tests into the recovery-*-scale tests
so that test results and logs could be gathered properly
and uploaded to Maloo.

The patch also does some cleanup works on the test scripts
and moves some common functions into test-framework.sh.

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: Ife174285d182ad5a2d4823767ca59df5a10b4aa4
Reviewed-on: http://review.whamcloud.com/2509
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Cliff White <cliffw@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-690 test: wait_osc_import_state() fixes
Yu Jian [Tue, 10 Apr 2012 04:40:44 +0000 (12:40 +0800)]
LU-690 test: wait_osc_import_state() fixes

-- increase maxtime to wait the timeout of 1st request;
   take into account at_min value;
-- cleanup wait_osc_import_state() to use _wait_import_state();

Oracle-bug-id: 24498

Signed-off-by: Elena Gryaznova <elena.gryaznova@oracle.com>
Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I0a3525a02e5ce6ca81082d177df0c5d7d68bea26
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@oracle.com>
Reviewed-on: http://review.whamcloud.com/2496
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-1092 ptlrpc: take export refcount during connect
Lai Siyao [Mon, 19 Mar 2012 08:41:54 +0000 (16:41 +0800)]
LU-1092 ptlrpc: take export refcount during connect

In the process of (re)connect,  a refcount of export should be taken,
otherwise disconnect of this export may be called, and it will put
the last refcount of this export and make access to this export
invalid.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Change-Id: Iaf27e842ed516b8968c90bfce396609e39f52c85
Reviewed-on: http://review.whamcloud.com/2345
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1245 lprocfs: use correct cpu number
Bobi Jam [Wed, 18 Apr 2012 01:14:57 +0000 (09:14 +0800)]
LU-1245 lprocfs: use correct cpu number

Take care of correct cpu number in lprocfs_stats_collector().

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib7bad581e216daa48bb6dc903b1720b44ddba9c0
Reviewed-on: http://review.whamcloud.com/2579
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1320 llite: fix a race between readpage and releasepage
Jinshan Xiong [Wed, 18 Apr 2012 04:40:24 +0000 (21:40 -0700)]
LU-1320 llite: fix a race between readpage and releasepage

This is a race between page stealing and readpage. If a just read
page is stolen, readpage will find the page is not uptodate, this
makes it panic so -EIO is returned to the reading application.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: Ib16d12d3bc3cc8c0545aa27f0836e4fd89c3a809
Reviewed-on: http://review.whamcloud.com/2564
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1241 kernel: Kernel update [RHEL6.2 2.6.32-220.7.1.el6]
yangsheng [Wed, 28 Mar 2012 06:28:30 +0000 (14:28 +0800)]
LU-1241 kernel: Kernel update [RHEL6.2 2.6.32-220.7.1.el6]

Update RHEL6.2 kernel to 2.6.32-220.7.1.el6.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I695f74e160b0b836c663c34cf185bcbab7b6c16c
Reviewed-on: http://review.whamcloud.com/2393
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-974 protocol: change OBD_CONNECT_GRANT_PARAM
Andreas Dilger [Tue, 13 Mar 2012 20:53:41 +0000 (14:53 -0600)]
LU-974 protocol: change OBD_CONNECT_GRANT_PARAM

Change the OBD_CONNECT_GRANT_PARAM flag value to avoid conflict
with the OBD_CONNECT_UMASK flag from LU-974.  While that patch is
not yet landed to our release tree, it is in use in production at
some customers.  While the risk of conflict is currently low, it
is easier to change the GRANT_PARAM value since only in use on the
orion branch, and isn't even handled by the client there yet.

Add (hopefully) clear comments for OBD_CONNECT and obd_connect_data
to ensure that they are not modified in some incompatible way across
branches.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I503892c3b595c0272b0941fa58a16a496321cab0
Reviewed-on: http://review.whamcloud.com/2299
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
12 years agoLU-573: conf-sanity test_22 failed with 41
Jinshan Xiong [Sat, 6 Aug 2011 02:32:22 +0000 (19:32 -0700)]
LU-573: conf-sanity test_22 failed with 41

Make sure recovery on OST is finished before trying to create file

Change-Id: I4a36685a5cd9c55de729906bff50c29b1108c931
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1192
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-570: Add function to find connect uuid by nid
Jinshan Xiong [Wed, 10 Aug 2011 19:30:20 +0000 (12:30 -0700)]
LU-570: Add function to find connect uuid by nid

In this patch, two functions are added:
- class_find_uuid(), find conn uuid by peer nid
- client_import_find_conn(), find a conn uuid in import connection list

Also, a code cleanup is performed.

Change-Id: I50e8e9392a39ef78719504cf083c0c22f5d39dcb
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1189
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1206: mdt: Fix error handling in mdt_mfd_open
Oleg Drokin [Thu, 15 Mar 2012 00:56:02 +0000 (20:56 -0400)]
LU-1206: mdt: Fix error handling in mdt_mfd_open

In mdt_mfd_open if the mo_open() call failed or we could not allocate
mfd, we also need to undo write/exec reference count in order to
not mess up with subsequent exec/write accesses.

Signed-off-by: Prakash Surya <surya1@llnl.gov>
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I3bd98bd68368b48f2afaa7bb450d3a9947c992ac
Reviewed-on: http://review.whamcloud.com/2300
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-879 mds: Add a few rename stats under /proc
wangdi [Fri, 16 Dec 2011 01:19:31 +0000 (17:19 -0800)]
LU-879 mds: Add a few rename stats under /proc

1. Add samedir_rename in /proc/fs/lustre/mds/lustre-MDT0000/stats
to collect stats of same dir rename.
2. Add crossdir_rename in /proc/fs/lustre/mds/lustre-MDT0000/stats
to collect stats of cross dir rename.
3. Add /proc/fs/lustre/mds/lustre-MDT0000/rename_stats(YAML format)
to collect stats of rename stats happened on different size
directories.
The size of directories under which files are being removed.
With these patches, it will find out how many renames take place
in the same directory compared to how many renames are between
So during DNE implementation, we can know how rename may be
affected by DNE remote directories and large striped directories.

Signed-off-by: Wang Di <di.wang@whamcloud.com>
Change-Id: I4452ce196802c5724607455e0a9b4b372b06f159
Reviewed-on: http://review.whamcloud.com/1878
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-680 lfs: instance <-> mount point mapping from lfs
Richard Henwood [Wed, 2 Nov 2011 23:33:51 +0000 (19:33 -0400)]
LU-680 lfs: instance <-> mount point mapping from lfs

A new option to 'lfs' has been created to return the mapping
between Lustre filesystem instance and paths. The option
is 'getname' and it may be called with or without arguements.

'lfs getname' without arguments returns the instances of all
Lustre mount points.

'lfs getname [path...]' returns the instance of each specified
path. If the path is not a Lustre instance 'No such device' is
returned.

OBD_IOC_GETNAME has been added to file.c to provide consistent
behavior for file as well as directory paths.

A llapi_getname helper function has been added to liblustreapi
that returns a lustre instance name if a path is provided.

Documentation for 'lfs getname' is included inline an the lfs
man page has been updated.

Signed-off-by: Richard Henwood <rhenwood@whamcloud.com>
Signed-off-by: John L. Hammond <jhammond@tacc.utexas.edu>
Change-Id: Iab8ff12d604c7ce853f3c204b455e3b641f659f4
Reviewed-on: http://review.whamcloud.com/1373
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1098 debug: lower debug message level
Bobi Jam [Tue, 21 Feb 2012 01:23:11 +0000 (09:23 +0800)]
LU-1098 debug: lower debug message level

File info read and unlink race is normal, we'd lower the debug message
level since a lot of unnecessary unmasked messages will be generated
if mdt_object_find() cannot find those deleted objects.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I7630e6a1456ffb435c8e67cc626bf38547b840d0
Reviewed-on: http://review.whamcloud.com/2165
Tested-by: Hudson
Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-499 grant/cancel_rate are static when OST is idle
Lai Siyao [Fri, 5 Aug 2011 03:43:20 +0000 (20:43 -0700)]
LU-499 grant/cancel_rate are static when OST is idle

ldlm_pool_recalc() shouldn't be skipped if namespace->ns_bref eqauls
zero, instead a flag ns_stopping is added to mark ldlm namespace is
being freed.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Change-Id: Ic6485c34ec3e9868ae531a4dc25aee969c374eb5
Reviewed-on: http://review.whamcloud.com/1185
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Prakash Surya <surya1@llnl.gov>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-532 mdt: improve xattr ctime warning message
Andreas Dilger [Thu, 28 Jul 2011 22:13:53 +0000 (16:13 -0600)]
LU-532 mdt: improve xattr ctime warning message

Print out which xattr is not getting OBD_MD_FLCTIME set so that it
is possible to track down what code path on the client is failing.

Change-Id: I1918d2e8e0a1e03d8437846e823bca9df6f89b48
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1161
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-985 lprocfs: verify user buffer access
Bobi Jam [Fri, 13 Jan 2012 05:46:07 +0000 (13:46 +0800)]
LU-985 lprocfs: verify user buffer access

In lprocfs_xxx_evict_client(), need verify user's buffer when access
it.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I702e22f8d432edce200c6d91a0af8a1eac792008
Reviewed-on: http://review.whamcloud.com/1961
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-717 ldiskfs: MRP-222 Replace sysname with nodename in MMP
Nikitas Angelinas [Mon, 5 Dec 2011 22:31:12 +0000 (22:31 +0000)]
LU-717 ldiskfs: MRP-222 Replace sysname with nodename in MMP

sysname holds "Linux" by default, i.e. what appears when doing a
"uname -s"; nodename should be used to print the machine's hostname,
i.e. what is returned when doing a "uname -n" or "hostname", and what
gethostname(2)/sethostname(2) manipulate, in order to notify the
administrator of the node which is contending to mount the
filesystem.

Andreas says this was introduced when porting the MMP patches from
RHEL5 to RHEL6, and then also pushed upstream to ext4; a patch for
upstream ext4 has already been submitted.

Signed-off-by: Nikitas Angelinas <nikitas_angelinas@xyratex.com>
Change-Id: I207bf145d114a9981b5a6add4bbf92ca76f71840
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-on: http://review.whamcloud.com/1419
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-427 test: Test failure on test suite lfsck
yangsheng [Thu, 4 Aug 2011 03:36:23 +0000 (11:36 +0800)]
LU-427 test: Test failure on test suite lfsck

- Reset $MDSDB & $OSTDB in generate_db(). Else they will
  stale if user redefine $SHARED_DIRECTORY.
- Add a function check_shared_dir() to ensure
  $SHARED_DIRECTORY is shared among tests nodes.

Change-Id: Idf2a3d75e46c4cf768419adfea627511c24c495c
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1180
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-81 deadlock of changelog adding vs. changelog cancelling
Niu Yawei [Thu, 18 Aug 2011 04:22:19 +0000 (21:22 -0700)]
LU-81 deadlock of changelog adding vs. changelog cancelling

This is a workaround for the deadlock of changelog adding vs.
changelog cancelling. Changelog adding always start transaction
before acquiring the catlog lock(lgh_lock), whereas, changelog
cancelling do start transaction after holding the catlog lock.

We start transaction earlier to avoid above deadlock.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I9647b9a559f68a27dc0d4b4885857d3cf73b5b8e
Reviewed-on: http://review.whamcloud.com/1260
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-759 mdc: Clear rq_replay on error in mdc_enqueue()
Li Wei [Fri, 30 Sep 2011 08:30:09 +0000 (16:30 +0800)]
LU-759 mdc: Clear rq_replay on error in mdc_enqueue()

When mdc_enter_request() fails (e.g., due to signals) in mdc_enqueue(),
the request is freed without any care about its rq_replay field.  For
rq_replay requests, this results in assertion failures in
__ptlrpc_free_req().  This patch adds a call to mdc_clear_replay_flag()
to make sure __ptlrpc_free_req()'s assumption is respected.

Change-Id: I2185066a9f47b3d9563d9e1a8989754ef2e2dcb4
Signed-off-by: Li Wei <liwei@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1518
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-543 mdd: fix rename changelog
Niu Yawei [Sat, 5 Nov 2011 05:21:12 +0000 (22:21 -0700)]
LU-543 mdd: fix rename changelog

Current rename changelog stores source fid in both CL_RENAME & CL_EXT
records, which is redundant, and the 'tfid' in CL_EXT is never been
used.

Actually, we'd store target fid in the CL_EXT record, then application
could detect the fid unlinked by rename in changelog.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I0c616f813657a2faefa60a707f4fc1d9dc971b39
Reviewed-on: http://review.whamcloud.com/1652
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-542 Fix mdt xattr handler logic error
Bobi Jam [Thu, 28 Jul 2011 14:12:38 +0000 (22:12 +0800)]
LU-542 Fix mdt xattr handler logic error

Record system ACL and user xattr change/deletion changelog.

Change-Id: I5aabf1879ec6e812361fe0d1b8255f84d0e817d6
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1158
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-935 quota: break early when b/i_unit_sz exceeded upper limit
Niu Yawei [Mon, 19 Dec 2011 10:18:28 +0000 (02:18 -0800)]
LU-935 quota: break early when b/i_unit_sz exceeded upper limit

While expanding b/i_unit_sz in dquot_create_oqaq(), we'd break the loop
early when the b/i_unit_sz exceeded upper limit, otherwise, qaq_b/iunit_sz
could be overflow and result in endless loop.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I0bf069e9259627426d7a87ec42844eaed7a733b4
Reviewed-on: http://review.whamcloud.com/1890
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-931 mdd: store lu_fid instead of pointer in md_capainfo
Hongchao Zhang [Tue, 17 Jan 2012 04:10:25 +0000 (12:10 +0800)]
LU-931 mdd: store lu_fid instead of pointer in md_capainfo

in md_capainfo, mc_fid contains at most 5 pointers to lu_fid,
and if the corresponding lu_fid is freed, the pointer isn't notified
about it, then the pointer will point to freed memory!

Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Change-Id: I00088cbfeb145ceac0477467a8b2436f6cf1e530
Reviewed-on: http://review.whamcloud.com/1979
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1218 proc: Recovery timer in proc always displays 0
yangsheng [Thu, 15 Mar 2012 16:18:29 +0000 (00:18 +0800)]
LU-1218 proc: Recovery timer in proc always displays 0

Calculate remain recovery time for proc display.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I50c14859c704c7e2bc60b66b3d70350648feebb6
Reviewed-on: http://review.whamcloud.com/2334
Tested-by: Hudson
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-952 quota: follow locking order of quota code
Niu Yawei [Fri, 6 Jan 2012 09:18:35 +0000 (01:18 -0800)]
LU-952 quota: follow locking order of quota code

The locking order of quota code is: i_mutex > dqonoff_sem >
journal_lock > dqptr_sem > dquot->dq_lock > dqio_mutex, so we
should call the ll_vfs_dq_init() after journal started to avoid
deadlock.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Ia88a2eb8c9dc3827afd4828e0160ee376a1f041e
Reviewed-on: http://review.whamcloud.com/1923
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-805 quota: lfs quota doesn't print grace time correctly
Niu Yawei [Tue, 8 Nov 2011 13:07:05 +0000 (05:07 -0800)]
LU-805 quota: lfs quota doesn't print grace time correctly

Lustre always trigger grace time when the allocated qunit exceeding
softlimit, however, user tools 'lfs quota' only print grace time
when the total usage greater than softlimit, so sometimes user can't
tell if the softlimit is already exceeded from 'lfs quota' output.

This patch changes the 'lfs quota' to use the data get from kernel
instead of comparing usage with softlimit.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Ia564c803ca33b2cf925759b6a6e4e4df2692f28d
Reviewed-on: http://review.whamcloud.com/1674
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-78 o2iblnd: kiblnd_check_conns can deadlock
Liang Zhen [Tue, 21 Feb 2012 04:40:25 +0000 (12:40 +0800)]
LU-78 o2iblnd: kiblnd_check_conns can deadlock

kiblnd_check_conns() called kiblnd_check_sends() with hold of global
rwlock, it's wrong because kiblnd_check_sends() could do many things:
 - call lnet_finalize() which is not safe with hold of spinlock
 - call kiblnd_close_conn() which requires to write_lock the same
   global lock
 - kiblnd_check_sends() might need to allocate NOOP message

It can be fixed by moving call of kiblnd_check_sends out from spinlock
This patch is from the fix of Bug 20288, with some small changes.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: Icc9fedc70ecb25b0c41ebaf6d80c971f8281c9c6
Reviewed-on: http://review.whamcloud.com/2166
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-651 osc: suppress message in can_merge_pages()
Bobi Jam [Sun, 2 Oct 2011 08:08:56 +0000 (16:08 +0800)]
LU-651 osc: suppress message in can_merge_pages()

Thottle messages if adjacent brw pages are not mergeable with
different OBD_BRW_NOQUOTA flags.

Change-Id: I22ce6f8807e2541d3e6b3c9631f60faa36baa81a
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1328
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy
Liang Zhen [Wed, 14 Mar 2012 04:41:08 +0000 (12:41 +0800)]
LU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy

multiple ptlrpc service threads can enter ptlrpc_grow_req_bufs()
the same time if they found "low_water" in ptlrpc_check_rqbd_pool(),
each of these threads will allocate ptlrpc_service::srv_nbuf_per_group
request buffers and could consume all memory.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I83d6fe53a0f86691ae7e2afb3d75fb8677f58688
Reviewed-on: http://review.whamcloud.com/2308
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
12 years agoLU-960 utils: bad stripe count report, and validate stripe size
Minh Diep [Tue, 7 Feb 2012 17:12:58 +0000 (09:12 -0800)]
LU-960 utils: bad stripe count report, and validate stripe size

Need to use %d to print -1 instead of %u
Need to check for -1 in input for stripe size

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Ic4ee84a45bdb5dc934a3e681a4fc2fcd51f14b99
Reviewed-on: http://review.whamcloud.com/2112
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Cliff White <cliffw@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-106 procfs: many proc entries are not accessed safely
Lai Siyao [Mon, 13 Jun 2011 03:56:03 +0000 (20:56 -0700)]
LU-106 procfs: many proc entries are not accessed safely

Some in memory data may be released/uninitialized at the time
of proc entry creation/removal, this patch includes the following
fixes:
* initialize data before proc entry creation
* free data after proc entry removal
* free proc entries in obd_precleanup() because
  obd_uuid/nid/nid_stats_hash are released in class_cleanup().
* free proc entries after obd_zombie_barrier() because obd_export
  hold one refcound of nid_stat.
* check osd->od_mount before accessing osd proc entries because the
  osd proc entries are created before mount.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Change-Id: I03cb977e1be0747032a70f6a39fec804f81d70cc
Reviewed-on: http://review.whamcloud.com/326
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1109 llite: do splice read stripe by stripe
Jinshan Xiong [Thu, 23 Feb 2012 19:57:54 +0000 (11:57 -0800)]
LU-1109 llite: do splice read stripe by stripe

If nfsd is reading an across stripe buffer, and if the first stripe
happens to be 64KB(PIPE_BUFFERS*PAGE_SIZE), then first read will
occupy all pipe buffers and this makes nfsd stuck if it reads the
next stripe immediately.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I13cb54b37f738ee3c081dff1929630ea523b77fd
Reviewed-on: http://review.whamcloud.com/2182
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1050 o2iblnd: fix checking order of rdma_create_id() argument
Shuichi Ihara [Mon, 13 Feb 2012 16:47:38 +0000 (01:47 +0900)]
LU-1050 o2iblnd: fix checking order of rdma_create_id() argument

Replace rdma_create_id() with rdma_destroy_id() in
openib gen2 test and four argument check moves to
the back of openib test.

Signed-off-by: Shuichi Ihara <sihara@ddn.com>
Change-Id: I0782183f15f58647291518a4222610601083c369
Reviewed-on: http://review.whamcloud.com/2097
Tested-by: Hudson
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-1125 recovery: initial recovery thread's watchdog
Bobi Jam [Wed, 22 Feb 2012 06:30:41 +0000 (14:30 +0800)]
LU-1125 recovery: initial recovery thread's watchdog

Recovery thread does not have watchdog attached, correctly initialize
it.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I6993c39bbf18f47e9ccd965a5d2ba1919cfb7736
Reviewed-on: http://review.whamcloud.com/2174
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1128 ldlm: return -1 for server pool shrinker
Niu Yawei [Fri, 24 Feb 2012 05:21:51 +0000 (21:21 -0800)]
LU-1128 ldlm: return -1 for server pool shrinker

For ldlm server pool shrinker, we just use it to decrease SLV,
but never reclaim any memory directly, so it should always return
-1 to inform the kernel to break the shrink loop.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I17f51ac84eb0b8c70b2cee9ac7eeca34647c1990
Reviewed-on: http://review.whamcloud.com/2184
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-882 quota: Quota code compares unsigned < 0
Niu Yawei [Mon, 19 Dec 2011 10:01:36 +0000 (02:01 -0800)]
LU-882 quota: Quota code compares unsigned < 0

Port from b23858.

In check_cur_qunit(), it checks "if (limit + record < 0)", however,
the limit is unsigned, so this check will be always false, and when
limit is smaller than -record, following "limit += record" will make
limit a unreasonable large value.

This patch also fixed a similar defect in dqacq_handler().

Signed-off-by: Vladimir Saveliev <vladimir.saveliev@oracle.com>
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Iea02143dae5542f1a9f9cc823a684a18031b8a03
Reviewed-on: http://review.whamcloud.com/1889
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-617 recovery: setattr from open breaks recovery
Niu Yawei [Fri, 26 Aug 2011 02:58:36 +0000 (19:58 -0700)]
LU-617 recovery: setattr from open breaks recovery

The setattr from open(open(O_TRUNC)) is now serialized with
'cl_setattr_lock' on client and goes to a dedicate portal, which is
different with other reint operations, consequently, setattr RPC
can be parallel with other reint RPCs, and that result in the race of
updating last_transno/last_xid on server.

This patch removed the 'cl_setattr_lock' stuff to make all the reint
operations serialized by 'cl_rpc_lock', and the code on server side
which assumes client is holding DLM lock when setattr from open is also
removed, since it's not true.

The MDS_SETATTR_PORTAL service is preserved to keep the compatibility
with old client, and the MDS_SETATTR_FROM_OPEN is also preserved, since
we are using this flag to check write access for open(O_TRUNC), and
it probably can be used for some optimization purpose in future.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I45f83f8f05022ff0d31f8e7784381821c835785d
Reviewed-on: http://review.whamcloud.com/1654
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
12 years agoLU-1198 idl: move FID VER to DLM resource name[1]
Andreas Dilger [Thu, 8 Mar 2012 07:29:09 +0000 (15:29 +0800)]
LU-1198 idl: move FID VER to DLM resource name[1]

Until Lustre 1.8.7/2.1.1 the FID version was packed into name[2].

However, this leaves very little room in the LDLM resource name
for other uses.  The upcoming quota code needs to store another
FID into the LDLM resource to allow directory tree quotas, and
managed by the DLM.

The 32-bit VER, which is currently always 0, is moved into the high
bits of name[1] along with the 32-bit OID, to avoid consuming the
name[2] field.  Since future use of the FID version (including
snapshots, pools, etc) will need changes on the client side anyway,
there will never be non-zero VER on an existing client.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If1e500cfb277dfc25bc056bb0c5763e48e7dcab0
Reviewed-on: http://review.whamcloud.com/2288
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-620 llite: add delete/remove_from_page_cache check
Bobi Jam [Wed, 21 Sep 2011 10:17:13 +0000 (18:17 +0800)]
LU-620 llite: add delete/remove_from_page_cache check

Later 2.6.32 kernel use memory cgroup feature but does not export
truncate_complete_page but export delete_from_page_cache or
remove_from_page_cache, we need properly use them for pachless client
code.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I33e3e7c32b548866ee77753ef8a8193c814d0ecb
Reviewed-on: http://review.whamcloud.com/2230
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1049 script: lc_net doesn't parse output correctly
Minh Diep [Mon, 6 Feb 2012 23:37:10 +0000 (15:37 -0800)]
LU-1049 script: lc_net doesn't parse output correctly

Port the second fix from bz=23234
The output of ping contains two ip addresses. The regex
matching picks up both addresses. The fix is to add
-m 1 to stop at the first match
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Iaefe42918b4587f88a8a0cb39cf9afb2a82021ba
Reviewed-on: http://review.whamcloud.com/2105
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: hongchao.zhang <hongchao.zhang@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-874 ldlm: Fix ldlm_bl_* thread creation
Jinshan Xiong [Fri, 3 Feb 2012 19:12:46 +0000 (11:12 -0800)]
LU-874 ldlm: Fix ldlm_bl_* thread creation

Always create a new ldlm_bl_ thread when all threads
are busy, not just after returning from sleep.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: Ife23dd09694e26d11d49572bc8bb0a2c0b2d3eee
Reviewed-on: http://review.whamcloud.com/2088
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-874 osc: prioritize writeback pages
Jinshan Xiong [Fri, 3 Feb 2012 19:11:37 +0000 (11:11 -0800)]
LU-874 osc: prioritize writeback pages

When a lock is being canceled, we should prioritize those covering
pages which have already been submitted by page writeback daemon;
otherwise, this client may be evicted because there is no active IO
for that lock for a long time.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I2f914df0204375a51f4a7565a75640e9bb3c6d19
Reviewed-on: http://review.whamcloud.com/2087
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-874 ldlm: prioritize LDLM_CANCEL requests
Jinshan Xiong [Fri, 3 Feb 2012 19:10:55 +0000 (11:10 -0800)]
LU-874 ldlm: prioritize LDLM_CANCEL requests

If a lock canceling request has already reached server, there is no
reason of evicting the client when the waiting list times out.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: If87b3b321feb97f5755840a0901e747afd8bb32c
Reviewed-on: http://review.whamcloud.com/2086
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-874 ptlrpc: handle in-flight hqreq correctly
Jinshan Xiong [Fri, 3 Feb 2012 19:15:02 +0000 (11:15 -0800)]
LU-874 ptlrpc: handle in-flight hqreq correctly

If there are in-flight requests pending, we shouldn't timeout the
covering dlm locks; neither should we remove the requests from export
exp_hp_rpcs list until the requests are handled.

In this patch, the following things are improved:
1. leave IO rpcs in export's hp list until they are handled;
2. using interval tree to find rpc overlapped locks;
3. refresh the lock again after IO rpcs are finished to leave a time
   window for clients to cancel covering dlm locks;
4. rework repbody in ost_handler.c so as to not modify original obdo
5. cleanup the code.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I22873187aa6fdb28e95173dc0be9f992a54bbb9e
Reviewed-on: http://review.whamcloud.com/2085
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-983 llite: align readahead to 1M after ra_max adjustment
wangdi [Mon, 5 Mar 2012 19:44:34 +0000 (11:44 -0800)]
LU-983 llite: align readahead to 1M after ra_max adjustment

1. Align the readahead pages only if ria_start != 0, otherwise
   the readahead pages will be cut to zero.
2. Add test_101e to verify small reads for small size files.
   Add test_101f to verify max_read_ahead_whole_mb.
3. Port 101c from b1_8

Signed-off-by: Wang Di <di.wang@whamcloud.com>
Change-Id: I8b31621f5b48c3bc98f8d888f91d65b945c51999
Reviewed-on: http://review.whamcloud.com/2260
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1029 build: build ofed 1.5.4 with qlgc_vnic failed
Minh Diep [Tue, 7 Feb 2012 16:40:33 +0000 (08:40 -0800)]
LU-1029 build: build ofed 1.5.4 with qlgc_vnic failed

Ofed failed to include proper header files to support
qlgc_vnic. Since we are not using ib on vnic in the lab,
we'll remove it

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: I016fa4cc764fe163449229953631a76ef3b49b14
Reviewed-on: http://review.whamcloud.com/2111
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Michael MacDonald <mjmac@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1120 kernel: kernel update [RHEL6.2 2.6.32-220.4.2]
yangsheng [Mon, 27 Feb 2012 17:42:49 +0000 (01:42 +0800)]
LU-1120 kernel: kernel update [RHEL6.2 2.6.32-220.4.2]

Update RHEL6.2 kernel to 2.6.32-220.4.2.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I08a96bbf2ac3eeaaa9db13998531935a92d85fe7
Reviewed-on: http://review.whamcloud.com/2215
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1052 kernel: Kernel update [RHEL5.7 2.6.18-274.18.1.el5]
yangsheng [Tue, 31 Jan 2012 14:27:38 +0000 (22:27 +0800)]
LU-1052 kernel: Kernel update [RHEL5.7 2.6.18-274.18.1.el5]

Update RHEL5.7 kernel to 2.6.18-274.18.1.el5.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ie8439a864ec4ffdda7104ec1f2fd92fef3af2d0b
Reviewed-on: http://review.whamcloud.com/2069
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-143 fid hash improvements
Liang Zhen [Tue, 29 Mar 2011 08:44:41 +0000 (16:44 +0800)]
LU-143 fid hash improvements

Current hash function of fid is not good enough for lu_site and
ldlm_namespace.
We have to use two totally different hash functions to hash fid into
hash table with millions of entries.

Change-Id: I6261e63a406118a93d578210c31e67fc7f9e389c
Signed-off-by: Liang Zhen <liang@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/374
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoTag 2.1.1rc3 2.1.1 2.1.1-RC4 v2_1_1_0 v2_1_1_0_RC4
Oleg Drokin [Fri, 17 Feb 2012 06:21:21 +0000 (01:21 -0500)]
Tag 2.1.1rc3

Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: Ib1e74c23a0747526179d0904a03d3e4fa1c3b50a

12 years agoLU-1091 fix the tag matching code in configure
Brian J. Murrell [Fri, 10 Feb 2012 20:27:04 +0000 (15:27 -0500)]
LU-1091 fix the tag matching code in configure

Signed-off-by: Brian J. Murrell <brian@whamcloud.com>
Change-Id: I24dd4d0929364a2012afe725586626328ea584c0
Reviewed-on: http://review.whamcloud.com/2129
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-966 mdd: revert commit and use original patch
Mikhail Pershin [Wed, 15 Feb 2012 08:39:53 +0000 (12:39 +0400)]
LU-966 mdd: revert commit and use original patch

Use original patch which replaces LASSERT with CERROR in MDD,
because checking object exists in MDT ruins VBR recovery.
It happens too early but VBR should check that too before reply
to the client.

Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Change-Id: If31586c9d1cc86c4c8059a7227c231acdc82a0f1
Reviewed-on: http://review.whamcloud.com/2148
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Yu Jian <yujian@whamcloud.com>
12 years agoLU-769 ptlrpc: Do not miss pending signals in ptlrpc_set wait
Oleg Drokin [Mon, 7 Nov 2011 03:34:41 +0000 (22:34 -0500)]
LU-769 ptlrpc: Do not miss pending signals in ptlrpc_set wait

conf_sanity test 23a highlighted a problem in ptlrpc_set_wait logic,
if we enter there with a signal pending and the import is not FULL,
there is no way to interrupt such a set because we block signals
all the time. Enabling signals all the time is not an option either.
Waiting until import reconnects is questionable too since it might
never come up after all (like in the test 23a).
So for the solution we will just manually mark the set as interrupted
after the initial wait.

Change-Id: Iaa3e356e971b4f75fd7f21cc579c85f7487719a0
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1657
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
12 years agoLU-820 Increase LU_CDEBUG_LINE so that osc page messages fit.
Oleg Drokin [Thu, 3 Nov 2011 03:41:09 +0000 (23:41 -0400)]
LU-820 Increase LU_CDEBUG_LINE so that osc page messages fit.

LU_CDEBUG_LINE at 255 does not quite fit osc-page message out of
osc_page_print that is quite long.
As a result we get annoying "does not end in newline" console messages
when this happens.

Doubling LU_CDEBUG_LINE certainly should be enough.

Change-Id: Iec8635d6578e192d3b33643c4b1dab1dae2be6b4
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1642
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-778 o2iblnd: Add rdma_create_id() compatibility macro 2.1.1-RC2 v2_1_1_0_RC2
Ned Bass [Wed, 19 Oct 2011 21:47:50 +0000 (14:47 -0700)]
LU-778 o2iblnd: Add rdma_create_id() compatibility macro

As of RHEL6.2 kernel 2.6.32-204.el6, rdma_create_id() requires a
queue-pair type as a fourth argument.  This was previously inferred
from the rdma_port_space argument.  Add an autoconf test to detect
whether the fourth argument is expected and a compatibility macro
that discards the QP type argument if the 3-argument version of
rdma_create_id() is present.

Change-Id: Idb668e1f059954ecc994ad59b366d54da8b82dc8
Signed-off-by: Ned Bass <bass6@llnl.gov>
Reviewed-on: http://review.whamcloud.com/1556
Tested-by: Hudson
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoUpdate version to 2.1.1(rc1) 2.1.1-RC1 v2_1_1_0_RC1
Oleg Drokin [Thu, 9 Feb 2012 00:56:41 +0000 (19:56 -0500)]
Update version to 2.1.1(rc1)

Change-Id: I4ac4d6a0701bc6217f968fe68fe3b2c5bb7cd341
Signed-off-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-797 tests: speed up ost-pools tests
Andreas Dilger [Thu, 26 Jan 2012 12:39:06 +0000 (05:39 -0700)]
LU-797 tests: speed up ost-pools tests

The test time of the ost-pools subtests is unreasonably long.

test_14 fills an OST to 90% full, regardless of the OST size.
Skip the test if the amount of data to be written is too large
to run in a practical time.

test_18 creates 3x3x30000 files to compare performance with/without
pools enabled.  Instead of creating a fixed number of files, use
createmany to run for a specific (short) time to measure performance.

test_23 tried to fill all OSTs 100% full.  Split this test into two:
- test_23a to test quota with a file in a pool
- test_23b to test OOS with a file striped over pool

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib851939a210ab68f52da1ac777781c2a922c500c
Reviewed-on: http://review.whamcloud.com/2028
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-322 tests: Pool test fixes for large number of OSTs
James Simmons [Wed, 30 Nov 2011 15:07:18 +0000 (10:07 -0500)]
LU-322 tests: Pool test fixes for large number of OSTs

This patch fixes issues with large numbers of OSTs with the
ost-pools test. We need to use the hexidecimal numbers for
the pool args since the OSTs UUID are named with hex numbers.
Curently the test do pass but errors can be seen in the logs
when more than 9 OSTs exist.

Change-Id: I2bed80a1c55fc7cd405dc530876ef517a380c423
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/251
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>