Whamcloud - gitweb
fs/lustre-release.git
13 years agoUpdate release date for new RC 2.1.2-RC2 v2_1_2_RC2
Oleg Drokin [Sun, 27 May 2012 16:31:18 +0000 (12:31 -0400)]
Update release date for new RC

Change-Id: I3d91ec3d3ef40f69844c09c1cab5c9a24ebbd642
Signed-off-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-549 llite: Improve statfs performance if selinux is disabled
Yevheniy Demchenko [Tue, 10 Apr 2012 20:01:14 +0000 (22:01 +0200)]
LU-549 llite: Improve statfs performance if selinux is disabled

Even if selinux is disabled, client still tries to get selinux
attributes from MDS. As xattrs are not yet cached, this significantly
slows down xattr heavy operations like ls -l. This patch forces
to return -EOPNOTSUPP on the client side if selinux is disabled.
It speeds up ls -l 25% for cold-cache case and 50% for hot-cache
case.

Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz>
Signed-off-by: Keith Mannthey <keith@whamcloud.com>
Change-Id: I0c24bd8559818b0fae29a082790b392095f91ab5
:# Please enter the commit message for your changes. Lines starting
Reviewed-on: http://review.whamcloud.com/2904
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1398 build: Module.symvers dependencies
Minh Diep [Thu, 24 May 2012 20:06:58 +0000 (13:06 -0700)]
LU-1398 build: Module.symvers dependencies

Ensure a Module.symvers file is generated with the correct
symbols for the configured lustre backend filesystems. This
is accomplished by adding a generic module-symvers rule which
depends on a filesystem specific version of the rule.  When a
filesystem is not configured the result is an empty rule.

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: I5333e226f69ca75b6a959cc1ed673d640da22b23
Reviewed-on: http://review.whamcloud.com/2898
Tested-by: Hudson
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
13 years agoLU-969 Revert stack usage reduction patch
Oleg Drokin [Fri, 25 May 2012 04:06:59 +0000 (00:06 -0400)]
LU-969 Revert stack usage reduction patch

This patch introduced quite a few problems in the end.
Broke return values on 32bit systems (LU-1436)
Local io performance regression (LU-1408)

Revert "LU-969 debug: reduce stack usage"

This reverts commit b9cbe3616b6e0b44c7835b1aec65befb85f848f9.

Change-Id: I9966d9490e5016ef95d3ca088796ae187af318d5
Signed-off-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1134 test: can not assume lustre setup before nfs test 2.1.2-RC1 v2_1_2_RC1
Minh Diep [Thu, 24 May 2012 08:32:11 +0000 (16:32 +0800)]
LU-1134 test: can not assume lustre setup before nfs test

During autotest, lustre can be unmounted. parallel-scale-nfs
test should not assume that lustre is mounted and skip the setup.
This patch also includes the fix for LU-1213.

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I81fd5a428f8367f68928716b5635bf94bcc7590c
Reviewed-on: http://review.whamcloud.com/2565
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-532 llite: trusted. xattr is invisible to non-root
Bob Glossman [Thu, 17 May 2012 18:41:59 +0000 (11:41 -0700)]
LU-532 llite: trusted. xattr is invisible to non-root

Filter out all invalid xattrs in listxattr.
This includes trusted. xattrs that can cause
unnecessary "EPERM" in subsequent getxattr operations.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ic32fe262772370cd837bef878c9bfd9eefc0ec3c
Reviewed-on: http://review.whamcloud.com/2490
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-629 ptlrpc: fix _debug_req to print opc/status
Andreas Dilger [Wed, 24 Aug 2011 21:17:53 +0000 (15:17 -0600)]
LU-629 ptlrpc: fix _debug_req to print opc/status

The 2.x _debug_req() function was changed in bug 16359/commit 5467a86021
to avoid problems with accessing unswabbed message buffers. Unfortunately,
this broke the printing of many/most _debug_req() messages, because it
didn't check whether swabbing was actually needed in the first place.

Also, in ptlrpc_expire_one_request() some extra debugging information was
added in bug 21636/commit 368689640 but never removed, making this common
message overly verbose.

Fix _debug_req() so that it prints opcode/flags/status, unless the
ptlrpc_body _needs_ to be swabbed, but isn't.  Also print out more
useful idenfifiers for the nodes (the obd_name and NID instead of
the connection UUID).  This removes some of the added verbosity from
ptlrpc_expire_one_request(), and most of the rest was already being
printed out (deadline, current, etc).

Change-Id: I88a78486becd19f5b38f5578e5cc30e649564908
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1286
Tested-by: Hudson
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Ported-by: Keith Mannthey <keith@whamclound.com>
Reviewed-on: http://review.whamcloud.com/2875

13 years agoLU-1424 kernel: Kernel update [RHEL6.2 2.6.32-220.17.1.el6]
yangsheng [Mon, 21 May 2012 15:43:50 +0000 (23:43 +0800)]
LU-1424 kernel: Kernel update [RHEL6.2 2.6.32-220.17.1.el6]

Update RHEL6.2 kernel to 2.6.32-220.17.1.el6.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I01b238bd6d4ca52eeb8a36bc404f2557e5aa653b
Reviewed-on: http://review.whamcloud.com/2850
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-992 ldiskfs: fix typo for rhel5 ldiskfs patches
yangsheng [Wed, 11 Apr 2012 05:27:48 +0000 (13:27 +0800)]
LU-992 ldiskfs: fix typo for rhel5 ldiskfs patches

A typo indroduced a long time ago. Fix it even rhel5
support will deprecate.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I10564cd8dee7d62e05616869044dab0930a5638a
Reviewed-on: http://review.whamcloud.com/2506
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1274 osc: Do not grab mutex of cl_lock for glimpse
Jinshan Xiong [Fri, 30 Mar 2012 19:57:34 +0000 (12:57 -0700)]
LU-1274 osc: Do not grab mutex of cl_lock for glimpse

Otherwise this will cause client eviction if that lock is being
flushed and OST happens to be slow to finish the IO.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I4d7d9e8c275653d4e3f50f81dc416142d4905377
Reviewed-on: http://review.whamcloud.com/2808
Tested-by: Hudson
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-425 tests: fix the issue of using "grep -w"
Yu Jian [Wed, 16 May 2012 08:19:00 +0000 (16:19 +0800)]
LU-425 tests: fix the issue of using "grep -w"

This patch fixes the following issue while using "grep -w"
to do exact match:

$ echo /mnt/nbp0-2 | grep -w /mnt/nbp0
/mnt/nbp0-2

Per the description of "-w" option:
-w, --word-regexp
Select only those lines containing matches that form whole words.
The test is that the matching substring must either be at the
beginning of the line, or preceded by a non-word constituent
character. Similarly, it must be either at the end of the line
or followed by a non-word constituent character. Word-constituent
characters are letters, digits, and the underscore.

So, the hyphen "-" character is a non-word constituent character
and "grep -w" does not do exact match on strings which contain it.

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I61e611aad78748ad1e6362c7df3e0792e2766016
Reviewed-on: http://review.whamcloud.com/2801
Tested-by: Hudson
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1095 debug: Report remaining recovery time consistently
Christopher J. Morrone [Mon, 27 Feb 2012 00:20:47 +0000 (16:20 -0800)]
LU-1095 debug: Report remaining recovery time consistently

Consistency is good, always report the remaining recovery time
in the mm:ss format.  This patch get's the last 3 remaining
instances where it is simply reported as a total number of seconds.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: If5599d8c24b1cd862ab89670553fcd24672cadbc
Reviewed-on: http://review.whamcloud.com/2204
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
(cherry picked from commit e8c6d2e9647b2dc95edddac5e902168816e7f57b)
Reviewed-on: http://review.whamcloud.com/2834

13 years agoLU-1095 debug: Common client/server message standardization
Christopher J. Morrone [Mon, 27 Feb 2012 00:16:51 +0000 (16:16 -0800)]
LU-1095 debug: Common client/server message standardization

Enhance and standardize several common messages.  In particular
when a peer is involved ensure peers nid is in the message, and
on the server include the obd name in the message.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: Iaea477e7dab240866a10c1863886d21d674e293d
Reviewed-on: http://review.whamcloud.com/2200
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Ported-by: Keith Mannthey <keith@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/2833

13 years agoLU-1280 ldiskfs: remove EXT_ASSERT from ext3_ext_new_extent_cb()
Yu Jian [Thu, 17 May 2012 14:30:03 +0000 (22:30 +0800)]
LU-1280 ldiskfs: remove EXT_ASSERT from ext3_ext_new_extent_cb()

The EXT_ASSERT() in ext3_ext_new_extent_cb() is invalid since
new locking is introduced in ext4_ext_walk_space().

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I8de3ad4004c304a45be14347df50bf066d8f4caa
Reviewed-on: http://review.whamcloud.com/2827
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1366 utils: disable ldiskfs extents feature for MDT
Bobi Jam [Tue, 15 May 2012 16:03:03 +0000 (00:03 +0800)]
LU-1366 utils: disable ldiskfs extents feature for MDT

Explicitly disable "extents" for MDT filesystem if it's based on ext4,
it provides no benifit for MDT.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I284c6c207fb8cc79537bebd60b6ab8d836fd4ed9
Reviewed-on: http://review.whamcloud.com/2798
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1205 tests: sanityn test_18 sometimes takes long time to run
Jinshan Xiong [Fri, 13 Apr 2012 23:15:51 +0000 (16:15 -0700)]
LU-1205 tests: sanityn test_18 sometimes takes long time to run

This is a live-lock problem where two processes are writing to the
same mmaped file via two nodes. To write a mmap region, both processes
will do:

  acquire cl_lock -> read page -> release cl_lock-> install page.

During the above steps, the page can be truncated after the lock is
released and then immediately cancelled by the other process, so
kernel has to do page fault again and never complete.

Lustre can't handle this case well so this test case is disabled.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I0cbb00a1ca68715a0b97ce369a18c53fa8de19cb
Reviewed-on: http://review.whamcloud.com/2723
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1205 tests: cleanup code style in mmap_sanity.c
Andreas Dilger [Mon, 12 Mar 2012 20:43:45 +0000 (14:43 -0600)]
LU-1205 tests: cleanup code style in mmap_sanity.c

Cleanup numerous code style issues in the mmap_sanity.c test:
- whitespace at end of line
- spaces around operators
- indentation
- line wrapping at 80 columns

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: If47eeeb1dec2705b9aa4e70cba3c1bc9241546a7
Reviewed-on: http://review.whamcloud.com/2722
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1205 tests: add timestamps to sanityn 18 mmap
Andreas Dilger [Mon, 12 Mar 2012 20:23:07 +0000 (14:23 -0600)]
LU-1205 tests: add timestamps to sanityn 18 mmap

The sanityn.sh test_18 mmap_sanity.c test sometimes takes over
an hour to run, and sometimes only seconds.  Add timestamps to
the subtest results so that it is possible to debug where that
time is being spent.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I641566c9a0b204095ad0c2e3bee852a0e8fd6881
Reviewed-on: http://review.whamcloud.com/2721
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-980 llog: cleanup return value in llog_client_create
Hongchao Zhang [Thu, 12 Jan 2012 13:29:00 +0000 (21:29 +0800)]
LU-980 llog: cleanup return value in llog_client_create

in llog_client_create, the newly allocated llog_handle is
return by parameter res, but it doesn't be cleaned up
if the following operations failed and the corresponding
llog_handle is already freed.

Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ib8c40c53b071fff7de3550a39f009915cb8511a7
Reviewed-on: http://review.whamcloud.com/2806
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1102 crypto: correctly check crypto_alloc_blkcipher returns
Bobi Jam [Wed, 9 May 2012 19:22:58 +0000 (03:22 +0800)]
LU-1102 crypto: correctly check crypto_alloc_blkcipher returns

ll_crypto_alloc_blkcipher() returns error value as well as possible
NULL pointer, should check its return value carefully.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I181b236406e2649580a04940886f849ad6071078
Reviewed-on: http://review.whamcloud.com/2703
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1374 kernel: Kernel update [RHEL5.8 2.6.18-308.4.1.el5]
James Simmons [Wed, 9 May 2012 15:02:39 +0000 (11:02 -0400)]
LU-1374 kernel: Kernel update [RHEL5.8 2.6.18-308.4.1.el5]

Update RHEL5.8 kernel to 2.6.18-308.4.1.el5.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I1025558c97b1d6887d52020857a997cdc495d865
Reviewed-on: http://review.whamcloud.com/2684
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1345 tests: sanity test 215 non integer handling fix
James Simmons [Fri, 4 May 2012 11:40:49 +0000 (07:40 -0400)]
LU-1345 tests: sanity test 215 non integer handling fix

Sanity test 215 test the format of various /proc/sys/lnet/* files.
Some of those files are integer values but their can be times when
no valid number is available so a NA is reported. This patch
handles those cases.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: If03a57e378e98ddd689c0e555fc8c9dc87d39138
Reviewed-on: http://review.whamcloud.com/2603
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-554 lnet: add gnilnd awareness to LNet
James Simmons [Wed, 9 May 2012 14:33:24 +0000 (10:33 -0400)]
LU-554 lnet: add gnilnd awareness to LNet

This allows servers on any network to talk to gnilnd routers.
This is 2.1 version of the Oracle 23884 attachment 31892.

Change-Id: I96777551b0caa50021ebb32755caaa01623ea97d
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Wally Wang <wang@cray.com>
Reviewed-on: http://review.whamcloud.com/2449
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Cory Spitz <spitzcor@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1308 Additional multihomed nid config fix
Oleg Drokin [Wed, 25 Apr 2012 19:28:22 +0000 (15:28 -0400)]
LU-1308 Additional multihomed nid config fix

Need to put the new nid addition at the last slot available,
not next after the last.

Change-Id: Icf9d898fba4c6e9c05f085b855a33282ea0d4b47
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/2599
Reviewed-by: Denis Kondratenko <Denis_Kondratenko@xyratex.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Cory Spitz <spitzcor@cray.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
13 years agoLU-1308 Properly add multihomed nids to peer table
Oleg Drokin [Tue, 17 Apr 2012 06:31:10 +0000 (02:31 -0400)]
LU-1308 Properly add multihomed nids to peer table

class_add_uuid had a copy&paste error where it was checking against
wrong entry for nid tables and as such had trouble finding multihomed
nid configurations.

Change-Id: I2d73bdde9cf7b0bf882b14b473b4491873e64c25
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/2561
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
13 years agoLU-630 lnet: only router checks peer health
Lai Siyao [Mon, 5 Dec 2011 07:28:39 +0000 (15:28 +0800)]
LU-630 lnet: only router checks peer health

The peer health code is designed for router, so a ~rtr node always
assumes peers to be alive.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ib794feace322112988a5b727ed40fb38f8f57370
Reviewed-on: http://review.whamcloud.com/2646
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1095 debug: Send common recovery messages to D_HA
Christopher J. Morrone [Sun, 26 Feb 2012 23:05:14 +0000 (15:05 -0800)]
LU-1095 debug: Send common recovery messages to D_HA

These messages are always present at recovery time, and are not
understable by a sysadmin.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I907b0ac49541b20699914dc4f8c5e0db3fb6bec9
Reviewed-on: http://review.whamcloud.com/2198
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1095 debug: Improve recovery console messages
Christopher J. Morrone [Sat, 3 Mar 2012 01:41:45 +0000 (17:41 -0800)]
LU-1095 debug: Improve recovery console messages

Quiet and/or improve a few recovery messages.

A sysadmin will not understand this:

  2012-03-02 16:27:19 Lustre: 5211:0:(ldlm_lib.c:2072:
  target_queue_recovery_request()) Next recovery transno: 410629539,
  current: 410629539, replaying

Messages like this are too verbose for the console:

  2012-03-02 16:27:59 LustreError: 5286:0:
  (genops.c:1270:class_disconnect_stale_exports())
  lc3-OST0004: disconnect stale client
  47808f4f-9f36-e8eb-f363-14b1abe4ac57@<unknown>

and can be left to this simpler message:

  2012-03-02 16:27:59 Lustre: lc3-OST0005: disconnecting 0 stale
  clients

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I457602c3440ba10475e4ddca7c4e58ef8669922c
Reviewed-on: http://review.whamcloud.com/2249
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Liu Xuezhao <xuezhao.liu@emc.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1095 debug: CWARN to CDEBUG for mds_notify() event
Brian Behlendorf [Fri, 19 Feb 2010 19:53:55 +0000 (11:53 -0800)]
LU-1095 debug: CWARN to CDEBUG for mds_notify() event

Both of these warnings represent correct behavior the administrator
does not need to know about, or more importantly do anything about.
As such I am moving both of these warnings to CDEBUG(D_CONFIG).

  Lustre: 8099:0:(mds_lov.c:1167:mds_notify()) MDS lc1-MDT0000:
  add target lc1-OST0023_UUID

  Lustre: lc1-MDT0000: in recovery, not resetting orphans on
  lc1-OST0007_UUID

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I66a98d87e3d5de7205420c74db4f6d9bcaaf31a7
Reviewed-on: http://review.whamcloud.com/2202
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1095 debug: Improve messages for fake requests
Christopher J. Morrone [Mon, 27 Feb 2012 00:19:21 +0000 (16:19 -0800)]
LU-1095 debug: Improve messages for fake requests

Update the console filter to correctly handle fake requests and
squelched the lov_update_create_set() message for the
-ETIMEDOUT/-ENOTCONN case.

 LustreError: 7872:0:(lov_request.c:693:lov_update_create_set()) error
 creating fid 0x104c5e0b sub-object on OST idx 53/2: rc = -107

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I5f37f585566b053d515665fcddbcc8a3e653d89a
Reviewed-on: http://review.whamcloud.com/2203
Tested-by: Hudson
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1095 debug: Standardize, suppress mount/umount messages
Christopher J. Morrone [Mon, 27 Feb 2012 00:06:29 +0000 (16:06 -0800)]
LU-1095 debug: Standardize, suppress mount/umount messages

Standardize mount/umount console message to include profile name,
and optionally suppress them with the 'quiet' mount option.  We
have been using private namespaces for testing and mounting then
umounting the FS as needed for each job.  In this context these
messages end up causing alot of syslog noise.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I7514f6016c337a358e5e31146644810dff292d02
Reviewed-on: http://review.whamcloud.com/2199
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1095 mgs: remove message from console
Christopher J. Morrone [Fri, 10 Feb 2012 23:24:06 +0000 (15:24 -0800)]
LU-1095 mgs: remove message from console

There is no good reason for a sysadmin to see this message
on the console.  Most of the time this will be a fluke
due to the vagarities of lnet networks (server decides
client is disconnected, but client doesn't know that yet,
messages arriving out of order, etc.).

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Change-Id: I0c18734f82a9c89a5e940ce4e2c602614e89ce26
Reviewed-on: http://review.whamcloud.com/2133
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-969 debug: reduce stack usage
Hongchao Zhang [Mon, 12 Mar 2012 08:11:47 +0000 (16:11 +0800)]
LU-969 debug: reduce stack usage

1, libcfs_debug_vmsg2 to accept libcfs_debug_msg_data struture
   to replace SUBSYSTEM, __FILE__, __FUNCTION__, __LINE__ and
   cdls on the stack

2, CDEBUG, DEBUG_CAPA use static libcfs_debug_msg_data

3, remove the local variable in RETURN/GOTO/__CHECK_STACK

4, reduce stack in recovery thread by moving lu_env,
   ptlrpc_thread to heap.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I75fe53027f56e27255b5f558e8fd57c7db833648
Reviewed-on: http://review.whamcloud.com/2668
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1361 build: enable kabi on rhel6
Minh Diep [Thu, 3 May 2012 23:00:46 +0000 (16:00 -0700)]
LU-1361 build: enable kabi on rhel6

Turn on USE_KABI=true to build with kabi on rhel6

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Ie028ced17baf5a4540c59b8b63fb279a146718a6
Reviewed-on: http://review.whamcloud.com/2642
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Tested-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1312 kernel: crash at boot time in isci driver
yangsheng [Wed, 2 May 2012 13:29:01 +0000 (21:29 +0800)]
LU-1312 kernel: crash at boot time in isci driver

Restore SG_ALL to default value to avoid crash isci.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I9c0e358d5cbc41af2c4c9549e837bc54f50820ad
Reviewed-on: http://review.whamcloud.com/2626
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-577 tests: FAIL replay-single test_70b rundbench load
James Simmons [Wed, 18 Apr 2012 14:13:54 +0000 (10:13 -0400)]
LU-577 tests: FAIL replay-single test_70b rundbench load

Test 70b for replay-single assumes that lustre is mounted on
/mnt/lustre which is not the case for us. This patch passes
the proper MOUNT. The test also was not using the standard
DIR/tdir setup which had generated data files not being
cleaned up. Increased the sleep period to match dbench's
warm up period. This gives dbench a change to start up when
using many clients. Set the pdsh FANOUT environment variable
because by default pdsh launches in blocks of 32 nodes. This
way pdsh will lauch all node jobs at the same time

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I5fd4160fc684c19990caf60b51ef62d18ff98249
Reviewed-on: http://review.whamcloud.com/2538
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1014 mountconf: MGS should process parameter config
Lai Siyao [Thu, 23 Feb 2012 08:23:25 +0000 (16:23 +0800)]
LU-1014 mountconf: MGS should process parameter config

MGS doesn't have llog config of its own, but it should process
<profile>-params config which is global parameters for the whole
system.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I7c3a236fa0c24581494ba0e2a3ab40271a2e8c8f
Reviewed-on: http://review.whamcloud.com/2667
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1247 obdfilter: fix invalid check of precrate objects
Alexander.Boyko [Wed, 21 Mar 2012 17:47:53 +0000 (21:47 +0400)]
LU-1247 obdfilter: fix invalid check of precrate objects

MDT precreate objects when it has objects count less than the
oscc->oscc_grow_count / 2. oscc->oscc_grow_count can be equal
to OST_MAX_PRECREATE, so MDT (last_id - next_id) is less than the
(OST_MAX_PRECREAT * 3 / 2). This patch fix the wrong condition at
filter_handle_precreate() when delete orphans request happend.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Andriy Skulysh <andriy_skulysh@xyratex.com>
Xyratex-bug-id: MRP-440

Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I1e555c5480709d2acd4c3810a464b70767a6549f
Reviewed-on: http://review.whamcloud.com/2666
Reviewed-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Alexander Boyko <alexander_boyko@xyratex.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-989 ldlm: Fix client's import destruction
Andriy Skulysh [Fri, 13 Jan 2012 14:08:57 +0000 (16:08 +0200)]
LU-989 ldlm: Fix client's import destruction

Move client's import destruction from disconnect to cleanup phase
The patch allows to use connect after disconnect.

Xyratex-bug-id: MRP-288
Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I0a63f66205ac5931ead0acea492f3e480669e237
Reviewed-on: http://review.whamcloud.com/2664
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1166 recovery: don't leak a connected client counter.
Alexey Lyashkov [Mon, 5 Mar 2012 16:17:19 +0000 (20:17 +0400)]
LU-1166 recovery: don't leak a connected client counter.

target_handle_connect vs client eviction race may leak a
connected client counter and some evicted clients will counted twice.

Xyratex-bug: MRP-451

Signed-off-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I13f8168baf904e214605514e4ddfc6f16ab077c9
Reviewed-on: http://review.whamcloud.com/2665
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-663 kernel: Some arch do not have NUMA features anymore
Gregoire Pichon [Wed, 7 Sep 2011 14:55:04 +0000 (16:55 +0200)]
LU-663 kernel: Some arch do not have NUMA features anymore

Some architectures, especially x86_64, do not have cpu_to_node()
defined as a macro, and node_to_cpumask() exported by the kernel
anymore.

The cpu_to_node() routine is defined either as a macro, as an inline
routine using another exported symbol, or as an exported symbol.
Anyway, the kernel defines this service since at least version
2.6.12.

The node_to_cpumask() routine has been replaced by cpumask_of_node()
for x86 architectures since kernel version 2.6.30.

The set_cpus_allowed() routine is not defined if
CONFIG_CPUMASK_OFFSTACK=y since kernel version 2.6.32.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I169a3e1f54816e0a29b265b1d2773f99dbf4eaff
Reviewed-on: http://review.whamcloud.com/2620
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-447 lnet: add lctl --net XXX push
James Simmons [Fri, 30 Mar 2012 12:50:09 +0000 (08:50 -0400)]
LU-447 lnet: add lctl --net XXX push

Lctl --net XXX push is used to clear out purgatory conns arbitrarily.
We use this with lctl --net XXX disconnect for regression testing.
This does not nuke the peer, so it shouldn't yield lnd_query failures
like del_peer does.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ia7b033750134020022df676f451d91b20e4f5db4
Reviewed-on: http://review.whamcloud.com/2645
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-646 port bz23485 (clarification of lustre fsync behavior)
Lai Siyao [Tue, 30 Aug 2011 03:44:31 +0000 (20:44 -0700)]
LU-646 port bz23485 (clarification of lustre fsync behavior)

Add directory fsync operation.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I339e0505da2de7dbe2de7f3d5f513df8332fe956
Reviewed-on: http://review.whamcloud.com/2643
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1358 kernel: Kernel update [RHEL6.2 2.6.32-220.13.1.el6]
yangsheng [Fri, 4 May 2012 16:14:44 +0000 (00:14 +0800)]
LU-1358 kernel: Kernel update [RHEL6.2 2.6.32-220.13.1.el6]

Update RHEL6.2 kernel to 2.6.32-220.13.1.el6.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I927e544c990bebf51c38911962c24cf48e70cba7
Reviewed-on: http://review.whamcloud.com/2652
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1280 ldiskfs: remove LASSERTF from ext3_ext_new_extent_cb()
Yu Jian [Thu, 3 May 2012 11:50:15 +0000 (19:50 +0800)]
LU-1280 ldiskfs: remove LASSERTF from ext3_ext_new_extent_cb()

The LASSERTF() in ext3_ext_new_extent_cb() was injected for
debugging purpose to make sure the race really happened but
was forgotten to be removed from the original patch in
http://review.whamcloud.com/1618 .

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I12482e7092320d7b80190c8a84014708bf67c75e
Reviewed-on: http://review.whamcloud.com/2639
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1319 mdt: increment MDT getattr stats
yangsheng [Wed, 2 May 2012 14:44:42 +0000 (22:44 +0800)]
LU-1319 mdt: increment MDT getattr stats

Move increment of MDT getattr stat from mdt_getattr() to
mdt_getattr_internal() so we don't miss other call paths
that may service getattr requests.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I25293fe98567b1250ecc2f9645295c1522345295
Reviewed-on: http://review.whamcloud.com/2637
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1350 debug: lower debug message level
Bobi Jam [Thu, 26 Apr 2012 17:18:44 +0000 (01:18 +0800)]
LU-1350 debug: lower debug message level

File info read and unlink race is normal, we'd lower the debug message
level since a lot of unnecessary unmasked messages will be generated
if mdt_object_find() cannot find those deleted objects.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: If4ec54fbd341bbdd16dbe0efc779be57e9640220
Reviewed-on: http://review.whamcloud.com/2608
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1017 handle -EAGAIN properly in lu_object_find_try()
Niu Yawei [Tue, 31 Jan 2012 06:06:47 +0000 (22:06 -0800)]
LU-1017 handle -EAGAIN properly in lu_object_find_try()

htable_lookup() could return -EAGAIN for dying object, we should
handle it properly in lu_object_find_try().

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: I1fa53a95f96f5a5c0d12158521d733fbd852b590
Reviewed-on: http://review.whamcloud.com/2629
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1217 osc: to not check a cl_lock's state w/o protection
Jinshan Xiong [Mon, 26 Mar 2012 19:17:17 +0000 (12:17 -0700)]
LU-1217 osc: to not check a cl_lock's state w/o protection

osc_page_putref_lock() used to check cl_lock's refcount and
corresponding osc_lock's ols_hold without any protection, this
is racy because other process can change the lock state so as to
make the assertion be false.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I65fe1fa7fc55e8642fea6789784d7bb92a45d56f
Reviewed-on: http://review.whamcloud.com/2604
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-685 obdclass: lu_object reclamation is inefficient
Lai Siyao [Thu, 15 Sep 2011 06:45:13 +0000 (23:45 -0700)]
LU-685 obdclass: lu_object reclamation is inefficient

Put only non-referenced lu_object in lru list to speed up object
reclamation.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ibde905bfe7ec5ec0b66f31a6070081cf3dc331cd
Reviewed-on: http://review.whamcloud.com/2628
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1084 ptlrpc: Change CWARNs to CDEBUGs
Christopher J. Morrone [Sat, 11 Feb 2012 01:34:32 +0000 (17:34 -0800)]
LU-1084 ptlrpc: Change CWARNs to CDEBUGs

These messages should not appear on the console.  A sysadmin
will have no idea what to make of most of them.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Ia8d7e033bcd14d7c8ea5b1b27f849ef81eb9ad4a
Reviewed-on: http://review.whamcloud.com/2621
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-459 quiet too noisy console messages at mount
Andreas Dilger [Mon, 22 Aug 2011 22:56:45 +0000 (16:56 -0600)]
LU-459 quiet too noisy console messages at mount

Quiet a number of extra debug messages printed to the console after a
remount or recovery.  They provide no value and just add to the general
confusion of reading Lustre debug messages.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Bob Glossman <bogl@whamcloud.com>
Change-Id: Id83fa6c5538cf34f3af4503c1e16540a8de6e74e
Reviewed-on: http://review.whamcloud.com/2619
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-814 test: automate NFS over lustre testing
Minh Diep [Thu, 5 Jan 2012 16:55:48 +0000 (08:55 -0800)]
LU-814 test: automate NFS over lustre testing

Provide setup nfs within auster framework

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Icfd61bf6772807a344576b92b5268a83a7b79e4b
Reviewed-on: http://review.whamcloud.com/1664
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-780 test: improve parallel-scale to support hyperion run
Minh Diep [Fri, 4 Nov 2011 23:12:57 +0000 (16:12 -0700)]
LU-780 test: improve parallel-scale to support hyperion run

We need to add support for srun/slurm, and a few tests
from hyperion-sanity script that has been used for hyperion
testing

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: I7f1baa0c99980ad9001436911d23f1030aa7d0fe
Reviewed-on: http://review.whamcloud.com/1615
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Cliff White <cliffw@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1137 ldlm: fix for the flock handling for 1.8 clients
Alexey Lyashkov [Fri, 24 Feb 2012 10:47:37 +0000 (02:47 -0800)]
LU-1137 ldlm: fix for the flock handling for 1.8 clients

The current fix intended to fix the issue with incorrect flock
owner field filling. This issue observed when 1.8 clients
(and with lesser version) doesn't fill the owner field correctly.
With this patch this filling integrated on the 2.x server side.

Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy_skulysh@xyratex.com>
Signed-off-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Signed-off-by: Iurii Golovach <iurii_golovach@xyratex.com>
Xyratex-bug-id: MRP-413
Change-Id: I88ba40eb9cb74d07b90862801669028c5dc94e08
Reviewed-on: http://review.whamcloud.com/2193
Reviewed-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
13 years ago2.1.2-rc0 2.1.2-RC0 v2_1_2_0_RC0
Oleg Drokin [Mon, 23 Apr 2012 18:44:03 +0000 (14:44 -0400)]
2.1.2-rc0

Change-Id: I97a7c0367c6db67282593b2c4e5c246b519e1d8f
Signed-off-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1282 lprocfs: Add a module param to disable percpu stats
Bobi Jam [Thu, 12 Apr 2012 00:48:42 +0000 (08:48 +0800)]
LU-1282 lprocfs: Add a module param to disable percpu stats

Add an obdclass module option to choose to use a single lprocfs stats
structure rather than percpu data.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I45d5a05029197e629d4f7d161a5e4e5d01a93bf5
Reviewed-on: http://review.whamcloud.com/2515
Tested-by: Hudson
Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-734 tests: add sub-tests into recovery-*-scale tests
Yu Jian [Wed, 11 Apr 2012 08:23:34 +0000 (16:23 +0800)]
LU-734 tests: add sub-tests into recovery-*-scale tests

This patch adds sub-tests into the recovery-*-scale tests
so that test results and logs could be gathered properly
and uploaded to Maloo.

The patch also does some cleanup works on the test scripts
and moves some common functions into test-framework.sh.

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: Ife174285d182ad5a2d4823767ca59df5a10b4aa4
Reviewed-on: http://review.whamcloud.com/2509
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Cliff White <cliffw@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-690 test: wait_osc_import_state() fixes
Yu Jian [Tue, 10 Apr 2012 04:40:44 +0000 (12:40 +0800)]
LU-690 test: wait_osc_import_state() fixes

-- increase maxtime to wait the timeout of 1st request;
   take into account at_min value;
-- cleanup wait_osc_import_state() to use _wait_import_state();

Oracle-bug-id: 24498

Signed-off-by: Elena Gryaznova <elena.gryaznova@oracle.com>
Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I0a3525a02e5ce6ca81082d177df0c5d7d68bea26
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@oracle.com>
Reviewed-on: http://review.whamcloud.com/2496
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
13 years agoLU-1092 ptlrpc: take export refcount during connect
Lai Siyao [Mon, 19 Mar 2012 08:41:54 +0000 (16:41 +0800)]
LU-1092 ptlrpc: take export refcount during connect

In the process of (re)connect,  a refcount of export should be taken,
otherwise disconnect of this export may be called, and it will put
the last refcount of this export and make access to this export
invalid.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Change-Id: Iaf27e842ed516b8968c90bfce396609e39f52c85
Reviewed-on: http://review.whamcloud.com/2345
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1245 lprocfs: use correct cpu number
Bobi Jam [Wed, 18 Apr 2012 01:14:57 +0000 (09:14 +0800)]
LU-1245 lprocfs: use correct cpu number

Take care of correct cpu number in lprocfs_stats_collector().

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib7bad581e216daa48bb6dc903b1720b44ddba9c0
Reviewed-on: http://review.whamcloud.com/2579
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1320 llite: fix a race between readpage and releasepage
Jinshan Xiong [Wed, 18 Apr 2012 04:40:24 +0000 (21:40 -0700)]
LU-1320 llite: fix a race between readpage and releasepage

This is a race between page stealing and readpage. If a just read
page is stolen, readpage will find the page is not uptodate, this
makes it panic so -EIO is returned to the reading application.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: Ib16d12d3bc3cc8c0545aa27f0836e4fd89c3a809
Reviewed-on: http://review.whamcloud.com/2564
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1241 kernel: Kernel update [RHEL6.2 2.6.32-220.7.1.el6]
yangsheng [Wed, 28 Mar 2012 06:28:30 +0000 (14:28 +0800)]
LU-1241 kernel: Kernel update [RHEL6.2 2.6.32-220.7.1.el6]

Update RHEL6.2 kernel to 2.6.32-220.7.1.el6.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I695f74e160b0b836c663c34cf185bcbab7b6c16c
Reviewed-on: http://review.whamcloud.com/2393
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-974 protocol: change OBD_CONNECT_GRANT_PARAM
Andreas Dilger [Tue, 13 Mar 2012 20:53:41 +0000 (14:53 -0600)]
LU-974 protocol: change OBD_CONNECT_GRANT_PARAM

Change the OBD_CONNECT_GRANT_PARAM flag value to avoid conflict
with the OBD_CONNECT_UMASK flag from LU-974.  While that patch is
not yet landed to our release tree, it is in use in production at
some customers.  While the risk of conflict is currently low, it
is easier to change the GRANT_PARAM value since only in use on the
orion branch, and isn't even handled by the client there yet.

Add (hopefully) clear comments for OBD_CONNECT and obd_connect_data
to ensure that they are not modified in some incompatible way across
branches.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I503892c3b595c0272b0941fa58a16a496321cab0
Reviewed-on: http://review.whamcloud.com/2299
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
13 years agoLU-573: conf-sanity test_22 failed with 41
Jinshan Xiong [Sat, 6 Aug 2011 02:32:22 +0000 (19:32 -0700)]
LU-573: conf-sanity test_22 failed with 41

Make sure recovery on OST is finished before trying to create file

Change-Id: I4a36685a5cd9c55de729906bff50c29b1108c931
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1192
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-570: Add function to find connect uuid by nid
Jinshan Xiong [Wed, 10 Aug 2011 19:30:20 +0000 (12:30 -0700)]
LU-570: Add function to find connect uuid by nid

In this patch, two functions are added:
- class_find_uuid(), find conn uuid by peer nid
- client_import_find_conn(), find a conn uuid in import connection list

Also, a code cleanup is performed.

Change-Id: I50e8e9392a39ef78719504cf083c0c22f5d39dcb
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1189
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1206: mdt: Fix error handling in mdt_mfd_open
Oleg Drokin [Thu, 15 Mar 2012 00:56:02 +0000 (20:56 -0400)]
LU-1206: mdt: Fix error handling in mdt_mfd_open

In mdt_mfd_open if the mo_open() call failed or we could not allocate
mfd, we also need to undo write/exec reference count in order to
not mess up with subsequent exec/write accesses.

Signed-off-by: Prakash Surya <surya1@llnl.gov>
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I3bd98bd68368b48f2afaa7bb450d3a9947c992ac
Reviewed-on: http://review.whamcloud.com/2300
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
13 years agoLU-879 mds: Add a few rename stats under /proc
wangdi [Fri, 16 Dec 2011 01:19:31 +0000 (17:19 -0800)]
LU-879 mds: Add a few rename stats under /proc

1. Add samedir_rename in /proc/fs/lustre/mds/lustre-MDT0000/stats
to collect stats of same dir rename.
2. Add crossdir_rename in /proc/fs/lustre/mds/lustre-MDT0000/stats
to collect stats of cross dir rename.
3. Add /proc/fs/lustre/mds/lustre-MDT0000/rename_stats(YAML format)
to collect stats of rename stats happened on different size
directories.
The size of directories under which files are being removed.
With these patches, it will find out how many renames take place
in the same directory compared to how many renames are between
So during DNE implementation, we can know how rename may be
affected by DNE remote directories and large striped directories.

Signed-off-by: Wang Di <di.wang@whamcloud.com>
Change-Id: I4452ce196802c5724607455e0a9b4b372b06f159
Reviewed-on: http://review.whamcloud.com/1878
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
13 years agoLU-680 lfs: instance <-> mount point mapping from lfs
Richard Henwood [Wed, 2 Nov 2011 23:33:51 +0000 (19:33 -0400)]
LU-680 lfs: instance <-> mount point mapping from lfs

A new option to 'lfs' has been created to return the mapping
between Lustre filesystem instance and paths. The option
is 'getname' and it may be called with or without arguements.

'lfs getname' without arguments returns the instances of all
Lustre mount points.

'lfs getname [path...]' returns the instance of each specified
path. If the path is not a Lustre instance 'No such device' is
returned.

OBD_IOC_GETNAME has been added to file.c to provide consistent
behavior for file as well as directory paths.

A llapi_getname helper function has been added to liblustreapi
that returns a lustre instance name if a path is provided.

Documentation for 'lfs getname' is included inline an the lfs
man page has been updated.

Signed-off-by: Richard Henwood <rhenwood@whamcloud.com>
Signed-off-by: John L. Hammond <jhammond@tacc.utexas.edu>
Change-Id: Iab8ff12d604c7ce853f3c204b455e3b641f659f4
Reviewed-on: http://review.whamcloud.com/1373
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1098 debug: lower debug message level
Bobi Jam [Tue, 21 Feb 2012 01:23:11 +0000 (09:23 +0800)]
LU-1098 debug: lower debug message level

File info read and unlink race is normal, we'd lower the debug message
level since a lot of unnecessary unmasked messages will be generated
if mdt_object_find() cannot find those deleted objects.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I7630e6a1456ffb435c8e67cc626bf38547b840d0
Reviewed-on: http://review.whamcloud.com/2165
Tested-by: Hudson
Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-499 grant/cancel_rate are static when OST is idle
Lai Siyao [Fri, 5 Aug 2011 03:43:20 +0000 (20:43 -0700)]
LU-499 grant/cancel_rate are static when OST is idle

ldlm_pool_recalc() shouldn't be skipped if namespace->ns_bref eqauls
zero, instead a flag ns_stopping is added to mark ldlm namespace is
being freed.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Change-Id: Ic6485c34ec3e9868ae531a4dc25aee969c374eb5
Reviewed-on: http://review.whamcloud.com/1185
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Prakash Surya <surya1@llnl.gov>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-532 mdt: improve xattr ctime warning message
Andreas Dilger [Thu, 28 Jul 2011 22:13:53 +0000 (16:13 -0600)]
LU-532 mdt: improve xattr ctime warning message

Print out which xattr is not getting OBD_MD_FLCTIME set so that it
is possible to track down what code path on the client is failing.

Change-Id: I1918d2e8e0a1e03d8437846e823bca9df6f89b48
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1161
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-985 lprocfs: verify user buffer access
Bobi Jam [Fri, 13 Jan 2012 05:46:07 +0000 (13:46 +0800)]
LU-985 lprocfs: verify user buffer access

In lprocfs_xxx_evict_client(), need verify user's buffer when access
it.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I702e22f8d432edce200c6d91a0af8a1eac792008
Reviewed-on: http://review.whamcloud.com/1961
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-717 ldiskfs: MRP-222 Replace sysname with nodename in MMP
Nikitas Angelinas [Mon, 5 Dec 2011 22:31:12 +0000 (22:31 +0000)]
LU-717 ldiskfs: MRP-222 Replace sysname with nodename in MMP

sysname holds "Linux" by default, i.e. what appears when doing a
"uname -s"; nodename should be used to print the machine's hostname,
i.e. what is returned when doing a "uname -n" or "hostname", and what
gethostname(2)/sethostname(2) manipulate, in order to notify the
administrator of the node which is contending to mount the
filesystem.

Andreas says this was introduced when porting the MMP patches from
RHEL5 to RHEL6, and then also pushed upstream to ext4; a patch for
upstream ext4 has already been submitted.

Signed-off-by: Nikitas Angelinas <nikitas_angelinas@xyratex.com>
Change-Id: I207bf145d114a9981b5a6add4bbf92ca76f71840
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-on: http://review.whamcloud.com/1419
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
13 years agoLU-427 test: Test failure on test suite lfsck
yangsheng [Thu, 4 Aug 2011 03:36:23 +0000 (11:36 +0800)]
LU-427 test: Test failure on test suite lfsck

- Reset $MDSDB & $OSTDB in generate_db(). Else they will
  stale if user redefine $SHARED_DIRECTORY.
- Add a function check_shared_dir() to ensure
  $SHARED_DIRECTORY is shared among tests nodes.

Change-Id: Idf2a3d75e46c4cf768419adfea627511c24c495c
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1180
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-81 deadlock of changelog adding vs. changelog cancelling
Niu Yawei [Thu, 18 Aug 2011 04:22:19 +0000 (21:22 -0700)]
LU-81 deadlock of changelog adding vs. changelog cancelling

This is a workaround for the deadlock of changelog adding vs.
changelog cancelling. Changelog adding always start transaction
before acquiring the catlog lock(lgh_lock), whereas, changelog
cancelling do start transaction after holding the catlog lock.

We start transaction earlier to avoid above deadlock.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I9647b9a559f68a27dc0d4b4885857d3cf73b5b8e
Reviewed-on: http://review.whamcloud.com/1260
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-759 mdc: Clear rq_replay on error in mdc_enqueue()
Li Wei [Fri, 30 Sep 2011 08:30:09 +0000 (16:30 +0800)]
LU-759 mdc: Clear rq_replay on error in mdc_enqueue()

When mdc_enter_request() fails (e.g., due to signals) in mdc_enqueue(),
the request is freed without any care about its rq_replay field.  For
rq_replay requests, this results in assertion failures in
__ptlrpc_free_req().  This patch adds a call to mdc_clear_replay_flag()
to make sure __ptlrpc_free_req()'s assumption is respected.

Change-Id: I2185066a9f47b3d9563d9e1a8989754ef2e2dcb4
Signed-off-by: Li Wei <liwei@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1518
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-543 mdd: fix rename changelog
Niu Yawei [Sat, 5 Nov 2011 05:21:12 +0000 (22:21 -0700)]
LU-543 mdd: fix rename changelog

Current rename changelog stores source fid in both CL_RENAME & CL_EXT
records, which is redundant, and the 'tfid' in CL_EXT is never been
used.

Actually, we'd store target fid in the CL_EXT record, then application
could detect the fid unlinked by rename in changelog.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I0c616f813657a2faefa60a707f4fc1d9dc971b39
Reviewed-on: http://review.whamcloud.com/1652
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-542 Fix mdt xattr handler logic error
Bobi Jam [Thu, 28 Jul 2011 14:12:38 +0000 (22:12 +0800)]
LU-542 Fix mdt xattr handler logic error

Record system ACL and user xattr change/deletion changelog.

Change-Id: I5aabf1879ec6e812361fe0d1b8255f84d0e817d6
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1158
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-935 quota: break early when b/i_unit_sz exceeded upper limit
Niu Yawei [Mon, 19 Dec 2011 10:18:28 +0000 (02:18 -0800)]
LU-935 quota: break early when b/i_unit_sz exceeded upper limit

While expanding b/i_unit_sz in dquot_create_oqaq(), we'd break the loop
early when the b/i_unit_sz exceeded upper limit, otherwise, qaq_b/iunit_sz
could be overflow and result in endless loop.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I0bf069e9259627426d7a87ec42844eaed7a733b4
Reviewed-on: http://review.whamcloud.com/1890
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
13 years agoLU-931 mdd: store lu_fid instead of pointer in md_capainfo
Hongchao Zhang [Tue, 17 Jan 2012 04:10:25 +0000 (12:10 +0800)]
LU-931 mdd: store lu_fid instead of pointer in md_capainfo

in md_capainfo, mc_fid contains at most 5 pointers to lu_fid,
and if the corresponding lu_fid is freed, the pointer isn't notified
about it, then the pointer will point to freed memory!

Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Change-Id: I00088cbfeb145ceac0477467a8b2436f6cf1e530
Reviewed-on: http://review.whamcloud.com/1979
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1218 proc: Recovery timer in proc always displays 0
yangsheng [Thu, 15 Mar 2012 16:18:29 +0000 (00:18 +0800)]
LU-1218 proc: Recovery timer in proc always displays 0

Calculate remain recovery time for proc display.

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: I50c14859c704c7e2bc60b66b3d70350648feebb6
Reviewed-on: http://review.whamcloud.com/2334
Tested-by: Hudson
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-952 quota: follow locking order of quota code
Niu Yawei [Fri, 6 Jan 2012 09:18:35 +0000 (01:18 -0800)]
LU-952 quota: follow locking order of quota code

The locking order of quota code is: i_mutex > dqonoff_sem >
journal_lock > dqptr_sem > dquot->dq_lock > dqio_mutex, so we
should call the ll_vfs_dq_init() after journal started to avoid
deadlock.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Ia88a2eb8c9dc3827afd4828e0160ee376a1f041e
Reviewed-on: http://review.whamcloud.com/1923
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-805 quota: lfs quota doesn't print grace time correctly
Niu Yawei [Tue, 8 Nov 2011 13:07:05 +0000 (05:07 -0800)]
LU-805 quota: lfs quota doesn't print grace time correctly

Lustre always trigger grace time when the allocated qunit exceeding
softlimit, however, user tools 'lfs quota' only print grace time
when the total usage greater than softlimit, so sometimes user can't
tell if the softlimit is already exceeded from 'lfs quota' output.

This patch changes the 'lfs quota' to use the data get from kernel
instead of comparing usage with softlimit.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Ia564c803ca33b2cf925759b6a6e4e4df2692f28d
Reviewed-on: http://review.whamcloud.com/1674
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-78 o2iblnd: kiblnd_check_conns can deadlock
Liang Zhen [Tue, 21 Feb 2012 04:40:25 +0000 (12:40 +0800)]
LU-78 o2iblnd: kiblnd_check_conns can deadlock

kiblnd_check_conns() called kiblnd_check_sends() with hold of global
rwlock, it's wrong because kiblnd_check_sends() could do many things:
 - call lnet_finalize() which is not safe with hold of spinlock
 - call kiblnd_close_conn() which requires to write_lock the same
   global lock
 - kiblnd_check_sends() might need to allocate NOOP message

It can be fixed by moving call of kiblnd_check_sends out from spinlock
This patch is from the fix of Bug 20288, with some small changes.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: Icc9fedc70ecb25b0c41ebaf6d80c971f8281c9c6
Reviewed-on: http://review.whamcloud.com/2166
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-651 osc: suppress message in can_merge_pages()
Bobi Jam [Sun, 2 Oct 2011 08:08:56 +0000 (16:08 +0800)]
LU-651 osc: suppress message in can_merge_pages()

Thottle messages if adjacent brw pages are not mergeable with
different OBD_BRW_NOQUOTA flags.

Change-Id: I22ce6f8807e2541d3e6b3c9631f60faa36baa81a
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1328
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy
Liang Zhen [Wed, 14 Mar 2012 04:41:08 +0000 (12:41 +0800)]
LU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy

multiple ptlrpc service threads can enter ptlrpc_grow_req_bufs()
the same time if they found "low_water" in ptlrpc_check_rqbd_pool(),
each of these threads will allocate ptlrpc_service::srv_nbuf_per_group
request buffers and could consume all memory.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I83d6fe53a0f86691ae7e2afb3d75fb8677f58688
Reviewed-on: http://review.whamcloud.com/2308
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
13 years agoLU-960 utils: bad stripe count report, and validate stripe size
Minh Diep [Tue, 7 Feb 2012 17:12:58 +0000 (09:12 -0800)]
LU-960 utils: bad stripe count report, and validate stripe size

Need to use %d to print -1 instead of %u
Need to check for -1 in input for stripe size

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Ic4ee84a45bdb5dc934a3e681a4fc2fcd51f14b99
Reviewed-on: http://review.whamcloud.com/2112
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Cliff White <cliffw@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-106 procfs: many proc entries are not accessed safely
Lai Siyao [Mon, 13 Jun 2011 03:56:03 +0000 (20:56 -0700)]
LU-106 procfs: many proc entries are not accessed safely

Some in memory data may be released/uninitialized at the time
of proc entry creation/removal, this patch includes the following
fixes:
* initialize data before proc entry creation
* free data after proc entry removal
* free proc entries in obd_precleanup() because
  obd_uuid/nid/nid_stats_hash are released in class_cleanup().
* free proc entries after obd_zombie_barrier() because obd_export
  hold one refcound of nid_stat.
* check osd->od_mount before accessing osd proc entries because the
  osd proc entries are created before mount.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Change-Id: I03cb977e1be0747032a70f6a39fec804f81d70cc
Reviewed-on: http://review.whamcloud.com/326
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1109 llite: do splice read stripe by stripe
Jinshan Xiong [Thu, 23 Feb 2012 19:57:54 +0000 (11:57 -0800)]
LU-1109 llite: do splice read stripe by stripe

If nfsd is reading an across stripe buffer, and if the first stripe
happens to be 64KB(PIPE_BUFFERS*PAGE_SIZE), then first read will
occupy all pipe buffers and this makes nfsd stuck if it reads the
next stripe immediately.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I13cb54b37f738ee3c081dff1929630ea523b77fd
Reviewed-on: http://review.whamcloud.com/2182
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1050 o2iblnd: fix checking order of rdma_create_id() argument
Shuichi Ihara [Mon, 13 Feb 2012 16:47:38 +0000 (01:47 +0900)]
LU-1050 o2iblnd: fix checking order of rdma_create_id() argument

Replace rdma_create_id() with rdma_destroy_id() in
openib gen2 test and four argument check moves to
the back of openib test.

Signed-off-by: Shuichi Ihara <sihara@ddn.com>
Change-Id: I0782183f15f58647291518a4222610601083c369
Reviewed-on: http://review.whamcloud.com/2097
Tested-by: Hudson
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
13 years agoLU-1125 recovery: initial recovery thread's watchdog
Bobi Jam [Wed, 22 Feb 2012 06:30:41 +0000 (14:30 +0800)]
LU-1125 recovery: initial recovery thread's watchdog

Recovery thread does not have watchdog attached, correctly initialize
it.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I6993c39bbf18f47e9ccd965a5d2ba1919cfb7736
Reviewed-on: http://review.whamcloud.com/2174
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1128 ldlm: return -1 for server pool shrinker
Niu Yawei [Fri, 24 Feb 2012 05:21:51 +0000 (21:21 -0800)]
LU-1128 ldlm: return -1 for server pool shrinker

For ldlm server pool shrinker, we just use it to decrease SLV,
but never reclaim any memory directly, so it should always return
-1 to inform the kernel to break the shrink loop.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I17f51ac84eb0b8c70b2cee9ac7eeca34647c1990
Reviewed-on: http://review.whamcloud.com/2184
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-882 quota: Quota code compares unsigned < 0
Niu Yawei [Mon, 19 Dec 2011 10:01:36 +0000 (02:01 -0800)]
LU-882 quota: Quota code compares unsigned < 0

Port from b23858.

In check_cur_qunit(), it checks "if (limit + record < 0)", however,
the limit is unsigned, so this check will be always false, and when
limit is smaller than -record, following "limit += record" will make
limit a unreasonable large value.

This patch also fixed a similar defect in dqacq_handler().

Signed-off-by: Vladimir Saveliev <vladimir.saveliev@oracle.com>
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Iea02143dae5542f1a9f9cc823a684a18031b8a03
Reviewed-on: http://review.whamcloud.com/1889
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-617 recovery: setattr from open breaks recovery
Niu Yawei [Fri, 26 Aug 2011 02:58:36 +0000 (19:58 -0700)]
LU-617 recovery: setattr from open breaks recovery

The setattr from open(open(O_TRUNC)) is now serialized with
'cl_setattr_lock' on client and goes to a dedicate portal, which is
different with other reint operations, consequently, setattr RPC
can be parallel with other reint RPCs, and that result in the race of
updating last_transno/last_xid on server.

This patch removed the 'cl_setattr_lock' stuff to make all the reint
operations serialized by 'cl_rpc_lock', and the code on server side
which assumes client is holding DLM lock when setattr from open is also
removed, since it's not true.

The MDS_SETATTR_PORTAL service is preserved to keep the compatibility
with old client, and the MDS_SETATTR_FROM_OPEN is also preserved, since
we are using this flag to check write access for open(O_TRUNC), and
it probably can be used for some optimization purpose in future.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I45f83f8f05022ff0d31f8e7784381821c835785d
Reviewed-on: http://review.whamcloud.com/1654
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
13 years agoLU-1198 idl: move FID VER to DLM resource name[1]
Andreas Dilger [Thu, 8 Mar 2012 07:29:09 +0000 (15:29 +0800)]
LU-1198 idl: move FID VER to DLM resource name[1]

Until Lustre 1.8.7/2.1.1 the FID version was packed into name[2].

However, this leaves very little room in the LDLM resource name
for other uses.  The upcoming quota code needs to store another
FID into the LDLM resource to allow directory tree quotas, and
managed by the DLM.

The 32-bit VER, which is currently always 0, is moved into the high
bits of name[1] along with the 32-bit OID, to avoid consuming the
name[2] field.  Since future use of the FID version (including
snapshots, pools, etc) will need changes on the client side anyway,
there will never be non-zero VER on an existing client.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If1e500cfb277dfc25bc056bb0c5763e48e7dcab0
Reviewed-on: http://review.whamcloud.com/2288
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-620 llite: add delete/remove_from_page_cache check
Bobi Jam [Wed, 21 Sep 2011 10:17:13 +0000 (18:17 +0800)]
LU-620 llite: add delete/remove_from_page_cache check

Later 2.6.32 kernel use memory cgroup feature but does not export
truncate_complete_page but export delete_from_page_cache or
remove_from_page_cache, we need properly use them for pachless client
code.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I33e3e7c32b548866ee77753ef8a8193c814d0ecb
Reviewed-on: http://review.whamcloud.com/2230
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-1049 script: lc_net doesn't parse output correctly
Minh Diep [Mon, 6 Feb 2012 23:37:10 +0000 (15:37 -0800)]
LU-1049 script: lc_net doesn't parse output correctly

Port the second fix from bz=23234
The output of ping contains two ip addresses. The regex
matching picks up both addresses. The fix is to add
-m 1 to stop at the first match
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Iaefe42918b4587f88a8a0cb39cf9afb2a82021ba
Reviewed-on: http://review.whamcloud.com/2105
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: hongchao.zhang <hongchao.zhang@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
13 years agoLU-874 ldlm: Fix ldlm_bl_* thread creation
Jinshan Xiong [Fri, 3 Feb 2012 19:12:46 +0000 (11:12 -0800)]
LU-874 ldlm: Fix ldlm_bl_* thread creation

Always create a new ldlm_bl_ thread when all threads
are busy, not just after returning from sleep.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: Ife23dd09694e26d11d49572bc8bb0a2c0b2d3eee
Reviewed-on: http://review.whamcloud.com/2088
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-874 osc: prioritize writeback pages
Jinshan Xiong [Fri, 3 Feb 2012 19:11:37 +0000 (11:11 -0800)]
LU-874 osc: prioritize writeback pages

When a lock is being canceled, we should prioritize those covering
pages which have already been submitted by page writeback daemon;
otherwise, this client may be evicted because there is no active IO
for that lock for a long time.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I2f914df0204375a51f4a7565a75640e9bb3c6d19
Reviewed-on: http://review.whamcloud.com/2087
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>