Whamcloud - gitweb
fs/lustre-release.git
12 years agoLU-820 Increase LU_CDEBUG_LINE so that osc page messages fit.
Oleg Drokin [Thu, 3 Nov 2011 03:41:09 +0000 (23:41 -0400)]
LU-820 Increase LU_CDEBUG_LINE so that osc page messages fit.

LU_CDEBUG_LINE at 255 does not quite fit osc-page message out of
osc_page_print that is quite long.
As a result we get annoying "does not end in newline" console messages
when this happens.

Doubling LU_CDEBUG_LINE certainly should be enough.

Change-Id: Iec8635d6578e192d3b33643c4b1dab1dae2be6b4
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1642
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-883 commit: commit-msg rewrite
Bruce Korb [Tue, 6 Dec 2011 15:41:38 +0000 (07:41 -0800)]
LU-883 commit: commit-msg rewrite

 1. do global initializations in function.
    collect the read only and state monitoring initializations
    to one place where they can be easily identified.

 2. Move the signoff and changeid code to functions.
    Have them call a "ck_wrapup" function to validate
    the processing state for doing the summary/changeid/etc.
    change-id: may now appear before signed-off-by:

 3. Move the wrap up validations to the first-time-through-wrapup
    code in ck_wrapup.

 4. add a function for handling innocuous tag lines.
    Only call "ck_wrapup".

 5. Remove all $LINE echoing to the end of the case statement.
    Use "break" and/or "continue" to bypass line echo.

 6. use [ ${#foo} -gt 0 ] in preference to [ $foo ].
    This scriptlett will echo "no":
      $ f='! -z' ; [ $f ] && echo yes || echo no
    You could quote "$f", but directly checking length more
    correctly represents the intent.

 7. Removed conditions for setting HAS_LAST_BLANK=true because
    the conditions were written to be *ALWAYS* true.
      foo=false
      [ $foo ] && echo hi there
    will always echo out, "hi there".

 8. eliminate the convoluted and slightly wrong state management
    for trying to identify a "diff" block.  Instead, just read
    in the next line in that case clause and either break out
    of the read loop or echo out both of the lines.

 9. Do not fork and exec a bunch of programs when ${#LINE} will
    directly tell you how many characters are in the line.

10. do default line processing in a function, too.  Makes regex
    code selection much easier to understand.  Also allowed
    me to remove the ";&" case statement fall-through construct.
    Removed $HAS_SIGNOFF check since $IS_WRAPPING_UP is tested
    upon function entry.

11. emacs editor hints.

12. TODO: read the first two lines outside of the loop and
    verify that they are up to snuff.  Then remove a bunch of
    checks scattered about the code.  Line counting, etc., etc.

Signed-off-by: Bruce Korb <bruce_korb@xyratex.com>
Change-Id: I30a4a9b77b82af9f9b965823487001cbd5c28230
Reviewed-on: http://review.whamcloud.com/1764
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-810 changelog: Fix hsm_get_cl_xxx() helpers
Aurelien Degremont [Fri, 4 Nov 2011 22:01:59 +0000 (23:01 +0100)]
LU-810 changelog: Fix hsm_get_cl_xxx() helpers

Fix helpers for extracting information from HSM changelog records.

HSM records in changelog contain additionnal information from standard
records. This includes HSM event, error code and some flags.
Macros helpers exist to help user-space tools, like Policy Engine, to
extract them. Those helpers were broken, this fixes them.

Signed-off-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Change-Id: I7ba1e6a3ec7635b646f7a2dfa8173bf90529fbd9
Reviewed-on: http://review.whamcloud.com/1651
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-847 quota: New quota tests to exercise accounting
Niu Yawei [Thu, 15 Sep 2011 02:37:54 +0000 (19:37 -0700)]
LU-847 quota: New quota tests to exercise accounting

three tests are added in sanity-quota.sh:
- basic disk usage tracking for user & group;
- disk usage transfer for user & group;
- disk usage across restart for user & group;

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I605370db52f07875f725f48d7b0ce1bab9af80b3
Reviewed-on: http://review.whamcloud.com/1380
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-651 osc: suppress message in can_merge_pages()
Bobi Jam [Sun, 2 Oct 2011 08:08:56 +0000 (16:08 +0800)]
LU-651 osc: suppress message in can_merge_pages()

Thottle messages if adjacent brw pages are not mergeable with
different OBD_BRW_NOQUOTA flags.

Change-Id: I22ce6f8807e2541d3e6b3c9631f60faa36baa81a
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1328
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-663 kernel: Some arch do not have NUMA features anymore
Gregoire Pichon [Wed, 7 Sep 2011 14:55:04 +0000 (16:55 +0200)]
LU-663 kernel: Some arch do not have NUMA features anymore

Some architectures, especially x86_64, do not have cpu_to_node()
defined as a macro, and node_to_cpumask() exported by the kernel
anymore.

The cpu_to_node() routine is defined either as a macro, as an inline
routine using another exported symbol, or as an exported symbol.
Anyway, the kernel defines this service since at least version
2.6.12.

The node_to_cpumask() routine has been replaced by cpumask_of_node()
for x86 architectures since kernel version 2.6.30.

The set_cpus_allowed() routine is not defined if
CONFIG_CPUMASK_OFFSTACK=y since kernel version 2.6.32.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Change-Id: If81269f403f888d4cde89c6fda5a8d7e10ea70b0
Reviewed-on: http://review.whamcloud.com/1345
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
12 years agoLU-506 kernel: FC15 - blkdev_get_by_dev() used instead of open_by_devnum().
yangsheng [Mon, 5 Dec 2011 16:17:17 +0000 (00:17 +0800)]
LU-506 kernel: FC15 - blkdev_get_by_dev() used instead of open_by_devnum().

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I9d633ba5c3004fd23de9522ebc8089792b96ed2c
Reviewed-on: http://review.whamcloud.com/1800
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-717 ldiskfs: MRP-222 Replace sysname with nodename in MMP
Nikitas Angelinas [Mon, 5 Dec 2011 22:31:12 +0000 (22:31 +0000)]
LU-717 ldiskfs: MRP-222 Replace sysname with nodename in MMP

sysname holds "Linux" by default, i.e. what appears when doing a
"uname -s"; nodename should be used to print the machine's hostname,
i.e. what is returned when doing a "uname -n" or "hostname", and what
gethostname(2)/sethostname(2) manipulate, in order to notify the
administrator of the node which is contending to mount the
filesystem.

Andreas says this was introduced when porting the MMP patches from
RHEL5 to RHEL6, and then also pushed upstream to ext4; a patch for
upstream ext4 has already been submitted.

Signed-off-by: Nikitas Angelinas <nikitas_angelinas@xyratex.com>
Change-Id: I207bf145d114a9981b5a6add4bbf92ca76f71840
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-on: http://review.whamcloud.com/1419
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-506 kernel: FC15 - tcp_sendpage() uses struct sock.
yangsheng [Thu, 8 Dec 2011 00:10:20 +0000 (08:10 +0800)]
LU-506 kernel: FC15 - tcp_sendpage() uses struct sock.

Since 2.6.36 tcp_sendpage() uses 'struct sock' as 1st argument
instead of 'struct socket'.

Change-Id: I1641541e3e52f3f47c2dfeb63a5311f5f9a0634a
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1739
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-887 mgc: prevent client IR with old server
Bobi Jam [Sat, 3 Dec 2011 14:21:46 +0000 (22:21 +0800)]
LU-887 mgc: prevent client IR with old server

Prevent IR enabled client start IR handling with IR unawared MGS
server.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I9773eafa437358dbf4988e0fddd490b0daf59358
Reviewed-on: http://review.whamcloud.com/1798
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-900 test: ncli.sh cause test failure
Minh Diep [Tue, 6 Dec 2011 05:27:54 +0000 (21:27 -0800)]
LU-900 test: ncli.sh cause test failure

When a variable such as SRUN=$(which srun) returns empty, it
causes the test the fail
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: I8e5550a6c2e65dc7fbc51c6a6ef6e7e62260d88b
Reviewed-on: http://review.whamcloud.com/1801
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-841 tests: sanity.sh 27q does not create a testing directory
Andrew Perepechko [Mon, 14 Nov 2011 15:10:59 +0000 (19:10 +0400)]
LU-841 tests: sanity.sh 27q does not create a testing directory

sanity.sh 27q does not create a testing directory which causes
ENOENT errors from "ONLY=27q bash sanity.sh"

Signed-off-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Change-Id: I4b0b3839cc3cfd8cf643c7e4964cd0c22af39bea
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-on: http://review.whamcloud.com/1698
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-881 utils: running bumped w/o lock
Bruce Korb [Mon, 28 Nov 2011 20:02:21 +0000 (12:02 -0800)]
LU-881 utils: running bumped w/o lock

bumped_running accessed without lock

Multiple threads man increment shared_data->running until
the setting of bumped_running becomes visible.

Also, use "bool" -- First released in Issue 6 of POSIX spec:
   included for alignment with the ISO/IEC 9899:1999 standard

When inserting #include <stdbool.h>, it was noticed that stdio.h was
included twice.  Collected and ordered system headers so that
won't happen again.

Signed-off-by: Bruce Korb <bruce_korb@xyratex.com>
Change-Id: I45891e125221d29f72efee38580a56888c3a266f
Reviewed-on: http://review.whamcloud.com/1749
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
12 years agoLU-685 obdclass: lu_object reclamation is inefficient
Lai Siyao [Thu, 15 Sep 2011 06:45:13 +0000 (23:45 -0700)]
LU-685 obdclass: lu_object reclamation is inefficient

Put only non-referenced lu_object in lru list to speed up object
reclamation.

Change-Id: I711bf14c99d63567590fa131cc0c8e41b7cc0cbe
Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1384
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-792 lbuild-rhel5 should use redhat's SRPM repo
Michael MacDonald [Tue, 25 Oct 2011 13:57:16 +0000 (09:57 -0400)]
LU-792 lbuild-rhel5 should use redhat's SRPM repo

Download EL5 kernel .src.rpm packages from redhat's repo instead
of trying to use CentOS's often-outdated repo.

Change-Id: I0e622e6481f11941b5038b36c0fd074fe470c595
Signed-off-by: Michael MacDonald <mjmac@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1590
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-542 Fix mdt xattr handler logic error
Bobi Jam [Thu, 28 Jul 2011 14:12:38 +0000 (22:12 +0800)]
LU-542 Fix mdt xattr handler logic error

Record system ACL and user xattr change/deletion changelog.

Change-Id: I5aabf1879ec6e812361fe0d1b8255f84d0e817d6
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1158
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-506 FC15: unplug_fn removed since 2.6.39.
yangsheng [Thu, 24 Nov 2011 05:12:28 +0000 (13:12 +0800)]
LU-506 FC15: unplug_fn removed since 2.6.39.

The old plugging interface isn't exist any more. So
disable the unplug_fn callback in this case.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I2396b845e2f2ef158f8e21e031a31ac83f3a5b02
Reviewed-on: http://review.whamcloud.com/1738
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-482 replay-dual test_0a failed
Niu Yawei [Wed, 13 Jul 2011 06:20:07 +0000 (23:20 -0700)]
LU-482 replay-dual test_0a failed

Running LVM on top of VM hypervisor has the write caching and write
reordering problem for kernels prior to 2.6.33.
This might corrupt the journal or fs metadata and lead to a mount failure
in first replay-dual test being run (typically test_0a).

Adding 10 seconds delay after the mount should be enough for the
changes to be flushed to disk.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I68bcb298f94480b26e506f92b3c018530cfe6106
Reviewed-on: http://review.whamcloud.com/1157
Tested-by: Hudson
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-834 echo_client: fix page_is_vmlocked
Jinshan Xiong [Thu, 10 Nov 2011 17:30:39 +0000 (09:30 -0800)]
LU-834 echo_client: fix page_is_vmlocked

->cpo_is_vmlocked() should return -EBUSY if page is locked,
or -ENODATA otherwise. Fix it for echo_client implementation.

Change-Id: I863591463fefd0c2afba32d04b62b983103a4e3d
Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1686
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-508 ldiskfs: fix race in ext4_ext_walk_space()
Bobi Jam [Thu, 27 Oct 2011 01:51:39 +0000 (09:51 +0800)]
LU-508 ldiskfs: fix race in ext4_ext_walk_space()

we should not access on-disk data (e.g. path->p_ext->*) with no
locking.

to be fixed in mainline ext4 as well.

Port from: ORI-291
Author: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ic9dfcc9a90766c79c0da5ca54e9fbb2f917865a6
Reviewed-on: http://review.whamcloud.com/1618
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-553 build: improve checks for commit-msg
Andreas Dilger [Fri, 11 Nov 2011 23:26:33 +0000 (16:26 -0700)]
LU-553 build: improve checks for commit-msg

Improve the checks done by the commit-msg script.  It now ensures
that all the parts of the commit message are present.
- validate that the Change-Id: generated from 'git hash-object' is
  not empty, since this can happen if git is unhappy with the options
- check for only one Change-Id: line (multiple Signed-off-by: OK)
- describe the "component:" field better, with some examples

If there was an error committing the message, save a copy to a
temporary file, so that it can be edited and re-used, instead of
having to recreate it each time, or fetch it from .git/COMMIT_MSG.

Add a simple regression test with good & bad commit messages, so
it is easier to verify that any changes made to the script will
continue to both detect errors, and pass valid commit messages.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I15cb3690560400a591598997424cf79dee3a039d
Reviewed-on: http://review.whamcloud.com/1688
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
12 years agoLU-506 FC15: ctl_name & strategy removed from ctl_table.
yangsheng [Fri, 9 Sep 2011 15:27:24 +0000 (23:27 +0800)]
LU-506 FC15: ctl_name & strategy removed from ctl_table.

The 'ctl_name' & 'strategy' field removed from
'struct ctl_table'.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I5eaeb46234083433109a150ea80408c9b3f9962b
Reviewed-on: http://review.whamcloud.com/1332
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
12 years agoLU-506 FC15: move some headers to include/generated
yangsheng [Mon, 29 Aug 2011 04:22:22 +0000 (12:22 +0800)]
LU-506 FC15: move some headers to include/generated

In FC15 (2.6.40-4) the generated kernel headers have moved
from "include/linux" to "include/generated". Update configure
scripts and makefiles to include this new directory. In a
number of cases, Lustre code was including generated headers
directly, but this was not really needed.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Id2094d8318681aa5ea08b416dc764bcf03bd8595
Reviewed-on: http://review.whamcloud.com/1329
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
12 years agoLU-506 FC15: update shrinker to use shrink_control callback
yangsheng [Fri, 9 Sep 2011 16:15:13 +0000 (00:15 +0800)]
LU-506 FC15: update shrinker to use shrink_control callback

Linux 3.0 memory pressure shrinker now takes "struct shrink_control" as
its argument instead of "nr_to_scan" and "gfp_mask". This was backported
to Fedora 15.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Id9f6a9e10efe785d2837d1ad73098d2808a2f076
Reviewed-on: http://review.whamcloud.com/1331
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
12 years agoLU-857 security: Lustre client tolerates enforced SELinux.
Aurelien Degremont [Mon, 14 Nov 2011 15:25:57 +0000 (16:25 +0100)]
LU-857 security: Lustre client tolerates enforced SELinux.

Fix a bug which prevents Lustre clients to access directoriess when
SELinux is enforced, on RHEL 6.
This patch does not add a real SELinux support for Lustre but ables
to activate it for all other local filesystems, without Lustre
misbehaving.

Signed-off-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Change-Id: Ia6692c96a8439eb9239cb55ce32a1c54958241d1
Reviewed-on: http://review.whamcloud.com/1703
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-745 kernel: ost-pools test_23 hung
Niu Yawei [Wed, 9 Nov 2011 03:23:52 +0000 (19:23 -0800)]
LU-745 kernel: ost-pools test_23 hung

It could be caused by a jbd2 bug which result in forever sleep
in the do_get_write_access().

http://www.spinics.net/lists/linux-ext4/msg24689.html

In do_get_write_access() we wait on BH_Unshadow bit for buffer to get
from shadow state. The waking code in journal_commit_transaction() has
a bug because it does not issue a memory barrier after the buffer is moved
from the shadow state and before wake_up_bit() is called. Thus a waitqueue
check can happen before the buffer is actually moved from the shadow state
and waiting process may never be woken. Fix the problem by issuing proper
barrier.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I44dce352babc6699cdacc00263bfd3f24538400c
Reviewed-on: http://review.whamcloud.com/1675
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-624 Kernel update [RHEL6.1 2.6.32-131.17.1.el6]
yangsheng [Thu, 25 Aug 2011 08:14:38 +0000 (16:14 +0800)]
LU-624  Kernel update [RHEL6.1 2.6.32-131.17.1.el6]

Change-Id: I82ef82e11f846840707f9f65ca72bcda8885c4e0
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1632
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-737 utils: check device name for digit
Minh Diep [Wed, 2 Nov 2011 21:41:45 +0000 (14:41 -0700)]
LU-737 utils: check device name for digit

We need to check the whole string for digit
not only the first character

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: I316d931bb344d3e3fe5bb7d7a2454f200b637017
Reviewed-on: http://review.whamcloud.com/1641
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-759 mdc: Clear rq_replay on error in mdc_enqueue()
Li Wei [Fri, 30 Sep 2011 08:30:09 +0000 (16:30 +0800)]
LU-759 mdc: Clear rq_replay on error in mdc_enqueue()

When mdc_enter_request() fails (e.g., due to signals) in mdc_enqueue(),
the request is freed without any care about its rq_replay field.  For
rq_replay requests, this results in assertion failures in
__ptlrpc_free_req().  This patch adds a call to mdc_clear_replay_flag()
to make sure __ptlrpc_free_req()'s assumption is respected.

Change-Id: I2185066a9f47b3d9563d9e1a8989754ef2e2dcb4
Signed-off-by: Li Wei <liwei@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1518
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-831 header: struct bit field should be unsigned type
Bobi Jam [Thu, 10 Nov 2011 07:43:44 +0000 (15:43 +0800)]
LU-831 header: struct bit field should be unsigned type

Make sure struct bit fields be unsigned type, or else if they are read
from proc interface, they would showes as a big number equivalent to
-1.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib252b4e89ce6b3f898e2da11a60de9aa9201119c
Reviewed-on: http://review.whamcloud.com/1685
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-835 build: skip generated files in .gitignore
Andreas Dilger [Wed, 9 Nov 2011 22:53:12 +0000 (15:53 -0700)]
LU-835 build: skip generated files in .gitignore

Skip files automatically generated during builds on FC-13.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1cff9a351137145bb87a89460d0a1b54fae29e2c
Reviewed-on: http://review.whamcloud.com/1681
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-795 osd api: Commit callback per transaction
Mikhail Pershin [Thu, 27 Oct 2011 04:03:33 +0000 (08:03 +0400)]
LU-795 osd api: Commit callback per transaction

- ability to add commit callback per transaction in addition to
  per-device hooks. Now it is much simpler if only commit callback
  is needed.
- rewrite commit callbacks for last_commit and new_client, add commit
  callback in seq manager
- cleanup not-needed code: old commit callbacks, txn_keys
- remove osd od_env_for_commit environment and env param from commit
  callbacks
- use th_sync to mark sync operations

Change-Id: If5f8f2a6d3cd2f3e77fd13c802213a181043a2d7
Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1621
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-691 Fix OST index errors in test suite - sanity 133c defect
James Simmons [Thu, 3 Nov 2011 14:39:36 +0000 (10:39 -0400)]
LU-691 Fix OST index errors in test suite - sanity 133c defect

Several test run do_facet ost which assumes ost is ost0 which does not exist
according to the way the test suite works. This patch address several areas
where the wrong ost index is used. This patch also introduces some short hand
functions to get OST properties from the index. Those functions ensure ost1,
ost10 and ost100 are seen as different which is not always the case.

Change-Id: Ic31224794563964a3415d24abeebce9dacceb686
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/1425
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoUpdated version to 2.1.52 2.1.52 2.1.52.0 v2_1_52_0
Oleg Drokin [Sat, 12 Nov 2011 05:23:50 +0000 (00:23 -0500)]
Updated version to 2.1.52

Change-Id: I68a6934d997921a82670272cf30abdcc25d6a575
Signed-off-by: Oleg Drokin <green@whamcloud.com>
12 years agoORNL-3 mntopt: consider low-layer options for MDT ACL flags
Fan Yong [Fri, 11 Nov 2011 02:18:39 +0000 (10:18 +0800)]
ORNL-3 mntopt: consider low-layer options for MDT ACL flags

Currently, MDT layer enables ACL support by default without checking
whether low-layer (ldiskfs) enables ACL support or not, then causes
unnecessary data traffic on network and through MDS stack for ACL.

So MDT should communicate with low-layer before setting ACL flags.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I804f10bf486745ddd3b23b89e959dfd585589ac0
Reviewed-on: http://review.whamcloud.com/1211
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-769 ptlrpc: Do not miss pending signals in ptlrpc_set wait
Oleg Drokin [Mon, 7 Nov 2011 03:34:41 +0000 (22:34 -0500)]
LU-769 ptlrpc: Do not miss pending signals in ptlrpc_set wait

conf_sanity test 23a highlighted a problem in ptlrpc_set_wait logic,
if we enter there with a signal pending and the import is not FULL,
there is no way to interrupt such a set because we block signals
all the time. Enabling signals all the time is not an option either.
Waiting until import reconnects is questionable too since it might
never come up after all (like in the test 23a).
So for the solution we will just manually mark the set as interrupted
after the initial wait.

Change-Id: Iaa3e356e971b4f75fd7f21cc579c85f7487719a0
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1657
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
12 years agoLU-762 ldiskfs: don't drop directory nlink to 0
yangsheng [Thu, 3 Nov 2011 18:25:42 +0000 (02:25 +0800)]
LU-762 ldiskfs: don't drop directory nlink to 0

When landing the nlink patch for ext4, for an unknown reason the
logic in ext4_dec_count() was changed from the ext3 version of
the patch. It now drops the nlink = 0 temporarily and then
if it is a directory with nlink == 0 it increases nlink again.

Instead, only drop nlink if it is larger than 2.

Change-Id: Ieeff3e45daea56f502848f9c2b0fb04f0a9d2b6d
Author: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1644
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-780 test: improve parallel-scale to support hyperion run
Minh Diep [Fri, 4 Nov 2011 23:12:57 +0000 (16:12 -0700)]
LU-780 test: improve parallel-scale to support hyperion run

We need to add support for srun/slurm, and a few tests
from hyperion-sanity script that has been used for hyperion
testing

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: I7f1baa0c99980ad9001436911d23f1030aa7d0fe
Reviewed-on: http://review.whamcloud.com/1615
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Cliff White <cliffw@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-826 mdd: Fix for mdd_lov_create_finish memory free
Alexander.Boyko [Mon, 7 Nov 2011 15:22:26 +0000 (18:22 +0300)]
LU-826 mdd: Fix for mdd_lov_create_finish memory free

At lov/lov_pack.c lov_packmd() use OBD_ALLOC_LARGE for memory
allocation for lov_mds_md object(lmpp), but mdd_lov_create_finish
use OBD_FREE to free memory. This bug doesn`t affect to current
version, but may be relevant in future.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Change-Id: Ic37cb72022b9aac02368b11f370cbaad0c730e7c
Reviewed-on: http://review.whamcloud.com/1659
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoORNL-8 IR: minor fixes to IR patches
Jinshan Xiong [Thu, 20 Oct 2011 18:34:23 +0000 (11:34 -0700)]
ORNL-8 IR: minor fixes to IR patches

Summary of fixes:
1. ORNL-13: add comments to mgs_handle_fslog_hack();
2. ORNL-14: typo and accurate error/debug messages;
            minor fixes in recovery-small.sh;

Change-Id: I6317067eb6250faf5df21be719c82de17b0f4cc9
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1582
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoORNL-7 replace statahead hacking with its own cache
Fan Yong [Tue, 8 Nov 2011 02:28:54 +0000 (10:28 +0800)]
ORNL-7 replace statahead hacking with its own cache

Original statahead hacks dcache, it inserts dentry into dcache and
associates with inode without holding parent's lock, which breaks
VFS layer synchronization mechanism.

The new statahead does not build dentry, it just pre-fetches inode's
attributes and related ldlm locks, caches them in a small statahead
cache against parent's inode. The statahead sponsor can search such
small cache similar as searching dcache to look for what it wants.
If cache hit, then builds dentry, associates with inode, and inserts
into dcache by the statahead sponsor itself.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I6948f8a438a938c51563468d775e676e4185e580
Reviewed-on: http://review.whamcloud.com/1208
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-774 test: wait till space from previous tests is released
Bobi Jam [Tue, 25 Oct 2011 10:12:34 +0000 (18:12 +0800)]
LU-774 test: wait till space from previous tests is released

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Idf4933f9b40c74b99bf763eb896680a4e0c942ff
Reviewed-on: http://review.whamcloud.com/1588
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-482 test-framework: Ensure dirty cache is flushed before barrier
Oleg Drokin [Sun, 6 Nov 2011 19:06:40 +0000 (14:06 -0500)]
LU-482 test-framework: Ensure dirty cache is flushed before barrier

With certain backend devices like LVM with older kernels the data
in dirty cache cannot be propagated all the way to the block device
with a single sync as there are multiple non-cooperating layers.

So convert such sync calls into triple syncs

Change-Id: If82e25223a277ec165d150b0f5f960ff845af9b0
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1656
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
12 years agoLU-773 tests: sanity:test_105b failure
Jinshan Xiong [Fri, 21 Oct 2011 18:07:30 +0000 (11:07 -0700)]
LU-773 tests: sanity:test_105b failure

The root cause of this issue is that activating osc in test sanity 104a
wasn't successful. We should wait for the recovery to finish.

Change-Id: I940419bfb1f579c4a0233b7439ac1f459ee584ad
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1542
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-601 mdd: Fix transaction credits
Bobi Jam [Tue, 23 Aug 2011 03:34:06 +0000 (11:34 +0800)]
LU-601 mdd: Fix transaction credits

* mdd_create()/mdd_create_data() may need delete orphan objects on
  OSTs, so that we need preserve enough transaction credits for llog
  records.
* mdd_attr_set() may write lov llogs.
* orphan_object_destroy() also will write a llog record, we need
  reserve credit for it as well.
* add credit changelog record.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I5124d2f368e2ff794b2b2b8194bec86f63e971cf
Reviewed-on: http://review.whamcloud.com/1276
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-778 o2iblnd: Add rdma_create_id() compatibility macro
Ned Bass [Wed, 19 Oct 2011 21:47:50 +0000 (14:47 -0700)]
LU-778 o2iblnd: Add rdma_create_id() compatibility macro

As of RHEL6.2 kernel 2.6.32-204.el6, rdma_create_id() requires a
queue-pair type as a fourth argument.  This was previously inferred
from the rdma_port_space argument.  Add an autoconf test to detect
whether the fourth argument is expected and a compatibility macro
that discards the QP type argument if the 3-argument version of
rdma_create_id() is present.

Change-Id: Idb668e1f059954ecc994ad59b366d54da8b82dc8
Signed-off-by: Ned Bass <bass6@llnl.gov>
Reviewed-on: http://review.whamcloud.com/1556
Tested-by: Hudson
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-389 update wirecheck for master
Fan Yong [Thu, 27 Oct 2011 17:46:32 +0000 (01:46 +0800)]
LU-389 update wirecheck for master

Drop unused mds_body/mds_remote_perm definition and wirecheck.
Add missing wirecheck for mdt_xxx, ost_xxx, lu_xxx, and so on.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I858a02b0b5de8680cc8315e38aeeca271c26e9ad
Reviewed-on: http://review.whamcloud.com/985
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoORNL-26 prevent call md_set_lock_data() repeatly
Fan Yong [Thu, 3 Nov 2011 16:40:32 +0000 (00:40 +0800)]
ORNL-26 prevent call md_set_lock_data() repeatly

md_set_lock_data() is called from different functions in llite, and
may be called more than once for the same <inode lock> pair. It is
harmless for the correctness, but will affect the performance a bit,
should be avoided.

Drop dentry flags of "DCACHE_LUSTRE_INVALID" only when we hold the
"MDS_INODELOCK_LOOKUP" lock.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I4dbe206af77ba6a619f3268c238ce98ac7aef4c0
Reviewed-on: http://review.whamcloud.com/1224
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-686 conf-sanity test 52 fails due to missing mount point
Hongchao Zhang [Tue, 1 Nov 2011 05:44:15 +0000 (13:44 +0800)]
LU-686 conf-sanity test 52 fails due to missing mount point

Create the mount point before mount

Change-Id: I6c813f5040e4636386eabb640526f3d1072f484b
Signed-off-by: HongChao Zhang <hongchao.zhang@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1379
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-620 llite: add delete_from_page_cache and remove_from_page_cache check
Bobi Jam [Wed, 21 Sep 2011 10:17:13 +0000 (18:17 +0800)]
LU-620 llite: add delete_from_page_cache and remove_from_page_cache check

Later 2.6.32 kernel use memory cgroup feature but does not export
truncate_complete_page but export delete_from_page_cache or
remove_from_page_cache, we need properly use them for pachless client
code.

Change-Id: I33e3e7c32b548866ee77753ef8a8193c814d0ecb
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1399
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-000 libcfs: allow debug to be set to numeric -1
Andreas Dilger [Fri, 21 Oct 2011 21:45:17 +0000 (15:45 -0600)]
LU-000 libcfs: allow debug to be set to numeric -1

Don't warn if debug is set to "-1", which should always mean
"enable all debugging".

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1503d2f5776043e9a4ec3bc4af710ae35ca97960
Reviewed-on: http://review.whamcloud.com/1577
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-723 Enhance lustre/ldiskfs build system
Brian Behlendorf [Wed, 20 Apr 2011 23:46:54 +0000 (16:46 -0700)]
LU-723 Enhance lustre/ldiskfs build system

Enhance the lustre/ldiskfs build system so it is more robust, flexible,
and consistent with lustre/zfs build system.  This change is being made
in the interest of standardizing the infra-structure around backend
filesystems.

This change does not effect the current behavior of the --with-ldiskfs,
--enable-ext4, or --with-ldiskfsprogs configure options.  However, it
does remove the obsolete --with-ldiskfs-inkernel configure option which
was only used by LLNL.  It also adds the --with-ldiskfs-obj configure
option which improves flexibility.  And the --enable-ldiskfs-build
configure option to support building against the lustre-ldiskfs-devel
package.  The behavior of these options is consistent with their zfs
counterparts, see commit 8c7266c for further details.

  --enable-ext4           enable ldiskfs build using ext4
  --enable-ldiskfs-build  enable ldiskfs configure/make
  --with-ldiskfs=path     set path to ldiskfs source
  --with-ldiskfs-obj=path set path to ldiskfs objects
  --with-ldiskfsprogs     use alternate names for ldiskfs-enabled e2fsprogs

Sample ./configure results when building lustre and ldiskfs using
the kernel-devel and kernel-debuginfo-common packages.

checking whether to enable ldiskfs... yes
checking ldiskfs source directory... /home/behlendo/src/git/lustre/ldiskfs
checking ldiskfs object directory... /home/behlendo/src/git/lustre/ldiskfs
checking ldiskfs module symbols... Module.symvers
checking ldiskfs source release... 3.3.0
checking whether to use ext3 or ext4 source... ext4
checking ext4 source directory...  /usr/src/debug/.../fs/ext4
checking whether to build ldiskfs... yes
checking for /home/behlendo/src/git/lustre/ldiskfs/configure... yes
checking for /usr/src/debug/.../fs/ext4/dir.c...  yes
checking for /usr/src/debug/.../fs/ext4/file.c...  yes
checking for /usr/src/debug/.../fs/ext4/inode.c...  yes
checking for /usr/src/debug/.../fs/ext4/super.c...  yes
checking if ext4_ext_walk_space() takes i_data_sem... yes
checking if LDISKFS_SINGLEDATA_TRANS_BLOCKS takes sb as argument... yes
checking if ldiskfs_discard_preallocations defined... yes
checking if ldiskfs_ext_insert_extent needs 5 arguments... yes

In the context of this change additional cleanup has been done and
all of the ldiskfs specific code relocated to lustre-build-ldiskfs.m4.

Note that this change moves us closer to supporting patchless Lustre
servers with ldiskfs.  Once the remaining kernel patches for Lustre
are dropped you will be able to build Lustre using the distribution
provided kernel-devel and kernel-debuginfo-common packages.

This change also incorperates ORI-340, commit f604951, which ensures
that the Module.symvers file will cleanly include the symbols for all
enabled Lustre backends.  While the only backend supported by master
right now is ldiskfs this brings the master and orion branchs in to
sync in this regard.

Change-Id: I6f13f266944ec6967f4d7705a30b83ab8e577b15
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Prakash Surya <surya1@llnl.gov>
Reviewed-on: http://review.whamcloud.com/1566
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoORNL-22 ptlrpc: more comment for multi-threaded ptlrpcd
Fan Yong [Wed, 2 Nov 2011 04:41:09 +0000 (12:41 +0800)]
ORNL-22 ptlrpc:  more comment for multi-threaded ptlrpcd

To explain share work load between ptlrpcd partners.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I72711a93af321e43e6dbbbc52b427060be47f808
Reviewed-on: http://review.whamcloud.com/1638
Tested-by: Hudson
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: hongchao.zhang <hongchao.zhang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-362 Fix error path in lu_kmem_init().
Mikhail Pershin [Wed, 19 Oct 2011 19:55:04 +0000 (23:55 +0400)]
LU-362 Fix error path in lu_kmem_init().

When echo client is failed to set up there are memory leaks were
noticed, it is related to missed error handling in some functions.

- Free all caches if lu_kmem_init() failed
- fix error handling in cl_global_init() and ccc_global_init()

Change-Id: Ide9e7ad6d40f99a7fbb4330ba63b168cc408356f
Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/586
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-462 Don't alloc/free client data for self export
Mikhail Pershin [Fri, 21 Oct 2011 12:38:29 +0000 (16:38 +0400)]
LU-462 Don't alloc/free client data for self export

Self export doesn't need client data and ldlm initialization.
Patch uses uuid comparision to determine self_export.

Change-Id: Id26ef90e9857e4c1d3a0e7a3756eaf67607890d6
Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1574
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-791 obdfilter: Don't clear OBD_MD_FLFLAGS mistakenly
Niu Yawei [Tue, 25 Oct 2011 11:31:11 +0000 (04:31 -0700)]
LU-791 obdfilter: Don't clear OBD_MD_FLFLAGS mistakenly

Instead of set oa->o_valid to OBD_MD_FLID | OBD_MD_FLGROUP arbitrarily
in filter_handle_precreate(), it should be changed as "|=" to keep
the OBD_MD_FLFLAGS setted in filter_precreate().

Otherwise, client will not be aware of OST is running out of space,
and lov_create() will wait for objects forever in such case.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I84be8dde59dbb2829cd800e10b7aa6f4402b7e56
Reviewed-on: http://review.whamcloud.com/1589
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: hongchao.zhang <hongchao.zhang@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoORNL-28 recovery: rework extend_recovery_timer()
Jinshan Xiong [Thu, 27 Oct 2011 05:52:43 +0000 (23:52 -0600)]
ORNL-28 recovery: rework extend_recovery_timer()

extend_recovery_timer() is used to adjust timeout value of a recovering
target. In the original implementation, there was a problem it stopped
the target from firing a timer again for version recovery case.

Change-Id: I815a15fb5d3104e52a189eed1529c58d7a8d03b9
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1620
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-611 utils: add restripe option to lfs_migrate
Andreas Dilger [Fri, 19 Aug 2011 11:21:24 +0000 (05:21 -0600)]
LU-611 utils: add restripe option to lfs_migrate

Add the "-R" option to have lfs_migrate restripe a migrated file
instead of keeping the original striping.  This is useful of some
directory got the wrong striping and a bunch of files were created
with the wrong striping.

Avoid possible confusion between the lfs_migrate and lfs setstripe
command-line options.  For now, deprecate the old "-c" option, since
it is redundant in any case.

Change-Id: I3a39bad93ef5c079678c65960e53d22e51431df3
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1265
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-532 mdt: improve xattr ctime warning message
Andreas Dilger [Thu, 28 Jul 2011 22:13:53 +0000 (16:13 -0600)]
LU-532 mdt: improve xattr ctime warning message

Print out which xattr is not getting OBD_MD_FLCTIME set so that it
is possible to track down what code path on the client is failing.

Change-Id: I1918d2e8e0a1e03d8437846e823bca9df6f89b48
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1161
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoRevert "LU-462 Don't alloc/free client data for self export"
Oleg Drokin [Mon, 31 Oct 2011 06:45:50 +0000 (02:45 -0400)]
Revert "LU-462 Don't alloc/free client data for self export"

This introduced a memory leak problem

This reverts commit 140178844e5c0e4f3cfed8199800e39bf7082cd9

Change-Id: I558da8c44e08f77e77f7d1fe79da892a579992c3
Reviewed-on: http://review.whamcloud.com/1631
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Oleg Drokin <green@whamcloud.com>
13 years agoORNL-22 general ptlrpcd threads pool support
Fan Yong [Sat, 15 Oct 2011 16:09:53 +0000 (00:09 +0800)]
ORNL-22 general ptlrpcd threads pool support

Originally, there were two ptlrpcd threads on each node to serve all
async RPCs on the node, one ptlrpcd is for BRW, the other is for all
others. Such load mode cannot match more and more async RPCs process
on current large SMP node.

So we introduce ptlrpcd threads pool, any ptlrpcd threads in the pool
can be common shared by all async RPCs, like async I/O, async glimpse
lock, statahead, and ect. The async RPC sponsor can affect the system
load mode by specifying load policy when pushes the RPC into ptlrpcd
queue. On the other hand, it supports some flexible binding policies
to bind some ptlrpcd threads on CPU cores for reducing cross-CPU data
traffic, and also allow some ptlrpcd threads to be scheduled freely
on any CPU core to try to guarantee processing async RPCs in time.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: Icc0bd689df73b6863cc9adc544c3654c046cb8bd
Reviewed-on: http://review.whamcloud.com/1184
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoORNL-25 process dir page hash collision
Fan Yong [Fri, 28 Oct 2011 07:03:26 +0000 (15:03 +0800)]
ORNL-25 process dir page hash collision

If dir page has hash collision with others, then remove such page
from cache after using to avoid to be found unexpectedly later.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I15ff85e5233248944d77a9d93292d8690e1a715f
Reviewed-on: http://review.whamcloud.com/1234
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-770 tests: MRP-250 sanity.sh fails in CLIENTONLY mode
Andriy Skulysh [Tue, 18 Oct 2011 12:08:37 +0000 (15:08 +0300)]
LU-770 tests: MRP-250 sanity.sh fails in CLIENTONLY mode

Skip sanity.sh tests which requires remote access to
lustre servers.

Signed-off-by: Andriy Skulysh <andriy_skulysh@xyratex.com>
Change-Id: Ia853a8cf95bf6bf638e391aab30654a06a3b6589
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-on: http://review.whamcloud.com/1537
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-554 add gnilnd awareness to LNet
Wally Wang [Wed, 24 Aug 2011 20:22:47 +0000 (13:22 -0700)]
LU-554 add gnilnd awareness to LNet

This allows servers on any network to talk to gnilnd routers.
This is 2.1 version of the Oracle 23884 attachment 31892.

Change-Id: I96777551b0caa50021ebb32755caaa01623ea97d
Signed-off-by: Wally Wang <wang@cray.com>
Reviewed-on: http://review.whamcloud.com/1285
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-715 lov: fix procfs reporting for qos values
Matt Ezell [Mon, 17 Oct 2011 14:43:22 +0000 (10:43 -0400)]
LU-715 lov: fix procfs reporting for qos values

When writing to
/proc/fs/lustre/lov/<fsname>-mdtlov/{qos_prio_free,qos_threshold_rr},
the values read back are often one less than the values written.
This happens because internally the value is stored as a number from
0-255 but accessed by the user with 0-100. Integer truncation in the
storage and retrieval stages causes this to often show lower. Adding
255 to an internal step causes the bit-shift to "round up".

Signed-off-by: Matt Ezell <ezell@nics.utk.edu>
Change-Id: I9050aadb55bfa82d14b94a78e399d315249ac48f
Reviewed-on: http://review.whamcloud.com/1532
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-700 osc: Tally BRW_{READ,WRITE}_BYTES by bytes transferred.
John Hammond [Wed, 21 Sep 2011 19:00:39 +0000 (14:00 -0500)]
LU-700 osc: Tally BRW_{READ,WRITE}_BYTES by bytes transferred.

Call ptlrpc_lprocfs_brw() in brw_interpret() rather than
osc_send_oap_rpc() and tally by the number of bytes transferred rather
than the number requested.

Change-Id: Ia7191972d9671f01d942a46eba069191f130f516
Signed-off-by: John L. Hammond <jhammond@tacc.utexas.edu>
Reviewed-on: http://review.whamcloud.com/1402
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-672 MRP-213 don't panic on geting version for non existent fid
Mikhail Pershin [Thu, 22 Sep 2011 13:53:38 +0000 (17:53 +0400)]
LU-672 MRP-213 don't panic on geting version for non existent fid

lctl getobjversion can be called for a file which was removed from
MDT, but exists in cache. That producea a assert on mdd code:
LustreError: 20825:0:(mdd_object.c:2474:mdd_version_get())
ASSERTION(mdd_object_exists(mdd_obj)) failed

Author: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Change-Id: I7442e7aee26736741482c158ee3713df9796c953
Reviewed-on: http://review.whamcloud.com/1365
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-723 ldiskfs: remove ext3 RHEL5 kernel series
Andreas Dilger [Wed, 26 Oct 2011 07:34:34 +0000 (01:34 -0600)]
LU-723 ldiskfs: remove ext3 RHEL5 kernel series

Remove the old ext3 RHEL5 kernel patch series.  This has been
deprecated since Lustre 1.8.6 in favour of the ext4 RHEL5 series.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1e09ff432f2e970446c3b43fb92f0c1a988159ae
Reviewed-on: http://review.whamcloud.com/1603
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-20 ldiskfs: remove spurious warning message
Andreas Dilger [Fri, 21 Oct 2011 10:06:16 +0000 (04:06 -0600)]
LU-20 ldiskfs: remove spurious warning message

Remove the spurious warning message that we added for the missing
extents option since the MDS never gets the extents option enabled.

ldiskfs_fill_super: extents feature not enabled on this
filesystem, use tune2fs

Remove the obsolete patch that removed this from the patch we added.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9d896258b89ec3db528db59094572daac8dc207b
Reviewed-on: http://review.whamcloud.com/1572
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-334 llite: Add LPROC_LL_OSC_{READ,WRITE}.
John Hammond [Wed, 21 Sep 2011 22:17:25 +0000 (17:17 -0500)]
LU-334 llite: Add LPROC_LL_OSC_{READ,WRITE}.

This patch adds quick and easy access to aggregate OSC BRW (bytes on
the wire) statistics through llite.  To accomplish this, we pass the
number of bytes transferred by a successful request to
cl_req_completion(), and on to ll_stats_ops_tally() by way of the
ccc_req_completion().

Signed-off-by: John L. Hammond <jhammond@tacc.utexas.edu>
Change-Id: I4b464c27f6dc87fcc19d35b2bc45dc7cb9bf7741
Reviewed-on: http://review.whamcloud.com/1341
Reviewed-by: Richard Henwood <rhenwood@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jinshan Xiong <jay@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-646 port bz23485 (clarification of lustre fsync behavior)
Lai Siyao [Tue, 30 Aug 2011 03:44:31 +0000 (20:44 -0700)]
LU-646 port bz23485 (clarification of lustre fsync behavior)

Add directory fsync operation.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Change-Id: I9915bab70291c503ff1462328d5f2fbcff2b700e
Reviewed-on: http://review.whamcloud.com/1309
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-553 build: fix commit-msg line width check
Andreas Dilger [Wed, 19 Oct 2011 20:05:33 +0000 (14:05 -0600)]
LU-553 build: fix commit-msg line width check

Fix the calculation of the commit message line width to skip the
trailing linefeed character.

Don't use the "git hash_object -t commit" option, since this makes
some versions of Git unhappy and generate an empty Commit-Id string.

Skip diffstat output from "commit -v" when validating commit comment.

Validate the Change-Id: line has the proper ID format.

Reported-by: Bobi Jam <bobijam@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0bcfdedbe6d33fd5f81381ca33862a10d6b41b38
Reviewed-on: http://review.whamcloud.com/1553
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-629 ptlrpc: fix _debug_req to print opc/status
Andreas Dilger [Wed, 24 Aug 2011 21:17:53 +0000 (15:17 -0600)]
LU-629 ptlrpc: fix _debug_req to print opc/status

The 2.x _debug_req() function was changed in bug 16359/commit 5467a86021
to avoid problems with accessing unswabbed message buffers. Unfortunately,
this broke the printing of many/most _debug_req() messages, because it
didn't check whether swabbing was actually needed in the first place.

Also, in ptlrpc_expire_one_request() some extra debugging information was
added in bug 21636/commit 368689640 but never removed, making this common
message overly verbose.

Fix _debug_req() so that it prints opcode/flags/status, unless the
ptlrpc_body _needs_ to be swabbed, but isn't.  Also print out more
useful idenfifiers for the nodes (the obd_name and NID instead of
the connection UUID).  This removes some of the added verbosity from
ptlrpc_expire_one_request(), and most of the rest was already being
printed out (deadline, current, etc).

Change-Id: I88a78486becd19f5b38f5578e5cc30e649564908
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1286
Tested-by: Hudson
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-753 obdfilter: improper LASSERT in filter_commitrw_write()
Niu Yawei [Sat, 22 Oct 2011 08:16:55 +0000 (16:16 +0800)]
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

In rare cases fsfilt_commit_wait() will wake up and return after the
transaction has finished its work and updated j_commit_sequence but
the commit callbacks have not been run yet. Which will trigger the
LASSERT(oti->oti_transno <= obd->obd_last_committed) improperly.

We should just wait for the commit callback finished instead of put
an improper LASSERT here.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Ibd5add8d352d2e7598be49b0bf8fa37d40ce6e1f
Reviewed-on: http://review.whamcloud.com/1583
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoORNL-24 skip unnecessary client obj hash lookup
Fan Yong [Sun, 9 Oct 2011 14:27:51 +0000 (22:27 +0800)]
ORNL-24 skip unnecessary client obj hash lookup

The client-side object will be initialized when inode is established
in memory. Such object will be inserted into global hash table for
further using. The caller follows "lookup-alloc-lookup-insert". It is
standard, but maybe unnecessary for some cases. If it is sure that
there is only one caller will insert the new established object, then
we can skip the two lookup, and "alloc-insert" directly.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: Ie173c57627b2e5b4ed9b8a93f368d88ba8e54c31
Reviewed-on: http://review.whamcloud.com/1225
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-15 slow IO with read-intense application
yangsheng [Wed, 17 Aug 2011 18:43:49 +0000 (02:43 +0800)]
LU-15 slow IO with read-intense application

Align the readahead extent by 1M after when it is trimed by ra_max_pages.

Change-Id: I4102d2fe956fd01457949f0eb7c63654b5c2d095
signed-off-by: Wang Di <di.wang@whamcloud.com>
signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1255
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jay@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-555 ll_have_md_lock() optimization to accelerate multiple bits locks
jcl [Sat, 30 Jul 2011 12:54:01 +0000 (14:54 +0200)]
LU-555 ll_have_md_lock() optimization to accelerate multiple bits locks

Change-Id: Ie300ad2abf285a7413553da2a86cc74216cbae7d
Signed-off-by: jcl <jacques-charles.lafoucriere@cea.fr>
Reviewed-on: http://review.whamcloud.com/1170
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-488 ptlrpc_connection_put() LASSERT(!cfs_hlist_unhashed(&conn->c_hash))
Lai Siyao [Fri, 8 Jul 2011 16:35:47 +0000 (09:35 -0700)]
LU-488 ptlrpc_connection_put() LASSERT(!cfs_hlist_unhashed(&conn->c_hash))

Connection hash may be rehashed while ptlrpc_connection_put() is
called, ASSERT &conn->c_refcount > 1 instead of this.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Change-Id: Iec6d35419e0c4d8497bd0b84c6210abc8eb23882
Reviewed-on: http://review.whamcloud.com/1074
Tested-by: Hudson
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Prakash Surya <surya1@llnl.gov>
Reviewed-by: zhen liang <liang.zhen@live.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-513 Make cfs_wait_event_interruptible_exclusive really exclusive
Christopher J. Morrone [Wed, 20 Jul 2011 00:43:15 +0000 (17:43 -0700)]
LU-513 Make cfs_wait_event_interruptible_exclusive really exclusive

Change-Id: Iea0556a006f8826f8597824131fb5110a848c434
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/1118
Tested-by: Hudson
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoIncrease lustre version to 2.1.51 2.1.51 2.1.51.0 v2_1_51_0
Oleg Drokin [Tue, 25 Oct 2011 21:41:45 +0000 (17:41 -0400)]
Increase lustre version to 2.1.51

Change-Id: I91b5cc909aae9434349ec1ddbd16fd1eedadc412
Signed-off-by: Oleg Drokin <green@whamcloud.com>
13 years agoORNL-28: remove lock contention in reconnecting
Jinshan Xiong [Mon, 19 Sep 2011 02:58:21 +0000 (19:58 -0700)]
ORNL-28: remove lock contention in reconnecting

target_handle_connect() used to grab obd::obd_recovery_task_lock to update
obd_next_recovery_transno and obd_connected_clients. In this patch, I revised
this piece of code by:
- modify obd_connected_clients to cfs_atomic_t
- grab obd_recovery_task_lock only if ocd_transno is less than
  obd_next_recovery_transno

In this way, target_handle_connect() don't grab any global lock.

Change-Id: I971897de82566e239fe750cf95f2c3f1646325bc
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1293
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
13 years agoLU-735 Remove fixme() macro
Andreas Dilger [Mon, 3 Oct 2011 21:16:28 +0000 (14:16 -0700)]
LU-735 Remove fixme() macro

This NULL export warning is the only consumer of the fixme()
macro.  Change it to a CWARN and remove the fixme() macro.

Change-Id: I194a9d92369a6a4dce35701fe261631420f9894d
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-on: http://review.whamcloud.com/1472
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-707 Fix FID string quoting in replay-vbr
Li Wei [Fri, 9 Sep 2011 06:05:56 +0000 (14:05 +0800)]
LU-707 Fix FID string quoting in replay-vbr

The FID string printed by the "path2fid" lfs command was passed to
do_facet() without any quoting.  Depending on the names of the files in
the remote shell's working directory, the FID string could be replaced,
by the shell's filename expansion, to a list of matching file names.
This patch adds the necessary quoting prevent the remote shell from
doing filename expansion on the FID string.

Change-Id: I0bc28eb7f29df717a661a4c1bce8c918117b997c
Signed-off-by: Li Wei <liwei@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1409
Tested-by: Hudson
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-443 LNet: Only squawk when md->start is NULL on non-zero length
Wally Wang [Wed, 24 Aug 2011 18:08:58 +0000 (11:08 -0700)]
LU-443 LNet: Only squawk when md->start is NULL on non-zero length

Only squawk when md->start is NULL on non-zero length.
The md->start == NULL check prevents anyone from creating a ME/MD
with no buffer. These are used as backstop buffers to generate events
when traffic has exceeded the local buffer space.

Change-Id: I1389b0a45d3f8ff548f6400c66b30a69bafb4f39
Signed-off-by: Wally Wang <wang@cray.com>
Reviewed-on: http://review.whamcloud.com/989
Tested-by: Hudson
Reviewed-by: Cory Spitz <spitzcor@cray.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-448 add lst stat --count
Wally Wang [Thu, 22 Sep 2011 23:19:28 +0000 (16:19 -0700)]
LU-448 add lst stat --count

This adds lst stat --count, making it easier to script up data
collection for performance tests.

The patch is from Oracle bug 22638 attachment 33001

Change-Id: I46aa1de12ea8e6c2d5eafff8ce5c31d7d618c06f
Signed-off-by: Wally Wang <wang@cray.com>
Reviewed-on: http://review.whamcloud.com/1004
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-357 racer test cleanup
hongchao.zhang [Thu, 23 Jun 2011 03:24:31 +0000 (11:24 +0800)]
LU-357 racer test cleanup
 1, increase the test time to 300s(900s for SLOW)
 2, fixing the problem of recursively calling racer.sh

Change-Id: I91ac7e5c42ed5bc98b3a647c30d7e37af0573f09
Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/905
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-743 conf-sanity: test_46a failure
Jinshan Xiong [Fri, 7 Oct 2011 18:27:19 +0000 (11:27 -0700)]
LU-743 conf-sanity: test_46a failure

This failure is because client still didn't see the adding OSTs so it
met a problem when decoding lsm because the # of OSTs was over tgt count
at the client side.

Change-Id: I49ee22379734375a2a92f7495f6d849e47db0909
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1494
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoORNL-28: Set recovery timeout correctly
Jinshan Xiong [Fri, 26 Aug 2011 00:30:50 +0000 (17:30 -0700)]
ORNL-28: Set recovery timeout correctly

make sure recovery window uses timeout value from lustre config;
in current implementation this piece of code is totally wrong since
it just disregards timeout configuration.

Change-Id: I0cb0d777569cccd96f30da11834c6e333a673816
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1292
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoORNL-28: Solve reconnecting race between IR and SR
Jinshan Xiong [Wed, 24 Aug 2011 23:03:53 +0000 (16:03 -0700)]
ORNL-28: Solve reconnecting race between IR and SR

if there is a connecting request on the fly when client import is
notified by IR, it will set the corresponding conn uuid to a higher
prio and set imp_force_verify so that it will do reconnection
immediately in case RPC timeout happens.

Change-Id: I77e799e1f12b49f3c0271585c1ce812d16dc1ef6
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1291
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoORNL-27: Cancel on completion lock on the MGS
Jinshan Xiong [Thu, 6 Oct 2011 20:35:02 +0000 (13:35 -0700)]
ORNL-27: Cancel on completion lock on the MGS

We should cancel the recover/config LCK_EX lock immediately
when they are granted to accelerate enqueue process.

Also, it doesn't make sense to add mgc recover/config lock into
LRU list because these kinds of lock would never be canceled
voluntarily. Restore LDLM_FL_NO_LRU flag and apply it for mgc
lock.

Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Change-Id: I369b57ca4780b0bfa07d33b4423b468481263ade
Reviewed-on: http://review.whamcloud.com/1261
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoORNL-14: Configuration and Unit test cases
Jinshan Xiong [Tue, 4 Oct 2011 04:35:24 +0000 (21:35 -0700)]
ORNL-14: Configuration and Unit test cases

In this patch, some procfs and test cases are added for imperative recovery.

Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Change-Id: Ied49b622208157a74e8547e4609d61ee5041f624
Reviewed-on: http://review.whamcloud.com/1219
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
13 years agoLU-687 clio: retry if fault page was truncated
Jinshan Xiong [Wed, 5 Oct 2011 05:54:48 +0000 (22:54 -0700)]
LU-687 clio: retry if fault page was truncated

In vvp_io_fault_start, if a page was truncated we should retry it in
ll_fault() instead of return -EFAULT because it will cause fake OOM.

Change-Id: Ia2ba40ca4ecd67170e6c2eb81ddc0ee34d9379a8
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1453
Tested-by: Hudson
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-699 tests: replay-dual test_1 failure
Jinshan Xiong [Wed, 19 Oct 2011 23:54:26 +0000 (16:54 -0700)]
LU-699 tests: replay-dual test_1 failure

The root cause of this problem is due to data corruption on lustre_disk_data.
In ldd_write(), it just waits for log to be committed but NOT data to actually
write to disk. So in this test case, if the data is not written into disk when
we mark the device as readonly, the data will be lost and cause remount mdt
failed.

Change-Id: I8c46a925ce2df3d2db69b1e7fd8813eb0668401d
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1557
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
13 years agoORNL-13: NonIR client support
Jinshan Xiong [Thu, 11 Aug 2011 01:19:15 +0000 (18:19 -0700)]
ORNL-13: NonIR client support

In this task, NonIR support is added. NONIR clients means those clients who
doesn't know the protocol of imperative recovery. This means they won't be
notified for the restarting of target.

To support NonIR clients, the MGS has to record how many NonIR clients per
file system, and track those clients. In this way, if there are NonIR clients
for a specific file system, the MGS should tell the restarting target to disable
imperative recovery; otherwise, these `old' clients are to be evicted easily.

Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Change-Id: I3725c66b74d702aa213644ee9a6f89d59b8a8083
Reviewed-on: http://review.whamcloud.com/1218
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-462 Don't alloc/free client data for self export
Mikhail Pershin [Mon, 27 Jun 2011 17:02:14 +0000 (21:02 +0400)]
LU-462 Don't alloc/free client data for self export

Self export doesn't need client data and ldlm initialization.

Change-Id: I31307d2212e3d11c79f1ab215edbb840c3cfb8c6
Signed-off-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1023
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoORNL-23 avoid unnecessary CMD lock contention in LMV
nasf [Mon, 5 Sep 2011 04:48:00 +0000 (12:48 +0800)]
ORNL-23 avoid unnecessary CMD lock contention in LMV

There are some redundant lock operations in LMV layer which is only
useful under old CMD mode. We should avoid those lock operations to
reduce unnecessary lock contention.

Change-Id: I2ad1ba728da39ce69b0e521ab3a992dd2b253f2e
Signed-off-by: nasf <yong.fan@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1226
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-515 canonicalize the devices names
Wally Wang [Wed, 20 Jul 2011 03:50:18 +0000 (20:50 -0700)]
LU-515 canonicalize the devices names

Perform a readlink on the device name if path is /dev/disk/by-id...
See Oracle bug 24487.

Change-Id: I964b224d764677d60064901f4238ae77b9cfb5ea
Signed-off-by: Wally Wang <wang@cray.com>
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1120
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 years agoLU-760 obdecho: initialization and awk problem
Jinshan Xiong [Thu, 13 Oct 2011 23:27:08 +0000 (16:27 -0700)]
LU-760 obdecho: initialization and awk problem

echo_client registers obd type and then initializes kmem cache, this is
problematic because echo_key will be immidiately accessed after obd type
is registered. This will cause kernel fault.

Also, an awk problem is fixed. If the length of output buffer is longer
than 1024 bytes, awk will run into problem.

Change-Id: Ief287e8f4eeb6a39bc336e7a9f5c21e921b79a58
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1521
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
13 years agoLU-389 update packet-lustre.c for master
nasf [Thu, 14 Jul 2011 09:51:30 +0000 (17:51 +0800)]
LU-389 update packet-lustre.c for master

Drop unused mds_body/mds_rec_xxx/mds_status_req. Add missing mdt_body/mdt_rec_xxx.

Change-Id: Ic530541f58d12c721fa6efd0bc9a1096a15d7e33
Signed-off-by: nasf <yong.fan@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/995
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
13 years agoORNL-10: Basic IR implementation
Jinshan Xiong [Fri, 19 Aug 2011 23:44:50 +0000 (16:44 -0700)]
ORNL-10: Basic IR implementation

To support imperative recovery, there is a target status table defined for
each file system defined on the MGS. When a target registers itself to the
MGS, the MGS will change this table correspondingly.

In the status table, one important field is target NID. This NID information
is used by clients locating server where target lives. By transferring this NID
to clients, clients can know the restarting of targets earlier. This is
so-called imperative recovery - the MGS notifies clients to do recovery
imperatively instead of timeout based standard recovery.

To implement imperative recovery, clients are asked to cache a NID table, which
contains the location information of all servers. Clients need to hold a read
mode ldlm plain lock - recover lock - to cache this table. Whenever the MGS
wants to change this table, it will enqueue an EXCL recover lock so that all
clients will be notified for this change. Clients will request for a new read
recover lock and then query for the MGS for NID table updates.

Change-Id: I3b38ba142b810df507805b71972feeb1bade1ac2
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1217
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>