Whamcloud - gitweb
Johann Lombardi [Thu, 4 Feb 2010 22:24:44 +0000 (23:24 +0100)]
b=21961 add changelog entry
Nathan Rutman [Thu, 4 Feb 2010 21:58:08 +0000 (13:58 -0800)]
b=21961 (17914) ignore trailing -mdc when determining index number
a=jinshan.xiong
i=nathan
i=h.huang
Elena Gryaznova [Thu, 4 Feb 2010 16:30:52 +0000 (19:30 +0300)]
b=21990 add parallel-scale EXCEPT list
i=Minh.Diep
Andreas Dilger [Thu, 4 Feb 2010 12:37:29 +0000 (13:37 +0100)]
b=21966 avoid divide-by-zero in lprocfs_rd_import()
i=johann
Johann Lombardi [Thu, 4 Feb 2010 08:15:55 +0000 (09:15 +0100)]
b=16909 simplify MDT/OST service start message
i=nathan
i=adilger
Johann Lombardi [Thu, 4 Feb 2010 08:12:44 +0000 (09:12 +0100)]
b=21686 add changelog entry
Landen [Thu, 4 Feb 2010 07:57:49 +0000 (15:57 +0800)]
b=16909 Simplify MDT/OST service start message
i=nathan
i=adilger
Elena Gryaznova [Wed, 3 Feb 2010 21:58:09 +0000 (00:58 +0300)]
b=18489 test_116, test_118k cleanup
i=Andrew.Perepechko (panda)
i=Andreas.Dilger
Elena Gryaznova [Wed, 3 Feb 2010 21:37:08 +0000 (00:37 +0300)]
b=21953 use separate failover counter for each facet
i=Mikhail.Pershin (tappro)
Johann Lombardi [Wed, 3 Feb 2010 18:46:33 +0000 (19:46 +0100)]
add 1.8.3 section in the lnet changelog
Andrew Perepechko [Wed, 3 Feb 2010 17:13:59 +0000 (20:13 +0300)]
b=21147 call build_lqs only from generic_quota_on
i=Johann Lombardi
i=ZhiYong Tian
Dmitry Zogin [Wed, 3 Feb 2010 16:54:11 +0000 (11:54 -0500)]
b=21259 "lfs check" is only allowed for root.
Code cleanup around obd_class_*() functions and sanity test for non-root lfs check
i=adilger
i=andrew.perepechko
yangsheng [Tue, 2 Feb 2010 15:15:37 +0000 (23:15 +0800)]
b=21632 Kernel update to OEL5.4 2.6.18-164.11.1.0.1.el5.
i=johann
Johann Lombardi [Wed, 3 Feb 2010 18:35:51 +0000 (19:35 +0100)]
bump version to 1.8.2.50
Johann Lombardi [Wed, 3 Feb 2010 18:34:07 +0000 (19:34 +0100)]
add changelog section for 1.8.3
Johann Lombardi [Tue, 2 Feb 2010 10:49:29 +0000 (11:49 +0100)]
Revert debug patch from b=21364
This reverts commit
818de83d3200ae48dae7096500ba0118b8f95976.
I inadvertently committed my debug patch.
Landen [Fri, 29 Jan 2010 07:13:42 +0000 (15:13 +0800)]
b=20970 need add an additional barrier for write_disjoint
i=rread
i=grev
Elena Gryaznova [Thu, 28 Jan 2010 15:06:18 +0000 (18:06 +0300)]
b=21948 skip parallel grouplock test for NFSCLIENT mode
i=Johann
Dmitry Zogin [Wed, 27 Jan 2010 18:04:27 +0000 (13:04 -0500)]
b=21900 ost-pools test_25: FAIL: /mnt/lustre/d0.ost-pools/d25/file1 not allocated from OSTs 0.
Modify ost-pools test_25 to wait for MDS-OST connection to re-establish.
i=johann
Landen [Wed, 27 Jan 2010 07:48:03 +0000 (15:48 +0800)]
delete test_12 in sanity-quota.sh
root [Sat, 30 Jan 2010 07:02:05 +0000 (15:02 +0800)]
b=21686 fail the request if its obd_device stopping
in ldlm_handle_enqueue, the request should be failed
if its obd_device had been marked as "fail"(obd_fail=1),
which will be set during umount.
i=johann@sun.com
i=oleg.drokin@sun.com
Johann Lombardi [Tue, 2 Feb 2010 10:09:15 +0000 (11:09 +0100)]
Merge branch 'b1_8' of git@git.lustre.org:prime/lustre into b1_8
Terry Rutledge [Sat, 30 Jan 2010 20:57:52 +0000 (13:57 -0700)]
Forgot to update these when I updated the lustre/ChangeLog. Added date
and another supported version of OFED.
Johann Lombardi [Fri, 29 Jan 2010 00:53:20 +0000 (01:53 +0100)]
b=21364 debug patch
Johann Lombardi [Wed, 27 Jan 2010 14:39:49 +0000 (15:39 +0100)]
b=21815 lustre_hash_rehash_key() should use lh_read_unlock()
lh_read_lock() is no-op if rehash is disabled, so we should
use lh_read_unlock() in this function.
This should not have any consequence, but better to fix it.
Johann Lombardi [Wed, 27 Jan 2010 14:27:07 +0000 (15:27 +0100)]
b=21815 move assertion under write lock
Johann Lombardi [Wed, 27 Jan 2010 12:49:37 +0000 (13:49 +0100)]
b=21815 print more debug info in lustre_hash_exit when assertion fails
Vladimir V. Saveliev [Tue, 26 Jan 2010 16:12:40 +0000 (17:12 +0100)]
b=19405 do not flag a request as rq_replay for non replayable imports
i=ericm
i=robert
Johann Lombardi [Tue, 26 Jan 2010 16:04:06 +0000 (17:04 +0100)]
b=21906 LBUG doesn't print stack trace on sles9 because show_stack not exported
Johann Lombardi [Sat, 23 Jan 2010 00:36:25 +0000 (01:36 +0100)]
Revert "b=21097 fix md5sum error in metadata-updates.sh"
This reverts commit
89b5d6f0e40b35bcc93d6830568e823d67e8f364.
Johann Lombardi [Fri, 22 Jan 2010 23:03:01 +0000 (00:03 +0100)]
b=17682 fix time unit in message
Johann Lombardi [Fri, 22 Jan 2010 21:16:02 +0000 (22:16 +0100)]
b=21448 send recovery rpc ASAP
i=robert.read
i=tappro
Johann Lombardi [Fri, 22 Jan 2010 21:03:26 +0000 (22:03 +0100)]
b=21406 fix deadlock between kjournald2 and ost_io thread
i=adilger
i=girish
Calling clear_page_dirty_for_io() is no longer needed since
we are granted that no dirty pages can be left in the page
cache by partial truncate. The problem is that
clear_page_dirty_for_io() can temporarilly mark the page
as dirty in the radix tree, which can cause deadlock
between jbd commit and bulk write handling.
Johann Lombardi [Fri, 22 Jan 2010 20:49:33 +0000 (21:49 +0100)]
b=17569 add force_over_16tb for rhel5/ext4
16TB is the next limit.
Johann Lombardi [Fri, 22 Jan 2010 15:26:27 +0000 (16:26 +0100)]
b=17569 remove force_over_8tb for rhel5/ext4 since it is now tested
Johann Lombardi [Fri, 22 Jan 2010 15:00:30 +0000 (16:00 +0100)]
b=21686 revert attach 25564 bug 19557
yangsheng [Fri, 22 Jan 2010 12:53:02 +0000 (20:53 +0800)]
b=21632 Update RHEL5.4 kernel to 2.6.16-164.11.1.el5.
Andrew Perepechko [Thu, 21 Jan 2010 12:08:28 +0000 (15:08 +0300)]
b=21147 fix unnecessary semaphore release in generic_quota_on
i=Johann Lombardi
Johann Lombardi [Thu, 21 Jan 2010 10:43:42 +0000 (11:43 +0100)]
add missing changelog entries
Rahul Deshmukh [Thu, 21 Jan 2010 07:42:38 +0000 (13:12 +0530)]
b=21595 jbd2/rhel5: don't call jbd callback with spinlock hold
since the callback is allowed to sleep (e.g. take semaphore), we should
not hold any spinlocks when involing it.
jbd2/sles11 is fixed already.
i=johann
i=girish
Dmitry Zogin [Wed, 20 Jan 2010 23:54:49 +0000 (18:54 -0500)]
b=21828 drop number of active requests when queued for recovery
Now that we take a reference on the original request instead of
making a copy of it for recovery. We need to drop the number of
active requests or the queued requests will prevent all request
processing when they exceed (srv->srv_threads_running - 1).
i=nathan.rutman
i=tappro
Andrew Perepechko [Wed, 20 Jan 2010 20:06:38 +0000 (23:06 +0300)]
b=21826 refuse to invalidate operational quota files when they are in use
an attempt to invalidate operational quota files on the quota master is not actually permitted by VFS (returning -EPERM), but we should not depend on that and should return the error earlier.
i=Johann Lombardi
i=ZhiYong Tian
Rahul Deshmukh [Mon, 18 Jan 2010 21:49:40 +0000 (22:49 +0100)]
b=19742 fix llite fiemap interfaces
i=johann
llite can get fiemap requests through ioctl or directly
through the ->fiemap vfs inode's operation (newer kernel).
Unfortunately, both interfaces take different arguments,
so the purpose of this patch is to fix this.
Rahul Deshmukh [Mon, 18 Jan 2010 21:48:01 +0000 (22:48 +0100)]
b=19742 fix fiemap patches for rhel5
i=girish
i=andreas
James Simmons [Mon, 18 Jan 2010 17:36:07 +0000 (18:36 +0100)]
b=21370 sanity 27x: double the qos_maxage timeout
i=adilger
Johann Lombardi [Mon, 18 Jan 2010 15:50:39 +0000 (16:50 +0100)]
b=19742 add missing fiemap patches to rhel5 series
i=adilger
i=girish
Johann Lombardi [Mon, 18 Jan 2010 15:01:59 +0000 (16:01 +0100)]
b=21846 define lqs_key for quota lqs
i=adilger
i=landen
Johann Lombardi [Mon, 18 Jan 2010 14:53:27 +0000 (15:53 +0100)]
update supported kernels in changelog and which_patch
yangsheng [Mon, 18 Jan 2010 14:35:18 +0000 (22:35 +0800)]
b=20758 Update kernel to 2.6.16.60-0.42.8.
i=johann
i=landen
Johann Lombardi [Mon, 18 Jan 2010 10:10:33 +0000 (11:10 +0100)]
b=18690 disable rehash for quota
quota can uses a key of 0 for root and the rehash code
assert on this. Disable rehashing for quota lqs for now.
Terry Rutledge [Sat, 16 Jan 2010 18:26:35 +0000 (11:26 -0700)]
Updated with correct version string for 1.8.2 RC1.
Terry Rutledge [Sat, 16 Jan 2010 18:25:01 +0000 (11:25 -0700)]
Updated for 1.8.2 RC1. Added release date.
Girish Shilamkar [Sat, 16 Jan 2010 17:34:06 +0000 (18:34 +0100)]
b=21564 add changelog entry
Girish Shilamkar [Sat, 16 Jan 2010 16:16:57 +0000 (21:46 +0530)]
b=21564 Print mmp_check_interval in kmmpd, fix sles10 & rhel5/ext3 too
The patch not only prints mmp_check_interval but also makes it possible
to abort mount operation in case it takes too long.
i=adilger
Johann Lombardi [Sat, 16 Jan 2010 14:39:22 +0000 (15:39 +0100)]
b=21574 more pinger fixes
i=oleg
i=andrew
- ptlrpc_update_next_ping(): don't postpone next ping when "soon"
is set and a ping request is already scheduled before the new
deadline.
- It is usually fine to extend the deadline for the next ping
since we are granted that the pinger will wake up before
this new deadline and update his timer.
However, the purpose of ptlrpc_pinger_commit_expected() is to
schedule ping earlier. To support this, i've changed
ptlrpc_update_next_ping() to wake up the pinger if the new
ping deadline is before the pinger is supposed to wake up.
Johann Lombardi [Sat, 16 Jan 2010 14:27:17 +0000 (15:27 +0100)]
b=21574 PING_INTERVAL_SHORT should not postpone the next ping
i=oleg
i=andrew
Most our tests run with obd_timeout=20s, so PING_INTERVAL=5s, while
PING_INTERVAL_SHORT=7s. ptlrpc_pinger_commit_expected() was actually
not intended to delay pings.
Although we would prefer to schedule the next ping after a
bit more than 5s (jbd commit time), using 5s instead of 7s
is not a big deal since we will have to only wait for 5
additional seconds in the worst case.
Johann Lombardi [Sat, 16 Jan 2010 14:21:28 +0000 (15:21 +0100)]
b=21574 define ptlrpc_pinger_commit_expected for liblustre
i=oleg
i=andrew
Johann Lombardi [Sat, 16 Jan 2010 14:19:03 +0000 (15:19 +0100)]
b=21574 schedule ping asap instead of delaying it
i=oleg
i=andrew
The intent was to schedule a ping as soon as possible to know
sooner rather than later that the transno has been committed.
This is used by the async journal feature to unpin pages
in memory sooner.
Johann Lombardi [Sat, 16 Jan 2010 14:16:03 +0000 (15:16 +0100)]
b=20928 skip sanity 202 on ib since unaligned dio not supported by o2iblnd
Girish Shilamkar [Sat, 16 Jan 2010 08:32:41 +0000 (14:02 +0530)]
b=21564 Print mmp_check_interval in kmmpd
The patch not only prints mmp_check_interval but also makes it possible
to abort mount operation in case it takes too long.
i=adilger
Johann Lombardi [Sat, 16 Jan 2010 01:04:38 +0000 (02:04 +0100)]
add missing changelog entries
Johann Lombardi [Sat, 16 Jan 2010 00:21:12 +0000 (01:21 +0100)]
b=18399 add missing patch to sles11 series
Ed Giesen [Sat, 16 Jan 2010 00:15:25 +0000 (01:15 +0100)]
b=21097 fix md5sum error in metadata-updates.sh
Johann Lombardi [Sat, 16 Jan 2010 00:05:23 +0000 (01:05 +0100)]
bump version
Johann Lombardi [Fri, 15 Jan 2010 23:36:37 +0000 (00:36 +0100)]
fix another build issue
Johann Lombardi [Fri, 15 Jan 2010 16:37:42 +0000 (17:37 +0100)]
fix build issue
Dmitry Zogin [Fri, 15 Jan 2010 14:35:48 +0000 (09:35 -0500)]
b=21565 filter_last_id() NULL dereference
lprocfs_filter_rd_last_id() should check for the fully setup obd device,
before proceeding further.
i=johann
i=andrew.perepechko
Johann Lombardi [Fri, 15 Jan 2010 11:03:58 +0000 (12:03 +0100)]
b=11680 don't call LBUG if reading force_lbug
Should not happen because the permission is 0200,
but better to check.
Johann Lombardi [Fri, 15 Jan 2010 10:32:27 +0000 (11:32 +0100)]
b=18690 enable rehash for hash tables that intended to use it
Christopher J. Morrone [Fri, 15 Jan 2010 00:27:17 +0000 (01:27 +0100)]
b=11680 Add /proc/sys/lnet/force_lbug
This patch adds a proc entry called force_lbug.
Brian J. Murrell [Thu, 14 Jan 2010 21:15:02 +0000 (16:15 -0500)]
b=19720 use min_t() to force comparison to unsigned
In older kernels num_online_cpus() is an int, and in newer
kernels it is an unsigned so force the comparison to unsigned
so that it's portable to both new and old kernels.
i=panda
i=whitebear
yangsheng [Thu, 14 Jan 2010 13:55:18 +0000 (21:55 +0800)]
b=21411 Improvement for AT.
i=nathan
i=tappro
Johann Lombardi [Thu, 14 Jan 2010 14:20:02 +0000 (15:20 +0100)]
fix tiny nit from previous commit
Johann Lombardi [Thu, 14 Jan 2010 14:16:54 +0000 (15:16 +0100)]
fix build error on ia64/ppc64
Johann Lombardi [Thu, 14 Jan 2010 13:13:07 +0000 (14:13 +0100)]
fix changelog
Johann Lombardi [Thu, 14 Jan 2010 13:00:47 +0000 (14:00 +0100)]
quiet noisy error message when rehash is disabled
Rehash is sometimes disabled temporarily, so quiet
message printed on the console when cur_bits != max_bits.
Alexander.Zarochentev [Thu, 14 Jan 2010 08:56:57 +0000 (11:56 +0300)]
b=21716 Reduce memory consumptions in directio utility
use one memory mapped file buffer in directio.c instead of two.
i=tappro
i=andrew.perepechko
Fan Yong [Thu, 14 Jan 2010 02:34:48 +0000 (10:34 +0800)]
b=20139 prevent parent thread to be killed before its child becoming daemon
Prevent parent thread to be killed before its child becoming daemon.
i=tappro
i=robert
root [Tue, 12 Jan 2010 23:12:31 +0000 (07:12 +0800)]
b=21471 fix race problem during recovery
during recovery, "class_unlink_export", "class_set_export_delayed"
and "target_queue_last_replay_reply" maybe race duirng processing
(increase/decrease) "obd_recoverable_clients" and "obd_delayed_clients"
and cause the recovery to wait forever
i=tappro@sun.com
i=yong.fan@sun.com
Johann Lombardi [Thu, 14 Jan 2010 10:59:00 +0000 (11:59 +0100)]
b=20020 don't shrink if no mds_body
i=andrew
i=dmitry
Fix bug found on lol when the group upcall returns EIDRM.
We incorrectly shrink the reply while there is no mds_body.
Elena Gryaznova [Wed, 13 Jan 2010 22:28:01 +0000 (01:28 +0300)]
b=19387 integrate LST into acc-sm
i=Maxim.Patlasov
i=He.Huang
new acc-sm test suite: LNET_SELFTEST
Brian Behlendorf [Wed, 13 Jan 2010 22:28:18 +0000 (23:28 +0100)]
b=21800 fix spurious message from shrink_slab reporing negative nr
i=johann
i=andrew
Elena Gryaznova [Mon, 11 Jan 2010 12:25:00 +0000 (15:25 +0300)]
b=19702 fix COUNT to work properly
i=Andrew.Perepechko
Elena Gryaznova [Wed, 13 Jan 2010 20:19:21 +0000 (23:19 +0300)]
b=20866 DEPS assignment needs quotes
o=Brian
i=grev
Elena Gryaznova [Wed, 13 Jan 2010 20:00:58 +0000 (23:00 +0300)]
b=20918 improve log warning
i=Brian
Nathan Rutman [Wed, 13 Jan 2010 17:58:56 +0000 (09:58 -0800)]
b=21746 compare full fsname when erasing config files for writeconf
i=breitz
i=brian
Alexander.Zarochentev [Sun, 10 Jan 2010 19:57:06 +0000 (22:57 +0300)]
b=20816 improve simulation of late reply
ignore obd_fail_timeout for ping replies.
i=robert.read
i=johann
i=tappro
Alexander.Zarochentev [Sun, 10 Jan 2010 19:56:26 +0000 (22:56 +0300)]
b=20816 fix replay-single test 67b
Exhausting precreation before testing delayed file creation on OST.
i=johann
i=robert.read
Brian J. Murrell [Sat, 9 Jan 2010 14:34:22 +0000 (09:34 -0500)]
b=21759 Miscellaneous build fixes
The message string given to fatal() cannot be slit with line continuations
as you would strings elsewhere -- for whatever reason. So let's just un-
split them for now.
Coding style fixups.
Adds a "--set-var" option to lbuild to set/override an environment variable.
This is mainly meant for lbuild testers.
Fix missing - from tar so that the --exclude parameters will be honoured.
Some stderr->stdout redirections to get output into the correct log.
i=wangyb
i=yangsheng
Brian J. Murrell [Sat, 9 Jan 2010 04:36:05 +0000 (23:36 -0500)]
b=20617 old-school builds sles9 too
A portion of the patch for this didn't seem to get committed causing
SLES9 builds to fail.
Johann Lombardi [Fri, 8 Jan 2010 22:46:06 +0000 (23:46 +0100)]
bump version to 1.8.1.60
Johann Lombardi [Fri, 8 Jan 2010 22:44:03 +0000 (23:44 +0100)]
Disable async journal commit & cancel lock before replay features by default
This reverts commit
d7bdeb27b53b12f8c567fa220f82e1ca2a10a470.
Brian J. Murrell [Fri, 8 Jan 2010 16:38:20 +0000 (11:38 -0500)]
b=21586 More stderr/stdout redirections
Just a few more redirections to get some commands' output into the
appropriate log files.
We should actually return the 255, not just assign it to a unused variable.
i=yangsheng
i=wangyb
brian [Fri, 8 Jan 2010 16:38:19 +0000 (11:38 -0500)]
b=20315 Use libexecdir
Use the more standard libexecdir for scripts.
i=adilger
o=Christopher Morrone
Brian J. Murrell [Fri, 8 Jan 2010 16:38:18 +0000 (11:38 -0500)]
b=21754 RPM version update fix
It seems that Suse will release an updated RPM without updating the
kernel inside. In doing so, the kernel and the RPM file name have
different specifications of the version.
This fix allows for that.
i=yangsheng
i=wangyb
Brian J. Murrell [Fri, 8 Jan 2010 16:38:17 +0000 (11:38 -0500)]
b=21757 Update per make oldconfig.
Make oldconfig always winds up removing CONFIG_SD_IOSTATS from the RHEL4
config file on i686, so we should remove it from the source.
Dmitry Zogin [Fri, 8 Jan 2010 16:08:32 +0000 (11:08 -0500)]
b=20247 Disabling printing D_NETERROR messages on the console.
i=he.huang
i=johann
i=adilger
Remove neterror from libcfs_printk since it is too chatty
and can flood the console.
Johann Lombardi [Fri, 8 Jan 2010 15:24:51 +0000 (16:24 +0100)]
b=20805 rate limit D_NETERR messages
i=isaac
i=liang
add CNETERR() macro that uses CDEBUG_LIMIT() for D_NETERR messages
Girish Shilamkar [Fri, 8 Jan 2010 09:35:31 +0000 (15:05 +0530)]
b=20301 Fix mkfs.lustre for 16TB+ LUNs
Patch by James Simmons
i=adilger
i=girish
Mounting 16TB LUNs failed due to three bugs in mkfs.lustre. This patch
fixes this.
Brian J. Murrell [Thu, 7 Jan 2010 19:47:38 +0000 (14:47 -0500)]
b=20617 Update old-school build with new API changes
The API changes that were part of the previous landing for bug
20617 require that the old_school build be updated as well.
i=wangyb
i=yangsheng