Whamcloud - gitweb
yangsheng [Wed, 25 Aug 2010 16:18:46 +0000 (00:18 +0800)]
b=13752 Remove 2.4 kernel definition of cfs_waitq_wait_event_interruptible_timeout()
i=issac
yangsheng [Wed, 25 Aug 2010 13:03:20 +0000 (21:03 +0800)]
b=21610 Add lbuild-sles11sp1 file.
i=johann
Elena Gryaznova [Wed, 25 Aug 2010 12:48:00 +0000 (16:48 +0400)]
b=23515 recovery_*_scale tests need more than 2 clients
i=Mikhail.Pershin
skip these tests by env for number of remote clients < 2
pravin [Wed, 25 Aug 2010 11:50:43 +0000 (17:20 +0530)]
b=20563 add llite mount option to generate 32bit ino v2
i=vitaly
i=andreas
patch adds mount option to generate 32bit ino, this can be used
for 32bit application compatibility.
Fan Yong [Wed, 25 Aug 2010 05:15:41 +0000 (13:15 +0800)]
b=22935 keep reference count for "lli_sai" to prevent it to be released when "statahead_enter()" and "statahead_exit()"
keep reference count for "lli_sai" to prevent it to be released when "statahead_enter()" and "statahead_exit()"
i=eric.mei
i=di.wang
Jian Yu [Wed, 25 Aug 2010 02:42:23 +0000 (10:42 +0800)]
b=19151 use single-level test directory for large-scale test 3a
$tdir is a two-level directory which causes the mode setting issue
while running MPI programs. Let's use a single-level test directory
for large-scale test 3a.
i=grev
i=robert.read
Brian J. Murrell [Tue, 24 Aug 2010 21:53:17 +0000 (17:53 -0400)]
debian/patches and non-GA tags
Make the code to populate debian/patches work better with non-GA tags
such as v1_8_3_53.
Brian J. Murrell [Tue, 24 Aug 2010 21:53:16 +0000 (17:53 -0400)]
b=22967 s/LB_LINUX_CONFTEST/AC_LANG_CONFTEST/
There was a macro update in more recent releases of autoconf that
requires us to use AC_LANG_CONFTEST instead of the LB_LINUX_CONFTEST
that we currently use. The problem using LB_LINUX_CONFTEST causes
is that as configure is determining capabilities and setting capability
flags, these are not being used when compiling further conftest.c
programs.
So for example if a macro determines if foo is available and then sets
FOO if it is, and then a test in a subsequent macro tries to use FOO,
it will find it undefined.
i=michael.macdonald
i=andreas.dilger
Dmitry Zogin [Tue, 24 Aug 2010 20:45:25 +0000 (16:45 -0400)]
b=22378 Correct MDS client stats
Move some of the MDS stats from obdfilter layer to MDS layer.
i=andreas.dilger
i=amoly.liu
Andrew Perepechko [Tue, 24 Aug 2010 20:35:54 +0000 (00:35 +0400)]
b=16890 llapi_quotactl man page update with additional explanation of igrace/bgrace/itime/btime
Dmitry Zogin [Tue, 24 Aug 2010 20:30:07 +0000 (16:30 -0400)]
b=22378 Correct MDS client stats
test_133 sanity.sh has been added
i=andreas.dilger
Nathan Rutman [Tue, 24 Aug 2010 17:02:06 +0000 (10:02 -0700)]
b=23408 disable failure temporarily while we collect performance stats
i=vitaly
Johann Lombardi [Tue, 24 Aug 2010 20:09:06 +0000 (22:09 +0200)]
b=23595 fix broken patch
Nathan Rutman [Mon, 23 Aug 2010 17:59:22 +0000 (10:59 -0700)]
b=23595 return registration errors
i=johann
i=vitaly
Johann Lombardi [Tue, 24 Aug 2010 16:26:33 +0000 (18:26 +0200)]
b=21174 fix tiny nit in previous landing
Andrew Perepechko [Fri, 20 Aug 2010 21:51:56 +0000 (01:51 +0400)]
b=23216 a fix for a memory leak in echo_commitrw
i=Andreas Dilger
i=Zhen Liang
Elena Gryaznova [Fri, 20 Aug 2010 21:46:06 +0000 (01:46 +0400)]
b=23573 skip conf-sanity fs2 tests for HARD failure mode
i=Andrew.Perepechko
Elena Gryaznova [Fri, 20 Aug 2010 21:40:41 +0000 (01:40 +0400)]
b=20407 replay-ost-single: do not skip for HARD mode and mixed_ost_devs
i=Brian.Murrell
Nathan Rutman [Fri, 20 Aug 2010 21:16:04 +0000 (14:16 -0700)]
b=23595 fix conf-sanity 57 for remote ost
i=vitaly.fertman
Nathan Rutman [Fri, 20 Aug 2010 18:54:56 +0000 (11:54 -0700)]
b=22934 fix writeconf, redeux
Andrew Perepechko [Fri, 20 Aug 2010 11:15:25 +0000 (13:15 +0200)]
b=21174 allow quotacheck over OSTs with sparse indices
i=Elena Gryznova
Elena Gryaznova [Wed, 18 Aug 2010 13:31:59 +0000 (17:31 +0400)]
b=23335 Allocate echo objects that can be mapped to a valid FID
With the change to using valid FIDs for all OST objects in bug 19427,
the echo objid needs to be below 2^32, because regular FID numbers
are limited to 2^32 objects in a single sequence number.
o=andreas.dilger
i=aleksandr.guzovskiy
i=mikhail.pershin
Elena Gryaznova [Tue, 17 Aug 2010 15:46:03 +0000 (19:46 +0400)]
b=23278 replay-single test 86 does not remount client
o=Oleg.Drokin
i=Elena.Gryaznova
Elena Gryaznova [Mon, 16 Aug 2010 16:23:10 +0000 (20:23 +0400)]
b=20407 TF: "HARD" failovers with multiple targets per server
i=Brian.Murrell
i=Li.Wei
Andrew Perepechko [Mon, 16 Aug 2010 12:00:02 +0000 (16:00 +0400)]
b=21174 allow quotacheck over OSTs with sparse indices
i=Johann Lombardi
i=ZhiYong Tian
yangsheng [Mon, 16 Aug 2010 05:07:58 +0000 (23:07 -0600)]
b=21610 Update changelog & which_kernel for sles11 sp1.
yangsheng [Fri, 13 Aug 2010 20:26:18 +0000 (04:26 +0800)]
b=21610 Kernel update for SLES11 SP1.
Update SLES11 SP1 to 2.6.32.13-0.5.1.
i=adilger, johann, kalpak
i=girish, rahul, whitebear
i=brian, wangyb
Dmitry Zogin [Wed, 11 Aug 2010 14:27:14 +0000 (10:27 -0400)]
b=23206 performance-sanity test_8 FAIL
Debug patch
Andreas Dilger [Tue, 10 Aug 2010 05:57:50 +0000 (01:57 -0400)]
b=23409 add -i to the setstripe usage and man page
Add the "-i" option to the "lfs setstripe" usage and man page.
Fix nroff formatting in the "lfs setstripe" and "lfs getstripe".
i=sheila.barthel
Nathan Rutman [Mon, 9 Aug 2010 19:20:29 +0000 (12:20 -0700)]
b=21720 fix test 18 to interleave tests increase pass margin
i=rread
backport 2.0's fractional-second createmany
Andreas Dilger [Mon, 9 Aug 2010 14:25:34 +0000 (10:25 -0400)]
b=23270 simplify "lctl osts" (llapi_ostlist) code
Simplify "lfs osts" command so that it avoids the filesystem traversal code
entirely, and just calls setup_osts() to print the OST list.
Dmitry Zogin [Mon, 9 Aug 2010 12:59:45 +0000 (08:59 -0400)]
b=23316 BUG: soft lockup - CPU#2 stuck for 10s! [ll_cfg_requeue:2851]
Use SHARED_DIR_LOGS in error_noexit().
i=grev
Dmitry Zogin [Mon, 9 Aug 2010 12:51:27 +0000 (08:51 -0400)]
b=22891 Objects not getting deleted for files which have been removed
ll_have_md_lock() should differentiate between CR and CW OPEN locks.
Also sanityN.sh test_36b was added.
i=oleg.drokin
i=johann.lombardi
Fan Yong [Mon, 9 Aug 2010 09:02:57 +0000 (17:02 +0800)]
b=22979 ignore the case of zero unused lock before recovery for replay-single test_85
Ignore the case of zero unused lock before recovery for replay-single test_85.
i=johann.lombardi
i=Hongchao.zhang
yangsheng [Mon, 9 Aug 2010 08:54:59 +0000 (16:54 +0800)]
b=22596 Avoid test failed on single OST.
i=adilger, grev
Use OSTCOUNT instead hard code 2 to adapt single OST case. This test
isn't necessary OSTCOUNT > 1.
yangsheng [Mon, 9 Aug 2010 08:51:31 +0000 (16:51 +0800)]
b=13585 Remove i_filterdata patches.
i=adilger
yangsheng [Mon, 9 Aug 2010 08:45:33 +0000 (16:45 +0800)]
b=22514 Update to latest RHEL5.5 kernel.
Andreas Dilger [Fri, 6 Aug 2010 19:43:04 +0000 (13:43 -0600)]
b=22906 mke2fs needs ~16TB LUNs to be 16TB-1 block
Adjust LUNs at or just over 16TB to be 1 block below 16TB, to avoid
problems with current (1.41) mke2fs being unhappy with this size.
i=zhiqi.tao
i=girish.shilamkar
Vladimir Saveliev [Sun, 1 Aug 2010 05:21:42 +0000 (09:21 +0400)]
b=13698 LL_IOC_RECREATE_FID (1.8)
define new ioctl for object replicate
it uses IDIF FID instead of truct ll_recreate_obj
old LL_IOC_RECREATE is kept for compatibility
i=andreas.dilger
i=di.wang
Andreas Dilger [Wed, 4 Aug 2010 19:56:43 +0000 (13:56 -0600)]
b=22481 man page for lfs_migrate
Add a manual page for lfs_migrate.
Minor formatting changes of the lfs_migrate::usage() message.
minhdiep [Tue, 3 Aug 2010 21:53:36 +0000 (15:53 -0600)]
b=12197 use absolute path in mtab
i=johann
The mount point passed into mtab should be the absolute path
Convert the mount point to real path.
Eric Mei [Tue, 3 Aug 2010 13:36:58 +0000 (07:36 -0600)]
b=22944 skip conf-sanity test 16 in interop mode.
r=andreas.dilger
r=jian.yu
r=grev
Andrew Perepechko [Mon, 2 Aug 2010 19:43:38 +0000 (23:43 +0400)]
b=23234 use a regular expression to parse ip_addr from ret_str in lc_net
lc_net does not parse unexpected output from pdsh well
a=Chris Horn (CRAY)
i=Elena Gryaznova
i=Andrew Perepechko
Fan Yong [Mon, 2 Aug 2010 15:40:28 +0000 (23:40 +0800)]
b=23310 Partly matching the out message against the expected one to resolve the different output message format for getfacl/setfal on different Linux distributions
i=tappro
i=wangyb
Partly matching the out message against the expected one to resolve the different output message format for getfacl/setfal on different Linux distributions.
Andrew Perepechko [Sat, 31 Jul 2010 20:30:34 +0000 (00:30 +0400)]
b=22107 pin object's inode in memory to avoid certain timeouts
i=Andreas Dilger
i=Johann Lombardi
Elena Gryaznova [Fri, 30 Jul 2010 15:00:45 +0000 (19:00 +0400)]
b=14242 test_6g fails when b_release_1_6_4 is run on Cray XT3
i=Andreas.Dilger
acceptance-small test-framework RUNAS_GID changes
Dmitry Zogin [Fri, 30 Jul 2010 13:48:09 +0000 (09:48 -0400)]
b=21760 Application hung in direct I/O
Make sure that the bulk is aborted, if a request has been aborted in flight.
Call ptlrpc_abort_bulk() out of ptlrpc_check_set()
i=oleg.drokin
i=andrew.perepechko
Johann Lombardi [Mon, 2 Aug 2010 07:49:53 +0000 (09:49 +0200)]
Revert "b=23139 give the required grant for reconnection"
This reverts commit
307f1ef16b4f32b9deeefff4b0aa5a1f0f0d2efa.
Revert patch from bug 23139 since it causes build failure on i686
and it also contains a bogus LASSERT.
Johann Lombardi [Fri, 30 Jul 2010 13:59:23 +0000 (15:59 +0200)]
Update changelog section to 1.8.5
hongchao.zhang [Thu, 22 Jul 2010 04:25:27 +0000 (12:25 +0800)]
b=23139 give the required grant for reconnection
if a client is reconnecting to the filter, the grant
required by the client should be honored
i=oleg.drokin
i=eric.mei
Landen [Fri, 30 Jul 2010 07:35:08 +0000 (15:35 +0800)]
b=20433 we should recycle dentries and inodes if only cancelling locks existing
i=green
i=adilger
Elena Gryaznova [Thu, 29 Jul 2010 12:25:57 +0000 (16:25 +0400)]
b=23382 t-f: do_nodes(): wrong sed RE
i=Andrew.Perepechko
hongchao.zhang [Wed, 21 Jul 2010 05:35:15 +0000 (13:35 +0800)]
b=23352 print the arrival time of late cancel RPC
in "ldlm_cancel_handler", print the arrival time of RPCs, which
cancel the lock but the corresponding export has disappeared
i=nathan.rutman
i=hongchao.zhang
yangsheng [Thu, 29 Jul 2010 09:53:54 +0000 (17:53 +0800)]
b=23371 Aviod deadlock with i_data_sem.
i=adilger
i=andrew
ext4_ext_walk_space() take i_data_sem at present. So we have to detect this case to avoid deadlock.
yangsheng [Thu, 29 Jul 2010 09:49:18 +0000 (17:49 +0800)]
b=23064 improve bdi usage
i=andreas
i=kalpak
i=johann
Johann Lombardi [Thu, 29 Jul 2010 08:56:33 +0000 (10:56 +0200)]
b=23439 fix some recovery debug messages
i=andrew
Nathan Rutman [Wed, 28 Jul 2010 23:02:38 +0000 (16:02 -0700)]
b=23228 handle any previous state in test_59
i=johann
Johann Lombardi [Fri, 23 Jul 2010 22:49:59 +0000 (00:49 +0200)]
b=22632 also build mptlinux on SLES11
i=johann
Add SLES11 to the list of platforms we build mptlinux on.
FWIW, RDAC fails to build on SLES11 so it has not been added here.
Johann Lombardi [Fri, 23 Jul 2010 22:46:54 +0000 (00:46 +0200)]
b=21587 add debug patch
i=andrew
E.Gryaznova [Fri, 23 Jul 2010 22:39:17 +0000 (00:39 +0200)]
b=23402 mmp_init() fix
i=jian.yu
Johann Lombardi [Fri, 23 Jul 2010 22:35:11 +0000 (00:35 +0200)]
add changelog entries
Johann Lombardi [Fri, 23 Jul 2010 22:25:55 +0000 (00:25 +0200)]
b=23368 fix conflicting ext4 mount flags
i=adilger
Johann Lombardi [Fri, 23 Jul 2010 21:57:44 +0000 (23:57 +0200)]
b=23368 fix bug in mainline rhel5/ext4 causing slab corruption when mount failed
i=adilger
Johann Lombardi [Fri, 23 Jul 2010 21:54:55 +0000 (23:54 +0200)]
b=23368 disable DELALLOC by default for RHEL5/ext4
i=yangsheng
As for SLES11, we should disabled delayed allocation by default
since it is known to be buggy.
johann [Fri, 9 Jul 2010 22:36:13 +0000 (00:36 +0200)]
set version to 1.8.4 for rc1
johann [Fri, 9 Jul 2010 22:34:57 +0000 (00:34 +0200)]
set expected release date in the changelogs
johann [Fri, 9 Jul 2010 22:26:48 +0000 (00:26 +0200)]
Update copyright
johann [Fri, 9 Jul 2010 22:15:29 +0000 (00:15 +0200)]
b=23305 fix changelog entry to point to public bug
johann [Fri, 9 Jul 2010 21:12:30 +0000 (23:12 +0200)]
Move ext4-remove-extents-warning-rhel5.patch to correct place
Girish Shilamkar [Fri, 9 Jul 2010 10:41:48 +0000 (16:11 +0530)]
b=23302 Remove "extents disabled" warning
i=johann
yangsheng [Fri, 9 Jul 2010 14:47:09 +0000 (22:47 +0800)]
b=23122 Change config check for sles11 sp1.
i=johann
johann [Fri, 9 Jul 2010 11:14:20 +0000 (13:14 +0200)]
b=22771 add changelog entry
Girish Shilamkar [Thu, 8 Jul 2010 18:34:52 +0000 (00:04 +0530)]
b=22771 Patches to disable mb_cache
i=adilger
johann [Thu, 8 Jul 2010 22:04:08 +0000 (00:04 +0200)]
Add missing changelog entries
Dmitry Zogin [Thu, 8 Jul 2010 17:54:32 +0000 (13:54 -0400)]
b=19529 Avoid deadlock for local client writes.
Check the OBD_BRW_MEMALLOC flag correctly in the remote buffer.
i=johann
i=andreas.dilger
yangsheng [Thu, 8 Jul 2010 14:10:04 +0000 (22:10 +0800)]
b=23235 Reintroduce ext4_dquot_initialize() and ext4_dquot_drop() to avoid deadlock.
The problem is that lustre already starts a transaction before calling the ldiskfs/quota functions
most of the time, so we still need quota drop & initialize to start the transaction first to avoid
ordering issue with the other quota operations.
i=johann
i=landen
i=panda
johann [Wed, 7 Jul 2010 21:02:37 +0000 (23:02 +0200)]
Bump version to 1.8.3.58
Andrew Perepechko [Wed, 7 Jul 2010 20:59:09 +0000 (00:59 +0400)]
b=23216 a fix for a possible memory leak in ldiskfs_mb_load_buddy
i=Alex Zhuravlev
i=Johann Lombardi
johann [Wed, 7 Jul 2010 17:20:50 +0000 (19:20 +0200)]
b=23175 disable lockless truncate
lockless truncate is suspected to cause bug 23175. Disable it by
default for now to see if the problem happens again.
hongchao.zhang [Wed, 30 Jun 2010 15:57:49 +0000 (23:57 +0800)]
b=23139 workaround to avoid assertion in osc_init_grant
workaround for 1.6 servers which don't have
the patch from bug20278 applied
i=oleg.drokin
i=eric.mei
Andrew Perepechko [Tue, 6 Jul 2010 21:51:13 +0000 (01:51 +0400)]
b=23216 prevent memory leak in ost_brw_read and ost_brw_write
i=Alexander Zarochentsev
i=Oleg Drokin
Dmitry Zogin [Mon, 5 Jul 2010 23:17:46 +0000 (19:17 -0400)]
b=21980 cache `ll_obdo_cache': Can't free all objects
Always use OBDO_ALLOC/FREE for obdo allocations to prevent slab fragmentation.
Other related fixes.
o=johann
i=di.wang
i=dmitry.zoguine
pravin [Mon, 5 Jul 2010 14:24:34 +0000 (19:54 +0530)]
b=20563 flush seq fix
i=alexander.zarochentsev
i=rahul
always flush seq on OBD_NOTIFY_INACTIVE event.
yangsheng [Mon, 5 Jul 2010 13:22:48 +0000 (21:22 +0800)]
b=23235 Init lnb[n].lnb_grant_used on recoverable_resent.
i=green
i=wangdi
Dmitry Zogin [Fri, 2 Jul 2010 22:11:37 +0000 (18:11 -0400)]
b=23237 mount.lustre dies with SIGSEGV: Unable to read 1.8 config /tmp/lustre_tmp.IfgmBK/mountdata
Do not try to close non-open file and return the error in get_mountdata()
i=andreas.dilger
i=johann
minhdiep [Fri, 2 Jul 2010 21:23:56 +0000 (15:23 -0600)]
b=23175 Use MPI_Bcast to send random value to all ranks The file size will be different if each rank generate its own chunksize. We need rank 0 to generate the random and pass it to all ranks i=Johann
Johann Lombardi [Thu, 1 Jul 2010 13:16:56 +0000 (15:16 +0200)]
Bump version to 1.8.3.57
Johann Lombardi [Wed, 30 Jun 2010 23:00:27 +0000 (01:00 +0200)]
b=23192 ping_evictor_main() should skip export eviction only we are still in recovery
i=panda
i=tappro
target_recovery_check_and_stop() now returns 1 when the ost is not in
recovery. This confuses the ping evictor which decides not to process
export that needs to be evicted.
Andreas Dilger [Wed, 30 Jun 2010 21:09:54 +0000 (23:09 +0200)]
b=22360 Check for file_operations .flush fl_owner_t id parameter
i=johann
Since 2.6.18 (commit
75e1fcc0b18df0a65ab113198e9dc0e98999a08c) the
file_operations .flush() method has taken an fl_owner_t id parameter.
This is backported to some older vendor kernels, so a simple kernel
version check, as usual, is not sufficient to determine whether this
parameter is present or not.
Wally Wang [Wed, 30 Jun 2010 14:16:43 +0000 (16:16 +0200)]
b=21528 move "not all requested locks are canceled" message to dlmtrace
Andrew Perepechko [Tue, 29 Jun 2010 15:33:51 +0000 (19:33 +0400)]
b=22658 a conf-sanity test case for the proper missing llogs handling
a=Johann Lombardi
i=Johann Lombardi
i=Andrew Perepechko
Johann Lombardi [Wed, 30 Jun 2010 13:01:55 +0000 (15:01 +0200)]
remove LC_EXPORT___IGET since it is no longer used
./configure: line 11646: LC_EXPORT___IGET: command not found
Johann Lombardi [Wed, 30 Jun 2010 08:53:33 +0000 (10:53 +0200)]
b=23210 don't update obd->obd_osfs if target is gone already
i=wangdi
i=landen
lov_disconnect() can clean up lov->lov_tgts while the statfs interpret
routine of an rpc in flight has not been executed yet.
Johann Lombardi [Wed, 30 Jun 2010 08:28:46 +0000 (10:28 +0200)]
b=23131 issue OBD_NOTIFY_CREATE event in lov
i=nathan
i=andrew
(mds_lov.c:573:mds_lov_update_desc()) Process entered
(obd_class.h:376:obd_get_info()) Process entered
(obd_class.h:378:obd_get_info()) obd_get_info: NULL export
(obd_class.h:378:obd_get_info()) Process leaving (rc=
18446744073709551597 : -19
:
ffffffffffffffed)
The problem is that mds_lov_connect() calls obd_notify(OBD_NOTIFY_CREATE)
*before* obd_connect(...,&mds->mds_lov_exp). Since mds_notify(OBD_NOTIFY_CREATE)
requires mds->mds_lov_exp which is not yet initialized, it fails.
Johann Lombardi [Tue, 29 Jun 2010 11:59:03 +0000 (13:59 +0200)]
b=23192 fix race when ping evictor and srv thread executes target_recovery_check_and_stop() simultaneously
i=tappro
i=andrew
target_recovery_expired() wakes up both the ping evictor and the service thread
waiting in process_recovery_queue() which can race in
target_recovery_check_and_stop().
Brian J. Murrell [Mon, 28 Jun 2010 16:33:30 +0000 (12:33 -0400)]
b=23185 check for both arches
When we build our version of the SLES kernel, we optimize it for i686
whereas the SUSE kernel is i386. The actual arch makes a difference in
where the Module.symvers can be found, so just look in both locations
to cover both the upstream vendor kernel as well as our patched kernel.
ZhangHongChao [Mon, 28 Jun 2010 16:25:42 +0000 (18:25 +0200)]
b=17485 fix replay-single test 86 to remount client at the end
i=johann
Andrew Perepechko [Sun, 27 Jun 2010 18:27:17 +0000 (22:27 +0400)]
b=23196 obd reference fixes in lov_quota_check
i=Johann Lombardi
Johann Lombardi [Sun, 27 Jun 2010 05:48:51 +0000 (07:48 +0200)]
b=23196 quota broadcast crashes with inactive OSC
lov_quota_adjust_qunit() must take a lov reference
and check for lov->lov_tgts[i] != NULL when parsing
lov->lov_tgts.
Brian J. Murrell [Sat, 26 Jun 2010 02:57:23 +0000 (22:57 -0400)]
b=23185 use resolve_arch() in the test too
Missed a usage need for resolve_arch() in the test for the file we
extracted, using resolve_arch().