Whamcloud - gitweb
fs/lustre-release.git
13 years agob=22668 test_65b fix
Elena Gryaznova [Wed, 26 May 2010 21:10:51 +0000 (01:10 +0400)]
b=22668 test_65b fix

i=Johann.Lombardi
i=Andrew.Perepechko

make ost io on OST0000

13 years agob=22668 test_67b fix for ostcount > 10
Elena Gryaznova [Wed, 26 May 2010 20:52:06 +0000 (00:52 +0400)]
b=22668 test_67b fix for ostcount > 10

i=Johann

13 years agob=18489 test_116, test_118k cleanup
Elena Gryaznova [Wed, 26 May 2010 20:29:15 +0000 (00:29 +0400)]
b=18489 test_116, test_118k cleanup

i=Andrew.Perepechko (panda)
i=Andreas.Dilger

13 years agob=21953 use separate failover counter for each facet
Elena Gryaznova [Wed, 26 May 2010 19:24:47 +0000 (23:24 +0400)]
b=21953 use separate failover counter for each facet

i=Mikhail.Pershin (tappro)

13 years agoRevert "b=22244 delegate lock cancel to blocking thread"
Robert Read [Thu, 27 May 2010 15:56:13 +0000 (08:56 -0700)]
Revert "b=22244 delegate lock cancel to blocking thread"

This reverts commit 675dd06e429ee9551d0f874f3461ac3e5091c039.

13 years agob=20339 remove ref to LIBCFS_SSIZE_T_LONG & LIBCFS_SIZE_T_LONG function
pravin [Tue, 25 May 2010 14:29:14 +0000 (19:59 +0530)]
b=20339 remove ref to LIBCFS_SSIZE_T_LONG & LIBCFS_SIZE_T_LONG function

i=he.h.huang

13 years agob=22215 mpi_run (): p4_error fix
Elena Gryaznova [Wed, 26 May 2010 11:03:21 +0000 (15:03 +0400)]
b=22215 mpi_run (): p4_error fix

i=Brian.Murrell

fail if mpirun output contains p4_error, independently from error message and value

13 years agob=22841 local.sh mgs mkfs options MGSSIZE fix
Elena Gryaznova [Wed, 26 May 2010 06:46:11 +0000 (10:46 +0400)]
b=22841 local.sh mgs mkfs options MGSSIZE fix

i=Dmitry.Zoguine

13 years agob=20918 report max recovery time estimated
Elena Gryaznova [Wed, 26 May 2010 06:36:35 +0000 (10:36 +0400)]
b=20918 report max recovery time estimated

i=Andrew.Perepechko

13 years agob22131 ASSERTION(request->rq_repdata == NULL) failed
Nicolas Williams [Tue, 25 May 2010 23:51:58 +0000 (18:51 -0500)]
b22131 ASSERTION(request->rq_repdata == NULL) failed

    rq_repdata needs to be reset on free.

i=eric.mei@oracle.com

13 years agob=22598 Disable COS by default
Alexander.Zarochentsev [Tue, 25 May 2010 07:22:10 +0000 (11:22 +0400)]
b=22598 Disable COS by default

i=robert.read

13 years agob=22830 correct interop warning message
Mikhail Pershin [Tue, 25 May 2010 05:00:28 +0000 (09:00 +0400)]
b=22830 correct interop warning message

i=zam

13 years agob=17382 obdfilter-survey gives unreasonably high numbers
Dmitry Zogin [Mon, 24 May 2010 18:38:53 +0000 (14:38 -0400)]
b=17382 obdfilter-survey gives unreasonably high numbers

 Wait for all threads to complete when running test_brw.

 i=andreas.dilger
 i=oleg.drokin

13 years agob=19325 adjust waiting extent locks during 1st enqueue
Vitaly Fertman [Mon, 24 May 2010 19:23:45 +0000 (23:23 +0400)]
b=19325 adjust waiting extent locks during 1st enqueue

o=bobijam
i=vitaly
i=green

Re-landing. Adjust locks' extents on their first enqueue, so that at the time
they get granted, there is no need for another pass through the queues since
they are already shaped into the proper forms.

13 years agob=21815 per-nid stats should not access lustre hash internal structures directly
Rahul Deshmukh [Wed, 26 May 2010 08:48:17 +0000 (14:18 +0530)]
b=21815 per-nid stats should not access lustre hash internal structures directly

Fixed Yang Sheng's patch as per Robert's comment.

i=rread

13 years agob=21636 drop noisy debug messages
Fan Yong [Tue, 25 May 2010 02:08:25 +0000 (10:08 +0800)]
b=21636 drop noisy debug messages

Drop noisy debug messages

i=robert.read
i=rahul.deshmukh

13 years agob=22342 invalidate dentry can not be counted as the same
Fan Yong [Tue, 25 May 2010 02:01:39 +0000 (10:01 +0800)]
b=22342 invalidate dentry can not be counted as the same

Invalidate dentry can not be counted as the same.

i=robert.read
i=eric.mei

13 years agob=22040 Don't run connectathon lock tests on nfsv4
Robert Read [Thu, 20 May 2010 04:35:36 +0000 (21:35 -0700)]
b=22040 Don't run connectathon lock tests on nfsv4

We won't support flock on nfsv4 in 2.x until 14080 has been fixed.

i=green

13 years agob=17258 remove resurrected files
Brian J. Murrell [Tue, 25 May 2010 21:06:43 +0000 (17:06 -0400)]
b=17258 remove resurrected files

build/{lmake,lustre-kernel-2.4.spec.in} died with commit
9aa2968b965a5acdf127a5635dfde8950bc8d3a2 but a cherry-pick seems to
have resurrected them.  ~sigh~  Bad cherry-pick, bad cherry-pick.
Let's try a wooden stake this time.

13 years agob=22847 separate format and content in ext[34]_warning()
Brian J. Murrell [Tue, 25 May 2010 17:48:13 +0000 (13:48 -0400)]
b=22847 separate format and content in ext[34]_warning()

ext{3,4}_warning should not try to overload the message into the format
but should instead, pass a "%s" as the format and the string as an argument
for the %s.

i=panda
i=whitebear

13 years agob=22642 ldiskfs to figure out ext3/4 base itself
Brian J. Murrell [Tue, 25 May 2010 17:48:12 +0000 (13:48 -0400)]
b=22642 ldiskfs to figure out ext3/4 base itself

Ldiskfs should figure out whether to base itself on ext3 or ext4
by itself and not rely on lustre's configure to tell it.

i=mjmac
i=johann

13 years agob=22787 update to ofed to 1.5.1
Brian J. Murrell [Tue, 25 May 2010 17:48:11 +0000 (13:48 -0400)]
b=22787 update to ofed to 1.5.1

For O/Ses where we don't use the vendor supplied OFED, update the built
OFED to 1.5.1.

i=johann

13 years agob=17258 give the BUILD_TESTS love to ldiskfs as well
Brian J. Murrell [Tue, 25 May 2010 17:48:10 +0000 (13:48 -0400)]
b=17258 give the BUILD_TESTS love to ldiskfs as well

Because ldiskfs re-uses so (too?) much of the lustre auto* goop we need
to stub the BUILD_TESTS assignment into it's autoMakefile.am, even though
it's completely unused/unneed there.

i=wangyb
i=yangsheng

13 years agob=17258 fix error with make rpms after configure --disable-tests
Brian J. Murrell [Tue, 25 May 2010 17:48:09 +0000 (13:48 -0400)]
b=17258 fix error with make rpms after configure --disable-tests

If one configures lustre with "--disable-tests" a subsequent "make rpms"
will fail as it would still try to package up the lustre-tests RPM.
Fixing this provided the opportunity to fix another wart, that being the
subst'ing the configure arguments into the lustre.spec.  Now they are
passed as value with "--define 'configure_args ...'" when calling rpmbuild.

i=sheng.yang
i=yibin.wang

13 years agob=20619 be flexible in version format
Brian J. Murrell [Fri, 21 May 2010 15:06:45 +0000 (11:06 -0400)]
b=20619 be flexible in version format

The version format for the OFED release needs to be more flexible than
the previous M.m format that was required.  Now it can be any dot-separated
format.  i.e. 1 or 1.4 or 1.5.1 or 1.5.1.2, etc.  Of course, it has to be a
valid and unique OFED version (so the "1" example would be a valid version
string but invalid because there is no OFED version "1".  Ditto for 1.5.1.2
at the time of writing this comment.).

i=michael.macdonald
i=minh.diep

13 years agob=21109 mds_lov_read_objids cleanup and conf-sanity tests
Vladimir Saveliev [Fri, 21 May 2010 11:54:03 +0000 (15:54 +0400)]
b=21109 mds_lov_read_objids cleanup and conf-sanity tests

calculate mds->mds_lov_objid_lastidx and mds->mds_lov_objid_lastpage correctly
have index strings big enough to store indicies in decimal
test to check size of lov_objid and
test to check configuraion with big indicies

i=adilger
i=panda
i=nathan

13 years agob=21951 2.6.32-fc13 patchless client support for HEAD
Rahul Deshmukh [Fri, 21 May 2010 06:20:30 +0000 (11:50 +0530)]
b=21951 2.6.32-fc13 patchless client support for HEAD

b=21951 2.6.32-fc13 patchless client support for HEAD

i=sheng.yang
i=sebastien.buisson

Improvement changes for proc_handler, head port

13 years agob=17086 LSI Fusion MPT driver hacks to improve performance
Dmitry Zogin [Fri, 21 May 2010 03:35:32 +0000 (23:35 -0400)]
b=17086 LSI Fusion MPT driver hacks to improve performance

 patch to set CONFIG_FUSION_MAX_SGE=256 for Rhel5

 i=johann

13 years agob=21900 verify that we can write to the OST
Nathan Rutman [Thu, 20 May 2010 19:08:50 +0000 (12:08 -0700)]
b=21900 verify that we can write to the OST

i=dmitry.zoguine
a=nathan

13 years agob=22244 delegate lock cancel to blocking thread
Vitaly Fertman [Thu, 20 May 2010 15:45:43 +0000 (19:45 +0400)]
b=22244 delegate lock cancel to blocking thread

i=adilger
i=green

instead of cancelling locks locally in the shrinking thread,
deletate it to a separate blocking thread.

13 years agob=22244 ldlm cancel flags cleanup
Vitaly Fertman [Thu, 20 May 2010 15:45:42 +0000 (19:45 +0400)]
b=22244 ldlm cancel flags cleanup

i=adilger
i=green

cleanup of cancel flags passed to ldlm lock cancel code

13 years agob=22560 fix "obd_connect_names[]" for "OBD_CONNECT_FULL20" introduced
Fan Yong [Thu, 20 May 2010 15:18:32 +0000 (23:18 +0800)]
b=22560 fix "obd_connect_names[]" for "OBD_CONNECT_FULL20" introduced

fix "obd_connect_names[]" for "OBD_CONNECT_FULL20" introduced

i=robert.read
i=andreas.dilger

13 years agob=21485 allocate lcd inside obd_init_export()
Mikhail Pershin [Thu, 20 May 2010 15:03:05 +0000 (19:03 +0400)]
b=21485 allocate lcd inside obd_init_export()

i=rread
i=zam

13 years agob=22637 MDS returns OBD_MD_FLSIZE to client only when no OSS object allocated
Fan Yong [Thu, 20 May 2010 05:54:12 +0000 (13:54 +0800)]
b=22637 MDS returns OBD_MD_FLSIZE to client only when no OSS object allocated

MDS returns OBD_MD_FLSIZE to client only when no OSS object allocated.

i=robert.read
i=andreas.dilger

13 years agob=22827 restore locking for thread->t_flags
Alexander.Zarochentsev [Thu, 20 May 2010 07:33:48 +0000 (11:33 +0400)]
b=22827 restore locking for thread->t_flags

i=tappro
i=robert.read

13 years agob=20938 Add liblustreapi.a to dependencies lists
Robert Read [Thu, 13 May 2010 23:59:42 +0000 (16:59 -0700)]
b=20938 Add liblustreapi.a to dependencies lists

I think this is why we occasionally file to build lustre on LBATS
with a missing liblustreapi.a.

i=brian

13 years agob=22458 fix concurrent mgs lock revocation.
Eric Mei [Tue, 18 May 2010 14:59:35 +0000 (08:59 -0600)]
b=22458 fix concurrent mgs lock revocation.

r=nathan
r=rread

13 years agob=15587 ignore security.capability xattr on client side
Mikhail Pershin [Tue, 18 May 2010 05:52:31 +0000 (09:52 +0400)]
b=15587 ignore security.capability xattr on client side

i=adilger
i=johann

13 years agob=21681 Quiet bogus previously committed transno error
Mikhail Pershin [Tue, 18 May 2010 05:43:29 +0000 (09:43 +0400)]
b=21681 Quiet bogus previously committed transno error

i=zhang,panda

13 years agob=22455 remove "lnet." prefix from lctl params display
LiuYing [Tue, 18 May 2010 02:48:24 +0000 (10:48 +0800)]
b=22455 remove "lnet." prefix from lctl params display

remove "lnet." prefix from lctl params display and change one
"memused" to "lnet_memused".

o=adilger
i=johann
i=emoly.liu
i=rread

13 years agob=22731 server should not fall into LBUG if client send invalid parameter
Fan Yong [Tue, 18 May 2010 02:35:09 +0000 (10:35 +0800)]
b=22731 server should not fall into LBUG if client send invalid parameter

server should not fall into LBUG if client send invalid parameter

i=robert.read
i=di.wang

13 years agob=22560 introduce "OBD_CONNECT_FULL20" to distinguish 1.8 client from 2.0 one for...
Fan Yong [Tue, 18 May 2010 02:24:44 +0000 (10:24 +0800)]
b=22560 introduce "OBD_CONNECT_FULL20" to distinguish 1.8 client from 2.0 one for different checksum policy

Introduce "OBD_CONNECT_FULL20" to distinguish 1.8 client from 2.0 one for different checksum policy:
1) for 1.8 client, use fixed first 88 bytes of ptlrpc_body
2) for 2.0 client, use lm_buflens

i=andreas.dilger
i=robert.read

13 years agob=15253 add conf_param -d to remove permanent settings
Nathan Rutman [Fri, 14 May 2010 16:18:43 +0000 (09:18 -0700)]
b=15253 add conf_param -d to remove permanent settings

i=adilger
i=rread

13 years agob=22625 Fix libcfs_debug_file_path module option
Rahul Deshmukh [Fri, 14 May 2010 10:29:39 +0000 (15:59 +0530)]
b=22625 Fix libcfs_debug_file_path module option

i=rahul

Landing patch by Brian Behlendorf <behlendorf1@llnl.gov>,
fix libcfs_debug_file_path module option

13 years agob=21945 Adding WIRE_ATTR to lnet structures traversing network
Maxim Patlasov [Fri, 14 May 2010 09:11:39 +0000 (13:11 +0400)]
b=21945 Adding WIRE_ATTR to lnet structures traversing network

i=liang
i=isaac
LST passed some lnet structures via network neglecting the lack of WIRE_ATTR attribute. This resulted in incompatibility of LST-s running on different platforms.

13 years agob=22455 add "list_param -R"
LiuYing [Fri, 14 May 2010 00:47:07 +0000 (08:47 +0800)]
b=22455 add "list_param -R"

list parameters recursively with the "-R" option

o=adilger
i=emoly.liu
i=nathan

13 years agoUpdated for build 42. 1.10.0.42 v1_10_0_42
Terry Rutledge [Thu, 13 May 2010 20:30:07 +0000 (13:30 -0700)]
Updated for build 42.

13 years agoRevert "b=22637 MDS returns OBD_MD_FLSIZE to client only when no OSS object allocated"
Robert Read [Wed, 12 May 2010 19:07:34 +0000 (12:07 -0700)]
Revert "b=22637 MDS returns OBD_MD_FLSIZE to client only when no OSS object allocated"

Hit ASSERTION(attr->la_blocks == 0), see bug 22802.

This reverts commit 33b4bafea13bd2cfe90dba3a8651a175683f3999.

13 years agob=22573 do not skip previously granted locks in osc_lock_enqueue_wait().
Vitaly Fertman [Wed, 12 May 2010 11:04:16 +0000 (15:04 +0400)]
b=22573 do not skip previously granted locks in osc_lock_enqueue_wait().

i=eric.mei
i=wangdi

as CLIO adds new locks to the tail, walk through the head of the queue
to cancel overlapping conflicting locks on enqueue.

13 years agob=22518 mount client2 at the start of tests, disable COS, fix test 10
Mikhail Pershin [Wed, 12 May 2010 16:25:20 +0000 (20:25 +0400)]
b=22518 mount client2 at the start of tests, disable COS, fix test 10

i=grev

13 years agob=18857 enhance seq allocation scalability by updating seq data asynchronously.
pravin [Wed, 12 May 2010 14:25:24 +0000 (19:55 +0530)]
b=18857 enhance seq allocation scalability by updating seq data asynchronously.

this patch also removes seq replay. ref bug for details.

a=pravin,tappro
i=tappro
i=alexander.zarochentsev
i=pravin

13 years agob=21140 Fix srv_threads_running counting.
Alexander.Zarochentsev [Wed, 12 May 2010 10:29:19 +0000 (14:29 +0400)]
b=21140 Fix srv_threads_running counting.

It was possible to overload n_active_request processing incoming requests and
break the thread reservation logic. Likely, it was responsible to the long
processing of requests.

The patch makes srv_threads_running to exactly count only running not sleeping
threads. All threads accounting and comparing/reservation of threads are done
under the service spinlock so it produce a reliable result. The thread
reservation logic is based on new srv_threads_running value and cannot be
confused by not active sleeping threads. The thread reservation logic is
concentrated now in one place, where the wakeup condition is checked (now in
ptlrpc_main_check_event), once a thread is woken up, it is counted as running
and does further work w/o additional checks.

i=zhen.liang
i=robert.read

13 years agob=22683 remove unnecessary check and assert in the cfs_hash function.
Wang Di [Tue, 11 May 2010 17:58:48 +0000 (13:58 -0400)]
b=22683 remove unnecessary check and assert in the cfs_hash function.

o=Eric.Barton
i=Robert.Read
i=Di.Wang

13 years agob=13698 allow e2fsck part of lfsck.sh to be run without lfsck
Vladimir Saveliev [Tue, 11 May 2010 06:40:46 +0000 (10:40 +0400)]
b=13698 allow e2fsck part of lfsck.sh to be run without lfsck

this combines initial patch from Andreas (https://bugzilla.lustre.org/attachment.cgi?id=29696)
and several necessary fixes (https://bugzilla.lustre.org/attachment.cgi?id=29747)

i=adilger

13 years agob=20562 increase LUSTRE_SEQ_META_WIDTH to keep FLD compact
Rahul Deshmukh [Tue, 11 May 2010 06:25:30 +0000 (11:55 +0530)]
b=20562 increase LUSTRE_SEQ_META_WIDTH to keep FLD compact

Fixed the build error for fc11 and fc12 patchless client.

i=rread

13 years agob=22598 diagnostic patch for lock cancel callback error processing
Fan Yong [Tue, 11 May 2010 03:01:45 +0000 (11:01 +0800)]
b=22598 diagnostic patch for lock cancel callback error processing

diagnostic patch for lock cancel callback error processing.

i=robert
i=di.wang

13 years agob=19986 cleanup lock to eliminate former test cases effect before replay-single test_53
Fan Yong [Tue, 11 May 2010 01:55:54 +0000 (09:55 +0800)]
b=19986 cleanup lock to eliminate former test cases effect before replay-single test_53

cleanup lock to eliminate former test cases effect before replay-single test_53

i=robert
i=di.wang

13 years agob=18143 Make VBR compatible with pdirops.
Mikhail Pershin [Sun, 9 May 2010 09:09:23 +0000 (13:09 +0400)]
b=18143 Make VBR compatible with pdirops.

i=zam
i=bzzz

13 years agob=22283 clarify writeconf in man page
Nathan Rutman [Fri, 7 May 2010 22:33:29 +0000 (15:33 -0700)]
b=22283 clarify writeconf in man page

13 years agob=22671 Check for modules directly instead of keeping state
Nathan Rutman [Fri, 7 May 2010 19:29:07 +0000 (12:29 -0700)]
b=22671 Check for modules directly instead of keeping state

i=nico
i=rread

13 years agob=22581 LOADS env var in ncli.sh should allow overwrite
Elena Gryaznova [Fri, 7 May 2010 16:07:54 +0000 (20:07 +0400)]
b=22581 LOADS env var in ncli.sh should allow overwrite

i=Minh.Diep

13 years agob=22522 do not remove from res_list without locks
Oleg Drokin [Wed, 5 May 2010 23:35:58 +0000 (19:35 -0400)]
b=22522 do not remove from res_list without locks

Patch in bug 21501 moved list manipulation of res_list outside or res_lock
introducing a race window in flock code. Move it back under the lock.

i=rread
i=adilger

13 years agob=22669 fix fault page index handler in newer kernel.
Eric Mei [Wed, 5 May 2010 01:14:34 +0000 (19:14 -0600)]
b=22669 fix fault page index handler in newer kernel.

r=wangdi
r=rread

13 years agob=21502 symlink compatibility between 1.6 and 2.0
Rahul Deshmukh [Tue, 4 May 2010 15:16:01 +0000 (20:46 +0530)]
b=21502 symlink compatibility between 1.6 and 2.0

Fixed the sleep in spin lock, inode->i_sb->s_op-> dirty_inode(inode)
was called with spin lock held.

i=bzzz
i=pravin

13 years agob=22683 don't manipulate hash in lov_sub_enter/lov_sub_exit.
Eric Mei [Tue, 4 May 2010 14:30:09 +0000 (08:30 -0600)]
b=22683 don't manipulate hash in lov_sub_enter/lov_sub_exit.

r=wangdi
r=rread

13 years agoRevert "b=19427 correct lmm_object_id and reserve fids for fid-on-OST." 1.10.0.41a v1_10_0_41a
Robert Read [Fri, 30 Apr 2010 16:23:28 +0000 (09:23 -0700)]
Revert "b=19427 correct lmm_object_id and reserve fids for fid-on-OST."

This reverts commit 4c01e64e0a72c1682ebf0a8bd4cccf99fd04cd88.

This caused the interop issue seen in bug 22730.

13 years agoPrepare for Build 41 1.10.0.41 v1_10_0_41
Robert Read [Thu, 29 Apr 2010 22:09:34 +0000 (15:09 -0700)]
Prepare for Build 41

13 years agob=16680 remove some noisy debug messages
Fan Yong [Thu, 29 Apr 2010 06:06:37 +0000 (14:06 +0800)]
b=16680 remove some noisy debug messages

Remove some noisy debug messages.

i=robert.read
i=rahul.deshmukh

13 years agob=20326 Test suite for MMP feature
Jian Yu [Thu, 29 Apr 2010 05:59:13 +0000 (13:59 +0800)]
b=20326 Test suite for MMP feature

Tests for multiple mount protection (MMP) feature.

i=andreas.dilger
i=grev

13 years agob=22069 port "llapi_get_connect_flags()" API from b1_8 to master
Fan Yong [Thu, 29 Apr 2010 05:37:03 +0000 (13:37 +0800)]
b=22069 port "llapi_get_connect_flags()" API from b1_8 to master

Port "llapi_get_connect_flags()" API from b1_8 to master.

i=robert.read
i=landen

13 years agob=22075 buffalo-v2 should detect test timeouts
Manoj Joseph [Thu, 29 Apr 2010 03:27:21 +0000 (21:27 -0600)]
b=22075 buffalo-v2 should detect test timeouts

buffalo-v2 now detects test timeouts. It now generates status entries
in results.yml after a sub-test completes. If the test status is missing
a timeout is assumed to have occured.

i=robert.read
i=grev

13 years agob=21962 Quote the error message in results.yaml
Manoj Joseph [Thu, 29 Apr 2010 03:27:20 +0000 (21:27 -0600)]
b=21962 Quote the error message in results.yaml

Quote and escape the error message in results.yaml

i=robert.read
i=nicolas.williams

13 years agob=22582 remove leading / from fid2path results print full path if given mountpoint
Nathan Rutman [Wed, 28 Apr 2010 18:04:59 +0000 (11:04 -0700)]
b=22582 remove leading / from fid2path results print full path if given mountpoint

i=manoj
i=rread

13 years agob=22456 Remove files for unsupported kernels
Robert Read [Wed, 28 Apr 2010 17:06:59 +0000 (10:06 -0700)]
b=22456 Remove files for unsupported kernels

Removes support for fc3, fc5, rhel4, sles10, 2.6.18 vanilla and 2.6.22 vanilla.

i=adilger

13 years agob=18649 set wait_recovery_complete() MAX value to max recovery time estimated
Elena Gryaznova [Tue, 27 Apr 2010 15:32:56 +0000 (19:32 +0400)]
b=18649 set wait_recovery_complete() MAX value to max recovery time estimated

i=Mikhail.Pershin

13 years agob=20918 t-f max recovery time estimation
Elena Gryaznova [Tue, 27 Apr 2010 15:23:04 +0000 (19:23 +0400)]
b=20918 t-f max recovery time estimation

i=Nathan.Rutman
i=Brian.Murrell

inform user about the estimated maximum recovery time value
to help him to set the server FAILOVER period properly

13 years agob=22342 process racer condition between statahead and rename/unlink operation
Fan Yong [Tue, 27 Apr 2010 05:23:03 +0000 (13:23 +0800)]
b=22342 process racer condition between statahead and rename/unlink operation

1) process racer condition between statahead and rename/unlink operation
2) replace "lli_lock" with "lli_sa_lock" for statahead related processing

i=robert.read
i=tom.wang

13 years agob=22634 hold "mds_qonoff_sem" when call "lustre_read_quota()", and check parameter...
Fan Yong [Tue, 27 Apr 2010 03:36:53 +0000 (11:36 +0800)]
b=22634 hold "mds_qonoff_sem" when call "lustre_read_quota()", and check parameter properly in such function

1) replace "cfs_semaphore_t" with "cfs_rw_semaphore_t" for "mds_qonoff_sem" to enhance the parallel processing of quota related operations
2) hold "mds_qonoff_sem" when call "lustre_read_quota()", and check parameter properly in such function

i=robert.read
i=landen

13 years agob=22614 enlarge MDSSIZE/OSTSIZE to increase default journal size for conf-sanity
Fan Yong [Tue, 27 Apr 2010 02:51:27 +0000 (10:51 +0800)]
b=22614 enlarge MDSSIZE/OSTSIZE to increase default journal size for conf-sanity

1) enlarge MDSSIZE/OSTSIZE to increase default journal size for conf-sanity
2) journal handler error process in lustre_commit_dquot

i=robert.read
i=landen

13 years agob=21251 ha.sh: Fix ha_wait_loads and ha_dump_logs
Li Wei [Tue, 27 Apr 2010 02:28:09 +0000 (10:28 +0800)]
b=21251 ha.sh: Fix ha_wait_loads and ha_dump_logs

Report existence of hanging workloads.  Ignore "lctl dk" failures, since
some nodes may be down.

i=robert.read

13 years agob=21251 Add lustre/tests/ha.sh
Li Wei [Tue, 27 Apr 2010 02:28:08 +0000 (10:28 +0800)]
b=21251 Add lustre/tests/ha.sh

This is a simple failover test script that works with configurations
controlled by a CRM and have multiple targets per server.

i=robert.read
i=grev

13 years agoRevert "b=21379 Fix orphans proceeding in osc_create"
Robert Read [Mon, 26 Apr 2010 22:11:52 +0000 (15:11 -0700)]
Revert "b=21379 Fix orphans proceeding in osc_create"

This reverts commit 2deb4f149f4601f9128fd39efd4705573520f277.

13 years agob=22458 move lcw_dump out of softirq context.
Eric Mei [Mon, 26 Apr 2010 14:59:52 +0000 (08:59 -0600)]
b=22458 move lcw_dump out of softirq context.

Now the message dump is done in thread context in lc_watchdogd.

r=rread
r=nathan

13 years agob=21128 run sync ldlm_bl_to_thread_list() in separate thread to save stack space.
pravin [Mon, 26 Apr 2010 13:15:26 +0000 (18:45 +0530)]
b=21128 run sync ldlm_bl_to_thread_list() in separate thread to save stack space.

i=oleg.drokin
i=rahul

13 years agob=22637 MDS returns OBD_MD_FLSIZE to client only when no OSS object allocated
Fan Yong [Mon, 26 Apr 2010 07:38:37 +0000 (15:38 +0800)]
b=22637 MDS returns OBD_MD_FLSIZE to client only when no OSS object allocated

i=robert.read
i=landen

13 years agob=22513 Remove unecessary lock in read-ahead process.
Wang Di [Mon, 26 Apr 2010 03:57:44 +0000 (23:57 -0400)]
b=22513 Remove unecessary lock in read-ahead process.

i=Robert.Read
i=Eric.Mei

13 years agob=21938 use the same set during replay
hongchao.zhang [Wed, 21 Apr 2010 00:54:53 +0000 (08:54 +0800)]
b=21938 use the same set during replay

some requests use its own ptlrpc_request_set to process its requests, but Lustre
will use a specific ptlrpc_request_set to process the requests during recovery.
this patch fixes this problem to allow the requests to use its own set if it have
one

i=johann@sun.com
i=tappro@sun.com

13 years agob=15936 Unified target cleanups v2
Mikhail Pershin [Sun, 25 Apr 2010 12:09:48 +0000 (16:09 +0400)]
b=15936 Unified target cleanups v2

i=rread
i=andreas

13 years agob=20373 Putting parent lock for rep-ack on create is wasteful
Cliff White [Tue, 6 Apr 2010 06:40:15 +0000 (23:40 -0700)]
b=20373 Putting parent lock for rep-ack on create is wasteful

Do not put locks if no create.

i=robert.read
i=tappro

13 years agob=22310 add a little more comment.
Eric Mei [Fri, 23 Apr 2010 18:48:07 +0000 (12:48 -0600)]
b=22310 add a little more comment.

r=adilger

13 years agob=21678 Add more debug info to lnd_query code path
Isaac Huang [Fri, 23 Apr 2010 04:03:14 +0000 (00:03 -0400)]
b=21678 Add more debug info to lnd_query code path

The peer health code lacked some important debugging info in lnd_query
code paths. This patch added necessary debug prints, not just for bug
21678, but also for future troubleshooting.

i=liang
i=maxim

13 years agob=19427 correct lmm_object_id and reserve fids for fid-on-OST.
Wang Di [Fri, 23 Apr 2010 19:57:04 +0000 (12:57 -0700)]
b=19427 correct lmm_object_id and reserve fids for fid-on-OST.

1. Change lmm_object_id to fid.
2. Cleanup fid spaces reservation (for fid-on-OST).http://arch.lustre.org/index.php?title=Interoperability_fids_zfs#NEW.0
3. Rename group to Seq.

i= Andreas.diger
i= pravin.shelar

13 years agob=22615 fixes for regressions caused by 11063
Vladimir Saveliev [Thu, 22 Apr 2010 19:15:32 +0000 (12:15 -0700)]
b=22615 fixes for regressions caused by 11063

set atime to past under PW EOF extent lock
fix truncate in liblustre

i=vitaly
i=ericm

13 years agob=22507 rm -rf not replicated
Manoj Joseph [Thu, 22 Apr 2010 19:15:28 +0000 (12:15 -0700)]
b=22507 rm -rf not replicated

Support replication of recursive directory removal.

i=nathan.rutman
i=robert.read

13 years agob=22520 set the thread to be uninterrupt before add to waitq
Wang Di [Thu, 22 Apr 2010 19:15:24 +0000 (12:15 -0700)]
b=22520 set the thread to be uninterrupt before add to waitq

In lov_subobject_kill, if the thread needs to wait the object being
freed, it should set the thread to be uninterrupt, otherwise, the thread
might spin there.

i=Eric.mei
i=Robert

13 years agob=22296 Fix script problem for recovery-double-scale
Wang Di [Thu, 22 Apr 2010 19:15:20 +0000 (12:15 -0700)]
b=22296 Fix script problem for recovery-double-scale

Force the test threads stopped before shutdown the clients in
recovery-double-scale.

i=Jack.Chen
i=WangDi

13 years agob=22161 Use LCK_PW for parent lock in mdt_link(). Pdirops test set
Mikhail Pershin [Wed, 21 Apr 2010 18:43:20 +0000 (11:43 -0700)]
b=22161 Use LCK_PW for parent lock in mdt_link(). Pdirops test set

i=adilger
i=bzzz
i=rread

13 years agob=19919 Supply a absolute path.
yangsheng [Wed, 21 Apr 2010 18:43:12 +0000 (11:43 -0700)]
b=19919 Supply a absolute path.

i=andreas
i=johann

13 years agob=22190 Apply 19195 patch to add tls data for recovery thread.
Mikhail Pershin [Wed, 21 Apr 2010 18:43:11 +0000 (11:43 -0700)]
b=22190 Apply 19195 patch to add tls data for recovery thread.

This will be needed anyway when sync journal will be ported

i=zam
i=oleg

13 years agofixed for bug 22237
Rahul Deshmukh [Wed, 21 Apr 2010 18:42:26 +0000 (11:42 -0700)]
fixed for bug 22237

b=22237 replay-single test-13 mmp failure, BUG: warning at
fs/proc/generic.c:764/remove_proc_entry()

The proc entry EXT4_MAX_DIR_SIZE_NAME was not remove in cleanup path. It
is now fixed.

i=johann