Whamcloud - gitweb
cvs2svn [Thu, 27 Oct 2005 23:26:12 +0000 (23:26 +0000)]
This commit was manufactured by cvs2svn to create branch 'b_release_1_4_6'.
nathan [Thu, 27 Oct 2005 23:26:10 +0000 (23:26 +0000)]
b=9501
r=adilger
automatically create /dev/lnet, /dev/obd when needed.
nathan [Thu, 27 Oct 2005 23:24:45 +0000 (23:24 +0000)]
b=9501
r=adilger
automatically create /dev/lnet, /dev/obd when needed.
nathan [Thu, 27 Oct 2005 22:54:42 +0000 (22:54 +0000)]
Branch b_hd_newconfig
b=9501
fix 'lctl net down' twice assert
pjkirner [Thu, 27 Oct 2005 21:34:51 +0000 (21:34 +0000)]
* Key for peer db is ptl_process_id_t rather than just NID
(+ Assocated log msg chanages)
Necesary to support catamount processes that come up one right
after another using the same NID but diffrent PIDs
* Fix to bug in peer_time() code with incorrect return code
pjkirner [Thu, 27 Oct 2005 21:05:15 +0000 (21:05 +0000)]
* Fixed handling of NOOP message
* Fixed dssstamp check to handle multple routers & incoming connections
* Cleanup a few log messages + qkcc compiler warnings
lincent [Thu, 27 Oct 2005 08:20:27 +0000 (08:20 +0000)]
modify llog_reader to show args of lov_modify_tgts.
eeb [Thu, 27 Oct 2005 02:26:09 +0000 (02:26 +0000)]
* iiblnd: drain disconnecting sockets
* lnet: explicit configure/teardown for routers ("lctl network up" as well as
"lctl network down"). config_on_load=0 is the new default (setting it
effectively does "lctl network up" at module load time).
Added "net" as an alias for "network" to lctl so you don't have to type
"work" when you run "lctl net up" from the shell.
Also fixed a couple of bugs which required lnet to be unloaded before it
could be brought up again.
* lnet routing: restored automatic route disabling when comms to a router
fails (currently kernel elan and tcp only).
eeb [Thu, 27 Oct 2005 02:26:07 +0000 (02:26 +0000)]
* iiblnd: drain disconnecting sockets
* lnet: explicit configure/teardown for routers ("lctl network up" as well as
"lctl network down"). config_on_load=0 is the new default (setting it
effectively does "lctl network up" at module load time).
Added "net" as an alias for "network" to lctl so you don't have to type
"work" when you run "lctl net up" from the shell.
Also fixed a couple of bugs which required lnet to be unloaded before it
could be brought up again.
* lnet routing: restored automatic route disabling when comms to a router
fails (currently kernel elan and tcp only).
brian [Wed, 26 Oct 2005 23:15:01 +0000 (23:15 +0000)]
r=nic
Allow lbuild to build a kernel from a downloaded RHEL4 SRPM. It should only
need to do this once as it caches the .tar.bz2 it builds.
This technique can/should be ported to other vendor kernels we deal with.
Just more "hands off" automation of the build process.
adilger [Wed, 26 Oct 2005 18:35:46 +0000 (18:35 +0000)]
Branch b1_4
Don't compare .xml mtimes, they will always be different.
Split "long UUID" and "consistent .xml" into separate subtests.
nathan [Wed, 26 Oct 2005 16:46:51 +0000 (16:46 +0000)]
Branch b1_4
(lin)print lov_setup info
adilger [Wed, 26 Oct 2005 09:29:33 +0000 (09:29 +0000)]
Branch b1_4
Description: When migrating a subset of services from a node (e.g. failback
from a failover service node) the remaining services would
time out and evict clients.
Details : lconf --force (implied by --failover) sets the global obd_timeout
to 5 seconds in order to quickly disconnect, but this caused
other RPCs to time out too quickly. Do not change the global
obd_timeout for force cleanup, only set it for DISCONNECT RPCs.
b=6395, b=9514
adilger [Wed, 26 Oct 2005 00:30:10 +0000 (00:30 +0000)]
Branch b1_4
Use "if (likely(!ext3_mb_agressive))" since it is likely that this is not set
(it defaults to off). Also, check this first since it is the likely case.
nic [Tue, 25 Oct 2005 22:54:10 +0000 (22:54 +0000)]
p=alex
- fix for write performance slowdown on 2.6 kernels. Return early when not in
'aggressive' mode and default aggressive mode to off
adilger [Tue, 25 Oct 2005 19:57:43 +0000 (19:57 +0000)]
Branch b1_4
Don't get ll_inode_size_lock() in ll_update_inode() as this can be called
with inode_lock (spinlock) held and deadlock. This was protecting the
setting of lli_smd to prevent ll_inode_size_unlock() from inconsistently
calling lov_stripe_unlock() when it was never locked because lli_smd changed
since ll_inode_size_lock() was called.
We now avoid this race by only ever calling ll_inode_size_lock() with
lli_smd already set, or with "lock_lsm = 0" so we don't care if it changes
between lock and unlock. This makes sense in any case, because if there is
no lli_smd we shouldn't be doing glimpse/enqueue on the OSTs anyways.
b=9547
r=nikita
nathan [Tue, 25 Oct 2005 19:37:04 +0000 (19:37 +0000)]
Branch b1_4
b=9477
r=adilger
- robustify mtime check
- forgot to add back in part of --service= checks
nathan [Tue, 25 Oct 2005 18:22:05 +0000 (18:22 +0000)]
Branch b1_4
b=8080
r=adilger
Change magic for new lmd
pjkirner [Tue, 25 Oct 2005 15:55:30 +0000 (15:55 +0000)]
* Fix for redundant routing on Catamount
* Cleanup of some logging build warnings
* Some additional logging
nathan [Tue, 25 Oct 2005 15:20:15 +0000 (15:20 +0000)]
b=8080
create /dev/lnet
lsy [Tue, 25 Oct 2005 11:36:40 +0000 (11:36 +0000)]
fixes for nikita's inspection.
adilger [Tue, 25 Oct 2005 06:39:23 +0000 (06:39 +0000)]
Branch b1_4
Fix bug 9482 regression for fix to 2.6 llap_shrink_cache() page cleanup.
b=6450, b=9482
r=green
eeb [Tue, 25 Oct 2005 00:07:44 +0000 (00:07 +0000)]
* socklnd: fixed my stupid blunder that could cause the assertion
failure...
LustreError: 20480:0:(socklnd_cb.c:788:ksocknal_launch_packet())
ASSERTION(peer->ksnp_accepting > 0 ||
ksocknal_find_connecting_route_locked(peer) != NULL) failed
* iiblnd: fixed connection race and tested on boston, but didn't manage
to exercise the race resolution code.
ericm [Mon, 24 Oct 2005 17:09:22 +0000 (17:09 +0000)]
branch: b1_4
lconf recognize option --user_xattr (fake) to make ltest happy.
eeb [Mon, 24 Oct 2005 16:27:40 +0000 (16:27 +0000)]
* openiblnd: dropped unused 'rc' from kibnal_peer_connect_failed() to bring
it into line with iib,vib.
* iiblnd: coded connection race resolution
brian [Mon, 24 Oct 2005 16:27:30 +0000 (16:27 +0000)]
Make release 1.4.5.92 on current b1_4 head.
eeb [Sun, 23 Oct 2005 17:12:54 +0000 (17:12 +0000)]
* 9561: completed connection race fix for socklnd
eeb [Sun, 23 Oct 2005 14:08:15 +0000 (14:08 +0000)]
* socklnd:
fixed connection race that can occur with multiple routers
changed 'typed_conns' module parameter to RO
* nidstrings:
allow 0xnnnn parsing of numerical NIDs
libcfs_num_addr2str -> libcfs_decnum_addr2str (LO, QSW, PTL)
libcfs_hexnum_addr2str (GM)
* gmlnd:
change from decimal to hex representation of GM addresses (they're the
lowest 4 bytes of the NIC's MAC address).
* router:
compare routers first on # uncompleted bytes, then credits
for better load balance.
* llmount:
fixed error message (it's a NID, not a host)
fixed bug in checking mdx & profile string lengths
eeb [Sun, 23 Oct 2005 14:08:10 +0000 (14:08 +0000)]
* socklnd:
fixed connection race that can occur with multiple routers
changed 'typed_conns' module parameter to RO
* nidstrings:
allow 0xnnnn parsing of numerical NIDs
libcfs_num_addr2str -> libcfs_decnum_addr2str (LO, QSW, PTL)
libcfs_hexnum_addr2str (GM)
* gmlnd:
change from decimal to hex representation of GM addresses (they're the
lowest 4 bytes of the NIC's MAC address).
* router:
compare routers first on # uncompleted bytes, then credits
for better load balance.
* llmount:
fixed error message (it's a NID, not a host)
fixed bug in checking mdx & profile string lengths
adilger [Sat, 22 Oct 2005 07:36:18 +0000 (07:36 +0000)]
Branch b1_4
Save "options" from the wrath of strtok() so we can save them into /etc/mtab.
This is needed for buffalo testing of new features, among other things.
brian [Fri, 21 Oct 2005 22:33:34 +0000 (22:33 +0000)]
Array element access in bash MUST be enclosed with {}.
eeb [Fri, 21 Oct 2005 16:15:45 +0000 (16:15 +0000)]
* Changed iiblnd to use Infinicon iba_xxx() API.
* Changed iiblnd to post multiple work items in 1 go.
eeb [Fri, 21 Oct 2005 15:22:28 +0000 (15:22 +0000)]
* iiblnd fixes (mid-way through changing Infinicon API)
* viblnd fix (tx_waiting not cleared on RDMA ops initiated on a new
connection that fails triggers an assertion failure).
* some prep for userspace ip2nets
* router selection round robins if other selection criterea are equal
* local_nid_dist_zero LNET module param for single-node LND testing.
* reformat LNET /proc buffer displays
* rename userspace tcplnd env params TCPNAL_xxx -> TCPLND_xxx
pjkirner [Fri, 21 Oct 2005 13:48:57 +0000 (13:48 +0000)]
* Make echo_test a proper liblustre test, now that it works even on Catamount!
nikita [Fri, 21 Oct 2005 10:34:56 +0000 (10:34 +0000)]
fix unnecessary line-wrap
nikita [Fri, 21 Oct 2005 10:13:09 +0000 (10:13 +0000)]
ENTRY has to be matched by RETURN()
niu [Fri, 21 Oct 2005 08:33:59 +0000 (08:33 +0000)]
Disable compile userspace quota stuff for b1_4. This is short-term solution
for the <linux/quota.h> problem before b1_4_quota lands.
-b 9542
-r adilger
lsy [Fri, 21 Oct 2005 03:28:06 +0000 (03:28 +0000)]
remove unused LC_CONFIG_QUOTA in lustre-build.m4.
remove check for if_dqblk/dqinfo.
change parameters type from int to __u32, since it's used
in ioctl. and make it align. (according to Andreas' advice)
nathan [Thu, 20 Oct 2005 19:45:29 +0000 (19:45 +0000)]
add configure warning if building without CONFIG_KMOD
nathan [Thu, 20 Oct 2005 19:17:37 +0000 (19:17 +0000)]
Branch b1_4
allow auto-loading of lustre/LNET modules
nathan [Thu, 20 Oct 2005 18:23:35 +0000 (18:23 +0000)]
Branch b1_4
b=none
fix 'make dist' err with llog_reader
niu [Thu, 20 Oct 2005 09:45:15 +0000 (09:45 +0000)]
all children of b1_4 should use same lnet
niu [Thu, 20 Oct 2005 08:16:59 +0000 (08:16 +0000)]
define PORTAL_SYMBOL_GET/PUT for userspace
pjkirner [Thu, 20 Oct 2005 02:13:54 +0000 (02:13 +0000)]
* Added command line support for liblustre echo_test
* Added "hack" to allow echo_test to build on Catamount
nathan [Wed, 19 Oct 2005 21:50:44 +0000 (21:50 +0000)]
b=8080
better errors for 2.4 module autoloading
nathan [Wed, 19 Oct 2005 21:49:51 +0000 (21:49 +0000)]
b=8080
better errors for 2.4 module autoloading
green [Wed, 19 Oct 2005 18:58:42 +0000 (18:58 +0000)]
b=9482
r=adilger
Check that there is no page_mapped() before trying to define it for 2.4 kernels.
nathan [Wed, 19 Oct 2005 18:23:21 +0000 (18:23 +0000)]
Branch b_hd_newconfig
b=8080
add symlink from /proc/sys/portals to /proc/sys/lnet for old failover
scripts
pjkirner [Wed, 19 Oct 2005 18:21:09 +0000 (18:21 +0000)]
* Fixed a few problems in ptllnd_set_txiov relating to iovec mapping
* Remove a "parania" LASSERT that was not valid for Cray Portals
pjkirner [Wed, 19 Oct 2005 04:39:03 +0000 (04:39 +0000)]
* Fix case of PTLLND not being included because of incorrect #if value
nathan [Wed, 19 Oct 2005 00:41:25 +0000 (00:41 +0000)]
Branch b1_4
b=9477
These lines should not have been removed in the 9477 patch. They prevent
--select from limiting services in linux 2.4
ericm [Tue, 18 Oct 2005 21:37:30 +0000 (21:37 +0000)]
b1_4_acl has updated from b1_4, use same portals/lnet as b1_4
nikita [Tue, 18 Oct 2005 19:27:26 +0000 (19:27 +0000)]
Forgotten chunks of 7133 patch.
nathan [Tue, 18 Oct 2005 19:14:35 +0000 (19:14 +0000)]
Branch b1_4
b=9477
exempt bad/incomplete nids from network ping test
nathan [Tue, 18 Oct 2005 18:26:37 +0000 (18:26 +0000)]
b=8080
- modules aren't as automatic as they should be for linux 2.4
- for some reason python in linux 2.4 didn't understand this lmc construct
nathan [Tue, 18 Oct 2005 18:23:08 +0000 (18:23 +0000)]
b=8080
- modules aren't as automatic as they should be for linux 2.4
- for some reason python in linux 2.4 didn't understand this lmc construct
nathan [Tue, 18 Oct 2005 16:38:26 +0000 (16:38 +0000)]
b=8080
Landing LNET (b1_4_newconfig)
nathan [Tue, 18 Oct 2005 16:12:39 +0000 (16:12 +0000)]
b=8080
Landing LNET (b1_4_newconfig)
green [Mon, 17 Oct 2005 21:26:53 +0000 (21:26 +0000)]
Branch: b1_4
b=9482
r=adilger
Try to unmap pages before discarding them in llap_shrink_cache.
gord [Mon, 17 Oct 2005 19:25:47 +0000 (19:25 +0000)]
Commented out LC_CONFIG_QUOTA line, which had no definition.
pjkirner [Mon, 17 Oct 2005 19:13:03 +0000 (19:13 +0000)]
* Fix XT3 build issues
eeb [Sat, 15 Oct 2005 18:33:25 +0000 (18:33 +0000)]
* Changed linking of userspace ptllnd to depend on HAVE_CRAY_XT3 rather than
!HAVE_LIBPTHREAD.
nikita [Sat, 15 Oct 2005 17:26:04 +0000 (17:26 +0000)]
An optimization proposed by Andreas: do not grow (as part of ldlm lock policy)
extent locks acquired on server (e.g., OST-side locks introduced by previous
7311 fixes): server-side locks are not cached and would only conflict with
other threads.
b=7311
r=adilger
eeb [Sat, 15 Oct 2005 16:46:21 +0000 (16:46 +0000)]
* First cut iiblnd (compiles but untested)
* Removed #if LNET_SINGLE_THREADED and replaced with #if !HAVE_LIBPTHREAD
* Fixed LND module descriptions to say LND (not NAL)
* viblnd cleanups (removed unused struct members, fixed some formatting etc)
* minor cleanup in text buffer allocations (lnet/lnet/config.c)
* format string fix in klnd/ptllnd.c
* fixed lustre/utils/obd.c to work without libpthread/fork (disables --threads)
green [Fri, 14 Oct 2005 20:07:42 +0000 (20:07 +0000)]
b=7293
r=adilger
Add possibility (config option) to show minimal available OST free space
pjkirner [Fri, 14 Oct 2005 19:36:09 +0000 (19:36 +0000)]
* Rename max_immd_size -> max_msg_size to reduce further confusion, and align with Catamoun PTLLND
pjkirner [Fri, 14 Oct 2005 19:31:48 +0000 (19:31 +0000)]
Catamount PTLLND now runs sanity
* Resolved a number of unitilized variables
* Remove unused plb_refcount
* ptllnd_close_peer() released ref (could have been last) and then touched peer pointer.
* .lnd_wait was not defined in ptllnd.c causing LNET to reject it as a valid LND in single thread environment.
* Incorrect check on buf->plb_posted caused assert to be triggered in destroy_buffer
* buf->plb_posted was never set to 0 on unlink causing infinite loop.
* plni_nposted_buffers was never decremented would hit ASSERT
* Portals NID was not retrivied, and LNET NID was not created, caused init failure.
* type == INVALID value cause active_rdma to always issue PtlPut()
* (CLEANUP) variable ni, masks parameter ni
* Credits not given to peer correctly in HELLO message.
* Credits never decremented
* Matchbits did not handle reserved properly
* Bulk MD/ME should be inserted BEFORE (PTL_INS_BEFORE) all the small message MD for performance reasons.
* Missing PTL_ACK_DISABLE
* HELLO Message payload was being swab'ed even when it wasn't supposed to be.
* Some complex syntax of x?yz comibend with to few parthensis was causing md.options to be set incorrectly and messages to be dropped
* MD options not set properly in ptllnd_active_rdma
* MD threshold calculation compares against invalid type
* ptllnd_recv() used rlen rather than mlen
* Fixes for max_immd_size being max message size, not max immidate payload size.
adilger [Fri, 14 Oct 2005 17:39:06 +0000 (17:39 +0000)]
Branch b1_4_newconfig
- lustrecvs already has a regexp for all v* tags
- change buildcvs regexp to include lnet for v1.4.5.10+
mjmac [Fri, 14 Oct 2005 14:51:25 +0000 (14:51 +0000)]
Fixes to allow the harness to build v1_4_5_91.
alex [Fri, 14 Oct 2005 11:36:42 +0000 (11:36 +0000)]
b=7314
r=adilger,alex (original patch from Brian Behlendorf)
- adds ldiskfs tunnables for mballoc
adilger [Fri, 14 Oct 2005 11:20:08 +0000 (11:20 +0000)]
Branch b1_4_newconfig
Bah, shouldn't have committed this.
adilger [Fri, 14 Oct 2005 11:08:10 +0000 (11:08 +0000)]
Merge b1_4_newconfig from b1_4 (20051014_0359)
Description: data loss during non-page-aligned writes to a single file from
both multiple nodes and multiple threads on one node at same time
Details : updates to KMS and lsm weren't protected by common lock. Resulting
inconsistency led to false short-reads, that were cached and later
used by ->prepare_write() to fill in partially written page,
leading to data loss.
b=5047
Description: lconf --abort_recovery fails with 'Operation not supported'
Details : lconf was attempting to abort recovery on the MDT device and not
the MDS device
b=7047
Description: add support for EAs (user and system) on lustre filesystems
Details : it is now possible to store extended attributes in the Lustre
client filesystem, and with the user_xattr mount option it
is possible to allow users to store EAs on their files also
b=8592
Fix sanity.sh test 56, 65a.
alex [Fri, 14 Oct 2005 10:20:29 +0000 (10:20 +0000)]
b=9516
r=alex
- limit number of in-flight async destroy rpcs MDS issues
to destroy OST objects
adilger [Fri, 14 Oct 2005 09:57:47 +0000 (09:57 +0000)]
Branch b1_4_newconfig
Update the CDEBUG macros to match the current portals HEAD. This
re-instantiates the CDEBUG_LIMIT() macro, and fixes liblustre CDEBUG()
from being a no-op. Improving the CDEBUG_LIMIT() code is the subject
of bug 6411 and can be discussed separately.
Fix conflicting subsystem macros between HEAD and b_hd_newconfig.
b=6411
adilger [Thu, 13 Oct 2005 23:04:41 +0000 (23:04 +0000)]
Branch b1_4
Update build version to 1.4.5.8, last release before LNET/b1_4_neconfig landing.
adilger [Thu, 13 Oct 2005 23:02:31 +0000 (23:02 +0000)]
Branch b1_4
Missing locking for direct IO.
Disable locking check for liblustre, it has many callsites that do not lock
and lov_stripe_{un,}lock() are noops there anyways.
b=5047
ericm [Thu, 13 Oct 2005 22:39:36 +0000 (22:39 +0000)]
branch: b1_4
xattr: minor fix, add comment.
adilger [Thu, 13 Oct 2005 22:37:23 +0000 (22:37 +0000)]
Branch b1_4
- add user_xattr to lov.sh, uml.sh client mount options
- don't run xattr sanity test 102 when not root or mounted with user_xattr
- add nouser_xattr mount option to allow user_xattr to be unset
- add check for linux/xattr_acl.h + compat (doesn't exist on BG/L)
- fix test 56, 65a with default striping on parent directory
- add ChangeLog entry for gmnalnid support to llmount.c
- add ChangeLog entry for xattr support
b=7979, b=8592, b=9504, b=9505
cliffw [Thu, 13 Oct 2005 22:12:10 +0000 (22:12 +0000)]
b=9508
r=adilger@clusterfs.com
Reverted 2.4 kernels
cliffw [Thu, 13 Oct 2005 21:53:49 +0000 (21:53 +0000)]
b=9508
r=adilger@clusterfs.com
When both gcc33 and gcc32 are available, we should use gcc33
eeb [Thu, 13 Oct 2005 18:24:45 +0000 (18:24 +0000)]
* the userspace version of LIBCFS_ALLOC() stopped zeroing memory
This was reported as bug 9506
pjkirner [Thu, 13 Oct 2005 15:53:56 +0000 (15:53 +0000)]
* Now handle portals pid correctly (To support catamount clients)
nikita [Thu, 13 Oct 2005 14:39:20 +0000 (14:39 +0000)]
move misplaced comment.
ericm [Thu, 13 Oct 2005 05:23:42 +0000 (05:23 +0000)]
branch: b1_4
land b1_4_xattr: support manipulating user extended attributes.
nikita [Wed, 12 Oct 2005 20:46:16 +0000 (20:46 +0000)]
cleanup llap_from_page():
- add explicit LLAP_FROM_REMOVEPAGE
- make llap_from_page() static
- fix inverted condition in llap_from_page().
b=5047
r=adilger
nathan [Wed, 12 Oct 2005 20:45:30 +0000 (20:45 +0000)]
Branch b1_4
b=9445
nikita [Wed, 12 Oct 2005 19:02:39 +0000 (19:02 +0000)]
remove unneeded conditional compilation wrappers.
nic [Wed, 12 Oct 2005 18:11:47 +0000 (18:11 +0000)]
b=7047
p=adilger
r=nic
fix typo that was preventing lconf --abort_recovery from working
nic [Wed, 12 Oct 2005 17:58:10 +0000 (17:58 +0000)]
b=7047
p=adilger
r=nic
fix typo that was preventing lconf --abort_recovery from working
nikita [Wed, 12 Oct 2005 10:59:38 +0000 (10:59 +0000)]
typo fix.
nikita [Wed, 12 Oct 2005 10:55:00 +0000 (10:55 +0000)]
Add locking to provide consistency between kms and lsm.
b=5047
r=nikita
r=adilger
nathan [Tue, 11 Oct 2005 19:26:40 +0000 (19:26 +0000)]
Branch b_hd_newconfig
r=eeb
only drop self-reference on error, not already-initted return value
nic [Mon, 10 Oct 2005 23:59:13 +0000 (23:59 +0000)]
make sure we actually build the drivers...
nathan [Mon, 10 Oct 2005 22:50:21 +0000 (22:50 +0000)]
b=8076
update from b1_4
nic [Mon, 10 Oct 2005 22:16:12 +0000 (22:16 +0000)]
add qsnet patch for 2.6-rhel4
nic [Mon, 10 Oct 2005 20:22:44 +0000 (20:22 +0000)]
update to latest update from Suse.
pjkirner [Mon, 10 Oct 2005 13:53:54 +0000 (13:53 +0000)]
* weak alias doesn't seem to be working especially on the XT3 Catamount buid system. Go to a more explict registration based on #ifdefs.
pjkirner [Mon, 10 Oct 2005 13:45:45 +0000 (13:45 +0000)]
* Added include/libcfs/types.h to fixed XT3 (Catamount) build error (i.e. no "types.h" in that environmnet)
nikita [Mon, 10 Oct 2005 06:52:34 +0000 (06:52 +0000)]
check returned value
nikita [Mon, 10 Oct 2005 06:36:45 +0000 (06:36 +0000)]
liblustre/tests/sanity.c: add test 51 to test for regression in
ldlm_cli_enqueue() introduced by 7311 fix.