Whamcloud - gitweb
eeb [Tue, 4 Oct 2005 15:07:49 +0000 (15:07 +0000)]
* Added lctl commands...
which_nid NID [NID...] prints the closest NID
list_nids [all] lists local NIDs (all to include lo)
* Stopped 'lctl network' from printing local NIDs when passed 0 args; this
arg overloading was a bit hackish anyway.
pjkirner [Tue, 4 Oct 2005 15:04:58 +0000 (15:04 +0000)]
* Fix XT3 build errors on Catamount (that were already resolved on Linux XT3 build) by making a common file that contains the necessary definitions.
eeb [Sat, 1 Oct 2005 11:48:42 +0000 (11:48 +0000)]
* Updated gm-reg-phys patch with the fix from Myricom that allows
userpace programs to link again.
pjkirner [Fri, 30 Sep 2005 17:58:06 +0000 (17:58 +0000)]
* Fix leaking MD on Put on sender side. Introduced when I added threashold +1 for get operations. Which should only have been on the recv side.
* A few more stats to track MDs
* Cleanup a few ASSERTS
pjkirner [Fri, 30 Sep 2005 14:31:16 +0000 (14:31 +0000)]
* Cleanup stats
they are unsigned
shorten names
fix typo
add new stat for credit noop msg
* Fix bug with bucket check rounding error
pjkirner [Fri, 30 Sep 2005 13:07:52 +0000 (13:07 +0000)]
* Adjusted comment on PTL_MD_MAX_IOV now that I've got confirmation from Cray on this fact
pjkirner [Fri, 30 Sep 2005 13:03:38 +0000 (13:03 +0000)]
* Remove unused code
eeb [Fri, 30 Sep 2005 08:23:46 +0000 (08:23 +0000)]
* correction to ip2nets matching code
eeb [Fri, 30 Sep 2005 05:51:17 +0000 (05:51 +0000)]
* Added the 'ip2nets' lnet module parameter. It may be specified
instead of 'networks', and it consists of a list of networks and IP match
expressions. If any IP addresses on the node match the IP match
expression, the given network is added to the list of local networks.
This (along with recent route table fixes etc) allows a single site-wide
modprobe.conf
* Included a recursion check in lnet_finalize(). Recursion through
lnet_return_credits_locked() was possible if a set of blocked buffers
suddenly all complete in-line (e.g. when a peer dies). The recursion check
limits the number of threads actually processing finalized messages to the
number of CPUs.
* Cosmetic changes to /proc/lnet/{buffers,nis,peers} to make the
buffer/credit stats more readable.
* Fixed configure-on-load (it used to deadlock in modprobe when lnet loaded
LNDs) and made it the default.
* Added a shell script 'lnetunload' to unload lnet and any loaded LNDs.
liangzhen [Fri, 30 Sep 2005 04:27:05 +0000 (04:27 +0000)]
- Add new option --disable-libpthread while building of Lustre library
usocklnd will not be built while the option is used.
- Fix build problem while no libpthread in system.
- Change configuration message to make it clear.
pjkirner [Thu, 29 Sep 2005 20:55:13 +0000 (20:55 +0000)]
* Fix build breakage on XT3 introduced earlier today
pjkirner [Thu, 29 Sep 2005 20:04:49 +0000 (20:04 +0000)]
* make scheduler threads balance rx and tx more evenly
* move peer allocation outside spin lock
* fix bug with irqs not being disabled on spinlocks that need syncronzation with callbacks
eeb [Thu, 29 Sep 2005 14:01:10 +0000 (14:01 +0000)]
* PTL_MTU,PTL_MD_MAX_IOV -> LNET_MTU,LNET_MAX_IOV
* moved portals mtu and max iov out of lib-types.h into types.h so ptllnd
can see PTL_MTU/PTL_MD_MAX_IOV
CAVEAT EMPTOR: this might break ptllnd builds with non-lustre portals
since these defines are not in the official spec. Also, ptllnd really
needs the underlying portals to cope with LNET's MTU and MAX_IOV; it
simply asserts it currently.
eeb [Thu, 29 Sep 2005 13:16:35 +0000 (13:16 +0000)]
* cleaned up socklnd descriptor handling
eeb [Thu, 29 Sep 2005 12:13:04 +0000 (12:13 +0000)]
* More PORTALS -> LND renaming
eeb [Thu, 29 Sep 2005 00:11:36 +0000 (00:11 +0000)]
* Changed LNetGet() LNetPut() to include a source nid parameter
Set this to LNET_NID_ANY to have the source nid set automatically,
otherwise it checks that the destination is reachable from the given
source NID and sends using that. This ensures the source NID is
deterministic in the presence of equivalent routes.
* Changed LNetDist() to return the count of networks traversed rather than
routers (i.e. 0 == local, 1 == local net, 2 == via 1 router etc). It also
returns a source NID (== NID of local LNET interface that will send).
* Changed LNet() completion events to include the target process ID so the app
can tell which NID the sender sent to.
* Removed the 'implicit_loopback' lnet module parameter. You have to use
0@lo explicitly if you want loopback.
* Changed lustre to remember the source NID returned by LNetDist() to ensure
client requests always arrive with a consistent initiator NID.
* Changed lustre to remember the target NID of incoming requests to ensure
server replies are always sent with the same source NID that the client
sent her requests to.
* Changed lustre to use a NID of 0@lo if the distance to the target NID is 0
(i.e use the loopback LNET interface when talking to local servers).
pjkirner [Wed, 28 Sep 2005 18:01:31 +0000 (18:01 +0000)]
* Update license text for all PTLLND from phil's boilerplate
pjkirner [Wed, 28 Sep 2005 17:52:16 +0000 (17:52 +0000)]
* Add eager_recv
eeb [Wed, 28 Sep 2005 10:07:00 +0000 (10:07 +0000)]
* minor build fixes for ulnds/ptllnd
pjkirner [Wed, 28 Sep 2005 00:35:48 +0000 (00:35 +0000)]
* Add ptllnd.h to libptllnd_a_SOURCES
pjkirner [Wed, 28 Sep 2005 00:01:08 +0000 (00:01 +0000)]
* Commit small patch for lnet to allow Lustre to build against a 2.6.12 kernel.
(From adliger)
eeb [Tue, 27 Sep 2005 15:33:01 +0000 (15:33 +0000)]
* first cut ulnds/ptllnd
pjkirner [Tue, 27 Sep 2005 13:51:27 +0000 (13:51 +0000)]
Patch by Liang
1. lnet/ulnds/socklnd/debug.c will not be built anymore, lnet/libcfs/libcfs.a will be created while building of liblustre, functions in lnet/ulnds/socklnd/debug.c are moved to lnet/libcfs/debug.c.
2. LNET_SINGLE_THREADED will be assigned to 1 if no libpthread found.
3. usocklnd will not be built if no libpthread (I'm not sure if it's reasonable)
pjkirner [Mon, 26 Sep 2005 14:15:28 +0000 (14:15 +0000)]
* Cleanup NULL ptr usage
pjkirner [Mon, 26 Sep 2005 13:54:42 +0000 (13:54 +0000)]
* Complete the partial checkin of the MSG_TYPE_PUT typo. The build works again.
eeb [Sun, 25 Sep 2005 18:31:01 +0000 (18:31 +0000)]
* fixed typo PTLLND_MSG_TYPE_PUT
eeb [Fri, 23 Sep 2005 16:25:43 +0000 (16:25 +0000)]
* Added check for non-deterministic routes
pjkirner [Fri, 23 Sep 2005 15:00:56 +0000 (15:00 +0000)]
* Resolved "unload problem" due to PTL_MD_LUSTRE_COMPLETION_SEMANTICS not being specified explicitly when runing on Cray Portals.
* Removed TESTING_WITH_LOOPBACK (early unit testing code will never have any value in the future)
* Added more comments on the diffrences between Portals implemenations in the headers
eeb [Fri, 23 Sep 2005 14:30:47 +0000 (14:30 +0000)]
* Changed parsing of 'routes=' to allow routes via different networks with
and with different hopcounts; it just uses the shortest hopcount.
NB this still needs a change to LNet{Put,Get} to allow the caller to
specify the source NI and for ptlrpc to use it.
* forwarding="enabled" explicitly enables the node as a router
forwarding="disabled" explicitly disables the node as a router
otherwise if any of the node's NIDs are mentioned as routers in 'routes='
the node is implicitly enabled as a router.
* Added LNET_SINGLE_THREADED to tell LNET if the userspace runtime supports
posix threads.
* Added lnd_wait() to tell LNET that to call into the LND if the APP wants to
block for an event.
liangzhen [Fri, 23 Sep 2005 12:01:32 +0000 (12:01 +0000)]
1. Add now option --disable-usocklnd, to disable build & link of libsocklnd.
2. Smallfix for link of liblustre.a
pjkirner [Fri, 23 Sep 2005 02:38:05 +0000 (02:38 +0000)]
* Fixed problem where struct iovec != ptl_md_iovec_t, causing >1 page PtlMDAttach() to fail with error 26.
pjkirner [Thu, 22 Sep 2005 21:29:33 +0000 (21:29 +0000)]
* ptllnd stats via /proc entry
* a few more stats.
pjkirner [Thu, 22 Sep 2005 19:07:37 +0000 (19:07 +0000)]
* Fix KIOV mapping when used with CRAY PORTALS
* Replaced KIOV mapping for LUSTRE PORTALS that was previously there.
pjkirner [Thu, 22 Sep 2005 17:34:14 +0000 (17:34 +0000)]
* Invalid flags to cfs_mem_cache_alloc (Causd oops on XT3)
pjkirner [Thu, 22 Sep 2005 15:42:27 +0000 (15:42 +0000)]
* Remove non compatible irqs_diabled(). Basically backing out the changes from last night introducing this.
eeb [Thu, 22 Sep 2005 13:05:19 +0000 (13:05 +0000)]
* Fixed bugs in new lnet_[k]iov2[k]iov() that skipped fragments incorrectly.
These bugs would manifest as data corruption or crashes.
* Reduced verbosity of router buffer allocation CDEBUGs
eeb [Thu, 22 Sep 2005 10:59:57 +0000 (10:59 +0000)]
* removed #include <asm/arch/system.h> from kp30.h
* Fixed bug in lnet_send() that broke portals compatibility.
* Fixed places where lnd_eager_recv() fails but lnet doesn't decref
the relevant peer before freeing the msg.
pjkirner [Thu, 22 Sep 2005 04:15:10 +0000 (04:15 +0000)]
* Use a real PID
* Fixed defect in a case were lock,action,lock, rather than lock,action,unlock
* Fixed locking in error path (x2)
* Use Cray SS IFACE type
pjkirner [Thu, 22 Sep 2005 03:23:27 +0000 (03:23 +0000)]
* Fixed asserts
pjkirner [Thu, 22 Sep 2005 03:07:33 +0000 (03:07 +0000)]
* Additional checks for correct locking
pjkirner [Thu, 22 Sep 2005 03:04:31 +0000 (03:04 +0000)]
* Add support in for irqs_disabled() on all patforms
pjkirner [Wed, 21 Sep 2005 21:19:21 +0000 (21:19 +0000)]
* Add additional asserts to check caller context
pjkirner [Wed, 21 Sep 2005 20:41:14 +0000 (20:41 +0000)]
* Cleanedup send path
* Fixed lock being heald while calling lnet_finalize (only in error path)
* Fixed calls to kptllnd_peer_queue_tx_locked(), to actually hold the lock.
eeb [Wed, 21 Sep 2005 18:45:40 +0000 (18:45 +0000)]
* Moved some #defines out of klnds/ptllnd.h into a shared place
pjkirner [Wed, 21 Sep 2005 18:31:25 +0000 (18:31 +0000)]
* Adjusted TX Pool to meet new LNET requiremenst (no blocking)
* Fixed bug with routed REPLY
pjkirner [Wed, 21 Sep 2005 18:15:26 +0000 (18:15 +0000)]
* Fix a typo
eeb [Wed, 21 Sep 2005 17:45:46 +0000 (17:45 +0000)]
* Fixed LND registration / setting the_lnet.ln_init ordering
eeb [Wed, 21 Sep 2005 16:54:30 +0000 (16:54 +0000)]
* Added lnet/ulnds/ptllnd
* Changed userspace LNET to register all LNDs it has been linked with
and construct the default set of networks from them.
* Added support for userspace LNET config via environment variables
LNET_NETWORKS and LNET_ROUTES. They work just like the kernel module
parameters.
pjkirner [Wed, 21 Sep 2005 14:30:36 +0000 (14:30 +0000)]
* Fixed dist problem with ptllnd_wire.h file that moved
* Fixed dist problem with missing file in libcfs/linux
eeb [Wed, 21 Sep 2005 08:21:08 +0000 (08:21 +0000)]
* fixed a ./config warning
* moved klnds/ptllnd/ptllnd_wire.h into lnet/include/lnet/ so
ulnds/ptllnd can get at it.
* fixed/added some .cvsignore files
* fixed ulnds/tcplnd to give itself some send credits
pjkirner [Wed, 21 Sep 2005 05:13:14 +0000 (05:13 +0000)]
* Fixed some XT3 Compilation issues
* Clarified a few LNET vs PTL ambiguities
* Added code for mapping kiov -> iovec
pjkirner [Wed, 21 Sep 2005 02:08:05 +0000 (02:08 +0000)]
b=7982
* Portals LND
pjkirner [Wed, 21 Sep 2005 00:38:08 +0000 (00:38 +0000)]
* Undoing changes from the b_newconfig_rdmarouting landing that have negativily affected the ptllrpc build.
pjkirner [Tue, 20 Sep 2005 18:10:37 +0000 (18:10 +0000)]
* Fix buffalo build error with removed file.
pjkirner [Tue, 20 Sep 2005 17:43:10 +0000 (17:43 +0000)]
* Removed problematic building of ut (unit test tool) so we can move ahead with buffalo testing.
eeb [Tue, 20 Sep 2005 17:19:08 +0000 (17:19 +0000)]
* removed lnet_parse() rc ambiguity: 0 on success
pjkirner [Tue, 20 Sep 2005 17:00:04 +0000 (17:00 +0000)]
b=7981
* Landing of b_newconfig_rdmarouting
* Passed sanity.sh
* 9348 is still open, but this landing hasn't introduced it.
pjkirner [Mon, 19 Sep 2005 21:56:31 +0000 (21:56 +0000)]
* Fix incorrect path in godb file
pjkirner [Mon, 19 Sep 2005 20:41:31 +0000 (20:41 +0000)]
* Fixed problem with distributed build.
pjkirner [Mon, 19 Sep 2005 13:50:41 +0000 (13:50 +0000)]
* Add simple LNET unit test modules
pjkirner [Fri, 16 Sep 2005 16:42:38 +0000 (16:42 +0000)]
* Fixed build warning (and possible 64-bit error)
pjkirner [Fri, 16 Sep 2005 15:33:32 +0000 (15:33 +0000)]
b=8021
* Landing EEB's b_newconfig_rdmarouting branch
pjkirner [Fri, 16 Sep 2005 13:21:26 +0000 (13:21 +0000)]
* Apply Nikita's patch from portals tree to lnet
pjkirner [Fri, 16 Sep 2005 13:10:48 +0000 (13:10 +0000)]
* Removed extra unnecessary message
liangzhen [Fri, 16 Sep 2005 04:12:40 +0000 (04:12 +0000)]
Fix problem of build ptllnd.
pjkirner [Thu, 15 Sep 2005 13:32:28 +0000 (13:32 +0000)]
* Fix 2.6.5 build issue
liangzhen [Thu, 15 Sep 2005 13:17:32 +0000 (13:17 +0000)]
Remove unused socklnd files in ulnds, they have been
moved ulnds/socklnd.
liangzhen [Thu, 15 Sep 2005 09:56:38 +0000 (09:56 +0000)]
1. two options for build
a. --with-portals=<path to portals>, build ptllnd with external portals
b. --with-lustre-portals, build ptllnd and lustre portals
2. ulnd build patch, tcplnd is built as lnet/ulnds/socklnd
3. smallfix for lnet/ulnds/socklnd
eeb [Thu, 15 Sep 2005 08:54:50 +0000 (08:54 +0000)]
* Moved PTL_{MTU,_MD_MAX_IOV) into types.h
* Removed ref in lustre_net.h to lib-types.h and simplified how
PTLRPC_MAX_BRW_{SIZE,PAGES} are defined.
nathan [Wed, 14 Sep 2005 19:04:23 +0000 (19:04 +0000)]
fix build
pjkirner [Wed, 14 Sep 2005 13:22:02 +0000 (13:22 +0000)]
b=9318
r=eeb
* Removed MOST of the instances of CRAY_PORTALS, especially the ones that were related to the networking portion.
* In addition to what was in the patch for 9318, also included the removal of build_check.h and refrences.
Note: This does NOT complete the work on this bug, there are still a number of outstanding refrences to CRAY_PORTALS.
pjkirner [Wed, 14 Sep 2005 03:56:39 +0000 (03:56 +0000)]
* Fixed LNET undefined symbol PDE() problem on 2.4 kernel (specifically Cray XT3)
nathan [Tue, 13 Sep 2005 03:28:32 +0000 (03:28 +0000)]
b=8080
update from b1_4
eeb [Mon, 12 Sep 2005 17:41:23 +0000 (17:41 +0000)]
* Changed nal_send() to include 'target_is_router' and 'routing' flags
Where 'target_is_router' == the immediate destination is a router
and 'routing' == This message is being forwarded from another LND.
NB The routing flag isn't set yet (but will be when all routing is done in
lib-move.
* Added support for RDMA-ed REPLYs in all relevent LNDs ready for RDMA
routing. LNDs must send IMMEDIATE GETs if the local node or the target
are routers, but may RDMA the REPLY (just lika a PUT) on the return
route.
eeb [Mon, 12 Sep 2005 14:17:32 +0000 (14:17 +0000)]
* viblnd: applied ARP retry patch
eeb [Mon, 12 Sep 2005 13:49:36 +0000 (13:49 +0000)]
* tidied up NID printing (s/LPX64/%s/ && s/nid/libcfs_nid2str(nid)/)
eeb [Sun, 11 Sep 2005 13:54:35 +0000 (13:54 +0000)]
* Cleaned up portals compatibility tests into a couple of inlines
in lib-lnet.h
* Added (but didn't test) portals compatibility support for
gm, openib and ra.
eeb [Sat, 10 Sep 2005 17:12:16 +0000 (17:12 +0000)]
* Added check for portals compatibility mode in LNDs that don't support it
yet.
eeb [Sat, 10 Sep 2005 17:05:04 +0000 (17:05 +0000)]
* Got vibnal LNET/portals wire compatibility working
* Removed bad LPSZ in format strings (I got rid of lnet_size_t)
* Changed vibnal NID printing from LPX64 to %s(libcfs_nid2str(nid))
eeb [Sat, 10 Sep 2005 03:48:48 +0000 (03:48 +0000)]
* LNET/portals wire compatibility working on elan and tcp. Set the lnet
module parameter "portals_compatibility" to...
"strong" Compatible with portals and LNET "strong" and "weak"
"weak" Compatible with any value of LNET portals_compatibility
"none" Compatible with LNET "weak" and "none". This is the default.
Old XML and existing old configuration profiles (logs) can be used as-is.
* Updated GM README
* Backed out most of the change to lconf that used hostaddr to construct the
LNET NID. It now signals an error if the XML contains > 1 --hostaddr, or
if the --hostaddr doesn't match the NID, since it's likely manual
intervention will be required in these cases.
liangzhen [Fri, 9 Sep 2005 15:40:08 +0000 (15:40 +0000)]
optional build for portals.
1. Portals will not be built by default
2. To build portals: configure --with-portals=yes .....
pjkirner [Thu, 8 Sep 2005 20:37:10 +0000 (20:37 +0000)]
* Fix ogdb-host file generation, pickup the correct lnet modules as well as the legacy portals modules (for testing purposes only)
eeb [Thu, 8 Sep 2005 17:18:24 +0000 (17:18 +0000)]
* Added GM README
eeb [Thu, 8 Sep 2005 15:18:55 +0000 (15:18 +0000)]
* Removed unused parameters from LNet??? APIs (e.g. interface handle)
* Removed unused LNet??? APIs.
* Removed many scalar typedefs inherited from portals.
* fixed up alignment in some decls that s/ptl/lnet/ had unaligned.
* updated sanity.sh to s/portals.debug/lnet.debug/
* verified lnet can zeroconf mount a pre-lnet filesystem after
lctl --write_config <pre-lnet-xml>
liangzhen [Wed, 7 Sep 2005 13:33:08 +0000 (13:33 +0000)]
Smallfix for function define.
eeb [Tue, 6 Sep 2005 09:30:54 +0000 (09:30 +0000)]
* Added support for routable RDMA-ed REPLY messages to qswlnd
eeb [Tue, 6 Sep 2005 07:42:56 +0000 (07:42 +0000)]
* ptllnd: added .cvsignore
eeb [Mon, 5 Sep 2005 19:21:16 +0000 (19:21 +0000)]
* Removed nal_{send,recv}_pages() LND APIs (send and receive are passed
either VM frags (iov != NULL) or page frags (kiov != NULL) but not both.
* Ensure that the order of networks declared in the "networks" and "routes"
breaks ties when determining which peer NID to use.
eeb [Fri, 2 Sep 2005 18:56:56 +0000 (18:56 +0000)]
* Added hopcounts to route table
* Changed ptlrpc_uuid_to_peer() to choose the matching UUID with the shortest
hopcount
* Changed lconf to use a single UUID string for all target NIDs, so the
client can choose which one to use at runtime.
* Stripped out all the unused network configuration stuff from lconf
pjkirner [Fri, 2 Sep 2005 12:52:24 +0000 (12:52 +0000)]
Add PTLLND to the LND enum.
liangzhen [Fri, 2 Sep 2005 09:32:11 +0000 (09:32 +0000)]
Patch for Lustre Networking Reorganization
DONE:
1. Fixing of building both lnet and portals
2. Fixing of conflicting symbols in lnet and portals
- exported APIs of lnet/libcfs with name like ptl_* are renamed to libcfs_*
- exported APIs of lnet/lnet with name like ptl_* are renamed to lnet_*
- exported APIs of portals/libcfs with name like libcfs_* are renamed to libptl_*
- modules name of portals/libcfs/libcfs.ko to portals/libcfs/libptl.ko
3. /proc entry for lnet is /proc/sys/lnet
4. Listen port of socklnd is 988, listen port of socknal is 989
5. Pseudo device for lnet is /dev/lnet
6. Fixing of build lnet/klnds/ptllnd
7. Fixing of module path and /proc path in lnet/utils lustre/utils lustre/tests
TODO:
1. Renaming of unexported symbols in lnet.
2. Renaming of types and macro
3. Add option for building portals
4. Misc fix and testing
pjkirner [Thu, 1 Sep 2005 16:21:57 +0000 (16:21 +0000)]
Added build infrastructre for PTLLND.
Plus dummy PTLLND that basic interactions between LNET and PORTALS.
pjkirner [Thu, 1 Sep 2005 15:16:49 +0000 (15:16 +0000)]
Reorganize LNET API files, so that PTLLND
can include both LNET and PORTALS.
pjkirner [Thu, 1 Sep 2005 13:02:53 +0000 (13:02 +0000)]
Fix missing NAL->LND
liangzhen [Thu, 1 Sep 2005 04:11:07 +0000 (04:11 +0000)]
Smallfix for lnet build
pjkirner [Thu, 1 Sep 2005 03:46:00 +0000 (03:46 +0000)]
Changes necessary to make liblustre build after LNET rename
pjkirner [Thu, 1 Sep 2005 02:52:35 +0000 (02:52 +0000)]
Changes for LNET rename of NAL -> LND
pjkirner [Thu, 1 Sep 2005 00:21:57 +0000 (00:21 +0000)]
Rename Directories in LNET
knals -> klnds
unals -> ulnds
And associated build fixes.
eeb [Wed, 31 Aug 2005 21:34:02 +0000 (21:34 +0000)]
* Applied Andreas' tcpnal compiler optimization bugfix patch
from HEAD portals (different way of constructing tcp HELLO
header to avoid pointer aliasing) to lnet
* Applied qswnal build fix to lustre-portals.m4 from HEAD
portals to lnet
* lnet version of gmnal running @ HP
* fixed bad 64bit cast in acceptor.c
* fixed lconf to work with newconfig modules under lnet
eeb [Wed, 31 Aug 2005 12:51:50 +0000 (12:51 +0000)]
* Applied implicit_loopback fixes to portals (it was previously
applied to lnet)
* Minor formatting changes to lnet/lnet/{lib-move,router}.c