X-Git-Url: https://git.whamcloud.com/?a=blobdiff_plain;f=lnet%2FChangeLog;h=aa6b8f4e7314a1c914ece7d97cfd343fcb592c20;hb=2ed2d63ab29abb102033a0aaf2383de7b12a200a;hp=d7fcde8345f4fb682226c7c0a9e570a2755a8063;hpb=7a6647fecc2526a898a4eac51ccab49b65175b7c;p=fs%2Flustre-release.git diff --git a/lnet/ChangeLog b/lnet/ChangeLog index d7fcde83..aa6b8f4 100644 --- a/lnet/ChangeLog +++ b/lnet/ChangeLog @@ -1,5 +1,62 @@ -TBD Cluster File Systems, Inc. - * version 1.4.10 +2006-06-22 Cluster File Systems, Inc. + * version 1.4.11 / 1.6.1 + * Support for networks: + socklnd - kernels up to 2.6.16 + qswlnd - Qsnet kernel modules 5.20 and later + openiblnd - IbGold 1.8.2 + o2iblnd - OFED 1.1 and 1.2 + viblnd - Voltaire ibhost 3.4.5 and later + ciblnd - Topspin 3.2.0 + iiblnd - Infiniserv 3.3 + PathBits patch + gmlnd - GM 2.1.22 and later + mxlnd - MX 1.2.1 or later + ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x + * bug fixes + +Severity : minor +Details : lnet_clear_peer_table can wait forever if user forgets to + clear a lazy portal. + +Severity : minor +Details : libcfs_id2str should check pid against LNET_PID_ANY. + +Severity : major +Bugzilla : 10916 +Description: added LNET self test +Details : landing b_self_test + +Severity : minor +Frequency : rare +Bugzilla : 12227 +Description: cfs_duration_{u,n}sec() wrongly calculate nanosecond part of + struct timeval. +Details : do_div() macro is used incorrectly. + +2007-04-23 Cluster File Systems, Inc. + +Severity : normal +Bugzilla : 11680 +Description: make panic on lbug configurable + +Severity : major +Bugzilla : 12316 +Description: Add OFED1.2 support to o2iblnd +Details : o2iblnd depends on OFED's modules, if out-tree OFED's modules + are installed (other than kernel's in-tree infiniband), there + could be some problem while insmod o2iblnd (mismatch CRC of + ib_* symbols). + If extra Module.symvers is supported in kernel (i.e, 2.6.17), + this link provides solution: + https://bugs.openfabrics.org/show_bug.cgi?id=355 + if extra Module.symvers is not supported in kernel, we will + have to run the script in bug 12316 to update + $LINUX/module.symvers before building o2iblnd. + More details about this are in bug 12316. + +------------------------------------------------------------------------------ + +2007-04-01 Cluster File Systems, Inc. + * version 1.4.10 / 1.6.0 * Support for networks: socklnd - kernels up to 2.6.16 qswlnd - Qsnet kernel modules 5.20 and later @@ -12,6 +69,76 @@ TBD Cluster File Systems, Inc. mxlnd - MX 1.2.1 or later ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x * bug fixes + +Severity : minor +Frequency : rare +Description: Ptllnd didn't init kptllnd_data.kptl_idle_txs before it could be + possibly accessed in kptllnd_shutdown. Ptllnd should init + kptllnd_data.kptl_ptlid2str_lock before calling kptllnd_ptlid2str. + +Severity : normal +Frequency : rare +Description: gmlnd ignored some transmit errors when finalizing lnet messages. + +Severity : minor +Frequency : rare +Description: ptllnd logs a piece of incorrect debug info in kptllnd_peer_handle_hello. + +Severity : minor +Frequency : rare +Description: the_lnet.ln_finalizing was not set when the current thread is + about to complete messages. It only affects multi-threaded + user space LNet. + +Severity : normal +Frequency : rare +Bugzilla : 11472 +Description: Changed the default kqswlnd ntxmsg=512 + +Severity : major +Frequency : rare +Bugzilla : 12458 +Description: Assertion failure in kernel ptllnd caused by posting passive + bulk buffers before connection establishment complete. + +Severity : major +Frequency : rare +Bugzilla : 12455 +Description: A race in kernel ptllnd between deleting a peer and posting + new communications for it could hang communications - + manifesting as "Unexpectedly long timeout" messages. + +Severity : major +Frequency : rare +Bugzilla : 12432 +Description: Kernel ptllnd lock ordering issue could hang a node. + +Severity : major +Frequency : rare +Bugzilla : 12016 +Description: node crash on socket teardown race + +Severity : minor +Frequency : 'lctl peer_list' issued on a mx net +Bugzilla : 12237 +Description: Enable lctl's peer_list for MXLND + +Severity : major +Frequency : after Ptllnd timeouts and portals congestion +Bugzilla : 11659 +Description: Credit overflows +Details : This was a bug in ptllnd connection establishment. The fix + implements better peer stamps to disambiguate connection + establishment and ensure both peers enter the credit flow + state machine consistently. + +Severity : major +Frequency : rare +Bugzilla : 11394 +Description: kptllnd didn't propagate some network errors up to LNET +Details : This bug was spotted while investigating 11394. The fix + ensures network errors on sends and bulk transfers are + propagated to LNET/lustre correctly. Severity : enhancement Bugzilla : 10316 @@ -78,7 +205,7 @@ Details : libcfs created a symlink from /proc/sys/portals to /proc/sys/lnet for backwards compatibility. This is no longer required and makes the Cray portals /proc variables inaccessible. - + Severity : minor Bugzilla : 11312 Description: OFED FMR API change @@ -86,14 +213,14 @@ Details : This changes parameter usage to reflect a change in ib_fmr_pool_map_phys() between OFED 1.0 and OFED 1.1. Note that FMR support is only used in experimental versions of the o2iblnd - this change does not affect standard usage at all. - + Severity : enhancement Bugzilla : 11245 Description: new ko2iblnd module parameter: ib_mtu Details : the default IB MTU of 2048 performs badly on 23108 Tavor HCAs. You can avoid this problem by setting the MTU to 1024 using this module parameter. - + Severity : enhancement Bugzilla : 11118/11620 Description: ptllnd small request message buffer alignment fix @@ -104,7 +231,7 @@ Details : Set the PTL_MD_LOCAL_ALIGN8 option on small message receives. running the correct protocol version which was fixed by always NAK-ing such requests and handling any misalignments they introduce. - + Severity : minor Frequency : rarely Description: When kib(nal|lnd)_del_peer() is called upon a peer whose @@ -146,7 +273,7 @@ Details : Set the kptllnd module parameter "ptltrace_on_timeout=1" to dump Cray portals debug traces to a file. The kptllnd module parameter "ptltrace_basename", default "/tmp/lnet-ptltrace", is the basename of the dump file. - + Severity : major Frequency : infrequent Bugzilla : 11308 @@ -155,7 +282,7 @@ Details : Kernel ptllnd could produce protocol errors e.g. illegal matchbits and/or violate the credit flow protocol when trying to re-establish a connection with a peer after an error or timeout. - + Severity : enhancement Bugzilla : 10316 Description: Allow /proc/sys/lnet/debug to be set symbolically @@ -172,6 +299,17 @@ Details : In configurations with LNET routers if a router fails routers ------------------------------------------------------------------------------ +2006-12-09 Cluster File Systems, Inc. + +Severity : critical +Frequency : very rarely, in configurations with LNET routers and TCP +Bugzilla : 10889 +Description: incorrect data written to files on OSTs +Details : In certain high-load conditions incorrect data may be written + to files on the OST when using TCP networks. + +------------------------------------------------------------------------------ + 2006-07-31 Cluster File Systems, Inc. * version 1.4.7 - rework CDEBUG messages rate-limiting mechanism b=10375 @@ -209,7 +347,7 @@ Details : In configurations with LNET routers if a router fails routers between different network fabrics. Lustre Networking Devices (LNDS) for the supported network fabrics have also been created for this new infrastructure. - + 2005-08-08 Cluster File Systems, Inc. * version 1.4.4 * bug fixes