Minor whitespace cleanup.
tbd Sun Microsystems, Inc.
tbd Sun Microsystems, Inc.
* Support for networks:
socklnd - any kernel supported by Lustre,
qswlnd - Qsnet kernel modules 5.20 and later,
* Support for networks:
socklnd - any kernel supported by Lustre,
qswlnd - Qsnet kernel modules 5.20 and later,
Bugzilla : 16034
Description: Change ptllnd timeout and watchdog timers
Details : Add ptltrace_on_nal_failed and bump ptllnd timeout to match
Bugzilla : 16034
Description: Change ptllnd timeout and watchdog timers
Details : Add ptltrace_on_nal_failed and bump ptllnd timeout to match
Severity : normal
Bugzilla : 16186
Severity : normal
Bugzilla : 16186
Bugzilla : 14132
Description: acceptor.c cleanup
Details : Code duplication in acceptor.c for the cases of kernel and
Bugzilla : 14132
Description: acceptor.c cleanup
Details : Code duplication in acceptor.c for the cases of kernel and
- user-space removed. User-space libcfs tcpip primitives
- uniformed to have prototypes similar to kernel ones. Minor
- cosmetic changes in usocklnd to use cfs_socket_t as
- representation of socket.
+ user-space removed. User-space libcfs tcpip primitives
+ uniformed to have prototypes similar to kernel ones. Minor
+ cosmetic changes in usocklnd to use cfs_socket_t as
+ representation of socket.
Severity : minor
Bugzilla : 11245
Description: IB path MTU mistakenly set to 1st path MTU when ib_mtu is off
Details : See comment 46 in bug 11245 for details - it's indeed a bug
Severity : minor
Bugzilla : 11245
Description: IB path MTU mistakenly set to 1st path MTU when ib_mtu is off
Details : See comment 46 in bug 11245 for details - it's indeed a bug
- introduced by the original 11245 fix.
+ introduced by the original 11245 fix.
Severity : minor
Bugzilla : 15984
Description: uptllnd credit overflow fix
Details : kptl_msg_t::ptlm_credits could be overflown by uptllnd since
Severity : minor
Bugzilla : 15984
Description: uptllnd credit overflow fix
Details : kptl_msg_t::ptlm_credits could be overflown by uptllnd since
Severity : major
Bugzilla : 14634
Description: socklnd protocol version 3
Details : With current protocol V2, connections on router can be
Severity : major
Bugzilla : 14634
Description: socklnd protocol version 3
Details : With current protocol V2, connections on router can be
- blocked and can't receive any incoming messages when there is no
+ blocked and can't receive any incoming messages when there is no
more router buffer, so ZC-ACK can't be handled (LNet message
can't be finalized) and will cause deadlock on router.
Protocol V3 has a dedicated connection for emergency messages
more router buffer, so ZC-ACK can't be handled (LNet message
can't be finalized) and will cause deadlock on router.
Protocol V3 has a dedicated connection for emergency messages
-------------------------------------------------------------------------------
12-31-2008 Sun Microsystems, Inc.
-------------------------------------------------------------------------------
12-31-2008 Sun Microsystems, Inc.
* Support for networks:
socklnd - any kernel supported by Lustre,
qswlnd - Qsnet kernel modules 5.20 and later,
* Support for networks:
socklnd - any kernel supported by Lustre,
qswlnd - Qsnet kernel modules 5.20 and later,
Bugzilla : 15983
Description: workaround for OOM from o2iblnd
Details : OFED needs allocate big chunk of memory for QP while creating
Bugzilla : 15983
Description: workaround for OOM from o2iblnd
Details : OFED needs allocate big chunk of memory for QP while creating
- connection for o2iblnd, OOM can happen if no such a contiguous
+ connection for o2iblnd, OOM can happen if no such a contiguous
memory chunk.
QP size is decided by concurrent_sends and max_fragments of
o2iblnd, now we permit user to specify smaller value for
memory chunk.
QP size is decided by concurrent_sends and max_fragments of
o2iblnd, now we permit user to specify smaller value for
Bugzilla : 15093
Description: Support Zerocopy receive of Chelsio device
Details : Chelsio driver can support zerocopy for iov[1] if it's
Bugzilla : 15093
Description: Support Zerocopy receive of Chelsio device
Details : Chelsio driver can support zerocopy for iov[1] if it's
- contiguous and large enough.
+ contiguous and large enough.
Severity : normal
Bugzilla : 13490
Severity : normal
Bugzilla : 13490
Bugzilla : 16308
Description: finalize network operation in reasonable time
Details : conf-sanity test_32a couldn't stop ost and mds because it
Bugzilla : 16308
Description: finalize network operation in reasonable time
Details : conf-sanity test_32a couldn't stop ost and mds because it
- tried to access non-existent peer and tcp connect took
+ tried to access non-existent peer and tcp connect took
quite long before timing out.
Severity : major
Bugzilla : 16338
Description: Continuous recovery on 33 of 413 nodes after lustre oss failure
Details : Lost reference on conn prevents peer from being destroyed, which
quite long before timing out.
Severity : major
Bugzilla : 16338
Description: Continuous recovery on 33 of 413 nodes after lustre oss failure
Details : Lost reference on conn prevents peer from being destroyed, which
- could prevent new peer creation if peer count has reached upper
+ could prevent new peer creation if peer count has reached upper
limit.
Severity : normal
Bugzilla : 16102
Description: LNET Selftest results in Soft lockup on OSS CPU
Details : only hits when 8 or more o2ib clients involved and a session is
limit.
Severity : normal
Bugzilla : 16102
Description: LNET Selftest results in Soft lockup on OSS CPU
Details : only hits when 8 or more o2ib clients involved and a session is
- torn down with 'lst end_session' without preceeding 'lst stop'.
+ torn down with 'lst end_session' without preceeding 'lst stop'.
Severity : minor
Bugzilla : 16321
Severity : minor
Bugzilla : 16321
-------------------------------------------------------------------------------
2009-02-07 Sun Microsystems, Inc.
-------------------------------------------------------------------------------
2009-02-07 Sun Microsystems, Inc.
* Support for networks:
socklnd - any kernel supported by Lustre,
qswlnd - Qsnet kernel modules 5.20 and later,
* Support for networks:
socklnd - any kernel supported by Lustre,
qswlnd - Qsnet kernel modules 5.20 and later,
Bugzilla : 15983
Description: workaround for OOM from o2iblnd
Details : OFED needs allocate big chunk of memory for QP while creating
Bugzilla : 15983
Description: workaround for OOM from o2iblnd
Details : OFED needs allocate big chunk of memory for QP while creating
- connection for o2iblnd, OOM can happen if no such a contiguous
+ connection for o2iblnd, OOM can happen if no such a contiguous
memory chunk.
QP size is decided by concurrent_sends and max_fragments of
o2iblnd, now we permit user to specify smaller value for
memory chunk.
QP size is decided by concurrent_sends and max_fragments of
o2iblnd, now we permit user to specify smaller value for
Bugzilla : 15093
Description: Support Zerocopy receive of Chelsio device
Details : Chelsio driver can support zerocopy for iov[1] if it's
Bugzilla : 15093
Description: Support Zerocopy receive of Chelsio device
Details : Chelsio driver can support zerocopy for iov[1] if it's
- contiguous and large enough.
+ contiguous and large enough.
Severity : normal
Bugzilla : 13490
Description: fix credit flow deadlock in uptllnd
Severity : normal
Bugzilla : 13490
Description: fix credit flow deadlock in uptllnd
Bugzilla : 16308
Description: finalize network operation in reasonable time
Details : conf-sanity test_32a couldn't stop ost and mds because it
Bugzilla : 16308
Description: finalize network operation in reasonable time
Details : conf-sanity test_32a couldn't stop ost and mds because it
- tried to access non-existent peer and tcp connect took
+ tried to access non-existent peer and tcp connect took
quite long before timing out.
Severity : major
Bugzilla : 16338
Description: Continuous recovery on 33 of 413 nodes after lustre oss failure
Details : Lost reference on conn prevents peer from being destroyed, which
quite long before timing out.
Severity : major
Bugzilla : 16338
Description: Continuous recovery on 33 of 413 nodes after lustre oss failure
Details : Lost reference on conn prevents peer from being destroyed, which
- could prevent new peer creation if peer count has reached upper
+ could prevent new peer creation if peer count has reached upper
limit.
Severity : normal
Bugzilla : 16102
Description: LNET Selftest results in Soft lockup on OSS CPU
Details : only hits when 8 or more o2ib clients involved and a session is
limit.
Severity : normal
Bugzilla : 16102
Description: LNET Selftest results in Soft lockup on OSS CPU
Details : only hits when 8 or more o2ib clients involved and a session is
- torn down with 'lst end_session' without preceeding 'lst stop'.
+ torn down with 'lst end_session' without preceeding 'lst stop'.
Severity : minor
Bugzilla : 16321
Severity : minor
Bugzilla : 16321
-------------------------------------------------------------------------------
11-03-2008 Sun Microsystems, Inc.
-------------------------------------------------------------------------------
11-03-2008 Sun Microsystems, Inc.
* Support for networks:
socklnd - any kernel supported by Lustre,
qswlnd - Qsnet kernel modules 5.20 and later,
* Support for networks:
socklnd - any kernel supported by Lustre,
qswlnd - Qsnet kernel modules 5.20 and later,
Description: ptl_send_rpc hits LASSERT when ptl_send_buf fails
Details : only hits under out-of-memory situations
Description: ptl_send_rpc hits LASSERT when ptl_send_buf fails
Details : only hits under out-of-memory situations
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
04-26-2008 Sun Microsystems, Inc.
* version 1.6.5
* Support for networks:
04-26-2008 Sun Microsystems, Inc.
* version 1.6.5
* Support for networks:
Description: ksocklnd fails to establish connection if accept_port is high
Details : PID remapping must not be done for active (outgoing) connections
Description: ksocklnd fails to establish connection if accept_port is high
Details : PID remapping must not be done for active (outgoing) connections
--------------------------------------------------------------------------------
2008-01-11 Sun Microsystems, Inc.
--------------------------------------------------------------------------------
2008-01-11 Sun Microsystems, Inc.
gmlnd - GM 2.1.22 and later,
mxlnd - MX 1.2.1 or later,
ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
gmlnd - GM 2.1.22 and later,
mxlnd - MX 1.2.1 or later,
ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
Severity : normal
Bugzilla : 14387
Description: liblustre network error
Severity : normal
Bugzilla : 14387
Description: liblustre network error
Bugzilla : 12302
Description: new userspace socklnd
Details : Old userspace tcpnal that resided in lnet/ulnds/socklnd replaced
Bugzilla : 12302
Description: new userspace socklnd
Details : Old userspace tcpnal that resided in lnet/ulnds/socklnd replaced
- with new one - usocklnd.
+ with new one - usocklnd.
Severity : enhancement
Bugzilla : 11686
Severity : enhancement
Bugzilla : 11686
Bugzilla : 13236
Description: TOE Kernel panic by ksocklnd
Details : offloaded sockets provide their own implementation of sendpage,
Bugzilla : 13236
Description: TOE Kernel panic by ksocklnd
Details : offloaded sockets provide their own implementation of sendpage,
- can't call tcp_sendpage() directly
+ can't call tcp_sendpage() directly
Severity : normal
Bugzilla : 10778
Description: kibnal_shutdown() doesn't finish; lconf --cleanup hangs
Details : races between lnd_shutdown and peer creation prevent
Severity : normal
Bugzilla : 10778
Description: kibnal_shutdown() doesn't finish; lconf --cleanup hangs
Details : races between lnd_shutdown and peer creation prevent
- lnd_shutdown from finishing.
+ lnd_shutdown from finishing.
Severity : normal
Bugzilla : 13279
Description: open files rlimit 1024 reached while liblustre testing
Details : ulnds/socklnd must close open socket after unsuccessful
Severity : normal
Bugzilla : 13279
Description: open files rlimit 1024 reached while liblustre testing
Details : ulnds/socklnd must close open socket after unsuccessful
Severity : major
Bugzilla : 13482
Description: build error
Details : fix typos in gmlnd, ptllnd and viblnd
Severity : major
Bugzilla : 13482
Description: build error
Details : fix typos in gmlnd, ptllnd and viblnd
-------------------------------------------------------------------------------
+--------------------------------------------------------------------------------
2007-07-30 Cluster File Systems, Inc. <info@clusterfs.com>
* version 1.6.1
2007-07-30 Cluster File Systems, Inc. <info@clusterfs.com>
* version 1.6.1
mxlnd - MX 1.2.1 or later,
ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
mxlnd - MX 1.2.1 or later,
ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
+--------------------------------------------------------------------------------
+
2007-06-21 Cluster File Systems, Inc. <info@clusterfs.com>
* version 1.4.11
* Support for networks:
2007-06-21 Cluster File Systems, Inc. <info@clusterfs.com>
* version 1.4.11
* Support for networks:
Severity : major
Bugzilla : 12014
Description: ASSERTION failures when upgrading to the patchless zero-copy
Severity : major
Bugzilla : 12014
Description: ASSERTION failures when upgrading to the patchless zero-copy
Details : This bug affects "rolling upgrades", causing an inconsistent
Details : This bug affects "rolling upgrades", causing an inconsistent
- protocol version negotiation and subsequent assertion failure
+ protocol version negotiation and subsequent assertion failure
during rolling upgrades after the first wave of upgrades.
Severity : minor
Bugzilla : 11223
Details : Change "dropped message" CERRORs to D_NETERROR so they are
during rolling upgrades after the first wave of upgrades.
Severity : minor
Bugzilla : 11223
Details : Change "dropped message" CERRORs to D_NETERROR so they are
- logged instead of creating "console chatter" when a lustre
+ logged instead of creating "console chatter" when a lustre
timeout races with normal RPC completion.
Severity : minor
Details : lnet_clear_peer_table can wait forever if user forgets to
timeout races with normal RPC completion.
Severity : minor
Details : lnet_clear_peer_table can wait forever if user forgets to
Severity : minor
Details : libcfs_id2str should check pid against LNET_PID_ANY.
Severity : minor
Details : libcfs_id2str should check pid against LNET_PID_ANY.
Bugzilla : 12316
Description: Add OFED1.2 support to o2iblnd
Details : o2iblnd depends on OFED's modules, if out-tree OFED's modules
Bugzilla : 12316
Description: Add OFED1.2 support to o2iblnd
Details : o2iblnd depends on OFED's modules, if out-tree OFED's modules
- are installed (other than kernel's in-tree infiniband), there
- could be some problem while insmod o2iblnd (mismatch CRC of
- ib_* symbols).
- If extra Module.symvers is supported in kernel (i.e, 2.6.17),
- this link provides solution:
- https://bugs.openfabrics.org/show_bug.cgi?id=355
- if extra Module.symvers is not supported in kernel, we will
- have to run the script in bug 12316 to update
- $LINUX/module.symvers before building o2iblnd.
- More details about this are in bug 12316.
+ are installed (other than kernel's in-tree infiniband), there
+ could be some problem while insmod o2iblnd (mismatch CRC of
+ ib_* symbols).
+ If extra Module.symvers is supported in kernel (i.e, 2.6.17),
+ this link provides solution:
+ https://bugs.openfabrics.org/show_bug.cgi?id=355
+ if extra Module.symvers is not supported in kernel, we will
+ have to run the script in bug 12316 to update
+ $LINUX/module.symvers before building o2iblnd.
+ More details about this are in bug 12316.
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Severity : minor
Frequency : rare
Description: the_lnet.ln_finalizing was not set when the current thread is
Severity : minor
Frequency : rare
Description: the_lnet.ln_finalizing was not set when the current thread is
- about to complete messages. It only affects multi-threaded
+ about to complete messages. It only affects multi-threaded
user space LNet.
Severity : normal
user space LNet.
Severity : normal
Frequency : rare
Bugzilla : 12458
Description: Assertion failure in kernel ptllnd caused by posting passive
Frequency : rare
Bugzilla : 12458
Description: Assertion failure in kernel ptllnd caused by posting passive
- bulk buffers before connection establishment complete.
+ bulk buffers before connection establishment complete.
Severity : major
Frequency : rare
Bugzilla : 12445
Description: A race in kernel ptllnd between deleting a peer and posting
Severity : major
Frequency : rare
Bugzilla : 12445
Description: A race in kernel ptllnd between deleting a peer and posting
- new communications for it could hang communications -
+ new communications for it could hang communications -
manifesting as "Unexpectedly long timeout" messages.
Severity : major
manifesting as "Unexpectedly long timeout" messages.
Severity : major
Bugzilla : 11659
Description: Credit overflows
Details : This was a bug in ptllnd connection establishment. The fix
Bugzilla : 11659
Description: Credit overflows
Details : This was a bug in ptllnd connection establishment. The fix
- implements better peer stamps to disambiguate connection
+ implements better peer stamps to disambiguate connection
establishment and ensure both peers enter the credit flow
state machine consistently.
establishment and ensure both peers enter the credit flow
state machine consistently.
Bugzilla : 11394
Description: kptllnd didn't propagate some network errors up to LNET
Details : This bug was spotted while investigating 11394. The fix
Bugzilla : 11394
Description: kptllnd didn't propagate some network errors up to LNET
Details : This bug was spotted while investigating 11394. The fix
- ensures network errors on sends and bulk transfers are
+ ensures network errors on sends and bulk transfers are
propagated to LNET/lustre correctly.
Severity : enhancement
propagated to LNET/lustre correctly.
Severity : enhancement
- renamed cfs_sleep_chan -> cfs_waitq
cfs_sleep_link -> cfs_waitlink
- renamed cfs_sleep_chan -> cfs_waitq
cfs_sleep_link -> cfs_waitlink
- - fixed race in linux version of arch-independent socknal
- (the ENOMEM/EAGAIN decision).
+ - fixed race in linux version of arch-independent socknal
+ (the ENOMEM/EAGAIN decision).
- Didn't fix problems in Darwin version of arch-independent socknal
- Didn't fix problems in Darwin version of arch-independent socknal
- (resetting socket callbacks, eager ack hack, ENOMEM/EAGAIN decision)
+ (resetting socket callbacks, eager ack hack, ENOMEM/EAGAIN decision)
- removed libcfs types from non-socknal header files (only some types
in the header files had been changed; the .c files hadn't been
- removed libcfs types from non-socknal header files (only some types
in the header files had been changed; the .c files hadn't been