1 tbd Sun Microsystems, Inc.
3 * Support for networks:
4 socklnd - any kernel supported by Lustre,
5 qswlnd - Qsnet kernel modules 5.20 and later,
6 openiblnd - IbGold 1.8.2,
7 o2iblnd - OFED 1.1, 1.2.0, 1.2.5, and 1.3
8 viblnd - Voltaire ibhost 3.4.5 and later,
9 ciblnd - Topspin 3.2.0,
10 iiblnd - Infiniserv 3.3 + PathBits patch,
11 gmlnd - GM 2.1.22 and later,
12 mxlnd - MX 1.2.1 or later,
13 ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
22 Description: IB path MTU mistakenly set to 1st path MTU when ib_mtu is off
23 Details : See comment 46 in bug 11245 for details - it's indeed a bug
24 introduced by the original 11245 fix.
28 Description: uptllnd credit overflow fix
29 Details : kptl_msg_t::ptlm_credits could be overflown by uptllnd since
34 Description: socklnd prtocol version 3
35 Details : With current protocol V2, connections on router can be
36 blocked and can't receive any incoming messages when there is no
37 more router buffer, so ZC-ACK can't be handled (LNet message
38 can't be finalized) and will cause deadlock on router.
39 Protocol V3 has a dedicated connection for emergency messages
40 like ZC-ACK to router, messages on this dedicated connection
41 don't need any credit so will never be blocked. Also, V3 can send
42 keepalive ping in specified period for router healthy checking.
44 -------------------------------------------------------------------------------
46 12-31-2008 Sun Microsystems, Inc.
48 * Support for networks:
49 socklnd - any kernel supported by Lustre,
50 qswlnd - Qsnet kernel modules 5.20 and later,
51 openiblnd - IbGold 1.8.2,
52 o2iblnd - OFED 1.1, 1.2.0, 1.2.5, and 1.3
53 viblnd - Voltaire ibhost 3.4.5 and later,
54 ciblnd - Topspin 3.2.0,
55 iiblnd - Infiniserv 3.3 + PathBits patch,
56 gmlnd - GM 2.1.22 and later,
57 mxlnd - MX 1.2.1 or later,
58 ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
62 Description: workaround for OOM from o2iblnd
63 Details : OFED needs allocate big chunk of memory for QP while creating
64 connection for o2iblnd, OOM can happen if no such a contiguous
66 QP size is decided by concurrent_sends and max_fragments of
67 o2iblnd, now we permit user to specify smaller value for
68 concurrent_sends of o2iblnd(i.e: concurrent_sends=7), which
69 will decrease memory block size required by creating QP.
73 Description: Support Zerocopy receive of Chelsio device
74 Details : Chelsio driver can support zerocopy for iov[1] if it's
75 contiguous and large enough.
79 Description: fix credit flow deadlock in uptllnd
83 Description: finalize network operation in reasonable time
84 Details : conf-sanity test_32a couldn't stop ost and mds because it
85 tried to access non-existent peer and tcp connect took
86 quite long before timing out.
90 Description: Continuous recovery on 33 of 413 nodes after lustre oss failure
91 Details : Lost reference on conn prevents peer from being destroyed, which
92 could prevent new peer creation if peer count has reached upper
97 Description: LNET Selftest results in Soft lockup on OSS CPU
98 Details : only hits when 8 or more o2ib clients involved and a session is
99 torn down with 'lst end_session' without preceeding 'lst stop'.
103 Description: concurrent_sends in IB LNDs should not be changeable at run time
104 Details : concurrent_sends in IB LNDs should not be changeable at run time
108 Description: ptl_send_rpc hits LASSERT when ptl_send_buf fails
109 Details : only hits under out-of-memory situations
112 -------------------------------------------------------------------------------
114 2009-02-07 Sun Microsystems, Inc.
116 * Support for networks:
117 socklnd - any kernel supported by Lustre,
118 qswlnd - Qsnet kernel modules 5.20 and later,
119 openiblnd - IbGold 1.8.2,
120 o2iblnd - OFED 1.1, 1.2.0, 1.2.5, and 1.3
121 viblnd - Voltaire ibhost 3.4.5 and later,
122 ciblnd - Topspin 3.2.0,
123 iiblnd - Infiniserv 3.3 + PathBits patch,
124 gmlnd - GM 2.1.22 and later,
125 mxlnd - MX 1.2.1 or later,
126 ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
129 Description: workaround for OOM from o2iblnd
130 Details : OFED needs allocate big chunk of memory for QP while creating
131 connection for o2iblnd, OOM can happen if no such a contiguous
133 QP size is decided by concurrent_sends and max_fragments of
134 o2iblnd, now we permit user to specify smaller value for
135 concurrent_sends of o2iblnd(i.e: concurrent_sends=7), which
136 will decrease memory block size required by creating QP.
140 Description: Support Zerocopy receive of Chelsio device
141 Details : Chelsio driver can support zerocopy for iov[1] if it's
142 contiguous and large enough.
145 Description: fix credit flow deadlock in uptllnd
149 Description: finalize network operation in reasonable time
150 Details : conf-sanity test_32a couldn't stop ost and mds because it
151 tried to access non-existent peer and tcp connect took
152 quite long before timing out.
156 Description: Continuous recovery on 33 of 413 nodes after lustre oss failure
157 Details : Lost reference on conn prevents peer from being destroyed, which
158 could prevent new peer creation if peer count has reached upper
163 Description: LNET Selftest results in Soft lockup on OSS CPU
164 Details : only hits when 8 or more o2ib clients involved and a session is
165 torn down with 'lst end_session' without preceeding 'lst stop'.
169 Description: concurrent_sends in IB LNDs should not be changeable at run time
170 Details : concurrent_sends in IB LNDs should not be changeable at run time
172 -------------------------------------------------------------------------------
174 11-03-2008 Sun Microsystems, Inc.
176 * Support for networks:
177 socklnd - any kernel supported by Lustre,
178 qswlnd - Qsnet kernel modules 5.20 and later,
179 openiblnd - IbGold 1.8.2,
180 o2iblnd - OFED 1.1, 1.2.0, 1.2.5, and 1.3
181 viblnd - Voltaire ibhost 3.4.5 and later,
182 ciblnd - Topspin 3.2.0,
183 iiblnd - Infiniserv 3.3 + PathBits patch,
184 gmlnd - GM 2.1.22 and later,
185 mxlnd - MX 1.2.1 or later,
186 ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
190 Description: ptl_send_rpc hits LASSERT when ptl_send_buf fails
191 Details : only hits under out-of-memory situations
194 -------------------------------------------------------------------------------
197 04-26-2008 Sun Microsystems, Inc.
199 * Support for networks:
200 socklnd - any kernel supported by Lustre,
201 qswlnd - Qsnet kernel modules 5.20 and later,
202 openiblnd - IbGold 1.8.2,
203 o2iblnd - OFED 1.1 and 1.2.0, 1.2.5
204 viblnd - Voltaire ibhost 3.4.5 and later,
205 ciblnd - Topspin 3.2.0,
206 iiblnd - Infiniserv 3.3 + PathBits patch,
207 gmlnd - GM 2.1.22 and later,
208 mxlnd - MX 1.2.1 or later,
209 ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
213 Description: excessive debug information removed
214 Details : excessive debug information removed
218 Description: ksocknal_create_conn() hit ASSERTION during connection race
219 Details : ksocknal_create_conn() hit ASSERTION during connection race
223 Description: ksocknal_send_hello() hit ASSERTION while connecting race
224 Details : ksocknal_send_hello() hit ASSERTION while connecting race
228 Description: o2iblnd/ptllnd credit deadlock in a routed config.
229 Details : o2iblnd/ptllnd credit deadlock in a routed config.
233 Description: High load after starting lnet
234 Details : gmlnd should sleep in rx thread in interruptible way. Otherwise,
235 uptime utility reports high load that looks confusingly.
239 Description: ksocklnd fails to establish connection if accept_port is high
240 Details : PID remapping must not be done for active (outgoing) connections
242 --------------------------------------------------------------------------------
244 2008-01-11 Sun Microsystems, Inc.
246 * Support for networks:
247 socklnd - any kernel supported by Lustre,
248 qswlnd - Qsnet kernel modules 5.20 and later,
249 openiblnd - IbGold 1.8.2,
250 o2iblnd - OFED 1.1 and 1.2.0, 1.2.5
251 viblnd - Voltaire ibhost 3.4.5 and later,
252 ciblnd - Topspin 3.2.0,
253 iiblnd - Infiniserv 3.3 + PathBits patch,
254 gmlnd - GM 2.1.22 and later,
255 mxlnd - MX 1.2.1 or later,
256 ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
259 Description: liblustre network error
260 Details : liblustre clients should understand LNET_ACCEPT_PORT environment
261 variable even if they don't start lnet acceptor.
265 Description: Strange message from lnet (Ignoring prediction from the future)
266 Details : Incorrect calculation of peer's last_alive value in ksocklnd
268 --------------------------------------------------------------------------------
270 2007-12-07 Cluster File Systems, Inc. <info@clusterfs.com>
272 * Support for networks:
273 socklnd - any kernel supported by Lustre,
274 qswlnd - Qsnet kernel modules 5.20 and later,
275 openiblnd - IbGold 1.8.2,
276 o2iblnd - OFED 1.1 and 1.2.0, 1.2.5.
277 viblnd - Voltaire ibhost 3.4.5 and later,
278 ciblnd - Topspin 3.2.0,
279 iiblnd - Infiniserv 3.3 + PathBits patch,
280 gmlnd - GM 2.1.22 and later,
281 mxlnd - MX 1.2.1 or later,
282 ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
286 Description: ASSERTION(me == md->md_me) failed in lnet_match_md()
290 Description: increase send queue size for ciblnd/openiblnd
294 Description: new userspace socklnd
295 Details : Old userspace tcpnal that resided in lnet/ulnds/socklnd replaced
296 with new one - usocklnd.
298 Severity : enhancement
300 Description: Console message flood
301 Details : Make cdls ratelimiting more tunable by adding several tunable in
302 procfs /proc/sys/lnet/console_{min,max}_delay_centisecs and
303 /proc/sys/lnet/console_backoff.
305 --------------------------------------------------------------------------------
307 2007-09-27 Cluster File Systems, Inc. <info@clusterfs.com>
309 * Support for networks:
310 socklnd - any kernel supported by Lustre,
311 qswlnd - Qsnet kernel modules 5.20 and later,
312 openiblnd - IbGold 1.8.2,
313 o2iblnd - OFED 1.1 and 1.2,
314 viblnd - Voltaire ibhost 3.4.5 and later,
315 ciblnd - Topspin 3.2.0,
316 iiblnd - Infiniserv 3.3 + PathBits patch,
317 gmlnd - GM 2.1.22 and later,
318 mxlnd - MX 1.2.1 or later,
319 ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
323 Description: /proc/sys/lnet has non-sysctl entries
324 Details : Updating dump_kernel/daemon_file/debug_mb to use sysctl variables
328 Description: TOE Kernel panic by ksocklnd
329 Details : offloaded sockets provide their own implementation of sendpage,
330 can't call tcp_sendpage() directly
334 Description: kibnal_shutdown() doesn't finish; lconf --cleanup hangs
335 Details : races between lnd_shutdown and peer creation prevent
336 lnd_shutdown from finishing.
340 Description: open files rlimit 1024 reached while liblustre testing
341 Details : ulnds/socklnd must close open socket after unsuccessful
346 Description: build error
347 Details : fix typos in gmlnd, ptllnd and viblnd
349 ------------------------------------------------------------------------------
351 2007-07-30 Cluster File Systems, Inc. <info@clusterfs.com>
353 * Support for networks:
354 socklnd - kernels up to 2.6.16,
355 qswlnd - Qsnet kernel modules 5.20 and later,
356 openiblnd - IbGold 1.8.2,
357 o2iblnd - OFED 1.1 and 1.2
358 viblnd - Voltaire ibhost 3.4.5 and later,
359 ciblnd - Topspin 3.2.0,
360 iiblnd - Infiniserv 3.3 + PathBits patch,
361 gmlnd - GM 2.1.22 and later,
362 mxlnd - MX 1.2.1 or later,
363 ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
365 2007-06-21 Cluster File Systems, Inc. <info@clusterfs.com>
367 * Support for networks:
368 socklnd - kernels up to 2.6.16,
369 qswlnd - Qsnet kernel modules 5.20 and later,
370 openiblnd - IbGold 1.8.2,
372 viblnd - Voltaire ibhost 3.4.5 and later,
373 ciblnd - Topspin 3.2.0,
374 iiblnd - Infiniserv 3.3 + PathBits patch,
375 gmlnd - GM 2.1.22 and later,
376 mxlnd - MX 1.2.1 or later,
377 ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
381 Description: Initialize cpumask before use
385 Description: ASSERTION failures when upgrading to the patchless zero-copy
387 Details : This bug affects "rolling upgrades", causing an inconsistent
388 protocol version negotiation and subsequent assertion failure
389 during rolling upgrades after the first wave of upgrades.
393 Details : Change "dropped message" CERRORs to D_NETERROR so they are
394 logged instead of creating "console chatter" when a lustre
395 timeout races with normal RPC completion.
398 Details : lnet_clear_peer_table can wait forever if user forgets to
402 Details : libcfs_id2str should check pid against LNET_PID_ANY.
406 Description: added LNET self test
407 Details : landing b_self_test
412 Description: cfs_duration_{u,n}sec() wrongly calculate nanosecond part of
414 Details : do_div() macro is used incorrectly.
416 2007-04-23 Cluster File Systems, Inc. <info@clusterfs.com>
420 Description: make panic on lbug configurable
424 Description: Add OFED1.2 support to o2iblnd
425 Details : o2iblnd depends on OFED's modules, if out-tree OFED's modules
426 are installed (other than kernel's in-tree infiniband), there
427 could be some problem while insmod o2iblnd (mismatch CRC of
429 If extra Module.symvers is supported in kernel (i.e, 2.6.17),
430 this link provides solution:
431 https://bugs.openfabrics.org/show_bug.cgi?id=355
432 if extra Module.symvers is not supported in kernel, we will
433 have to run the script in bug 12316 to update
434 $LINUX/module.symvers before building o2iblnd.
435 More details about this are in bug 12316.
437 ------------------------------------------------------------------------------
439 2007-04-01 Cluster File Systems, Inc. <info@clusterfs.com>
440 * version 1.4.10 / 1.6.0
441 * Support for networks:
442 socklnd - kernels up to 2.6.16,
443 qswlnd - Qsnet kernel modules 5.20 and later,
444 openiblnd - IbGold 1.8.2,
446 viblnd - Voltaire ibhost 3.4.5 and later,
447 ciblnd - Topspin 3.2.0,
448 iiblnd - Infiniserv 3.3 + PathBits patch,
449 gmlnd - GM 2.1.22 and later,
450 mxlnd - MX 1.2.1 or later,
451 ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
455 Description: Ptllnd didn't init kptllnd_data.kptl_idle_txs before it could be
456 possibly accessed in kptllnd_shutdown. Ptllnd should init
457 kptllnd_data.kptl_ptlid2str_lock before calling kptllnd_ptlid2str.
461 Description: gmlnd ignored some transmit errors when finalizing lnet messages.
465 Description: ptllnd logs a piece of incorrect debug info in kptllnd_peer_handle_hello.
469 Description: the_lnet.ln_finalizing was not set when the current thread is
470 about to complete messages. It only affects multi-threaded
476 Description: Changed the default kqswlnd ntxmsg=512
481 Description: Assertion failure in kernel ptllnd caused by posting passive
482 bulk buffers before connection establishment complete.
487 Description: A race in kernel ptllnd between deleting a peer and posting
488 new communications for it could hang communications -
489 manifesting as "Unexpectedly long timeout" messages.
494 Description: Kernel ptllnd lock ordering issue could hang a node.
499 Description: node crash on socket teardown race
502 Frequency : 'lctl peer_list' issued on a mx net
504 Description: Enable lctl's peer_list for MXLND
507 Frequency : after Ptllnd timeouts and portals congestion
509 Description: Credit overflows
510 Details : This was a bug in ptllnd connection establishment. The fix
511 implements better peer stamps to disambiguate connection
512 establishment and ensure both peers enter the credit flow
513 state machine consistently.
518 Description: kptllnd didn't propagate some network errors up to LNET
519 Details : This bug was spotted while investigating 11394. The fix
520 ensures network errors on sends and bulk transfers are
521 propagated to LNET/lustre correctly.
523 Severity : enhancement
525 Description: Fixed console chatter in case of -ETIMEDOUT.
527 Severity : enhancement
529 Description: Added D_NETTRACE for recording network packet history
530 (initially only for ptllnd). Also a separate userspace
531 ptllnd facility to gather history which should really be
532 covered by D_NETTRACE too, if only CDEBUG recorded history in
538 Description: o2iblnd handle early RDMA_CM_EVENT_DISCONNECTED.
539 Details : If the fabric is lossy, an RDMA_CM_EVENT_DISCONNECTED
540 callback can occur before a connection has actually been
541 established. This caused an assertion failure previously.
543 Severity : enhancement
545 Description: Multiple instances for o2iblnd
546 Details : Allow multiple instances of o2iblnd to enable networking over
547 multiple HCAs and routing between them.
551 Description: lnet deadlock in router_checker
552 Details : turned ksnd_connd_lock, ksnd_reaper_lock, and ksock_net_t:ksnd_lock
553 into BH locks to eliminate potential deadlock caused by
554 ksocknal_data_ready() preempting code holding these locks.
558 Description: Millions of failed socklnd connection attempts cause a very slow FS
559 Details : added a new route flag ksnr_scheduled to distinguish from
560 ksnr_connecting, so that a peer connection request is only turned
561 down for race concerns when an active connection to the same peer
562 is under progress (instead of just being scheduled).
564 ------------------------------------------------------------------------------
566 2007-02-09 Cluster File Systems, Inc. <info@clusterfs.com>
568 * Support for networks:
569 socklnd - kernels up to 2.6.16
570 qswlnd - Qsnet kernel modules 5.20 and later
571 openiblnd - IbGold 1.8.2
573 viblnd - Voltaire ibhost 3.4.5 and later
574 ciblnd - Topspin 3.2.0
575 iiblnd - Infiniserv 3.3 + PathBits patch
576 gmlnd - GM 2.1.22 and later
577 mxlnd - MX 1.2.1 or later
578 ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
581 Severity : major on XT3
583 Description: libcfs overwrites /proc/sys/portals
584 Details : libcfs created a symlink from /proc/sys/portals to
585 /proc/sys/lnet for backwards compatibility. This is no
586 longer required and makes the Cray portals /proc variables
591 Description: OFED FMR API change
592 Details : This changes parameter usage to reflect a change in
593 ib_fmr_pool_map_phys() between OFED 1.0 and OFED 1.1. Note
594 that FMR support is only used in experimental versions of the
595 o2iblnd - this change does not affect standard usage at all.
597 Severity : enhancement
599 Description: new ko2iblnd module parameter: ib_mtu
600 Details : the default IB MTU of 2048 performs badly on 23108 Tavor
601 HCAs. You can avoid this problem by setting the MTU to 1024
602 using this module parameter.
604 Severity : enhancement
605 Bugzilla : 11118/11620
606 Description: ptllnd small request message buffer alignment fix
607 Details : Set the PTL_MD_LOCAL_ALIGN8 option on small message receives.
608 Round up small message size on sends in case this option
609 is not supported. 11620 was a defect in the initial
610 implementation which effectively asserted all peers had to be
611 running the correct protocol version which was fixed by always
612 NAK-ing such requests and handling any misalignments they
617 Description: When kib(nal|lnd)_del_peer() is called upon a peer whose
618 ibp_tx_queue is not empty, kib(nal|lnd)_destroy_peer()'s
619 'LASSERT(list_empty(&peer->ibp_tx_queue))' will fail.
621 Severity : enhancement
623 Description: Patchless ZC(zero copy) socklnd
624 Details : New protocol for socklnd, socklnd can support zero copy without
625 kernel patch, it's compatible with old socklnd. Checksum is
626 moved from tunables to modparams.
630 Description: When ksocknal_del_peer() is called upon a peer whose
631 ksnp_tx_queue is not empty, ksocknal_destroy_peer()'s
632 'LASSERT(list_empty(&peer->ksnp_tx_queue))' will fail.
635 Frequency : when ptlrpc is under heavy use and runs out of request buffer
637 Description: In lnet_match_blocked_msg(), md can be used without holding a
641 Frequency : very rarely
643 Description: If ksocknal_lib_setup_sock() fails, a ref on peer is lost.
644 If connd connects a route which has been closed by
645 ksocknal_shutdown(), ksocknal_create_routes() may create new
646 routes which hold references on the peer, causing shutdown
647 process to wait for peer to disappear forever.
649 Severity : enhancement
651 Description: Dump XT3 portals traces on kptllnd timeout
652 Details : Set the kptllnd module parameter "ptltrace_on_timeout=1" to
653 dump Cray portals debug traces to a file. The kptllnd module
654 parameter "ptltrace_basename", default "/tmp/lnet-ptltrace",
655 is the basename of the dump file.
658 Frequency : infrequent
660 Description: kernel ptllnd fix bug in connection re-establishment
661 Details : Kernel ptllnd could produce protocol errors e.g. illegal
662 matchbits and/or violate the credit flow protocol when trying
663 to re-establish a connection with a peer after an error or
666 Severity : enhancement
668 Description: Allow /proc/sys/lnet/debug to be set symbolically
669 Details : Allow debug and subsystem debug values to be read/set by name
670 in addition to numerically, for ease of use.
673 Frequency : only in configurations with LNET routers
675 Description: routes automatically marked down and recovered
676 Details : In configurations with LNET routers if a router fails routers
677 now actively try to recover routes that are down, unless they
678 are marked down by an administrator.
680 ------------------------------------------------------------------------------
682 2006-12-09 Cluster File Systems, Inc. <info@clusterfs.com>
685 Frequency : very rarely, in configurations with LNET routers and TCP
687 Description: incorrect data written to files on OSTs
688 Details : In certain high-load conditions incorrect data may be written
689 to files on the OST when using TCP networks.
691 ------------------------------------------------------------------------------
693 2006-07-31 Cluster File Systems, Inc. <info@clusterfs.com>
695 - rework CDEBUG messages rate-limiting mechanism b=10375
696 - add per-socket tunables for socklnd if the kernel is patched b=10327
698 ------------------------------------------------------------------------------
700 2006-02-15 Cluster File Systems, Inc. <info@clusterfs.com>
702 - fix use of portals/lnet pid to avoid dropping RPCs b=10074
703 - iiblnd wasn't mapping all memory, resulting in comms errors b=9776
704 - quiet LNET startup LNI message for liblustre b=10128
705 - Better console error messages if 'ip2nets' can't match an IP address
706 - Fixed overflow/use-before-set bugs in linux-time.h
707 - Fixed ptllnd bug that wasn't initialising rx descriptors completely
708 - LNET teardown failed an assertion about the route table being empty
709 - Fixed a crash in LNetEQPoll(<invalid handle>)
710 - Future protocol compatibility work (b_rls146_lnetprotovrsn)
711 - improve debug message for liblustre/Catamount nodes (b=10116)
713 2005-10-10 Cluster File Systems, Inc. <info@clusterfs.com>
714 * Configuration change for the XT3
715 The PTLLND is now used to run Lustre over Portals on the XT3.
716 The configure option(s) --with-cray-portals are no longer
717 used. Rather --with-portals=<path-to-portals-includes> is
718 used to enable building on the XT3. In addition to enable
719 XT3 specific features the option --enable-cray-xt3 must be
722 2005-10-10 Cluster File Systems, Inc. <info@clusterfs.com>
723 * Portals has been removed, replaced by LNET.
724 LNET is new networking infrastructure for Lustre, it includes a
725 reorganized network configuration mode (see the user
726 documentation for full details) as well as support for routing
727 between different network fabrics. Lustre Networking Devices
728 (LNDS) for the supported network fabrics have also been created
729 for this new infrastructure.
731 2005-08-08 Cluster File Systems, Inc. <info@clusterfs.com>
736 Frequency : rare (large Voltaire clusters only)
738 Description: the default number of reserved transmit descriptors was too low
739 for some large clusters
740 Details : As a workaround, the number was increased. A proper fix includes
743 2005-06-02 Cluster File Systems, Inc. <info@clusterfs.com>
748 Frequency : occasional (large-scale events, cluster reboot, network failure)
750 Description: too many error messages on console obscure actual problem and
751 can slow down/panic server, or cause recovery to fail repeatedly
752 Details : enable rate-limiting of console error messages, and some messages
753 that were console errors now only go to the kernel log
755 Severity : enhancement
757 Description: add /proc/sys/portals/catastrophe entry which will report if
758 that node has previously LBUGged
760 2005-04-06 Cluster File Systems, Inc. <info@clusterfs.com>
762 - update gmnal to use PTL_MTU, fix module refcounting (b=5786)
764 2005-04-04 Cluster File Systems, Inc. <info@clusterfs.com>
766 - handle error return code in kranal_check_fma_rx() (5915,6054)
768 2005-02-04 Cluster File Systems, Inc. <info@clusterfs.com>
770 - update vibnal (Voltaire IB NAL)
771 - update gmnal (Myrinet NAL), gmnalid
773 2005-02-04 Eric Barton <eeb@bartonsoftware.com>
775 * Landed portals:b_port_step as follows...
777 - removed CFS_DECL_SPIN*
778 just use 'spinlock_t' and initialise with spin_lock_init()
780 - removed CFS_DECL_MUTEX*
781 just use 'struct semaphore' and initialise with init_mutex()
783 - removed CFS_DECL_RWSEM*
784 just use 'struct rw_semaphore' and initialise with init_rwsem()
786 - renamed cfs_sleep_chan -> cfs_waitq
787 cfs_sleep_link -> cfs_waitlink
789 - fixed race in linux version of arch-independent socknal
790 (the ENOMEM/EAGAIN decision).
792 - Didn't fix problems in Darwin version of arch-independent socknal
793 (resetting socket callbacks, eager ack hack, ENOMEM/EAGAIN decision)
795 - removed libcfs types from non-socknal header files (only some types
796 in the header files had been changed; the .c files hadn't been