-Severity : major
-Frequency : only if OST filesystem is corrupted
-Bugzilla : 9829
-Description: client incorrectly hits assertion in ptlrpc_replay_req()
-Details : for a short time RPCs with bulk IO are in the replay list,
- but replay of bulk IOs is unimplemented. If the OST filesystem
- is corrupted due to disk cache incoherency and then replay is
- started it is possible to trip an assertion. Avoid putting
- committed RPCs into the replay list at all to avoid this issue.
-
-Severity : major
-Frequency : liblustre (e.g. catamount) on a large cluster with >= 8 OSTs
- per OSS
-Bugzilla : 11684
-Description: System hang on startup
-Details : This bug allowed the liblustre (e.g. catamount) client to
- return to the app before handling all startup RPCs. This
- could leave the node unresponsive to lustre network traffic
- and manifested as a server ptllnd timeout.
-
-Severity : enhancement
-Bugzilla : 11667
-Description: Add "/proc/sys/lustre/debug_peer_on_timeout"
- (liblustre envirable: LIBLUSTRE_DEBUG_PEER_ON_TIMEOUT)
- boolean to control whether to print peer debug info when a
- client's RPC times out.
-