Overview -------- These scripts will be used to collect application profiling info from lustre clients and servers. It will be run on a single (control) node, and collect all the profile info and create a tarball on the control node. lstat.sh : script for single node, will be run on each profile node. gather_stats_everywhere.sh : collect stats script. config.sh : customized configuration description Requirements ------- 1) Lustre is installed and setup on your cluster. 2) ssh/scp to these nodes works without requiring a password. Configuration ------ Configuration is very simple for this script - all of the profiling config VARs are in config.sh XXXX_INTERVAL: the profiling interval where value of interval means: 0 - gather stats at start and stop only N - gather stats every N seconds if XXX_INTERVAL isn't specified, XXX stats won't be collected XXX can be: VMSTAT, SERVICE, BRW, SDIO, MBALLOC, IO, JBD, CLIENT Running -------- The gather_stats_everywhere.sh should be run in three phases: a)sh gather_stats_everywhere.sh config.sh start It will start stats collection on each node specified in config.sh b)sh gather_stats_everywhere.sh config.sh stop It will stop collect stats on each node. If is provided, it will create a profile tarball /tmp/ c)sh gather_stats_everywhere.sh config.sh analyse log_tarball.tgz csv It will analyse the log_tarball and create a csv tarball for this profiling tarball. Example ------- When you want collect your profile info, you should 1) start the collect profile daemon on each node. sh gather_stats_everywhere.sh config.sh start 2) run your test. 3) stop the collect profile daemon on each node, cleanup the tmp file and create a profiling tarball. sh gather_stats_everywhere.sh config.sh stop log_tarball.tgz 4) create a csv file according to the profile. sh gather_stats_everywhere.sh config.sh analyse log_tarball.tgz csv TBD ------ Add liblustre profiling support and add more options for analyse.