hb_report man page on Mageia

hb_report man page on Mageia
Man page or keyword search:
man Server 17783 pages
apropos Keyword Search (all sections)
Output format
HB_REPORT(8)		    Pacemaker documentation		  HB_REPORT(8)

NAME
       hb_report - create report for CRM based clusters (Pacemaker)

SYNOPSIS
       hb_report -f {time|"cts:"testnum} [-t time] [-u user] [-l file] [-n
       nodes] [-E files] [-p patt] [-L patt] [-e prog] [-MSDCZAVsvhd] [dest]

DESCRIPTION
       The hb_report(1) is a utility to collect all information (logs,
       configuration files, system information, etc) relevant to Pacemaker
       (CRM) over the given period of time.

OPTIONS
       dest
	   The destination directory. Must be an absolute path. The resulting
	   tarball is placed in the parent directory and contains the last
	   directory element of this path. Typically something like
	   /tmp/standby-failed. If left out, the tarball is created in your
	   home directory named "hb_report-current_date", for instance
	   hb_report-Wed-03-Mar-2010.

       -d
	   Don’t create the compressed tar, but leave the result in a
	   directory.

       -f { time | "cts:"testnum }
	   The start time from which to collect logs. The time is in the
	   format as used by the Date::Parse perl module. For cts tests,
	   specify the "cts:" string followed by the test number. This option
	   is required.

       -t time
	   The end time to which to collect logs. Defaults to now.

       -n nodes
	   A list of space separated hostnames (cluster members). hb_report
	   may try to find out the set of nodes by itself, but if it runs on
	   the loghost which, as it is usually the case, does not belong to
	   the cluster, that may be difficult. Also, OpenAIS doesn’t contain a
	   list of nodes and if Pacemaker is not running, there is no way to
	   find it out automatically. This option is cumulative (i.e. use -n
	   "a b" or -n a -n b).

       -l file
	   Log file location. If, for whatever reason, hb_report cannot find
	   the log files, you can specify its absolute path.

       -E files
	   Extra log files to collect. This option is cumulative. By default,
	   /var/log/messages are collected along with the cluster logs.

       -M
	   Don’t collect extra log files, but only the file containing
	   messages from the cluster subsystems.

       -L patt
	   A list of regular expressions to match in log files for analysis.
	   This option is additive (default: "CRIT: ERROR:").

       -p patt
	   Additional patterns to match parameter name which contain sensitive
	   information. This option is additive (default: "passw.*").

       -A
	   This is an OpenAIS cluster. hb_report has some heuristics to find
	   the cluster stack, but that is not always reliable. By default,
	   hb_report assumes that it is run on a Heartbeat cluster.

       -u user
	   The ssh user. hb_report will try to login to other nodes without
	   specifying a user, then as "root", and finally as "hacluster". If
	   you have another user for administration over ssh, please use this
	   option.

       -S
	   Single node operation. Run hb_report only on this node and don’t
	   try to start slave collectors on other members of the cluster.
	   Under normal circumstances this option is not needed. Use if ssh(1)
	   does not work to other nodes.

       -Z
	   If destination directories exist, remove them instead of exiting
	   (this is default for CTS).

       -V
	   Print the version including the last repository changeset.

       -v
	   Increase verbosity. Normally used to debug unexpected behaviour.

       -h
	   Show usage and some examples.

       -D (obsolete)
	   Don’t invoke editor to fill the description text file.

       -e prog (obsolete)
	   Your favourite text editor. Defaults to $EDITOR, vim, vi, emacs, or
	   nano, whichever is found first.

       -C (obsolete)
	   Remove the destination directory once the report has been put in a
	   tarball.

EXAMPLES
       Last night during the backup there were several warnings encountered
       (logserver is the log host):

	   logserver# hb_report -f 3:00 -t 4:00 -n "node1 node2" /tmp/report

       collects everything from all nodes from 3am to 4am last night. The
       files are compressed to a tarball /tmp/report.tar.bz2.

       Just found a problem during testing:

	   # note the current time
	   node1# date
	   Fri Sep 11 18:51:40 CEST 2009
	   node1# /etc/init.d/heartbeat start
	   node1# nasty-command-that-breaks-things
	   node1# sleep 120 #wait for the cluster to settle
	   node1# hb_report -f 18:51 /tmp/hb1

	   # if hb_report can't figure out that this is openais
	   node1# hb_report -f 18:51 -A /tmp/hb1

	   # if hb_report can't figure out the cluster members
	   node1# hb_report -f 18:51 -n "node1 node2" /tmp/hb1

       The files are compressed to a tarball /tmp/hb1.tar.bz2.

INTERPRETING RESULTS
       The compressed tar archive is the final product of hb_report. This is
       one example of its content, for a CTS test case on a three node OpenAIS
       cluster:

	   $ ls -RF 001-Restart

	   001-Restart:
	   analysis.txt	    events.txt	logd.cf	      s390vm13/	 s390vm16/
	   description.txt  ha-log.txt	openais.conf  s390vm14/

	   001-Restart/s390vm13:
	   STOPPED  crm_verify.txt  hb_uuid.txt	 openais.conf@	 sysinfo.txt
	   cib.txt  dlm_dump.txt    logd.cf@	 pengine/	 sysstats.txt
	   cib.xml  events.txt	    messages	 permissions.txt

	   001-Restart/s390vm13/pengine:
	   pe-input-738.bz2  pe-input-740.bz2  pe-warn-450.bz2
	   pe-input-739.bz2  pe-warn-449.bz2   pe-warn-451.bz2

	   001-Restart/s390vm14:
	   STOPPED  crm_verify.txt  hb_uuid.txt	 openais.conf@	 sysstats.txt
	   cib.txt  dlm_dump.txt    logd.cf@	 permissions.txt
	   cib.xml  events.txt	    messages	 sysinfo.txt

	   001-Restart/s390vm16:
	   STOPPED  crm_verify.txt  hb_uuid.txt	 messages	 sysinfo.txt
	   cib.txt  dlm_dump.txt    hostcache	 openais.conf@	 sysstats.txt
	   cib.xml  events.txt	    logd.cf@	 permissions.txt

       The top directory contains information which pertains to the cluster or
       event as a whole. Files with exactly the same content on all nodes will
       also be at the top, with per-node links created (as it is in this
       example the case with openais.conf and logd.cf).

       The cluster log files are named ha-log.txt regardless of the actual log
       file name on the system. If it is found on the loghost, then it is
       placed in the top directory. Files named messages are excerpts of
       /var/log/messages from nodes.

       Most files are copied verbatim or they contain output of a command. For
       instance, cib.xml is a copy of the CIB found in
       /var/lib/heartbeat/crm/cib.xml. crm_verify.txt is output of the
       crm_verify(8) program.

       Some files are result of a more involved processing:

       analysis.txt
	   A set of log messages matching user defined patterns (may be
	   provided with the -L option).

       events.txt
	   A set of log messages matching event patterns. It should provide
	   information about major cluster motions without unnecessary
	   details. These patterns are devised by the cluster experts.
	   Currently, the patterns cover membership and quorum changes,
	   resource starts and stops, fencing (stonith) actions, and cluster
	   starts and stops. events.txt is always generated for each node. In
	   case the central cluster log was found, also combined for all
	   nodes.

       permissions.txt
	   One of the more common problem causes are file and directory
	   permissions. hb_report looks for a set of predefined directories
	   and checks their permissions. Any issues are reported here.

       backtraces.txt
	   gdb generated backtrace information for cores dumped within the
	   specified period.

       sysinfo.txt
	   Various release information about the platform, kernel, operating
	   system, packages, and anything else deemed to be relevant. The
	   static part of the system.

       sysstats.txt
	   Output of various system commands such as ps(1), uptime(1),
	   netstat(8), and ifconfig(8). The dynamic part of the system.

       description.txt should contain a user supplied description of the
       problem, but since it is very seldom used, it will be dropped from the
       future releases.

PREREQUISITES
       ssh
	   It is not strictly required, but you won’t regret having a
	   password-less ssh. It is not too difficult to setup and will save
	   you a lot of time. If you can’t have it, for example because your
	   security policy does not allow such a thing, or you just prefer
	   menial work, then you will have to resort to the semi-manual
	   semi-automated report generation. See below for instructions.

	   If you need to supply a password for your passphrase/login, then
	   please use the -u option.

       Times
	   In order to find files and messages in the given period and to
	   parse the -f and -t options, hb_report uses perl and one of the
	   Date::Parse or Date::Manip perl modules. Note that you need only
	   one of these. Furthermore, on nodes which have no logs and where
	   you don’t run hb_report directly, no date parsing is necessary. In
	   other words, if you run this on a loghost then you don’t need these
	   perl modules on the cluster nodes.

	   On rpm based distributions, you can find Date::Parse in
	   perl-TimeDate and on Debian and its derivatives in
	   libtimedate-perl.

       Core dumps
	   To backtrace core dumps gdb is needed and the packages with the
	   debugging info. The debug info packages may be installed at the
	   time the report is created. Let’s hope that you will need this
	   really seldom.

TIMES
       Specifying times can at times be a nuisance. That is why we have chosen
       to use one of the perl modules—they do allow certain freedom when
       talking dates. You can either read the instructions at the Date::Parse
       examples page[1]. or just rely on common sense and try stuff like:

	   3:00		 (today at 3am)
	   15:00	 (today at 3pm)
	   2007/9/1 2pm	 (September 1st at 2pm)
	   Tue Sep 15 20:46:27 CEST 2009 (September 15th etc)

       hb_report will (probably) complain if it can’t figure out what do you
       mean.

       Try to delimit the event as close as possible in order to reduce the
       size of the report, but still leaving a minute or two around for good
       measure.

       -f is not optional. And don’t forget to quote dates when they contain
       spaces.

SHOULD I SEND ALL THIS TO THE REST OF INTERNET?
       By default, the sensitive data in CIB and PE files is not mangled by
       hb_report because that makes PE input files mostly useless. If you
       still have no other option but to send the report to a public mailing
       list and do not want the sensitive data to be included, use the -s
       option. Without this option, hb_report will issue a warning if it finds
       information which should not be exposed. By default, parameters
       matching passw.* are considered sensitive. Use the -p option to specify
       additional regular expressions to match variable names which may
       contain information you don’t want to leak. For example:

	   # hb_report -f 18:00 -p "user.*" -p "secret.*" /var/tmp/report

       Heartbeat’s ha.cf is always sanitized. Logs and other files are not
       filtered.

LOGS
       It may be tricky to find syslog logs. The scheme used is to log a
       unique message on all nodes and then look it up in the usual syslog
       locations. This procedure is not foolproof, in particular if the syslog
       files are in a non-standard directory. We look in /var/log /var/logs
       /var/syslog /var/adm /var/log/ha /var/log/cluster. In case we can’t
       find the logs, please supply their location:

	   # hb_report -f 5pm -l /var/log/cluster1/ha-log -S /tmp/report_node1

       If you have different log locations on different nodes, well, perhaps
       you’d like to make them the same and make life easier for everybody.

       Files starting with "ha-" are preferred. In case syslog sends messages
       to more than one file, if one of them is named ha-log or ha-debug those
       will be favoured over syslog or messages.

       hb_report supports also archived logs in case the period specified
       extends that far in the past. The archives must reside in the same
       directory as the current log and their names must be prefixed with the
       name of the current log (syslog-1.gz or messages-20090105.bz2).

       If there is no separate log for the cluster, possibly unrelated
       messages from other programs are included. We don’t filter logs, but
       just pick a segment for the period you specified.

MANUAL REPORT COLLECTION
       So, your ssh doesn’t work. In that case, you will have to run this
       procedure on all nodes. Use -S so that hb_report doesn’t bother with
       ssh:

	   # hb_report -f 5:20pm -t 5:30pm -S /tmp/report_node1

       If you also have a log host which is not in the cluster, then you’ll
       have to copy the log to one of the nodes and tell us where it is:

	   # hb_report -f 5:20pm -t 5:30pm -l /var/tmp/ha-log -S /tmp/report_node1

       If you reconsider and want the ssh setup, take a look at the CTS README
       file for instructions.

OPERATION
       hb_report collects files and other information in a fairly
       straightforward way. The most complex tasks are discovering the log
       file locations (if syslog is used which is the most common case) and
       coordinating the operation on multiple nodes.

       The instance of hb_report running on the host where it was invoked is
       the master instance. Instances running on other nodes are slave
       instances. The master instance communicates with slave instances by
       ssh. There are multiple ssh invocations per run, so it is essential
       that the ssh works without password, i.e. with the public key
       authentication and authorized_keys.

       The operation consists of three phases. Each phase must finish on all
       nodes before the next one can commence. The first phase consists of
       logging unique messages through syslog on all nodes. This is the
       shortest of all phases.

       The second phase is the most involved. During this phase all local
       information is collected, which includes:

       ·   logs (both current and archived if the start time is far in the
	   past)

       ·   various configuration files (openais, heartbeat, logd)

       ·   the CIB (both as xml and as represented by the crm shell)

       ·   pengine inputs (if this node was the DC at any point in time over
	   the given period)

       ·   system information and status

       ·   package information and status

       ·   dlm lock information

       ·   backtraces (if there were core dumps)

       The third phase is collecting information from all nodes and analyzing
       it. The analyzis consists of the following tasks:

       ·   identify files equal on all nodes which may then be moved to the
	   top directory

       ·   save log messages matching user defined patterns (defaults to
	   ERRORs and CRITical conditions)

       ·   report if there were coredumps and by whom

       ·   report crm_verify(8) results

       ·   save log messages matching major events to events.txt

       ·   in case logging is configured without loghost, node logs and events
	   files are combined using a perl utility

BUGS
       Finding logs may at times be extremely difficult, depending on how
       weird the syslog configuration. It would be nice to ask syslog-ng
       developers to provide a way to find out the log destination based on
       facility and priority.

       If you think you found a bug, please rerun with the -v option and
       attach the output to bugzilla.

       hb_report can function in a satisfactory way only if ssh works to all
       nodes using authorized_keys (without password).

       There are way too many options.

AUTHOR
       Written by Dejan Muhamedagic, <dejan@suse.de[2]>

RESOURCES
       Pacemaker: http://clusterlabs.org/

       Heartbeat and other Linux HA resources: http://linux-ha.org/wiki

       OpenAIS: http://www.openais.org/

       Corosync: http://www.corosync.org/

SEE ALSO
       Date::Parse(3)

COPYING
       Copyright (C) 2007-2009 Dejan Muhamedagic. Free use of this software is
       granted under the terms of the GNU General Public License (GPL).

NOTES
	1. Date::Parse examples page
	   http://search.cpan.org/dist/TimeDate/lib/Date/Parse.pm#EXAMPLE_DATES

	2. dejan@suse.de
	   mailto:dejan@suse.de

hb_report 1.2			  10/17/2013			  HB_REPORT(8)
[top]

List of man pages available for Mageia

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome