docs: reflow docs/logging.rst to fill-column 77

This commit is contained in:
Zooko O'Whielacronx 2011-08-18 21:41:03 -07:00
parent cd006ed46b
commit 41999430e0

View File

@ -26,9 +26,9 @@ went wrong.
The foolscap logging system is documented at
`<http://foolscap.lothar.com/docs/logging.html>`_.
The foolscap distribution includes a utility named "``flogtool``" (usually
at ``/usr/bin/flogtool`` on Unix) which is used to get access to many
foolscap logging features.
The foolscap distribution includes a utility named "``flogtool``" (usually at
``/usr/bin/flogtool`` on Unix) which is used to get access to many foolscap
logging features.
Realtime Logging
================
@ -45,11 +45,10 @@ this port and start emitting log information::
flogtool tail BASEDIR/private/logport.furl
The ``--save-to FILENAME`` option will save all received events to a file,
where then can be examined later with "``flogtool dump``" or
"``flogtool web-viewer``". The ``--catch-up`` option will ask the node to
dump all stored events before subscribing to new ones (without ``--catch-up``,
you will only hear about events that occur after the tool has connected and
subscribed).
where then can be examined later with "``flogtool dump``" or "``flogtool
web-viewer``". The ``--catch-up`` option will ask the node to dump all stored
events before subscribing to new ones (without ``--catch-up``, you will only
hear about events that occur after the tool has connected and subscribed).
Incidents
=========
@ -65,18 +64,18 @@ level or higher, but other criteria could be implemented.
The typical "incident report" we've seen in a large Tahoe grid is about 40kB
compressed, representing about 1800 recent events.
These "flogfiles" have a similar format to the files saved by
"``flogtool tail --save-to``". They are simply lists of log events, with a
small header to indicate which event triggered the incident.
These "flogfiles" have a similar format to the files saved by "``flogtool
tail --save-to``". They are simply lists of log events, with a small header
to indicate which event triggered the incident.
The "``flogtool dump FLOGFILE``" command will take one of these ``.flog.bz2``
files and print their contents to stdout, one line per event. The raw event
dictionaries can be dumped by using "``flogtool dump --verbose FLOGFILE``".
The "``flogtool web-viewer``" command can be used to examine the flogfile
in a web browser. It runs a small HTTP server and emits the URL on stdout.
This view provides more structure than the output of "``flogtool dump``":
the parent/child relationships of log events is displayed in a nested format.
The "``flogtool web-viewer``" command can be used to examine the flogfile in
a web browser. It runs a small HTTP server and emits the URL on stdout. This
view provides more structure than the output of "``flogtool dump``": the
parent/child relationships of log events is displayed in a nested format.
"``flogtool web-viewer``" is still fairly immature.
Working with flogfiles
@ -86,10 +85,10 @@ The "``flogtool filter``" command can be used to take a large flogfile
(perhaps one created by the log-gatherer, see below) and copy a subset of
events into a second file. This smaller flogfile may be easier to work with
than the original. The arguments to "``flogtool filter``" specify filtering
criteria: a predicate that each event must match to be copied into the
target file. ``--before`` and ``--after`` are used to exclude events outside
a given window of time. ``--above`` will retain events above a certain
severity level. ``--from`` retains events send by a specific tubid.
criteria: a predicate that each event must match to be copied into the target
file. ``--before`` and ``--after`` are used to exclude events outside a given
window of time. ``--above`` will retain events above a certain severity
level. ``--from`` retains events send by a specific tubid.
``--strip-facility`` removes events that were emitted with a given facility
(like ``foolscap.negotiation`` or ``tahoe.upload``).
@ -102,8 +101,8 @@ modest) storage requirements to a different host and provides access to
logfiles from multiple nodes (web-API, storage, or helper) in a single place.
There are two kinds of gatherers: "log gatherer" and "stats gatherer". Each
produces a FURL which needs to be placed in the ``NODEDIR/tahoe.cfg`` file
of each node that is to publish to the gatherer, under the keys
produces a FURL which needs to be placed in the ``NODEDIR/tahoe.cfg`` file of
each node that is to publish to the gatherer, under the keys
"log_gatherer.furl" and "stats_gatherer.furl" respectively. When the Tahoe
node starts, it will connect to the configured gatherers and offer its
logport: the gatherer will then use the logport to subscribe to hear about
@ -127,35 +126,38 @@ provided in ``misc/incident-gatherer/support_classifiers.py`` . There is
roughly one category for each ``log.WEIRD``-or-higher level event in the
Tahoe source code.
The incident gatherer is created with the "``flogtool create-incident-gatherer
WORKDIR``" command, and started with "``tahoe start``". The generated
"``gatherer.tac``" file should be modified to add classifier functions.
The incident gatherer is created with the "``flogtool
create-incident-gatherer WORKDIR``" command, and started with "``tahoe
start``". The generated "``gatherer.tac``" file should be modified to add
classifier functions.
The incident gatherer writes incident names (which are simply the relative
pathname of the ``incident-\*.flog.bz2`` file) into ``classified/CATEGORY``.
For example, the ``classified/mutable-retrieve-uncoordinated-write-error``
file contains a list of all incidents which were triggered by an uncoordinated
write that was detected during mutable file retrieval (caused when somebody
changed the contents of the mutable file in between the node's mapupdate step
and the retrieve step). The ``classified/unknown`` file contains a list of all
incidents that did not match any of the classification functions.
file contains a list of all incidents which were triggered by an
uncoordinated write that was detected during mutable file retrieval (caused
when somebody changed the contents of the mutable file in between the node's
mapupdate step and the retrieve step). The ``classified/unknown`` file
contains a list of all incidents that did not match any of the classification
functions.
At startup, the incident gatherer will automatically reclassify any incident
report which is not mentioned in any of the ``classified/\*`` files. So the
usual workflow is to examine the incidents in ``classified/unknown``, add a
new classification function, delete ``classified/unknown``, then bound the
gatherer with "``tahoe restart WORKDIR``". The incidents which can be
classified with the new functions will be added to their own ``classified/FOO``
lists, and the remaining ones will be put in ``classified/unknown``, where
the process can be repeated until all events are classifiable.
classified with the new functions will be added to their own
``classified/FOO`` lists, and the remaining ones will be put in
``classified/unknown``, where the process can be repeated until all events
are classifiable.
The incident gatherer is still fairly immature: future versions will have a
web interface and an RSS feed, so operations personnel can track problems in
the storage grid.
In our experience, each incident takes about two seconds to transfer from
the node that generated it to the gatherer. The gatherer will automatically
catch up to any incidents which occurred while it is offline.
In our experience, each incident takes about two seconds to transfer from the
node that generated it to the gatherer. The gatherer will automatically catch
up to any incidents which occurred while it is offline.
Log Gatherer
------------
@ -170,12 +172,11 @@ contain events from many different sources, making it easier to correlate
things that happened on multiple machines (such as comparing a client node
making a request with the storage servers that respond to that request).
Create the Log Gatherer with the "``flogtool create-gatherer
WORKDIR``" command, and start it with "``tahoe start``". Then copy the
contents of the ``log_gatherer.furl`` file it creates into the
``BASEDIR/tahoe.cfg`` file (under the key ``log_gatherer.furl`` of the
section ``[node]``) of all nodes that should be sending it log
events. (See `<configuration.rst>`_.)
Create the Log Gatherer with the "``flogtool create-gatherer WORKDIR``"
command, and start it with "``tahoe start``". Then copy the contents of the
``log_gatherer.furl`` file it creates into the ``BASEDIR/tahoe.cfg`` file
(under the key ``log_gatherer.furl`` of the section ``[node]``) of all nodes
that should be sending it log events. (See `<configuration.rst>`_.)
The "``flogtool filter``" command, described above, is useful to cut down the
potentially large flogfiles into a more focussed form.
@ -194,11 +195,11 @@ Local twistd.log files
In addition to the foolscap-based event logs, certain high-level events will
be recorded directly in human-readable text form, in the
``BASEDIR/logs/twistd.log`` file (and its rotated old versions: ``twistd.log.1``,
``twistd.log.2``, etc). This form does not contain as much information as the
flogfiles available through the means described previously, but they are
immediately available to the curious developer, and are retained until the
twistd.log.NN files are explicitly deleted.
``BASEDIR/logs/twistd.log`` file (and its rotated old versions:
``twistd.log.1``, ``twistd.log.2``, etc). This form does not contain as much
information as the flogfiles available through the means described
previously, but they are immediately available to the curious developer, and
are retained until the twistd.log.NN files are explicitly deleted.
Only events at the ``log.OPERATIONAL`` level or higher are bridged to
``twistd.log`` (i.e. not the ``log.NOISY`` debugging events). In addition,
@ -224,22 +225,22 @@ but a few notes are worth stating here:
clustered with its parent. For example, a download process that involves
three sequential hash fetches could announce the send and receipt of those
hash-fetch messages with a ``parent=`` argument that ties them to the
overall download process. However, each new web-API download request
should be unparented.
overall download process. However, each new web-API download request should
be unparented.
* use the ``format=`` argument in preference to the ``message=`` argument.
E.g. use ``log.msg(format="got %(n)d shares, need %(k)d", n=n, k=k)``
instead of ``log.msg("got %d shares, need %d" % (n,k))``. This will allow
later tools to analyze the event without needing to scrape/reconstruct
the structured data out of the formatted string.
later tools to analyze the event without needing to scrape/reconstruct the
structured data out of the formatted string.
* Pass extra information as extra keyword arguments, even if they aren't
included in the ``format=`` string. This information will be displayed in
the "``flogtool dump --verbose``" output, as well as being available to
other tools. The ``umid=`` argument should be passed this way.
* use ``log.err`` for the catch-all ``addErrback`` that gets attached to
the end of any given Deferred chain. When used in conjunction with
* use ``log.err`` for the catch-all ``addErrback`` that gets attached to the
end of any given Deferred chain. When used in conjunction with
``LOGTOTWISTED=1``, ``log.err()`` will tell Twisted about the error-nature
of the log message, causing Trial to flunk the test (with an "ERROR"
indication that prints a copy of the Failure, including a traceback).
@ -270,12 +271,12 @@ by running the tests like this::
make test FLOGFILE=flog.out.bz2 FLOGLEVEL=1 FLOGTOTWISTED=1
The first environment variable will cause foolscap log events to be written
to ``./flog.out.bz2`` (instead of merely being recorded in the circular buffers
for the use of remote subscribers or incident reports). The second will cause
all log events to be written out, not just the higher-severity ones. The
third will cause twisted log events (like the markers that indicate when each
unit test is starting and stopping) to be copied into the flogfile, making it
easier to correlate log events with unit tests.
to ``./flog.out.bz2`` (instead of merely being recorded in the circular
buffers for the use of remote subscribers or incident reports). The second
will cause all log events to be written out, not just the higher-severity
ones. The third will cause twisted log events (like the markers that indicate
when each unit test is starting and stopping) to be copied into the flogfile,
making it easier to correlate log events with unit tests.
Enabling this form of logging appears to roughly double the runtime of the
unit tests. The ``flog.out.bz2`` file is approximately 2MB.