Commit Graph

13 Commits

Author SHA1 Message Date
Brian Warner
8473a96ada #330: convert stats-gatherer into a .tac file service, add 'tahoe create-stats-gatherer' 2008-11-18 01:46:20 -07:00
Brian Warner
cd26f58305 #518: replace various BASEDIR/* config files with a single BASEDIR/tahoe.cfg, with backwards-compatibility of course 2008-09-30 16:21:49 -07:00
Brian Warner
462ef2a0ac run a stats provider even if there's no gatherer, since the HTTP /statistics page is then useful. Only run the once-per-second load-monitor if there is a gatherer configured 2008-05-08 11:37:30 -07:00
Brian Warner
a5a7ba24ef stats: add tests for CPUUsageMonitor, modify it a bit to facilitate testing 2008-04-30 11:39:13 -07:00
Brian Warner
3be921174b stats: add CPU-percentage monitor, with 1min/5min/15min moving-window averages, using time.clock() 2008-04-29 18:12:53 -07:00
Brian Warner
5b8320442a stats: add /statistics web page to show them, add tests 2008-04-14 14:17:08 -07:00
robk-tahoe
766deaa9b6 stats_gatherer: reconcile helper stats gathering
I'd implemented stats gathering hooks in the helper a while back.
Brian did the same without reference to my changes.  This reconciles
those two changes, encompassing all the stats in both changes,
implemented through the stats_provider interface.

this also provide templates for all 10 helper graphs in the 
tahoe-stats munin plugin.
2008-04-10 17:25:44 -07:00
robk-tahoe
0d2eb1edf6 stats_gatherer: verbose debug logging
one of the storage servers is throwing foolscap violations about the
return value of get_stats().  this adds a log of the data returned
to the foolscap log event stream at the debug level '12' (between
NOISY(10) and OPERATIONAL(20))  hopefully this will facilitate
finding the cause of this problem.
2008-04-09 16:10:53 -07:00
Brian Warner
7e159feb27 stats: make StatsGatherer happy about sharing a process with other services, add one during system test to get some test coverage 2008-03-03 23:55:58 -07:00
robk-tahoe
f5a803303f stats: fix service issues
having moved inititalisation into startService to handle tub init cleanly,
I neglected the up-call to startService, which wound up not starting the
load_monitor.

also I changed the 'running' attribute to 'started' since 'running' is
the name used internally by MultiService itself.
2008-02-01 18:57:31 -07:00
robk-tahoe
e5487bbe21 stats: added IStatsProducer interface, fixed stats provider startup
this adds an interface, IStatsProducer, defining the get_stats() method
which the stats provider calls upon and registered producer, and made the
register_producer() method check that interface is implemented.

also refine the startup logic, so that the stats provider doesn't try and
connect out to the stats gatherer until after the node declares the tub
'ready'.  this is to address an issue whereby providers would attach to
the gatherer without providing a valid furl, and hence the gatherer would
be unable to determine the tubid of the connected client, leading to lost
samples.
2008-01-31 21:10:15 -07:00
robk-tahoe
0700ccabaa stats_gatherer: reject "<unauth>" as a tubid, to avoid screwing up the data. 2008-01-31 19:11:31 -07:00
robk-tahoe
7b9f3207d0 stats: add a simple stats gathering system
We have a desire to collect runtime statistics from multiple nodes primarily
for server monitoring purposes.   This implements a simple implementation of
such a system, as a skeleton to build more sophistication upon.

Each client now looks for a 'stats_gatherer.furl' config file.  If it has
been configured to use a stats gatherer, then it instantiates internally
a StatsProvider.  This is a central place for code which wishes to offer
stats up for monitoring to report them to, either by calling 
stats_provider.count('stat.name', value) to increment a counter, or by
registering a class as a stats producer with sp.register_producer(obj).

The StatsProvider connects to the StatsGatherer server and provides its
provider upon startup.  The StatsGatherer is then responsible for polling
the attached providers periodically to retrieve the data provided.
The provider queries each registered producer when the gatherer queries
the provider.  Both the internal 'counters' and the queried 'stats' are
then reported to the gatherer.

This provides a simple gatherer app, (c.f. make stats-gatherer-run)
which prints its furl and listens for incoming connections.  Once a
minute, the gatherer polls all connected providers, and writes the
retrieved data into a pickle file.

Also included is a munin plugin which knows how to read the gatherer's
stats.pickle and output data munin can interpret.  this plugin, 
tahoe-stats.py can be symlinked as multiple different names within
munin's 'plugins' directory, and inspects argv to determine which
data to display, doing a lookup in a table within that file.
It looks in the environment for 'statsfile' to determine the path to
the gatherer's stats.pickle.  An example plugins-conf.d file is
provided.
2008-01-30 20:11:07 -07:00