Commit Graph

49 Commits

Author SHA1 Message Date
Zooko O'Whielacronx
c1184f51e4 docs: fix a few stale comments in code 2008-05-07 08:39:03 -07:00
robk-tahoe
35319c3380 stats gathering: fix storage server stats if not tracking consumed
the RIStatsProvider interface requires that counter and stat values be
ChoiceOf(float, int, long)  the recent changes to storage server to not
track 'consumed' led to returning None as the value of a counter.
this causes violations to be experienced by nodes whose stats are being
gathered.

this patch simply omits that stat if 'consumed' is not being tracked.
2008-04-09 18:23:06 -07:00
Zooko O'Whielacronx
8783eabf5a don't do a du on startup if there is no size limit configured
This also turns off the production of the "space measurement done" log message, if there is no size limit configured.
2008-04-08 11:36:56 -07:00
Brian Warner
d6be5116f5 storage: emit log messages on bucket allocate/read and mutable writev 2008-03-27 17:33:58 -07:00
Zooko O'Whielacronx
0aa0efa123 storage servers announce that they will support clients as old as v0.8.0
Not that anyone pays attention to what storage servers claim about what versions they will support.
2008-03-13 09:10:11 -07:00
Brian Warner
886ef22335 webish: download-results: add server_problems 2008-03-03 20:30:35 -07:00
Brian Warner
2b49605c51 webish: add 'download results', with some basic timing information 2008-03-03 19:19:21 -07:00
Brian Warner
d96f90e1fb log more peerinfo in download/upload/checker problems 2008-02-26 17:33:14 -07:00
Zooko O'Whielacronx
7c6de95bc6 switch from base62 to base32 for storage indices, switch from z-base-32 to rfc 3548 base-32 for everything, separate out base32 encoding from idlib 2008-02-14 19:27:47 -07:00
Zooko O'Whielacronx
3f8df27063 use base62 encoding for storage indexes, on disk and in verifier caps, and in logging and diagnostic tools
base62 encoding fits more information into alphanumeric chars while avoiding the troublesome non-alphanumeric chars of base64 encoding.  In particular, this allows us to work around the ext3 "32,000 entries in a directory" limit while retaining the convenient property that the intermediate directory names are leading prefixes of the storage index file names.
2008-02-12 20:48:37 -07:00
Brian Warner
5103bf8148 storage: change service name from 'storageserver' to 'storage' 2008-02-05 20:28:59 -07:00
Brian Warner
daecca6589 big introducer refactoring: separate publish+subscribe. Addresses #271. 2008-02-05 13:05:13 -07:00
robk-tahoe
e5487bbe21 stats: added IStatsProducer interface, fixed stats provider startup
this adds an interface, IStatsProducer, defining the get_stats() method
which the stats provider calls upon and registered producer, and made the
register_producer() method check that interface is implemented.

also refine the startup logic, so that the stats provider doesn't try and
connect out to the stats gatherer until after the node declares the tub
'ready'.  this is to address an issue whereby providers would attach to
the gatherer without providing a valid furl, and hence the gatherer would
be unable to determine the tubid of the connected client, leading to lost
samples.
2008-01-31 21:10:15 -07:00
Brian Warner
670933ecee storage: clean up use of si_s vs si_dir, add test for BadWriterEnabler message, add some logging 2008-01-31 17:48:48 -07:00
Zooko O'Whielacronx
79c439d026 storage: make two levels of share directories so as not to exceed certain filesystems's limitations on directory size
The filesystem which gets my vote for most undeservedly popular is ext3, and it has a hard limit of 32,000 entries in a directory.  Many other filesystems (even ones that I like more than I like ext3) have either hard limits or bad performance consequences or weird edge cases when you get too many entries in a single directory.

This patch makes it so that there is a layer of intermediate directories between the "shares" directory and the actual storage-index directory (the one whose name contains the entire storage index (z-base-32 encoded) and which contains one or more share files named by their share number).

The intermediate directories are named by the first 14 bits of the storage index, which means there are at most 16384 of them.  (This also means that the intermediate directory names are not a leading prefix of the storage-index directory names -- to do that would have required us to have intermediate directories limited to either 1024 (2-char), which is too few, or 32768 (3-chars of a full 5 bits each), which would overrun ext3's funny hard limit of 32,000.))

This closes #150, and please see the "convertshares.py" script attached to #150 to convert your old tahoe-0.7.0 storage/shares directory into a new tahoe-0.8.0 storage/shares directory.
2008-01-31 16:26:28 -07:00
robk-tahoe
7b9f3207d0 stats: add a simple stats gathering system
We have a desire to collect runtime statistics from multiple nodes primarily
for server monitoring purposes.   This implements a simple implementation of
such a system, as a skeleton to build more sophistication upon.

Each client now looks for a 'stats_gatherer.furl' config file.  If it has
been configured to use a stats gatherer, then it instantiates internally
a StatsProvider.  This is a central place for code which wishes to offer
stats up for monitoring to report them to, either by calling 
stats_provider.count('stat.name', value) to increment a counter, or by
registering a class as a stats producer with sp.register_producer(obj).

The StatsProvider connects to the StatsGatherer server and provides its
provider upon startup.  The StatsGatherer is then responsible for polling
the attached providers periodically to retrieve the data provided.
The provider queries each registered producer when the gatherer queries
the provider.  Both the internal 'counters' and the queried 'stats' are
then reported to the gatherer.

This provides a simple gatherer app, (c.f. make stats-gatherer-run)
which prints its furl and listens for incoming connections.  Once a
minute, the gatherer polls all connected providers, and writes the
retrieved data into a pickle file.

Also included is a munin plugin which knows how to read the gatherer's
stats.pickle and output data munin can interpret.  this plugin, 
tahoe-stats.py can be symlinked as multiple different names within
munin's 'plugins' directory, and inspects argv to determine which
data to display, doing a lookup in a table within that file.
It looks in the environment for 'statsfile' to determine the path to
the gatherer's stats.pickle.  An example plugins-conf.d file is
provided.
2008-01-30 20:11:07 -07:00
Brian Warner
8063aa8b86 WriteBucketProxy: improve __repr__ 2008-01-28 18:53:51 -07:00
Brian Warner
a6ca98ac53 upload: add Encoder.abort(), to abandon the upload in progress. Add some debug hooks to enable unit tests. 2008-01-14 21:22:55 -07:00
Brian Warner
76ee9cccfe storage: improve logging a bit 2008-01-14 11:58:58 -07:00
Brian Warner
841c1a8509 storage.py: factor out a common compare() routine 2007-12-05 00:20:34 -07:00
Zooko O'Whielacronx
3605354a95 fix several bugs and warnings -- thanks, pyflakes 2007-12-03 15:42:35 -07:00
Zooko O'Whielacronx
59d6c3c822 decentralized directories: integration and testing
* use new decentralized directories everywhere instead of old centralized directories
 * provide UI to them through the web server
 * provide UI to them through the CLI
 * update unit tests to simulate decentralized mutable directories in order to test other components that rely on them
 * remove the notion of a "vdrive server" and a client thereof
 * remove the notion of a "public vdrive", which was a directory that was centrally published/subscribed automatically by the tahoe node (you can accomplish this manually by making a directory and posting the URL to it on your web site, for example)
 * add a notion of "wait_for_numpeers" when you need to publish data to peers, which is how many peers should be attached before you start.  The default is 1.
 * add __repr__ for filesystem nodes (note: these reprs contain a few bits of the secret key!)
 * fix a few bugs where we used to equate "mutable" with "not read-only".  Nowadays all directories are mutable, but some might be read-only (to you).
 * fix a few bugs where code wasn't aware of the new general-purpose metadata dict the comes with each filesystem edge
 * sundry fixes to unit tests to adjust to the new directories, e.g. don't assume that every share on disk belongs to a chk file.
2007-12-03 14:52:42 -07:00
Brian Warner
ba43c033fa storage.py: add a little logging (disabled) 2007-11-07 14:14:54 -07:00
Brian Warner
c4f7412f1c stabilize on 20-byte nodeids everywhere, printed with foolscap's base32 2007-11-06 18:49:59 -07:00
Brian Warner
e08b091d9f storage: rewrite slot API, now use testv_and_readv_and_writev or readv 2007-11-05 20:17:14 -07:00
Brian Warner
8f21424449 storage: add readv_slots: get data from all shares 2007-11-05 00:37:01 -07:00
Brian Warner
bcf84c1238 storage.py: fix tests, timestamps get updated when leases are renewed 2007-10-31 12:31:33 -07:00
Brian Warner
70e7961088 storage.py: more test coverage, make sure leases survive resizing 2007-10-31 12:07:47 -07:00
Brian Warner
948e6b34dd storage.py: improve test coverage even more 2007-10-31 01:44:01 -07:00
Brian Warner
4bd739435f storage.py: more mutable-slot coverage, renewing/cancelling leases 2007-10-31 01:31:56 -07:00
Brian Warner
256ef1bf53 mutable slots: add some test coverage for lease-addition 2007-10-31 00:38:30 -07:00
Brian Warner
68d3d62002 mutable slots: finish up basic coding on server-side containers, add some tests. Remove all caching from MutableShareFile. 2007-10-31 00:10:40 -07:00
Brian Warner
b24c2925e8 checkpointing mutable-file work. Storage layer is 80% in place. 2007-10-30 19:47:36 -07:00
Brian Warner
8451b485a4 storage: fill alreadygot= with all known shares for the given storageindex, not just the ones they asked about 2007-09-17 00:48:40 -07:00
Brian Warner
d628d5f503 storage: remove the leftover incoming/XYZ/ directory when we're done with it 2007-09-15 14:34:04 -07:00
Brian Warner
e1e037e9b5 storage: always record lease expiration times as integers 2007-09-11 14:53:31 -07:00
Brian Warner
277e720f7c storage: add version number to share data. Closes #90. 2007-09-04 09:00:24 -07:00
Brian Warner
fb65aadd82 storage: don't add a duplicate lease, renew the old one instead 2007-09-02 21:39:47 -07:00
Brian Warner
89c7f27572 storage: remove get_or_add_owner, since I don't know what we need yet 2007-09-02 15:03:40 -07:00
Brian Warner
94233b8813 storage: remove unused delete_bucket() method, lease-cancellation covers it 2007-09-02 15:00:29 -07:00
Brian Warner
85f3107b12 storage: handle simultanous uploads: add a lease for the pre-empted client 2007-09-02 14:57:49 -07:00
Brian Warner
0fe1205789 storage: replace sqlite with in-share lease records 2007-09-02 14:47:15 -07:00
Brian Warner
a605fe5cad storage: use sqlite from either python2.5's stdlib or the pysqlite2 package 2007-08-28 23:28:52 -07:00
Brian Warner
2a63fe8b01 deletion phase3: add a sqlite database to track renew/cancel-lease secrets, implement renew/cancel_lease (but nobody calls them yet). Also, move the shares from BASEDIR/storage/* down to BASEDIR/storage/shares/* 2007-08-27 23:41:40 -07:00
Brian Warner
739ae1ccde deletion phase1: send renew/cancel-lease secrets, but my_secret is fake, and the StorageServer discards them 2007-08-27 17:28:51 -07:00
Brian Warner
1aa22b9abd client.py: add a 'debug_no_storage' option to throw out all share data 2007-07-16 18:07:03 -07:00
Brian Warner
8a39ee9034 storage.py: turn some assertions into preconditions 2007-07-13 19:30:48 -07:00
Brian Warner
9fc687cdfc storage.py: handle num_segments != power-of-two without an assertion 2007-07-13 19:30:21 -07:00
Brian Warner
c6f52e379a rename storageserver.py to just storage.py, since it has both server and client sides now 2007-07-13 17:25:45 -07:00