The Fake*Node classes in test/common.py were accumulating share data in
a class-level dictionary, which persisted from one test run to the next.
As a result, running test_web.py over and over (with trial's
--until-failure feature) made this dictionary grow without bound,
eventually running out of memory.
This fix moves that dictionary into the FakeClient built fresh for each
test, so it doesn't build up. It does the same thing for "file_types",
which was much smaller but still lived at the class level.
Closes#1729
This stores IDisplayableServer-providing instances (StubServers or
NativeStorageServers) in the .servermap and .sharemap dictionaries. But
get_servermap()/get_sharemap() still return data structures with
serverids, not IServers, by translating their data on the way out. This
lets us put off changing the callers for a little bit longer.
Complete the getter-based transformation, by hiding ".uri" and updating
callers to use get_uri(). Also don't set a dummy self._uri, leave it
undefined until someone calls set_uri().
This hides attributes with e.g. _sharemap, and creates getters like
get_sharemap() to access them, for every field except .uri . This will
make it easier to modify the internal representation of .sharemap
without requiring callers to adjust quite yet.
".uri" has so many users that it seemed better to update it in a
subsequent patch.
Populate most of UploadResults (except .uri, which is learned later when
using a Helper) in the constructor, instead of allowing creators to
write to attributes later. This will help isolate the fields that we
want to change to use IServers.
This splits the pb.Copyable on-wire object (HelperUploadResults) out
from the local results object (UploadResults). To maintain compatibility
with older Helpers, we have to leave pb.Copyable classes alone and
unmodified, but we want to change UploadResults to use IServers instead
of serverids. So by using a different class on the wire, and translating
to/from it on either end, we can accomplish both.
Unlike set.union(), which returns a new set, DictOfSets.union() modified
the DictOfSets in-place. The name collision bit me when I changed some
code from using DictOfSets to a normal set, and expected that
set.union() would modify the set in-place. Since there was only one user
of DictOfSets.union, I figured it was safer to just get rid of it.
If a server did not respond to the pre-repair filecheck, but did respond
to the repair, that server was not correctly added to the
RepairResults.data["servers-responding"] list. (This resulted from a
buggy usage of DictOfSets.union() in filenode.py).
In addition, servers to which filecheck queries were sent, but did not
respond, were incorrectly added to the servers-responding list
anyawys. (This resulted from code in the checker.py not paying attention
to the 'responded' flag).
The first bug was neatly masked by the second: it's pretty rare to have
a server suddenly start responding in the one-second window between a
filecheck and a subsequent repair, and if the server was around for the
filecheck, you'd never notice the problem. I only spotted the smelly
code while I was changing it for IServer cleanup purposes.
I added coverage to test_repairer.py for this. Trying to get that test
to fail before fixing the first bug is what led me to discover the
second bug. I also had to update test_corrupt_file_verno, since it was
incorrectly asserting that 10 servers responded, when in fact one of
them throws an error (but the second bug was causing it to be reported
anyways).
Previously, test_runner sometimes fails because the _node_has_started()
poller fires after the portnum file has been opened, but before it has
actually been filled, allowing the test process to observe an empty file,
which flunks the test.
This adds a new fileutil.write_atomically() function (using the usual
write-to-.tmp-then-rename approach), and uses it for both node.url and
client.port . These files are written a bit before the node is really up and
running, but they're late enough for test_runner's purposes, which is to know
when it's safe to read client.port and use 'tahoe restart' (and therefore
SIGINT) to restart the node.
The current node/client code doesn't offer any better "are you really done
with startup" indicator.. the ideal approach would be to either watch the
logfile, or connect to its flogport, but both are a hassle. Changing the node
to write out a new "all done" file would be intrusive for regular
operations.
t=info contains randomly-generated ophandles, and t=rename-form contains the
name of the child being renamed, so neither is eligible for a
short-circuiting ETag. Enhanced test_web to exercise this. Had to improve
FakeCHKFileNode slightly to let it participate. Refs #443.
test_web.py: use shouldFail2(), safer than old shouldFail()
directory.py: forbid slashes in from_name=, return BAD_REQUEST instead of
GONE when trying to move into a non-directory
The move webapi function now takes a target_type argument which lets it
know whether the target is a subdirectory name or URI. This is an
improvement over the old system in which the move handler tried to guess
whether the target was a name or a URI. Also fixed a little docs
copypaste problem and tweaked some line wrapping.
This adds "move file" capability to the web UI's directory display. The
support and test framework is heavily based on the similar "rename file"
feature. Unit tests and documentation are included. Multiple in-progress
versions of this patch may be found in ticket 1579. This version
includes arbitrary URI target support and is compatible with the change
from tahoe_css to tahoe.css.
'serverid' is the pubkey (for V2 clients), falling back to the tubid (for V1
clients). This also required cleaning up the way the index is created for the
old V1 introducer.
This significantly cleans up the IntroducerServer web-status renderers.
Instead of poking around in the introducer's internals, now the web-status
renderers get clean AnnouncementDescriptor and SubscriberDescriptor
objects. They are still somewhat foolscap-centric, but will provide a clean
abstraction boundary for future improvements.
The specific #1721 bug was that old (V1) subscribers were handled by
wrapping their RemoteReference in a special WrapV1SubscriberInV2Interface
object, but the web-status display was trying to peek inside the object to
learn what host+port it was associated with, and the wrapper did not proxy
those extra attributes.
A test was added to test_introducer to make sure the introweb page renders
properly and at least contains the nicknames of both the V1 and V2 clients.
This was a premature feature addition to the mock filenode, and gets in the
way of the IServer refactoring I'm trying to do. Best to remove it now and
re-introduce it in a better form later when it's actually needed.
This avoids the name collision between the actual results
objects (defined in allmydata.check_results) and the code that renders
these objects into HTML (defined in allmydata.web.check_results). Only
the web-side objects were renamed.
SystemTest has a couple of different phases, separated by a poller which
waits for everything to be idle (all messages delivered, none in flight). It
does this by watching some internal "_debug_outstanding" counters in the
server and in each client, and waiting for them to hit zero.
Just before the last phase, we replace the server with a new one (to make
sure clients re-send their messages properly). Unfortunately, the polling
function closed over the variable holding the original server, and didn't see
the replacement. It kept polling the old server, and failed to notice the
outstanding messages for the new server. The last phase of the test (check3)
was started too early, which failed (since some messages had not yet been
delivered), and then exploded in a flurry of dirty-reactor errors (because
some messages were delivered after test shutdown).
This replaces the closed-over-variable with a "self.the_introducer", which
seems to fix the race.
One additional place to look at in the future: the client
announcement-receive path (remote_announce) uses an eventually(). If the
message has been received and the eventual-send posted (but not yet executed)
when the poller sees it, the poller might erroneously conclude that the
client is idle and cause the same problem as above. To fix this, the poller
(probably all pollers) could be enhanced to do a flushEventualQueue before
querying the are-we-done-yet predicate function.
This introduces new client and server halves to the Introducer (renaming the
old one with a _V1 suffix). Both have fallbacks to accomodate talking to a
different version: the publishing client switches on whether the server's
.get_version() advertises V2 support, the server switches on which
subscription method was invoked by the subscribing client.
The V2 protocol sends a three-tuple of (serialized announcement dictionary,
signature, pubkey) for each announcement. The V2 server dispatches messages
to subscribers according to the service-name, and throws errors for invalid
signatures, but does not otherwise examine the messages. The V2 receiver's
subscription callback will receive a (serverid, ann_dict) pair. The
'serverid' will be equal to the pubkey if all of the following are true:
the originating client is V2, and was told a privkey to use
the announcement went through a V2 server
the signature is valid
If not, 'serverid' will be equal to the tubid portion of the announced FURL,
as was the case for V1 receivers.
Servers will create a keypair if one does not exist yet, stored in
private/server.privkey .
The signed announcement dictionary puts the server FURL in a key named
"anonymous-storage-FURL", which anticipates upcoming Accounting-related
changes in the server advertisements. It also provides a key named
"permutation-seed-base32" to tell clients what permutation seed to use. This
is computed at startup, using tubid if there are existing shares, otherwise
the pubkey, to retain share-order compatibility for existing servers.
test_verify_mdmf_all_bad_sharedata tests for the regression described
in ticket 1648. In particular, it will trigger the misplaced assertion
in the share activation code. It also tests to make sure that
verification continues with fewer than k shares.