We no longer need the complexity of choosing the application name at
runtime. This removes the setup.py code which populates the _appname.py
file, and the code in __init__.py which reads it. It does not yet remove
the tests which compare the output of e.g. `tahoe --version` against
`allmydata.__appname__`, which I think could be removed, but that's more
invasive than I want to do right now.
closes ticket:2754
This doesn't reveal very much information, but does tell
you if magic-folder is currently working and if not it will
indicate when the last attempt to do a remote scan was.
Improve error-handling for directories if you ask for JSON from
the /uri endpoint, but an error occurs (you get a proper HTTP
status code and a valid JSON object).
For 'tahoe magic-folder status' e now retrieve *all* the remote data
required in the CLI before doing anything else so that errors can be
shown immediately. Use the improved JSON endpoints to print better
errors.
Previously, this file importing "allmydata.immutable" but assuming that
"allmydata.immutable.upload" was available, which only worked if some
other file had imported upload.py . This didn't affect running the
entire test suite (something imported upload.py before anything else
needed it), but caused errors when running specific tests like
test_repairer.py .
I can't currently test this (my OS-X laptop can't run those tests), but
based on how much time test_magic_folder takes on the buildbots, I
expect oneshare=True to help considerably.
This saves more time (as measured on my laptop):
* test_sftp: 17.7s -> 13s
* test_dirnode: 26.5s -> 20s
* test_ftp, test_configutil, test_web show negligible speedups
As before, some tests care about the number of shares, generally ones
which delete or corrupt shares and then expect to see the errors get
noticed or fixed. Those tests continue to use k=3/N=10.
Most of the CLI tests don't care about the actual shares. Configuring
the test client to use k=N=1 reduces the runtime from 180s to 90s on my
laptop.
A few tests *do* care, like test_check (which delete some shares, then
assert that 'tahoe check' shows the damage). These still use k=3/N=10.
Many of the test cases would exercise two copies of each file: one with
k=3/N=10, and a second with k=127/N=255 (255 being the maximum supported
by zfec).
Large number of shares increases the overhead of the testing apparatus,
which is pushing those shares to lots of local servers.
I don't think the "max_shares" case is necessary, and it takes forever.
Because of it, "mutable.Update" was consuming 15% of the total test
runtime, and a third of that was just a single
function (test_replace_locations_max_shares, now deleted). On a
Raspberry Pi 3 (our "slow computer" benchmark), including branch
coverage, this one class took 42 minutes to complete, and requires
disabling a bunch of timeouts to finish at all.
The total number of shares in a file ("N") affects one thing: the
width (and thus height) of the share hash tree. This should be exercised
in test_hashtree.
The number of required shares ("k") affects one thing: the segment size
must be a multiple of k. I don't think we need to exercise this, but if
so, it could be exercised by a few small values for k, rather than 127.
Removing the max_shares cases saves 82% of the mutable.update
runtime (on top of the previous three-segment fix), reducing it from 64s
to 11.3s on my laptop.
* uses @inlineCallbacks to turn the _lazy_tail recursion into
a "real" looking loop;
* remove the need for "immediate" vs delayed iteration of said loop;
* make it easier for the unit-tests to control the behavior of the
uploader/downloader;
* consolidates (some) setup/teardown code into the setUp and tearDown
hooks provided by unittest so unit-tests aren't doing that themselves
* re-factors some of the unit-tests to use an @inlineCallbacks style
so they're easier to follow and debug
This doesn't tackle the "how to know when our inotify events have arrived"
problem the unit-tests still have, nor does it eliminate the myriad bits
of state that get added to tests via all the MixIns.
Adds:
- a JSON endpoint
- CLI to display information
- QueuedItem + IQueuedItem for uploader/downloader
- IProgress interface + PercentProgress implementation
- progress= args to many upload/download APIs
This gives the integration-style CLI-based tests a chance
to set the delay to 0 before the first 3-second delayed
call is queued to _lazy_tail in the Downloader
1. Split alice/bob clocks to avoid races conditions
in the tests
2. Wrap ._notify so we can advance the clock after inotify
calls in the RealTest (since it takes >0ms to do the "real" notifies)
The yaml.SafeLoader.add_constructor() should probably only be done once,
and moving this all into a module gives us an opportunity to test it
directly.
Historical note: V2 introducers have been around for three years
now (released in 1.10.0), so it's time to drop v1. This branch removes a
lot of fallback code, and tests which exercised it. refs ticket:2784
This patch removes some now-unused code: v1-related support functions on
the client, "stub-client" handlers, and v1-tolerant remote methods on
the server. The unit tests have been cleaned up a bit too, now that
there are fewer cases to exercise.
* use yaml.safe_load and yaml.safe_dump
* configure SafeLoader to return unicode consistently, not str
* log+ignore bad cache, instead of throwing error, since we're already
in the log+ignore chain from connect_failed()
* use a local exception type, instead of one from storage_client.py
* delegate delivery to self._deliver_announcements
Using yaml.safe_dump gives us:
- ann:
my-version: tahoe-lafs/1.11.0.post96.dev0
nickname: node-4
instead of:
- ann:
!!python/unicode 'my-version': !!python/unicode 'tahoe-lafs/1.11.0.post96.dev0'
!!python/unicode 'nickname': !!python/unicode 'node-4'
We want SafeLoader to consistently return unicode instead of sometimes
plain strings (for ASCII-safe values) and sometimes unicode
(for everything else). The data we write into the cache was all unicode
to start with (it came from a JSON parser), so it seems better to get
back unicode too.
* Use tempfile for cache to avoid collisions
* Fix pyflakes complaints
* Remove test_client_cache_2, which exercises unsigned announcements.
These are scheduled to be removed soon (see ticket:2784) and don't
need to be tested.
Run with "tox -e coverage". Uses a new helper
module (allmydata.test.run_trial) to let us import+execute trial without
knowing exactly where the "trial" binary lives, which helps with using
"coverage run" under tox.
This makes IServer instances responsible for their own network
connections, which will help when we add HTTP-based servers in the
future. The StorageFarmBroker should not care about how the IServer uses
the network, it just provides the announcement (and local config).
This fixes some of the upcoming-deprecation warnings against Foolscap
(>=0.11.0). There are still a bunch related to the key-generator and the
stats gatherer.
This avoids a privacy leak when the web.static= directory is configured
but doesn't exist (which is almost always, since we set `web.static =
public_html` in the default config file, but nothing automatically
creates it). The nevow.static.File class tries to os.stat() the
directory before doing anything else, which causes an exception, which
renders the traceback to the HTTP client as a 500 Internal Server Error,
and the traceback includes the full path of the missing public_html
directory, which reveals the node's basedir.
Plain twisted.web.static.File doesn't do this check, and a missing
web.static directory just results in a plain old 404.
Closes ticket:1720.
This can be done synchronously because we now know the port number
earlier. This still uses get_local_addresses_sync() (not _async) to do
automatic IP-address detection if the config file didn't set
tub.location or used the special word "AUTO" in it.
The new implementation slightly changes the mapping from tub.location to
the assigned location string. The old code removed all instances of
"AUTO" from the location and then extended the hints with the local
ones (so "hint1:AUTO:hint2" turns into "hint1:hint2:auto1:auto2"). The
new code exactly replaces each "AUTO" with the local hints (so that
example turns into "hint1:auto1:auto2:hint2", and a silly
"hint1:AUTO:AUTO" would turn into "hint1:auto1:auto2:auto1:auto2"). This
is unlikely to affect anybody.
This test was depending upon the storage announcement happening *after*
startup, but the upcoming synchronous-Tub-startup change will modify the
ordering. Fix it in both cases by disabling storage in the client being
tested.
This reverts commit bb7184163e.
We changed test_runner.BinTahoe.run_bintahoe since this commit landed:
the new version can no longer cause the test to be skipped late (we've
gotten rid of the bin/tahoe script entirely, so it's no longer possible
for us to miss it). Hence I think we don't need this unsightly stall any
longer.
I set up a raspberry pi buildslave (which, on the "raspbian jesse"
image, uses a 32-bit python, and perhaps a 32-bit kernel too). It fails
test_util.TimeFormat.test_format_time_y2038 with a ValueError inside the
call to time.gmtime(). The test was looking for the equality check to
fail instead. I think catching ValueError is the more-correct way to
detect a system with a 32-bit time type.
With the new Foolscap-0.11.0 (which changed the way connections are
established), I'm seeing DirtyReactorErrors getting thrown by
allmydata.test.test_system.SystemTest.test_filesystem_with_cli_in_subprocess
, on a host that has three IP addresses (one is 127.0.0.1, two is wifi,
three is a VPN). The test itself is getting skipped because bin/tahoe
isn't in the expected place, but by that point, the nodes have already
been launched and have established connections over one of the three
hints (probably 127.0.0.1). The test terminates so quickly that the
connections to the other two addresses have not finished being
abandoned. The extra stall seems to give Foolscap enough time to reap
the cancelled connections and makes the DRT go away.
I think an offline test, or maybe one with a single external IP address,
wouldn't hit this case.
Arbitrary stalls are never very satisfactory, of course. Usually there
is some threshold delay value, below which it fails reliably, above
which it works on my own machine (for now). This one is weird: the
threshold seems to be below the resolution of the system clock. Stalling
for one nanosecond was enough to fix the problem, but using a simple
fireEventually() didn't work.
This little-used debugging feature allowed you to SSH or Telnet "into" a
Tahoe node, and get an interactive Read-Eval-Print-Loop (REPL) that
executed inside the context of the running process. The SSH
authentication code used a deprecated feature of Twisted, this code had
no unit-test coverage, and I haven't personally used it in at least 6
years (despite writing it in the first place). Time to go.
Also experiment with a Twisted-style "topfiles/" directory of NEWS
fragments. The idea is that we require all user-visible changes to
include a file or two (named as $TICKETNUM.$TYPE), and then run a script
to generate NEWS during the release process, instead of having a human
scan the commit logs and summarize the changes long after they landed.
Closes ticket:2367
Our install_requires= want foolscap>=0.10.1, and this check only fired
if we were given <0.6.4, so the check should be obsolete.
Also, the check was breaking my attempt to test Tahoe against a
development release of Foolscap, as the NormalizedVersion call threw an
IrrationalVersionError at my Versioneer-based "0.10.1+14.g37d8279"
version string.
With our new tox/pip/virtualenv -based environment, we no longer need
the bin/tahoe script, so the tests that examine it needed to change.
In particular, we no longer need to be running tests from the root of a
source tree. Instead, what we care about is that the subprocess 'tahoe'
is importing code from the same place that the unit test .py files live.
NumDict does not make any claims about the order of its repr(), so the
test needs to be prepared for it to be stringified in any order. On unix
the old test happened to pass, but on certain windows boxes (maybe
certain versions of python?), it failed. Fixes ticket:2736.
As discussed at https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1973 and in
previous pull request #129.
- replace lengthy timestamps with human-readable deltas (eg 1h 2m 3s)
- replace "announced" column with "Last RX" column
- remove service column (it always said the same thing, "storage")
- fix colspan on 'You are not presently connected' message
Previous versions, some with github comments: 3fe9053134 , 486dbfc7bd , and c89ea62580, 9fabb92486, bbd8b42a25
Unlike previous attempts, the tests on this one should pass in any timezone.
(But like current master, will fail with Nevow >=0.12...)
Thanks to an anonymous contributor who wrote some of the tests.
As discussed at https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1973 and in
previous pull request #129.
- replace lengthy timestamps with human-readable deltas (eg 1h 2m 3s)
- replace "announced" column with "Last RX" column
- remove service column (it always said the same thing, "storage")
- fix colspan on 'You are not presently connected' message
Previous versions, some with github comments: 3fe9053134 , 486dbfc7bd , and c89ea62580, 9fabb92486, bbd8b42a25
Unlike previous attempts, the tests on this one should pass in any timezone.
(But like current master, will fail with Nevow >=0.12...)
Thanks to an anonymous contributor who wrote some of the tests.
this includes a squash merge of dca1de6856 which
was previously seen in pull request #128, as well as daira's suggested changes
from pull request #204.
A long time ago, the introducer's status web page would show the
advertised IP addresses for all published services, by parsing their
FURL's connection hints. This hasn't worked since about 12-Aug-2014 when
foolscap-0.6.5 changed the internal format of these hints (the column
has been empty this whole time).
This removes the "Advertised IPs" column from the Service Announcements
table. Instead, the service's full connection hints (not just the IP
address) is displayed in a tooltip/popup on the "Announced" timestamp
column.
The code that pulls these connection hints is now tolerant of all three
foolscap styles:
* foolscap<=0.6.4 : tuples of ("ipv4",host,port)
* 0.6.5 .. 0.8.0 : tuples of ("tcp",host,port)
* foolscap>=0.9.0 : strings
fixes ticket:2510
The machine-parseable JSON output for the introducer status web page
used to include a key named "announcement_distinct_hosts", which counted
the number of distinct IP addresses advertised by all connected storage
servers. This hasn't worked since Aug-2014 when foolscap-0.6.5 change
the internal hints format.
This removes that field.
The previous version would incorrectly add to the output of
get_package_versions_string each time it was called.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
test_cli.Help was too sensitive to the way that the --help output was
wrapped, which caused failures on travis when COLUMNS= was set low and
the expected strings were split across separate lines.
Also:
* do some light refactoring of create-client/node
* make it clear that these commands' --basedir options do the same as
the global --node-directory option
* use "global-options" instead of "global-opts"
This avoids an error case where an empty child name resulted in a
duplicate mkdir. It adds a precondition check to guard against empty
child names, and some test cases. It also cleans up a funny redundancy
noticed earlier (refs ticket:2329).
This tests ftpd, but not sftpd. Doing this sort of test on sftpd
requires the creation of a valid pubkey/privkey file pair, which is more
work than I want to do right now.
init_ftp/init_sftp were changed to interpret the configured
accounts.file as relative to the node's basedir, with
abspath_expanduser_unicode(accountfile, base=self.basedir).
This would happen naturally in a real node, since it os.chdir()s
to the basedir before doing anything. But tests don't do that.
Author: Brian Warner <warner@lothar.com>
Author: Daira Hopwood <daira@jacaranda.org>
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
This substantially changes the internals of "tahoe cp", to behave in
accordance with the scheme developed in ticket:2329. test_cli_cp.py got
a large new test to exercise all the various combinations. This also
changes the set of error messages that "tahoe cp" can produce.
This modifies try_copy(), inserts a new implementation of
copy_things_to_directory() (and supporting methods), and fixes a few
bugs elsewhere.
fixes ticket:2329
This replaces the status display which was only distinct by color which is a usability issue for color-blind users. This commit includes test coverage by way of pattern matching on rendered templates. The PNG icons are conversions of original SVG source which I've included and placed in the public domain.
The latest setuptools (version 8) changed the way dependency
specifications ("I can handle libfoo version 2 or 3, but not 4") are
interpreted. The new version follows PEP440, which is simpler but
somewhat less expressive. Tahoe's _auto_deps.py now uses dep-specs which
are correctly parsed by both old and new setuptools.
Fixes ticket:2354.
* Restrict the requirements in _auto_deps.py to work with either the old
or PEP 440 semantics.
* Update check_requirement and tests to take account of changes for PEP
440 compatibility.
* Fix an error message.
* Remove a superfluous TODO.
add get_available_space() to NativeStorageServer
It uses a new 'available-space' key in the server's v1 version dict, or falls
back to 'maximum-immutable-share-size' (which presently always has the same
value but could have a different meaning in the future).
This is a squash merge of 9773555bb87fab71145ad7a0e84785a4e92d11f7
Instead of constructing a sys.argv for 'twistd' that reads the node's
.tac file, we construct arguments that tell twistd to use a special
in-memory-only plugin that creates the desired node instance directly.
We still use the name of the .tac file to decide which kind of instance
to make (Client, IntroducerNode, KeyGenerator, StatsGatherer), but never
actually read the contents of the .tac file. Later improvements could
change this to look inside the tahoe.cfg for a nodetype= directive, etc.
This also makes it easy to have "tahoe start BASEDIR" pass the rest of
its arguments on to twistd, so e.g. "tahoe start BASEDIR --nodaemon
--profile=prof.out" does what you'd expect "twistd --nodaemon
--profile=prof.out" to do. "tahoe run BASEDIR" is thus simply aliased to
"tahoe start BASEDIR --nodaemon". This removes the need to special-case
--profile and --syslog.
I also removed some of the default logging behavior:
before:
'tahoe start' = 'twistd --logfile BASEDIR logs/twistd.log'
'tahoe start --profile' adds '--profile=profiling_results.prof --savestats'
'tahoe run' = 'twistd --nodaemon --logfile BASEDIR/logs/tahoesvc.log'
after:
'tahoe start' = 'twistd --logfile BASEDIR logs/twistd.log'
unless --logfile, --nodaemon, or --syslog are passed
'tahoe start --profile' invalid, use 'tahoe start --profile=OUTPUT'
'tahoe run' = 'twistd --nodaemon'
so log messages go to stdout
This finally enables 'tahoe run' to work with all node types, including
the key-generator and stats-gatherer.
It gets 'tahoe start' one step closer to accepting --reactor= . To
actually accomplish this will require this file, the enclosing
__init_.py files, and everything they import to avoid importing the
reactor. (if anything imports twisted.internet.reactor before
startstop_node.start() gets to run, then --reactor= comes too late).
That will take a lot of work, and requires lazy-loading of many core
libraries (foolscap.logging in particular), and removing a lot of code
from src/allmydata/__init__.py .
Note fix following issues from origial commit:
refactor unittests, fix style, add test
(0) use CommonFixture as mixin to increase DRYness
(1) self.failUnlessIn('size', metadata.keys()) --> self.failUnlessIn('size', metdata)
(2) test_size_is_not_None --> test_size_is_0 AND test_size_is_1000
The stdlib 'subprocess' module in python-2.7.4 through 2.7.7 suffers
from http://bugs.python.org/issue18851 which causes unrelated file
descriptors to be closed when `subprocess.call()` fails the `exec()`,
such as when the executable being invoked does not actually exist. There
appears to be some randomness involved. This was fixed in python-2.7.8.
Tahoe's iputil.py uses subprocess.call on many different "ifconfig"-type
executables, most of which don't exist on any given platform (added in
git commit 8e31d66cd0). This results in a lot of file-descriptor
closing, which (at least during unit tests) tends to clobber important
things like Tub TCP sockets. This seems to be the root cause behind
ticket:2121, in which normal code tries to close already-closed sockets,
crashing the unit tests. Since different platforms have different
ifconfigs, some platforms will experience more failed execs than others,
so this bug could easily behave differently on linux vs freebsd, as well
as working normally on python-2.7.8 or 2.7.4.
This patch inserts a guard to make sure that os.path.isfile() is true
before allowing Popen.call() to try executing the target. This ought to
be enough to avoid the bug. It changes both iputil.py and
allmydata.__init__ (which uses Popen for calling "lsb_release"), which
are all the places where 'subprocess' is used outside of unit tests.
Other potential fixes: use the 'subprocess32' module from PyPI (which is
a bug-free backport of the Python3 stdlib subprocess module, but would
introduce a new dependency), or require python >= 2.7.8 (but this would
rule out development/deployment on the current OS-X 10.9 release, which
ships with 2.7.5, as well as other distributions like Ubuntu 14.04 LTS).
I believe this closes ticket:2121, and given the apparent relationship
between 2121 and 2023, I think it also closes ticket:2023 (although
since 2023 doesn't have copies of the failing log files, it's hard to
tell). I'm hoping that this will tide us over until 1.11 is released, at
which point we can execute on the plan to remove iputil.py entirely by
changing the way that nodes learn their externally-facing IP address.
Some Travis-CI workers report persistently empty disks, causing spurious
test failures. It's not really that important to assert used>0, so this
relaxes the test.
Closes ticket:2290
Closes ticket:2281 (trac).
This removes src/allmydata/test/trial_coverage.py, which was a
in-process way to run trial tests under the "coverage" code-coverage
tool. These days, the preferred way to do this is with "coverage run",
although the actual invocation is a bit messy because of the way
bin/trial uses subprocess.call() to invoke the real entrypoint script
with the right PYTHONPATH (see #1698 for details). Hopefully this will
be improved to use a simpler "coverage run .." command in the future.
This patch also removes twisted/plugins/allmydata_trial.py, which
enabled the "--reporter=bwverbose-coverage" option. Finally it modifies
setup.py to stop looking for that option and adding "trialcoverage" to
the dependencies list, which gets us closer to removing "setup_requires"
entirely.
The new rules for "bin/tahoe ARG1.. SUBCOMMAND ARG2.." arg:
* --node-directory is only accepted in ARG1, not ARG2
* create-*/start/stop/restart accept --basedir in ARG2, or an explicit
basedir argument
* only one of --node-directory/--basedir/explicit-basedir is accepted
* --quiet/--version is only accepted in ARG1, not ARG2
Closes#166
This should now fail quickly (during "tahoe start"). Previously this
would silently treat an unparseable size as "0", and the only way to
discover that it had had a problem would be to look at the foolscap log,
or examine the storage-service web page for the unexpected "Reserved
Size" number.
Previously, Introducers always used a swissnum of "introducer", so
anyone who could learn the (public) tubid of the introducer would be
able to connect to and use it. This changes new Introducers to use the
same randomly-generated swissnum as clients and storage servers do, so
that you absolutely must learn the introducer.furl from someone who
knows it already before you can connect.
This change also moves the location of the file that stores
introducer.furl from BASEDIR/introducer.furl to
BASEDIR/private/introducer.furl, since that's where we keep the private
things. The first time an introducer is started with the new code, it
will move any existing BASEDIR/introducer.furl into the new place.
Note that this will not change the FURL of existing introducers: it will
only affect newly created ones. When you change an introducer's FURL,
you must also update all of the nodes (clients and storage servers)
which connect to it, so upgrading it to an unguessable one isn't
something we should do automatically.
This stores the sequence number in BASEDIR/announcement-seqnum, and
increments it each time any service is published (every service
announcement is regenerated with the new sequence number). As everyone
knows, time is an illusion, and occasionally goes backwards, so a
counter is generally safer (and reveals less information about the
node).
Later, we'll improve the introducer client to tolerate rollbacks (where,
perhaps due to a VM being restarted from an earlier checkpoint, the
stored sequence number reverts to an earlier version).
twisted.web.html.escape was used to produce html-encoded string (to then look
it up in "value" attribute), but behavior of that function has changed between
Twisted 12.2.0 (simple custom implementation) and 12.3.0 (imported from stdlib
cgi module).
This also simplifies how case-insensitivity is handled, and fixes a corner case
where the wrong exception was raised when the size ends in "BB".
fixes#1812
Signed-off-by: David-Sarah Hopwood <davidsarah@mint>
This contains several merged patches. Individual messages follow, latest first:
* Fix a warning from check-miscaptures.
* In retrieve.py, explicitly test whether a key is in self.servermap.proxies
rather than catching KeyError.
* Added a new comment to the MDMF version of the test I removed, explaining
the removal of the SDMF version.
* Removed test_corrupt_all_block_hash_tree_late, since the entire block_hash_tree
is cached in the servermap for an SDMF file.
* Fixed several tests that require files larger than the servermap cache.
* Remove unused test_response_cache_memory_leak().
* Exercise the cache.
* Test infrastructure for counting cache misses on MDMF files.
* Removed the ResponseCache. Instead, the MDMFSlotReadProxy initialized
by ServerMap is kept around so Retrieve can access it. The ReadProxy
has a cache of the first 1000 bytes initially read from each share by
the ServerMap. We're able to satisfy a number of requests out of this
cache, so roundtrips are reduced from 84 to 60 in test_deepcheck_mdmf.
There is still some mystery about under what conditions the cache has
fewer than 1000 bytes. Also this breaks some existing unit tests that
depend on the inner behavior of ResponseCache.
* The servermap.proxies (a cache of SlotReadProxies) is now keyed
by (verinfo,serverid,shnum) rather than just (serverid,shnum)
* Minor cosmetic changes
* Added a test failure if the number of cache misses is too high.
Author: Andrew Miller <amiller@dappervision.com>
Signed-off-by: David-Sarah Hopwood <davidsarah@jacaranda.org>
This prints out which things are different when two sets are expected to be the
same. This was useful to me when debugging the code under test. Hm, this
pattern might be more generally useful...
This probably only works on Linux. It uses sudo to mount and unmount the tmpfs,
which may prompt for a password. refs #20
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
Nevow automatically HTML-escapes strings passed in stan without a raw marker.
Written by MK_FG. fixes#1143
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
The wait_for_connections() method, which is used at the start of
test_system to make sure that all the clients are connected to all the
servers, did not also wait for clients to be connected to their Helpers.
Every once in a while, the helper connection would take a bit longer,
and then
test_system.SystemTest.test_filesystem._test_web._got_welcome_helper
would fail, because we'd check for a helper connection before it was
ready.
The fix is to modify wait_for_connections's polling predicate to look
for helper connections (if configured) as well as the regular
introducer- and server- connections.
Tested by temporarily adding a large (30s) delay to the connectTo() call
in Uploader.startService, simulating a long helper
connection-establishment delay. This makes the test fail consistently.
Then I fixed wait_for_connections(), and the test passed (slowly). Then
I removed the delay.
Closes#1467
This makes it easy to distinguish between old V1-Introducer
nodes (identified by their Foolscap TubID) and new V2 nodes (identified
by their ed25519 pubkey).
This fixes a few places where we used to display a tubid even if we had
a pubkey, making it hard to visually correlate servers in two different
displays. It also cleans up the way we pass serverids to the JS-based
download timeline.
The "introweb" subscribed-clients list still shows tubids.
The _upload_resumable() test interrupts a Helper upload partway
through (by shutting down the Helper), then restarts the Helper and
resumes the upload. The control flow is kind of tricky: to do anything
"partway through" requires adding a hook to the Uploadable. The previous
flow depended upon a (fragile) call to self.stall(), which waits a fixed
number of seconds.
This removes one of those stall() calls (the remainder is in
test/common.py and I'll try removing it in a subsequent revision). It
also removes some now-redundant wait_for_connections() calls, since
bounce_client() doesn't fire its Deferred until the client has finished
coming back up (and uses wait_for_connections() internally to do so).
DeepResultsBase also has a get_corrupt_shares(), and it is populated
from CheckResults.get_corrupt_shares(). It has been updated too, along
with get_remaining_corrupt_shares().
Remove temporary get_new_corrupt_shares() and
get_new_incompatible_shares().
This changes all code which feeds CheckResults(sharemap=) to provide
IServer instances, but CheckResults converts these to old-style
serverids during output, so downstream code doesn't have to change yet.
It adds a temporary get_new_sharemap(), which *does* return IServer
instances, so the immutable repairer can build new CheckResults from an
old one. This will go away when get_sharemap() is updated to return
IServer (and downstream code is updated too).
i.e. change set_data() to accept lots of parameters, instead of taking
a single dictionary with lots of keys. Also Convert all CheckResults
creators to use it.
The goal is to make CheckResults more strongly typed, and remove the
ambiguous ".data" field in favor of a bunch of specific counters and
sharelists, so I can changes .sharemap and .servermap to use IServer
instances instead of string serverids. By cleaning this up first, I hope
to get that task done with less debugging.
The Fake*Node classes in test/common.py were accumulating share data in
a class-level dictionary, which persisted from one test run to the next.
As a result, running test_web.py over and over (with trial's
--until-failure feature) made this dictionary grow without bound,
eventually running out of memory.
This fix moves that dictionary into the FakeClient built fresh for each
test, so it doesn't build up. It does the same thing for "file_types",
which was much smaller but still lived at the class level.
Closes#1729
This stores IDisplayableServer-providing instances (StubServers or
NativeStorageServers) in the .servermap and .sharemap dictionaries. But
get_servermap()/get_sharemap() still return data structures with
serverids, not IServers, by translating their data on the way out. This
lets us put off changing the callers for a little bit longer.
Complete the getter-based transformation, by hiding ".uri" and updating
callers to use get_uri(). Also don't set a dummy self._uri, leave it
undefined until someone calls set_uri().
This hides attributes with e.g. _sharemap, and creates getters like
get_sharemap() to access them, for every field except .uri . This will
make it easier to modify the internal representation of .sharemap
without requiring callers to adjust quite yet.
".uri" has so many users that it seemed better to update it in a
subsequent patch.
Populate most of UploadResults (except .uri, which is learned later when
using a Helper) in the constructor, instead of allowing creators to
write to attributes later. This will help isolate the fields that we
want to change to use IServers.
This splits the pb.Copyable on-wire object (HelperUploadResults) out
from the local results object (UploadResults). To maintain compatibility
with older Helpers, we have to leave pb.Copyable classes alone and
unmodified, but we want to change UploadResults to use IServers instead
of serverids. So by using a different class on the wire, and translating
to/from it on either end, we can accomplish both.
Unlike set.union(), which returns a new set, DictOfSets.union() modified
the DictOfSets in-place. The name collision bit me when I changed some
code from using DictOfSets to a normal set, and expected that
set.union() would modify the set in-place. Since there was only one user
of DictOfSets.union, I figured it was safer to just get rid of it.
If a server did not respond to the pre-repair filecheck, but did respond
to the repair, that server was not correctly added to the
RepairResults.data["servers-responding"] list. (This resulted from a
buggy usage of DictOfSets.union() in filenode.py).
In addition, servers to which filecheck queries were sent, but did not
respond, were incorrectly added to the servers-responding list
anyawys. (This resulted from code in the checker.py not paying attention
to the 'responded' flag).
The first bug was neatly masked by the second: it's pretty rare to have
a server suddenly start responding in the one-second window between a
filecheck and a subsequent repair, and if the server was around for the
filecheck, you'd never notice the problem. I only spotted the smelly
code while I was changing it for IServer cleanup purposes.
I added coverage to test_repairer.py for this. Trying to get that test
to fail before fixing the first bug is what led me to discover the
second bug. I also had to update test_corrupt_file_verno, since it was
incorrectly asserting that 10 servers responded, when in fact one of
them throws an error (but the second bug was causing it to be reported
anyways).
Previously, test_runner sometimes fails because the _node_has_started()
poller fires after the portnum file has been opened, but before it has
actually been filled, allowing the test process to observe an empty file,
which flunks the test.
This adds a new fileutil.write_atomically() function (using the usual
write-to-.tmp-then-rename approach), and uses it for both node.url and
client.port . These files are written a bit before the node is really up and
running, but they're late enough for test_runner's purposes, which is to know
when it's safe to read client.port and use 'tahoe restart' (and therefore
SIGINT) to restart the node.
The current node/client code doesn't offer any better "are you really done
with startup" indicator.. the ideal approach would be to either watch the
logfile, or connect to its flogport, but both are a hassle. Changing the node
to write out a new "all done" file would be intrusive for regular
operations.
t=info contains randomly-generated ophandles, and t=rename-form contains the
name of the child being renamed, so neither is eligible for a
short-circuiting ETag. Enhanced test_web to exercise this. Had to improve
FakeCHKFileNode slightly to let it participate. Refs #443.
test_web.py: use shouldFail2(), safer than old shouldFail()
directory.py: forbid slashes in from_name=, return BAD_REQUEST instead of
GONE when trying to move into a non-directory
The move webapi function now takes a target_type argument which lets it
know whether the target is a subdirectory name or URI. This is an
improvement over the old system in which the move handler tried to guess
whether the target was a name or a URI. Also fixed a little docs
copypaste problem and tweaked some line wrapping.
This adds "move file" capability to the web UI's directory display. The
support and test framework is heavily based on the similar "rename file"
feature. Unit tests and documentation are included. Multiple in-progress
versions of this patch may be found in ticket 1579. This version
includes arbitrary URI target support and is compatible with the change
from tahoe_css to tahoe.css.
'serverid' is the pubkey (for V2 clients), falling back to the tubid (for V1
clients). This also required cleaning up the way the index is created for the
old V1 introducer.
This significantly cleans up the IntroducerServer web-status renderers.
Instead of poking around in the introducer's internals, now the web-status
renderers get clean AnnouncementDescriptor and SubscriberDescriptor
objects. They are still somewhat foolscap-centric, but will provide a clean
abstraction boundary for future improvements.
The specific #1721 bug was that old (V1) subscribers were handled by
wrapping their RemoteReference in a special WrapV1SubscriberInV2Interface
object, but the web-status display was trying to peek inside the object to
learn what host+port it was associated with, and the wrapper did not proxy
those extra attributes.
A test was added to test_introducer to make sure the introweb page renders
properly and at least contains the nicknames of both the V1 and V2 clients.
This was a premature feature addition to the mock filenode, and gets in the
way of the IServer refactoring I'm trying to do. Best to remove it now and
re-introduce it in a better form later when it's actually needed.
This avoids the name collision between the actual results
objects (defined in allmydata.check_results) and the code that renders
these objects into HTML (defined in allmydata.web.check_results). Only
the web-side objects were renamed.