On linux, write tests are failing because data written to fuse isn't showing
up in tahoe by the time it's checked. it's not clear where this is originating,
since the fuse implementation [should be] waiting for completion of tahoe
operations before returning from its calls. This adds an option to control the
duration of a pause between the fuse write and the check of tahoe, which is by
default set to 2s on linux, which - somewhat inexplicably - seems to 'fix' the
problem, in as far as it allows tests to complete.
this test uploads a test file to tahoe, and then reads the file from fuse,
but reads the blocks of the file in a random order; this is designed to
exercise the asynchronous download feature of blackmatch - where the file
is downloaded from tahoe asynchronously, and rather than blocking open()
for the entirety of the download, instead individual read() calls are
blocked until enough of the file has been downloaded to satisfy them
the code had a 'fullcleanup' flag internally which controlled whether
working directories were cleaned up. this promotes that to a command
line option (negated) '--no-cleanup' defaulting to False, i.e. do cleanup
this avoids dumping the repr of 1Mb of random data to stdout in the event
of a test failure, but rather just dumps the start/end of the errant strings
if the amount of data is > 200 chars repr'd
since the current tests assume that the implementation responds to changes made
to tahoe after mount, and impl_b prefetches and cached directory data, impl_b
fails the current 'read' test suite.
rather than reflect that problem in the overall failure of the runtests exit
code, this adds a 'todo' flag to the implementations table, and sets the todo
flag for impl_b. Thus errors will therein be reported in output, but not cause
a failing exit code.
previously the runtests suite removed the webport file created by
tahoe create-client in all but the first node. now that the node config
is in tahoe.cfg by default this file might not exist.
This implements a client/server split for blackmatch, where the client
implements the fuse_main bindings and a simple blocking rpc client mechanism.
The server implements the other half of that rpc mechanism, and contains all
the actual logic for interpreting fuse requests in the context of the on disk
cache and requests to the tahoe node. The server is based on a twisted reactor.
The rpc mechanism implements a simple method dispatch including marshalling,
using json, of basic inert data types, in a flat namespace (no objects).
The client side is written in a blocking idiom, to interface with the threading
model used by the fuse_main bindings, whereas the server side is written for a
twisted reactor-based environment, intended to facilitate implementing more
sophisticated logic in that paradigm. The two communicate over a unix domain
socket, allocated within the nodedir.
Command line usage is unchanged; the server is launched automatically by the
client. The server daemonizes itself, to avoid preventing the original parent
process (e.g. 'runtests') from waiting upon the server exiting.
The client keeps open a 'keepalive' connection to the server; upon loss thereof
the server will exit. This addresses the fact that the python-fuse bindings
provide no notification of exit of the client process upon unmount.
The client thus provides a relatively thin 'shim' proxying requests from the
fuse_main bindings across the rpc to the server process, which handles the
logic behind each request.
For the time being, a '--no-split' option is provided to surpress the splitting
into client/server, yielding the prior behaviour. Once the server logic gets
more complex and more entrenched in a twisted idiom, this might be removed.
The 'runtests' test harness currently tests both modes, as 'impl_c' and
'impl_c_no_split'
this tests opening a file for update, overwriting a small part of it, and
ensuring that the end result constitutes an overwrite of the original file.
This tests, e.g. the implementation doesn' open a 'fresh' file but does in
fact initialise the file to be uploaded with the contents of any extant
file before applying updates
changed the --tests option to be --suites, as it takes a prefix, e.g. 'read'
'write' (or 'all', the default) and runs those suites which are applicable to
each implementation being tested.
added a --tests option, which takes a list of tests, e.g. 'read_file_contents'
'write_overlapping_large_writes' and runs all tests specified without regard
to whether the implementation(s) under test are declared to support them.
this is basically to allow a specific test or two to be run, saving time
during development and debugging by not running the entire suite
this writes the test file in a randomised order, with randomly sized writes.
also for each 'slice' of the file written, a randomly chosen overlapping
write is also made to the file. this ensures that the file will be written
in its entirety in a thoroughly random order, with many overlapping writes.
using both small and large blocksizes for writes, write a 1Mb file to fuse
where every write overlaps another.
This serves a useful purpose - in manual testing of blackmatch some time ago
most operations e.g. bulk copies, worked fine, but using rsync caused data
corruption on most files. it turned out to be that rsync writes in 64K blocks,
but rather than making the last block short, the last block instead overlaps
the preceding (already written) block. This revealed a problem where cache
files were being opened 'append' rather than 'write' and hence the overlapping
write to the fuse layer caused the overlapping portion of the file to be
duplicated in cache, leading to oversized and corrupt files being uploaded.
unit tests to test writing contiguous blocks linearly through the file,
for a variety of block sizes; 'tiny_file' is an entire file fitting within
a single io block / write operation. 'linear_{small,large}_writes' test
a 1Mb file written with each write operation containing significantly less
or more, respecitvely, data than fuse will pass into the implementation as
a single operation (which on the mac at least is 64Kib)
this performs a very simple write through the fuse layer and confirms that
the file is stored correctly into the tahoe mesh. ('simple' in the sense
that the entire file body fits trivially in a single write() operation,
disk block etc)
similar to the --debug-wait option which causes the test harness to
pause at various stages of the process to facilitate debugging, this
option simplifies that debugging by automatically opening a web browser
to the root dir of that implementation's tests when tests are commenced.
in addition, if --web-open is specfied but --debug-wait is not, the
harness will still pause after running tests but before tearing down
the tahoe grid - this allows all tests to run to completion, but
provide a debugging hook to investigate the end state of the grid's
contents thereafter.
For a variety of reasons, high amongst them the fact that many people
interested in fuse support for tahoe seem to have missed its existence,
the existing fuse implementation for tahoe, previously 'mac/tahoefuse.py'
has been renamed and moved.
It was suggested that, even though the mac build depends upon it, that
the mac/tahoefuse implementation be moved into contrib/fuse along with
the other fuse implementations. The fact that it's not as extensively
covered by unit tests as mainline tahoe was given as corroboration.
In a bid to try and stem the confusion inherent in having tahoe_fuse,
tfuse and tahoefuse jumbled together (not necessarily helped by
referring to them as impl_a, b and c respectively) I'm hereby renaming
tahoefuse as 'blackmatch' (black match is, per wikipedia "a type of
crude fuse" hey, I'm a punny guy) Maybe one day it'll be promoted to
be 'quickmatch' instead...
Anyway, this patch moves mac/tahoefuse.py out to contrib/fuse/impl_c/
as blackmatch.py, and makes appropriate changes to the mac build process
to transclude blackmatch therein. this leaves the extant fuse.py and
fuseparts business in mac/ as-is and doesn't attempt to address such
issues in contrib/fuse/impl_c.
it is left as an exercise to the reader (or the reader of a message
to follow) as to how to deal with the 'fuse' python module on the mac.
as of this time, blackmatch should work on both mac and linux, and
passes the four extant tests in runtests. (fwiw neither impl_a nor
impl_b have I managed to get working on the mac yet)
since blackmatch supports a read-write and caching fuse interface to
tahoe, some write tests obviously need to be added to runtests.
This patch makes a significant number of changes to the fuse 'runtests' script
which stem from my efforts to integrate the third fuse implementation into this
framework. Perhaps not all were necessary to that end, and I beg nejucomo's
forebearance if I got too carried away.
- cleaned up the blank lines; imho blank lines should be empty
- made the unmount command switch based on platform, since macfuse just uses
'umount' not the 'fusermount' command (which doesn't exist)
- made the expected working dir for runtests the contrib/fuse dir, not the
top-level tahoe source tree - see also discussion of --path-to-tahoe below
- significantly reworked the ImplProcManager class. rather than subclassing
for each fuse implementation to be tested, the new version is based on
instantiating objects and providing relevant config info to the constructor.
this was motivated by a desire to eliminate the duplication of similar but
subtly different code between instances, framed by consideration of increasing
the number of platforms and implementations involved. each implementation to
test is thus reduced to the pertinent import and an entry in the
'implementations' table defining how to handle that implementation. this also
provides a way to specify which sets of tests to run for each implementation,
more on that below.
- significantly reworked the command line options parsing, using twisted.usage;
what used to be a single optional argument is now represented by the
--test-type option which allows one to choose between running unittests, the
system tests, or both.
the --implementations option allows for a specific (comma-separated) list of
implemenations to be tested, or the default 'all'
the --tests option allows for a specific (comma-separated) list of tests sets
to be run, or the default 'all'. note that only the intersection of tests
requested on the command line and tests relevant to each implementation will
be run. see below for more on tests sets.
the --path-to-tahoe open allows for the path to the 'tahoe' executable to be
specified. it defaults to '../../bin/tahoe' which is the location of the tahoe
script in the source tree relative to the contrib/fuse dir by default.
the --tmp-dir option controls where temporary directories (and hence
mountpoints) are created during the test. this defaults to /tmp - a change
from the previous behaviour of using the system default dir for calls to
tempfile.mkdtemp(), a behaviour which can be obtained by providing an empty
value, e.g. "--tmp-dir="
the --debug-wait flag causes the test runner to pause waiting upon user
input at various stages through the testing, which facilitates debugging e.g.
by allowing the user to open a browser and explore or modify the contents of
the ephemeral grid after it has been instantiated but before tests are run,
or make environmental adjustments before actually triggering fuse mounts etc.
note that the webapi url for the first client node is printed out upon its
startup to facilitate this sort of debugging also.
- the default tmp dir was changed, and made configurable. previously the
default behaviour of tempfile.mkdtemp() was used. it turns out that, at least
on the mac, that led to temporary directories to be created in a location
which ultimately led to mountpoint paths longer than could be handled by
macfuse - specifically mounted filesystems could not be unmounted and would
'leak'. by changing the default location to be rooted at /tmp this leads to
mountpoint paths short enough to be supported without problems.
- tests are now grouped into 'sets' by method name prefix. all the existing
tests have been moved into the 'read' set, i.e. with method names starting
'test_read_'. this is intended to facilitate the fact that some implementations
are read-only, and some support write, so the applicability of tests will vary
by implementation. the 'implementations' table, which governs the configuration
of the ImplProcManager responsible for a given implementation, provides a list
of 'test' (i.e test set names) which are applicable to that implementation.
note no 'write' tests yet exist, this is merely laying the groundwork.
- the 'expected output' of the tahoe command, which is checked for 'surprising'
output by regex match, can be confused by spurious output from libraries.
specfically, testing on the mac produced a warning message about zope interface
resolution various multiple eggs. the 'check_tahoe_output()' function now has
a list of 'ignorable_lines' (each a regex) which will be discarded before the
remainder of the output of the tahoe script is matched against expectation.
- cleaned up a typo, and a few spurious imports caught by pyflakes
The flow control has been de-obfuscated a bit.
Some output changes.
The test framework has quite a few race conditions, but it does a reasonable job of setting up and cleaning up.
This is a little convoluted because of the "layer" design, but it appears
to function correctly and do properly ordered cleanup.
Before system test setup is complete, tahoe_fuse.py needs to be modified
to allow arbitrary client base directories.