a typo in the 'flags2mode' code would wind up passing the O_APPEND
flag into the os open() call, which would cause the file to be opened
in 'strict append' mode, i.e. all writes extend the file, regardless of
calls to seek.
this causes a problem for tahoefuse in that the seek() calls made to
filehandles open through fuse would be ignored when write()s occurred.
this was evidenced by corruption seen when using rsync. it turns out
that rsync actually makes overlapping writes in some cases, i.e. even
when writing a new fresh file out, it still doesn't write a simple
contiguous span of data, but will make writes overlapping data already
written. this is probably related to the way it manages data blocks
internally for rolling checksums etc. at any rate, this bug would
thus cause rsync in those cases to write a chunk of duplicate data
into the file - leading to tahoe securely and reliably storing the
wrong data.
fixing this, so that non-append file opens do not pass O_APPEND seems
to eliminate this problem.
previously tahoefuse returned the fs stat for the filesystem the fuse plugin
was running upon (e.g. '/'). this works ok until you need to copy more to
tahoe than the local machine has free disk space, at which point Finder will
refuse to copy 'too much' data.
this changes it so that tahoe always reports 2TiB used of an 8TiB filesystem
this is entirely bogus, but allows copies of up to 2TiB to be initiated.
at one point I'd thrown in a 'str' since fuse api bits required a str instance
but tahoe returns unicode objects from its json parsing. that, naturally
enough should really be a utf8 encoded str of the unicode object...
various updates to improve the functionality of the mac fuse plugin
1. caching
previously, the experimental tahoefuse plugin pre-loaded the whole
structure of the specified mount into memory at launch time. changes
which were made through that fuse plugin would be remembered, but any
changes made through other tahoe clients would not be reflected.
now directory contents are only loaded when needed, and the data is
cached for a limited time. any use of Directory objects should first
call maybe_refresh() which will check the time since the cache was last
loaded, and if the data is older than some validity period (currently
26s) then the directory's contents will be refetched and reloaded.
this replaces the 'load_dir()' method of TFS
whenever a local change is made to a Directory object, or when the
aforementioned cache reloading notices a change in directory data, the
mtime of the directory is automatically updated.
2. stat / metadata
the retrieval of 'stat' information for getattr(), and the way that
metadata is handled, has been refactored to better reflect the fact that
metadata in tahoe is only represented by 'edges' (i.e entries in
directories) not on 'nodes' (files or directories themselves) hence a
stat lookup should be a query to the parent directory (specifically the
parent specified by the path being queried in the case that a node has
multiple parents) for details known by that directory for the given
child, rather than a query to the child itself.
the TStat utility class for returning stat information to the python-
fuse layer has been extended to accept a 'metadata' argument in its
constructor. any fields found in the metadata dict which match the
names of the stat attributes are loaded into the TStat object. the
'ctime' and 'mtime' attributes are translated to st_ctime and st_mtime
to mesh with the existing timestamp handling code. any fields specified
by kwargs to the constructor override things that might be loaded from
the metadata dict.
Directory objects now track their children as a dict mapping name to
(child_obj, metadata) tuples. This is because the metadata in tahoe
will be stored exclusively on the edges of the graph. each Directory
maintains its own mtime however, and get_stat() calls will report the
mtime of a directory based on the last modification of the Directory
object, not based on any mtime records from the parent directory's
metadata for that child. This addresses the fact that since directories
may be shared, a given parent may or may not reflect the latest changes,
however one of the Finder's behaviours is to examine the stat of a
directory, and not to bother doing a readdir() if the stat is unchanged.
i.e. unless directories report their changes in their stat info, the
Finder will not show changes within that directory.
3. refactoring
reporting of many error codes has been refactored to raise IOError
subclasses with the appropriate errno. this exploits python-fuse's
built-in mechanism for catching IOError and reporting the errno
embedded within it automatically, while simplifying the code within
the plugin.
the add_child() method on TFS was removed in favour of simply having an
add_child() method on Directory objects. this provides a more OO
approach in that Directory is responsible for maintaining its own in
memory state and also writing changes back to the node. similarly for
remove_child()
these changes, along with the new tfs.compose_url() method,
significantly simplify and improve readability of mkdir, rename methods
along with the newer link and unlink. these also get improved error
reporting.
various operations (chmod, chown, truncate, utime) are now ignored.
previously they would report an unsupported operation (EOPNOTSUPP)
but now are simply logged and ignored. this surpresses errors caused
by some client programs which try to use these operations, but at the
moment those operations are meaningless to the tahoe filesystem anyway.
4. link / unlink / rmdir
link, symlink calls are now supported, though with semantics differing
from posix, both equivalent. unlink, rmdir calls are now supported,
also equivalent.
link or symlink calls duplicate the uri of the named source and adds it
as a child of another directory according to the destination path. for
directories, this creates a 'hard' link, i.e. the same directory will
appear in multiple locations within the filesystem, and changes in
any place will be reflected everywhere. for files, by contrast, since
the uri being duplicated is an immutable CHK uri, link/symlink for files
is equivalent to a copy - though significantly cheaper. (a file copy
with the fuse plugin is likely to cause a new file to be written and
uploaded, the link command simply adds an entry referring to an
existing uri)
in testing, the 'ln' command is unable to make hard links (i.e. call
link()) for directories, though symlink ('ln -s') is supported.
either forms works equivalently for files.
unlink and rmdir both remove the specified entry from its parent
directory.
5. logging
the 'tfuse.log' file now only reports launches of the fuse plugin. once
the plugin has parsed the options, it reopens the log file with the
name of the mount, e.g. tfuse.root_dir.log, so that multiple instances
running concurrently will not interfere with each others' logging.
6. bug fixes
the tmp_file in the cache dir backing files opened for write was
intermittently failing to open the file. added O_CREAT to the os.open
call so that files will be created if missing, not throw errors.
a failure to correctly parse arguments if no mount (dir_cap) name was
given but also no fuse options were given has been fixed. now the
command 'tahoe fuse mountpoint' will correctly default to root_dir
also when running from source, arguments to tahoefuse were not handled
to correctly match the 'tahoe fuse ...' behaviour.
oops. I screwed up the makefile syntax further. buildslave would spend a
lot of fruitless time trawling the entire drive. this fixes that. and a
stray -n. ahem. [looks down sheepishly]
blah $( foo ) is more explicit than blah ` foo ` in a bash-like context
unfortunately it doesn't translate very well to makefiles, for which $(
means something else entirely
rather than trying to build a single .app with both 10.4 and 10.5 fuse
libraries embedded within it, for the time being, we're just going to
have independant 10.4 and 10.5 builds.
this provides a 10.5 _fusemodule.so, and build changes to copy the
appropriate versions of files for 10.4 or 10.5 from sub dirs of mac/
into the build tree before triggering py2app
the existing environment on otto requires a few build hints in order for
xml parsing to work properly. these hints are unnecessary, and moreover
their import by depends.py is broken, in the 10.5 environment in which
zandr's buildslave is running.
the make mac-upload target now requires an UPLOAD_DEST argument to be given,
which is the rsync destination (including trailing '/') to which the version
stamped directory containing the .dmg should be placed. the account the
build is running as (e.g. 'buildslave') should have ssh access to the account
specified in that dest. one might also consider locking the key down to the
target directory by adding something like
command="rsync --server -vlogDtpr . /home/amduser/public_html/dist/mac-blah/"
to the corresponding authorized_key entry on the target machine.
the mac/macfuse subdirectory needed to be added to the pythonpath in order
to build a binary incorporating the mac fuse system. this change should
make those modules accessible relative to the mac/ directory which is
implicitly included in the .app build process.
this provides a variety of changes to the macfuse 'tahoefuse' implementation.
most notably it extends the 'tahoe' command available through the mac build
to provide a 'fuse' subcommand, which invokes tahoefuse. this addresses
various aspects of main(argv) handling, sys.argv manipulation to provide an
appropriate command line syntax that meshes with the fuse library's built-
in command line parsing.
this provides a "tahoe fuse [dir_cap_name] [fuse_options] mountpoint"
command, where dir_cap_name is an optional name of a .cap file to be found
in ~/.tahoe/private defaulting to the standard root_dir.cap. fuse_options
if given are passed into the fuse system as its normal command line options
and the mountpoint is checked for existence before launching fuse.
the tahoe 'fuse' command is provided as an additional_command to the tahoe
runner in the case that it's launched from the mac .app binary.
this also includes a tweak to the TFS class which incorporates the ctime
and mtime of files into the tahoe fs model, if available.
This is the result of various experimentation done into using python-fuse
to provide access to tahoe on the mac. It's rough in quite a few places,
and is really the result of investigation more than a thorough
implemenation of the fuse api.
upon launch, it looks for the users root_dir by opening ~/.tahoe/node.url
and ~/.tahoe/private/root_dir.cap it then proceeds to cache the directory
structure found by walking the users tahoe drive (safely in the face of
directory loops) into memory and then mounts that filesystem.
when a file is read, it calls the tahoe node to first download the file
into a cache directory (~/.tahoe/_cache) and then serves up the file
from there.
when a file is written, a temporary file is allocated within the tmp dir
of the cache, and upon close() (specifically upon release()) the file is
uploaded to the tahoe node, and the new directory entry written.
note that while the durectory structure is cached into memory only when
the filesystem is mounted, that it is 'write through' i.e. changes made
via fuse are reflected into the underlying tahoe fs, even though changes
made to the tahoe fs otherwise show up only upon restart.
in addition to opening files for read and write, the mkdir() and rename()
calls are supported. most other file system operations are not yet
supported. notably stat() metadata is not currently tracked by tahoe,
and is variably reported by this fs depending on write cache files.
also note that this version does not fully support Finder. access through
normal unix commands such as cat, cp, mv, ls etc works fine, and read
access to file from within finder (including preview images and double-
click to open) work ok. but copies to the tahoe drive from within finder
may or may not succeed, but will always report an error. This is still
under investigation.
also note that this does not include any build integration. the included
_fusemodule.so was built on mac os 10.4 against macfuse 1.3.0, and is
known to not work against 10.5-1.3.1 it's possible it may also contain
dependencies upon parts of macports used to build the python that it was
built against. this will be cleaned up later.
usage:
python tahoefuse.py /Path/to/choice/of/mountpoint
or optionally
python tahoefuse.py -ovolicon=/Path/to/icon.icns /Path/to/mountpoint
upon startup, tahoefuse will walk the tahoe directory, then print a
summary of files and folders found, and then daemonise itself. to exit,
either eject the 'drive' (note: 10.5 doesn't show it as a drive, since
it considers fuse to be a connected server instead) or unmount it via
umount /Path/to/mountpoint etc.
this moves some of the code common to both windows and mac builds into the
allmydata module hierarchy, and cleans up the windows and mac build directories
to import the code from there.
remove debug messages (and traceback) from node output in the case that the
pkg resources hook can't find a requested file. it will now silently return
the empty string for files that can't be resolved
refine the logic in the .app which tries to install the 'tahoe' script.
now it will do nothing if 'tahoe' is found anywhere on the user's path,
and only if it's not present will it try to install it in each of the
candidate paths (/usr/local/bin ~/bin ~/Library/bin) which are on the
user's path
upon startup, the .app will look in '/usr/local/bin', '~/bin', '~/Library/bin'
if it finds one of these dirs, and can write into it, and there isn't already
a 'tahoe' present, it will write a small bach script which will launch the
binary contained within the .app bundle
this allows the .app bundle to offer the services of the 'tahoe' script
easily and simply
This patch adds support for a mac native build.
At the moment it's a fairly simple .app - i.e. so simple as to be unacceptable
for a shipping product, but ok for testing and experiment at this point.
notably once launched, the app's ui does not respond at all, although its dock
icon does allow it to be force-quit.
this produces a single .app bundle, which when run will look for a node basedir
in ~/.tahoe. If one is not found, one will be created in ~/Library/Application
Support/Allmydata Tahoe, and that will be symlinked to ~/.tahoe
if the basedir is lacking basic config (introducer.furl and root_dir.cap) then
the wx config wizard will be launched to log into an account and to set up
those files.
if a webport file is not found, the default value of 8123 will be written into
it.
once the node has started running, a webbrowser will be opened to the webish
interface at the users root_dir
note that, once configured, the node runs as the main thread of the .app,
no daemonisation is done, twistd is not involved.
the binary itself, from within the .app bundle, i.e.
"Allmydata Tahoe.app/Contents/MacOS/Allmydata Tahoe"
can be used from the command line and functions as the 'tahoe' executable
would in a unix environment, with one exception - when launched with no args
it triggers the default behaviour of running a node, and if necessary config
wizard, as if the user had launched the .app
one other gotcha to be aware of is that symlinking to this binary from some
other place in ones $PATH will most likely not work. when I tried this,
something - wx I believe - exploded, since it seems to use argv[0] to figure
out where necessary libraries reside and fails if argv[0] isn't in the .app
bundle. it's pretty easy to set up a script a la
#!/bin/bash
/Blah/blah/blah/Allmydata\ Tahoe.app/Contents/MacOS/Allmydata\ Tahoe "${@}"