Also adds a --poll-interval option to both 'magic-folder join'
and 'magic-folder create' so that the integration tests can pass
something "very short".
Previously, "tahoe create-node" without an --introducer= argument would
result in the literal string "None" being written into tahoe.cfg:
[client]
introducer.furl = None
We were using config.get("introducer",""), but that didn't suffice because
the key was actually present: it just had a value of None, which then got
stringified into "None" when writing out tahoe.cfg.
This briefly caused test/cli/test_create to fail, as the startup code tried
to parse "None" as a FURL. This only happened against a development version
of Foolscap which accidentally became sensitive to unparseable FURLs in
started Reconnectors. I fixed that in the final foolscap-0.12.5 release, so
we shouldn't hit this bug, but I wanted to fix it properly in the tahoe-side
source.
This adds tor-related CLI arguments to "create-node" and
"create-introducer", to control exactly how we should be using Tor.
* --tor-launch
* --tor-executable=
* --tor-control-port=
I went with "--tor-launch" instead of "--launch-tor" for consistency. I
don't particularly like the grammatical flow of it, and it doesn't
actually put all the tor-related arguments next to each other in the
--help output (the flags are put in one block, then the parameters in
the next). But it seems slightly more consistent to start all the
tor-related argument names with a "--tor*" prefix.
All server-like nodes (storage servers and introducers both) will need
this for the tor state directory and .onion private key file, and it
needs to exist before the config is written, so tor onion-service
private keys can be placed there.
Also remove a redundant import.
This puts the right inlineCallbacks in place to allow
write_node_config() to return a Deferred. The upcoming Tor support will
need this (since it must wait for an .onion address to be allocated
before it can write tahoe.cfg's tub.port and tub.location lines).
In addition, CLI functions are allowed to use sys.exit() instead of
always needing to return the exit code as an integer.
runner.py now knows about the blocking httplib calls in scripts/cli and
scripts/magic_folder, and uses deferToThread() to invoke them. Those
functions cannot return a Deferred: when rewrite them to use twisted.web
or treq, we'll remove this deferToThread call.
Option parsing was split out to a separate function for testing. We now
use twisted.internet.task.react() to start the reactor, which required
changing the way runner.py is tested.
closes ticket:2826
These are obsolete. Tests are run with 'tox', or by running 'trial
allmydata' from a populated virtualenv. A populated virtualenv is also
the right way to get a repl: just run 'python'.
refs ticket:2735
So "tahoe create-node --hide-ip" causes "reveal-IP-address = false" to
get written into tahoe.cfg . This also changes the default tahoe.cfg to
include "reveal-IP-address = true", for clarity.
refs ticket:1010
We now use::
tub.port = disabled
tub.location = disabled
instead of using an empty value (but the key still being present, since
if the key is missing entirely, that means "be automatic").
closes ticket:2816
Improve error-handling for directories if you ask for JSON from
the /uri endpoint, but an error occurs (you get a proper HTTP
status code and a valid JSON object).
For 'tahoe magic-folder status' e now retrieve *all* the remote data
required in the CLI before doing anything else so that errors can be
shown immediately. Use the improved JSON endpoints to print better
errors.
Adds:
- a JSON endpoint
- CLI to display information
- QueuedItem + IQueuedItem for uploader/downloader
- IProgress interface + PercentProgress implementation
- progress= args to many upload/download APIs
Meejah pointed out that new users might think the encoding parameters
are fixed, something you must pick correctly when you first set up the
node, and then are never allowed to change again, which is kind of
anxiety-inducing. This updates the comment to explain that the encoding
is stored in each filecap, and the tahoe.cfg values are only used for
newly-uploaded files.
This allows a python3-based "tox" (as shipped with modern debian and
ubuntu systems) to run setup.py egg_info, update_version, and sdist
commands. It moves the main "tahoe requires py2" check out of setup.py
and into allmydata.scripts.runner.run, where it gets applied at runtime
rather than build time.
It also changes the execfile(_auto_deps.py) and Versioneer-like "ask git
what our version string should be" code to work under both py2 and py3.
fixes ticket:2747
test_cli.Help was too sensitive to the way that the --help output was
wrapped, which caused failures on travis when COLUMNS= was set low and
the expected strings were split across separate lines.
Also:
* do some light refactoring of create-client/node
* make it clear that these commands' --basedir options do the same as
the global --node-directory option
* use "global-options" instead of "global-opts"
Subcommands "--help" is now rendered as:
```
tahoe [global-options] COMMAND [options] ARGS
(use 'tahoe --help' to view global options)
USAGE (flags/options)
DESCRIPTION
DESCRIPTION_UNWRAPPED
```
The new .description and .description_unwrapped fields allow
commands (subclasses of twisted.python.usage.Usage) better control over
how their explanations are rendered: the old .longdesc field was wrapped
unpleasantly.
This avoids an error case where an empty child name resulted in a
duplicate mkdir. It adds a precondition check to guard against empty
child names, and some test cases. It also cleans up a funny redundancy
noticed earlier (refs ticket:2329).
This substantially changes the internals of "tahoe cp", to behave in
accordance with the scheme developed in ticket:2329. test_cli_cp.py got
a large new test to exercise all the various combinations. This also
changes the set of error messages that "tahoe cp" can produce.
This modifies try_copy(), inserts a new implementation of
copy_things_to_directory() (and supporting methods), and fixes a few
bugs elsewhere.
fixes ticket:2329
This code will be replaced in the next commit with an entirely different
approach, and modifying it in a single commit would yield a completely
unreadable diff.
Instead of constructing a sys.argv for 'twistd' that reads the node's
.tac file, we construct arguments that tell twistd to use a special
in-memory-only plugin that creates the desired node instance directly.
We still use the name of the .tac file to decide which kind of instance
to make (Client, IntroducerNode, KeyGenerator, StatsGatherer), but never
actually read the contents of the .tac file. Later improvements could
change this to look inside the tahoe.cfg for a nodetype= directive, etc.
This also makes it easy to have "tahoe start BASEDIR" pass the rest of
its arguments on to twistd, so e.g. "tahoe start BASEDIR --nodaemon
--profile=prof.out" does what you'd expect "twistd --nodaemon
--profile=prof.out" to do. "tahoe run BASEDIR" is thus simply aliased to
"tahoe start BASEDIR --nodaemon". This removes the need to special-case
--profile and --syslog.
I also removed some of the default logging behavior:
before:
'tahoe start' = 'twistd --logfile BASEDIR logs/twistd.log'
'tahoe start --profile' adds '--profile=profiling_results.prof --savestats'
'tahoe run' = 'twistd --nodaemon --logfile BASEDIR/logs/tahoesvc.log'
after:
'tahoe start' = 'twistd --logfile BASEDIR logs/twistd.log'
unless --logfile, --nodaemon, or --syslog are passed
'tahoe start --profile' invalid, use 'tahoe start --profile=OUTPUT'
'tahoe run' = 'twistd --nodaemon'
so log messages go to stdout
This finally enables 'tahoe run' to work with all node types, including
the key-generator and stats-gatherer.
It gets 'tahoe start' one step closer to accepting --reactor= . To
actually accomplish this will require this file, the enclosing
__init_.py files, and everything they import to avoid importing the
reactor. (if anything imports twisted.internet.reactor before
startstop_node.start() gets to run, then --reactor= comes too late).
That will take a lot of work, and requires lazy-loading of many core
libraries (foolscap.logging in particular), and removing a lot of code
from src/allmydata/__init__.py .
The new rules for "bin/tahoe ARG1.. SUBCOMMAND ARG2.." arg:
* --node-directory is only accepted in ARG1, not ARG2
* create-*/start/stop/restart accept --basedir in ARG2, or an explicit
basedir argument
* only one of --node-directory/--basedir/explicit-basedir is accepted
* --quiet/--version is only accepted in ARG1, not ARG2
Closes#166
* fix CLI commands (put, mkdir) to send format=, not mutable-type=
* fix tests
* test_cli: fix tests that observe t=json output, don't ignore failures in
'tahoe put'
* fix handling of version= to make it easier to use the default
* interpret ?mutable=true&format=MDMF as MDMF, not SDMF
The filecaps used to be produced with hints for 'k' and segsize, but they
weren't actually used, and doing so had the potential to limit how we change
those filecaps in the future. Also the parsing code had some problems dealing
with other numbers of extensions. Removing the existing fields and making the
parser tolerate (and ignore) extra ones makes MDMF more future-proof.
I rerecorded this patch, originally by David-Sarah, to use "darcs replace" instead of editing to do the renames. This uncovered one missed rename in Client.init_drop_uploader. (Which also means that code isn't exercised by the current unit tests.)
refs #1429
I personally used "tahoe start/restart -m ../MY-TESTNET/node*" all the time,
to spin up or update a local testgrid while iterating over new code. However,
with the recent switch from "subprocess.Popen(/bin/twistd)" to "import and
call twistd.run()" in scripts/startstop_node.py (yay fewer processes!),
"start -m" broke, and fixing it requires os.fork, which is unavailable on
windows (boo windows!). And I was probably the only one using -m. So in the
interests of uniformity among platforms and simpler code (yay negative code
days!), we're just removing -m from everything. I will start using a little
shell script or something to simulate the removed functionality.
This patch also cleans up CLI-function calling a bit: get the basedir from
the config dict (instead of sometimes from a separate argument), and always
return a numeric exit code.
Specifically, test_runner.CreateNode.test_client failed, because the
os.fork-is-present test decided that --multiple should not be allowed on
windows, even though --multiple works just fine for 'tahoe create-client'.
The only restriction on --multiple is for 'tahoe start' and 'tahoe restart'.
This needs a different approach, probably by cleaning up BasedirMixin. We
should only be withholding --multiple on windows for "start" and
"restart". (we should continue withholding --multiple on all platforms for
"run").
This reverts (git) commit f3adb037ae:
"startstop_node.py: fix "tahoe start -m" by forking before non-final targets"
* don't advertise -m flag on tahoe start/restart/run unless os.fork is
available (i.e. windows)
* test_runner.py: add test to exercise "start/stop/restart -m"
This removes the need to use a locally-built (dependency) bin/twistd, and
removes a big chunk of behavior differences between unix and windows. It
also happens to resolve the "client node probably started" uncertainty.
Might help with #1190, #602, and #71.
fixes#530. I earlier tried this twice (see #530 for history) and then twice rolled it back due to some problems that arose. However, I didn't write down what the problems were in enough detail on the ticket that I can tell today whether those problems are still issues, so here goes the third attempt. (I did write down on the ticket that it would not create site.py or .pth files in the target directory with --multi-version mode, but I didn't explain why *that* was a problem.)
pyflakes pointed out that the exception handler fallback called an un-imported function, showing that the fallback wasn't being exercised.
I'm not 100% sure that this patch is right and would appreciate François or someone reviewing it.
Tahoe CLI commands working on local files, for instance 'tahoe cp' or 'tahoe
backup', have been improved to correctly handle filenames containing non-ASCII
characters.
In the case where Tahoe encounters a filename which cannot be decoded using the
system encoding, an error will be returned and the operation will fail. Under
Linux, this typically happens when the filesystem contains filenames encoded
with another encoding, for instance latin1, than the system locale, for
instance UTF-8. In such case, you'll need to fix your system with tools such
as 'convmv' before using Tahoe CLI.
All CLI commands have been improved to support non-ASCII parameters such as
filenames and aliases on all supported Operating Systems except Windows as of
now.
This patch modifies the regular expression used for verifying of '--node-url'
parameter. Support for accessing a Tahoe gateway over HTTPS was already
present, thanks to Python's urllib.
This handles the case where we upload a new tahoe directory for a
previously-processed local directory, possibly creating a new dircap (if the
metadata had changed). Now we replace the old dirhash->dircap record. The
previous behavior left the old record in place (with the old dircap and
timestamps), so we'd never stop creating new directories and never converge
on a null backup.
These edits were suggested by my watching over Jake Appelbaum's shoulder as he completely ignored/skipped/missed install.html and also as he decided that debian.txt wouldn't help him with basic installation. Then I threw in a few docs edits that have been sitting around in my sandbox asking to be committed for months.
This patch displays a warning to the user in two cases:
1. When special files like symlinks, fifos, devices, etc. are found in the
local source.
2. If files or directories are not readables by the user running the 'tahoe
backup' command.
In verbose mode, the number of skipped files and directories is printed at the
end of the backup.
Exit status returned by 'tahoe backup':
- 0 everything went fine
- 1 the backup failed
- 2 files were skipped during the backup
web/filenode.py: also serve edge metadata when using t=json on a
DIRCAP/childname object.
tahoe_ls.py: list file objects as if we were listing one-entry directories.
Show edge metadata if we have it, which will be true when doing
'tahoe ls DIRCAP/filename' and false when doing 'tahoe ls
FILECAP'
This forbids operations that would implicitly create a directory with a
zero-length (empty string) name, like what you'd get if you did "tahoe put
local /oops/blah" (#358) or "POST /uri/CAP//?t=mkdir" (#676). The error
message is fairly friendly too.
Also added code to "tahoe put" to catch this error beforehand and suggest the
correct syntax (i.e. without the leading slash).
The webapi has been looking for an Accept header since 1.4.0, but it treats a
missing header as equal to */* (to honor RFC2616). This change finally
modifies our CLI tools to ask for "text/plain, application/octet-stream",
which seems roughly correct (we either want a plain-text traceback or error
message, or an uninterpreted chunk of binary data to save to disk). Some day
we'll figure out how JSON fits into this scheme.
* backups now share dirnodes with any previous backup, in any location,
so renames and moves are handled very efficiently
* "tahoe backup" no longer bothers reading the previous snapshot
* if you switch grids, you should delete ~/.tahoe/private/backupdb.sqlite,
to force new uploads of all files and directories
* emit lease expiry date in ISO-8601'ish format as well as Brian's format
* rename iso_utc_time_to_localseconds() to iso_utc_time_to_seconds()
* add iso_utc_date()
* simplify the body of iso_utc_time_to_seconds()
This walks slowly through all shares, examining their leases, deciding which
are still valid and which have expired. Once enabled, it will then remove the
expired leases, and delete shares which no longer have any valid leases. Note
that there is not yet a tahoe.cfg option to enable lease-deletion: the
current code is read-only. A subsequent patch will add a tahoe.cfg knob to
control this, as well as docs. Some other minor items included in this patch:
tahoe debug dump-share has a new --leases-only flag
storage sharefile/leaseinfo code is cleaned up
storage web status page (/storage) has more info, more tests coverage
space-left measurement on OS-X should be more accurate (it was off by 2048x)
(use stat .f_frsize instead of f_bsize)
It is currently hardcoded in setup.py to be 'allmydata-tahoe'. Ticket #556 is to make it configurable by a runtime command-line argument to setup.py: "--appname=foo", but I suddenly wondered if we really wanted that and at the same time realized that we don't need that for tahoe-1.3.0 release, so this patch just hardcodes it in setup.py.
setup.py inspects a file named 'src/allmydata/_appname.py' and assert that it contains the string "__appname__ = 'allmydata-tahoe'", and creates it if it isn't already present. src/allmydata/__init__.py import _appname and reads __appname__ from it. The rest of the Python code imports allmydata and inspects "allmydata.__appname__", although actually every use it uses "allmydata.__full_version__" instead, where "allmydata.__full_version__" is created in src/allmydata/__init__.py to be:
__full_version__ = __appname + '-' + str(__version__).
All the code that emits an "application version string" when describing what version of a protocol it supports (introducer server, storage server, upload helper), or when describing itself in general (introducer client), usese allmydata.__full_version__.
This fixes ticket #556 at least well enough for tahoe-1.3.0 release.
This was somewhat sad; the assertion didn't say what path caused the
error, what went wrong. So... silently skip over things that are
neither dirs nor files.
Because the unit tests on the VirtualZooko? buildslave failed when it took 31 seconds for a process to go away.
Perhaps getting warning message after only 5 seconds instead of 40 seconds is desirable, and we should change the unit tests and set this back to 5, but I don't know exactly how to change the unit tests. Perhaps match this particular warning message about the shutdown taking a while and allow the code under test to pass if the only stderr that it emits is this warning.
The code for validating the share hash tree and the block hash tree has been rewritten to make sure it handles all cases, to share metadata about the file (such as the share hash tree, block hash trees, and UEB) among different share downloads, and not to require hashes to be stored on the server unnecessarily, such as the roots of the block hash trees (not needed since they are also the leaves of the share hash tree), and the root of the share hash tree (not needed since it is also included in the UEB). It also passes the latest tests including handling corrupted shares well.
ValidatedReadBucketProxy takes a share_hash_tree argument to its constructor, which is a reference to a share hash tree shared by all ValidatedReadBucketProxies for that immutable file download.
ValidatedReadBucketProxy requires the block_size and share_size to be provided in its constructor, and it then uses those to compute the offsets and lengths of blocks when it needs them, instead of reading those values out of the share. The user of ValidatedReadBucketProxy therefore has to have first used a ValidatedExtendedURIProxy to compute those two values from the validated contents of the URI. This is pleasingly simplifies safety analysis: the client knows which span of bytes corresponds to a given block from the validated URI data, rather than from the unvalidated data stored on the storage server. It also simplifies unit testing of verifier/repairer, because now it doesn't care about the contents of the "share size" and "block size" fields in the share. It does not relieve the need for share data v2 layout, because we still need to store and retrieve the offsets of the fields which come after the share data, therefore we still need to use share data v2 with its 8-byte fields if we want to store share data larger than about 2^32.
Specify which subset of the block hashes and share hashes you need while downloading a particular share. In the future this will hopefully be used to fetch only a subset, for network efficiency, but currently all of them are fetched, regardless of which subset you specify.
ReadBucketProxy hides the question of whether it has "started" or not (sent a request to the server to get metadata) from its user.
Download is optimized to do as few roundtrips and as few requests as possible, hopefully speeding up download a bit.
Because the unit tests on the VirtualZooko buildslave failed when it took 16 seconds for a process to go away.
Perhaps getting notification after only 5 seconds instead of 20 seconds is desirable, and we should change the unit tests and set this back to 5, but I don't know exactly how to change the unit tests. Perhaps match this particular warning message about the shutdown taking a while and allow the code under test to pass if the only stderr that it emits is this warning.
We're just going to mark unicode in the cli as unsupported for tahoe-lafs-1.3.0. Unicode filenames on the command-line do actually work for some platforms and probably only if the platform encoding is utf-8, but I'm not sure, and in any case for it to be marked as "supported" it would have to work on all platforms, be thoroughly tested, and also we would have to understand why it worked. :-)
Also encode all args to urllib as utf-8 because urllib doesn't handle unicode objects.
I'm not sure if it is appropriate to *assume* utf-8 encoding of cli args. Perhaps the Right thing to do is to detect the platform encoding. Any ideas?
This patch is mostly due to François Deppierraz.