Was not fetching payload of remote bundle, just manifest. The problem was
caused by a change of logic recently to not activate any queued fetch
candidates immediately, but wait until the next fd_poll(), so that parsing a
single packetful of rhizome advertisments would start fetching the most
important one first, instead of the first one parsed.
- close database after every command line operation
- don't cache rhizome enabled configuration
- don't send advertisements unless the database is open and the web server is running
- don't provess advertisements unless the database is open
Replace the main-loop scheduled periodic alarm with an "activate" alarm that is
scheduled whenever a fetch candidate is added to any queue, unless the alarm is
already scheduled.
Replace the "rhizome.fetch_interval_ms" config item with
"rhizome.fetch_delay_ms" [default 50], which is the number of milliseconds
between adding a fetch candidate and firing the "activate" alarm. This allows
time for a few more Rhizome advertisment packets to arrive after the first one,
before deciding which fetches to start first.
Add new `is_scheduled()` alarm primitive.
Overhauled the file fetch queue logic in rhizome_fetch.c.
Now the 'rhizomeprotocol' stress test passes in approximately 5 minutes on my
2009-vintage Dell laptop.
Added a call to rhizome_enqueue_suggestions() in rhizome_fetch_close() so that
a new Rhizome GET request is sent as soon as a fetch slot becomes free, instead
of waiting for the (default 5 second) timer to trigger the next GET.
as recommended a while back by Dan Bernstein as offering the fastest
implementation of the crypto_sign() primitives for ARM.
Indeed this implementation IS faster. See comparison below for a
Rock 500 handset (800MHz(?) ARM6, no NEON):
Original ref/ implementation on an R500 stock rom (non-rooted)::
mean signature generation time = 96.80ms
mean signature verification time = 272.20ms
ref10/ implementations on an R500 stock rom (non-rooted):
mean signature generation time = 4.00ms
mean signature verification time = 13.00ms
Approximately 20x speed up, just like that :)
Introduce __WHENCE__ macro and a block comment in log.h explaining it.
In "primitive" kinds of functions, rename 'whence' arguments to '__whence' and
use WHYF(), WARNF(), DEBUGF() macros instead of calling logMessage() directly.
In the case that the MANIFESTS 'author' column is not NULL, do not perform a
full bundle secret verification in order to clear the '.readonly' flag, just
check whether the author's SID is present in the keyring with a proper-size
rhizome secret.
Add test case for new feature of the "rhizome add" command: if the author SID
is not specified (empty arg) then it searches the keyring for the author.
Removed "authorSid" argument from several functions that also take a struct
rhizome_manifest * arg, since the author, if known, is now supplied in the
struct.
Improve return value handling and refactored some rhizome crypto code.
Replace ".selfsigned" column with ".author" and ".fromhere" columns in
output of "rhizome list" command. (Note that a "sender" column is
already present.)
Add 'author' field to struct rhizome_manifest.
Log all fully rendered SQL statements on DEBUG_RHIZOME.
Update 'rhizomeops' test cases and improve the assert_rhizome_list()
test function to be able to assert authorship of files.
The "rhizome direct push" command (and also sync) was not waiting for the
server's HTTP response, so it was exiting before the server had finished
storing the bundle, which led to a race with the subsequent "assert
bundle_received_by" test. Fixed by adding the missing code to receive the HTTP
response.
Refactored the code used for parsing HTTP responses in rhizome_fetch.c, and
used it in rhizome_direct_http.c.
It turns out that if the DB is locked, sqlite_prepare_v2() call can return
SQLITE_BUSY. The retry logic (implemented for issue #2) only provided for
sqlite_step() to return SQLITE_BUSY. It was a fairly straightforward matter to
extend the retry logic to cover statement preparation in an equally general
fashion.
The problem was observed while diagnosing failures in the rhizomeprotocol
DirectPush test case: the "servald rhizome list" command was failing due to a
locked database. See issue #9.
Must be enabled by using rhizome.api.addfile.*
Certainly polishing to be done, including using filename supplied
during HTTP POST. Now to fix that, and make it all work with
final rhizomeprotocol test case.
rhizomeprotocol test cases 8 and 9 currently fail post-merge. #9
at the end, and log2(filesize) instead of filesize. Equally
importantly BAR construction and parsing now uses #defines for
field sizes and offsets instead of it being hardwired without
meaningful documentation.
WILL BREAK BACKWARD COMPATIBILITY WITH PREVIOUS BUILDS.
YOU MUST DELETE AND REBUILD YOUR RHIZOME DATABASE AS OLD-FORMAT
BIDs WILL BE IN THERE AND GET SENT, AND STRANGE THINGS WILL HAPPEN.
This break with backwards compatibility is only reasonable to
consider because we have not yet had an official build using the
new Rhizome with old BAR format. 0.08 uses old Rhizome. #9
Objective is to avoid having to call system("servald rhizome import ...") to
handle a Rhizome direct POST /rhizome/bundle request. Antiquated code in and
around rhizome_import_bundle() needs much cleaning up, as indicated by some
TODO comments. Invocations must unnecessarily write the manifest into a file,
when they already have it in memory, ready to pass to the function.
All the 'rhizomeops' tests pass, but two 'rhizomeprotocol' tests are broken
by the changes in this commit.
Now returns series of "I have [newer]"'s and "Please send me"'s,
consisting of a 1 byte ID (0x01 or 0x02 respectively), followed
by the 64bit BID prefix from the BAR. As with all of Rhizome
Direct at present, the geo bounding box is ignored for now.
For some reason finds the same manifest several times (size bin
filtering seems to not be working right).
Also sync doesn't realise it has finished, and so doesn't return
when done.
size of associated data in a bundle, so that we can synchronise
small things first. Also preliminary work on making a general
cursor-type wrapper function for get_bars() so that it is easy
for any rhizome direct transport driver to iterate over the
known bundles in a rhizome datastore. #9
Closes#2.
Rewrite all Rhizome db query code using new retry primitives defined in
"rhizome.h": sqlite_step_retry(), sqlite_retry(), sqlite_retry_done(), etc.
Replace all calls to sqlite3_prepare_v2() with sqlite_prepare() which does
proper error logging.
Fix bug: re-invoking sqlite3_blob_close() on SQLITE_BUSY return causes process
to abort. Use an explicit BEGIN...COMMIT around the blob writing code instead.
Tested using repeated invocations of batphone/tests/meshms1.
Delete deprecated Rhizome db code in rhizome_crypto.c that has been replaced
with keyring file.
Much refactoring and removal of cruft.
SQL query errors are now logged with the filename, line number and function
where they were invoked, not of the low-level function that discovered the
error. This makes use of the new __HERE__ notation introduced last commit.
Replaces (const char *file, unsigned int line, const char *function) arguments
to all logging functions, simplifies malloc/free tracking code in
overlay_buffer.c and Rhizome manifest alloc/free tracking in rhizome_bundle.c.
Use __HERE__ macro instead of (__FILE__, __LINE__, __FUNCTION__) everywhere.
Special __NOWHERE__ macro is equivalent to (NULL, 0, NULL).
Declare net.c functions in new "net.h" header, so log.c doesn't have to pull
in the entire "serval.h" just to use write_str().
Facilitates progress on issue #2.
All the queries that used sqlite_exec_void() and sqlite_exec_int64() and
sqlite_exec_strbuf() now do a sleep-retry while the Rhizome db is locked.
There are other queries that still need conversion, and some old infinite
retry logic that needs replacing.
Add sqlite_exec_void_retry() function, use it in
rhizome_update_file_priority(). This should be reviewed to ensure that the
server process never sleeps.
The general problem remains of what the servald process should do if the
database is locked when it tries to update. Simplest solution is to sleep and
retry, but that blocks all other services and would hurt VoMP. A better
solution would be for each Rhizome operation to collect its database updates
into a single transaction and place that in a work queue that gets called using
schedule() (or even watch() if a file-descriptor event can somehow be used when
the database becomes available). Another solution is perhaps to perform all
Rhizome operations in a dedicated process that can block indefinitely on the
database without affecting servald responsiveness.
Do not add 'filehash' var to manifest if filesize=0
Do not accept 'filehash' var when parsing manifest with filesize=0
When responding to a new rhizome advertisement, do not try to HTTP
request a payload if filesize=0, just import the manifest directly
Various operations, eg "rhizome file add", do not report 'filehash'
fields where 'filesize' is zero
Do not delete rows from MANIFESTS table which have empty filehash
Various related bug fixes
Remove need to nul-terminate the received buffers in HTTP fetch reply handling
and HTTP server request parsing.
Remove redundant copying of data.
More rigorous parsing code, probably less vulnerable to overrun exploits.
Better debug logging of requests and responses.
Rhizome manifest parser now parses and validates all known fields, informs
about unsupported fields, and unpacks fields into relevant struct manifest
elements where appropriate. Is also stricter about whitespace.
Rhizome fetch code now logs debug messages if DEBUG_RHIZOME_RX bit is on.
Was not transmitting actual HTTP server port in rhizome announcements, was
always transmitting port 4110.
When trying for a free HTTP server port, sometimes bind() succeeds but listen()
fails with EADDRINUSE, so new logic to deal with that.
Tests assert that stderr contains no ERROR: lines after a successful exit
Rewrote sqlite_exec_int64() to separate error outcomes from legitimate
result values
Changed several WHY() calls to DEBUG()
Improved test framework
Improve regular expressions for common data types in test scripts
Revert column count field delimiter in "rhizome list" from ":" to "\n"
Add a few more test cases
that is called only when needed, and marks a manifest as finalised
if the verifcation fails. reading a manifest now never sets
finalised flag, as either _finalise() or _verify() must be called.
Move rhizome_new_manifest() out of rhizome_read_manifest_file() so that the
out-of-manifest report shows the names of the functions where the manifests
were really allocated.