It turns out that if the DB is locked, sqlite_prepare_v2() call can return
SQLITE_BUSY. The retry logic (implemented for issue #2) only provided for
sqlite_step() to return SQLITE_BUSY. It was a fairly straightforward matter to
extend the retry logic to cover statement preparation in an equally general
fashion.
The problem was observed while diagnosing failures in the rhizomeprotocol
DirectPush test case: the "servald rhizome list" command was failing due to a
locked database. See issue #9.
associated file before checking if it was already in the database.
Rhizome Direct can supply a manifest without associated file if
the file is already in the database, and so it was breaking.
Also removed "assert bundle_received_by"'s from rhizome direct
pull and sync tests because they are not needed, and were failing
because the same file contents was used for the files being
exchanged, and so file storage was not occurring, and thus the log
message being looked for was not being produced.
Push, pull and sync tests in rhizomeprotocol now pass, leaving
only two tests in error. #9
signatures were not being recorded. Also fixed separate issue
where rhizomeprotocols tests expected selfsigned to be 1 on
receiver end, when it will never be, because the BK doesn't match.
rhizome direct push test in rhizomeprotocols now passes. #9
Now the 'rhizomeprotocol' Push test case now passes. It should be renamed to
DirectPush.
Much refactoring of the Rhizome Direct HTTP request parsing. Now uses
strbuf_sprintf() instead of snprintf() in many places to check for buffer
overrun and ensure terminating nul. Still more of this kind of work is needed.
Improved debug that needs to be made conditional on DEBUG_RHIZOME_RX and
DEBUG_RHIZOME_TX. Some just needs removal.
Closes#2.
Rewrite all Rhizome db query code using new retry primitives defined in
"rhizome.h": sqlite_step_retry(), sqlite_retry(), sqlite_retry_done(), etc.
Replace all calls to sqlite3_prepare_v2() with sqlite_prepare() which does
proper error logging.
Fix bug: re-invoking sqlite3_blob_close() on SQLITE_BUSY return causes process
to abort. Use an explicit BEGIN...COMMIT around the blob writing code instead.
Tested using repeated invocations of batphone/tests/meshms1.
Delete deprecated Rhizome db code in rhizome_crypto.c that has been replaced
with keyring file.
Much refactoring and removal of cruft.
SQL query errors are now logged with the filename, line number and function
where they were invoked, not of the low-level function that discovered the
error. This makes use of the new __HERE__ notation introduced last commit.
All the queries that used sqlite_exec_void() and sqlite_exec_int64() and
sqlite_exec_strbuf() now do a sleep-retry while the Rhizome db is locked.
There are other queries that still need conversion, and some old infinite
retry logic that needs replacing.
Add sqlite_exec_void_retry() function, use it in
rhizome_update_file_priority(). This should be reviewed to ensure that the
server process never sleeps.
The general problem remains of what the servald process should do if the
database is locked when it tries to update. Simplest solution is to sleep and
retry, but that blocks all other services and would hurt VoMP. A better
solution would be for each Rhizome operation to collect its database updates
into a single transaction and place that in a work queue that gets called using
schedule() (or even watch() if a file-descriptor event can somehow be used when
the database becomes available). Another solution is perhaps to perform all
Rhizome operations in a dedicated process that can block indefinitely on the
database without affecting servald responsiveness.
Do not add 'filehash' var to manifest if filesize=0
Do not accept 'filehash' var when parsing manifest with filesize=0
When responding to a new rhizome advertisement, do not try to HTTP
request a payload if filesize=0, just import the manifest directly
Various operations, eg "rhizome file add", do not report 'filehash'
fields where 'filesize' is zero
Do not delete rows from MANIFESTS table which have empty filehash
Various related bug fixes
Tests assert that stderr contains no ERROR: lines after a successful exit
Rewrote sqlite_exec_int64() to separate error outcomes from legitimate
result values
Changed several WHY() calls to DEBUG()
Improved test framework
Improve regular expressions for common data types in test scripts
Revert column count field delimiter in "rhizome list" from ":" to "\n"
Add a few more test cases
that is called only when needed, and marks a manifest as finalised
if the verifcation fails. reading a manifest now never sets
finalised flag, as either _finalise() or _verify() must be called.