dirnode: add 'tahoe'/'linkcrtime' and 'tahoe'/'linkmotime' to take the place of what 'mtime'/'ctime' originally did, and make the 'tahoe' subdict be unwritable through the set_children API

Also add extensive documentation in docs/frontends/webapi.txt about the behaviors of these values.  See ticket #628.
This commit is contained in:
Zooko O'Whielacronx 2009-04-11 15:52:05 -07:00
parent 664b69dd8d
commit 9729753692
8 changed files with 245 additions and 86 deletions

View File

@ -401,45 +401,55 @@ GET /uri/$DIRCAP/[SUBDIRS../]FILENAME?t=json
GET /uri/$DIRCAP/[SUBDIRS../]FILENAME?t=json :
[ "filenode", { "ro_uri": file_uri,
"verify_uri": verify_uri,
"size": bytes,
"mutable": false,
"metadata": {"ctime": 1202777696.7564139,
"mtime": 1202777696.7564139
}
} ]
[ "filenode", {
"ro_uri": file_uri,
"verify_uri": verify_uri,
"size": bytes,
"mutable": false,
"metadata": {
"ctime": 1202777696.7564139,
"mtime": 1202777696.7564139,
"tahoe": {
"linkcrtime": 1202777696.7564139,
"linkmotime": 1202777696.7564139,
} } } ]
If it is a directory, then it includes information about the children of
this directory, as a mapping from child name to a set of data about the
child (the same data that would appear in a corresponding GET?t=json of the
child itself). The child entries also include metadata about each child,
including creation- and modification- timestamps. The output looks like
including link-creation- and link-change- timestamps. The output looks like
this:
GET /uri/$DIRCAP?t=json :
GET /uri/$DIRCAP/[SUBDIRS../]SUBDIR?t=json :
[ "dirnode", { "rw_uri": read_write_uri,
"ro_uri": read_only_uri,
"verify_uri": verify_uri,
"mutable": true,
"children": {
"foo.txt": [ "filenode", { "ro_uri": uri,
"size": bytes,
"metadata": {
"ctime": 1202777696.7564139,
"mtime": 1202777696.7564139
}
} ],
"subdir": [ "dirnode", { "rw_uri": rwuri,
"ro_uri": rouri,
"metadata": {
"ctime": 1202778102.7589991,
"mtime": 1202778111.2160511,
}
} ]
} } ]
[ "dirnode", {
"rw_uri": read_write_uri,
"ro_uri": read_only_uri,
"verify_uri": verify_uri,
"mutable": true,
"children": {
"foo.txt": [ "filenode", {
"ro_uri": uri,
"size": bytes,
"metadata": {
"ctime": 1202777696.7564139,
"mtime": 1202777696.7564139,
"tahoe": {
"linkcrtime": 1202777696.7564139,
"linkmotime": 1202777696.7564139,
} } } ],
"subdir": [ "dirnode", {
"rw_uri": rwuri,
"ro_uri": rouri,
"metadata": {
"ctime": 1202778102.7589991,
"mtime": 1202778111.2160511,
"tahoe": {
"linkcrtime": 1202777696.7564139,
"linkmotime": 1202777696.7564139,
} } } ] } } ]
In the above example, note how 'children' is a dictionary in which the keys
are child names and the values depend upon whether the child is a file or a
@ -453,6 +463,91 @@ GET /uri/$DIRCAP/[SUBDIRS../]FILENAME?t=json
field will be presend if and only if the object has a verify-cap
(non-distributed LIT files do not have verify-caps).
==== About the metadata ====
The value of the 'mtime' key and of the 'tahoe':'linkmotime' is updated
whenever a link to a child is set. The value of the 'ctime' key and of the
'tahoe':'linkcrtime' key is updated whenever a link to a child is created --
i.e. when there was not previously a link under that name.
In Tahoe earlier than v1.4.0, only the 'mtime'/'ctime' keys were populated.
Starting in Tahoe v1.4.0, the 'linkmotime'/'linkcrtime' keys in the 'tahoe'
sub-dict are also populated.
The reason we added the new values in Tahoe v1.4.0 is that there is an
undocumented API (search the source code for 'set_children') which you can
use to overwrite the values of the 'mtime'/'ctime' pair, and this
set_children API is used by the "tahoe backup" command (both in Tahoe v1.3.0
and in Tahoe v1.4.0) to set the 'mtime' and 'ctime' values when backing up
files from a local filesystem into the Tahoe filesystem. As of Tahoe v1.4.0,
the set_children API cannot be used to set anything under the 'tahoe' key of
the metadata dict -- if you include 'tahoe' keys in your 'metadata' arguments
then it will silently ignore those keys.
Therefore, if the 'tahoe' sub-dict is present, you can rely on the
'linkcrtime' and 'linkmotime' values therein to have the semantics described
above. (This is assuming that only official Tahoe clients have been used to
write those links, and that their system clocks were set to what you expected
-- there is nothing preventing someone from editing their Tahoe client or
writing their own Tahoe client which would overwrite those values however
they like, and there is nothing to constrain their system clock from taking
any value.)
The meaning of the 'ctime'/'mtime' fields are slightly more complex.
The meaning of the 'mtime' field is: whenever the edge is updated (by an HTTP
PUT or POST, as is done by the "tahoe cp" command), then the mtime is set to
the current time on the clock of the updating client. Whenever the edge is
updated by "tahoe backup" then the mtime is instead set to the value which
the updating client read from its local filesystem for the "mtime" of the
local file in question, which means the last time the contents of that file
were changed. Note however, that if the edge in the Tahoe filesystem points
to a mutable file and the contents of that mutable file is changed then the
"mtime" value on that edge will *not* be updated, since the edge itself
wasn't updated -- only the mutable file was.
The meaning of the 'ctime' field is even more complex. Whenever a new edge is
created (by an HTTP PUT or POST, as is done by "tahoe cp") then the ctime is
set to the current time on the clock of the updating client. Whenever the
edge is created *or updated* by "tahoe backup" then the ctime is instead set
to the value which the updating client read from its local filesystem. On
Windows, it reads the timestamp of when the local file was created and puts
that into the "ctime", and on other platforms it reads the timestamp of the
most recent time that either the contents or the metadata of the local file
was changed and puts that into the ctime. Again, if the edge points to a
mutable file and the content of that mutable file is changed then the ctime
will not be updated in any case.
Therefore there are several ways that the 'ctime' field could be confusing:
1. You might be confused about whether it reflects the time of the creation
of a link in the Tahoe filesystem or a timestamp copied in from a local
filesystem.
2. You might be confused about whether it is a copy of the file creation time
(if "tahoe backup" was run on a Windows system) or of the last
contents-or-metadata change (if "tahoe backup" was run on a different
operating system).
3. You might be confused by the fact that changing the contents of a mutable
file in Tahoe don't have any effect on any links pointing at that file in any
directories, although "tahoe backup" sets the link 'ctime'/'mtime' to reflect
timestamps about the local file corresponding to the Tahoe file to which the
link points.
4. Also, quite apart from Tahoe, you might be confused about the meaning of
the 'ctime' in unix local filesystems, which people sometimes think means
file creation time, but which actually means, in unix local filesystems, the
most recent time that the file contents or the file metadata (such as owner,
permission bits, extended attributes, etc.) has changed. Note that although
'ctime' does not mean file creation time in Unix, it does mean link creation
time in Tahoe, unless the "tahoe backup" command has been used on that link,
in which case it means something about the local filesystem file which
corresponds to the Tahoe file which is pointed at by the link. It means
either file creation time of the local file (if "tahoe backup" was run on
Windows) or file-contents-or-metadata-update-time of the local file (if
"tahoe backup" was run on a different operating system).
=== Attaching an existing File or Directory by its read- or write- cap ===

View File

@ -176,30 +176,29 @@ various pieces of a dirnode?
netstring(cap) = 4+len(cap)
encrypted(cap) = 16+cap+32
JSON({}) = 2
JSON({ctime=float,mtime=float}): 57
netstring(metadata) = 4+57 = 61
JSON({ctime=float,mtime=float,'tahoe':{linkcrtime=float,linkmotime=float}}): 137
netstring(metadata) = 4+137 = 141
so a CHK entry is:
5+ 4+len(name) + 4+97 + 5+16+97+32 + 4+57
And a 15-byte filename gives a 336-byte entry. When the entry points at a
5+ 4+len(name) + 4+97 + 5+16+97+32 + 4+137
And a 15-byte filename gives a 416-byte entry. When the entry points at a
subdirectory instead of a file, the entry is a little bit smaller. So an
empty directory uses 0 bytes, a directory with one child uses about 336
bytes, a directory with two children uses about 672, etc.
empty directory uses 0 bytes, a directory with one child uses about 416
bytes, a directory with two children uses about 832, etc.
When the dirnode data is encoding using our default 3-of-10, that means we
get 112ish bytes of data in each share per child.
get 139ish bytes of data in each share per child.
The pubkey, signature, and hashes form the first 935ish bytes of the
container, then comes our data, then about 1216 bytes of encprivkey. So if we
read the first:
1kB: we get 65bytes of dirnode data : only empty directories
1kiB: 89bytes of dirnode data : maybe one short-named subdir
2kB: 1065bytes: about 9 entries
3kB: 2065bytes: about 18 entries, or 7.5 entries plus the encprivkey
4kB: 3065bytes: about 27 entries, or about 16.5 plus the encprivkey
2kB: 1065bytes: about 8
3kB: 2065bytes: about 15 entries, or 6 entries plus the encprivkey
4kB: 3065bytes: about 22 entries, or about 13 plus the encprivkey
So we've written the code to do an initial read of 2kB from each share when
So we've written the code to do an initial read of 4kB from each share when
we read the mutable file, which should give good performance (one RTT) for
small directories.

View File

@ -83,15 +83,41 @@ class Adder:
metadata = children[name][1].copy()
else:
metadata = {"ctime": now,
"mtime": now}
if new_metadata is None:
# update timestamps
"mtime": now,
"tahoe": {
"linkcrtime": now,
"linkmotime": now,
}
}
if new_metadata is not None:
# Overwrite all metadata.
newmd = new_metadata.copy()
# Except 'tahoe'.
if newmd.has_key('tahoe'):
del newmd['tahoe']
if metadata.has_key('tahoe'):
newmd['tahoe'] = metadata['tahoe']
metadata = newmd
else:
# For backwards compatibility with Tahoe < 1.4.0:
if "ctime" not in metadata:
metadata["ctime"] = now
metadata["mtime"] = now
else:
# just replace it
metadata = new_metadata.copy()
# update timestamps
sysmd = metadata.get('tahoe', {})
if not 'linkcrtime' in sysmd:
if "ctime" in metadata:
# In Tahoe < 1.4.0 we used the word "ctime" to mean what Tahoe >= 1.4.0
# calls "linkcrtime".
sysmd["linkcrtime"] = metadata["ctime"]
else:
sysmd["linkcrtime"] = now
sysmd["linkmotime"] = now
children[name] = (child, metadata)
new_contents = self.node._pack_contents(children)
return new_contents

View File

@ -374,7 +374,7 @@ class ServermapUpdater:
# fixed-size slots so we can retrieve less data. For now, we'll just
# read 2000 bytes, which also happens to read enough actual data to
# pre-fetch a 9-entry dirnode.
self._read_size = 2000
self._read_size = 4000
if mode == MODE_CHECK:
# we use unpack_prefix_and_signature, so we need 1k
self._read_size = 1000

View File

@ -65,8 +65,17 @@ def list(options):
name = unicode(name)
child = children[name]
childtype = child[0]
ctime = child[1]["metadata"].get("ctime")
mtime = child[1]["metadata"].get("mtime")
# See webapi.txt for a discussion of the meanings of unix local
# filesystem mtime and ctime, Tahoe mtime and ctime, and Tahoe
# linkmotime and linkcrtime.
ctime = child[1].get("metadata", {}).get('tahoe', {}).get("linkcrtime")
if not ctime:
ctime = child[1]["metadata"].get("ctime")
mtime = child[1].get("metadata", {}).get('tahoe', {}).get("linkmotime")
if not mtime:
mtime = child[1]["metadata"].get("mtime")
rw_uri = child[1].get("rw_uri")
ro_uri = child[1].get("ro_uri")
if ctime:

View File

@ -415,8 +415,8 @@ class Dirnode(unittest.TestCase,
d.addCallback(lambda res: n.get_metadata_for(u"child"))
d.addCallback(lambda metadata:
self.failUnlessEqual(sorted(metadata.keys()),
["ctime", "mtime"]))
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"])))
d.addCallback(lambda res:
self.shouldFail(NoSuchChildError, "gcamap-no",
@ -438,8 +438,8 @@ class Dirnode(unittest.TestCase,
child, metadata = res
self.failUnlessEqual(child.get_uri(),
fake_file_uri.to_string())
self.failUnlessEqual(sorted(metadata.keys()),
["ctime", "mtime"])
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"]))
d.addCallback(_check_child_and_metadata2)
d.addCallback(lambda res:
@ -447,22 +447,31 @@ class Dirnode(unittest.TestCase,
def _check_child_and_metadata3(res):
child, metadata = res
self.failUnless(isinstance(child, FakeDirectoryNode))
self.failUnlessEqual(sorted(metadata.keys()),
["ctime", "mtime"])
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"]))
d.addCallback(_check_child_and_metadata3)
# set_uri + metadata
# it should be possible to add a child without any metadata
d.addCallback(lambda res: n.set_uri(u"c2", fake_file_uri.to_string(), {}))
d.addCallback(lambda res: n.get_metadata_for(u"c2"))
d.addCallback(lambda metadata: self.failUnlessEqual(metadata, {}))
d.addCallback(lambda metadata: self.failUnlessEqual(metadata.keys(), ['tahoe']))
# You can't override the link timestamps.
d.addCallback(lambda res: n.set_uri(u"c2", fake_file_uri.to_string(), { 'tahoe': {'linkcrtime': "bogus"}}))
d.addCallback(lambda res: n.get_metadata_for(u"c2"))
def _has_good_linkcrtime(metadata):
self.failUnless(metadata.has_key('tahoe'))
self.failUnless(metadata['tahoe'].has_key('linkcrtime'))
self.failIfEqual(metadata['tahoe']['linkcrtime'], 'bogus')
d.addCallback(_has_good_linkcrtime)
# if we don't set any defaults, the child should get timestamps
d.addCallback(lambda res: n.set_uri(u"c3", fake_file_uri.to_string()))
d.addCallback(lambda res: n.get_metadata_for(u"c3"))
d.addCallback(lambda metadata:
self.failUnlessEqual(sorted(metadata.keys()),
["ctime", "mtime"]))
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"])))
# or we can add specific metadata at set_uri() time, which
# overrides the timestamps
@ -470,7 +479,8 @@ class Dirnode(unittest.TestCase,
{"key": "value"}))
d.addCallback(lambda res: n.get_metadata_for(u"c4"))
d.addCallback(lambda metadata:
self.failUnlessEqual(metadata, {"key": "value"}))
self.failUnless((set(metadata.keys()) == set(["key", "tahoe"])) and
(metadata['key'] == "value"), metadata))
d.addCallback(lambda res: n.delete(u"c2"))
d.addCallback(lambda res: n.delete(u"c3"))
@ -486,14 +496,14 @@ class Dirnode(unittest.TestCase,
n.set_node, u"d2", n2,
overwrite=False))
d.addCallback(lambda res: n.get_metadata_for(u"d2"))
d.addCallback(lambda metadata: self.failUnlessEqual(metadata, {}))
d.addCallback(lambda metadata: self.failUnlessEqual(metadata.keys(), ['tahoe']))
# if we don't set any defaults, the child should get timestamps
d.addCallback(lambda res: n.set_node(u"d3", n))
d.addCallback(lambda res: n.get_metadata_for(u"d3"))
d.addCallback(lambda metadata:
self.failUnlessEqual(sorted(metadata.keys()),
["ctime", "mtime"]))
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"])))
# or we can add specific metadata at set_node() time, which
# overrides the timestamps
@ -501,7 +511,8 @@ class Dirnode(unittest.TestCase,
{"key": "value"}))
d.addCallback(lambda res: n.get_metadata_for(u"d4"))
d.addCallback(lambda metadata:
self.failUnlessEqual(metadata, {"key": "value"}))
self.failUnless((set(metadata.keys()) == set(["key", "tahoe"])) and
(metadata['key'] == "value"), metadata))
d.addCallback(lambda res: n.delete(u"d2"))
d.addCallback(lambda res: n.delete(u"d3"))
@ -525,13 +536,15 @@ class Dirnode(unittest.TestCase,
d.addCallback(lambda children: self.failIf(u"new" in children))
d.addCallback(lambda res: n.get_metadata_for(u"e1"))
d.addCallback(lambda metadata:
self.failUnlessEqual(sorted(metadata.keys()),
["ctime", "mtime"]))
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"])))
d.addCallback(lambda res: n.get_metadata_for(u"e2"))
d.addCallback(lambda metadata: self.failUnlessEqual(metadata, {}))
d.addCallback(lambda metadata:
self.failUnlessEqual(set(metadata.keys()), set(['tahoe'])))
d.addCallback(lambda res: n.get_metadata_for(u"e3"))
d.addCallback(lambda metadata:
self.failUnlessEqual(metadata, {"key": "value"}))
self.failUnless((set(metadata.keys()) == set(["key", "tahoe"]))
and (metadata['key'] == "value"), metadata))
d.addCallback(lambda res: n.delete(u"e1"))
d.addCallback(lambda res: n.delete(u"e2"))
@ -555,13 +568,15 @@ class Dirnode(unittest.TestCase,
d.addCallback(lambda children: self.failIf(u"new" in children))
d.addCallback(lambda res: n.get_metadata_for(u"f1"))
d.addCallback(lambda metadata:
self.failUnlessEqual(sorted(metadata.keys()),
["ctime", "mtime"]))
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"])))
d.addCallback(lambda res: n.get_metadata_for(u"f2"))
d.addCallback(lambda metadata: self.failUnlessEqual(metadata, {}))
d.addCallback(
lambda metadata: self.failUnlessEqual(set(metadata.keys()), set(['tahoe'])))
d.addCallback(lambda res: n.get_metadata_for(u"f3"))
d.addCallback(lambda metadata:
self.failUnlessEqual(metadata, {"key": "value"}))
self.failUnless((set(metadata.keys()) == set(["key", "tahoe"])) and
(metadata['key'] == "value"), metadata))
d.addCallback(lambda res: n.delete(u"f1"))
d.addCallback(lambda res: n.delete(u"f2"))
@ -627,8 +642,8 @@ class Dirnode(unittest.TestCase,
d.addCallback(lambda res: n.set_node(u"no_timestamps", n))
d.addCallback(lambda res: n.get_metadata_for(u"no_timestamps"))
d.addCallback(lambda metadata:
self.failUnlessEqual(sorted(metadata.keys()),
["ctime", "mtime"]))
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"])))
d.addCallback(lambda res: n.delete(u"no_timestamps"))
d.addCallback(lambda res: n.delete(u"subdir"))
@ -658,8 +673,8 @@ class Dirnode(unittest.TestCase,
sorted([u"child", u"newfile"])))
d.addCallback(lambda res: n.get_metadata_for(u"newfile"))
d.addCallback(lambda metadata:
self.failUnlessEqual(sorted(metadata.keys()),
["ctime", "mtime"]))
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"])))
d.addCallback(lambda res: n.add_file(u"newfile-metadata",
uploadable,
@ -668,7 +683,8 @@ class Dirnode(unittest.TestCase,
self.failUnless(IFileNode.providedBy(newnode)))
d.addCallback(lambda res: n.get_metadata_for(u"newfile-metadata"))
d.addCallback(lambda metadata:
self.failUnlessEqual(metadata, {"key": "value"}))
self.failUnless((set(metadata.keys()) == set(["key", "tahoe"])) and
(metadata['key'] == "value"), metadata))
d.addCallback(lambda res: n.delete(u"newfile-metadata"))
d.addCallback(lambda res: n.create_empty_directory(u"subdir2"))

View File

@ -19,6 +19,11 @@ def iso_utc(now=None, sep='_', t=time.time):
now = t()
return datetime.datetime.utcfromtimestamp(now).isoformat(sep)
def iso_local(now=None, sep='_', t=time.time):
if now is None:
now = t()
return datetime.datetime.fromtimestamp(now).isoformat(sep)
def iso_utc_time_to_seconds(isotime, _conversion_re=re.compile(r"(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})[T_ ](?P<hour>\d{2}):(?P<minute>\d{2}):(?P<second>\d{2})(?P<subsecond>\.\d+)?")):
"""
The inverse of iso_utc().

View File

@ -13,7 +13,7 @@ from nevow.inevow import IRequest
from foolscap.eventual import fireEventually
from allmydata.util import base32
from allmydata.util import base32, time_format
from allmydata.uri import from_string_dirnode
from allmydata.interfaces import IDirectoryNode, IFileNode, IMutableFileNode, \
ExistingChildError, NoSuchChildError
@ -592,16 +592,25 @@ class DirectoryAsHTML(rend.Page):
ctx.fillSlots("rename", rename)
times = []
TIME_FORMAT = "%H:%M:%S %d-%b-%Y"
if "ctime" in metadata:
ctime = time.strftime(TIME_FORMAT,
time.localtime(metadata["ctime"]))
times.append("c: " + ctime)
if "mtime" in metadata:
mtime = time.strftime(TIME_FORMAT,
time.localtime(metadata["mtime"]))
linkcrtime = metadata.get('tahoe', {}).get("linkcrtime")
if linkcrtime is not None:
times.append("lcr: " + time_format.iso_local(linkcrtime))
else:
# For backwards-compatibility with links last modified by Tahoe < 1.4.0:
if "ctime" in metadata:
ctime = time_format.iso_local(metadata["ctime"])
times.append("c: " + ctime)
linkmotime = metadata.get('tahoe', {}).get("linkmotime")
if linkmotime is not None:
if times:
times.append(T.br())
times.append("lmo: " + time_format.iso_local(linkmotime))
else:
# For backwards-compatibility with links last modified by Tahoe < 1.4.0:
if "mtime" in metadata:
mtime = time_format.iso_local(metadata["mtime"])
if times:
times.append(T.br())
times.append("m: " + mtime)
ctx.fillSlots("times", times)