dirnode.py: stop writing 'ctime' and 'mtime' fields. Includes documentation and test changes.

This commit is contained in:
david-sarah 2010-06-18 16:01:19 -07:00
parent 72e395d878
commit 4712875193
4 changed files with 127 additions and 143 deletions

View File

@ -664,25 +664,41 @@ GET /uri/$DIRCAP/[SUBDIRS../]FILENAME?t=json
==== About the metadata ====
The value of the 'mtime' key and of the 'tahoe':'linkmotime' is updated
whenever a link to a child is set. The value of the 'ctime' key and of the
'tahoe':'linkcrtime' key is updated whenever a link to a child is created --
i.e. when there was not previously a link under that name.
The value of the 'tahoe':'linkmotime' key is updated whenever a link to a
child is set. The value of the 'tahoe':'linkcrtime' key is updated whenever
a link to a child is created -- i.e. when there was not previously a link
under that name.
In Tahoe earlier than v1.4.0, only the 'mtime'/'ctime' keys were populated.
Starting in Tahoe v1.4.0, the 'linkmotime'/'linkcrtime' keys in the 'tahoe'
sub-dict are also populated. However, prior to v1.7.0, a bug caused the
'tahoe' sub-dict to be deleted by webapi requests in which new metadata
is specified, and not to be added to existing child links that lack it.
Note however, that if the edge in the Tahoe filesystem points to a mutable
file and the contents of that mutable file is changed, then the
'tahoe':'linkmotime' value on that edge will *not* be updated, since the
edge itself wasn't updated -- only the mutable file was.
The reason we added the new values in Tahoe v1.4.0 is that there is a
The timestamps are represented as a number of seconds since the UNIX epoch
(1970-01-01 00:00:00 UTC), excluding leap seconds.
In Tahoe earlier than v1.4.0, 'mtime' and 'ctime' keys were populated
instead of the 'tahoe':'linkmotime' and 'tahoe':'linkcrtime' keys. Starting
in Tahoe v1.4.0, the 'linkmotime'/'linkcrtime' keys in the 'tahoe' sub-dict
are populated. However, prior to Tahoe v1.7beta, a bug caused the 'tahoe'
sub-dict to be deleted by webapi requests in which new metadata is
specified, and not to be added to existing child links that lack it.
From Tahoe v1.7.0 onward, the 'mtime' and 'ctime' fields are no longer
populated or updated (see ticket #924), except by "tahoe backup" as
explained below. For backward compatibility, when an existing link is
updated and 'tahoe':'linkcrtime' is not present in the previous metadata
but 'ctime' is, the old value of 'ctime' is used as the new value of
'tahoe':'linkcrtime'.
The reason we added the new fields in Tahoe v1.4.0 is that there is a
"set_children" API (described below) which you can use to overwrite the
values of the 'mtime'/'ctime' pair, and this API is used by the "tahoe
backup" command (both in Tahoe v1.3.0 and in Tahoe v1.4.0) to set the
'mtime' and 'ctime' values when backing up files from a local filesystem
into the Tahoe filesystem. As of Tahoe v1.4.0, the set_children API cannot
be used to set anything under the 'tahoe' key of the metadata dict -- if
you include 'tahoe' keys in your 'metadata' arguments then it will silently
values of the 'mtime'/'ctime' pair, and this API is used by the
"tahoe backup" command (in Tahoe v1.3.0 and later) to set the 'mtime' and
'ctime' values when backing up files from a local filesystem into the
Tahoe filesystem. As of Tahoe v1.4.0, the set_children API cannot be used
to set anything under the 'tahoe' key of the metadata dict -- if you
include 'tahoe' keys in your 'metadata' arguments then it will silently
ignore those keys.
Therefore, if the 'tahoe' sub-dict is present, you can rely on the
@ -694,60 +710,45 @@ GET /uri/$DIRCAP/[SUBDIRS../]FILENAME?t=json
they like, and there is nothing to constrain their system clock from taking
any value.)
The meaning of the 'ctime'/'mtime' fields are slightly more complex.
When an edge is created or updated by "tahoe backup", the 'mtime' and
'ctime' keys on that edge are set as follows:
The meaning of the 'mtime' field is: whenever the edge is updated (by an HTTP
PUT or POST, as is done by the "tahoe cp" command), then the mtime is set to
the current time on the clock of the updating client. Whenever the edge is
updated by "tahoe backup" then the mtime is instead set to the value which
the updating client read from its local filesystem for the "mtime" of the
local file in question, which means the last time the contents of that file
were changed. Note however, that if the edge in the Tahoe filesystem points
to a mutable file and the contents of that mutable file is changed then the
"mtime" value on that edge will *not* be updated, since the edge itself
wasn't updated -- only the mutable file was.
* 'mtime' is set to the timestamp read from the local filesystem for the
"mtime" of the local file in question, which means the last time the
contents of that file were changed.
The meaning of the 'ctime' field is even more complex. Whenever a new edge is
created (by an HTTP PUT or POST, as is done by "tahoe cp") then the ctime is
set to the current time on the clock of the updating client. Whenever the
edge is created *or updated* by "tahoe backup" then the ctime is instead set
to the value which the updating client read from its local filesystem. On
Windows, it reads the timestamp of when the local file was created and puts
that into the "ctime", and on other platforms it reads the timestamp of the
most recent time that either the contents or the metadata of the local file
was changed and puts that into the ctime. Again, if the edge points to a
mutable file and the content of that mutable file is changed then the ctime
will not be updated in any case.
* On Windows, 'ctime' is set to the creation timestamp for the file
read from the local filesystem. On other platforms, 'ctime' is set to
the UNIX "ctime" of the local file, which means the last time that
either the contents or the metadata of the local file was changed.
Therefore there are several ways that the 'ctime' field could be confusing:
There are several ways that the 'ctime' field could be confusing:
1. You might be confused about whether it reflects the time of the creation
of a link in the Tahoe filesystem or a timestamp copied in from a local
filesystem.
of a link in the Tahoe filesystem (by a version of Tahoe < v1.7.0) or a
timestamp copied in by "tahoe backup" from a local filesystem.
2. You might be confused about whether it is a copy of the file creation time
(if "tahoe backup" was run on a Windows system) or of the last
contents-or-metadata change (if "tahoe backup" was run on a different
operating system).
2. You might be confused about whether it is a copy of the file creation
time (if "tahoe backup" was run on a Windows system) or of the last
contents-or-metadata change (if "tahoe backup" was run on a different
operating system).
3. You might be confused by the fact that changing the contents of a mutable
file in Tahoe don't have any effect on any links pointing at that file in any
directories, although "tahoe backup" sets the link 'ctime'/'mtime' to reflect
timestamps about the local file corresponding to the Tahoe file to which the
link points.
3. You might be confused by the fact that changing the contents of a
mutable file in Tahoe don't have any effect on any links pointing at
that file in any directories, although "tahoe backup" sets the link
'ctime'/'mtime' to reflect timestamps about the local file corresponding
to the Tahoe file to which the link points.
4. Also, quite apart from Tahoe, you might be confused about the meaning
of the "ctime" in UNIX local filesystems, which people sometimes think
means file creation time, but which actually means, in UNIX local
filesystems, the most recent time that the file contents or the file
metadata (such as owner, permission bits, extended attributes, etc.)
has changed. Note that although "ctime" does not mean file creation time
in UNIX, links created by a version of Tahoe prior to v1.7.0, and never
written by "tahoe backup", will have 'ctime' set to the link creation
time.
4. Also, quite apart from Tahoe, you might be confused about the meaning of
the 'ctime' in UNIX local filesystems, which people sometimes think means
file creation time, but which actually means, in UNIX local filesystems, the
most recent time that the file contents or the file metadata (such as owner,
permission bits, extended attributes, etc.) has changed. Note that although
'ctime' does not mean file creation time in UNIX, it does mean link creation
time in Tahoe, unless the "tahoe backup" command has been used on that link,
in which case it means something about the local filesystem file which
corresponds to the Tahoe file which is pointed at by the link. It means
either file creation time of the local file (if "tahoe backup" was run on
Windows) or file-contents-or-metadata-update-time of the local file (if
"tahoe backup" was run on a different operating system).
=== Attaching an existing File or Directory by its read- or write- cap ===

View File

@ -30,13 +30,11 @@ def update_metadata(metadata, new_metadata, now):
Timestamps are set according to the time 'now'."""
if metadata is None:
metadata = {'ctime': now,
'mtime': now,
'tahoe': {
'linkcrtime': now,
'linkmotime': now,
}
}
metadata = {}
old_ctime = None
if 'ctime' in metadata:
old_ctime = metadata['ctime']
if new_metadata is not None:
# Overwrite all metadata.
@ -48,29 +46,19 @@ def update_metadata(metadata, new_metadata, now):
if 'tahoe' in metadata:
newmd['tahoe'] = metadata['tahoe']
# For backwards compatibility with Tahoe < 1.4.0:
if 'ctime' not in newmd:
if 'ctime' in metadata:
newmd['ctime'] = metadata['ctime']
else:
newmd['ctime'] = now
if 'mtime' not in newmd:
newmd['mtime'] = now
metadata = newmd
else:
# For backwards compatibility with Tahoe < 1.4.0:
if 'ctime' not in metadata:
metadata['ctime'] = now
metadata['mtime'] = now
# update timestamps
sysmd = metadata.get('tahoe', {})
if 'linkcrtime' not in sysmd:
# In Tahoe < 1.4.0 we used the word 'ctime' to mean what Tahoe >= 1.4.0
# calls 'linkcrtime'.
assert 'ctime' in metadata
sysmd['linkcrtime'] = metadata['ctime']
# calls 'linkcrtime'. This field is only used if it was in the old metadata,
# and 'tahoe:linkcrtime' was not.
if old_ctime is not None:
sysmd['linkcrtime'] = old_ctime
else:
sysmd['linkcrtime'] = now
sysmd['linkmotime'] = now
metadata['tahoe'] = sysmd

View File

@ -746,7 +746,7 @@ class Dirnode(GridTestMixin, unittest.TestCase,
d.addCallback(lambda res: n.get_metadata_for(u"child"))
d.addCallback(lambda metadata:
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"])))
set(["tahoe"])))
d.addCallback(lambda res:
self.shouldFail(NoSuchChildError, "gcamap-no",
@ -768,8 +768,7 @@ class Dirnode(GridTestMixin, unittest.TestCase,
child, metadata = res
self.failUnlessEqual(child.get_uri(),
fake_file_uri)
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"]))
self.failUnlessEqual(set(metadata.keys()), set(["tahoe"]))
d.addCallback(_check_child_and_metadata2)
d.addCallback(lambda res:
@ -777,8 +776,7 @@ class Dirnode(GridTestMixin, unittest.TestCase,
def _check_child_and_metadata3(res):
child, metadata = res
self.failUnless(isinstance(child, dirnode.DirectoryNode))
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"]))
self.failUnlessEqual(set(metadata.keys()), set(["tahoe"]))
d.addCallback(_check_child_and_metadata3)
# set_uri + metadata
@ -788,7 +786,7 @@ class Dirnode(GridTestMixin, unittest.TestCase,
{}))
d.addCallback(lambda res: n.get_metadata_for(u"c2"))
d.addCallback(lambda metadata:
self.failUnlessEqual(set(metadata.keys()), set(["tahoe", "ctime", "mtime"])))
self.failUnlessEqual(set(metadata.keys()), set(["tahoe"])))
# You can't override the link timestamps.
d.addCallback(lambda res: n.set_uri(u"c2",
@ -806,7 +804,7 @@ class Dirnode(GridTestMixin, unittest.TestCase,
fake_file_uri, fake_file_uri))
d.addCallback(lambda res: n.get_metadata_for(u"c3"))
d.addCallback(lambda metadata:
self.failUnlessEqual(set(metadata.keys()), set(["tahoe", "ctime", "mtime"])))
self.failUnlessEqual(set(metadata.keys()), set(["tahoe"])))
# we can also add specific metadata at set_uri() time
d.addCallback(lambda res: n.set_uri(u"c4",
@ -814,7 +812,7 @@ class Dirnode(GridTestMixin, unittest.TestCase,
{"key": "value"}))
d.addCallback(lambda res: n.get_metadata_for(u"c4"))
d.addCallback(lambda metadata:
self.failUnless((set(metadata.keys()) == set(["key", "tahoe", "ctime", "mtime"])) and
self.failUnless((set(metadata.keys()) == set(["key", "tahoe"])) and
(metadata['key'] == "value"), metadata))
d.addCallback(lambda res: n.delete(u"c2"))
@ -832,20 +830,20 @@ class Dirnode(GridTestMixin, unittest.TestCase,
overwrite=False))
d.addCallback(lambda res: n.get_metadata_for(u"d2"))
d.addCallback(lambda metadata:
self.failUnlessEqual(set(metadata.keys()), set(["tahoe", "ctime", "mtime"])))
self.failUnlessEqual(set(metadata.keys()), set(["tahoe"])))
# if we don't set any defaults, the child should get timestamps
d.addCallback(lambda res: n.set_node(u"d3", n))
d.addCallback(lambda res: n.get_metadata_for(u"d3"))
d.addCallback(lambda metadata:
self.failUnlessEqual(set(metadata.keys()), set(["tahoe", "ctime", "mtime"])))
self.failUnlessEqual(set(metadata.keys()), set(["tahoe"])))
# we can also add specific metadata at set_node() time
d.addCallback(lambda res: n.set_node(u"d4", n,
{"key": "value"}))
d.addCallback(lambda res: n.get_metadata_for(u"d4"))
d.addCallback(lambda metadata:
self.failUnless((set(metadata.keys()) == set(["key", "tahoe", "ctime", "mtime"])) and
self.failUnless((set(metadata.keys()) == set(["key", "tahoe"])) and
(metadata["key"] == "value"), metadata))
d.addCallback(lambda res: n.delete(u"d2"))
@ -876,13 +874,13 @@ class Dirnode(GridTestMixin, unittest.TestCase,
d.addCallback(lambda children: self.failIf(u"new" in children))
d.addCallback(lambda res: n.get_metadata_for(u"e1"))
d.addCallback(lambda metadata:
self.failUnlessEqual(set(metadata.keys()), set(["tahoe", "ctime", "mtime"])))
self.failUnlessEqual(set(metadata.keys()), set(["tahoe"])))
d.addCallback(lambda res: n.get_metadata_for(u"e2"))
d.addCallback(lambda metadata:
self.failUnlessEqual(set(metadata.keys()), set(["tahoe", "ctime", "mtime"])))
self.failUnlessEqual(set(metadata.keys()), set(["tahoe"])))
d.addCallback(lambda res: n.get_metadata_for(u"e3"))
d.addCallback(lambda metadata:
self.failUnless((set(metadata.keys()) == set(["key", "tahoe", "ctime", "mtime"])) and
self.failUnless((set(metadata.keys()) == set(["key", "tahoe"])) and
(metadata["key"] == "value"), metadata))
d.addCallback(lambda res: n.delete(u"e1"))
@ -907,13 +905,13 @@ class Dirnode(GridTestMixin, unittest.TestCase,
d.addCallback(lambda children: self.failIf(u"new" in children))
d.addCallback(lambda res: n.get_metadata_for(u"f1"))
d.addCallback(lambda metadata:
self.failUnlessEqual(set(metadata.keys()), set(["tahoe", "ctime", "mtime"])))
self.failUnlessEqual(set(metadata.keys()), set(["tahoe"])))
d.addCallback(lambda res: n.get_metadata_for(u"f2"))
d.addCallback(lambda metadata:
self.failUnlessEqual(set(metadata.keys()), set(["tahoe", "ctime", "mtime"])))
self.failUnlessEqual(set(metadata.keys()), set(["tahoe"])))
d.addCallback(lambda res: n.get_metadata_for(u"f3"))
d.addCallback(lambda metadata:
self.failUnless((set(metadata.keys()) == set(["key", "tahoe", "ctime", "mtime"])) and
self.failUnless((set(metadata.keys()) == set(["key", "tahoe"])) and
(metadata["key"] == "value"), metadata))
d.addCallback(lambda res: n.delete(u"f1"))
@ -926,7 +924,7 @@ class Dirnode(GridTestMixin, unittest.TestCase,
{"tags": ["web2.0-compatible"], "tahoe": {"bad": "mojo"}}))
d.addCallback(lambda n1: n1.get_metadata_for(u"child"))
d.addCallback(lambda metadata:
self.failUnless((set(metadata.keys()) == set(["tags", "tahoe", "ctime", "mtime"])) and
self.failUnless((set(metadata.keys()) == set(["tags", "tahoe"])) and
metadata["tags"] == ["web2.0-compatible"] and
"bad" not in metadata["tahoe"], metadata))
@ -953,43 +951,37 @@ class Dirnode(GridTestMixin, unittest.TestCase,
d.addCallback(lambda res: n.get_metadata_for(u"timestamps"))
def _check_timestamp1(metadata):
self.failUnless("ctime" in metadata)
self.failUnless("mtime" in metadata)
self.failUnlessGreaterOrEqualThan(metadata["ctime"],
self.failUnlessEqual(set(metadata.keys()), set(["tahoe"]))
tahoe_md = metadata["tahoe"]
self.failUnlessEqual(set(tahoe_md.keys()), set(["linkcrtime", "linkmotime"]))
self.failUnlessGreaterOrEqualThan(tahoe_md["linkcrtime"],
self._start_timestamp)
self.failUnlessGreaterOrEqualThan(self._stop_timestamp,
metadata["ctime"])
self.failUnlessGreaterOrEqualThan(metadata["mtime"],
tahoe_md["linkcrtime"])
self.failUnlessGreaterOrEqualThan(tahoe_md["linkmotime"],
self._start_timestamp)
self.failUnlessGreaterOrEqualThan(self._stop_timestamp,
metadata["mtime"])
tahoe_md["linkmotime"])
# Our current timestamp rules say that replacing an existing
# child should preserve the 'ctime' but update the mtime
self._old_ctime = metadata["ctime"]
self._old_mtime = metadata["mtime"]
# child should preserve the 'linkcrtime' but update the
# 'linkmotime'
self._old_linkcrtime = tahoe_md["linkcrtime"]
self._old_linkmotime = tahoe_md["linkmotime"]
d.addCallback(_check_timestamp1)
d.addCallback(self.stall, 2.0) # accomodate low-res timestamps
d.addCallback(lambda res: n.set_node(u"timestamps", n))
d.addCallback(lambda res: n.get_metadata_for(u"timestamps"))
def _check_timestamp2(metadata):
self.failUnlessEqual(metadata["ctime"], self._old_ctime,
"%s != %s" % (metadata["ctime"],
self._old_ctime))
self.failUnlessGreaterThan(metadata["mtime"], self._old_mtime)
self.failUnlessIn("tahoe", metadata)
tahoe_md = metadata["tahoe"]
self.failUnlessEqual(set(tahoe_md.keys()), set(["linkcrtime", "linkmotime"]))
self.failUnlessEqual(tahoe_md["linkcrtime"], self._old_linkcrtime)
self.failUnlessGreaterThan(tahoe_md["linkmotime"], self._old_linkmotime)
return n.delete(u"timestamps")
d.addCallback(_check_timestamp2)
# also make sure we can add/update timestamps on a
# previously-existing child that didn't have any, since there are
# a lot of 0.7.0-generated edges around out there
d.addCallback(lambda res: n.set_node(u"no_timestamps", n, {}))
d.addCallback(lambda res: n.set_node(u"no_timestamps", n))
d.addCallback(lambda res: n.get_metadata_for(u"no_timestamps"))
d.addCallback(lambda metadata:
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"])))
d.addCallback(lambda res: n.delete(u"no_timestamps"))
d.addCallback(lambda res: n.delete(u"subdir"))
d.addCallback(lambda old_child:
self.failUnlessEqual(old_child.get_uri(),
@ -1017,8 +1009,7 @@ class Dirnode(GridTestMixin, unittest.TestCase,
set([u"child", u"newfile"])))
d.addCallback(lambda res: n.get_metadata_for(u"newfile"))
d.addCallback(lambda metadata:
self.failUnlessEqual(set(metadata.keys()),
set(["tahoe", "ctime", "mtime"])))
self.failUnlessEqual(set(metadata.keys()), set(["tahoe"])))
uploadable3 = upload.Data("some data", convergence="converge")
d.addCallback(lambda res: n.add_file(u"newfile-metadata",
@ -1028,7 +1019,7 @@ class Dirnode(GridTestMixin, unittest.TestCase,
self.failUnless(IImmutableFileNode.providedBy(newnode)))
d.addCallback(lambda res: n.get_metadata_for(u"newfile-metadata"))
d.addCallback(lambda metadata:
self.failUnless((set(metadata.keys()) == set(["key", "tahoe", "ctime", "mtime"])) and
self.failUnless((set(metadata.keys()) == set(["key", "tahoe"])) and
(metadata['key'] == "value"), metadata))
d.addCallback(lambda res: n.delete(u"newfile-metadata"))
@ -1120,19 +1111,21 @@ class Dirnode(GridTestMixin, unittest.TestCase,
return d
def test_update_metadata(self):
(t1, t2, t3) = (626644800, 634745640, 892226160)
(t1, t2, t3) = (626644800.0, 634745640.0, 892226160.0)
md1 = dirnode.update_metadata({}, {"ctime": t1}, t2)
self.failUnlessEqual(md1, {"ctime": t1, "mtime": t2,
"tahoe":{"linkcrtime": t1, "linkmotime": t2}})
md1 = dirnode.update_metadata({"ctime": t1}, {}, t2)
self.failUnlessEqual(md1, {"tahoe":{"linkcrtime": t1, "linkmotime": t2}})
md2 = dirnode.update_metadata(md1, {"key": "value", "tahoe": {"bad": "mojo"}}, t3)
self.failUnlessEqual(md2, {"key": "value", "ctime": t1, "mtime": t3,
self.failUnlessEqual(md2, {"key": "value",
"tahoe":{"linkcrtime": t1, "linkmotime": t3}})
md3 = dirnode.update_metadata({}, None, t3)
self.failUnlessEqual(md3, {"ctime": t3, "mtime": t3,
"tahoe":{"linkcrtime": t3, "linkmotime": t3}})
self.failUnlessEqual(md3, {"tahoe":{"linkcrtime": t3, "linkmotime": t3}})
md4 = dirnode.update_metadata({}, {"bool": True, "number": 42}, t1)
self.failUnlessEqual(md4, {"bool": True, "number": 42,
"tahoe":{"linkcrtime": t1, "linkmotime": t1}})
def test_create_subdirectory(self):
self.basedir = "dirnode/Dirnode/test_create_subdirectory"

View File

@ -240,16 +240,18 @@ class WebMixin(object):
for (name,value)
in data[1]["children"].iteritems()] )
self.failUnlessEqual(kids[u"sub"][0], "dirnode")
self.failUnless("metadata" in kids[u"sub"][1])
self.failUnless("ctime" in kids[u"sub"][1]["metadata"])
self.failUnless("mtime" in kids[u"sub"][1]["metadata"])
self.failUnlessIn("metadata", kids[u"sub"][1])
self.failUnlessIn("tahoe", kids[u"sub"][1]["metadata"])
tahoe_md = kids[u"sub"][1]["metadata"]["tahoe"]
self.failUnlessIn("linkcrtime", tahoe_md)
self.failUnlessIn("linkmotime", tahoe_md)
self.failUnlessEqual(kids[u"bar.txt"][0], "filenode")
self.failUnlessEqual(kids[u"bar.txt"][1]["size"], len(self.BAR_CONTENTS))
self.failUnlessEqual(kids[u"bar.txt"][1]["ro_uri"], self._bar_txt_uri)
self.failUnlessEqual(kids[u"bar.txt"][1]["verify_uri"],
self._bar_txt_verifycap)
self.failUnlessEqual(kids[u"bar.txt"][1]["metadata"]["ctime"],
self._bar_txt_metadata["ctime"])
self.failUnlessEqual(kids[u"bar.txt"][1]["metadata"]["tahoe"]["linkcrtime"],
self._bar_txt_metadata["tahoe"]["linkcrtime"])
self.failUnlessEqual(kids[u"n\u00fc.txt"][1]["ro_uri"],
self._bar_txt_uri)