remove zfec source from our tree, users should grab a tarball from our http://allmydata.org/trac/tahoe/wiki/Dependencies page, or from the python cheeseshop

This commit is contained in:
Brian Warner 2007-08-25 15:37:25 -07:00
parent 4bbc423d70
commit 248f2dc260
22 changed files with 0 additions and 5690 deletions

View File

@ -1,353 +0,0 @@
In addition to the terms written below, this licence comes with the added
permission that, if you become obligated to release a derived work under this
licence (as per section 2.b), you may delay the fulfillment of this
obligation for up to 12 months. If you are obligated to release code under
section 2.b of this licence, such code must be released under these same terms
including the 12-month grace period clause.
In addition to the terms written below, this licence comes with the added
permission that you may link this program with the OpenSSL library and
distribute executables, as long as you follow the requirements of this
licence in regard to all of the software in the executable aside from
OpenSSL.
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Lesser General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License.

View File

@ -1,235 +0,0 @@
* Intro and Licence
This package implements an "erasure code", or "forward error correction
code".
It is offered under the GNU General Public License (v2 or later), with the
added permission that, if you become obligated to release a derived work
under this licence (as per section 2.b), you may delay the fulfillment of
this obligation for up to 12 months. If you are obligated to release code
under section 2.b of this licence, such code must be released under these
same terms including the 12-month grace period clause. See the COPYING
file for details.
The most widely known example of an erasure code is the RAID-5 algorithm
which makes it so that in the event of the loss of any one hard drive, the
stored data can be completely recovered. The algorithm in the zfec package
has a similar effect, but instead of recovering from the loss of only a
single element, it can be parameterized to choose in advance the number of
elements whose loss it can tolerate.
This package is largely based on the old "fec" library by Luigi Rizzo et al.,
which is a mature and optimized implementation of erasure coding. The zfec
package makes several changes from the original "fec" package, including
addition of the Python API, refactoring of the C API to support zero-copy
operation, a few clean-ups and micro-optimizations of the core code itself,
and the addition of a command-line tool named "zfec".
* Installation
This package is managed with the "setuptools" package management tool. To
build and install the package directly into your system, just run "python
./setup.py install". If you prefer to keep the package limited to a specific
directory so that you can manage it yourself (perhaps by using the "GNU
stow") tool, then give it these arguments: "python ./setup.py install
--single-version-externally-managed
--record=${specificdirectory}/zfec-install.log --prefix=${specificdirectory}"
To run the self-tests, execute "python ./setup.py test" (or if you have
Twisted Python installed, you can run "trial zfec" for nicer output and test
options.)
* Community
The source is currently available via darcs on the web with the command:
darcs get http://allmydata.org/source/zfec
More information on darcs is available at http://darcs.net
Please join the zfec mailing list and submit patches:
<http://allmydata.org/cgi-bin/mailman/listinfo/zfec-dev>
* Overview
This package performs two operations, encoding and decoding. Encoding takes
some input data and expands its size by producing extra "check blocks", also
called "secondary blocks". Decoding takes some data -- any combination of
blocks of the original data (called "primary blocks") and "secondary blocks",
and produces the original data.
The encoding is parameterized by two integers, k and m. m is the total number
of blocks produced, and k is how many of those blocks are necessary to
reconstruct the original data. m is required to be at least 1 and at most 256,
and k is required to be at least 1 and at most m.
(Note that when k == m then there is no point in doing erasure coding -- it
degenerates to the equivalent of the Unix "split" utility which simply splits
the input into successive segments. Similarly, when k == 1 it degenerates to
the equivalent of the unix "cp" utility -- each block is a complete copy of the
input data. The "zfec" command-line tool does not implement these degenerate
cases.)
Note that each "primary block" is a segment of the original data, so its size
is 1/k'th of the size of original data, and each "secondary block" is of the
same size, so the total space used by all the blocks is m/k times the size of
the original data (plus some padding to fill out the last primary block to be
the same size as all the others). In addition to the data contained in the
blocks themselves there are also a few pieces of metadata which are necessary
for later reconstruction. Those pieces are: 1. the value of K, 2. the value
of M, 3. the sharenum of each block, 4. the number of bytes of padding
that were used. The "zfec" command-line tool compresses these pieces of data
and prepends them to the beginning of each share, so each the sharefile
produced by the "zfec" command-line tool is between one and four bytes larger
than the share data alone.
The decoding step requires as input k of the blocks which were produced by the
encoding step. The decoding step produces as output the data that was earlier
input to the encoding step.
* Command-Line Tool
The bin/ directory contains two Unix-style, command-line tools "zfec" and
"zunfec". Execute "zfec --help" or "zunfec --help" for usage instructions.
Note: a Unix-style tool like "zfec" does only one thing -- in this case
erasure coding -- and leaves other tasks to other tools. Other Unix-style
tools that go well with zfec include "GNU tar" for archiving multiple files
and directories into one file, "rzip" or "lrzip" for compression, and "GNU
Privacy Guard" for encryption or "sha256sum" for integrity. It is important
to do things in order: first archive, then compress, then either encrypt or
sha256sum, then erasure code. Note that if GNU Privacy Guard is used for
privacy, then it will also ensure integrity, so the use of sha256sum is
unnecessary in that case.
* Performance Measurements
On my Athlon 64 2.4 GHz workstation (running Linux), the "zfec" command-line
tool encoded a 160 MB file with m=100, k=94 (about 6% redundancy) in 3.9
seconds, where the "par2" tool encoded the file with about 6% redundancy in
27 seconds. zfec encoded the same file with m=12, k=6 (100% redundancy) in
4.1 seconds, where par2 encoded it with about 100% redundancy in 7 minutes
and 56 seconds.
The underlying C library in benchmark mode encoded from a file at about
4.9 million bytes per second and decoded at about 5.8 million bytes per second.
On Peter's fancy Intel Mac laptop (2.16 GHz Core Duo), it encoded from a file
at about 6.2 million bytes per second.
On my even fancier Intel Mac laptop (2.33 GHz Core Duo), it encoded from a file
at about 6.8 million bytes per second.
On my old PowerPC G4 867 MHz Mac laptop, it encoded from a file at about 1.3
million bytes per second.
* API
Each block is associated with "blocknum". The blocknum of each primary block is
its index (starting from zero), so the 0'th block is the first primary block,
which is the first few bytes of the file, the 1'st block is the next primary
block, which is the next few bytes of the file, and so on. The last primary
block has blocknum k-1. The blocknum of each secondary block is an arbitrary
integer between k and 255 inclusive. (When using the Python API, if you don't
specify which blocknums you want for your secondary blocks when invoking
encode(), then it will by default provide the blocks with ids from k to m-1
inclusive.)
** C API
fec_encode() takes as input an array of k pointers, where each pointer points
to a memory buffer containing the input data (i.e., the i'th buffer contains
the i'th primary block). There is also a second parameter which is an array of
the blocknums of the secondary blocks which are to be produced. (Each element
in that array is required to be the blocknum of a secondary block, i.e. it is
required to be >= k and < m.)
The output from fec_encode() is the requested set of secondary blocks which are
written into output buffers provided by the caller.
fec_decode() takes as input an array of k pointers, where each pointer points
to a buffer containing a block. There is also a separate input parameter which
is an array of blocknums, indicating the blocknum of each of the blocks which is
being passed in.
The output from fec_decode() is the set of primary blocks which were missing
from the input and had to be reconstructed. These reconstructed blocks are
written into putput buffers provided by the caller.
** Python API
encode() and decode() take as input a sequence of k buffers, where a "sequence"
is any object that implements the Python sequence protocol (such as a list or
tuple) and a "buffer" is any object that implements the Python buffer protocol
(such as a string or array). The contents that are required to be present in
these buffers are the same as for the C API.
encode() also takes a list of desired blocknums. Unlike the C API, the Python
API accepts blocknums of primary blocks as well as secondary blocks in its list
of desired blocknums. encode() returns a list of buffer objects which contain
the blocks requested. For each requested block which is a primary block, the
resulting list contains a reference to the apppropriate primary block from the
input list. For each requested block which is a secondary block, the list
contains a newly created string object containing that block.
decode() also takes a list of integers indicating the blocknums of the blocks
being passed int. decode() returns a list of buffer objects which contain all
of the primary blocks of the original data (in order). For each primary block
which was present in the input list, then the result list simply contains a
reference to the object that was passed in the input list. For each primary
block which was not present in the input, the result list contains a newly
created string object containing that primary block.
Beware of a "gotcha" that can result from the combination of mutable data and
the fact that the Python API returns references to inputs when possible.
Returning references to its inputs is efficient since it avoids making an
unnecessary copy of the data, but if the object which was passed as input is
mutable and if that object is mutated after the call to zfec returns, then the
result from zfec -- which is just a reference to that same object -- will also
be mutated. This subtlety is the price you pay for avoiding data copying. If
you don't want to have to worry about this then you can simply use immutable
objects (e.g. Python strings) to hold the data that you pass to zfec.
* Utilities
The filefec.py module has a utility function for efficiently reading a file
and encoding it piece by piece. This module is used by the "zfec" and
"zunfec" command-line tools from the bin/ directory.
* Dependencies
A C compiler is required. To use the Python API or the command-line tools a
Python interpreter is also required. We have tested it with Python v2.4 and
v2.5.
* Acknowledgements
Thanks to the author of the original fec lib, Luigi Rizzo, and the folks that
contributed to it: Phil Karn, Robert Morelos-Zaragoza, Hari Thirumoorthy, and
Dan Rubenstein. Thanks to the Mnet hackers who wrote an earlier Python
wrapper, especially Myers Carpenter and Hauke Johannknecht. Thanks to Brian
Warner and Amber O'Whielacronx for help with the API, documentation,
debugging, compression, and unit tests. Thanks to the creators of GCC
(starting with Richard M. Stallman) and Valgrind (starting with Julian Seward)
for a pair of excellent tools. Thanks to my coworkers at Allmydata --
http://allmydata.com -- Fabrice Grinda, Peter Secor, Rob Kinninmont, Brian
Warner, Zandr Milewski, Justin Boreta, Mark Meras for sponsoring this work and
releasing it under a Free Software licence.
Enjoy!
Zooko Wilcox-O'Hearn
2007-04-27
Boulder, Colorado

View File

@ -1,13 +0,0 @@
* INSTALL doc
* catch EnvironmentError when writing sharefiles and clean up
* try Duff's device in _addmul1()?
* memory usage analysis
* announce on lwn, p2p-hackers
* compile with Microsoft compiler, etc.
* test cmdline (tricky)
* test handling of filesystem exceptional conditions (even trickier)
* install the setuptools bootstrap magic
* include setuptools with package so that the bootstrap magic doesn't connect out
* use setuptools's dependency magic
* include all dependencies with package so that the dependency magic doesn't connect out

View File

@ -1,46 +0,0 @@
from zfec import filefec
import os
from pyutil import benchutil
FNAME="benchrandom.data"
def _make_new_rand_file(size):
open(FNAME, "wb").write(os.urandom(size))
def donothing(results, reslenthing):
pass
import sha
hashers = [ sha.new() for i in range(100) ]
def hashem(results, reslenthing):
for i, result in enumerate(results):
hashers[i].update(result)
def _encode_file(N):
filefec.encode_file(open(FNAME, "rb"), donothing, 25, 100)
def _encode_file_stringy(N):
filefec.encode_file_stringy(open(FNAME, "rb"), donothing, 25, 100)
def _encode_file_stringy_easyfec(N):
filefec.encode_file_stringy_easyfec(open(FNAME, "rb"), donothing, 25, 100)
def _encode_file_not_really(N):
filefec.encode_file_not_really(open(FNAME, "rb"), donothing, 25, 100)
def _encode_file_not_really_and_hash(N):
filefec.encode_file_not_really_and_hash(open(FNAME, "rb"), donothing, 25, 100)
def _encode_file_and_hash(N):
filefec.encode_file(open(FNAME, "rb"), hashem, 25, 100)
def bench():
# for f in [_encode_file_stringy_easyfec, _encode_file_stringy, _encode_file, _encode_file_not_really,]:
# for f in [_encode_file,]:
for f in [_encode_file_not_really, _encode_file_not_really_and_hash, _encode_file, _encode_file_and_hash,]:
print f
benchutil.bench(f, initfunc=_make_new_rand_file, TOPXP=23, MAXREPS=128, MAXTIME=64)
# bench()

View File

@ -1,227 +0,0 @@
#!/usr/bin/env python
"""Bootstrap setuptools installation
If you want to use setuptools in your package's setup.py, just include this
file in the same directory with it, and add this to the top of your setup.py::
from ez_setup import use_setuptools
use_setuptools()
If you want to require a specific version of setuptools, set a download
mirror, or use an alternate download directory, you can do so by supplying
the appropriate options to ``use_setuptools()``.
This file can also be run as a script to install or upgrade setuptools.
"""
import sys
DEFAULT_VERSION = "0.6c6"
DEFAULT_URL = "http://cheeseshop.python.org/packages/%s/s/setuptools/" % sys.version[:3]
md5_data = {
'setuptools-0.6b1-py2.3.egg': '8822caf901250d848b996b7f25c6e6ca',
'setuptools-0.6b1-py2.4.egg': 'b79a8a403e4502fbb85ee3f1941735cb',
'setuptools-0.6b2-py2.3.egg': '5657759d8a6d8fc44070a9d07272d99b',
'setuptools-0.6b2-py2.4.egg': '4996a8d169d2be661fa32a6e52e4f82a',
'setuptools-0.6b3-py2.3.egg': 'bb31c0fc7399a63579975cad9f5a0618',
'setuptools-0.6b3-py2.4.egg': '38a8c6b3d6ecd22247f179f7da669fac',
'setuptools-0.6b4-py2.3.egg': '62045a24ed4e1ebc77fe039aa4e6f7e5',
'setuptools-0.6b4-py2.4.egg': '4cb2a185d228dacffb2d17f103b3b1c4',
'setuptools-0.6c1-py2.3.egg': 'b3f2b5539d65cb7f74ad79127f1a908c',
'setuptools-0.6c1-py2.4.egg': 'b45adeda0667d2d2ffe14009364f2a4b',
'setuptools-0.6c2-py2.3.egg': 'f0064bf6aa2b7d0f3ba0b43f20817c27',
'setuptools-0.6c2-py2.4.egg': '616192eec35f47e8ea16cd6a122b7277',
'setuptools-0.6c3-py2.3.egg': 'f181fa125dfe85a259c9cd6f1d7b78fa',
'setuptools-0.6c3-py2.4.egg': 'e0ed74682c998bfb73bf803a50e7b71e',
'setuptools-0.6c3-py2.5.egg': 'abef16fdd61955514841c7c6bd98965e',
'setuptools-0.6c4-py2.3.egg': 'b0b9131acab32022bfac7f44c5d7971f',
'setuptools-0.6c4-py2.4.egg': '2a1f9656d4fbf3c97bf946c0a124e6e2',
'setuptools-0.6c4-py2.5.egg': '8f5a052e32cdb9c72bcf4b5526f28afc',
'setuptools-0.6c5-py2.3.egg': 'ee9fd80965da04f2f3e6b3576e9d8167',
'setuptools-0.6c5-py2.4.egg': 'afe2adf1c01701ee841761f5bcd8aa64',
'setuptools-0.6c5-py2.5.egg': 'a8d3f61494ccaa8714dfed37bccd3d5d',
'setuptools-0.6c6-py2.5.egg': 'b2f8a7520709a5b34f80946de5f02f53',
}
import sys, os
def _validate_md5(egg_name, data):
if egg_name in md5_data:
from md5 import md5
digest = md5(data).hexdigest()
if digest != md5_data[egg_name]:
print >>sys.stderr, (
"md5 validation of %s failed! (Possible download problem?)"
% egg_name
)
sys.exit(2)
return data
def use_setuptools(
version=DEFAULT_VERSION, download_base=DEFAULT_URL, to_dir=os.curdir, min_version=None
):
"""Automatically find/download setuptools and make it available on sys.path
`version` should be a valid setuptools version number that is available
as an egg for download under the `download_base` URL (which should end with
a '/'). `to_dir` is the directory where setuptools will be downloaded, if
it is not already available. If an older version of setuptools is installed,
this routine will print a message to ``sys.stderr`` and raise SystemExit in
an attempt to abort the calling script.
"""
try:
import setuptools
if setuptools.__version__ == '0.0.1':
print >>sys.stderr, (
"You have an obsolete version of setuptools installed. Please\n"
"remove it from your system entirely before rerunning this script."
)
sys.exit(2)
except ImportError:
egg = download_setuptools(version, download_base, to_dir)
sys.path.insert(0, egg)
import setuptools; setuptools.bootstrap_install_from = egg
import pkg_resources
try:
if not min_version:
min_version = version
pkg_resources.require("setuptools>="+min_version)
except pkg_resources.VersionConflict, e:
# XXX could we install in a subprocess here?
# --arnowa here we would need an elegant update solution, i think
print >>sys.stderr, (
"The required version of setuptools (>=%s) is not available, and\n"
"can't be installed while this script is running. Please install\n"
" a more recent version first.\n\n(Currently using %r)"
) % (min_version, e.args[0])
sys.exit(2)
def download_setuptools(
version=DEFAULT_VERSION, download_base=DEFAULT_URL, to_dir=os.curdir
):
"""Download setuptools from a specified location and return its filename
`version` should be a valid setuptools version number that is available
as an egg for download under the `download_base` URL (which should end
with a '/'). `to_dir` is the directory where the egg will be downloaded.
"""
import urllib2, shutil
egg_name = "setuptools-%s-py%s.egg" % (version,sys.version[:3])
url = download_base + egg_name
saveto = os.path.join(to_dir, egg_name)
src = dst = None
if not os.path.exists(saveto): # Avoid repeated downloads
try:
from distutils import log
if True:
log.warn("""
---------------------------------------------------------------------------
This script requires setuptools version %s to run (even to display
help). I will attempt to download it for you (from
%s), but
you may need to enable firewall access for this script first.
(Note: if this machine does not have network access, please obtain the file
%s
and place it in this directory before rerunning this script.)
---------------------------------------------------------------------------""",
version, download_base, url
);
log.warn("Downloading %s", url)
src = urllib2.urlopen(url)
# Read/write all in one block, so we don't create a corrupt file
# if the download is interrupted.
data = _validate_md5(egg_name, src.read())
dst = open(saveto,"wb"); dst.write(data)
finally:
if src: src.close()
if dst: dst.close()
return os.path.realpath(saveto)
def main(argv, version=DEFAULT_VERSION):
"""Install or upgrade setuptools and EasyInstall"""
try:
import setuptools
except ImportError:
egg = None
try:
egg = download_setuptools(version)
sys.path.insert(0,egg)
from setuptools.command.easy_install import main
return main(list(argv)+[egg]) # we're done here
finally:
if egg and os.path.exists(egg):
os.unlink(egg)
else:
if setuptools.__version__ == "0.0.1": #Does this happen?? --arnowa
# tell the user to uninstall obsolete version
use_setuptools(version)
req = "setuptools>="+version
import pkg_resources
try:
pkg_resources.require(req)
except pkg_resources.VersionConflict:
try:
from setuptools.command.easy_install import main
except ImportError:
from easy_install import main
main(list(argv)+[download_setuptools()])
sys.exit(0) # try to force an exit
else:
if argv:
from setuptools.command.easy_install import main
main(argv)
else:
print "Setuptools version",version,"or greater has been installed."
print '(Run "ez_setup.py -U setuptools" to reinstall or upgrade.)'
def update_md5(filenames):
"""Update our built-in md5 registry"""
import re
from md5 import md5
for name in filenames:
base = os.path.basename(name)
f = open(name,'rb')
md5_data[base] = md5(f.read()).hexdigest()
f.close()
data = [" %r: %r,\n" % it for it in md5_data.items()]
data.sort()
repl = "".join(data)
import inspect
srcfile = inspect.getsourcefile(sys.modules[__name__])
f = open(srcfile, 'rb'); src = f.read(); f.close()
match = re.search("\nmd5_data = {\n([^}]+)}", src)
if not match:
print >>sys.stderr, "Internal error!"
sys.exit(2)
src = src[:match.start(1)] + repl + src[match.end(1):]
f = open(srcfile,'w')
f.write(src)
f.close()
if __name__=='__main__':
if len(sys.argv)>2 and sys.argv[1]=='--md5update':
update_md5(sys.argv[2:])
else:
main(sys.argv[1:])

View File

@ -1,92 +0,0 @@
#!/usr/bin/env python
# zfec -- fast forward error correction library with Python interface
#
# Copyright (C) 2007 Allmydata, Inc.
# Author: Zooko Wilcox-O'Hearn
#
# This file is part of zfec.
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version, with the added permission that, if you become obligated
# to release a derived work under this licence (as per section 2.b), you may
# delay the fulfillment of this obligation for up to 12 months. See the file
# COPYING for details.
#
# If you would like to inquire about a commercial relationship with Allmydata,
# Inc., please contact partnerships@allmydata.com and visit
# http://allmydata.com/.
from ez_setup import use_setuptools
import sys
if 'cygwin' in sys.platform.lower():
min_version='0.6c6'
else:
min_version='0.6a9'
use_setuptools(min_version=min_version)
from setuptools import Extension, find_packages, setup
DEBUGMODE=False
# DEBUGMODE=True
extra_compile_args=[]
extra_link_args=[]
extra_compile_args.append("-std=c99")
undef_macros=[]
if DEBUGMODE:
extra_compile_args.append("-O0")
extra_compile_args.append("-g")
extra_compile_args.append("-Wall")
extra_link_args.append("-g")
undef_macros.append('NDEBUG')
trove_classifiers=[
"Development Status :: 5 - Production/Stable",
"Environment :: Console",
"License :: OSI Approved :: GNU General Public License (GPL)",
"License :: DFSG approved",
"Intended Audience :: Developers",
"Intended Audience :: End Users/Desktop",
"Intended Audience :: System Administrators",
"Operating System :: Microsoft",
"Operating System :: Microsoft :: Windows",
"Operating System :: Unix",
"Operating System :: POSIX :: Linux",
"Operating System :: POSIX",
"Operating System :: MacOS :: MacOS X",
"Operating System :: Microsoft :: Windows :: Windows NT/2000",
"Operating System :: OS Independent",
"Natural Language :: English",
"Programming Language :: C",
"Programming Language :: Python",
"Topic :: Utilities",
"Topic :: System :: Systems Administration",
"Topic :: System :: Filesystems",
"Topic :: System :: Distributed Computing",
"Topic :: Software Development :: Libraries",
"Topic :: Communications :: Usenet News",
"Topic :: System :: Archiving :: Backup",
"Topic :: System :: Archiving :: Mirroring",
"Topic :: System :: Archiving",
]
setup(name='zfec',
version='1.0.0',
description='a fast erasure code with command-line, C, and Python interfaces',
long_description='Fast, portable, programmable erasure coding a.k.a. "forward error correction": the generation of redundant blocks of information such that if some blocks are lost then the original data can be recovered from the remaining blocks.',
author='Zooko O\'Whielacronx',
author_email='zooko@zooko.com',
url='http://allmydata.org/source/zfec',
license='GNU GPL',
packages=find_packages(),
classifiers=trove_classifiers,
entry_points = { 'console_scripts': [ 'zfec = zfec.cmdline_zfec:main', 'zunfec = zfec.cmdline_zunfec:main' ] },
ext_modules=[Extension('_fec', ['zfec/fec.c', 'zfec/_fecmodule.c',], extra_link_args=extra_link_args, extra_compile_args=extra_compile_args, undef_macros=undef_macros),],
test_suite="zfec.test",
)

View File

@ -1,36 +0,0 @@
"""
zfec -- fast forward error correction library with Python interface
maintainer web site: U{http://allmydata.com/source/zfec}
zfec web site: U{http://allmydata.com/source/zfec}
"""
from util.version import Version
# For an explanation of what the parts of the version string mean,
# please see pyutil.version.
__version__ = Version("1.0.0b3-0-STABLE")
# Please put a URL or other note here which shows where to get the branch of
# development from which this version grew.
__sources__ = ["http://allmydata.org/source/zfec",]
from _fec import Encoder, Decoder, Error
import filefec, cmdline_zfec, cmdline_zunfec
# zfec -- fast forward error correction library with Python interface
#
# Copyright (C) 2007 Allmydata, Inc.
# Author: Zooko Wilcox-O'Hearn
# mailto:zooko@zooko.com
#
# This file is part of zfec.
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version, with the added permission that, if you become obligated
# to release a derived work under this licence (as per section 2.b), you may
# delay the fulfillment of this obligation for up to 12 months. See the
# COPYING file for details.

View File

@ -1,612 +0,0 @@
/**
* zfec -- fast forward error correction library with Python interface
*/
#include <Python.h>
#include <structmember.h>
#if (PY_VERSION_HEX < 0x02050000)
typedef int Py_ssize_t;
#endif
#include "fec.h"
#include "stdarg.h"
static PyObject *py_fec_error;
static PyObject *py_raise_fec_error (const char *format, ...);
static char fec__doc__[] = "\
FEC - Forward Error Correction \n\
";
static PyObject *
py_raise_fec_error(const char *format, ...) {
char exceptionMsg[1024];
va_list ap;
va_start (ap, format);
vsnprintf (exceptionMsg, 1024, format, ap);
va_end (ap);
exceptionMsg[1023]='\0';
PyErr_SetString (py_fec_error, exceptionMsg);
return NULL;
}
static char Encoder__doc__[] = "\
Hold static encoder state (an in-memory table for matrix multiplication), and k and m parameters, and provide {encode()} method.\n\n\
@param k: the number of packets required for reconstruction \n\
@param m: the number of packets generated \n\
";
typedef struct {
PyObject_HEAD
/* expose these */
short kk;
short mm;
/* internal */
fec_t* fec_matrix;
} Encoder;
static PyObject *
Encoder_new(PyTypeObject *type, PyObject *args, PyObject *kwds) {
Encoder *self;
self = (Encoder*)type->tp_alloc(type, 0);
if (self != NULL) {
self->kk = 0;
self->mm = 0;
self->fec_matrix = NULL;
}
return (PyObject *)self;
}
static int
Encoder_init(Encoder *self, PyObject *args, PyObject *kwdict) {
static char *kwlist[] = {
"k",
"m",
NULL
};
int ink, inm;
if (!PyArg_ParseTupleAndKeywords(args, kwdict, "ii", kwlist, &ink, &inm))
return -1;
if (ink < 1) {
py_raise_fec_error("Precondition violation: first argument is required to be greater than or equal to 1, but it was %d", self->kk);
return -1;
}
if (inm < 1) {
py_raise_fec_error("Precondition violation: second argument is required to be greater than or equal to 1, but it was %d", self->mm);
return -1;
}
if (inm > 256) {
py_raise_fec_error("Precondition violation: second argument is required to be less than or equal to 256, but it was %d", self->mm);
return -1;
}
if (ink > inm) {
py_raise_fec_error("Precondition violation: first argument is required to be less than or equal to the second argument, but they were %d and %d respectively", ink, inm);
return -1;
}
self->kk = (short)ink;
self->mm = (short)inm;
self->fec_matrix = fec_new(self->kk, self->mm);
return 0;
}
static char Encoder_encode__doc__[] = "\
Encode data into m packets.\n\
\n\
@param inblocks: a sequence of k buffers of data to encode -- these are the k primary blocks, i.e. the input data split into k pieces (for best performance, make it a tuple instead of a list); All blocks are required to be the same length.\n\
@param desired_blocks_nums optional sequence of blocknums indicating which blocks to produce and return; If None, all m blocks will be returned (in order). (For best performance, make it a tuple instead of a list.)\n\
@returns: a list of buffers containing the requested blocks; Note that if any of the input blocks were 'primary blocks', i.e. their blocknum was < k, then the result sequence will contain a Python reference to the same Python object as was passed in. As long as the Python object in question is immutable (i.e. a string) then you don't have to think about this detail, but if it is mutable (i.e. an array), then you have to be aware that if you subsequently mutate the contents of that object then that will also change the contents of the sequence that was returned from this call to encode().\n\
";
static PyObject *
Encoder_encode(Encoder *self, PyObject *args) {
PyObject* inblocks;
PyObject* desired_blocks_nums = NULL; /* The blocknums of the blocks that should be returned. */
PyObject* result = NULL;
if (!PyArg_ParseTuple(args, "O|O", &inblocks, &desired_blocks_nums))
return NULL;
gf* check_blocks_produced[self->mm - self->kk]; /* This is an upper bound -- we will actually use only num_check_blocks_produced of these elements (see below). */
PyObject* pystrs_produced[self->mm - self->kk]; /* This is an upper bound -- we will actually use only num_check_blocks_produced of these elements (see below). */
unsigned num_check_blocks_produced = 0; /* The first num_check_blocks_produced elements of the check_blocks_produced array and of the pystrs_produced array will be used. */
const gf* incblocks[self->kk];
unsigned num_desired_blocks;
PyObject* fast_desired_blocks_nums = NULL;
PyObject** fast_desired_blocks_nums_items;
unsigned c_desired_blocks_nums[self->mm];
unsigned c_desired_checkblocks_ids[self->mm - self->kk];
unsigned i;
PyObject* fastinblocks = NULL;
for (i=0; i<self->mm - self->kk; i++)
pystrs_produced[i] = NULL;
if (desired_blocks_nums) {
fast_desired_blocks_nums = PySequence_Fast(desired_blocks_nums, "Second argument (optional) was not a sequence.");
if (!fast_desired_blocks_nums)
goto err;
num_desired_blocks = PySequence_Fast_GET_SIZE(fast_desired_blocks_nums);
fast_desired_blocks_nums_items = PySequence_Fast_ITEMS(fast_desired_blocks_nums);
for (i=0; i<num_desired_blocks; i++) {
if (!PyInt_Check(fast_desired_blocks_nums_items[i])) {
py_raise_fec_error("Precondition violation: second argument is required to contain int.");
goto err;
}
c_desired_blocks_nums[i] = PyInt_AsLong(fast_desired_blocks_nums_items[i]);
if (c_desired_blocks_nums[i] >= self->kk)
num_check_blocks_produced++;
}
} else {
num_desired_blocks = self->mm;
for (i=0; i<num_desired_blocks; i++)
c_desired_blocks_nums[i] = i;
num_check_blocks_produced = self->mm - self->kk;
}
fastinblocks = PySequence_Fast(inblocks, "First argument was not a sequence.");
if (!fastinblocks)
goto err;
if (PySequence_Fast_GET_SIZE(fastinblocks) != self->kk) {
py_raise_fec_error("Precondition violation: Wrong length -- first argument is required to contain exactly k blocks. len(first): %d, k: %d", PySequence_Fast_GET_SIZE(fastinblocks), self->kk);
goto err;
}
/* Construct a C array of gf*'s of the input data. */
PyObject** fastinblocksitems = PySequence_Fast_ITEMS(fastinblocks);
if (!fastinblocksitems)
goto err;
Py_ssize_t sz, oldsz = 0;
for (i=0; i<self->kk; i++) {
if (!PyObject_CheckReadBuffer(fastinblocksitems[i])) {
py_raise_fec_error("Precondition violation: %u'th item is required to offer the single-segment read character buffer protocol, but it does not.\n", i);
goto err;
}
if (PyObject_AsReadBuffer(fastinblocksitems[i], (const void**)&(incblocks[i]), &sz))
goto err;
if (oldsz != 0 && oldsz != sz) {
py_raise_fec_error("Precondition violation: Input blocks are required to be all the same length. oldsz: %Zu, sz: %Zu\n", oldsz, sz);
goto err;
}
oldsz = sz;
}
/* Allocate space for all of the check blocks. */
unsigned char check_block_index = 0; /* index into the check_blocks_produced and (parallel) pystrs_produced arrays */
for (i=0; i<num_desired_blocks; i++) {
if (c_desired_blocks_nums[i] >= self->kk) {
c_desired_checkblocks_ids[check_block_index] = c_desired_blocks_nums[i];
pystrs_produced[check_block_index] = PyString_FromStringAndSize(NULL, sz);
if (pystrs_produced[check_block_index] == NULL)
goto err;
check_blocks_produced[check_block_index] = (gf*)PyString_AsString(pystrs_produced[check_block_index]);
if (check_blocks_produced[check_block_index] == NULL)
goto err;
check_block_index++;
}
}
assert (check_block_index == num_check_blocks_produced);
/* Encode any check blocks that are needed. */
fec_encode(self->fec_matrix, incblocks, check_blocks_produced, c_desired_checkblocks_ids, num_check_blocks_produced, sz);
/* Wrap all requested blocks up into a Python list of Python strings. */
result = PyList_New(num_desired_blocks);
if (result == NULL)
goto err;
check_block_index = 0;
for (i=0; i<num_desired_blocks; i++) {
if (c_desired_blocks_nums[i] < self->kk) {
Py_INCREF(fastinblocksitems[c_desired_blocks_nums[i]]);
if (PyList_SetItem(result, i, fastinblocksitems[c_desired_blocks_nums[i]]) == -1) {
Py_DECREF(fastinblocksitems[c_desired_blocks_nums[i]]);
goto err;
}
} else {
if (PyList_SetItem(result, i, pystrs_produced[check_block_index]) == -1)
goto err;
pystrs_produced[check_block_index] = NULL;
check_block_index++;
}
}
goto cleanup;
err:
for (i=0; i<num_check_blocks_produced; i++)
Py_XDECREF(pystrs_produced[i]);
Py_XDECREF(result); result = NULL;
cleanup:
Py_XDECREF(fastinblocks); fastinblocks=NULL;
Py_XDECREF(fast_desired_blocks_nums); fast_desired_blocks_nums=NULL;
return result;
}
static void
Encoder_dealloc(Encoder * self) {
fec_free(self->fec_matrix);
self->ob_type->tp_free((PyObject*)self);
}
static PyMethodDef Encoder_methods[] = {
{"encode", (PyCFunction)Encoder_encode, METH_VARARGS, Encoder_encode__doc__},
{NULL},
};
static PyMemberDef Encoder_members[] = {
{"k", T_SHORT, offsetof(Encoder, kk), READONLY, "k"},
{"m", T_SHORT, offsetof(Encoder, mm), READONLY, "m"},
{NULL} /* Sentinel */
};
static PyTypeObject Encoder_type = {
PyObject_HEAD_INIT(NULL)
0, /*ob_size*/
"_fec.Encoder", /*tp_name*/
sizeof(Encoder), /*tp_basicsize*/
0, /*tp_itemsize*/
(destructor)Encoder_dealloc, /*tp_dealloc*/
0, /*tp_print*/
0, /*tp_getattr*/
0, /*tp_setattr*/
0, /*tp_compare*/
0, /*tp_repr*/
0, /*tp_as_number*/
0, /*tp_as_sequence*/
0, /*tp_as_mapping*/
0, /*tp_hash */
0, /*tp_call*/
0, /*tp_str*/
0, /*tp_getattro*/
0, /*tp_setattro*/
0, /*tp_as_buffer*/
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/
Encoder__doc__, /* tp_doc */
0, /* tp_traverse */
0, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
0, /* tp_iter */
0, /* tp_iternext */
Encoder_methods, /* tp_methods */
Encoder_members, /* tp_members */
0, /* tp_getset */
0, /* tp_base */
0, /* tp_dict */
0, /* tp_descr_get */
0, /* tp_descr_set */
0, /* tp_dictoffset */
(initproc)Encoder_init, /* tp_init */
0, /* tp_alloc */
Encoder_new, /* tp_new */
};
static char Decoder__doc__[] = "\
Hold static decoder state (an in-memory table for matrix multiplication), and k and m parameters, and provide {decode()} method.\n\n\
@param k: the number of packets required for reconstruction \n\
@param m: the number of packets generated \n\
";
typedef struct {
PyObject_HEAD
/* expose these */
short kk;
short mm;
/* internal */
fec_t* fec_matrix;
} Decoder;
static PyObject *
Decoder_new(PyTypeObject *type, PyObject *args, PyObject *kwds) {
Decoder *self;
self = (Decoder*)type->tp_alloc(type, 0);
if (self != NULL) {
self->kk = 0;
self->mm = 0;
self->fec_matrix = NULL;
}
return (PyObject *)self;
}
static int
Decoder_init(Encoder *self, PyObject *args, PyObject *kwdict) {
static char *kwlist[] = {
"k",
"m",
NULL
};
int ink, inm;
if (!PyArg_ParseTupleAndKeywords(args, kwdict, "ii", kwlist, &ink, &inm))
return -1;
if (ink < 1) {
py_raise_fec_error("Precondition violation: first argument is required to be greater than or equal to 1, but it was %d", self->kk);
return -1;
}
if (inm < 1) {
py_raise_fec_error("Precondition violation: second argument is required to be greater than or equal to 1, but it was %d", self->mm);
return -1;
}
if (inm > 256) {
py_raise_fec_error("Precondition violation: second argument is required to be less than or equal to 256, but it was %d", self->mm);
return -1;
}
if (ink > inm) {
py_raise_fec_error("Precondition violation: first argument is required to be less than or equal to the second argument, but they were %d and %d respectively", ink, inm);
return -1;
}
self->kk = (short)ink;
self->mm = (short)inm;
self->fec_matrix = fec_new(self->kk, self->mm);
return 0;
}
#define SWAP(a,b,t) {t tmp; tmp=a; a=b; b=tmp;}
static char Decoder_decode__doc__[] = "\
Decode a list blocks into a list of segments.\n\
@param blocks a sequence of buffers containing block data (for best performance, make it a tuple instead of a list)\n\
@param blocknums a sequence of integers of the blocknum for each block in blocks (for best performance, make it a tuple instead of a list)\n\
\n\
@return a list of strings containing the segment data (i.e. ''.join(retval) yields a string containing the decoded data)\n\
";
static PyObject *
Decoder_decode(Decoder *self, PyObject *args) {
PyObject*restrict blocks;
PyObject*restrict blocknums;
PyObject* result = NULL;
if (!PyArg_ParseTuple(args, "OO", &blocks, &blocknums))
return NULL;
const gf*restrict cblocks[self->kk];
unsigned cblocknums[self->kk];
gf*restrict recoveredcstrs[self->kk]; /* self->kk is actually an upper bound -- we probably won't need all of this space. */
PyObject*restrict recoveredpystrs[self->kk]; /* self->kk is actually an upper bound -- we probably won't need all of this space. */
unsigned i;
for (i=0; i<self->kk; i++)
recoveredpystrs[i] = NULL;
PyObject*restrict fastblocknums = NULL;
PyObject*restrict fastblocks = PySequence_Fast(blocks, "First argument was not a sequence.");
if (!fastblocks)
goto err;
fastblocknums = PySequence_Fast(blocknums, "Second argument was not a sequence.");
if (!fastblocknums)
goto err;
if (PySequence_Fast_GET_SIZE(fastblocks) != self->kk) {
py_raise_fec_error("Precondition violation: Wrong length -- first argument is required to contain exactly k blocks. len(first): %d, k: %d", PySequence_Fast_GET_SIZE(fastblocks), self->kk);
goto err;
}
if (PySequence_Fast_GET_SIZE(fastblocknums) != self->kk) {
py_raise_fec_error("Precondition violation: Wrong length -- blocknums is required to contain exactly k blocks. len(blocknums): %d, k: %d", PySequence_Fast_GET_SIZE(fastblocknums), self->kk);
goto err;
}
/* Construct a C array of gf*'s of the data and another of C ints of the blocknums. */
unsigned needtorecover=0;
PyObject** fastblocknumsitems = PySequence_Fast_ITEMS(fastblocknums);
if (!fastblocknumsitems)
goto err;
PyObject** fastblocksitems = PySequence_Fast_ITEMS(fastblocks);
if (!fastblocksitems)
goto err;
Py_ssize_t sz, oldsz = 0;
for (i=0; i<self->kk; i++) {
if (!PyInt_Check(fastblocknumsitems[i])) {
py_raise_fec_error("Precondition violation: second argument is required to contain int.");
goto err;
}
long tmpl = PyInt_AsLong(fastblocknumsitems[i]);
if (tmpl < 0 || tmpl > 255) {
py_raise_fec_error("Precondition violation: block nums can't be less than zero or greater than 255. %ld\n", tmpl);
goto err;
}
cblocknums[i] = (unsigned)tmpl;
if (cblocknums[i] >= self->kk)
needtorecover+=1;
if (!PyObject_CheckReadBuffer(fastblocksitems[i])) {
py_raise_fec_error("Precondition violation: %u'th item is required to offer the single-segment read character buffer protocol, but it does not.\n", i);
goto err;
}
if (PyObject_AsReadBuffer(fastblocksitems[i], (const void**)&(cblocks[i]), &sz))
goto err;
if (oldsz != 0 && oldsz != sz) {
py_raise_fec_error("Precondition violation: Input blocks are required to be all the same length. oldsz: %Zu, sz: %Zu\n", oldsz, sz);
goto err;
}
oldsz = sz;
}
/* move src packets into position */
for (i=0; i<self->kk;) {
if (cblocknums[i] >= self->kk || cblocknums[i] == i)
i++;
else {
/* put pkt in the right position. */
unsigned c = cblocknums[i];
SWAP (cblocknums[i], cblocknums[c], int);
SWAP (cblocks[i], cblocks[c], const gf*);
SWAP (fastblocksitems[i], fastblocksitems[c], PyObject*);
}
}
/* Allocate space for all of the recovered blocks. */
for (i=0; i<needtorecover; i++) {
recoveredpystrs[i] = PyString_FromStringAndSize(NULL, sz);
if (recoveredpystrs[i] == NULL)
goto err;
recoveredcstrs[i] = (gf*)PyString_AsString(recoveredpystrs[i]);
if (recoveredcstrs[i] == NULL)
goto err;
}
/* Decode any recovered blocks that are needed. */
fec_decode(self->fec_matrix, cblocks, recoveredcstrs, cblocknums, sz);
/* Wrap up both original primary blocks and decoded blocks into a Python list of Python strings. */
unsigned nextrecoveredix=0;
result = PyList_New(self->kk);
if (result == NULL)
goto err;
for (i=0; i<self->kk; i++) {
if (cblocknums[i] == i) {
/* Original primary block. */
Py_INCREF(fastblocksitems[i]);
if (PyList_SetItem(result, i, fastblocksitems[i]) == -1) {
Py_DECREF(fastblocksitems[i]);
goto err;
}
} else {
/* Recovered block. */
if (PyList_SetItem(result, i, recoveredpystrs[nextrecoveredix]) == -1)
goto err;
recoveredpystrs[nextrecoveredix] = NULL;
nextrecoveredix++;
}
}
goto cleanup;
err:
for (i=0; i<self->kk; i++)
Py_XDECREF(recoveredpystrs[i]);
Py_XDECREF(result); result = NULL;
cleanup:
Py_XDECREF(fastblocks); fastblocks=NULL;
Py_XDECREF(fastblocknums); fastblocknums=NULL;
return result;
}
static void
Decoder_dealloc(Decoder * self) {
fec_free(self->fec_matrix);
self->ob_type->tp_free((PyObject*)self);
}
static PyMethodDef Decoder_methods[] = {
{"decode", (PyCFunction)Decoder_decode, METH_VARARGS, Decoder_decode__doc__},
{NULL},
};
static PyMemberDef Decoder_members[] = {
{"k", T_SHORT, offsetof(Encoder, kk), READONLY, "k"},
{"m", T_SHORT, offsetof(Encoder, mm), READONLY, "m"},
{NULL} /* Sentinel */
};
static PyTypeObject Decoder_type = {
PyObject_HEAD_INIT(NULL)
0, /*ob_size*/
"_fec.Decoder", /*tp_name*/
sizeof(Decoder), /*tp_basicsize*/
0, /*tp_itemsize*/
(destructor)Decoder_dealloc, /*tp_dealloc*/
0, /*tp_print*/
0, /*tp_getattr*/
0, /*tp_setattr*/
0, /*tp_compare*/
0, /*tp_repr*/
0, /*tp_as_number*/
0, /*tp_as_sequence*/
0, /*tp_as_mapping*/
0, /*tp_hash */
0, /*tp_call*/
0, /*tp_str*/
0, /*tp_getattro*/
0, /*tp_setattro*/
0, /*tp_as_buffer*/
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/
Decoder__doc__, /* tp_doc */
0, /* tp_traverse */
0, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
0, /* tp_iter */
0, /* tp_iternext */
Decoder_methods, /* tp_methods */
Decoder_members, /* tp_members */
0, /* tp_getset */
0, /* tp_base */
0, /* tp_dict */
0, /* tp_descr_get */
0, /* tp_descr_set */
0, /* tp_dictoffset */
(initproc)Decoder_init, /* tp_init */
0, /* tp_alloc */
Decoder_new, /* tp_new */
};
static PyMethodDef fec_methods[] = {
{NULL}
};
#ifndef PyMODINIT_FUNC /* declarations for DLL import/export */
#define PyMODINIT_FUNC void
#endif
PyMODINIT_FUNC
init_fec(void) {
PyObject *module;
PyObject *module_dict;
if (PyType_Ready(&Encoder_type) < 0)
return;
if (PyType_Ready(&Decoder_type) < 0)
return;
module = Py_InitModule3("_fec", fec_methods, fec__doc__);
if (module == NULL)
return;
Py_INCREF(&Encoder_type);
Py_INCREF(&Decoder_type);
PyModule_AddObject(module, "Encoder", (PyObject *)&Encoder_type);
PyModule_AddObject(module, "Decoder", (PyObject *)&Decoder_type);
module_dict = PyModule_GetDict(module);
py_fec_error = PyErr_NewException("_fec.Error", NULL, NULL);
PyDict_SetItemString(module_dict, "Error", py_fec_error);
}
/**
* zfec -- fast forward error correction library with Python interface
*
* Copyright (C) 2007 Allmydata, Inc.
* Author: Zooko Wilcox-O'Hearn
*
* This file is part of zfec.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version, with the added permission that, if you become obligated
* to release a derived work under this licence (as per section 2.b), you may
* delay the fulfillment of this obligation for up to 12 months. See the file
* COPYING for details.
*
* If you would like to inquire about a commercial relationship with Allmydata,
* Inc., please contact partnerships@allmydata.com and visit
* http://allmydata.com/.
*/
/**
* based on fecmodule.c by the Mnet Project, especially Myers Carpenter and
* Hauke Johannknecht
*/

View File

@ -1,75 +0,0 @@
#!/usr/bin/env python
# zfec -- a fast C implementation of Reed-Solomon erasure coding with
# command-line, C, and Python interfaces
import sys
from util import argparse
import filefec
from zfec import __version__ as libversion
from util.version import Version
__version__ = Version("1.0.0a1-0-STABLE")
def main():
if '-V' in sys.argv or '--version' in sys.argv:
print "zfec library version: ", libversion
print "zfec command-line tool version: ", __version__
sys.exit(0)
parser = argparse.ArgumentParser(description="Encode a file into a set of share files, a subset of which can later be used to recover the original file.")
parser.add_argument('inputfile', help='file to encode or "-" for stdin', type=argparse.FileType('rb'), metavar='INF')
parser.add_argument('-d', '--output-dir', help='directory in which share file names will be created (default ".")', default='.', metavar='D')
parser.add_argument('-p', '--prefix', help='prefix for share file names; If omitted, the name of the input file will be used.', metavar='P')
parser.add_argument('-s', '--suffix', help='suffix for share file names (default ".fec")', default='.fec', metavar='S')
parser.add_argument('-m', '--totalshares', help='the total number of share files created (default 16)', default=16, type=int, metavar='M')
parser.add_argument('-k', '--requiredshares', help='the number of share files required to reconstruct (default 4)', default=4, type=int, metavar='K')
parser.add_argument('-f', '--force', help='overwrite any file which already in place an output file (share file)', action='store_true')
parser.add_argument('-v', '--verbose', help='print out messages about progress', action='store_true')
parser.add_argument('-V', '--version', help='print out version number and exit', action='store_true')
args = parser.parse_args()
if args.prefix is None:
args.prefix = args.inputfile.name
if args.prefix == "<stdin>":
args.prefix = ""
if args.totalshares < 3:
print "Invalid parameters, totalshares is required to be >= 3\nPlease see the accompanying documentation."
sys.exit(1)
if args.totalshares > 256:
print "Invalid parameters, totalshares is required to be <= 256\nPlease see the accompanying documentation."
sys.exit(1)
if args.requiredshares < 2:
print "Invalid parameters, requiredshares is required to be >= 2\nPlease see the accompanying documentation."
sys.exit(1)
if args.requiredshares >= args.totalshares:
print "Invalid parameters, requiredshares is required to be < totalshares\nPlease see the accompanying documentation."
sys.exit(1)
args.inputfile.seek(0, 2)
fsize = args.inputfile.tell()
args.inputfile.seek(0, 0)
return filefec.encode_to_files(args.inputfile, fsize, args.output_dir, args.prefix, args.requiredshares, args.totalshares, args.suffix, args.force, args.verbose)
# zfec -- fast forward error correction library with Python interface
#
# Copyright (C) 2007 Allmydata, Inc.
# Author: Zooko Wilcox-O'Hearn
#
# This file is part of zfec.
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version, with the added permission that, if you become obligated
# to release a derived work under this licence (as per section 2.b), you may
# delay the fulfillment of this obligation for up to 12 months. See the file
# COPYING for details.
#
# If you would like to inquire about a commercial relationship with Allmydata,
# Inc., please contact partnerships@allmydata.com and visit
# http://allmydata.com/.

View File

@ -1,77 +0,0 @@
#!/usr/bin/env python
# zfec -- a fast C implementation of Reed-Solomon erasure coding with
# command-line, C, and Python interfaces
import os, sys
from util import argparse
import filefec
from zfec import __version__ as libversion
from util.version import Version
__version__ = Version("1.0.0a1-0-STABLE")
def main():
if '-V' in sys.argv or '--version' in sys.argv:
print "zfec library version: ", libversion
print "zunfec command-line tool version: ", __version__
return 0
parser = argparse.ArgumentParser(description="Decode data from share files.")
parser.add_argument('-o', '--outputfile', required=True, help='file to write the resulting data to, or "-" for stdout', type=str, metavar='OUTF')
parser.add_argument('sharefiles', nargs='*', help='shares file to read the encoded data from', type=unicode, metavar='SHAREFILE')
parser.add_argument('-v', '--verbose', help='print out messages about progress', action='store_true')
parser.add_argument('-f', '--force', help='overwrite any file which already in place of the output file', action='store_true')
parser.add_argument('-V', '--version', help='print out version number and exit', action='store_true')
args = parser.parse_args()
if len(args.sharefiles) < 2:
print "At least two sharefiles are required."
return 1
if args.force:
outf = open(args.outputfile, 'wb')
else:
try:
flags = os.O_WRONLY|os.O_CREAT|os.O_EXCL | (hasattr(os, 'O_BINARY') and os.O_BINARY)
outfd = os.open(args.outputfile, flags)
except OSError:
print "There is already a file named %r -- aborting. Use --force to overwrite." % (args.outputfile,)
return 2
outf = os.fdopen(outfd, "wb")
sharefs = []
# This sort() actually matters for performance (shares with numbers < k
# are much faster to use than the others), as well as being important for
# reproducibility.
args.sharefiles.sort()
for fn in args.sharefiles:
sharefs.append(open(fn, 'rb'))
try:
ret = filefec.decode_from_files(outf, sharefs, args.verbose)
except filefec.InsufficientShareFilesError, e:
print str(e)
return 3
return 0
# zfec -- fast forward error correction library with Python interface
#
# Copyright (C) 2007 Allmydata, Inc.
# Author: Zooko Wilcox-O'Hearn
#
# This file is part of zfec.
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version, with the added permission that, if you become obligated
# to release a derived work under this licence (as per section 2.b), you may
# delay the fulfillment of this obligation for up to 12 months. See the file
# COPYING for details.
#
# If you would like to inquire about a commercial relationship with Allmydata,
# Inc., please contact partnerships@allmydata.com and visit
# http://allmydata.com/.

View File

@ -1,58 +0,0 @@
# zfec -- a fast C implementation of Reed-Solomon erasure coding with
# command-line, C, and Python interfaces
import zfec
# div_ceil() was copied from the pyutil library.
def div_ceil(n, d):
"""
The smallest integer k such that k*d >= n.
"""
return (n/d) + (n%d != 0)
class Encoder(object):
def __init__(self, k, m):
self.fec = zfec.Encoder(k, m)
def encode(self, data):
"""
@param data: string
"""
chunksize = div_ceil(len(data), self.fec.k)
numchunks = div_ceil(len(data), chunksize)
l = [ data[i:i+chunksize] for i in range(0, len(data), chunksize) ]
# padding
if len(l[-1]) != len(l[0]):
l[-1] = l[-1] + ('\x00'*(len(l[0])-len(l[-1])))
res = self.fec.encode(l)
return res
class Decoder(object):
def __init__(self, k, m):
self.fec = zfec.Decoder(k, m)
def decode(self, blocks, sharenums, padlen=0):
blocks = self.fec.decode(blocks, sharenums)
data = ''.join(blocks)
if padlen:
data = data[:-padlen]
return data
# zfec -- fast forward error correction library with Python interface
#
# Copyright (C) 2007 Allmydata, Inc.
# Author: Zooko Wilcox-O'Hearn
#
# This file is part of zfec.
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version, with the added permission that, if you become obligated
# to release a derived work under this licence (as per section 2.b), you may
# delay the fulfillment of this obligation for up to 12 months. See the file
# COPYING for details.
#
# If you would like to inquire about a commercial relationship with Allmydata,
# Inc., please contact partnerships@allmydata.com and visit
# http://allmydata.com/.

View File

@ -1,614 +0,0 @@
/**
* zfec -- fast forward error correction library with Python interface
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include "fec.h"
/*
* If you get a error returned (negative value) from a fec_* function,
* look in here for the error message.
*/
#define FEC_ERROR_SIZE 1025
char fec_error[FEC_ERROR_SIZE+1];
#define ERR(...) (snprintf(fec_error, FEC_ERROR_SIZE, __VA_ARGS__))
/*
* Primitive polynomials - see Lin & Costello, Appendix A,
* and Lee & Messerschmitt, p. 453.
*/
static const char*const Pp="101110001";
/*
* To speed up computations, we have tables for logarithm, exponent and
* inverse of a number. We use a table for multiplication as well (it takes
* 64K, no big deal even on a PDA, especially because it can be
* pre-initialized an put into a ROM!), otherwhise we use a table of
* logarithms. In any case the macro gf_mul(x,y) takes care of
* multiplications.
*/
static gf gf_exp[510]; /* index->poly form conversion table */
static int gf_log[256]; /* Poly->index form conversion table */
static gf inverse[256]; /* inverse of field elem. */
/* inv[\alpha**i]=\alpha**(GF_SIZE-i-1) */
/*
* modnn(x) computes x % GF_SIZE, where GF_SIZE is 2**GF_BITS - 1,
* without a slow divide.
*/
static inline gf
modnn(int x) {
while (x >= 255) {
x -= 255;
x = (x >> 8) + (x & 255);
}
return x;
}
#define SWAP(a,b,t) {t tmp; tmp=a; a=b; b=tmp;}
/*
* gf_mul(x,y) multiplies two numbers. It is much faster to use a
* multiplication table.
*
* USE_GF_MULC, GF_MULC0(c) and GF_ADDMULC(x) can be used when multiplying
* many numbers by the same constant. In this case the first call sets the
* constant, and others perform the multiplications. A value related to the
* multiplication is held in a local variable declared with USE_GF_MULC . See
* usage in _addmul1().
*/
static gf gf_mul_table[256][256];
#define gf_mul(x,y) gf_mul_table[x][y]
#define USE_GF_MULC register gf * __gf_mulc_
#define GF_MULC0(c) __gf_mulc_ = gf_mul_table[c]
#define GF_ADDMULC(dst, x) dst ^= __gf_mulc_[x]
/*
* Generate GF(2**m) from the irreducible polynomial p(X) in p[0]..p[m]
* Lookup tables:
* index->polynomial form gf_exp[] contains j= \alpha^i;
* polynomial form -> index form gf_log[ j = \alpha^i ] = i
* \alpha=x is the primitive element of GF(2^m)
*
* For efficiency, gf_exp[] has size 2*GF_SIZE, so that a simple
* multiplication of two numbers can be resolved without calling modnn
*/
static void
_init_mul_table(void) {
int i, j;
for (i = 0; i < 256; i++)
for (j = 0; j < 256; j++)
gf_mul_table[i][j] = gf_exp[modnn (gf_log[i] + gf_log[j])];
for (j = 0; j < 256; j++)
gf_mul_table[0][j] = gf_mul_table[j][0] = 0;
}
/*
* i use malloc so many times, it is easier to put checks all in
* one place.
*/
static void *
my_malloc (int sz, char *err_string) {
void *p = malloc (sz);
if (p == NULL) {
ERR("Malloc failure allocating %s\n", err_string);
exit (1);
}
return p;
}
#define NEW_GF_MATRIX(rows, cols) \
(gf*)my_malloc(rows * cols, " ## __LINE__ ## " )
/*
* initialize the data structures used for computations in GF.
*/
static void
generate_gf (void) {
int i;
gf mask;
mask = 1; /* x ** 0 = 1 */
gf_exp[8] = 0; /* will be updated at the end of the 1st loop */
/*
* first, generate the (polynomial representation of) powers of \alpha,
* which are stored in gf_exp[i] = \alpha ** i .
* At the same time build gf_log[gf_exp[i]] = i .
* The first 8 powers are simply bits shifted to the left.
*/
for (i = 0; i < 8; i++, mask <<= 1) {
gf_exp[i] = mask;
gf_log[gf_exp[i]] = i;
/*
* If Pp[i] == 1 then \alpha ** i occurs in poly-repr
* gf_exp[8] = \alpha ** 8
*/
if (Pp[i] == '1')
gf_exp[8] ^= mask;
}
/*
* now gf_exp[8] = \alpha ** 8 is complete, so can also
* compute its inverse.
*/
gf_log[gf_exp[8]] = 8;
/*
* Poly-repr of \alpha ** (i+1) is given by poly-repr of
* \alpha ** i shifted left one-bit and accounting for any
* \alpha ** 8 term that may occur when poly-repr of
* \alpha ** i is shifted.
*/
mask = 1 << 7;
for (i = 9; i < 255; i++) {
if (gf_exp[i - 1] >= mask)
gf_exp[i] = gf_exp[8] ^ ((gf_exp[i - 1] ^ mask) << 1);
else
gf_exp[i] = gf_exp[i - 1] << 1;
gf_log[gf_exp[i]] = i;
}
/*
* log(0) is not defined, so use a special value
*/
gf_log[0] = 255;
/* set the extended gf_exp values for fast multiply */
for (i = 0; i < 255; i++)
gf_exp[i + 255] = gf_exp[i];
/*
* again special cases. 0 has no inverse. This used to
* be initialized to 255, but it should make no difference
* since noone is supposed to read from here.
*/
inverse[0] = 0;
inverse[1] = 1;
for (i = 2; i <= 255; i++)
inverse[i] = gf_exp[255 - gf_log[i]];
}
/*
* Various linear algebra operations that i use often.
*/
/*
* addmul() computes dst[] = dst[] + c * src[]
* This is used often, so better optimize it! Currently the loop is
* unrolled 16 times, a good value for 486 and pentium-class machines.
* The case c=0 is also optimized, whereas c=1 is not. These
* calls are unfrequent in my typical apps so I did not bother.
*/
#define addmul(dst, src, c, sz) \
if (c != 0) _addmul1(dst, src, c, sz)
#define UNROLL 16 /* 1, 4, 8, 16 */
static void
_addmul1(register gf*restrict dst, const register gf*restrict src, gf c, size_t sz) {
USE_GF_MULC;
const gf* lim = &dst[sz - UNROLL + 1];
GF_MULC0 (c);
#if (UNROLL > 1) /* unrolling by 8/16 is quite effective on the pentium */
for (; dst < lim; dst += UNROLL, src += UNROLL) {
GF_ADDMULC (dst[0], src[0]);
GF_ADDMULC (dst[1], src[1]);
GF_ADDMULC (dst[2], src[2]);
GF_ADDMULC (dst[3], src[3]);
#if (UNROLL > 4)
GF_ADDMULC (dst[4], src[4]);
GF_ADDMULC (dst[5], src[5]);
GF_ADDMULC (dst[6], src[6]);
GF_ADDMULC (dst[7], src[7]);
#endif
#if (UNROLL > 8)
GF_ADDMULC (dst[8], src[8]);
GF_ADDMULC (dst[9], src[9]);
GF_ADDMULC (dst[10], src[10]);
GF_ADDMULC (dst[11], src[11]);
GF_ADDMULC (dst[12], src[12]);
GF_ADDMULC (dst[13], src[13]);
GF_ADDMULC (dst[14], src[14]);
GF_ADDMULC (dst[15], src[15]);
#endif
}
#endif
lim += UNROLL - 1;
for (; dst < lim; dst++, src++) /* final components */
GF_ADDMULC (*dst, *src);
}
/*
* computes C = AB where A is n*k, B is k*m, C is n*m
*/
static void
_matmul(gf * a, gf * b, gf * c, unsigned n, unsigned k, unsigned m) {
unsigned row, col, i;
for (row = 0; row < n; row++) {
for (col = 0; col < m; col++) {
gf *pa = &a[row * k];
gf *pb = &b[col];
gf acc = 0;
for (i = 0; i < k; i++, pa++, pb += m)
acc ^= gf_mul (*pa, *pb);
c[row * m + col] = acc;
}
}
}
/*
* _invert_mat() takes a matrix and produces its inverse
* k is the size of the matrix.
* (Gauss-Jordan, adapted from Numerical Recipes in C)
* Return non-zero if singular.
*/
static void
_invert_mat(gf* src, unsigned k) {
gf c, *p;
unsigned irow = 0;
unsigned icol = 0;
unsigned row, col, i, ix;
unsigned* indxc = (unsigned*) my_malloc (k * sizeof(unsigned), "indxc");
unsigned* indxr = (unsigned*) my_malloc (k * sizeof(unsigned), "indxr");
unsigned* ipiv = (unsigned*) my_malloc (k * sizeof(unsigned), "ipiv");
gf *id_row = NEW_GF_MATRIX (1, k);
gf *temp_row = NEW_GF_MATRIX (1, k);
memset (id_row, '\0', k * sizeof (gf));
/*
* ipiv marks elements already used as pivots.
*/
for (i = 0; i < k; i++)
ipiv[i] = 0;
for (col = 0; col < k; col++) {
gf *pivot_row;
/*
* Zeroing column 'col', look for a non-zero element.
* First try on the diagonal, if it fails, look elsewhere.
*/
if (ipiv[col] != 1 && src[col * k + col] != 0) {
irow = col;
icol = col;
goto found_piv;
}
for (row = 0; row < k; row++) {
if (ipiv[row] != 1) {
for (ix = 0; ix < k; ix++) {
if (ipiv[ix] == 0) {
if (src[row * k + ix] != 0) {
irow = row;
icol = ix;
goto found_piv;
}
} else if (ipiv[ix] > 1) {
ERR("singular matrix");
goto fail;
}
}
}
}
found_piv:
++(ipiv[icol]);
/*
* swap rows irow and icol, so afterwards the diagonal
* element will be correct. Rarely done, not worth
* optimizing.
*/
if (irow != icol)
for (ix = 0; ix < k; ix++)
SWAP (src[irow * k + ix], src[icol * k + ix], gf);
indxr[col] = irow;
indxc[col] = icol;
pivot_row = &src[icol * k];
c = pivot_row[icol];
if (c == 0) {
ERR("singular matrix 2");
goto fail;
}
if (c != 1) { /* otherwhise this is a NOP */
/*
* this is done often , but optimizing is not so
* fruitful, at least in the obvious ways (unrolling)
*/
c = inverse[c];
pivot_row[icol] = 1;
for (ix = 0; ix < k; ix++)
pivot_row[ix] = gf_mul (c, pivot_row[ix]);
}
/*
* from all rows, remove multiples of the selected row
* to zero the relevant entry (in fact, the entry is not zero
* because we know it must be zero).
* (Here, if we know that the pivot_row is the identity,
* we can optimize the addmul).
*/
id_row[icol] = 1;
if (memcmp (pivot_row, id_row, k * sizeof (gf)) != 0) {
for (p = src, ix = 0; ix < k; ix++, p += k) {
if (ix != icol) {
c = p[icol];
p[icol] = 0;
addmul (p, pivot_row, c, k);
}
}
}
id_row[icol] = 0;
} /* done all columns */
for (col = k; col > 0; col--)
if (indxr[col-1] != indxc[col-1])
for (row = 0; row < k; row++)
SWAP (src[row * k + indxr[col-1]], src[row * k + indxc[col-1]], gf);
fail:
free (indxc);
free (indxr);
free (ipiv);
free (id_row);
free (temp_row);
return;
}
/*
* fast code for inverting a vandermonde matrix.
*
* NOTE: It assumes that the matrix is not singular and _IS_ a vandermonde
* matrix. Only uses the second column of the matrix, containing the p_i's.
*
* Algorithm borrowed from "Numerical recipes in C" -- sec.2.8, but largely
* revised for my purposes.
* p = coefficients of the matrix (p_i)
* q = values of the polynomial (known)
*/
void
_invert_vdm (gf* src, unsigned k) {
unsigned i, j, row, col;
gf *b, *c, *p;
gf t, xx;
if (k == 1) /* degenerate case, matrix must be p^0 = 1 */
return;
/*
* c holds the coefficient of P(x) = Prod (x - p_i), i=0..k-1
* b holds the coefficient for the matrix inversion
*/
c = NEW_GF_MATRIX (1, k);
b = NEW_GF_MATRIX (1, k);
p = NEW_GF_MATRIX (1, k);
for (j = 1, i = 0; i < k; i++, j += k) {
c[i] = 0;
p[i] = src[j]; /* p[i] */
}
/*
* construct coeffs. recursively. We know c[k] = 1 (implicit)
* and start P_0 = x - p_0, then at each stage multiply by
* x - p_i generating P_i = x P_{i-1} - p_i P_{i-1}
* After k steps we are done.
*/
c[k - 1] = p[0]; /* really -p(0), but x = -x in GF(2^m) */
for (i = 1; i < k; i++) {
gf p_i = p[i]; /* see above comment */
for (j = k - 1 - (i - 1); j < k - 1; j++)
c[j] ^= gf_mul (p_i, c[j + 1]);
c[k - 1] ^= p_i;
}
for (row = 0; row < k; row++) {
/*
* synthetic division etc.
*/
xx = p[row];
t = 1;
b[k - 1] = 1; /* this is in fact c[k] */
for (i = k - 1; i > 0; i--) {
b[i-1] = c[i] ^ gf_mul (xx, b[i]);
t = gf_mul (xx, t) ^ b[i-1];
}
for (col = 0; col < k; col++)
src[col * k + row] = gf_mul (inverse[t], b[col]);
}
free (c);
free (b);
free (p);
return;
}
static int fec_initialized = 0;
static void
init_fec (void) {
generate_gf();
_init_mul_table();
fec_initialized = 1;
}
/*
* This section contains the proper FEC encoding/decoding routines.
* The encoding matrix is computed starting with a Vandermonde matrix,
* and then transforming it into a systematic matrix.
*/
#define FEC_MAGIC 0xFECC0DEC
void
fec_free (fec_t *p) {
if (p == NULL ||
p->magic != (((FEC_MAGIC ^ p->k) ^ p->n) ^ (unsigned long) (p->enc_matrix))) {
ERR("bad parameters to fec_free");
return;
}
free (p->enc_matrix);
free (p);
}
fec_t *
fec_new(unsigned k, unsigned n) {
unsigned row, col;
gf *p, *tmp_m;
fec_t *retval;
fec_error[FEC_ERROR_SIZE] = '\0';
if (fec_initialized == 0)
init_fec ();
retval = (fec_t *) my_malloc (sizeof (fec_t), "new_code");
retval->k = k;
retval->n = n;
retval->enc_matrix = NEW_GF_MATRIX (n, k);
retval->magic = ((FEC_MAGIC ^ k) ^ n) ^ (unsigned long) (retval->enc_matrix);
tmp_m = NEW_GF_MATRIX (n, k);
/*
* fill the matrix with powers of field elements, starting from 0.
* The first row is special, cannot be computed with exp. table.
*/
tmp_m[0] = 1;
for (col = 1; col < k; col++)
tmp_m[col] = 0;
for (p = tmp_m + k, row = 0; row < n - 1; row++, p += k)
for (col = 0; col < k; col++)
p[col] = gf_exp[modnn (row * col)];
/*
* quick code to build systematic matrix: invert the top
* k*k vandermonde matrix, multiply right the bottom n-k rows
* by the inverse, and construct the identity matrix at the top.
*/
_invert_vdm (tmp_m, k); /* much faster than _invert_mat */
_matmul(tmp_m + k * k, tmp_m, retval->enc_matrix + k * k, n - k, k, k);
/*
* the upper matrix is I so do not bother with a slow multiply
*/
memset (retval->enc_matrix, '\0', k * k * sizeof (gf));
for (p = retval->enc_matrix, col = 0; col < k; col++, p += k + 1)
*p = 1;
free (tmp_m);
return retval;
}
void
fec_encode(const fec_t* code, const gf*restrict const*restrict const src, gf*restrict const*restrict const fecs, const unsigned*restrict const block_nums, size_t num_block_nums, size_t sz) {
unsigned char i, j;
unsigned fecnum;
gf* p;
for (i=0; i<num_block_nums; i++) {
fecnum=block_nums[i];
assert (fecnum >= code->k);
memset(fecs[i], 0, sz);
p = &(code->enc_matrix[fecnum * code->k]);
for (j = 0; j < code->k; j++)
addmul(fecs[i], src[j], p[j], sz);
}
}
/**
* Build decode matrix into some memory space.
*
* @param matrix a space allocated for a k by k matrix
*/
void
build_decode_matrix_into_space(const fec_t*restrict const code, const unsigned*const restrict index, const unsigned k, gf*restrict const matrix) {
unsigned char i;
gf* p;
for (i=0, p=matrix; i < k; i++, p += k) {
if (index[i] < k) {
memset(p, 0, k);
p[i] = 1;
} else {
memcpy(p, &(code->enc_matrix[index[i] * code->k]), k);
}
}
_invert_mat (matrix, k);
}
void
fec_decode(const fec_t* code, const gf*restrict const*restrict const inpkts, gf*restrict const*restrict const outpkts, const unsigned*restrict const index, size_t sz) {
gf m_dec[code->k * code->k];
build_decode_matrix_into_space(code, index, code->k, m_dec);
unsigned char outix=0;
for (unsigned char row=0; row<code->k; row++) {
if (index[row] >= code->k) {
memset(outpkts[outix], 0, sz);
for (unsigned char col=0; col < code->k; col++)
addmul(outpkts[outix], inpkts[col], m_dec[row * code->k + col], sz);
outix++;
}
}
}
/**
* zfec -- fast forward error correction library with Python interface
*
* Copyright (C) 2007 Allmydata, Inc.
* Author: Zooko Wilcox-O'Hearn
*
* This file is part of zfec.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version, with the added permission that, if you become obligated
* to release a derived work under this licence (as per section 2.b), you may
* delay the fulfillment of this obligation for up to 12 months. See the file
* COPYING for details.
*
* If you would like to inquire about a commercial relationship with Allmydata,
* Inc., please contact partnerships@allmydata.com and visit
* http://allmydata.com/.
*/
/*
* Much of this work is derived from the "fec" software by Luigi Rizzo, et
* al., the copyright notice and licence terms of which are included below
* for reference.
* fec.c -- forward error correction based on Vandermonde matrices
* 980624
* (C) 1997-98 Luigi Rizzo (luigi@iet.unipi.it)
*
* Portions derived from code by Phil Karn (karn@ka9q.ampr.org),
* Robert Morelos-Zaragoza (robert@spectra.eng.hawaii.edu) and Hari
* Thirumoorthy (harit@spectra.eng.hawaii.edu), Aug 1995
*
* Modifications by Dan Rubenstein (see Modifications.txt for
* their description.
* Modifications (C) 1998 Dan Rubenstein (drubenst@cs.umass.edu)
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
* THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
* PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS
* BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
* OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
* PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
* OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
* TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
* OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
* OF SUCH DAMAGE.
*/

View File

@ -1,99 +0,0 @@
/**
* zfec -- fast forward error correction library with Python interface
*/
typedef unsigned char gf;
typedef struct {
unsigned long magic;
unsigned k, n; /* parameters of the code */
gf* enc_matrix;
} fec_t;
/**
* param k the number of blocks required to reconstruct
* param m the total number of blocks created
*/
fec_t* fec_new(unsigned k, unsigned m);
void fec_free(fec_t* p);
/**
* @param inpkts the "primary blocks" i.e. the chunks of the input data
* @param fecs buffers into which the secondary blocks will be written
* @param block_nums the numbers of the desired blocks -- including both primary blocks (the id < k) which fec_encode() ignores and check blocks (the id >= k) which fec_encode() will produce and store into the buffers of the fecs parameter
* @param num_block_nums the length of the block_nums array
*/
void fec_encode(const fec_t* code, const gf*restrict const*restrict const src, gf*restrict const*restrict const fecs, const unsigned*restrict const block_nums, size_t num_block_nums, size_t sz);
/**
* @param inpkts an array of packets (size k)
* @param outpkts an array of buffers into which the reconstructed output packets will be written (only packets which are not present in the inpkts input will be reconstructed and written to outpkts)
* @param index an array of the blocknums of the packets in inpkts
* @param sz size of a packet in bytes
*/
void fec_decode(const fec_t* code, const gf*restrict const*restrict const inpkts, gf*restrict const*restrict const outpkts, const unsigned*restrict const index, size_t sz);
/**
* zfec -- fast forward error correction library with Python interface
*
* Copyright (C) 2007 Allmydata, Inc.
* Author: Zooko Wilcox-O'Hearn
*
* This file is part of zfec.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version, with the added permission that, if you become obligated
* to release a derived work under this licence (as per section 2.b), you may
* delay the fulfillment of this obligation for up to 12 months. See the file
* COPYING for details.
*
* If you would like to inquire about a commercial relationship with Allmydata,
* Inc., please contact partnerships@allmydata.com and visit
* http://allmydata.com/.
*/
/*
* Much of this work is derived from the "fec" software by Luigi Rizzo, et
* al., the copyright notice and licence terms of which are included below
* for reference.
*
* fec.h -- forward error correction based on Vandermonde matrices
* 980614
* (C) 1997-98 Luigi Rizzo (luigi@iet.unipi.it)
*
* Portions derived from code by Phil Karn (karn@ka9q.ampr.org),
* Robert Morelos-Zaragoza (robert@spectra.eng.hawaii.edu) and Hari
* Thirumoorthy (harit@spectra.eng.hawaii.edu), Aug 1995
*
* Modifications by Dan Rubenstein (see Modifications.txt for
* their description.
* Modifications (C) 1998 Dan Rubenstein (drubenst@cs.umass.edu)
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
* THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
* PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS
* BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
* OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
* PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
* OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
* TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
* OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
* OF SUCH DAMAGE.
*/

View File

@ -1,496 +0,0 @@
import easyfec, zfec
from util import fileutil
from util.mathutil import log_ceil
import array, os, re, struct, traceback
CHUNKSIZE = 4096
class InsufficientShareFilesError(zfec.Error):
def __init__(self, k, kb, *args, **kwargs):
zfec.Error.__init__(self, *args, **kwargs)
self.k = k
self.kb = kb
def __repr__(self):
return "Insufficient share files -- %d share files are required to recover this file, but only %d were given" % (self.k, self.kb,)
def __str__(self):
return self.__repr__()
class CorruptedShareFilesError(zfec.Error):
pass
def _build_header(m, k, pad, sh):
"""
@param m: the total number of shares; 3 <= m <= 256
@param k: the number of shares required to reconstruct; 2 <= k < m
@param pad: the number of bytes of padding added to the file before encoding; 0 <= pad < k
@param sh: the shnum of this share; 0 <= k < m
@return: a string (which is hopefully short) encoding m, k, sh, and pad
"""
assert m >= 3
assert m <= 2**8
assert k >= 2
assert k < m
assert pad >= 0
assert pad < k
assert sh >= 0
assert sh < m
bitsused = 0
val = 0
val |= (m - 3)
bitsused += 8 # the first 8 bits always encode m
kbits = log_ceil(m-2, 2) # num bits needed to store all possible values of k
val <<= kbits
bitsused += kbits
val |= (k - 2)
padbits = log_ceil(k, 2) # num bits needed to store all possible values of pad
val <<= padbits
bitsused += padbits
val |= pad
shnumbits = log_ceil(m, 2) # num bits needed to store all possible values of shnum
val <<= shnumbits
bitsused += shnumbits
val |= sh
assert bitsused >= 11
assert bitsused <= 32
if bitsused <= 16:
val <<= (16-bitsused)
cs = struct.pack('>H', val)
assert cs[:-2] == '\x00' * (len(cs)-2)
return cs[-2:]
if bitsused <= 24:
val <<= (24-bitsused)
cs = struct.pack('>I', val)
assert cs[:-3] == '\x00' * (len(cs)-3)
return cs[-3:]
else:
val <<= (32-bitsused)
cs = struct.pack('>I', val)
assert cs[:-4] == '\x00' * (len(cs)-4)
return cs[-4:]
def MASK(bits):
return (1<<bits)-1
def _parse_header(inf):
"""
@param inf: an object which I can call read(1) on to get another byte
@return: tuple of (m, k, pad, sh,); side-effect: the first one to four
bytes of inf will be read
"""
# The first 8 bits always encode m.
ch = inf.read(1)
if not ch:
raise CorruptedShareFilesError("Share files were corrupted -- share file %r didn't have a complete metadata header at the front. Perhaps the file was truncated." % (inf.name,))
byte = ord(ch)
m = byte + 3
# The next few bits encode k.
kbits = log_ceil(m-2, 2) # num bits needed to store all possible values of k
b2_bits_left = 8-kbits
kbitmask = MASK(kbits) << b2_bits_left
ch = inf.read(1)
if not ch:
raise CorruptedShareFilesError("Share files were corrupted -- share file %r didn't have a complete metadata header at the front. Perhaps the file was truncated." % (inf.name,))
byte = ord(ch)
k = ((byte & kbitmask) >> b2_bits_left) + 2
shbits = log_ceil(m, 2) # num bits needed to store all possible values of shnum
padbits = log_ceil(k, 2) # num bits needed to store all possible values of pad
val = byte & (~kbitmask)
needed_padbits = padbits - b2_bits_left
if needed_padbits > 0:
ch = inf.read(1)
if not ch:
raise CorruptedShareFilesError("Share files were corrupted -- share file %r didn't have a complete metadata header at the front. Perhaps the file was truncated." % (inf.name,))
byte = struct.unpack(">B", ch)[0]
val <<= 8
val |= byte
needed_padbits -= 8
assert needed_padbits <= 0
extrabits = -needed_padbits
pad = val >> extrabits
val &= MASK(extrabits)
needed_shbits = shbits - extrabits
if needed_shbits > 0:
ch = inf.read(1)
if not ch:
raise CorruptedShareFilesError("Share files were corrupted -- share file %r didn't have a complete metadata header at the front. Perhaps the file was truncated." % (inf.name,))
byte = struct.unpack(">B", ch)[0]
val <<= 8
val |= byte
needed_shbits -= 8
assert needed_shbits <= 0
gotshbits = -needed_shbits
sh = val >> gotshbits
return (m, k, pad, sh,)
FORMAT_FORMAT = "%%s.%%0%dd_%%0%dd%%s"
RE_FORMAT = "%s.[0-9]+_[0-9]+%s"
def encode_to_files(inf, fsize, dirname, prefix, k, m, suffix=".fec", overwrite=False, verbose=False):
"""
Encode inf, writing the shares to specially named, newly created files.
@param fsize: calling read() on inf must yield fsize bytes of data and
then raise an EOFError
@param dirname: the name of the directory into which the sharefiles will
be written
"""
mlen = len(str(m))
format = FORMAT_FORMAT % (mlen, mlen,)
padbytes = zfec.util.mathutil.pad_size(fsize, k)
fns = []
fs = []
try:
for shnum in range(m):
hdr = _build_header(m, k, padbytes, shnum)
fn = os.path.join(dirname, format % (prefix, shnum, m, suffix,))
if verbose:
print "Creating share file %r..." % (fn,)
if overwrite:
f = open(fn, "wb")
else:
flags = os.O_WRONLY|os.O_CREAT|os.O_EXCL | (hasattr(os, 'O_BINARY') and os.O_BINARY)
fd = os.open(fn, flags)
f = os.fdopen(fd, "wb")
f.write(hdr)
fs.append(f)
fns.append(fn)
sumlen = [0]
def cb(blocks, length):
assert len(blocks) == len(fs)
oldsumlen = sumlen[0]
sumlen[0] += length
if verbose:
if int((float(oldsumlen) / fsize) * 10) != int((float(sumlen[0]) / fsize) * 10):
print str(int((float(sumlen[0]) / fsize) * 10) * 10) + "% ...",
if sumlen[0] > fsize:
raise IOError("Wrong file size -- possibly the size of the file changed during encoding. Original size: %d, observed size at least: %s" % (fsize, sumlen[0],))
for i in range(len(blocks)):
data = blocks[i]
fs[i].write(data)
length -= len(data)
encode_file_stringy_easyfec(inf, cb, k, m, chunksize=4096)
except EnvironmentError, le:
print "Cannot complete because of exception: "
print le
print "Cleaning up..."
# clean up
while fs:
f = fs.pop()
f.close() ; del f
fn = fns.pop()
if verbose:
print "Cleaning up: trying to remove %r..." % (fn,)
fileutil.remove_if_possible(fn)
return 1
if verbose:
print
print "Done!"
return 0
# Note: if you really prefer base-2 and you change this code, then please
# denote 2^20 as "MiB" instead of "MB" in order to avoid ambiguity.
# Thanks.
# http://en.wikipedia.org/wiki/Megabyte
MILLION_BYTES=10**6
def decode_from_files(outf, infiles, verbose=False):
"""
Decode from the first k files in infiles, writing the results to outf.
"""
assert len(infiles) >= 2
infs = []
shnums = []
m = None
k = None
padlen = None
byteswritten = 0
for f in infiles:
(nm, nk, npadlen, shnum,) = _parse_header(f)
if not (m is None or m == nm):
raise CorruptedShareFilesError("Share files were corrupted -- share file %r said that m was %s but another share file previously said that m was %s" % (f.name, nm, m,))
m = nm
if not (k is None or k == nk):
raise CorruptedShareFilesError("Share files were corrupted -- share file %r said that k was %s but another share file previously said that k was %s" % (f.name, nk, k,))
if k > len(infiles):
raise InsufficientShareFilesError(k, len(infiles))
k = nk
if not (padlen is None or padlen == npadlen):
raise CorruptedShareFilesError("Share files were corrupted -- share file %r said that pad length was %s but another share file previously said that pad length was %s" % (f.name, npadlen, padlen,))
padlen = npadlen
infs.append(f)
shnums.append(shnum)
if len(infs) == k:
break
dec = easyfec.Decoder(k, m)
while True:
chunks = [ inf.read(CHUNKSIZE) for inf in infs ]
if [ch for ch in chunks if len(ch) != len(chunks[-1])]:
raise CorruptedShareFilesError("Share files were corrupted -- all share files are required to be the same length, but they weren't.")
if len(chunks[-1]) == CHUNKSIZE:
# Then this was a full read, so we're still in the sharefiles.
resultdata = dec.decode(chunks, shnums, padlen=0)
outf.write(resultdata)
byteswritten += len(resultdata)
if verbose:
if ((byteswritten - len(resultdata)) / (10*MILLION_BYTES)) != (byteswritten / (10*MILLION_BYTES)):
print str(byteswritten / MILLION_BYTES) + " MB ...",
else:
# Then this was a short read, so we've reached the end of the sharefiles.
resultdata = dec.decode(chunks, shnums, padlen)
outf.write(resultdata)
return # Done.
if verbose:
print
print "Done!"
def encode_file(inf, cb, k, m, chunksize=4096):
"""
Read in the contents of inf, encode, and call cb with the results.
First, k "input blocks" will be read from inf, each input block being of
size chunksize. Then these k blocks will be encoded into m "result
blocks". Then cb will be invoked, passing a list of the m result blocks
as its first argument, and the length of the encoded data as its second
argument. (The length of the encoded data is always equal to k*chunksize,
until the last iteration, when the end of the file has been reached and
less than k*chunksize bytes could be read from the file.) This procedure
is iterated until the end of the file is reached, in which case the space
of the input blocks that is unused is filled with zeroes before encoding.
Note that the sequence passed in calls to cb() contains mutable array
objects in its first k elements whose contents will be overwritten when
the next segment is read from the input file. Therefore the
implementation of cb() has to either be finished with those first k arrays
before returning, or if it wants to keep the contents of those arrays for
subsequent use after it has returned then it must make a copy of them to
keep.
@param inf the file object from which to read the data
@param cb the callback to be invoked with the results
@param k the number of shares required to reconstruct the file
@param m the total number of shares created
@param chunksize how much data to read from inf for each of the k input
blocks
"""
enc = zfec.Encoder(k, m)
l = tuple([ array.array('c') for i in range(k) ])
indatasize = k*chunksize # will be reset to shorter upon EOF
eof = False
ZEROES=array.array('c', ['\x00'])*chunksize
while not eof:
# This loop body executes once per segment.
i = 0
while (i<len(l)):
# This loop body executes once per chunk.
a = l[i]
del a[:]
try:
a.fromfile(inf, chunksize)
i += 1
except EOFError:
eof = True
indatasize = i*chunksize + len(a)
# padding
a.fromstring("\x00" * (chunksize-len(a)))
i += 1
while (i<len(l)):
a = l[i]
a[:] = ZEROES
i += 1
res = enc.encode(l)
cb(res, indatasize)
import sha
def encode_file_not_really(inf, cb, k, m, chunksize=4096):
enc = zfec.Encoder(k, m)
l = tuple([ array.array('c') for i in range(k) ])
indatasize = k*chunksize # will be reset to shorter upon EOF
eof = False
ZEROES=array.array('c', ['\x00'])*chunksize
while not eof:
# This loop body executes once per segment.
i = 0
while (i<len(l)):
# This loop body executes once per chunk.
a = l[i]
del a[:]
try:
a.fromfile(inf, chunksize)
i += 1
except EOFError:
eof = True
indatasize = i*chunksize + len(a)
# padding
a.fromstring("\x00" * (chunksize-len(a)))
i += 1
while (i<len(l)):
a = l[i]
a[:] = ZEROES
i += 1
# res = enc.encode(l)
cb(None, None)
def encode_file_not_really_and_hash(inf, cb, k, m, chunksize=4096):
hasher = sha.new()
enc = zfec.Encoder(k, m)
l = tuple([ array.array('c') for i in range(k) ])
indatasize = k*chunksize # will be reset to shorter upon EOF
eof = False
ZEROES=array.array('c', ['\x00'])*chunksize
while not eof:
# This loop body executes once per segment.
i = 0
while (i<len(l)):
# This loop body executes once per chunk.
a = l[i]
del a[:]
try:
a.fromfile(inf, chunksize)
i += 1
except EOFError:
eof = True
indatasize = i*chunksize + len(a)
# padding
a.fromstring("\x00" * (chunksize-len(a)))
i += 1
while (i<len(l)):
a = l[i]
a[:] = ZEROES
i += 1
# res = enc.encode(l)
for thing in l:
hasher.update(thing)
cb(None, None)
def encode_file_stringy(inf, cb, k, m, chunksize=4096):
"""
Read in the contents of inf, encode, and call cb with the results.
First, k "input blocks" will be read from inf, each input block being of
size chunksize. Then these k blocks will be encoded into m "result
blocks". Then cb will be invoked, passing a list of the m result blocks
as its first argument, and the length of the encoded data as its second
argument. (The length of the encoded data is always equal to k*chunksize,
until the last iteration, when the end of the file has been reached and
less than k*chunksize bytes could be read from the file.) This procedure
is iterated until the end of the file is reached, in which case the part
of the input shares that is unused is filled with zeroes before encoding.
@param inf the file object from which to read the data
@param cb the callback to be invoked with the results
@param k the number of shares required to reconstruct the file
@param m the total number of shares created
@param chunksize how much data to read from inf for each of the k input
blocks
"""
enc = zfec.Encoder(k, m)
indatasize = k*chunksize # will be reset to shorter upon EOF
while indatasize == k*chunksize:
# This loop body executes once per segment.
i = 0
l = []
ZEROES = '\x00'*chunksize
while i<k:
# This loop body executes once per chunk.
i += 1
l.append(inf.read(chunksize))
if len(l[-1]) < chunksize:
indatasize = i*chunksize + len(l[-1])
# padding
l[-1] = l[-1] + "\x00" * (chunksize-len(l[-1]))
while i<k:
l.append(ZEROES)
i += 1
res = enc.encode(l)
cb(res, indatasize)
def encode_file_stringy_easyfec(inf, cb, k, m, chunksize=4096):
"""
Read in the contents of inf, encode, and call cb with the results.
First, chunksize*k bytes will be read from inf, then encoded into m
"result blocks". Then cb will be invoked, passing a list of the m result
blocks as its first argument, and the length of the encoded data as its
second argument. (The length of the encoded data is always equal to
k*chunksize, until the last iteration, when the end of the file has been
reached and less than k*chunksize bytes could be read from the file.)
This procedure is iterated until the end of the file is reached, in which
case the space of the input that is unused is filled with zeroes before
encoding.
@param inf the file object from which to read the data
@param cb the callback to be invoked with the results
@param k the number of shares required to reconstruct the file
@param m the total number of shares created
@param chunksize how much data to read from inf for each of the k input
blocks
"""
enc = easyfec.Encoder(k, m)
readsize = k*chunksize
indata = inf.read(readsize)
while indata:
res = enc.encode(indata)
cb(res, len(indata))
indata = inf.read(readsize)
# zfec -- fast forward error correction library with Python interface
#
# Copyright (C) 2007 Allmydata, Inc.
# Author: Zooko Wilcox-O'Hearn
#
# This file is part of zfec.
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version, with the added permission that, if you become obligated
# to release a derived work under this licence (as per section 2.b), you may
# delay the fulfillment of this obligation for up to 12 months. See the file
# COPYING for details.
#
# If you would like to inquire about a commercial relationship with Allmydata,
# Inc., please contact partnerships@allmydata.com and visit
# http://allmydata.com/.

View File

@ -1,128 +0,0 @@
from twisted.trial import unittest
from zfec.util import mathutil
class Math(unittest.TestCase):
def test_div_ceil(self):
f = mathutil.div_ceil
self.failUnlessEqual(f(0, 1), 0)
self.failUnlessEqual(f(0, 2), 0)
self.failUnlessEqual(f(0, 3), 0)
self.failUnlessEqual(f(1, 3), 1)
self.failUnlessEqual(f(2, 3), 1)
self.failUnlessEqual(f(3, 3), 1)
self.failUnlessEqual(f(4, 3), 2)
self.failUnlessEqual(f(5, 3), 2)
self.failUnlessEqual(f(6, 3), 2)
self.failUnlessEqual(f(7, 3), 3)
def test_next_multiple(self):
f = mathutil.next_multiple
self.failUnlessEqual(f(5, 1), 5)
self.failUnlessEqual(f(5, 2), 6)
self.failUnlessEqual(f(5, 3), 6)
self.failUnlessEqual(f(5, 4), 8)
self.failUnlessEqual(f(5, 5), 5)
self.failUnlessEqual(f(5, 6), 6)
self.failUnlessEqual(f(32, 1), 32)
self.failUnlessEqual(f(32, 2), 32)
self.failUnlessEqual(f(32, 3), 33)
self.failUnlessEqual(f(32, 4), 32)
self.failUnlessEqual(f(32, 5), 35)
self.failUnlessEqual(f(32, 6), 36)
self.failUnlessEqual(f(32, 7), 35)
self.failUnlessEqual(f(32, 8), 32)
self.failUnlessEqual(f(32, 9), 36)
self.failUnlessEqual(f(32, 10), 40)
self.failUnlessEqual(f(32, 11), 33)
self.failUnlessEqual(f(32, 12), 36)
self.failUnlessEqual(f(32, 13), 39)
self.failUnlessEqual(f(32, 14), 42)
self.failUnlessEqual(f(32, 15), 45)
self.failUnlessEqual(f(32, 16), 32)
self.failUnlessEqual(f(32, 17), 34)
self.failUnlessEqual(f(32, 18), 36)
self.failUnlessEqual(f(32, 589), 589)
def test_pad_size(self):
f = mathutil.pad_size
self.failUnlessEqual(f(0, 4), 0)
self.failUnlessEqual(f(1, 4), 3)
self.failUnlessEqual(f(2, 4), 2)
self.failUnlessEqual(f(3, 4), 1)
self.failUnlessEqual(f(4, 4), 0)
self.failUnlessEqual(f(5, 4), 3)
def test_is_power_of_k(self):
f = mathutil.is_power_of_k
for i in range(1, 100):
if i in (1, 2, 4, 8, 16, 32, 64):
self.failUnless(f(i, 2), "but %d *is* a power of 2" % i)
else:
self.failIf(f(i, 2), "but %d is *not* a power of 2" % i)
for i in range(1, 100):
if i in (1, 3, 9, 27, 81):
self.failUnless(f(i, 3), "but %d *is* a power of 3" % i)
else:
self.failIf(f(i, 3), "but %d is *not* a power of 3" % i)
def test_next_power_of_k(self):
f = mathutil.next_power_of_k
self.failUnlessEqual(f(0,2), 1)
self.failUnlessEqual(f(1,2), 1)
self.failUnlessEqual(f(2,2), 2)
self.failUnlessEqual(f(3,2), 4)
self.failUnlessEqual(f(4,2), 4)
for i in range(5, 8): self.failUnlessEqual(f(i,2), 8, "%d" % i)
for i in range(9, 16): self.failUnlessEqual(f(i,2), 16, "%d" % i)
for i in range(17, 32): self.failUnlessEqual(f(i,2), 32, "%d" % i)
for i in range(33, 64): self.failUnlessEqual(f(i,2), 64, "%d" % i)
for i in range(65, 100): self.failUnlessEqual(f(i,2), 128, "%d" % i)
self.failUnlessEqual(f(0,3), 1)
self.failUnlessEqual(f(1,3), 1)
self.failUnlessEqual(f(2,3), 3)
self.failUnlessEqual(f(3,3), 3)
for i in range(4, 9): self.failUnlessEqual(f(i,3), 9, "%d" % i)
for i in range(10, 27): self.failUnlessEqual(f(i,3), 27, "%d" % i)
for i in range(28, 81): self.failUnlessEqual(f(i,3), 81, "%d" % i)
for i in range(82, 200): self.failUnlessEqual(f(i,3), 243, "%d" % i)
def test_ave(self):
f = mathutil.ave
self.failUnlessEqual(f([1,2,3]), 2)
self.failUnlessEqual(f([0,0,0,4]), 1)
self.failUnlessAlmostEqual(f([0.0, 1.0, 1.0]), .666666666666)
def failUnlessEqualContents(self, a, b):
self.failUnlessEqual(sorted(a), sorted(b))
def test_permute(self):
f = mathutil.permute
self.failUnlessEqualContents(f([]), [])
self.failUnlessEqualContents(f([1]), [[1]])
self.failUnlessEqualContents(f([1,2]), [[1,2], [2,1]])
self.failUnlessEqualContents(f([1,2,3]),
[[1,2,3], [1,3,2],
[2,1,3], [2,3,1],
[3,1,2], [3,2,1]])
# zfec -- fast forward error correction library with Python interface
#
# Copyright (C) 2007 Allmydata, Inc.
# Author: Zooko Wilcox-O'Hearn
#
# This file is part of zfec.
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version, with the added permission that, if you become obligated
# to release a derived work under this licence (as per section 2.b), you may
# delay the fulfillment of this obligation for up to 12 months. See the file
# COPYING for details.
#
# If you would like to inquire about a commercial relationship with Allmydata,
# Inc., please contact partnerships@allmydata.com and visit
# http://allmydata.com/.

View File

@ -1,243 +0,0 @@
#!/usr/bin/env python
import cStringIO, os, random, re
import unittest
global VERBOSE
VERBOSE=False
import zfec
from base64 import b32encode
def ab(x): # debuggery
if len(x) >= 3:
return "%s:%s" % (len(x), b32encode(x[-3:]),)
elif len(x) == 2:
return "%s:%s" % (len(x), b32encode(x[-2:]),)
elif len(x) == 1:
return "%s:%s" % (len(x), b32encode(x[-1:]),)
elif len(x) == 0:
return "%s:%s" % (len(x), "--empty--",)
def _h(k, m, ss):
encer = zfec.Encoder(k, m)
nums_and_blocks = list(enumerate(encer.encode(ss)))
assert isinstance(nums_and_blocks, list), nums_and_blocks
assert len(nums_and_blocks) == m, (len(nums_and_blocks), m,)
nums_and_blocks = random.sample(nums_and_blocks, k)
blocks = [ x[1] for x in nums_and_blocks ]
nums = [ x[0] for x in nums_and_blocks ]
decer = zfec.Decoder(k, m)
decoded = decer.decode(blocks, nums)
assert len(decoded) == len(ss), (len(decoded), len(ss),)
assert tuple([str(s) for s in decoded]) == tuple([str(s) for s in ss]), (tuple([ab(str(s)) for s in decoded]), tuple([ab(str(s)) for s in ss]),)
def randstr(n):
return ''.join(map(chr, map(random.randrange, [0]*n, [256]*n)))
def _help_test_random():
m = random.randrange(1, 257)
k = random.randrange(1, m+1)
l = random.randrange(0, 2**10)
ss = [ randstr(l/k) for x in range(k) ]
_h(k, m, ss)
def _help_test_random_with_l(l):
m = 83
k = 19
ss = [ randstr(l/k) for x in range(k) ]
_h(k, m, ss)
class ZFec(unittest.TestCase):
def test_random(self):
for i in range(3):
_help_test_random()
if VERBOSE:
print "%d randomized tests pass." % (i+1)
def test_bad_args_enc(self):
encer = zfec.Encoder(2, 4)
try:
encer.encode(["a", "b", ], ["c", "I am not an integer blocknum",])
except zfec.Error, e:
assert "Precondition violation: second argument is required to contain int" in str(e), e
else:
raise "Should have gotten zfec.Error for wrong type of second argument."
try:
encer.encode(["a", "b", ], 98) # not a sequence at all
except TypeError, e:
assert "Second argument (optional) was not a sequence" in str(e), e
else:
raise "Should have gotten TypeError for wrong type of second argument."
def test_bad_args_dec(self):
decer = zfec.Decoder(2, 4)
try:
decer.decode(98, [0, 1]) # first argument is not a sequence
except TypeError, e:
assert "First argument was not a sequence" in str(e), e
else:
raise "Should have gotten TypeError for wrong type of second argument."
try:
decer.decode(["a", "b", ], ["c", "d",])
except zfec.Error, e:
assert "Precondition violation: second argument is required to contain int" in str(e), e
else:
raise "Should have gotten zfec.Error for wrong type of second argument."
try:
decer.decode(["a", "b", ], 98) # not a sequence at all
except TypeError, e:
assert "Second argument was not a sequence" in str(e), e
else:
raise "Should have gotten TypeError for wrong type of second argument."
class FileFec(unittest.TestCase):
def test_filefec_header(self):
for m in [3, 5, 7, 9, 11, 17, 19, 33, 35, 65, 66, 67, 129, 130, 131, 254, 255, 256,]:
for k in [2, 3, 5, 9, 17, 33, 65, 129, 255,]:
if k >= m:
continue
for pad in [0, 1, k-1,]:
if pad >= k:
continue
for sh in [0, 1, m-1,]:
if sh >= m:
continue
h = zfec.filefec._build_header(m, k, pad, sh)
hio = cStringIO.StringIO(h)
(rm, rk, rpad, rsh,) = zfec.filefec._parse_header(hio)
assert (rm, rk, rpad, rsh,) == (m, k, pad, sh,), h
def _help_test_filefec(self, teststr, k, m, numshs=None):
if numshs == None:
numshs = m
TESTFNAME = "testfile.txt"
PREFIX = "test"
SUFFIX = ".fec"
fsize = len(teststr)
tempdir = zfec.util.fileutil.NamedTemporaryDirectory(cleanup=True)
try:
tempf = tempdir.file(TESTFNAME, 'w+b')
tempf.write(teststr)
tempf.seek(0)
# encode the file
zfec.filefec.encode_to_files(tempf, fsize, tempdir.name, PREFIX, k, m, SUFFIX, verbose=VERBOSE)
# select some share files
RE=re.compile(zfec.filefec.RE_FORMAT % (PREFIX, SUFFIX,))
fns = os.listdir(tempdir.name)
assert len(fns) >= m, (fns, tempdir, tempdir.name,)
sharefs = [ open(os.path.join(tempdir.name, fn), "rb") for fn in fns if RE.match(fn) ]
for sharef in sharefs:
tempdir.register_file(sharef)
random.shuffle(sharefs)
del sharefs[numshs:]
# decode from the share files
outf = tempdir.file('recovered-testfile.txt', 'w+b')
zfec.filefec.decode_from_files(outf, sharefs, verbose=VERBOSE)
outf.seek(0)
recovereddata = outf.read()
assert recovereddata == teststr
finally:
tempdir.shutdown()
def test_filefec_all_shares(self):
return self._help_test_filefec("Yellow Whirled!", 3, 8)
def test_filefec_all_shares_2(self):
return self._help_test_filefec("Yellow Whirled", 3, 8)
def test_filefec_all_shares_3(self):
return self._help_test_filefec("Yellow Whirle", 3, 8)
def test_filefec_all_shares_3_b(self):
return self._help_test_filefec("Yellow Whirle", 4, 16)
def test_filefec_all_shares_2_b(self):
return self._help_test_filefec("Yellow Whirled", 4, 16)
def test_filefec_all_shares_1_b(self):
return self._help_test_filefec("Yellow Whirled!", 4, 16)
def test_filefec_all_shares_with_padding(self, noisy=VERBOSE):
return self._help_test_filefec("Yellow Whirled!A", 3, 8)
def test_filefec_min_shares_with_padding(self, noisy=VERBOSE):
return self._help_test_filefec("Yellow Whirled!A", 3, 8, numshs=3)
def test_filefec_min_shares_with_crlf(self, noisy=VERBOSE):
return self._help_test_filefec("llow Whirled!A\r\n", 3, 8, numshs=3)
def test_filefec_min_shares_with_lf(self, noisy=VERBOSE):
return self._help_test_filefec("Yellow Whirled!A\n", 3, 8, numshs=3)
def test_filefec_min_shares_with_lflf(self, noisy=VERBOSE):
return self._help_test_filefec("Yellow Whirled!A\n\n", 3, 8, numshs=3)
def test_filefec_min_shares_with_crcrlflf(self, noisy=VERBOSE):
return self._help_test_filefec("Yellow Whirled!A\r\r\n\n", 3, 8, numshs=3)
class Cmdline(unittest.TestCase):
def test_basic(self, noisy=VERBOSE):
tempdir = zfec.util.fileutil.NamedTemporaryDirectory(cleanup=True)
fo = tempdir.file("test.data", "w+b")
fo.write("WHEHWHJEKWAHDLJAWDHWALKDHA")
import sys
realargv = sys.argv
try:
DEFAULT_M=16
DEFAULT_K=4
sys.argv = ["zfec", os.path.join(tempdir.name, "test.data"),]
retcode = zfec.cmdline_zfec.main()
assert retcode == 0, retcode
RE=re.compile(zfec.filefec.RE_FORMAT % ('test.data', ".fec",))
fns = os.listdir(tempdir.name)
assert len(fns) >= DEFAULT_M, (fns, tempdir, tempdir.name,)
sharefns = [ os.path.join(tempdir.name, fn) for fn in fns if RE.match(fn) ]
random.shuffle(sharefns)
del sharefns[DEFAULT_K:]
sys.argv = ["zunfec",]
sys.argv.extend(sharefns)
sys.argv.extend(['-o', os.path.join(tempdir.name, 'test.data-recovered'),])
retcode = zfec.cmdline_zunfec.main()
assert retcode == 0, retcode
import filecmp
assert filecmp.cmp(os.path.join(tempdir.name, 'test.data'), os.path.join(tempdir.name, 'test.data-recovered'))
finally:
sys.argv = realargv
# zfec -- fast forward error correction library with Python interface
#
# Copyright (C) 2007 Allmydata, Inc.
# Author: Zooko Wilcox-O'Hearn
#
# This file is part of zfec.
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version, with the added permission that, if you become obligated
# to release a derived work under this licence (as per section 2.b), you may
# delay the fulfillment of this obligation for up to 12 months. See the file
# COPYING for details.
#
# If you would like to inquire about a commercial relationship with Allmydata,
# Inc., please contact partnerships@allmydata.com and visit
# http://allmydata.com/.

File diff suppressed because it is too large Load Diff

View File

@ -1,257 +0,0 @@
"""
Futz with files like a pro.
"""
import exceptions, os, stat, tempfile, time
try:
from twisted.python import log
except ImportError:
class DummyLog:
def msg(self, *args, **kwargs):
pass
log = DummyLog()
def rename(src, dst, tries=4, basedelay=0.1):
""" Here is a superkludge to workaround the fact that occasionally on
Windows some other process (e.g. an anti-virus scanner, a local search
engine, etc.) is looking at your file when you want to delete or move it,
and hence you can't. The horrible workaround is to sit and spin, trying
to delete it, for a short time and then give up.
With the default values of tries and basedelay this can block for less
than a second.
@param tries: number of tries -- each time after the first we wait twice
as long as the previous wait
@param basedelay: how long to wait before the second try
"""
for i in range(tries-1):
try:
return os.rename(src, dst)
except EnvironmentError, le:
# XXX Tighten this to check if this is a permission denied error (possibly due to another Windows process having the file open and execute the superkludge only in this case.
log.msg("XXX KLUDGE Attempting to move file %s => %s; got %s; sleeping %s seconds" % (src, dst, le, basedelay,))
time.sleep(basedelay)
basedelay *= 2
return os.rename(src, dst) # The last try.
def remove(f, tries=4, basedelay=0.1):
""" Here is a superkludge to workaround the fact that occasionally on
Windows some other process (e.g. an anti-virus scanner, a local search
engine, etc.) is looking at your file when you want to delete or move it,
and hence you can't. The horrible workaround is to sit and spin, trying
to delete it, for a short time and then give up.
With the default values of tries and basedelay this can block for less
than a second.
@param tries: number of tries -- each time after the first we wait twice
as long as the previous wait
@param basedelay: how long to wait before the second try
"""
try:
os.chmod(f, stat.S_IWRITE | stat.S_IEXEC | stat.S_IREAD)
except:
pass
for i in range(tries-1):
try:
return os.remove(f)
except EnvironmentError, le:
# XXX Tighten this to check if this is a permission denied error (possibly due to another Windows process having the file open and execute the superkludge only in this case.
if not os.path.exists(f):
return
log.msg("XXX KLUDGE Attempting to remove file %s; got %s; sleeping %s seconds" % (f, le, basedelay,))
time.sleep(basedelay)
basedelay *= 2
return os.remove(f) # The last try.
class _Dir(object):
"""
Hold a set of files and subdirs and clean them all up when asked to.
"""
def __init__(self, name, cleanup=True):
self.name = name
self.cleanup = cleanup
self.files = set()
self.subdirs = set()
def file(self, fname, mode=None):
"""
Create a file in the tempdir and remember it so as to close() it
before attempting to cleanup the temp dir.
@rtype: file
"""
ffn = os.path.join(self.name, fname)
if mode is not None:
fo = open(ffn, mode)
else:
fo = open(ffn)
self.register_file(fo)
return fo
def subdir(self, dirname):
"""
Create a subdirectory in the tempdir and remember it so as to call
shutdown() on it before attempting to clean up.
@rtype: _Dir instance
"""
ffn = os.path.join(self.name, dirname)
sd = _Dir(ffn, self.cleanup)
self.register_subdir(sd)
def register_file(self, fileobj):
"""
Remember the file object and call close() on it before attempting to
clean up.
"""
self.files.add(fileobj)
def register_subdir(self, dirobj):
"""
Remember the _Dir object and call shutdown() on it before attempting
to clean up.
"""
self.subdirs.add(dirobj)
def shutdown(self):
if self.cleanup:
for subdir in hasattr(self, 'subdirs') and self.subdirs or []:
subdir.shutdown()
for fileobj in hasattr(self, 'files') and self.files or []:
fileobj.close() # "close()" is idempotent so we don't need to catch exceptions here
if hasattr(self, 'name'):
rm_dir(self.name)
def __repr__(self):
return "<%s instance at %x %s>" % (self.__class__.__name__, id(self), self.name)
def __str__(self):
return self.__repr__()
def __del__(self):
try:
self.shutdown()
except:
import traceback
traceback.print_exc()
class NamedTemporaryDirectory(_Dir):
"""
Call tempfile.mkdtemp(), store the name of the dir in self.name, and
rm_dir() when it gets garbage collected or "shutdown()".
Also keep track of file objects for files within the tempdir and call
close() on them before rm_dir(). This is a convenient way to open temp
files within the directory, and it is very helpful on Windows because you
can't delete a directory which contains a file which is currently open.
"""
def __init__(self, cleanup=True, *args, **kwargs):
""" If cleanup, then the directory will be rmrf'ed when the object is shutdown. """
name = tempfile.mkdtemp(*args, **kwargs)
_Dir.__init__(self, name, cleanup)
def make_dirs(dirname, mode=0777, strictmode=False):
"""
A threadsafe and idempotent version of os.makedirs(). If the dir already
exists, do nothing and return without raising an exception. If this call
creates the dir, return without raising an exception. If there is an
error that prevents creation or if the directory gets deleted after
make_dirs() creates it and before make_dirs() checks that it exists, raise
an exception.
@param strictmode if true, then make_dirs() will raise an exception if the
directory doesn't have the desired mode. For example, if the
directory already exists, and has a different mode than the one
specified by the mode parameter, then if strictmode is true,
make_dirs() will raise an exception, else it will ignore the
discrepancy.
"""
tx = None
try:
os.makedirs(dirname, mode)
except OSError, x:
tx = x
if not os.path.isdir(dirname):
if tx:
raise tx
raise exceptions.IOError, "unknown error prevented creation of directory, or deleted the directory immediately after creation: %s" % dirname # careful not to construct an IOError with a 2-tuple, as that has a special meaning...
tx = None
if hasattr(os, 'chmod'):
try:
os.chmod(dirname, mode)
except OSError, x:
tx = x
if strictmode and hasattr(os, 'stat'):
s = os.stat(dirname)
resmode = stat.S_IMODE(s.st_mode)
if resmode != mode:
if tx:
raise tx
raise exceptions.IOError, "unknown error prevented setting correct mode of directory, or changed mode of the directory immediately after creation. dirname: %s, mode: %04o, resmode: %04o" % (dirname, mode, resmode,) # careful not to construct an IOError with a 2-tuple, as that has a special meaning...
def rm_dir(dirname):
"""
A threadsafe and idempotent version of shutil.rmtree(). If the dir is
already gone, do nothing and return without raising an exception. If this
call removes the dir, return without raising an exception. If there is an
error that prevents deletion or if the directory gets created again after
rm_dir() deletes it and before rm_dir() checks that it is gone, raise an
exception.
"""
excs = []
try:
os.chmod(dirname, stat.S_IWRITE | stat.S_IEXEC | stat.S_IREAD)
for f in os.listdir(dirname):
fullname = os.path.join(dirname, f)
if os.path.isdir(fullname):
rm_dir(fullname)
else:
remove(fullname)
os.rmdir(dirname)
except Exception, le:
# Ignore "No such file or directory"
if (not isinstance(le, OSError)) or le.args[0] != 2:
excs.append(le)
# Okay, now we've recursively removed everything, ignoring any "No
# such file or directory" errors, and collecting any other errors.
if os.path.exists(dirname):
if len(excs) == 1:
raise excs[0]
if len(excs) == 0:
raise OSError, "Failed to remove dir for unknown reason."
raise OSError, excs
def remove_if_possible(f):
try:
remove(f)
except:
pass
# zfec -- fast forward error correction library with Python interface
#
# Copyright (C) 2007 Allmydata, Inc.
# Author: Zooko Wilcox-O'Hearn
#
# This file is part of zfec.
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version, with the added permission that, if you become obligated
# to release a derived work under this licence (as per section 2.b), you may
# delay the fulfillment of this obligation for up to 12 months. See the file
# COPYING for details.
#
# If you would like to inquire about a commercial relationship with Allmydata,
# Inc., please contact partnerships@allmydata.com and visit
# http://allmydata.com/.

View File

@ -1,91 +0,0 @@
"""
A few commonly needed functions.
"""
import math
def div_ceil(n, d):
"""
The smallest integer k such that k*d >= n.
"""
return (n/d) + (n%d != 0)
def next_multiple(n, k):
"""
The smallest multiple of k which is >= n.
"""
return div_ceil(n, k) * k
def pad_size(n, k):
"""
The smallest number that has to be added to n so that n is a multiple of k.
"""
if n%k:
return k - n%k
else:
return 0
def is_power_of_k(n, k):
return k**int(math.log(n, k) + 0.5) == n
def next_power_of_k(n, k):
p = 1
while p < n:
p *= k
return p
def ave(l):
return sum(l) / len(l)
def log_ceil(n, b):
"""
The smallest integer k such that b^k >= n.
log_ceil(n, 2) is the number of bits needed to store any of n values, e.g.
the number of bits needed to store any of 128 possible values is 7.
"""
p = 1
k = 0
while p < n:
p *= b
k += 1
return k
def permute(l):
"""
Return all possible permutations of l.
@type l: sequence
@rtype: a list of sequences
"""
if len(l) == 1:
return [l,]
res = []
for i in range(len(l)):
l2 = list(l[:])
x = l2.pop(i)
for l3 in permute(l2):
l3.append(x)
res.append(l3)
return res
# zfec -- fast forward error correction library with Python interface
#
# Copyright (C) 2007 Allmydata, Inc.
# Author: Zooko Wilcox-O'Hearn
#
# This file is part of zfec.
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version, with the added permission that, if you become obligated
# to release a derived work under this licence (as per section 2.b), you may
# delay the fulfillment of this obligation for up to 12 months. See the file
# COPYING for details.
#
# If you would like to inquire about a commercial relationship with Allmydata,
# Inc., please contact partnerships@allmydata.com and visit
# http://allmydata.com/.

View File

@ -1,139 +0,0 @@
# Copyright (c) 2004-2007 Bryce "Zooko" Wilcox-O'Hearn
# mailto:zooko@zooko.com
# http://zooko.com/repos/pyutil
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this work to deal in this work without restriction (including the rights
# to use, modify, distribute, sublicense, and/or sell copies).
"""
extended version number class
"""
from distutils import version
# End users see version strings like this:
# "1.0.0"
# ^ ^ ^
# | | |
# | | '- micro version number
# | '- minor version number
# '- major version number
# The first number is "major version number". The second number is the "minor
# version number" -- it gets bumped whenever we make a new release that adds or
# changes functionality. The third version is the "micro version number" -- it
# gets bumped whenever we make a new release that doesn't add or change
# functionality, but just fixes bugs (including performance issues).
# Early-adopter end users see version strings like this:
# "1.0.0a1"
# ^ ^ ^^^
# | | |||
# | | ||'- release number
# | | |'- alpha or beta (or none)
# | | '- micro version number
# | '- minor version number
# '- major version number
# The optional "a" or "b" stands for "alpha release" or "beta release"
# respectively. The number after "a" or "b" gets bumped every time we
# make a new alpha or beta release. This has the same form and the same
# meaning as version numbers of releases of Python.
# Developers see "full version strings", like this:
# "1.0.0a1-55-UNSTABLE"
# ^ ^ ^^^ ^ ^
# | | ||| | |
# | | ||| | '- tags
# | | ||| '- nano version number
# | | ||'- release number
# | | |'- alpha or beta (or none)
# | | '- micro version number
# | '- minor version number
# '- major version number
# The next number is the "nano version number". It is meaningful only to
# developers. It gets bumped whenever a developer changes anything that another
# developer might care about.
# The last part is the "tags" separated by "_". Standard tags are
# "STABLE" and "UNSTABLE".
class Tag(str):
def __cmp__(t1, t2):
if t1 == t2:
return 0
if t1 == "UNSTABLE" and t2 == "STABLE":
return 1
if t1 == "STABLE" and t2 == "UNSTABLE":
return -1
return -2 # who knows
class Version:
def __init__(self, vstring=None):
self.major = None
self.minor = None
self.micro = None
self.prereleasetag = None
self.nano = None
self.tags = None
if vstring:
self.parse(vstring)
def parse(self, vstring):
i = vstring.find('-')
if i:
svstring = vstring[:i]
estring = vstring[i+1:]
else:
svstring = vstring
estring = None
self.strictversion = version.StrictVersion(svstring)
self.major = self.strictversion.version[0]
self.minor = self.strictversion.version[1]
self.micro = self.strictversion.version[2]
self.prereleasetag = self.strictversion.prerelease
if estring:
try:
(self.nano, tags,) = estring.split('-')
except:
print estring
raise
self.tags = map(Tag, tags.split('_'))
self.tags.sort()
self.fullstr = '-'.join([str(self.strictversion), str(self.nano), '_'.join(self.tags)])
def tags(self):
return self.tags
def user_str(self):
return self.strictversion.__str__()
def full_str(self):
return self.fullstr
def __str__(self):
return self.full_str()
def __repr__(self):
return self.__str__()
def __cmp__ (self, other):
if isinstance(other, basestring):
other = Version(other)
res = cmp(self.strictversion, other.strictversion)
if res != 0:
return res
res = cmp(self.nano, other.nano)
if res != 0:
return res
return cmp(self.tags, other.tags)