1 # File formats {#page_formats}
5 There are two file formats used by `librsync` and `rdiff`: the
6 *signature* file, which summarizes a data file, and the *delta* file,
7 which describes the edits from one data file to another.
9 librsync does not know or care about any formats in the data files.
11 All integers are big-endian.
15 All librsync files start with a `uint32` magic number identifying them. These are declared in `librsync.h`:
18 /** A delta file. At present, there's only one delta format. **/
19 RS_DELTA_MAGIC = 0x72730236, /* r s \2 6 */
22 * A signature file with MD4 signatures. Backward compatible with
23 * librsync < 1.0, but strongly deprecated because it creates a security
24 * vulnerability on files containing partly untrusted data. See
25 * <https://github.com/librsync/librsync/issues/5>.
27 RS_MD4_SIG_MAGIC = 0x72730136, /* r s \1 6 */
30 * A signature file using the BLAKE2 hash. Supported from librsync 1.0.
32 RS_BLAKE2_SIG_MAGIC = 0x72730137 /* r s \1 7 */
37 Signatures consist of a header followed by a number of block
40 Each block signature gives signature hashes for one block of
41 `block_len` bytes from the input data file. The final data block
42 may be shorter. The number of blocks in the signature is therefore
44 ceil(input_len/block_len)
46 The signature header is (see `rs_sig_s_header`):
48 u32 magic; // either RS_MD4_SIG_MAGIC or RS_BLAKE2_SIG_MAGIC
49 u32 block_len; // bytes per block
50 u32 strong_sum_len; // bytes per strong sum in each block
52 The block signature contains a rolling or weak checksum used to find
53 moved data, and a strong hash used to check the match is correct.
54 The weak checksum is computed as in `rollsum.c`. The strong hash is
55 either MD4 or BLAKE2 depending on the magic number.
57 To make the signatures smaller at a cost of a greater chance of collisions,
58 the `strong_sum_len` in the header can cause the strong sum to be truncated
59 to the left after computation.
61 Each signature block format is (see `rs_sig_do_block`):
64 u8[strong_sum_len] strong_sum;
68 TODO(https://github.com/librsync/librsync/issues/46): Document delta format.