lzma_.index

Handling of .xz Index and related information

Source:

Author:
Lasse Collin (original liblzma author), Johannes Pfau (D bindings)

License:
public domain

struct lzma_index;

Opaque data type to hold the Index(es) and other information

lzma_index often holds just one .xz Index and possibly the Stream Flags of the same Stream and size of the Stream Padding field. However, multiple lzma_indexes can be concatenated with lzma_index_cat() and then there may be information about multiple Streams in the same lzma_index.

Notes about thread safety: Only one thread may modify lzma_index at a time. All functions that take non-const pointer to lzma_index modify it. As long as no thread is modifying the lzma_index, getting information from the same lzma_index can be done from multiple threads at the same time with functions that take a const pointer to lzma_index or use lzma_index_iter. The same iterator must be used only by one thread at a time, of course, but there can be as many iterators for the same lzma_index as needed.

struct lzma_index_iter;

Iterator to get information about Blocks and Streams

StreamStruct stream;
BlockStruct block;

enum lzma_index_iter_mode;

Operation mode for lzma_index_iter_next()

LZMA_INDEX_ITER_ANY

Get the next Block or Stream

Go to the next Block if the current Stream has at least one Block left. Otherwise go to the next Stream even if it has no Blocks. If the Stream has no Blocks (lzma_index_iter.stream.block_count == 0), lzma_index_iter.block will have undefined values.

LZMA_INDEX_ITER_STREAM

Get the next Stream

Go to the next Stream even if the current Stream has unread Blocks left. If the next Stream has at least one Block, the iterator will point to the first Block. If there are no Blocks, lzma_index_iter.block will have undefined values.

LZMA_INDEX_ITER_BLOCK

Get the next Block

Go to the next Block if the current Stream has at least one Block left. If the current Stream has no Blocks left, the next Stream with at least one Block is located and the iterator will be made to point to the first Block of that Stream.

LZMA_INDEX_ITER_NONEMPTY_BLOCK

Get the next non-empty Block

This is like LZMA_INDEX_ITER_BLOCK except that it will skip Blocks whose Uncompressed Size is zero.

ulong lzma_index_memusage(lzma_vli streams, lzma_vli blocks);

Calculate memory usage of lzma_index

On disk, the size of the Index field depends on both the number of Records stored and how big values the Records store (due to variable-length integer encoding). When the Index is kept in lzma_index structure, the memory usage depends only on the number of Records/Blocks stored in the Index(es), and in case of concatenated lzma_indexes, the number of Streams. The size in RAM is almost always significantly bigger than in the encoded form on disk.

This function calculates an approximate amount of memory needed hold the given number of Streams and Blocks in lzma_index structure. This value may vary between CPU architectures and also between liblzma versions if the internal implementation is modified.

ulong lzma_index_memused(const lzma_index* i);

Calculate the memory usage of an existing lzma_index

This is a shorthand for lzma_index_memusage(lzma_index_stream_count(i), lzma_index_block_count(i)).

lzma_index* lzma_index_init(lzma_allocator* allocator);

Allocate and initialize a new lzma_index structure

Returns:
On success, a pointer to an empty initialized lzma_index is returned. If allocation fails, NULL is returned.

void lzma_index_end(lzma_index* i, lzma_allocator* allocator);

Deallocate lzma_index

If i is NULL, this does nothing.

lzma_ret lzma_index_append(lzma_index* i, lzma_allocator* allocator, lzma_vli unpadded_size, lzma_vli uncompressed_size);

Add a new Block to lzma_index

Parameters:

lzma_index* i	Pointer to a lzma_index structure
lzma_allocator* allocator	Pointer to lzma_allocator, or NULL to use malloc()
lzma_vli unpadded_size	Unpadded Size of a Block. This can be calculated with lzma_block_unpadded_size() after encoding or decoding the Block.
lzma_vli uncompressed_size	Uncompressed Size of a Block. This can be taken directly from lzma_block structure after encoding or decoding the Block. Appending a new Block does not invalidate iterators. For example, if an iterator was pointing to the end of the lzma_index, after lzma_index_append() it is possible to read the next Block with an existing iterator.

Returns:
- LZMA_OK - LZMA_MEM_ERROR - LZMA_DATA_ERROR: Compressed or uncompressed size of the Stream or size of the Index field would grow too big. - LZMA_PROG_ERROR

lzma_ret lzma_index_stream_flags(lzma_index* i, const lzma_stream_flags* stream_flags);

Set the Stream Flags

Set the Stream Flags of the last (and typically the only) Stream in lzma_index. This can be useful when reading information from the lzma_index, because to decode Blocks, knowing the integrity check type is needed.

The given Stream Flags are copied into internal preallocated structure in the lzma_index, thus the caller doesn't need to keep the *stream_flags available after calling this function.

Returns:
- LZMA_OK - LZMA_OPTIONS_ERROR: Unsupported stream_flags->version. - LZMA_PROG_ERROR

uint lzma_index_checks(const lzma_index* i);

Get the types of integrity Checks

If lzma_index_stream_flags() is used to set the Stream Flags for every Stream, lzma_index_checks() can be used to get a bitmask to indicate which Check types have been used. It can be useful e.g. if showing the Check types to the user.

The bitmask is 1 << check_id, e.g. CRC32 is 1 << 1 and SHA-256 is 1 << 10.

lzma_ret lzma_index_stream_padding(lzma_index* i, lzma_vli stream_padding);

Set the amount of Stream Padding

Set the amount of Stream Padding of the last (and typically the only) Stream in the lzma_index. This is needed when planning to do random-access reading within multiple concatenated Streams.

By default, the amount of Stream Padding is assumed to be zero bytes.

Returns:
- LZMA_OK - LZMA_DATA_ERROR: The file size would grow too big. - LZMA_PROG_ERROR

lzma_vli lzma_index_stream_count(const lzma_index* i);

Get the number of Streams

lzma_vli lzma_index_block_count(const lzma_index* i);

Get the number of Blocks

This returns the total number of Blocks in lzma_index. To get number of Blocks in individual Streams, use lzma_index_iter.

lzma_vli lzma_index_size(const lzma_index* i);

Get the size of the Index field as bytes

This is needed to verify the Backward Size field in the Stream Footer.

lzma_vli lzma_index_stream_size(const lzma_index* i);

Get the total size of the Stream

If multiple lzma_indexes have been combined, this works as if the Blocks were in a single Stream. This is useful if you are going to combine Blocks from multiple Streams into a single new Stream.

lzma_vli lzma_index_total_size(const lzma_index* i);

Get the total size of the Blocks

This doesn't include the Stream Header, Stream Footer, Stream Padding, or Index fields.

lzma_vli lzma_index_file_size(const lzma_index* i);

Get the total size of the file

When no lzma_indexes have been combined with lzma_index_cat() and there is no Stream Padding, this function is identical to lzma_index_stream_size(). If multiple lzma_indexes have been combined, this includes also the headers of each separate Stream and the possible Stream Padding fields.

lzma_vli lzma_index_uncompressed_size(const lzma_index* i);

Get the uncompressed size of the file

void lzma_index_iter_init(lzma_index_iter* iter, const lzma_index* i);

Initialize an iterator

Parameters:

lzma_index_iter* iter

Pointer to a lzma_index_iter structure

lzma_index* i

lzma_index to which the iterator will be associated

This function associates the iterator with the given lzma_index, and calls lzma_index_iter_rewind() on the iterator.

This function doesn't allocate any memory, thus there is no lzma_index_iter_end(). The iterator is valid as long as the associated lzma_index is valid, that is, until lzma_index_end() or using it as source in lzma_index_cat(). Specifically, lzma_index doesn't become invalid if new Blocks are added to it with lzma_index_append() or if it is used as the destination in lzma_index_cat().

It is safe to make copies of an initialized lzma_index_iter, for example, to easily restart reading at some particular position.

void lzma_index_iter_rewind(lzma_index_iter* iter);

Rewind the iterator

Rewind the iterator so that next call to lzma_index_iter_next() will return the first Block or Stream.

lzma_bool lzma_index_iter_next(lzma_index_iter* iter, lzma_index_iter_mode mode);

Get the next Block or Stream

Parameters:

lzma_index_iter* iter	Iterator initialized with lzma_index_iter_init()
lzma_index_iter_mode mode	Specify what kind of information the caller wants to get. See lzma_index_iter_mode for details.

Returns:
If next Block or Stream matching the mode was found, *iter is updated and this function returns false. If no Block or Stream matching the mode is found, *iter is not modified and this function returns true. If mode is set to an unknown value, *iter is not modified and this function returns true.

lzma_bool lzma_index_iter_locate(lzma_index_iter* iter, lzma_vli target);

Locate a Block

If it is possible to seek in the .xz file, it is possible to parse the Index field(s) and use lzma_index_iter_locate() to do random-access reading with granularity of Block size.

Parameters:

lzma_index_iter* iter

Iterator that was earlier initialized with lzma_index_iter_init().

lzma_vli target

Uncompressed target offset which the caller would like to locate from the Stream

If the target is smaller than the uncompressed size of the Stream (can be checked with lzma_index_uncompressed_size()): - Information about the Stream and Block containing the requested uncompressed offset is stored into *iter. - Internal state of the iterator is adjusted so that lzma_index_iter_next() can be used to read subsequent Blocks or Streams. - This function returns false.

If target is greater than the uncompressed size of the Stream, *iter is not modified, and this function returns true.

lzma_ret lzma_index_cat(lzma_index* dest, lzma_index* src, lzma_allocator* allocator);

Concatenate lzma_indexes

Concatenating lzma_indexes is useful when doing random-access reading in multi-Stream .xz file, or when combining multiple Streams into single Stream.

Parameters:

lzma_index* dest	lzma_index after which src is appended
lzma_index* src	lzma_index to be appended after dest. If this function succeeds, the memory allocated for src is freed or moved to be part of dest, and all iterators pointing to src will become invalid.
lzma_allocator* allocator	Custom memory allocator; can be NULL to use malloc() and free().

Returns:
- LZMA_OK: lzma_indexes were concatenated successfully. src is now a dangling pointer. - LZMA_DATA_ERROR: *dest would grow too big. - LZMA_MEM_ERROR - LZMA_PROG_ERROR

lzma_index* lzma_index_dup(const lzma_index* i, lzma_allocator* allocator);

Duplicate lzma_index

Returns:
A copy of the lzma_index, or NULL if memory allocation failed.

lzma_ret lzma_index_encoder(lzma_stream* strm, const lzma_index* i);

Initialize .xz Index encoder

Parameters:

lzma_stream* strm	Pointer to properly prepared lzma_stream
lzma_index* i	Pointer to lzma_index which should be encoded. The valid `action' values for lzma_code() are LZMA_RUN and LZMA_FINISH. It is enough to use only one of them (you can choose freely; use LZMA_RUN to support liblzma versions older than 5.0.0).

Returns:
- LZMA_OK: Initialization succeeded, continue with lzma_code(). - LZMA_MEM_ERROR - LZMA_PROG_ERROR

lzma_ret lzma_index_decoder(lzma_stream* strm, lzma_index** i, ulong memlimit);

Initialize .xz Index decoder

Parameters:

lzma_stream* strm	Pointer to properly prepared lzma_stream
lzma_index** i	The decoded Index will be made available via this pointer. Initially this function will set i to NULL (the old value is ignored). If decoding succeeds (lzma_code() returns LZMA_STREAM_END), i will be set to point to a new lzma_index, which the application has to later free with lzma_index_end().
ulong memlimit	How much memory the resulting lzma_index is allowed to require. The valid `action' values for lzma_code() are LZMA_RUN and LZMA_FINISH. It is enough to use only one of them (you can choose freely; use LZMA_RUN to support liblzma versions older than 5.0.0).

Returns:
- LZMA_OK: Initialization succeeded, continue with lzma_code(). - LZMA_MEM_ERROR - LZMA_MEMLIMIT_ERROR - LZMA_PROG_ERROR

lzma_ret lzma_index_buffer_encode(const lzma_index* i, ubyte* out_, uint* out_pos, size_t out_size);

Single-call .xz Index encoder

Parameters:

lzma_index* i	lzma_index to be encoded
ubyte* out_	Beginning of the output buffer
uint* out_pos	The next byte will be written to out[*out_pos]. *out_pos is updated only if encoding succeeds.
size_t out_size	Size of the out buffer; the first byte into which no data is written to is out[out_size].

Returns:
- LZMA_OK: Encoding was successful. - LZMA_BUF_ERROR: Output buffer is too small. Use lzma_index_size() to find out how much output space is needed. - LZMA_PROG_ERROR

Note:
This function doesn't take allocator argument since all the internal data is allocated on stack.

lzma_ret lzma_index_buffer_decode(lzma_index** i, ulong* memlimit, lzma_allocator* allocator, const ubyte* in_, uint* in_pos, size_t in_size);

Single-call .xz Index decoder

Parameters:

lzma_index** i	If decoding succeeds, i will point to a new lzma_index, which the application has to later free with lzma_index_end(). If an error occurs, i will be NULL. The old value of *i is always ignored and thus doesn't need to be initialized by the caller.
ulong* memlimit	Pointer to how much memory the resulting lzma_index is allowed to require. The value pointed by this pointer is modified if and only if LZMA_MEMLIMIT_ERROR is returned.
lzma_allocator* allocator	Pointer to lzma_allocator, or NULL to use malloc()
ubyte* in_	Beginning of the input buffer
uint* in_pos	The next byte will be read from in[*in_pos]. *in_pos is updated only if decoding succeeds.
size_t in_size	Size of the input buffer; the first byte that won't be read is in[in_size].

Returns:
- LZMA_OK: Decoding was successful. - LZMA_MEM_ERROR - LZMA_MEMLIMIT_ERROR: Memory usage limit was reached. The minimum required memlimit value was stored to *memlimit. - LZMA_DATA_ERROR - LZMA_PROG_ERROR

Main module

lzma_

lzma_.index