3 .\" Author: Lasse Collin
5 .\" This file has been put into the public domain.
6 .\" You can do whatever you want with this file.
8 .TH XZ 1 "2017-04-19" "Tukaani" "XZ Utils"
11 xz, unxz, xzcat, lzma, unlzma, lzcat \- Compress or decompress .xz and .lzma files
21 .BR "xz \-\-decompress" .
25 .BR "xz \-\-decompress \-\-stdout" .
29 .BR "xz \-\-format=lzma" .
33 .BR "xz \-\-format=lzma \-\-decompress" .
37 .BR "xz \-\-format=lzma \-\-decompress \-\-stdout" .
39 When writing scripts that need to decompress files,
40 it is recommended to always use the name
42 with appropriate arguments
53 is a general-purpose data compression tool with
54 command line syntax similar to
58 The native file format is the
60 format, but the legacy
62 format used by LZMA Utils and
63 raw compressed streams with no container format headers
67 compresses or decompresses each
69 according to the selected operation mode.
77 reads from standard input and writes the processed data
80 will refuse (display an error and skip the
82 to write compressed data to standard output if it is a terminal.
85 will refuse to read compressed data
86 from standard input if it is a terminal.
94 are written to a new file whose name is derived from the source
98 When compressing, the suffix of the target file format
102 is appended to the source filename to get the target filename.
104 When decompressing, the
108 suffix is removed from the filename to get the target filename.
110 also recognizes the suffixes
114 and replaces them with the
118 If the target file already exists, an error is displayed and the
122 Unless writing to standard output,
124 will display a warning and skip the
126 if any of the following applies:
129 is not a regular file.
130 Symbolic links are not followed,
131 and thus they are not considered to be regular files.
134 has more than one hard link.
137 has setuid, setgid, or sticky bit set.
139 The operation mode is set to compress and the
141 already has a suffix of the target file format
145 when compressing to the
151 when compressing to the
155 The operation mode is set to decompress and the
157 doesn't have a suffix of any of the supported file formats
164 After successfully compressing or decompressing the
167 copies the owner, group, permissions, access time,
168 and modification time from the source
171 If copying the group fails, the permissions are modified
172 so that the target file doesn't become accessible to users
173 who didn't have permission to access the source
176 doesn't support copying other metadata like access control lists
177 or extended attributes yet.
179 Once the target file has been successfully closed, the source
186 is never removed if the output is written to standard output.
194 process makes it print progress information to standard error.
195 This has only limited use since when standard error
198 will display an automatically updating progress indicator.
203 varies from a few hundred kilobytes to several gigabytes
204 depending on the compression settings.
205 The settings used when compressing a file determine
206 the memory requirements of the decompressor.
207 Typically the decompressor needs 5\ % to 20\ % of
208 the amount of memory that the compressor needed when
210 For example, decompressing a file created with
212 currently requires 65\ MiB of memory.
213 Still, it is possible to have
215 files that require several gigabytes of memory to decompress.
217 Especially users of older systems may find
218 the possibility of very large memory usage annoying.
219 To prevent uncomfortable surprises,
221 has a built-in memory usage limiter, which is disabled by default.
222 While some operating systems provide ways to limit
223 the memory usage of processes, relying on it
224 wasn't deemed to be flexible enough (e.g. using
226 to limit virtual memory tends to cripple
229 The memory usage limiter can be enabled with
230 the command line option \fB\-\-memlimit=\fIlimit\fR.
231 Often it is more convenient to enable the limiter
232 by default by setting the environment variable
235 .BR XZ_DEFAULTS=\-\-memlimit=150MiB .
236 It is possible to set the limits separately
237 for compression and decompression
238 by using \fB\-\-memlimit\-compress=\fIlimit\fR and
239 \fB\-\-memlimit\-decompress=\fIlimit\fR.
240 Using these two options outside
242 is rarely useful because a single run of
244 cannot do both compression and decompression and
245 .BI \-\-memlimit= limit
246 (or \fB\-M\fR \fIlimit\fR)
247 is shorter to type on the command line.
249 If the specified memory usage limit is exceeded when decompressing,
251 will display an error and decompressing the file will fail.
252 If the limit is exceeded when compressing,
254 will try to scale the settings down so that the limit
255 is no longer exceeded (except when using \fB\-\-format=raw\fR
256 or \fB\-\-no\-adjust\fR).
257 This way the operation won't fail unless the limit is very small.
258 The scaling of the settings is done in steps that don't
259 match the compression level presets, e.g. if the limit is
260 only slightly less than the amount required for
262 the settings will be scaled down only a little,
263 not all the way down to
266 .SS "Concatenation and padding with .xz files"
267 It is possible to concatenate
271 will decompress such files as if they were a single
275 It is possible to insert padding between the concatenated parts
276 or after the last part.
277 The padding must consist of null bytes and the size
278 of the padding must be a multiple of four bytes.
279 This can be useful e.g. if the
281 file is stored on a medium that measures file sizes
284 Concatenation and padding are not allowed with
286 files or raw streams.
290 .SS "Integer suffixes and special values"
291 In most places where an integer argument is expected,
292 an optional suffix is supported to easily indicate large integers.
293 There must be no space between the integer and the suffix.
296 Multiply the integer by 1,024 (2^10).
303 are accepted as synonyms for
307 Multiply the integer by 1,048,576 (2^20).
313 are accepted as synonyms for
317 Multiply the integer by 1,073,741,824 (2^30).
323 are accepted as synonyms for
328 can be used to indicate the maximum integer value
329 supported by the option.
332 If multiple operation mode options are given,
333 the last one takes effect.
335 .BR \-z ", " \-\-compress
337 This is the default operation mode when no operation mode option
338 is specified and no other operation mode is implied from
339 the command name (for example,
342 .BR \-\-decompress ).
344 .BR \-d ", " \-\-decompress ", " \-\-uncompress
347 .BR \-t ", " \-\-test
348 Test the integrity of compressed
350 This option is equivalent to
351 .B "\-\-decompress \-\-stdout"
352 except that the decompressed data is discarded instead of being
353 written to standard output.
354 No files are created or removed.
356 .BR \-l ", " \-\-list
357 Print information about compressed
359 No uncompressed output is produced,
360 and no files are created or removed.
361 In list mode, the program cannot read
362 the compressed data from standard
363 input or from other unseekable sources.
365 The default listing shows basic information about
368 To get more detailed information, use also the
371 For even more information, use
373 twice, but note that this may be slow, because getting all the extra
374 information requires many seeks.
375 The width of verbose output exceeds
376 80 characters, so piping the output to e.g.\&
378 may be convenient if the terminal isn't wide enough.
380 The exact output may vary between
382 versions and different locales.
383 For machine-readable output,
384 .B \-\-robot \-\-list
387 .SS "Operation modifiers"
389 .BR \-k ", " \-\-keep
390 Don't delete the input files.
392 .BR \-f ", " \-\-force
393 This option has several effects:
396 If the target file already exists,
397 delete it before compressing or decompressing.
399 Compress or decompress even if the input is
400 a symbolic link to a regular file,
401 has more than one hard link,
402 or has the setuid, setgid, or sticky bit set.
403 The setuid, setgid, and sticky bits are not copied
411 cannot recognize the type of the source file,
412 copy the source file as is to standard output.
418 for files that have not been compressed with
422 might support new compressed file formats, which may make
424 decompress more types of files instead of copying them as is to
426 .BI \-\-format= format
427 can be used to restrict
429 to decompress only a single file format.
432 .BR \-c ", " \-\-stdout ", " \-\-to\-stdout
433 Write the compressed or decompressed data to
434 standard output instead of a file.
438 .B \-\-single\-stream
439 Decompress only the first
442 silently ignore possible remaining input data following the stream.
443 Normally such trailing garbage makes
448 never decompresses more than one stream from
450 files or raw streams, but this option still makes
452 ignore the possible trailing data after the
456 This option has no effect if the operation mode is not
462 Disable creation of sparse files.
463 By default, if decompressing into a regular file,
465 tries to make the file sparse if the decompressed data contains
466 long sequences of binary zeros.
467 It also works when writing to standard output
468 as long as standard output is connected to a regular file
469 and certain additional conditions are met to make it safe.
470 Creating sparse files may save disk space and speed up
471 the decompression by reducing the amount of disk I/O.
473 \fB\-S\fR \fI.suf\fR, \fB\-\-suffix=\fI.suf
474 When compressing, use
476 as the suffix for the target file instead of
480 If not writing to standard output and
481 the source file already has the suffix
483 a warning is displayed and the file is skipped.
485 When decompressing, recognize files with the suffix
487 in addition to files with the
494 If the source file has the suffix
496 the suffix is removed to get the target filename.
498 When compressing or decompressing raw streams
499 .RB ( \-\-format=raw ),
500 the suffix must always be specified unless
501 writing to standard output,
502 because there is no default suffix for raw streams.
504 \fB\-\-files\fR[\fB=\fIfile\fR]
505 Read the filenames to process from
509 is omitted, filenames are read from standard input.
510 Filenames must be terminated with the newline character.
513 is taken as a regular filename; it doesn't mean standard input.
514 If filenames are given also as command line arguments, they are
515 processed before the filenames read from
518 \fB\-\-files0\fR[\fB=\fIfile\fR]
519 This is identical to \fB\-\-files\fR[\fB=\fIfile\fR] except
520 that each filename must be terminated with the null character.
522 .SS "Basic file format and compression options"
524 \fB\-F\fR \fIformat\fR, \fB\-\-format=\fIformat
527 to compress or decompress:
537 the format of the input file is automatically detected.
538 Note that raw streams (created with
540 cannot be auto-detected.
545 file format, or accept only
547 files when decompressing.
550 Compress to the legacy
552 file format, or accept only
554 files when decompressing.
557 is provided for backwards compatibility with LZMA Utils.
560 Compress or uncompress a raw stream (no headers).
561 This is meant for advanced users only.
562 To decode raw streams, you need use
564 and explicitly specify the filter chain,
565 which normally would have been stored in the container headers.
568 \fB\-C\fR \fIcheck\fR, \fB\-\-check=\fIcheck
569 Specify the type of the integrity check.
570 The check is calculated from the uncompressed data and
574 This option has an effect only when compressing into the
578 format doesn't support integrity checks.
579 The integrity check (if any) is verified when the
581 file is decompressed.
589 Don't calculate an integrity check at all.
590 This is usually a bad idea.
591 This can be useful when integrity of the data is verified
592 by other means anyway.
595 Calculate CRC32 using the polynomial from IEEE-802.3 (Ethernet).
598 Calculate CRC64 using the polynomial from ECMA-182.
599 This is the default, since it is slightly better than CRC32
600 at detecting damaged files and the speed difference is negligible.
604 This is somewhat slower than CRC32 and CRC64.
609 headers is always verified with CRC32.
610 It is not possible to change or disable it.
613 Don't verify the integrity check of the compressed data when decompressing.
614 The CRC32 values in the
616 headers will still be verified normally.
618 .B "Do not use this option unless you know what you are doing."
619 Possible reasons to use this option:
622 Trying to recover data from a corrupt .xz file.
624 Speeding up decompression.
625 This matters mostly with SHA-256 or
626 with files that have compressed extremely well.
627 It's recommended to not use this option for this purpose
628 unless the file integrity is verified externally in some other way.
632 Select a compression preset level.
635 If multiple preset levels are specified,
636 the last one takes effect.
637 If a custom filter chain was already specified, setting
638 a compression preset level clears the custom filter chain.
640 The differences between the presets are more significant than with
644 The selected compression settings determine
645 the memory requirements of the decompressor,
646 thus using a too high preset level might make it painful
647 to decompress the file on an old system with little RAM.
649 .B "it's not a good idea to blindly use \-9 for everything"
650 like it often is with
656 .BR "\-0" " ... " "\-3"
657 These are somewhat fast presets.
659 is sometimes faster than
661 while compressing much better.
662 The higher ones often have speed comparable to
664 with comparable or better compression ratio,
666 depend a lot on the type of data being compressed.
668 .BR "\-4" " ... " "\-6"
669 Good to very good compression while keeping
670 decompressor memory usage reasonable even for old systems.
672 is the default, which is usually a good choice
673 e.g. for distributing files that need to be decompressible
674 even on systems with only 16\ MiB RAM.
678 may be worth considering too.
685 but with higher compressor and decompressor memory requirements.
686 These are useful only when compressing files bigger than
687 8\ MiB, 16\ MiB, and 32\ MiB, respectively.
690 On the same hardware, the decompression speed is approximately
691 a constant number of bytes of compressed data per second.
692 In other words, the better the compression,
693 the faster the decompression will usually be.
694 This also means that the amount of uncompressed output
695 produced per second can vary a lot.
697 The following table summarises the features of the presets:
705 Preset;DictSize;CompCPU;CompMem;DecMem
706 \-0;256 KiB;0;3 MiB;1 MiB
707 \-1;1 MiB;1;9 MiB;2 MiB
708 \-2;2 MiB;2;17 MiB;3 MiB
709 \-3;4 MiB;3;32 MiB;5 MiB
710 \-4;4 MiB;4;48 MiB;5 MiB
711 \-5;8 MiB;5;94 MiB;9 MiB
712 \-6;8 MiB;6;94 MiB;9 MiB
713 \-7;16 MiB;6;186 MiB;17 MiB
714 \-8;32 MiB;6;370 MiB;33 MiB
715 \-9;64 MiB;6;674 MiB;65 MiB
723 DictSize is the LZMA2 dictionary size.
724 It is waste of memory to use a dictionary bigger than
725 the size of the uncompressed file.
726 This is why it is good to avoid using the presets
728 when there's no real need for them.
731 and lower, the amount of memory wasted is
732 usually low enough to not matter.
734 CompCPU is a simplified representation of the LZMA2 settings
735 that affect compression speed.
736 The dictionary size affects speed too,
737 so while CompCPU is the same for levels
738 .BR \-6 " ... " \-9 ,
739 higher levels still tend to be a little slower.
740 To get even slower and thus possibly better compression, see
743 CompMem contains the compressor memory requirements
744 in the single-threaded mode.
745 It may vary slightly between
748 Memory requirements of some of the future multithreaded modes may
749 be dramatically higher than that of the single-threaded mode.
751 DecMem contains the decompressor memory requirements.
752 That is, the compression settings determine
753 the memory requirements of the decompressor.
754 The exact decompressor memory usage is slightly more than
755 the LZMA2 dictionary size, but the values in the table
756 have been rounded up to the next full MiB.
759 .BR \-e ", " \-\-extreme
760 Use a slower variant of the selected compression preset level
761 .RB ( \-0 " ... " \-9 )
762 to hopefully get a little bit better compression ratio,
763 but with bad luck this can also make it worse.
764 Decompressor memory usage is not affected,
765 but compressor memory usage increases a little at preset levels
766 .BR \-0 " ... " \-3 .
768 Since there are two presets with dictionary sizes
769 4\ MiB and 8\ MiB, the presets
773 use slightly faster settings (lower CompCPU) than
778 That way no two presets are identical.
786 Preset;DictSize;CompCPU;CompMem;DecMem
787 \-0e;256 KiB;8;4 MiB;1 MiB
788 \-1e;1 MiB;8;13 MiB;2 MiB
789 \-2e;2 MiB;8;25 MiB;3 MiB
790 \-3e;4 MiB;7;48 MiB;5 MiB
791 \-4e;4 MiB;8;48 MiB;5 MiB
792 \-5e;8 MiB;7;94 MiB;9 MiB
793 \-6e;8 MiB;8;94 MiB;9 MiB
794 \-7e;16 MiB;8;186 MiB;17 MiB
795 \-8e;32 MiB;8;370 MiB;33 MiB
796 \-9e;64 MiB;8;674 MiB;65 MiB
801 For example, there are a total of four presets that use
802 8\ MiB dictionary, whose order from the fastest to the slowest is
814 These are somewhat misleading aliases for
819 These are provided only for backwards compatibility
821 Avoid using these options.
823 .BI \-\-block\-size= size
824 When compressing to the
826 format, split the input data into blocks of
829 The blocks are compressed independently from each other,
830 which helps with multi-threading and
831 makes limited random-access decompression possible.
832 This option is typically used to override the default
833 block size in multi-threaded mode,
834 but this option can be used in single-threaded mode too.
836 In multi-threaded mode about three times
838 bytes will be allocated in each thread for buffering input and output.
841 is three times the LZMA2 dictionary size or 1 MiB,
843 Typically a good value is 2\-4 times
844 the size of the LZMA2 dictionary or at least 1 MiB.
847 less than the LZMA2 dictionary size is waste of RAM
848 because then the LZMA2 dictionary buffer will never get fully used.
849 The sizes of the blocks are stored in the block headers,
850 which a future version of
852 will use for multi-threaded decompression.
854 In single-threaded mode no block splitting is done by default.
855 Setting this option doesn't affect memory usage.
856 No size information is stored in block headers,
857 thus files created in single-threaded mode
858 won't be identical to files created in multi-threaded mode.
859 The lack of size information also means that a future version of
861 won't be able decompress the files in multi-threaded mode.
863 .BI \-\-block\-list= sizes
864 When compressing to the
866 format, start a new block after
867 the given intervals of uncompressed data.
871 of the blocks are specified as a comma-separated list.
872 Omitting a size (two or more consecutive commas) is a shorthand
873 to use the size of the previous block.
875 If the input file is bigger than the sum of
879 is repeated until the end of the file.
882 may be used as the last value to indicate that
883 the rest of the file should be encoded as a single block.
887 that exceed the encoder's block size
888 (either the default value in threaded mode or
889 the value specified with \fB\-\-block\-size=\fIsize\fR),
890 the encoder will create additional blocks while
891 keeping the boundaries specified in
893 For example, if one specifies
894 .B \-\-block\-size=10MiB
895 .B \-\-block\-list=5MiB,10MiB,8MiB,12MiB,24MiB
896 and the input file is 80 MiB,
897 one will get 11 blocks:
898 5, 10, 8, 10, 2, 10, 10, 4, 10, 10, and 1 MiB.
900 In multi-threaded mode the sizes of the blocks
901 are stored in the block headers.
902 This isn't done in single-threaded mode,
903 so the encoded output won't be
904 identical to that of the multi-threaded mode.
906 .BI \-\-flush\-timeout= timeout
907 When compressing, if more than
909 milliseconds (a positive integer) has passed since the previous flush and
910 reading more input would block,
911 all the pending input data is flushed from the encoder and
912 made available in the output stream.
913 This can be useful if
915 is used to compress data that is streamed over a network.
918 values make the data available at the receiving end
919 with a small delay, but large
921 values give better compression ratio.
923 This feature is disabled by default.
924 If this option is specified more than once, the last one takes effect.
929 can be used to explicitly disable this feature.
931 This feature is not available on non-POSIX systems.
934 .B "This feature is still experimental."
937 is unsuitable for decompressing the stream in real time due to how
941 .BI \-\-memlimit\-compress= limit
942 Set a memory usage limit for compression.
943 If this option is specified multiple times,
944 the last one takes effect.
946 If the compression settings exceed the
949 will adjust the settings downwards so that
950 the limit is no longer exceeded and display a notice that
951 automatic adjustment was done.
952 Such adjustments are not made when compressing with
957 In those cases, an error is displayed and
959 will exit with exit status 1.
963 can be specified in multiple ways:
968 can be an absolute value in bytes.
969 Using an integer suffix like
973 .B "\-\-memlimit\-compress=80MiB"
977 can be specified as a percentage of total physical memory (RAM).
978 This can be useful especially when setting the
980 environment variable in a shell initialization script
981 that is shared between different computers.
982 That way the limit is automatically bigger
983 on systems with more memory.
985 .B "\-\-memlimit\-compress=70%"
989 can be reset back to its default value by setting it to
991 This is currently equivalent to setting the
995 (no memory usage limit).
996 Once multithreading support has been implemented,
997 there may be a difference between
1001 for the multithreaded case, so it is recommended to use
1005 until the details have been decided.
1008 See also the section
1009 .BR "Memory usage" .
1011 .BI \-\-memlimit\-decompress= limit
1012 Set a memory usage limit for decompression.
1013 This also affects the
1016 If the operation is not possible without exceeding the
1019 will display an error and decompressing the file will fail.
1021 .BI \-\-memlimit\-compress= limit
1022 for possible ways to specify the
1025 \fB\-M\fR \fIlimit\fR, \fB\-\-memlimit=\fIlimit\fR, \fB\-\-memory=\fIlimit
1026 This is equivalent to specifying \fB\-\-memlimit\-compress=\fIlimit
1027 \fB\-\-memlimit\-decompress=\fIlimit\fR.
1030 Display an error and exit if the compression settings exceed
1031 the memory usage limit.
1032 The default is to adjust the settings downwards so
1033 that the memory usage limit is not exceeded.
1034 Automatic adjusting is always disabled when creating raw streams
1035 .RB ( \-\-format=raw ).
1037 \fB\-T\fR \fIthreads\fR, \fB\-\-threads=\fIthreads
1038 Specify the number of worker threads to use.
1045 use as many threads as there are CPU cores on the system.
1046 The actual number of threads can be less than
1048 if the input file is not big enough
1049 for threading with the given settings or
1050 if using more threads would exceed the memory usage limit.
1052 Currently the only threading method is to split the input into
1053 blocks and compress them independently from each other.
1054 The default block size depends on the compression level and
1055 can be overriden with the
1056 .BI \-\-block\-size= size
1059 Threaded decompression hasn't been implemented yet.
1060 It will only work on files that contain multiple blocks
1061 with size information in block headers.
1062 All files compressed in multi-threaded mode meet this condition,
1063 but files compressed in single-threaded mode don't even if
1064 .BI \-\-block\-size= size
1067 .SS "Custom compressor filter chains"
1068 A custom filter chain allows specifying
1069 the compression settings in detail instead of relying on
1070 the settings associated to the presets.
1071 When a custom filter chain is specified,
1072 preset options (\fB\-0\fR ... \fB\-9\fR and \fB\-\-extreme\fR)
1073 earlier on the command line are forgotten.
1074 If a preset option is specified
1075 after one or more custom filter chain options,
1076 the new preset takes effect and
1077 the custom filter chain options specified earlier are forgotten.
1079 A filter chain is comparable to piping on the command line.
1080 When compressing, the uncompressed input goes to the first filter,
1081 whose output goes to the next filter (if any).
1082 The output of the last filter gets written to the compressed file.
1083 The maximum number of filters in the chain is four,
1084 but typically a filter chain has only one or two filters.
1086 Many filters have limitations on where they can be
1087 in the filter chain:
1088 some filters can work only as the last filter in the chain,
1089 some only as a non-last filter, and some work in any position
1091 Depending on the filter, this limitation is either inherent to
1092 the filter design or exists to prevent security issues.
1094 A custom filter chain is specified by using one or more
1095 filter options in the order they are wanted in the filter chain.
1096 That is, the order of filter options is significant!
1097 When decoding raw streams
1098 .RB ( \-\-format=raw ),
1099 the filter chain is specified in the same order as
1100 it was specified when compressing.
1102 Filters take filter-specific
1104 as a comma-separated list.
1108 Every option has a default value, so you need to
1109 specify only those you want to change.
1111 To see the whole filter chain and
1118 This works also for viewing the filter chain options used by presets.
1120 \fB\-\-lzma1\fR[\fB=\fIoptions\fR]
1123 \fB\-\-lzma2\fR[\fB=\fIoptions\fR]
1125 Add LZMA1 or LZMA2 filter to the filter chain.
1126 These filters can be used only as the last filter in the chain.
1128 LZMA1 is a legacy filter,
1129 which is supported almost solely due to the legacy
1131 file format, which supports only LZMA1.
1133 version of LZMA1 to fix some practical issues of LZMA1.
1136 format uses LZMA2 and doesn't support LZMA1 at all.
1137 Compression speed and ratios of LZMA1 and LZMA2
1138 are practically the same.
1140 LZMA1 and LZMA2 share the same set of
1145 Reset all LZMA1 or LZMA2
1150 consist of an integer, which may be followed by single-letter
1152 The integer can be from
1156 matching the command line options \fB\-0\fR ... \fB\-9\fR.
1157 The only supported modifier is currently
1163 is specified, the default values of LZMA1 or LZMA2
1165 are taken from the preset
1169 Dictionary (history buffer)
1171 indicates how many bytes of the recently processed
1172 uncompressed data is kept in memory.
1173 The algorithm tries to find repeating byte sequences (matches) in
1174 the uncompressed data, and replace them with references
1175 to the data currently in the dictionary.
1176 The bigger the dictionary, the higher is the chance
1178 Thus, increasing dictionary
1180 usually improves compression ratio, but
1181 a dictionary bigger than the uncompressed file is waste of memory.
1185 is from 64\ KiB to 64\ MiB.
1186 The minimum is 4\ KiB.
1187 The maximum for compression is currently 1.5\ GiB (1536\ MiB).
1188 The decompressor already supports dictionaries up to
1189 one byte less than 4\ GiB, which is the maximum for
1190 the LZMA1 and LZMA2 stream formats.
1196 together determine the memory usage of the LZMA1 or LZMA2 encoder.
1197 The same (or bigger) dictionary
1199 is required for decompressing that was used when compressing,
1200 thus the memory usage of the decoder is determined
1201 by the dictionary size used when compressing.
1204 headers store the dictionary
1209 .RI "2^" n " + 2^(" n "\-1),"
1212 are somewhat preferred for compression.
1215 will get rounded up when stored in the
1220 Specify the number of literal context bits.
1221 The minimum is 0 and the maximum is 4; the default is 3.
1222 In addition, the sum of
1228 All bytes that cannot be encoded as matches
1229 are encoded as literals.
1230 That is, literals are simply 8-bit bytes
1231 that are encoded one at a time.
1233 The literal coding makes an assumption that the highest
1235 bits of the previous uncompressed byte correlate
1237 E.g. in typical English text, an upper-case letter is
1238 often followed by a lower-case letter, and a lower-case
1239 letter is usually followed by another lower-case letter.
1240 In the US-ASCII character set, the highest three bits are 010
1241 for upper-case letters and 011 for lower-case letters.
1244 is at least 3, the literal coding can take advantage of
1245 this property in the uncompressed data.
1247 The default value (3) is usually good.
1248 If you want maximum compression, test
1250 Sometimes it helps a little, and
1251 sometimes it makes compression worse.
1252 If it makes it worse, test e.g.\&
1257 Specify the number of literal position bits.
1258 The minimum is 0 and the maximum is 4; the default is 0.
1261 affects what kind of alignment in the uncompressed data is
1262 assumed when encoding literals.
1265 below for more information about alignment.
1268 Specify the number of position bits.
1269 The minimum is 0 and the maximum is 4; the default is 2.
1272 affects what kind of alignment in the uncompressed data is
1274 The default means four-byte alignment
1276 which is often a good choice when there's no better guess.
1278 When the aligment is known, setting
1280 accordingly may reduce the file size a little.
1281 E.g. with text files having one-byte
1282 alignment (US-ASCII, ISO-8859-*, UTF-8), setting
1284 can improve compression slightly.
1288 If the alignment is an odd number like 3 bytes,
1290 might be the best choice.
1292 Even though the assumed alignment can be adjusted with
1296 LZMA1 and LZMA2 still slightly favor 16-byte alignment.
1297 It might be worth taking into account when designing file formats
1298 that are likely to be often compressed with LZMA1 or LZMA2.
1301 Match finder has a major effect on encoder speed,
1302 memory usage, and compression ratio.
1303 Usually Hash Chain match finders are faster than Binary Tree
1305 The default depends on the
1315 The following match finders are supported.
1316 The memory usage formulas below are rough approximations,
1317 which are closest to the reality when
1323 Hash Chain with 2- and 3-byte hashing
1342 Hash Chain with 2-, 3-, and 4-byte hashing
1361 Binary Tree with 2-byte hashing
1372 Binary Tree with 2- and 3-byte hashing
1391 Binary Tree with 2-, 3-, and 4-byte hashing
1413 specifies the method to analyze
1414 the data produced by the match finder.
1433 is used with Hash Chain match finders and
1435 with Binary Tree match finders.
1436 This is also what the
1441 Specify what is considered to be a nice length for a match.
1442 Once a match of at least
1444 bytes is found, the algorithm stops
1445 looking for possibly better matches.
1448 can be 2\-273 bytes.
1449 Higher values tend to give better compression ratio
1450 at the expense of speed.
1451 The default depends on the
1455 Specify the maximum search depth in the match finder.
1456 The default is the special value of 0,
1457 which makes the compressor determine a reasonable
1466 for Hash Chains is 4\-100 and 16\-1000 for Binary Trees.
1467 Using very high values for
1469 can make the encoder extremely slow with some files.
1472 over 1000 unless you are prepared to interrupt
1473 the compression in case it is taking far too long.
1476 When decoding raw streams
1477 .RB ( \-\-format=raw ),
1478 LZMA2 needs only the dictionary
1486 \fB\-\-x86\fR[\fB=\fIoptions\fR]
1489 \fB\-\-powerpc\fR[\fB=\fIoptions\fR]
1491 \fB\-\-ia64\fR[\fB=\fIoptions\fR]
1493 \fB\-\-arm\fR[\fB=\fIoptions\fR]
1495 \fB\-\-armthumb\fR[\fB=\fIoptions\fR]
1497 \fB\-\-sparc\fR[\fB=\fIoptions\fR]
1499 Add a branch/call/jump (BCJ) filter to the filter chain.
1500 These filters can be used only as a non-last filter
1501 in the filter chain.
1503 A BCJ filter converts relative addresses in
1504 the machine code to their absolute counterparts.
1505 This doesn't change the size of the data,
1506 but it increases redundancy,
1507 which can help LZMA2 to produce 0\-15\ % smaller
1510 The BCJ filters are always reversible,
1511 so using a BCJ filter for wrong type of data
1512 doesn't cause any data loss, although it may make
1513 the compression ratio slightly worse.
1515 It is fine to apply a BCJ filter on a whole executable;
1516 there's no need to apply it only on the executable section.
1517 Applying a BCJ filter on an archive that contains both executable
1518 and non-executable files may or may not give good results,
1519 so it generally isn't good to blindly apply a BCJ filter when
1520 compressing binary packages for distribution.
1522 These BCJ filters are very fast and
1523 use insignificant amount of memory.
1524 If a BCJ filter improves compression ratio of a file,
1525 it can improve decompression speed at the same time.
1526 This is because, on the same hardware,
1527 the decompression speed of LZMA2 is roughly
1528 a fixed number of bytes of compressed data per second.
1530 These BCJ filters have known problems related to
1531 the compression ratio:
1534 Some types of files containing executable code
1535 (e.g. object files, static libraries, and Linux kernel modules)
1536 have the addresses in the instructions filled with filler values.
1537 These BCJ filters will still do the address conversion,
1538 which will make the compression worse with these files.
1540 Applying a BCJ filter on an archive containing multiple similar
1541 executables can make the compression ratio worse than not using
1543 This is because the BCJ filter doesn't detect the boundaries
1544 of the executable files, and doesn't reset
1545 the address conversion counter for each executable.
1548 Both of the above problems will be fixed
1549 in the future in a new filter.
1550 The old BCJ filters will still be useful in embedded systems,
1551 because the decoder of the new filter will be bigger
1552 and use more memory.
1554 Different instruction sets have have different alignment:
1562 Filter;Alignment;Notes
1563 x86;1;32-bit or 64-bit x86
1564 PowerPC;4;Big endian only
1565 ARM;4;Little endian only
1566 ARM-Thumb;2;Little endian only
1567 IA-64;16;Big or little endian
1568 SPARC;4;Big or little endian
1573 Since the BCJ-filtered data is usually compressed with LZMA2,
1574 the compression ratio may be improved slightly if
1575 the LZMA2 options are set to match the
1576 alignment of the selected BCJ filter.
1577 For example, with the IA-64 filter, it's good to set
1579 with LZMA2 (2^4=16).
1580 The x86 filter is an exception;
1581 it's usually good to stick to LZMA2's default
1582 four-byte alignment when compressing x86 executables.
1584 All BCJ filters support the same
1591 that is used when converting between relative
1592 and absolute addresses.
1595 must be a multiple of the alignment of the filter
1596 (see the table above).
1597 The default is zero.
1598 In practice, the default is good; specifying a custom
1600 is almost never useful.
1603 \fB\-\-delta\fR[\fB=\fIoptions\fR]
1604 Add the Delta filter to the filter chain.
1605 The Delta filter can be only used as a non-last filter
1606 in the filter chain.
1608 Currently only simple byte-wise delta calculation is supported.
1609 It can be useful when compressing e.g. uncompressed bitmap images
1610 or uncompressed PCM audio.
1611 However, special purpose algorithms may give significantly better
1612 results than Delta + LZMA2.
1613 This is true especially with audio,
1614 which compresses faster and better e.g. with
1624 of the delta calculation in bytes.
1631 and eight-byte input A1 B1 A2 B3 A3 B5 A4 B7, the output will be
1632 A1 B1 01 02 01 02 01 02.
1637 .BR \-q ", " \-\-quiet
1638 Suppress warnings and notices.
1639 Specify this twice to suppress errors too.
1640 This option has no effect on the exit status.
1641 That is, even if a warning was suppressed,
1642 the exit status to indicate a warning is still used.
1644 .BR \-v ", " \-\-verbose
1646 If standard error is connected to a terminal,
1648 will display a progress indicator.
1651 twice will give even more verbose output.
1653 The progress indicator shows the following information:
1656 Completion percentage is shown
1657 if the size of the input file is known.
1658 That is, the percentage cannot be shown in pipes.
1660 Amount of compressed data produced (compressing)
1661 or consumed (decompressing).
1663 Amount of uncompressed data consumed (compressing)
1664 or produced (decompressing).
1666 Compression ratio, which is calculated by dividing
1667 the amount of compressed data processed so far by
1668 the amount of uncompressed data processed so far.
1670 Compression or decompression speed.
1671 This is measured as the amount of uncompressed data consumed
1672 (compression) or produced (decompression) per second.
1673 It is shown after a few seconds have passed since
1675 started processing the file.
1677 Elapsed time in the format M:SS or H:MM:SS.
1679 Estimated remaining time is shown
1680 only when the size of the input file is
1681 known and a couple of seconds have already passed since
1683 started processing the file.
1684 The time is shown in a less precise format which
1685 never has any colons, e.g. 2 min 30 s.
1688 When standard error is not a terminal,
1692 print the filename, compressed size, uncompressed size,
1693 compression ratio, and possibly also the speed and elapsed time
1694 on a single line to standard error after compressing or
1695 decompressing the file.
1696 The speed and elapsed time are included only when
1697 the operation took at least a few seconds.
1698 If the operation didn't finish, e.g. due to user interruption,
1699 also the completion percentage is printed
1700 if the size of the input file is known.
1702 .BR \-Q ", " \-\-no\-warn
1703 Don't set the exit status to 2
1704 even if a condition worth a warning was detected.
1705 This option doesn't affect the verbosity level, thus both
1709 have to be used to not display warnings and
1710 to not alter the exit status.
1713 Print messages in a machine-parsable format.
1714 This is intended to ease writing frontends that want to use
1716 instead of liblzma, which may be the case with various scripts.
1717 The output with this option enabled is meant to be stable across
1724 .BR \-\-info\-memory
1725 Display, in human-readable format, how much physical memory (RAM)
1727 thinks the system has and the memory usage limits for compression
1728 and decompression, and exit successfully.
1730 .BR \-h ", " \-\-help
1731 Display a help message describing the most commonly used options,
1732 and exit successfully.
1734 .BR \-H ", " \-\-long\-help
1735 Display a help message describing all features of
1737 and exit successfully
1739 .BR \-V ", " \-\-version
1740 Display the version number of
1742 and liblzma in human readable format.
1743 To get machine-parsable output, specify
1749 The robot mode is activated with the
1752 It makes the output of
1754 easier to parse by other programs.
1757 is supported only together with
1759 .BR \-\-info\-memory ,
1762 It will be supported for compression and
1763 decompression in the future.
1766 .B "xz \-\-robot \-\-version"
1767 will print the version number of
1769 and liblzma in the following format:
1771 .BI XZ_VERSION= XYYYZZZS
1773 .BI LIBLZMA_VERSION= XYYYZZZS
1780 Even numbers are stable.
1781 Odd numbers are alpha or beta versions.
1784 Patch level for stable releases or
1785 just a counter for development releases.
1789 0 is alpha, 1 is beta, and 2 is stable.
1791 should be always 2 when
1796 are the same on both lines if
1798 and liblzma are from the same XZ Utils release.
1800 Examples: 4.999.9beta is
1806 .SS "Memory limit information"
1807 .B "xz \-\-robot \-\-info\-memory"
1808 prints a single line with three tab-separated columns:
1810 Total amount of physical memory (RAM) in bytes
1812 Memory usage limit for compression in bytes.
1813 A special value of zero indicates the default setting,
1814 which for single-threaded mode is the same as no limit.
1816 Memory usage limit for decompression in bytes.
1817 A special value of zero indicates the default setting,
1818 which for single-threaded mode is the same as no limit.
1820 In the future, the output of
1821 .B "xz \-\-robot \-\-info\-memory"
1822 may have more columns, but never more than a single line.
1825 .B "xz \-\-robot \-\-list"
1826 uses tab-separated output.
1827 The first column of every line has a string
1828 that indicates the type of the information found on that line:
1831 This is always the first line when starting to list a file.
1832 The second column on the line is the filename.
1835 This line contains overall information about the
1838 This line is always printed after the
1843 This line type is used only when
1848 lines as there are streams in the
1853 This line type is used only when
1858 lines as there are blocks in the
1863 lines are shown after all the
1865 lines; different line types are not interleaved.
1868 This line type is used only when
1870 was specified twice.
1871 This line is printed after all
1878 line contains overall information about the
1883 This line is always the very last line of the list output.
1884 It shows the total counts and sizes.
1892 Number of streams in the file
1894 Total number of blocks in the stream(s)
1896 Compressed size of the file
1898 Uncompressed size of the file
1900 Compression ratio, for example
1902 If ratio is over 9.999, three dashes
1904 are displayed instead of the ratio.
1906 Comma-separated list of integrity check names.
1907 The following strings are used for the known check types:
1913 For unknown check types,
1917 is the Check ID as a decimal number (one or two digits).
1919 Total size of stream padding in the file
1929 Stream number (the first stream is 1)
1931 Number of blocks in the stream
1933 Compressed start offset
1935 Uncompressed start offset
1937 Compressed size (does not include stream padding)
1943 Name of the integrity check
1945 Size of stream padding
1955 Number of the stream containing this block
1957 Block number relative to the beginning of the stream
1958 (the first block is 1)
1960 Block number relative to the beginning of the file
1962 Compressed start offset relative to the beginning of the file
1964 Uncompressed start offset relative to the beginning of the file
1966 Total compressed size of the block (includes headers)
1972 Name of the integrity check
1978 was specified twice, additional columns are included on the
1981 These are not displayed with a single
1983 because getting this information requires many seeks
1984 and can thus be slow:
1988 Value of the integrity check in hexadecimal
1994 indicates that compressed size is present, and
1996 indicates that uncompressed size is present.
1997 If the flag is not set, a dash
1999 is shown instead to keep the string length fixed.
2000 New flags may be added to the end of the string in the future.
2002 Size of the actual compressed data in the block (this excludes
2003 the block header, block padding, and check fields)
2005 Amount of memory (in bytes) required to decompress
2006 this block with this
2011 Note that most of the options used at compression time
2012 cannot be known, because only the options
2013 that are needed for decompression are stored in the
2025 Amount of memory (in bytes) required to decompress
2033 indicating if all block headers have both compressed size and
2034 uncompressed size stored in them
2042 version required to decompress the file
2060 Average compression ratio
2062 Comma-separated list of integrity check names
2063 that were present in the files
2069 keep the order of the earlier columns the same as on
2077 was specified twice, additional columns are included on the
2083 Maximum amount of memory (in bytes) required to decompress
2091 indicating if all block headers have both compressed size and
2092 uncompressed size stored in them
2100 version required to decompress the file
2104 Future versions may add new line types and
2105 new columns can be added to the existing line types,
2106 but the existing columns won't be changed.
2117 Something worth a warning occurred,
2118 but no actual errors occurred.
2120 Notices (not warnings or errors) printed on standard error
2121 don't affect the exit status.
2125 parses space-separated lists of options
2126 from the environment variables
2130 in this order, before parsing the options from the command line.
2131 Note that only options are parsed from the environment variables;
2132 all non-options are silently ignored.
2133 Parsing is done with
2135 which is used also for the command line arguments.
2138 User-specific or system-wide default options.
2139 Typically this is set in a shell initialization script to enable
2141 memory usage limiter by default.
2142 Excluding shell initialization scripts
2143 and similar special cases, scripts must never set or unset
2147 This is for passing options to
2149 when it is not possible to set the options directly on the
2152 This is the case e.g. when
2154 is run by a script or tool, e.g. GNU
2161 XZ_OPT=\-2v tar caf foo.tar.xz foo
2169 e.g. to set script-specific default compression options.
2170 It is still recommended to allow users to override
2172 if that is reasonable, e.g. in
2174 scripts one may use something like this:
2180 XZ_OPT=${XZ_OPT\-"\-7e"}
2187 .SH "LZMA UTILS COMPATIBILITY"
2188 The command line syntax of
2190 is practically a superset of
2195 as found from LZMA Utils 4.32.x.
2196 In most cases, it is possible to replace
2197 LZMA Utils with XZ Utils without breaking existing scripts.
2198 There are some incompatibilities though,
2199 which may sometimes cause problems.
2201 .SS "Compression preset levels"
2202 The numbering of the compression level presets is not identical in
2205 The most important difference is how dictionary sizes
2206 are mapped to different presets.
2207 Dictionary size is roughly equal to the decompressor memory usage.
2228 The dictionary size differences affect
2229 the compressor memory usage too,
2230 but there are some other differences between
2231 LZMA Utils and XZ Utils, which
2232 make the difference even bigger:
2239 Level;xz;LZMA Utils 4.32.x
2253 The default preset level in LZMA Utils is
2255 while in XZ Utils it is
2257 so both use an 8 MiB dictionary by default.
2259 .SS "Streamed vs. non-streamed .lzma files"
2260 The uncompressed size of the file can be stored in the
2263 LZMA Utils does that when compressing regular files.
2264 The alternative is to mark that uncompressed size is unknown
2265 and use end-of-payload marker to indicate
2266 where the decompressor should stop.
2267 LZMA Utils uses this method when uncompressed size isn't known,
2268 which is the case for example in pipes.
2271 supports decompressing
2273 files with or without end-of-payload marker, but all
2277 will use end-of-payload marker and have uncompressed size
2278 marked as unknown in the
2281 This may be a problem in some uncommon situations.
2284 decompressor in an embedded device might work
2285 only with files that have known uncompressed size.
2286 If you hit this problem, you need to use LZMA Utils
2287 or LZMA SDK to create
2289 files with known uncompressed size.
2291 .SS "Unsupported .lzma files"
2299 LZMA Utils can decompress files with any
2303 but always creates files with
2307 Creating files with other
2315 The implementation of the LZMA1 filter in liblzma
2316 requires that the sum of
2323 files, which exceed this limitation, cannot be decompressed with
2326 LZMA Utils creates only
2328 files which have a dictionary size of
2330 (a power of 2) but accepts files with any dictionary size.
2331 liblzma accepts only
2333 files which have a dictionary size of
2336 .RI "2^" n " + 2^(" n "\-1)."
2337 This is to decrease false positives when detecting
2341 These limitations shouldn't be a problem in practice,
2342 since practically all
2344 files have been compressed with settings that liblzma will accept.
2346 .SS "Trailing garbage"
2348 LZMA Utils silently ignore everything after the first
2351 In most situations, this is a bug.
2352 This also means that LZMA Utils
2353 don't support decompressing concatenated
2357 If there is data left after the first
2361 considers the file to be corrupt unless
2362 .B \-\-single\-stream
2364 This may break obscure scripts which have
2365 assumed that trailing garbage is ignored.
2369 .SS "Compressed output may vary"
2370 The exact compressed output produced from
2371 the same uncompressed input file
2372 may vary between XZ Utils versions even if
2373 compression options are identical.
2374 This is because the encoder can be improved
2375 (faster or better compression)
2376 without affecting the file format.
2377 The output can vary even between different
2378 builds of the same XZ Utils version,
2379 if different build options are used.
2381 The above means that once
2383 has been implemented,
2384 the resulting files won't necessarily be rsyncable
2385 unless both old and new files have been compressed
2386 with the same xz version.
2387 This problem can be fixed if a part of the encoder
2388 implementation is frozen to keep rsyncable output
2389 stable across xz versions.
2391 .SS "Embedded .xz decompressors"
2394 decompressor implementations like XZ Embedded don't necessarily
2395 support files created with integrity
2401 Since the default is
2402 .BR \-\-check=crc64 ,
2407 when creating files for embedded systems.
2409 Outside embedded systems, all
2411 format decompressors support all the
2413 types, or at least are able to decompress
2414 the file without verifying the
2415 integrity check if the particular
2419 XZ Embedded supports BCJ filters,
2420 but only with the default start offset.
2429 using the default compression level
2433 if compression is successful:
2449 even if decompression is successful:
2463 .RB ( "\-4 \-\-extreme" ),
2464 which is slower than e.g. the default
2466 but needs less memory for compression and decompression (48\ MiB
2467 and 5\ MiB, respectively):
2472 tar cf \- baz | xz \-4e > baz.tar.xz
2477 A mix of compressed and uncompressed files can be decompressed
2478 to standard output with a single command:
2483 xz \-dcf a.txt b.txt.xz c.txt d.txt.lzma > abcd.txt
2488 .SS "Parallel compression of many files"
2493 can be used to parallelize compression of many files:
2498 find . \-type f \e! \-name '*.xz' \-print0 \e
2499 | xargs \-0r \-P4 \-n16 xz \-T1
2508 sets the number of parallel
2511 The best value for the
2513 option depends on how many files there are to be compressed.
2514 If there are only a couple of files,
2515 the value should probably be 1;
2516 with tens of thousands of files,
2517 100 or even more may be appropriate to reduce the number of
2521 will eventually create.
2527 is there to force it to single-threaded mode, because
2529 is used to control the amount of parallelization.
2532 Calculate how many bytes have been saved in total
2533 after compressing multiple files:
2538 xz \-\-robot \-\-list *.xz | awk '/^totals/{print $5\-$4}'
2543 A script may want to know that it is using new enough
2547 script checks that the version number of the
2549 tool is at least 5.0.0.
2550 This method is compatible with old beta versions,
2551 which didn't support the
2558 if ! eval "$(xz \-\-robot \-\-version 2> /dev/null)" ||
2559 [ "$XZ_VERSION" \-lt 50000002 ]; then
2560 echo "Your xz is too old."
2562 unset XZ_VERSION LIBLZMA_VERSION
2567 Set a memory usage limit for decompression using
2569 but if a limit has already been set, don't increase it:
2574 NEWLIM=$((123 << 20)) # 123 MiB
2575 OLDLIM=$(xz \-\-robot \-\-info\-memory | cut \-f3)
2576 if [ $OLDLIM \-eq 0 \-o $OLDLIM \-gt $NEWLIM ]; then
2577 XZ_OPT="$XZ_OPT \-\-memlimit\-decompress=$NEWLIM"
2584 .SS "Custom compressor filter chains"
2585 The simplest use for custom filter chains is
2586 customizing a LZMA2 preset.
2588 because the presets cover only a subset of the
2589 potentially useful combinations of compression settings.
2591 The CompCPU columns of the tables
2592 from the descriptions of the options
2593 .BR "\-0" " ... " "\-9"
2596 are useful when customizing LZMA2 presets.
2597 Here are the relevant parts collected from those two tables:
2617 If you know that a file requires
2618 somewhat big dictionary (e.g. 32 MiB) to compress well,
2619 but you want to compress it quicker than
2621 would do, a preset with a low CompCPU value (e.g. 1)
2622 can be modified to use a bigger dictionary:
2627 xz \-\-lzma2=preset=1,dict=32MiB foo.tar
2632 With certain files, the above command may be faster than
2634 while compressing significantly better.
2635 However, it must be emphasized that only some files benefit from
2636 a big dictionary while keeping the CompCPU value low.
2637 The most obvious situation,
2638 where a big dictionary can help a lot,
2639 is an archive containing very similar files
2640 of at least a few megabytes each.
2641 The dictionary size has to be significantly bigger
2642 than any individual file to allow LZMA2 to take
2643 full advantage of the similarities between consecutive files.
2645 If very high compressor and decompressor memory usage is fine,
2646 and the file being compressed is
2647 at least several hundred megabytes, it may be useful
2648 to use an even bigger dictionary than the 64 MiB that
2655 xz \-vv \-\-lzma2=dict=192MiB big_foo.tar
2662 .RB ( "\-\-verbose \-\-verbose" )
2663 like in the above example can be useful
2664 to see the memory requirements
2665 of the compressor and decompressor.
2666 Remember that using a dictionary bigger than
2667 the size of the uncompressed file is waste of memory,
2668 so the above command isn't useful for small files.
2670 Sometimes the compression time doesn't matter,
2671 but the decompressor memory usage has to be kept low
2672 e.g. to make it possible to decompress the file on
2674 The following command uses
2676 .RB ( "\-6 \-\-extreme" )
2677 as a base and sets the dictionary to only 64\ KiB.
2678 The resulting file can be decompressed with XZ Embedded
2679 (that's why there is
2680 .BR \-\-check=crc32 )
2681 using about 100\ KiB of memory.
2686 xz \-\-check=crc32 \-\-lzma2=preset=6e,dict=64KiB foo
2691 If you want to squeeze out as many bytes as possible,
2692 adjusting the number of literal context bits
2694 and number of position bits
2697 Adjusting the number of literal position bits
2699 might help too, but usually
2704 E.g. a source code archive contains mostly US-ASCII text,
2705 so something like the following might give
2706 slightly (like 0.1\ %) smaller file than
2714 xz \-\-lzma2=preset=6e,pb=0,lc=4 source_code.tar
2719 Using another filter together with LZMA2 can improve
2720 compression with certain file types.
2721 E.g. to compress a x86-32 or x86-64 shared library
2722 using the x86 BCJ filter:
2727 xz \-\-x86 \-\-lzma2 libfoo.so
2732 Note that the order of the filter options is significant.
2739 because there cannot be any filter after LZMA2,
2740 and also because the x86 BCJ filter cannot be used
2741 as the last filter in the chain.
2743 The Delta filter together with LZMA2
2744 can give good results with bitmap images.
2745 It should usually beat PNG,
2746 which has a few more advanced filters than simple
2747 delta but uses Deflate for the actual compression.
2749 The image has to be saved in uncompressed format,
2750 e.g. as uncompressed TIFF.
2751 The distance parameter of the Delta filter is set
2752 to match the number of bytes per pixel in the image.
2753 E.g. 24-bit RGB bitmap needs
2755 and it is also good to pass
2757 to LZMA2 to accommodate the three-byte alignment:
2762 xz \-\-delta=dist=3 \-\-lzma2=pb=0 foo.tiff
2767 If multiple images have been put into a single archive (e.g.\&
2769 the Delta filter will work on that too as long as all images
2770 have the same number of bytes per pixel.
2782 XZ Utils: <https://tukaani.org/xz/>
2784 XZ Embedded: <https://tukaani.org/xz/embedded.html>
2786 LZMA SDK: <http://7-zip.org/sdk.html>