blog:python_decompress_pack_magic

This is an old revision of the document!


Python decompress PACK_MAGIC

A file compressed with pack format has magic bytes in octal \036\037 or in hex 0x1f1e

GZIP can decode this along with the pcat program. For an exercise I converted the unpack.c module in gzip into its python equivalent.

The slowest part of the code is the look_bits function and this is where you can see how an interpreted language grinds compared to C.

Using the excellent line profiler: https://pypi.python.org/pypi/line_profiler/

Timer unit: 1e-06 s = 1uS

Total time: 43.3399 s
File: unpack.py
Function: look_bits at line 36

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    36                                               @profile
    37                                               def look_bits(self,bits,mask):
    38    351442       265280      0.8      0.6          while(self.valid < bits):
    39    140575     10361348     73.7     23.9              self.bitbuf <<= 8
    40    140575     14576495    103.7     33.6              self.bitbuf |= next(self.get_byte)
    41    140575       189102      1.3      0.4              self.valid += 8
    42    210867     17947709     85.1     41.4          return (self.bitbuf >> (self.valid - bits)) & mask

unpack.zip

  • blog/python_decompress_pack_magic.1414148953.txt.gz
  • Last modified: 2014/10/24 11:09
  • by brett