> This feels like it should have been mentioned in the article.
With an entire section complaining how many lines of code existing implementations are, looks like they did found a good simple implementation to clone in Rust then deliberately not mention it.
Just like that author, many years ago, I went through the process of understanding the DEFLATE compression standard and producing a short and concise decompressor for gzip+DEFLATE. Here are the resources I published as a result of that exploration:
Put me in mind of one of my early experiments in Rust. It would be interesting to compare a iterator based form that just called .take(need)
I haven't written a lot of Rust, but one thing I did was to write an iterator that took an iterator of bytes as input and provided bits as output. Then used an iterator that gave bytes from a block of memory.
It was mostly as a test to see how much high level abstraction left an imprint on the compiled code.
The dissasembly showed it pulling in 32 bits at a time and shifting out the bits pretty much the same way I would have written in ASM.
I was quite impressed. Although I tested it was working by counting the bits and someone critizised it for not using popcount, so I guess you can't have everything.
I had a similar experience implementing simd instructions in my emulator, where I needed to break apart a 64-bit value into four eight-bit values, do an operation on each value, then pack it back together. My first implementation did it with all the bit shifts you’d expect, but my second one used two helpers to unpack into an array, map on the array to a second array, and pack the array again. The optimized output was basically the same.
His also omits CRC, which is part of the 25k lines, no --fast/--best/etc, missing some output formats, and so on. I'm sure the 25k includes a lot of bloat, but the comparison is odd. Comparing to your list would make much more sense.
I would expect a CRC to add a negligible number of lines of code. The reason that production-grade decompressors are tens of thousands of LOC is likely attributable to extreme manual optimization. For example, I wouldn't be surprised if a measurable fraction of those lines are actually inline assembly.
That's java code, though... bit weird, esp. i % 8 (which is just i & 7). The compiler should be able to optimize it since 'i' is guaranteed to be non-negative, still awkward.
Java CRC32 nowadays uses intrinsics and avx128 for crc32.
Doesn't need to be inline assembly, just pre-encoded lookup tables and intrinsics-based vectorized CRC alone will add quite a lot of code. Most multi-platform CRC algorithms tend to have at least a few paths for byte/word/dword at a time, hardware CRC, and hardware GF(2) multiply. It's not really extreme optimization, just better algorithms to match better hardware capabilities.
The Huffman decoding implementation is also bigger in production implementations for both speed and error checking. Two Huffman trees need to be exactly complete except in the special case of a single code, and in most cases they are flattened to two-level tables for speed (though the latest desktop CPUs have enough L1 cache to use single-level).
Finally, the LZ copy typically has special cases added for using wider than byte copies for non-overlapping, non-wrapping runs. This is a significant decoding speed optimization.
>the only flag we care about is FNAME
The specification does not define an encoding for the file name. Different file systems may impose restrictions on certain names, so FNAME should not be used.
that reminds me a zip file creator in a few lines of JS. Now that CompressionStream is a built in feature of the browser and node. No need to use some bloated npm lib. But momentum and popularity (and LLMs) will keep people using JSZip for eternity
Another dev who doesn’t show respect to what has been done and expect a particular language will do wonders for him. Also I don’t see this is much better in term of readability.
Where do you see the lack of respect? The author wanted to learn how gzip works and chose to implement it in a language they like to do so. As a learning tool, not because the world needs another gzip decompressor.
#define MAXBITS 15
#define MAXLCODES 286
#define MAXDCODES 30
#define MAXCODES
#define FIXLCODES 288
struct state
local int bits(struct state *s, int need)
local int stored(struct state *s)
struct huffman
local int decode(...)
local int construct(...)
local int codes(...)
local int fixed(...)
local int dynamic(...)
int puff(...)
> so i wrote a gzip decompressor from scratch
After skimming through the author's Rust code, it appears to be a fairly straightforward port of puff.c (included in the zlib source): https://github.com/madler/zlib/blob/develop/contrib/puff/puf...
Even the function names are identical :/
The difference in size seems to be mainly due to the missing documentation before each function.
This feels like it should have been mentioned in the article.
It makes me wonder if there was some LLM help, based on how similar the fn structure and identifier names are.
> It makes me wonder if there was some LLM help
I would bet there was
> This feels like it should have been mentioned in the article.
With an entire section complaining how many lines of code existing implementations are, looks like they did found a good simple implementation to clone in Rust then deliberately not mention it.
Oh i didnt even know of this. But i got a lot of help from this one
https://github.com/madler/infgen
EDIT: Didnt notice that it’s even by the same person of course it’s very similar
You could say it was a “puff” piece, eh, eh!?
Just like that author, many years ago, I went through the process of understanding the DEFLATE compression standard and producing a short and concise decompressor for gzip+DEFLATE. Here are the resources I published as a result of that exploration:
* https://www.nayuki.io/page/deflate-specification-v1-3-html
* https://www.nayuki.io/page/simple-deflate-decompressor
* https://github.com/nayuki/Simple-DEFLATE-decompressor
Those resources were a huge help when I was digging into the DEFLATE algorithm, thank you!
The function
Put me in mind of one of my early experiments in Rust. It would be interesting to compare a iterator based form that just called .take(need)I haven't written a lot of Rust, but one thing I did was to write an iterator that took an iterator of bytes as input and provided bits as output. Then used an iterator that gave bytes from a block of memory.
It was mostly as a test to see how much high level abstraction left an imprint on the compiled code.
The dissasembly showed it pulling in 32 bits at a time and shifting out the bits pretty much the same way I would have written in ASM.
I was quite impressed. Although I tested it was working by counting the bits and someone critizised it for not using popcount, so I guess you can't have everything.
> I tested it was working by counting the bits and someone critizised it for not using popcount
PSA: Rust exposes the popcnt intrinsic via the `count_ones` method on integer types: https://doc.rust-lang.org/std/primitive.u32.html#method.coun...
Looks like that was added about 3 years after I wrote my test code.
According to the docs there, it was stabilized in 1.0, then stabilized in const contexts in 1.32. Were you testing this in 2012?
Maybe I was wrong, when I look it up it said it came in in 2020. 1.4something.
I had a similar experience implementing simd instructions in my emulator, where I needed to break apart a 64-bit value into four eight-bit values, do an operation on each value, then pack it back together. My first implementation did it with all the bit shifts you’d expect, but my second one used two helpers to unpack into an array, map on the array to a second array, and pack the array again. The optimized output was basically the same.
> twenty five thousand lines of pure C not counting CMake files. ...
Keep in mind this is also 31 years of cruft and lord knows what.
Plan 9 gzip is 738 lines total:
Even the zipfs file server that mounts zip files as file systems is 391 lines.edit - post a link to said code: https://github.com/9front/9front/tree/front/sys/src/cmd/gzip
> ... (and whenever working with C always keep in mind that C stands for CVE).
Sigh.
You forgot to include https://github.com/9front/9front/tree/front/sys/src/libflate which gzip is built around, which brings it closer to 10k lines.
I wrote a standalone gzip decompressor in about 500 lines of code (including comments, with braces on the next line), with no dependencies at all: https://commandlinefanatic.com/cgi-bin/showarticle.cgi?artic...
Interesting, the decompressor in Jdeflate is around 4k LoC. https://github.com/Jpn666/jdeflate
His also omits CRC, which is part of the 25k lines, no --fast/--best/etc, missing some output formats, and so on. I'm sure the 25k includes a lot of bloat, but the comparison is odd. Comparing to your list would make much more sense.
I would expect a CRC to add a negligible number of lines of code. The reason that production-grade decompressors are tens of thousands of LOC is likely attributable to extreme manual optimization. For example, I wouldn't be surprised if a measurable fraction of those lines are actually inline assembly.
True. A most basic CRC implementation is about 7 lines of code: (presented in Java to avoid some C/C++ footguns)
Or smooshed down slightly (with caveats): But one reason that many CRC implementations are large is because they include a pre-computed table of 256× 32-bit constants so that one byte can processed at a time. For example: https://github.com/madler/zlib/blob/7cdaaa09095e9266dee21314...That's java code, though... bit weird, esp. i % 8 (which is just i & 7). The compiler should be able to optimize it since 'i' is guaranteed to be non-negative, still awkward.
Java CRC32 nowadays uses intrinsics and avx128 for crc32.
With C++20 you can use consteval to compute the table(s) at compile time from template parameters.
Doesn't need to be inline assembly, just pre-encoded lookup tables and intrinsics-based vectorized CRC alone will add quite a lot of code. Most multi-platform CRC algorithms tend to have at least a few paths for byte/word/dword at a time, hardware CRC, and hardware GF(2) multiply. It's not really extreme optimization, just better algorithms to match better hardware capabilities.
The Huffman decoding implementation is also bigger in production implementations for both speed and error checking. Two Huffman trees need to be exactly complete except in the special case of a single code, and in most cases they are flattened to two-level tables for speed (though the latest desktop CPUs have enough L1 cache to use single-level).
Finally, the LZ copy typically has special cases added for using wider than byte copies for non-overlapping, non-wrapping runs. This is a significant decoding speed optimization.
Yes, there's subdirs with language bindings for many non-C langs, an examples folder with example C code, win32 specific C code, test code, etc.
More reasons it's an odd comparison.
gzip also contains a significant amount of compatibility code for different platforms.
Crc32 can be written in handful lines of code. Although it'd be better to use the vector instruction set - e.g. AVX when available.
I was reading this and couldn't stop thinking https://en.wikipedia.org/wiki/Literate_programming
>the only flag we care about is FNAME The specification does not define an encoding for the file name. Different file systems may impose restrictions on certain names, so FNAME should not be used.
that reminds me a zip file creator in a few lines of JS. Now that CompressionStream is a built in feature of the browser and node. No need to use some bloated npm lib. But momentum and popularity (and LLMs) will keep people using JSZip for eternity
That's what she said.
(yawn, tech fucks)
Another dev who doesn’t show respect to what has been done and expect a particular language will do wonders for him. Also I don’t see this is much better in term of readability.
Where do you see the lack of respect? The author wanted to learn how gzip works and chose to implement it in a language they like to do so. As a learning tool, not because the world needs another gzip decompressor.
Chose to paste the C into an LLM and said 'make it rust'?
What is your source for that accusation?
Original zlib code: https://github.com/madler/zlib/blob/develop/contrib/puff/puf...
The code of the article: https://github.com/ieviev/mini-gzip/blob/main/src/main.rs
Top level declarations of the C code:
Top level declarations of the Rust code:he does mention https://github.com/trifectatechfoundation/zlib-rs not just https://github.com/madler/zlib, but it would be interesting to hear from those developers also