Normal view

Before yesterdayMain stream

phpser: a faster, HMAC-signed binary serializer for PHP cache workloads, benchmarked against igbinary

I've reached for igbinary on basically every PHP project I've shipped for the last decade. It's the obvious default for cache serialization. Two things about cache workloads kept nagging at me though, so I wrote phpser to see if a serializer built specifically for caches could do better.

The first is the read/write asymmetry. A cache decodes on every read and encodes once per write, easily 100:1 on a read-heavy cache, but igbinary (like most general serializers) balances the two sides. The second is trust: the bytes you decode often come from redis, memcached, or a cookie, any of which an attacker may be able to write to, and unserialize() on attacker-controlled input is one of PHP's oldest exploit primitives.

phpser is a C extension that goes after both. The wire format is designed around the reader (I borrowed the "make the reader do the least work" instinct from Rust's rkyv, though phpser is not zero-copy): a front-loaded string dictionary the decoder reuses by refcount instead of re-allocating, tagged scalar runs for packed numeric arrays, and pre-sized hashtables written in place. The encoder is fast too, with an O(1) pointer-hash string intern and plain objects serialized straight from their property slots.

Benchmarks vs igbinary (PHP 8.4 NTS release build, 1000 iters, median of 9 runs):

Shape Size Encode Decode
packed_1k (range 0..999) -65% -70% -75%
dto_1000 (Laravel queue shape) -12% -15% -18%
rowset_1000 (mixed assoc) +1% -55% +4%

It's not a clean sweep: mixed associative rowsets decode about 4% slower and run a few percent larger, because the front-loaded dictionary (the thing that makes everything else fast) doesn't pay off when few strings repeat. It's not streamable either, for the same reason.

On the security side there's an HMAC-SHA256 signed mode: phpser_serialize_signed($value, $key) and phpser_unserialize_signed($payload, $key). The signature is verified in constant time before any decoding happens, so a tampered or foreign-keyed payload returns null and never reaches the code that builds objects. There's also an allowed_classes option matching native unserialize().

Install is via PIE: pie install iliaal/phpser

Repo: https://github.com/iliaal/phpser Full writeup with the wire-format walkthrough and the complete benchmark table: https://ilia.ws/blog/phpser-a-fast-secure-binary-serializer-for-php-cache-workloads

I maintain php_excel and a few other PHP extensions; this one scratched a specific itch. Happy to answer questions, and I'd love feedback from anyone running heavy cache or queue traffic where decode time actually shows up in a profile.

submitted by /u/Ilia0001 to r/PHP
[link] [comments]
❌
❌