Back to list
CLWW の秩序保持暗号における 1/256 のバグの修正
Fixing a 1-in-256 bug in CLWW order-preserving encryption
Translated: 2026/4/25 4:56:58
Japanese Translation
多くのデータベースでは、秩序保持暗号(OPE)を利用することで、暗号化されたデータをソートや範囲クエリに使用できます。これは、暗号化文字列のバイトが明文字列の順序と同じ順序で比較される仕組みであり、標準の B-Tree インデックスや ORDER BY クラウスがそのまま機能します。2015 年に提唱された有名な構築法の一つが Chenette-Lewi-Weis-Wu (CLWW) です。この論文は、カスタム比較子を伴う秩序表示スキームを与え、かつ副次的な利益として同じ暗号化バイトを辞書式に比較することで秩序保持スキームが得られると一筆に触れています。しかし、その再解釈はほぼ正しいだけです。比較の約 1/256 が、実際の明文字列順序と一致しません。この記事では、実際に示す例を通じてその理由を説明し、エンコーディングの 2 つの微小な変更を提示します。これにより、残りの不一致が除去されますが、各暗号化バイトに 1 バイト追加、1 つの算術パスを要する代償が支払われます。この記事では構築と直観を踏まえます。完全な導出——三角形確率分布、信号/ノイズの議論、実証、スキームが洩らす情報についての議論——を詳しく学びたい場合は、以下の連作エンジニアリングノートをご覧ください:CLWW ORE から逆転送による正確な OPE(Backward Carry + MSB-Bit Placement)。データベースの列を標準的な対称暗号(AES-GCM、ChaCha20、お好みのもの)で暗号化すると、優れた機密性は得られますが、列の有効な他のすべての性質が失われます。暗号化文字列のバイト順序は明文字列順序とは関係ないため、B-Tree インデックス、ORDER BY クラウス、範囲クエリ(WHERE price BETWEEN 100 AND 500)は完全に機能しなくなります。秩序保持暗号は、代償を払うことでこれを修正します。OPE スキームでは、明文字列 x < y ならば、データベースがすでに理解する標準比較の下で enc(x) < enc(y) となります——通常はバイトの辞書式比較、つまり memcmp です。これらの暗号化文字列をいかなるオフザシェルフのソートインデックスや範囲スキャンにも適用すれば、即座に動作します。カスタム比較子も、クエリ書き換えミドルウェアも不要です。具体例として、以下の 8 ビット整数の明文字列とそれに対応する示唆を与える暗号化文字列を提示します——バイトは実際の暗号化スキームを実行して生成したものではありませんが、そのルールに従っています(各 9 バイト、予約済みの先頭バイト、上位の明文字列ビットが一致する共通接頭辞、最初の異なるバイトから約 128 ビットずらされたオフセット)。9 バイトのうち最初の 6 バイトのみが示されます:明文字列 7 00 5e 38 12 9a 2c … 暗号化文字列 (16 進) 42 00 5e 38 92 3f c4 … 明文字列 100 00 5e b8 4c 92 3f … 暗号化文字列 (16 進) 200 00 de 7a 23 51 8c … 明文字列 250 00 de 7a a3 ff 22 … 昇順に並んだ明文字列は、昇順のバイト辞書式順序の暗号化文字列を生み出します。 memcmp を任意の 2 行に対して適用すると、明文字列を直接比較した結果と同じ判断を下します。これらを B-Tree インデックスにドロップし、ORDER BY / 範囲クエリが即座に動作します。スキームの漏洩プロファイルがそのまま見えることも確認できます——明文字列の上位ビットが近いほど、暗号化文字列の先頭バイトの共有が多くなります。それが少し後で詳述します。代償は漏洩です。順序を保持するスキームは、順序も洩れます(同義句的),そして通常、明文字列 2 つの共通接頭辞についてのもう少し多くの情報を洩漏します。多くのワークロードではこのトレードオフが許容されますが、他のワークロードでは、より厳密な漏洩プロファイルを持つ構築法——例えば Lewi–Wu Block ORE スキーム——を望みます。CipherStash は、Postgres を通じて EQL として実装される生産環境の大部分において、それを利用しています。Block ORE はカスタム比較子を必要とし、Postgres の場合、それは CREATE OPERATOR CLASS / CREATE OPERATOR FAMILY を介してカスタムオペレータクラスを登録する必要があります。いくつかの管理プラットフォームでは、これらのステートメントがスーパーユーザ権限を伴わずに行える(AWS RDS はそれを行い、それが EQL をそこにおいて today 実行する理由である);他のプラットフォームでは(執筆時点で Supabase)行えません。後者のカテゴリにあるプラットフォームにいるときは、CLWW OPE が次の最善の選択肢です。暗号化文字列上の標準の辞書式比較は、いかなる標準の B-Tree インデックスとも動作し、カスタムオペレータクラスの指定が不要です。CLWW の具体的な洩漏に少し触れますが、それは最後に触れます。CLWW はビットバイビットの ORE スキームです。n ビットの明文字列 p = b₀ b₁ ...
Original Content
Many databases let you sort and range-query encrypted data using order-preserving encryption (OPE): a scheme where ciphertext bytes compare in the same order as their plaintexts, so standard B-tree indexes and ORDER BY clauses Just Work. A well-known construction is Chenette-Lewi-Weis-Wu (CLWW) from 2015 — small, fast, and widely deployed. The paper gives an order-revealing scheme with a custom comparator, and notes in passing that the same ciphertext bytes can be compared lexicographically to yield an order-preserving scheme as a side benefit. But that reinterpretation is only almost right. Roughly 1 in 256 comparisons disagrees with the true plaintext order. In this post I'll walk through why, with a worked example, and show how two small changes to the encoding eliminate the residual — at the cost of one extra byte per ciphertext and a single arithmetic pass. This post walks through the construction and the intuition. If you want the full derivation — triangular probability distributions, the signal/noise argument, empirical verification, and a discussion of what the scheme leaks — head over to the companion engineering note: Exact OPE from CLWW ORE via Backward Carry + MSB-Bit Placement. When you encrypt a column in a database with a standard symmetric cipher — AES-GCM, ChaCha20, pick your favourite — you get great confidentiality. You also lose every other useful property of the column. The ciphertext byte order has nothing to do with the plaintext order, so your B-tree index, your ORDER BY clauses, and your range queries (WHERE price BETWEEN 100 AND 500) stop working entirely. Order-preserving encryption fixes this, at a cost. In an OPE scheme, if plaintext x < y, then enc(x) < enc(y) under the standard comparison your database already understands — usually byte-lexicographic compare, i.e. memcmp. Drop these ciphertexts into any off-the-shelf sorted index and range scans work out of the box. No custom comparator. No query-rewriting middleware. To make that concrete, here's a handful of 8-bit integer plaintexts alongside illustrative ciphertexts — the bytes are made up rather than produced by running the scheme, but they follow its rules (9 bytes each, reserved leading byte, shared prefix where the high-order plaintext bits agree, first differing byte offset by ~128). Only the first 6 of 9 bytes are shown: Plaintext Ciphertext (hex) 7 00 5e 38 12 9a 2c … 42 00 5e 38 92 3f c4 … 100 00 5e b8 4c 92 3f … 200 00 de 7a 23 51 8c … 250 00 de 7a a3 ff 22 … Plaintexts in ascending order produce ciphertexts in ascending byte-lexicographic order: memcmp over any two rows returns the same verdict as comparing the plaintexts directly. Drop these into a B-tree index and ORDER BY / range queries work out of the box. You can also see the scheme's leakage profile hiding in plain sight — the closer two plaintexts' high-order bits are, the more leading bytes their ciphertexts share. More on that in a minute. The cost is leakage. A scheme that preserves order also leaks order (tautologically), and typically a bit more — usually some information about the common prefix of two plaintexts. For many workloads that tradeoff is acceptable; for others you'd reach for a construction with a tighter leakage profile, like the Lewi–Wu Block ORE scheme — which is what CipherStash uses for most production paths, exposed to Postgres through EQL. Block ORE needs a custom comparator, which in Postgres means registering a custom operator class via CREATE OPERATOR CLASS / CREATE OPERATOR FAMILY. Some managed platforms permit those statements without superuser (AWS RDS does, which is why EQL runs there today); others don't (Supabase, as of writing). When you're on a platform in the latter category, CLWW OPE is the next best thing — standard lex-compare on the ciphertext works with any stock B-tree index, no custom operator class required. I'll touch on what CLWW specifically leaks toward the end. CLWW is a bit-by-bit ORE scheme. For an n-bit plaintext p = b₀ b₁ … bₙ₋₁ (MSB-first), it produces an n-byte ciphertext. The construction feeds the plaintext bits through a keyed hash function one bit at a time — we'll model that keyed hash as a pseudorandom function (PRF), a function whose output is indistinguishable from uniform random bytes to anyone who doesn't have the key. In practice that's HMAC-SHA256, BLAKE3 in keyed mode, or similar; our implementation uses BLAKE3. At each step we pull one PRF byte and add the plaintext bit to it: The arrows between the PRF boxes are the important part. PRFᵢ depends on all the plaintext bits b₀ … bᵢ₋₁ that came before it. So for two plaintexts X and Y that first differ at bit k: At positions 0 through k, the PRF inputs are identical in both plaintexts, so PRFᵢ^X = PRFᵢ^Y. At positions k+1 onward, the chain has diverged (one side saw bₖ = 0, the other saw bₖ = 1), so the PRFs look independent-random. At the first differing byte itself, then, the ciphertexts differ by exactly b_k^Y − b_k^X, which is +1 or −1. The native CLWW comparator exploits this: find the first differing byte, check whether c_Y ≡ c_X + 1 (mod 256). If yes, X < Y; otherwise Y < X. Note the mod 256 in that check — we'll come back to it. Section 3.2 of the CLWW paper observes that the ciphertext bytes agree up to the first differing byte, and at that byte one side is exactly one more than the other. So if you forget about the custom comparator and just sort the ciphertexts lexicographically, you'll usually get the right answer. "Usually" is doing a lot of work there. Consider X = 0 and Y = 1 as 8-bit plaintexts. Bits of X are 0000 0000; bits of Y are 0000 0001. The PRF chain for Y sees only zeros up to bit 6, so PRF₀ … PRF₇ are identical in both plaintexts. Every ciphertext byte agrees, except byte 7 — which encodes the one differing bit. Now imagine PRF₇ = 0xFF. A 1-in-256 event. Byte PRF X's bit Y's bit c^X c^Y 0–6 PRFᵢ 0 0 PRFᵢ PRFᵢ 7 0xFF 0 1 0xFF (0xFF + 1) mod 256 = 0x00 Lex compare: bytes 0 through 6 agree, byte 7 has 0x00 on the Y side and 0xFF on the X side. The comparator says Y < X. The plaintext says Y > X. Wrong answer. The CLWW comparator handles this case because its check is c_Y ≡ c_X + 1 (mod 256), which is true for (0xFF, 0x00) as well as for (0x42, 0x43). Lex compare doesn't have the mod-256 awareness — when the byte wraps around, the ordering flips. Per-comparison error rate: 1/256 ≈ 3.9 × 10⁻³. For a range scan touching a few thousand ciphertexts, that's multiple mis-orderings per query. Unacceptable. Here's the key reframe: instead of thinking of the ciphertext as a byte stream, think of it as a big-endian integer. When PRFₖ + bₖ wraps past 256, we don't lose the +1 — we propagate it as an arithmetic carry to the byte on the left, exactly like schoolbook long addition. We reserve a leading byte at the top of the ciphertext to absorb any carry that makes it all the way up. fn encrypt_ope_v1(K, p): hasher = keyed_hasher(K) # Pass 1: generate PRF bytes, record the plaintext bits. prfs = []; bits = [] for bit in bits_msb_first(p): prfs.push(hasher.next_byte()) bits.push(bit) hasher.update(bit) # Pass 2: compute P + B as an (n+1)-byte big-endian integer, # with a reserved leading byte that absorbs the final carry. out = [0] + prfs carry = 0 for i in reversed(0..n): total = out[i + 1] + bits[i] + carry out[i + 1] = total mod 256 carry = total >> 8 # always 0 or 1 out[0] = carry return out Comparison is plain byte-lex compare — no custom comparator required. Let's replay the failing case. X = 0, Y = 1, PRF₇ = 0xFF. The ciphertext is now 9 bytes (one reserved + eight content). X (all bits zero): pass 2 adds nothing and no carry propagates. c^X = [0, PRF₀, …, PRF₆, 0xFF]. Y (bit 7 = 1): at byte 8 we compute 0xFF + 1 + 0 = 0x100 → byte 8 becomes 0x00, carry out = 1. At byte 7 we compute PRF₆ + 0 + 1 = PRF₆ + 1, carry out = 0. Other bytes unchanged. c^Y = [0, PRF₀, …, PRF₅, PRF₆ + 1, 0x00]. Lex compare: c^X and c^Y agree through byte 6. At byte 7, c^X has PRF₆ and c^Y has PRF₆ + 1. Y is bigger — correctly. ✓ The +1 didn't disappear when the byte wrapped; it just moved one position to the left, where it reappears at the byte the lex comparator hits first. A big improvement, but it turns out this scheme is only almost right. The per-comparison error rate drops from ~1/256 to around 7.7 × 10⁻⁶ — about 500× better — but it's still not zero. The residual is a signal/noise mismatch. At the first differing byte, the plaintext injects a +1 signal — but the backward carry coming up from the byte to its right can also swing the result by ±1. Signal and noise are the same order of magnitude, so in rare PRF configurations they cancel exactly and the two ciphertexts tie where the plaintexts don't. The full derivation — triangular distributions, tail probabilities, the small correction for the m = 2 case — lives in the engineering note. The question is: can we kill the residual without making the ciphertext larger? The signal-vs-noise framing suggests the obvious move: make the signal bigger. We can't shrink the noise — ±1 per byte is intrinsic to how the carry works — but we can inject the plaintext bit at a higher-weight position within its byte. Concretely: instead of adding +bit (which puts the signal at bit 0 of the byte, weight 1), add +bit · 128 (which puts it at bit 7, weight 128). Same ciphertext size. Same PRF chain. One line of code: - total = out[i + 1] + bits[i] + carry + total = out[i + 1] + bits[i] * 128 + carry The signal is now +128 at the first differing byte. The carry noise is still ±1. They can't cancel — lex compare resolves at the first differing byte by a margin of at least 127 (or, if that byte happens to wrap and the carry propagates one position left, at the byte before it, which sits at 256× the integer weight and so dominates anyway). The engineering note has the full case analysis. Exact OPE. Verified empirically: 1 310 700 adjacent u16 pairs under 20 randomly sampled keys, zero ordering errors. Two small changes — a backward carry pass with a reserved byte, and placing the plaintext bit at the top bit of its byte rather than the bottom — turn CLWW's paper-suggested OPE reinterpretation from a ~1/256-wrong scheme into an exact one. Same number of PRF evaluations, one extra ciphertext byte, no custom comparator needed. What this gives you: drop-in ciphertexts for any ordered index or sort operation, with ordering that matches plaintext ordering exactly. What it doesn't give you: strict-ORE-level leakage hiding (the scheme leaks first-differing-bit position, same as CLWW ORE), and no decryption — for round-trip you pair the OPE ciphertext with an authenticated symmetric cipher like AES-GCM. The implementation lives in our cllw-ore crate on crates.io, with a fuller derivation of the error rate and the exactness argument in the engineering note. If you've got a production encrypted-database use case that needs order-preserving indexes — or you just enjoy small clean cryptographic primitives — I'd love to hear what you think.