Intro to VIA: a chain of forced moves

Private Information Retrieval (PIR) asks a seemingly impossible question: how can a client fetch record \(i\) from an \(N\)-record server database without the server learning \(i\)? VIA is a collection of cryptographic schemes that make that possible.

This intro goes over VIA not as a list of features but as a sequence of forced moves: start from the dumbest scheme that works, find what is wrong with it, fix that, then notice the fix has created a new problem, and repeat. VIA is where the chain terminates.

Scope. Moves 0–3 build base VIA (the construction every variant shares), whose headline is that it needs no offline phase: no one-time "hint" download (a big precomputed summary of the database that older single-server schemes make you fetch before your first query). Move 4 adds a Compression refinement called VIA-C; the batch variant VIA-B layers extra batch-repacking machinery on top of VIA-C and is out of scope here.

Move 0: the only obviously-correct scheme

Send the client the entire database. The server learns nothing about \(i\) because the client asked for all of it. It is correct and perfectly private, and it moves \(O(N)\) bytes, which is hopeless at scale. Everything that follows is an attempt to keep this privacy while making the server do the selecting, without ever seeing what it selected.

Move 1: hand the server an encrypted selector

Represent the request as a one-hot vector: a \(1\) at position \(i\) and \(0\) elsewhere: say \([0,0,1,0]\) to ask for record \(2\) of four. Treat the database as a matrix \(D\) whose rows are the records and multiply the two: the vector-times-matrix product multiplies each record by its bit and adds the results. For the four-record case that is \(0\!\cdot\! r_0 + 0\!\cdot\! r_1 + 1\!\cdot\! r_2 + 0\!\cdot\! r_3 = r_2\): every record annihilated except the one asked for. The product is a single record, not an \(N\)-wide blend. But the vector states \(i\) outright, so we must encrypt it with a scheme the server can still compute on.

0 0 1 0 query (one-hot) × r0 r1 r2 r3 database D = 0·r0 + 0·r1 + 1·r2 + 0·r3 = r2 record i

VIA's encryption is LWE (Learning With Errors). An LWE ciphertext is literally one linear equation in a long secret vector \(S\): \[ B \;=\; A\cdot S \;+\; E \;+\; \Delta\, m \pmod q, \] where \(A\) is a public random vector, \(m\) is the message, \(E\) is a small random error, and \(\Delta = q/p\) is a scaling factor that lifts \(m\) to a large value \(\Delta m\) so the small error sits harmlessly below it (the worked example below makes that "room underneath" visible). Anyone holding \(S\) computes \(B - A\cdot S = \Delta m + E\) and reads \(m\) off the top; anyone without \(S\) sees only the pair \((A, B)\), which (because \(E\) is random) looks exactly like two random numbers, with no test that tells a real ciphertext from noise (cryptographers call this "computationally indistinguishable"). Recovering \(S\) or \(m\) from many such equations is the Learning-With-Errors problem, believed hard even for quantum computers. So "encrypted" means exactly this: trivially readable with the key, provably noise-like without it.1

This is also where privacy comes from, and it is worth stating up front because the rest of the tour is about correctness: the server runs every step that follows on these ciphertexts and never holds \(S\). To it, the query, the half-built selectors, and the final answer are all just noise. That, not any single gadget below, is what hides which record you wanted; everything else here is about keeping the computation correct without ever lifting that veil.

Decryption is "subtract \(A\cdot S\), then round to the nearest multiple of \(\Delta\)." That rounding is the catch that drives the entire design: it returns the right answer only while the error stays under half a step, \(|E| < \Delta/2\). Call that inequality the noise budget.2 And every homomorphic operation (every computation the server does on ciphertexts) spends some of it, because each one makes \(E\) grow.

Worked example: how noise accumulates, and what happens when it overflows

Take tiny numbers: modulus \(q = 64\), plaintext modulus \(p = 2\) (so \(m\) is a single bit), giving \(\Delta = q/p = 32\). A bit \(m=1\) is stored as the value \(\Delta\cdot 1 + E = 32 + E\); decryption asks "is this value nearer \(0\) or nearer \(32\)?" It answers \(1\) as long as the value stays inside \((16, 48)\): i.e. as long as \(|E| < \Delta/2 = 16\).

Accumulation. Adding two ciphertexts adds their two errors. A fresh ciphertext might hold only \(E = 3\) (value \(35\), comfortably inside the safe band). But the server's whole job is to add ciphertexts together: sum \(k\) of them, each carrying an error of about \(3\), and by that rule the result's error can be as large as \(3 + 3 + \dots = 3k\). So the error grows with how many terms you sum: eight such terms already push \(E\) to \(8\times 3 = 24\), and the stored value becomes \(32 + 24 = 56\). (This is not hypothetical: it is exactly what a one-hot selector does: it sums one term per database row, so a bigger database means more terms and more error.)

0 32 = Δ·1 64 ≡ 0 decodes m = 0 decodes m = 1 decodes m = 0 correct while |E| < Δ/2 = 16 (either side of 32) fresh: 32+3 = 35 ✓ → 1 overflowed: 32+24 = 56 ✗ → 0
The decryption "ruler" for \(q=64,\ \Delta=32\). A value decodes to the slot it rounds to. Stay within \(\Delta/2=16\) of \(32\) (green band) and the bit reads \(1\); drift past \(48\) and it rounds up to \(64\equiv0\) and reads \(0\).

Overflow returns the wrong value, silently. At \(E=24\) the value \(56\) has crossed the boundary at \(48\); it is now nearer to \(64\equiv 0\) than to \(32\), so decryption rounds it to \(0\). The bit has silently flipped from \(1\) to \(0\). Nothing raises an error (the arithmetic is all valid mod \(q\)), so the client simply receives the wrong record with no way to tell. That is why noise is the hard constraint, not a nuisance: the same picture holds at realistic scale (with \(\Delta\) in the hundreds and a budget half of that). It also exposes Move 1's second flaw: error grows with the number of terms summed, so a length-\(N\) selector flips the bit on any sizable database. Move 1 loses on both counts: \(N\) to upload and noise that scales with \(N\).

Move 2: send the index in binary, expand it server-side

Lay the database out as an \(I\times J\) grid, so a record's address splits into a row \(\alpha\) and a column \(\beta\) (a third index \(\gamma\) picks a slot inside the record, covered in Move 3). Move 2 is only about selecting the row. Here is the trick: instead of sending an \(I\)-entry one-hot vector, the client sends \(\alpha\) in binary (just \(\log_2 I\) encrypted bits) and the server rebuilds the one-hot row selector from those bits itself, using a tree of DMux gates.3

A DMux ("de-multiplexer") gate takes one ciphertext and one encrypted control bit and emits two ciphertexts: it sends the input down one of the two outputs and puts \(\text{enc}(0)\) on the other: the bit decides which side gets the input. Wire \(\log_2 I\) of these into a binary tree, one level per address bit: drop a single \(\text{enc}(1)\) in at the root, and at each level the next bit of \(\alpha\) steers that live payload down one branch while every other branch fills with \(\text{enc}(0)\). After the last level, the \(I\) leaves are all \(\text{enc}(0)\) except the one at address \(\alpha\), which holds \(\text{enc}(1)\): that row of leaves is the encrypted one-hot selector, rebuilt from \(\log_2 I\) bits.

graph TD
  R["root payload = enc(1)"]
  R -->|"high bit b₁ = 0"| N0["enc(0)"]
  R -->|"high bit b₁ = 1 ✓"| N1["enc(1)"]
  N0 --> L0["row 0 → enc(0)"]
  N0 --> L1["row 1 → enc(0)"]
  N1 -->|"low bit b₀ = 0 ✓"| L2["row 2 → enc(1)"]
  N1 -->|"low bit b₀ = 1"| L3["row 3 → enc(0)"]
  classDef live fill:#e8f7ee,stroke:#1f9d6b,stroke-width:2px;
  classDef dead fill:#f5f7fa,stroke:#cdd5df,color:#9aa4b2;
  class R,N1,L2 live;
  class N0,L0,L1,L3 dead;
DMux tree for \(I=4\), routing to row \(\alpha = 2\) (binary \(10\)). The high bit \(b_1\) picks the right subtree, the low bit \(b_0\) picks the left leaf inside it (✓ = the live path); every other branch fills with \(\text{enc}(0)\). The four leaves are the encrypted one-hot selector \([0,0,1,0]\): rebuilt by the server from just \(\log_2 4 = 2\) bits.

Now the server holds an encrypted one-hot row selector: \(I\) ciphertexts, all encrypting \(0\) except the \(\alpha\)-th, which encrypts \(1\). Reading out row \(\alpha\) is then almost mechanical: multiply each database row by its matching ciphertext and add the results. The database rows are ordinary plaintext, so "plaintext row \(\times\) encrypted bit" is the cheap kind of multiplication: every row but the \(\alpha\)-th is multiplied by an encryption of \(0\) and vanishes, while row \(\alpha\) is multiplied by an encryption of \(1\) and survives. The server ends up holding an encryption of exactly the row the client asked for, having never learned which one. (VIA calls this the first-dimension step.)4

First dimension: (encrypted one-hot) × (plaintext database), then add selector plaintext DB product 0 × db row 0 = ×0 → enc(0), vanishes 0 × db row 1 = ×0 → enc(0), vanishes 1 × db row 2 = ×1 → db row 2 survives 0 × db row 3 = ×0 → enc(0), vanishes Σ (add all four) → an encryption of db row 2 — the server never learns it was row 2
The first-dimension step makes "select without seeing" concrete: the encrypted selector zeroes out every row but the chosen one, and the multiplications are the cheap plaintext×ciphertext kind. The server holds the answer encrypted and is none the wiser about \(\alpha\).

A symmetric tree of CMux gates (a CMux is the mirror image of a DMux: given two ciphertexts and an encrypted control bit, it outputs whichever one the bit selects) then folds the \(J\) columns of that row down to the single target cell. Finally a controlled rotation (CRot) cyclically shifts the packed slots so that coefficient-slot \(\gamma\) lands in the readable position.5

The payoff is all in the noise. A DMux tree is \(\log_2 I\) levels deep, and (take this on faith for one more Move; Move 3 earns it) each level adds only a fixed amount of noise variance. Swapping a length-\(I\) selector for \(\log_2 I\) control bits turns noise that grew with \(I\) into noise that grows with \(\log_2 I\): the single most important budgeting move in the design. (The earlier "coefficient expansion" technique that VIA replaces built the selector in a way whose noise variance grew quadratically, exactly the trap Move 2 sidesteps.)6

Move 3: make "multiply by an encrypted bit" cheap

Move 2 leaned on an operation it never justified: multiplying a ciphertext by an encrypted bit, cheaply. The danger case is multiplying two encrypted values directly: there the two error terms cross-multiply, so the result's noise is roughly the product of the inputs', and it grows multiplicatively. A tree of such multiplications would blow the budget after a couple of levels (red curve below). (Multiplying an encrypted value by ordinary plaintext, as the first-dimension step does, is not the problem: that one is cheap.)

noise gate operations (tree levels) → 1 2 3 4 5 noise budget — decryption fails above additive — RGSW external product multiplicative — naïve ciphertext × ciphertext
Why the gate primitive matters. Naïve ciphertext multiplication compounds noise and crosses the budget within a few operations; the RGSW external product adds only a fixed amount per gate, so a whole tree of them stays safe.

The fix is a special, bulkier ciphertext called RGSW that supports the external product \(\text{RGSW} \otimes \text{RLWE} \to \text{RLWE}\): it multiplies a ciphertext by an encrypted bit while adding only a fixed amount of noise (green curve).7 Two ingredients make it work.

First, RLWE: LWE lifted from single numbers up to whole polynomials. Instead of one secret number, the secret is now a degree-\(n\) polynomial (its coefficients taken mod \(q\)); the set of all such polynomials, with ordinary add-and-multiply, is what cryptographers call a ring, written \(R_{n,q}\). The practical win is packing and speed: one RLWE ciphertext carries \(n\) message values at once, and two of them multiply far faster than you'd expect: the schoolbook way of multiplying two degree-\(n\) polynomials costs about \(n^2\) steps, while a classic fast-multiplication trick (the Number-Theoretic Transform, a cousin of the FFT from signal processing) cuts that to about \(n\log n\). That speedup is what makes it practical to operate on thousands of packed values inside a single ciphertext.

Second, gadget decomposition, which is what keeps the multiplication's noise additive. Rather than multiply by one big mod-\(q\) value (big value → big noise), the gadget writes that value in a small base \(B\) as \(\ell\) little "digits" (the same idea as storing a big integer as an array of machine words) and multiplies digit by digit. Because every partial product is small, so is the noise each contributes. (VIA uses an approximate gadget decomposition: it drops the least-significant digits, accepting a tiny reconstruction error that is simply folded into the noise budget, in exchange for speed.) With RGSW and the external product in hand, the DMux and CMux gates of Move 2 are exactly the additive-noise gates the trees needed: settling the IOU from Move 2.8

Move 4: stop shipping big gate-keys

RGSW ciphertexts are large: they are bundles of many RLWE ciphertexts. Transmitting the control bits as RGSW (the "gate-keys" the trees consume) re-inflates the query we worked to shrink. The fix, in the variant called VIA-C: the client sends only tiny LWE ciphertexts of the index bits, and the server rebuilds the full RGSW gate-keys itself, in two homomorphic re-encoding steps: an LWE→RLWE conversion followed by an RLWE→RGSW conversion.9

graph LR
  Q["client sends tiny LWE ciphertexts of the index bits, about ℓ log N numbers"]
  Q --> C1["server: LWE to RLWE conversion"]
  C1 --> C2["RLWE to RGSW conversion"]
  C2 --> G["full-size RGSW gate-keys rebuilt on the server"]
  G --> T["feed the DMux / CMux / CRot trees (Move 2)"]
  classDef a fill:#e8f7ee,stroke:#1f9d6b;
  classDef b fill:#eaf2fb,stroke:#2b5dd1;
  class Q a; class C1,C2,G,T b;
VIA-C moves the cost of building gate-keys from the wire to the server: a logarithmic-size query is decompressed back into RGSW on the server side. (Green = on the client; blue = on the server.)

These conversions are the same family of operation as key-, ring-, and modulus-switching: using public "switching keys," the server can move a ciphertext to a different secret key, a smaller ring, or a smaller modulus without decrypting it, re-expressing the same hidden value under new parameters. A conversion just changes a ciphertext's shape (LWE → RLWE → RGSW) while preserving what it hides, which is why the server can rebuild gate-keys it could never have read.

The per-query upload collapses to about \(\ell\cdot\log N\) numbers (\(\ell\) is the gadget digit-count from Move 3). Two costs come with this. First, the conversion and switching keys are themselves a one-time setup payload the client supplies up front, so VIA-C (unlike base VIA, which sends nothing ahead of time) does reintroduce a modest amount of offline communication; the zero-offline headline stays base VIA's. Second, supplying keys that encrypt key material asks for one extra, mild assumption, circular security: that it stays safe to encrypt a secret under a key related to that same secret, which is exactly what these conversion keys do (base VIA, sending no keys, doesn't need it).10 What VIA-C buys for these two costs is a dramatically smaller cost on every query that follows.

Move 5: pay for the conversions, then balance the whole ledger

Re-encoding is not free: each conversion injects noise, so a careless LWE→RLWE step would bankrupt the budget on its own. VIA's central technical contribution is a conversion whose noise variance grows only as \(O(n\log n)\), where the prior method's grew as \(O(n^3)\), and that frugality is exactly what makes Move 4 affordable rather than ruinous. (Why the new method is so much quieter is the paper's main result and beyond our scope here; take it as an asserted result, not one we prove.)11

With every step's cost bounded, correctness reduces to one accounting question: is the total noise accumulated across conversion → DMux tree → first-dimension sum → CMux tree → CRot → response compression still below \(\Delta/2\) when the client finally rounds? VIA keeps the answer "yes" with a descending modulus chain \(q_1 > q_2 > q_3 > q_4\). A modulus switch rescales a ciphertext to a smaller modulus and, in doing so, scales its noise down too: shrinking the data and buying back budget at once. A ring switch likewise compresses the answer into a smaller ring before it travels back.12

noise decryption threshold Δ/2 mod-switch ↓ ring-switch ↓ convert DMux first-dim CMux CRot decode
The noise ledger. Error climbs as the server works (the first-dimension sum is the steepest rise), but well-timed modulus and ring switches knock it back down, so the final value lands just under \(\Delta/2\) and the record decrypts correctly.

And travel back it does: the server returns this final, compressed ciphertext, and the client (the only party holding \(S\)) decrypts it with the very same "round to the nearest \(\Delta\)" rule from Move 1. That last rounding is the reason the whole budget had to stay under \(\Delta/2\) in the first place.

Why it all holds together

Each primitive is the forced answer to the previous step's flaw (the one-hot selector, the binary index plus DMux tree, the RGSW external product, the LWE→RLWE→RGSW conversion, and the low-noise conversion with its modulus ledger), and the chain ends with the right record decrypted, just under budget. Privacy, meanwhile, was secured back in Move 1 and never spent: because every value the server touches is a ciphertext it cannot read, it processes the entire query while learning nothing about \(i\). Formally, under the RLWE hardness assumption and its module variant MLWE (plus the circular-security assumption that Move 4 introduced), the query is provably indistinguishable from random.13 VIA is simply what you get when you keep fixing the problem your last fix created, without ever letting the server see.

Footnotes

paper stated in Liu et al. (2025) code observed in via-rs (pinned to commit 9575286)

  1. paper Liu et al. (2025) §2.2 (Lattice-Based Homomorphic Encryption), p.5 — RLWE is parameterised by \((n,q,\chi_S,\chi_E)\) with \(\{(A,A\!\cdot\!S+E)\}\approx_c\{(A,R)\}\); encryption \(c=(A,AS+E+M)\), \(M=\Delta m\), \(\Delta=\lceil q/p\rceil\), decryption \(\lfloor(B-AS)/\Delta\rceil\). The intro introduces this in its single-number (LWE) form first and lifts it to polynomials (RLWE) in Move 3.
  2. inference The \(|E|<\Delta/2\) decode condition is the direct consequence of rounding \(\lfloor(B-AS)/\Delta\rceil\) (fn 1): the nearest-multiple-of-\(\Delta\) rule recovers \(m\) iff the additive error has not crossed the half-step boundary.
  3. paper Liu et al. (2025) §3.1 (The VIA Protocol), p.8 — "VIA substitutes coefficient expansion techniques with DMux for generating the encrypted one-hot vector." The address is split into a row \(\alpha\) and column \(\beta\) over an \(I\times J\) grid (p.8).
  4. client/decompose.rs — index split \(\texttt{idx}=\gamma(IJ)+\alpha J+\beta\) into \((\alpha,\beta,\gamma)=\)(row, column, slot); \(\alpha\) drives DMux (MSB-first), \(\beta\)/\(\gamma\) LSB-first. The first-dimension plaintext×ciphertext inner product is server/first_dim.rs.
  5. primitives/gates/mux.rs (CMux/DMux + trees, built on the RGSW external product) and primitives/gates/rotate.rs (crot, RGSW-controlled rotation). Paper §4.3/§4.4, p.9.
  6. paper Liu et al. (2025) §3.1, p.8 — DMux introduces "only logarithmic noise variance in the one-hot vector length", whereas conventional ciphertext-compression (coefficient expansion) approaches "exhibit at least quadratic noise growth."
  7. paper Liu et al. (2025) §2.2, p.6 — external product \(\boxdot: \mathrm{RGSW}_S(M_1)\times\mathrm{RLWE}_S(M_2)\to \mathrm{RLWE}_S(M_1M_2)\) (the intro writes \(\otimes\)). Implemented in primitives/encryption/rgsw.rs.
  8. paper Liu et al. (2025) §2.2, p.6 — "approximate gadget decomposition proposed in [40]: for gadget parameters \((\ell,B)\)". Code: primitives/encryption/gadget.rs — keeps the top \(L\) digits, "≈" hiding a reconstruction error bounded by \(\mathrm{round}(q/B^L)/2\) absorbed into noise.
  9. paper Liu et al. (2025) §4 (Query compression, Figure 6), p.13 — the client encrypts the control/select/rotation bits as LWE ciphertexts (\(O(\log N)\) elements of \(\mathbb Z_{q_1}\)); the server runs LWE→RLWE then forms the RGSW gate-keys. Code: conversion/cascade.rs (LWE→RLWE) and gates/convert.rs (RLWE→RGSW).
  10. paper Liu et al. (2025) §3.1, p.9 — VIA's security holds "if the encryption system is secure along with a 'circular security' assumption"; VIA-B's relies on circular / key-dependent message security (§C.1). Base VIA, which ships no keys, does not need it.
  11. paper Liu et al. (2025) §1, p.2 and §4.2 (Lemma 4.2), p.12 — the new LWE-to-RLWE conversion "introduces only \(O(n\log n)\) noise variance" (\(\theta'\le\theta_c+2\theta_{ks}\log n\)), versus the prior method of [28] whose noise variance is \(O(n^3)\) (cubic in the polynomial degree, §4.1, p.11). unverified the \(O(n\log n)\) figure is the paper's asserted result, re-derived noise budgets live in via-estimator, not on this page.
  12. paper Liu et al. (2025) §3.1, p.8 — moduli \(q_1>q_2>q_3>q_4>p\); modulus switching is applied before first-dimensional folding (p.8), and response compression uses ring switching (§4.3).
  13. paper Liu et al. (2025) §3.1, p.9 — query indistinguishability under the RLWE/MLWE assumption plus circular security. This is the paper's claim; this page deliberately states no concrete security-bit figure — the in-repo source of trust for any such number is via-estimator.