← Back to digest

Quantum Search without Global Diffusion

Authors: John Burke, Ciaran Mc Goldrick (Trinity College Dublin) · arXiv:2604.15435 · submitted 20 April 2026 · score 8/10 (HIGH)

Abstract

Quantum search is among the most important algorithms in quantum computing, offering fundamentally new approaches to optimisation and information processing. At its core is quantum amplitude amplification, a technique that achieves a quadratic speedup over classical search by combining two global reflections: the oracle, which marks the target, and the diffusion operator, which reflects about the initial state. We show that this speedup can be preserved when the oracle is the only global operator, with all other operations acting locally on non-overlapping partitions of the search register. We present a recursive construction that, when the initial and target states both decompose as tensor products over these chosen partitions, admits an exact closed-form solution for the algorithm's dynamics. This is enabled by an intriguing degeneracy in the principal angles between successive reflections, which collapse to just two distinct values governed by a single recursively defined angle. Applied to unstructured search, a problem that naturally satisfies the tensor decomposition, the approach retains the O(√N) oracle complexity of Grover search when each partition contains at least log2(log2 N) qubits. On an 18-qubit search problem, partitioning into two stages reduces the non-oracle circuit depth by as much as 51%–96% relative to Grover, requiring up to 9% additional oracle calls.

Executive summary

Burke and Mc Goldrick prove that the global diffusion operator in Grover/amplitude amplification is not strictly necessary: if the state-preparation unitary and target state both factor as tensor products across a partition of the qubit register, the global reflection can be replaced entirely by local, register-scoped reflections, with the oracle as the only operator that still spans the full register. The proof identifies a surprising degeneracy — the principal angles between successive reflection subspaces (whose ambient dimension grows exponentially) collapse to two values governed by a scalar recursion. For unstructured search the overhead is negligible once each partition holds at least log2(log2 N) qubits, and at n = 50 the two-stage scheme matches Grover's success probability at oracle cost increase <0.1% while slashing non-oracle depth. For Yuan, this is directly load-bearing for Y4's Grover-ADMM pipeline: the dominant hardware cost of a Grover rotation for a cardinality-constrained feasible set is the non-oracle diffusion on the full n-qubit register, exactly what this paper localises.

Main contribution

The central object is a family of reflection operators Wi, defined recursively from the oracle Sx and local partial-diffusion reflections Sψi = I − 2|ψi⟩⟨ψi| acting only on register i:

Wi = (Sψi Wi−1)ti Sψi (Wi−1 Sψi)ti, W0 = Sx

Within the span of the initial and target register states, the product Sψi Wi−1 acts as a rotation with two distinct principal angles {γi, π/2 − γi} where sin(2γi) = sin(2θi) · sin(2ti−1γi−1) and γ1 = θ1. This scalar recursion yields a closed-form success probability per stage and an oracle overhead bounded by 1/∏i cos(θi) relative to standard amplitude amplification. Measure the non-final registers, reinitialise, repeat on the shrinking remainder — the target is recovered stage-by-stage with a geometric-series overhead dominated by the first round.

Key theorems / lemmas / algorithms

Detailed walkthrough

The paper's pitch is structural: standard Grover/AA requires two global reflections — an oracle Sx that marks the target, and a diffusion operator Sψ = I − 2|ψ⟩⟨ψ| that reflects about the initial state. The second is typically implemented as a multi-controlled operation that touches every qubit; on near-term hardware with limited connectivity, this is expensive. Prior work on partial-diffusion operators (Grover-Radhakrishnan-Korepin, partial search, Zalka-style tradeoffs) reduces but does not eliminate the global reflection; when global diffusion is replaced entirely, the dynamics become a unitary on an exponentially growing subspace and do not admit a closed-form analysis in general.

The key insight of Sec. 3 (algorithm overview) is that if A = ⊗i Ai and |x⟩ = ⊗i |xi⟩ factor across a register partition of sizes n1+…+nm = n (unstructured search with Hadamards satisfies this trivially with any partition), then the recursive reflection Wi defined above acts on span{|ψi⟩, |ψi⟩} as a rotation with just two principal angles. The proof in Sec. algorithm/proof.tex is an induction: at each level of the recursion the ambient invariant subspace doubles, but the reflection only "sees" two directions per register. As a consequence, the squared projections onto |ψi⟩ and |ψi⟩ are cos2(2tiγi) and sin2(2tiγi) irrespective of the lower-register state — which is what makes the stage-by-stage analysis tractable.

The success-probability formula is then derived by tracking projections onto the register-m target component. Measurement collapses the outer registers; the target register is in a mixed state on span{|ψm⟩, |ψm⟩}. The remaining registers are reinitialised and the search iterates on a shrinking register. Because each stage introduces only a multiplicative 1/cos(θi) in query count, and the partition sizes geometrically shrink (for unstructured search) or can be held uniform, the total oracle budget is a geometric series dominated by the first round.

Sec. 5 (Evaluation) does two things. First, an 18-qubit Qiskit simulation: for target |1⟩⊗18, a 3-stage [6,6,6] configuration with (102, 25, 6) iterations per stage measures the target 96.5% of the time vs. the theoretical 96.7%; the 2-stage [9,9] configuration with (201, 17) iterations achieves 99.5% vs. 99.8%. Grover with matched success probability is the baseline. The authors evaluate non-oracle circuit depth under three MCX-decomposition scenarios — S1 (no ancillae), S2 (dirty V-chain using idle qubits), S3 (one clean ancilla). Under S1 at the 3-stage configuration, diffusion depth drops by 95%; under S2 by 97%; with one clean ancilla (S3) by 63%. The oracle-call overheads are 30% (3-stage) and 9% (2-stage). Importantly, the authors compute the break-even oracle-to-diffusion-depth ratio at which the recursive scheme still beats Grover in total depth: for 3-stage it is ∼3.1, 3.2, 1.7 (S1/S2/S3); for 2-stage it is ∼8.8, 10.6, 5.2.

Sec. 5.2 (Scaling analysis) extends to n = 20–50 qubits using the closed-form expressions. The headline: both failure probability and oracle overhead improve with n for fixed stage count. By n = 50, 2-stage failure < 10−7 with <0.1% oracle overhead; 3-stage < 10−4 with <1%; 4-stage < 10−3 with <5%. At this scale the break-even oracle depth ratio under S2 reaches ∼103 for 2-stage, meaning the scheme is depth-favourable even when the oracle is three orders of magnitude deeper than a global diffuser. Applied to an AES-128 key-recovery oracle (oracle-to-diffusion depth ratio ∼10×), the recursive scheme reduces total depth by ∼5%.

A subtle point: success probability per stage saturates below Grover's (99.99%+) because each stage's probability is bounded by the scalar recursion — the paper is explicit that quadratic-speedup preservation comes with a small probability deficit that shrinks with n. For applications that tolerate a ~10−3–10−7 miss rate (essentially all quantum-advantage arguments for combinatorial search), this is a very favourable trade: dramatically shallower non-oracle circuits at essentially free success probability. For applications requiring exact success (deterministic Grover variants), this scheme is not a drop-in.

Figures

Figure 1
Figure 1. Simulation and depth comparison of the recursive search scheme against standard Grover on an 18-qubit single-target problem. Top row: measurement-outcome distributions for 3-stage [6,6,6] and 2-stage [9,9] configurations over 1000 shots. Bottom left: diffusion-operator circuit depth (log scale) under decomposition scenarios S1 (no ancillae), S2 (dirty V-chain), S3 (one clean ancilla). Bottom right: total oracle-call counts vs. Grover baseline matched to the same success probability.
Figure 2
Figure 2. Scaling of the recursive scheme for k = 2, 3, 4 near-equal stages across n = 20 to 50 qubits (unstructured single-target search). Left: failure probability 1 − p on log scale. Right: relative oracle overhead vs. Grover matched to the same success. Both quantities improve rapidly with n for every stage count.
Figure 3
Figure 3. Depth scaling of the recursive scheme vs. Grover for k = 2, 3, 4 stages across n = 20 to 50. Top row: break-even oracle-to-diffusion depth ratio under S1/S2/S3 decompositions (below this ratio the recursive scheme is strictly shallower). Bottom row: total circuit depth relative to Grover for oracle-depth multipliers 1×, 5×, 10× under S3. Dashed red line marks parity with Grover.

Citations to Yuan's papers

No direct citation to any of Y1–Y6 found in bibliography.

Overlap with Y1–Y6

Recommended action for Yuan