← Back to digest

Quantum Search without Global Diffusion

John Burke, Ciaran Mc Goldrick · arXiv:2604.15435 · submitted 2026-04-20 · score 8/10 (HIGH)

Abstract

Quantum search is among the most important algorithms in quantum computing, offering fundamentally new approaches to optimisation and information processing. At its core is quantum amplitude amplification, a technique that achieves a quadratic speedup over classical search by combining two global reflections: the oracle, which marks the target, and the diffusion operator, which reflects about the initial state. We show that this speedup can be preserved when the oracle is the only global operator, with all other operations acting locally on non-overlapping partitions of the search register. We present a recursive construction that, when the initial and target states both decompose as tensor products over these chosen partitions, admits an exact closed-form solution for the algorithm's dynamics. Applied to unstructured search, the approach retains the O(√N) oracle complexity of Grover search when each partition contains at least log₂(log₂ N) qubits. On an 18-qubit search problem, partitioning into two stages reduces the non-oracle circuit depth by 51%–96% relative to Grover, requiring up to 9% additional oracle calls.

Executive summary

Burke and Mc Goldrick prove that the global diffusion reflection at the heart of Grover/amplitude amplification is unnecessary whenever the state preparation unitary A and the target state both factor as tensor products across a partition of the qubit register. They give a recursive construction in which only the oracle remains global; every other reflection acts within a single partition. The principal angles between the recursively defined reflections collapse to two values governed by a single recursion sin(2γᵢ) = sin(2θᵢ) sin(2tᵢ₋₁γᵢ₋₁), yielding a closed-form expression for success probability and an oracle overhead of 1/∏cos(θᵢ) over standard QAA. For Yuan, the relevance is direct: Y4 builds a Grover-based algorithm for cardinality-constrained binary optimisation, and the cardinality-constrained feasible space (under quasi-binary encoding from Y2) is exactly the kind of structured product space where local diffusion may yield significant depth savings on hardware.

Main contribution

The central theorem (Section Per-Stage Oracle Complexity) shows that, with intermediate stage iterates set to tᵢ = 1, the recursive construction Wᵢ = (Sψᵢ Wᵢ₋₁)tᵢ Sψᵢ (Wᵢ₋₁ Sψᵢ)tᵢ drives the amplitude on the outermost register towards the target with the per-round oracle cost bounded by C ≤ CQAA/∏ᵢ cos(θᵢ). The key technical observation is that despite the reflected eigenspaces growing exponentially (dimension 2i-1), the principal angles between consecutive reflections collapse to two values, reducing the analysis at every level to a scalar 2×2 problem. The total complexity sums geometrically over recursive rounds and the dominant cost remains the first round.

Key theorems / lemmas / algorithms

Detailed walkthrough

The paper formalises a striking observation about Grover-style amplitude amplification: the global diffusion 2|ψ⟩⟨ψ| − I is conventionally implemented as an n-qubit reflection, requiring multi-controlled gates that dominate the non-oracle depth on near-term hardware. The authors show that whenever the prepared state and the target state both factor across a chosen partition of the n-qubit register into m blocks of sizes n₁, …, nₘ, the global diffusion can be replaced by a recursive construction built only from local partial diffusions Sψᵢ (each acting on register i, implemented from Aᵢ, Aᵢ†, and a local zero-state phase flip) and the global oracle Sx — defining W₀ = Sx and Wᵢ = (SψᵢWᵢ₋₁)tᵢ Sψᵢ (Wᵢ₋₁ Sψᵢ)tᵢ (overview, Section 3.1).

The decisive technical result is the principal-angle degeneracy proved in Section 3.2 (proof.tex, the longest file in the source). Although Wᵢ reflects in a subspace of dimension 2i-1, all principal angles between the eigenspaces of consecutive reflections collapse to just two values γᵢ and π/2 − γᵢ. Without this collapse, the dynamics would require analysing an exponentially growing number of mixing rates; with it, the amplification reduces to a scalar SU(2)-like rotation per recursion level. Equation stage_succ then gives the closed-form success probability for the outermost register and the oracle-optimal iteration count tₘ* = ⌊π/(4γₘ) − ½⌉. The authors prove that intermediate iterates tᵢ < m are optimal at tᵢ = 1 via a doubling argument, leaving a single recursion sin(2γₘ) = ∏ sin(2θᵢ) = 2m sin(θ) ∏ cos(θᵢ).

The total cost (Section Total Oracle Complexity, final_complexity.tex) is dominated by the first round; remaining rounds operate on progressively smaller spaces and form a geometric series. For unstructured single-target search with equal stage size s, the bound is Ctotal ≤ CGrover(n) / [∏cos(θᵢ) · (1 − 21−s/2)]. Both factors converge to 1 once s ≥ log₂ log₂ N, recovering Grover's O(√N) query count exactly. This matches the asymptotic gate scaling of earlier sparse-diffusion Grover variants (Grover '02 trade-offs) but eliminates global non-oracle operators completely.

The evaluation (Section 4, evaluation.tex) backs the theory with Qiskit simulation on an 18-qubit search for |1⟩⊗18. The 3-stage [6,6,6] configuration measures the target in 96.5% of shots vs. theoretical 96.7%; the 2-stage [9,9] in 99.5% vs. 99.8%. Three multi-controlled-X decomposition scenarios (no ancilla, dirty V-chain via idle qubits, single clean ancilla) are compared. Under S2 the diffusion-circuit depth drops 96–97% versus Grover at 9–30% extra oracle calls; with a clean ancilla (S3) depth still drops 51–63%. Failure modes also retain partial-prefix structure: 57% of 3-stage failures still recover the first-stage substring correctly. The scaling analysis (n=20 to n=50) shows failure probability and oracle overhead both decay rapidly: 4-stage at n=20 has 10% failure / 55% overhead, both improving to <0.1%/<5% at n=50. Break-even oracle-to-diffusion ratios under S2 reach ~10³ at n=50 (2-stage), so even AES-128 oracles (≈10× depth) gain about 5% total depth reduction.

The discussion calls out three limitations: mid-circuit measurement noise (mitigated by scaling: 9 measurements vs. 250 oracle calls for 100 qubits in 10 stages), per-stage rounding error from integer iteration counts (potentially fixable via fixed-point amplitude amplification or phase-tunable variants), and the tensor-decomposition requirement on state preparation (a hard constraint, but the target-state version of the constraint can be relaxed at the cost of breaking the angle degeneracy and computing each rotation plane independently). The authors note their construction is particularly suited to distributed quantum architectures because all non-oracle operations stay within a single processing node, removing exactly the inter-node entanglement that dominates noise in distributed Grover demonstrations.

Figures

Figure 1
Figure 1. Scaling behaviour of the recursive search scheme for k = 2, 3, 4 stages of near equal size, computed using the expressions from §3 applied to unstructured searches over n = 20 to n = 50 qubits with a single target. Left: Failure probability 1 − p (log scale). Right: Relative oracle overhead vs. standard Grover matched to the same success probability.
Figure 2
Figure 2. Depth scaling of the recursive scheme relative to standard Grover for k = 2, 3, 4 stages across n = 20–50. Top: Break-even oracle-to-diffusion depth ratio under decomposition scenarios S1 (no ancillae), S2 (dirty V-chain), S3 (clean ancilla). The recursive scheme is shallower than Grover whenever the oracle depth lies below the plotted ratio. Bottom: Total relative depth assuming a single oracle costs 1×, 5×, 10× the diffusion-operator depth.
Figure 3
Figure 3. Simulation and depth comparison of the recursive scheme against standard Grover on an 18-qubit single-target problem. Top: Measurement outcome distributions for [6,6,6] and [9,9] configurations over 1000 shots. Bottom left: Diffusion operator depth (log scale) under S1/S2/S3 with percent of Grover. Bottom right: Total oracle calls with percent overhead at matched success probability.

Citations to Yuan's papers

No direct citation to any of Y1–Y6 found in bibliography.

Overlap with Y1–Y6

Recommended action for Yuan