← Back to digest

Improvement of performance of Grover's algorithm on three generations of Heron family IBM QPUs without and with topological dynamical decoupling

Authors: Tihomir G. Tenev, Nayden P. Nedev, Nikolay V. Vitanov · arXiv:2604.23228 · submission cycle 2026-04-28 · score 8/10 (HIGH)

Abstract

We investigate the performance of Grover's algorithm on three different generations of IBM Heron QPUs. On Heron family of IBM QPUs the success probabilities for three, four and five qubits without dynamical decoupling is better than results reported for previous generations of QPUs. The success probability as function of number of iterations of Grover operator is considered. A study of the improvement of results of Grover's algorithm for five qubit case with the help of topological dynamical decoupling is considered. For a six qubit case on Heron r3 QPU a clear result for finding the sought-after bitstring is reported for theoretically suboptimal number of iterations of Grover operator with the help of dynamical decoupling.

Executive summary

An empirical hardware study running Grover's search at 3–6 qubits on three IBM Heron-generation devices (Torino r1, Marrakesh r2, Pittsburgh r3) with and without dynamical-decoupling (DD) sequences. Headline results: (i) success probabilities without DD on the new Heron r3 (Pittsburgh) already exceed prior generations' DD-protected numbers for n=3, 4, 5; (ii) on the 6-qubit case the theoretically optimal 6 iterations are below the random-guess floor on all three devices, but with T4 or XY4 DD at 2–3 iterations Pittsburgh recovers a clean target signal up to ~0.06 success probability; (iii) the recently proposed topological DD family Tn matches or slightly beats XY4 in some configurations, with smaller-pulse-count sequences (T2, T4) generally best. This is squarely on the same Heron platform Y6 used for the PBR test, and the Grover algorithm underlies Y4's cardinality-constrained quantum advantage claim — so this paper is a useful empirical baseline for both lines.

Main contribution

A controlled cross-device benchmark of Grover at increasing problem sizes on the three Heron generations (calibration-data-aware run scheduling, fixed transpiler seed=1234, 10 000 shots/run, balanced 0/1 target bitstrings), combined with a head-to-head of CPMG, XY4, and the topological Tn DD sequences (Nedev 2025) on the 5-qubit and 6-qubit cases. The empirical conclusion is that (a) Heron r3 (Pittsburgh) hardware is the first IBM superconducting platform to support useful unprotected Grover at 5 qubits and DD-assisted Grover at 6 qubits, and (b) at the 5-qubit case the best Tn sequence outperforms XY4 by ~14% on Pittsburgh.

Key experimental protocol

Detailed walkthrough

The 3–5 qubit case (Section III-A): the success-probability ladder is monotone across Heron generations on every problem size. For n=5, Pittsburgh (Heron r3) reaches ~0.35 unprotected — "almost twice as good" as DD-protected previous-generation IBM results. The authors attribute this to better T1, T2 and lower 2Q error on Heron r3, tabulated in the paper.

The 5q-iteration sweep (Section III-B) reveals the canonical noise-vs-signal trade-off: the highest unprotected success probability is attained at fewer than the theoretically optimal 4 iterations — 2 on Torino, 2–3 on Marrakesh, 3 on Pittsburgh (peaking near 0.38). The 2Q-gate count grows by ~135 per iteration (127 → 263 → 402 → 538), and gate-error accumulation overtakes the algorithmic amplification past the device-dependent sweet spot. This is the same mechanism Y3 quantifies for QAOA on portfolio optimisation: in the thermal-relaxation regime, deeper circuits stop helping.

The DD comparison (Section III-C, 5 qubits): on Torino, T8 gives the largest enhancement (~30% over the unprotected case); on Marrakesh, XY4 and T4 are tied; on Pittsburgh, T2 edges out XY4 by ~14%. The trend across Tn is non-monotone: there is an oscillation in success probability vs. pulse count, and shorter sequences (T2, T4) outperform T10, T12. The authors attribute this to a trade-off between the number of inserted DD blocks (which grows when each block is short) and the protection per block. The use of star-topology qubit selection means two qubits dominate the gate load while three are mostly idle — DD has stronger effect on the idle qubits, partially explaining why pulse-count and timing details matter so much.

The 6q case (Section III-E) is the headline experimental result. Without DD, the success probability at the optimal 6 iterations is at or below the 1/64 random-guess floor on all three devices — Grover's algorithm fails on Heron at 6 qubits unprotected. With T4 or XY4, Pittsburgh produces an unambiguous target peak at 2–3 iterations: ~0.06 with T4 at 2 iterations, ~0.05 at 3 iterations — well above the 1/64 floor and clearly distinguishable from non-target bitstrings (Figures 8–10 in the paper, viewable on the arXiv abstract page; see skip note below). On Torino and Marrakesh the 6q result is at-or-below the floor regardless of DD.

Section IV's conclusion is properly cautious: the Grover circuit at 6 qubits on Heron r3 returns the right answer with small but resolvable probability for sub-optimal iteration counts when paired with topological or XY4 DD. This is, to the authors' knowledge, the first 6-qubit Grover demonstration with a clean target signal on superconducting hardware. The signal is fragile: at 4–6 iterations on Pittsburgh the success probability collapses back toward the floor.

Figure rendering: the paper's figures are EPS-only and this digest pipeline lacks an EPS converter, so figures could not be embedded inline. Refer to figures 1, 7, and 8–10 in the source PDF for the iteration sweep plots (5q, 6q) and the bitstring histograms confirming target dominance on Pittsburgh under DD.

Citations to Yuan's papers

No direct citation to any of Y1–Y6 found in bibliography.

Overlap with Y1–Y6

Recommended action for Yuan