← Back to digest

A Resource-Efficient Variational Quantum Framework for the Traveling Salesman Problem

Yuefeng Lin, Chao Zheng, Cong Guo (SpinQ Technology / North China University of Technology) · arXiv:2605.00739 · submitted May 2026 · score 8/10 (HIGH)

Abstract

The Traveling Salesman Problem (TSP) is a prototypical combinatorial optimization problem, but its quantum implementation is limited by the O(n2)-qubit overhead of standard one-hot encodings. Here, we propose a resource-efficient variational quantum framework based on compact binary-register encoding, a permutation-preserving problem-inspired ansatz, and a complementary divide-and-conquer execution strategy. The compact encoding reduces the data-qubit requirement to O(n log n), while the divide-and-conquer formulation lowers the number of qubits required in each local hardware execution to the size of the largest subsystem. Numerical simulations on TSP instances with 4, 5, and 6 cities achieve best average success rates of 100%, 100%, and 95.5%, respectively. A local two-qubit implementation of the divide-and-conquer approximation is further evaluated for a 5-city TSP instance on SpinQ Gemini Pro and SpinQ Triangulum II NMR quantum computers. Taken together, the results indicate how compact encoding and divide-and-conquer execution with classical post-processing can be used to study small combinatorial optimization instances on resource-constrained quantum hardware.

Executive summary

The authors attack the same qubit-overhead bottleneck that motivated Yuan's quasi-binary encoding for portfolio optimization (Y2): instead of one binary variable per city/position pair (n2 qubits), each tour position is held in a ⌈log2(n−1)⌉-qubit register, giving an M⌈log2 M⌉ data-qubit budget after fixing one starting city. They pair this compact encoding with a feasibility-preserving "permutation ansatz" built from ancilla-controlled register-wise SWAPs — the binary-register analogue of a hard-constraint mixer. On 4–6 city benchmarks (10 random graphs, 100 random inits each) the ansatz hits 100%, 100%, 95.5% best-average success. A second contribution is a "divide-and-conquer" execution strategy that exploits the diagonality of the cost Hamiltonian in the computational basis to factorize global expectation values into per-subsystem local measurements, used here to actually run a 5-city instance on a 2-qubit NMR processor. The paper does not cite any of Y1–Y6, but it is the closest methodological analogue this announce cycle to Yuan's compact-encoding + hard-mixer line of work and is worth tracking as a parallel implementation of the same ideas in a different (TSP / NMR) setting.

Main contribution

The central object is a cyclic-symmetry-reduced binary-register Hamiltonian for the TSP whose ground state is an optimal Hamiltonian cycle. Fixing the starting city kills the n-fold cyclic redundancy and leaves M = n−1 positions; each position is a k = ⌈log2 M⌉-qubit register holding a binary city label. A projector Pi(a) onto code a in register i is built as ∏b(1+(−1)abZi,b)/2, and three terms — repetition penalty, invalid-code penalty, and a tour-length term that adds a tilde{D}abPi(a)Pi+1(b) for every adjacent register pair — assemble the full diagonal Ising Hamiltonian (Eqs. 7–10 of the paper). Atop this Hamiltonian sit two complementary engines: (i) a permutation-preserving ansatz that mixes amplitude only inside the feasible permutation subspace via parameterized controlled register swaps, eliminating the need for repetition / invalid-code penalty terms during optimization; and (ii) a divide-and-conquer execution that drops inter-subsystem entanglement so that the diagonal Hamiltonian factorizes over local two-qubit measurements. The first targets compact-but-still-multi-qubit hardware; the second targets the most extreme NISQ regime (the 2-qubit NMR machines used here).

Key algorithms / constructions

Detailed walkthrough

The paper opens with the standard one-hot QUBO (Eqs. 1–3): n2 binary variables xi,u with row- and column-sum equality constraints, a distance term Σ Duv xi,u xi+1,v, and two penalty terms enforcing each-city-once and each-position-once. The pain point — n2 qubits and constraint-violating energies that reshape the cost landscape — is the same one Y2 attacks for portfolio optimization with quasi-binary encoding plus a hard XY-mixer that never leaves the feasible cardinality slice. Section 2.2 of the present paper rebuilds the Hamiltonian on top of binary city-label registers with cyclic-symmetry reduction (M = n−1), giving the M⌈log2 M⌉-qubit count. They are explicit that this is conceptually the same move as Ramezani et al. (2024) but composed with the symmetry reduction.

Section 2.3 is the methodologically interesting half. Rather than QAOA's alternating cost-mixer schedule, they construct a single problem-inspired ansatz that intrinsically lives inside the feasible permutation subspace. The trick is a controlled register swap: an auxiliary qubit prepared in |0⟩ is rotated by Ry(2θ) and then conditionally swaps two adjacent k-qubit registers (k CSWAPs in parallel). Because adjacent transpositions generate SM, an L-layer chain of such blocks can in principle reach any feasible tour. Final tour probabilities are obtained from the marginal over the auxiliary qubit. This is the binary-register analogue of Y2's hard XY-mixer: the "feasibility" is built into the gate primitive rather than enforced via penalty terms in the cost. Because penalties are dropped, the variational objective is just ⟨Hdist⟩, which simplifies the optimization landscape — the same logical motivation that drives Y2's mixer choice.

The simulator results (Fig. 2 / Section 3.2) sweep depth L from n−1 to 30 over 10 random graphs × 100 random inits. The 4-city instance trivially saturates 100%; the 5-city case rises from 92.7% at minimum L to 100% from L=10 onward; the 6-city case shows the most depth dependence, climbing from 62.3% to 95.5% at the largest tested depth and never quite reaching 100% on average. The shaded bands narrow with depth, suggesting better instance-to-instance robustness as depth increases — a pattern that mirrors what Y3 found for QAOA layerwise optimization on DGMVP portfolios. Section 3.3 then benchmarks against Matsuo et al. (2023)'s one-hot feasible-subspace ansatz on the same 4/5/6-city set: 100% / 100% / 95.5% (this work) vs. 94.9% / 88.3% / 58.5% (Matsuo). The 6-city gap is the dominant evidence — 37 percentage points absolute, with the minimum-success-rate floor lifting from 28% to 70% — though the authors are careful to flag this as "not a fully gate-budget-matched benchmark" because the encodings differ.

Table 2 of the paper makes the resource scaling concrete: the proposed circuit needs M⌈log2 M⌉+1 qubits and (M−1)L parameters, with ⌈log2 M⌉(M−1)L CSWAPs; the prior one-hot subspace ansatz needs M2 qubits, M2−M+2 two-qubit gates, and a cubic ⅓M3−½M2+⅙M−1 CSWAPs. The CSWAP-decomposition cost on hardware-native gate sets is acknowledged as a confound that motivates further matched-budget benchmarks.

Section 3.4 deploys the divide-and-conquer engine on hardware. For the 5-city instance the symmetry-reduced problem is an 8-qubit Hamiltonian (M=4, k=2 → 4 registers × 2 qubits = 8); they partition into 4 subsystems of 2 qubits each, prepare the 2-qubit hardware-efficient block on each, and reconstruct the global energy by classical multiplication of subsystem expectation values. SPSA drives 120 optimization iterations with 1024 shots/circuit. On Gemini Pro the raw loss falls 101.81 → 98.75, with target-state probability 0.714 → 0.829; IBU mitigation pushes those to 98.35 / 0.868. On Triangulum II raw measurements are noisier (loss fluctuating 108–112, target-state probability 0.651), but IBU-mitigated trajectories sit in 97–100 with target-state probability 0.959 (peaking >0.98). The authors attribute IBU's edge over matrix-inversion mitigation to its preservation of physical probability constraints under noisy calibration — the same reason matrix-inversion error mitigation is fragile in Y3-style shot-noise studies.

The conclusion (Section 4) is appropriately hedged: the work covers only 4–6 city instances; trainability under barren plateaus is unanalyzed; and the divide-and-conquer approximation deliberately discards inter-subsystem entanglement, which limits accuracy when the optimal-solution distribution is correlated. The authors flag matched-budget benchmarks against one-hot ansatze and CSWAP decomposition cost as the obvious next steps.

Figures

Success rate vs depth for 4, 5, 6 city TSP
Figure 2 (paper). Success rate as a function of circuit depth for the proposed problem-inspired ansatz, for TSP instances with n = 4, 5, 6 cities. Solid curves are the average success rate over 10 randomly generated graph instances; shaded regions are the absolute min–max range across the 10 instances.
Bar chart comparing proposed ansatz to Matsuo et al. (2023)
Figure 3 (paper). Performance comparison between the proposed problem-inspired ansatz and the prior one-hot feasible-subspace ansatz of Matsuo et al. (2023). Bars give the average success rate over 10 random TSP instances per problem size; error bars show the absolute min–max range across instances.
Loss trajectory on Gemini Pro
Figure 4(a) (paper). Loss on SpinQ Gemini Pro for the 5-city TSP instance, comparing raw noisy measurements, iterative Bayesian unfolding (IBU), matrix-inversion-based mitigation, and the noise-free simulator.
Target state probability on Gemini Pro
Figure 4(b) (paper). Target-state probability during optimization on SpinQ Gemini Pro for the same 5-city TSP instance.
Loss trajectory on Triangulum II
Figure 5(a) (paper). Loss on SpinQ Triangulum II for the 5-city TSP instance, with the four measurement-processing variants overlaid.
Target state probability on Triangulum II
Figure 5(b) (paper). Target-state probability during optimization on SpinQ Triangulum II.
Per-instance success rate vs depth, 5 cities
Supplementary figure. Per-instance success rate vs circuit depth for the 5-city case (caption not in main text — this is a per-instance breakdown supporting Fig. 2).
Per-instance success rate vs depth, 6 cities
Supplementary figure. Per-instance success rate vs circuit depth for the 6-city case (caption not in main text — supporting per-instance breakdown for Fig. 2).

Citations to Yuan's papers

No direct citation to any of Y1–Y6 found in bibliography.

Overlap with Y1–Y6

Recommended action for Yuan