Formulating Subgroup Discovery as a Quantum Optimization Problem for Network Security
Note: LaTeX source unavailable — analysis based on the publicly-served PDF. Per-section references and equation labels are taken from the PDF text.
Abstract
While current network intrusion detection systems achieve satisfactory accuracy, they often lack explainability. Subgroup Discovery (SD) addresses this by building interpretable rules that characterize feature interactions associated with attack traffic. With large datasets, classical heuristic beam search methods struggle with exponentially scaling search spaces and can prune critical multi-feature interactions. This paper introduces a quantum-enhanced pipeline for SD applied to network intrusion detection using NSL-KDD, formulating SD as quantum optimization for the first time. By encoding feature selection as a Quadratic Unconstrained Binary Optimization (QUBO) and solving it via the Quantum Approximate Optimization Algorithm (QAOA) on IBM Quantum hardware (ibm_pittsburgh), the pipeline identifies subgroups of network features that discriminate normal from attack traffic. Hardware scaling experiments on ibm_pittsburgh (10–30 qubits) reveal that QAOA at depth p = 1 shows WRAcc ratios of 0.983 at 10 qubits, 0.971 at 15 qubits, 0.855 at 20 qubits, and 0.624 at 25 qubits, degrading to 0.039 at 30 qubits as circuit noise dominates, establishing an empirical NISQ scaling boundary.
Executive summary
This is a NISQ-era empirical paper executing QAOA on a real IBM Heron-class device (ibm_pittsburgh, 156-qubit) for a QUBO derived from a feature-selection problem with cardinality constraint. It mirrors Yuan's Y3 setup (end-to-end QAOA for a QUBO on real IBM hardware, with attention to noise-driven degradation), and Y2/Y4 (cardinality-constrained QUBO with squared-deviation penalty). The standout empirical contribution is a measured 10→30 qubit scaling curve for dense, structured QUBOs at p=1, which gives a directly comparable benchmark against Y3's NISQ-noise analysis. The "QAOA-unique subgroup" finding — that QAOA's superposition-based search recovers multi-feature interactions that classical beam search prunes early — is the kind of soft, application-level NISQ benefit Y3 also explored in the portfolio domain.
Main contribution
The paper builds a five-phase pipeline: (1) preprocess NSL-KDD intrusion data with information-gain feature ranking; (2) compute classical baselines via beam search and (where tractable) exhaustive enumeration; (3) construct a QUBO whose objective is a least-squares fit of the WRAcc functional augmented by a squared-cardinality penalty; (4) execute p=1 QAOA on ibm_pittsburgh from 10 up to 30 qubits with COBYLA optimisation; (5) evaluate via two complementary approximation ratios (energy ratio rE and WRAcc-recovery ratio rW). The cardinality penalty λ(Σxi − K)² is calibrated per-instance to make target-cardinality solutions favourable. The empirical scaling curve and a set of "QAOA-unique" interpretable subgroups are the headline results.
Key constructions and results
- WRAcc QUBO formulation (Sec. 3.3): Weighted Relative Accuracy is fit by a quadratic function of binary feature-selection variables; cardinality is enforced by an additive penalty λ(Σxi − K)² with λ calibrated so that wrong-cardinality solutions are strictly worse than the target-cardinality ground state.
- Dual approximation ratios (Sec. 3.5): energy ratio rE = ⟨HC⟩QAOA/Eground and application ratio rW = WRAccQAOA(K)/WRAcc*. The dual framework decouples Hamiltonian-quality from solution-quality, an explicit acknowledgement that low energy ≠ best operational answer.
- Hardware experiments (Sec. 4.3, Table I): p=1 QAOA on ibm_pittsburgh, recovering rW = 0.983 (10 qubits, k=3 Probe), 0.971 (15 qubits, k=5 R2L), 0.855 (20 qubits, k=3 Probe), 0.624 (25 qubits, k=5 R2L), and 0.039 (30 qubits, k=3 Probe). Noise dominates by 30 qubits.
- Noiseless simulator baseline (Sec. 4.3): Aer p=1 at 10 qubits achieves rW = 1.0 (recovers the exhaustive-optimal subgroup); p=2 increases probability mass at the target cardinality (11.8% → 13.7%) without changing the ratio.
- QAOA-unique subgroups (Sec. 4.4): a six-feature R2L conjunction rule {service_ftp_data, srv_count, count, service_http, dst_host_srv_diff_host_rate, dst_host_count} achieves 99.6% test precision; this rule is found by QAOA but pruned by beam search because intermediate two- and three-feature subgroups score weakly.
Detailed walkthrough
Subgroup discovery in network security is the problem of finding small interpretable feature conjunctions that classify attack traffic. The standard quality measure is Weighted Relative Accuracy (Sec. 2.1): WRAcc(S) = (|sg|/N)·|p(sg)−p0|, balancing coverage and contrast against the baseline attack rate. The classical workhorse is beam search, which extends subgroups one feature at a time and retains only the top-ranked candidates at each depth. Beam search is fast but greedy: a critical multi-feature interaction whose two- and three-feature precursors score weakly will be pruned before the relevant feature can be added.
The paper formulates this as a QUBO (Sec. 3.3) by fitting WRAcc to a least-squares quadratic function of binary feature-selection variables. The Q matrix is normalised by max|Qobj| to preserve the WRAcc signal-to-penalty ratio. A squared-deviation cardinality penalty Σxi − K)² with calibrated weight λ enforces the cardinality constraint Σxi = K. The penalty calibration computes, for each wrong cardinality k ≠ K, the minimum λ that makes the target-K solution strictly better than the best k-solution; the final λ is set to 1.5× the maximum such value with a 0.1 floor. This per-instance calibration is conceptually similar to Y2's avoidance of penalty-based encoding entirely (Y2 uses a hard mixer instead, which removes the calibration question). The authors also support a FREE_CARDINALITY mode that drops the penalty and lets QAOA explore all cardinalities.
QAOA execution (Sec. 3.4) uses depth p=1 with COBYLA classical optimisation. The hardware backend is ibm_pittsburgh, a 156-qubit Heron-class superconducting processor — the same family as the IBM Heron2 used by Yuan's Y6 PBR experiment. The paper's hardware scaling table (Table I) is the most directly comparable empirical artefact to Y3's NISQ analysis: at 10 qubits the WRAcc ratio is 98.3%, at 15 qubits 97.1%, at 20 qubits 85.5%, at 25 qubits 62.4%, and at 30 qubits 3.9%. The collapse is sharp; the authors place the practical NISQ boundary for dense p=1 QAOA QUBOs at 20–25 qubits on current Heron hardware. This is roughly aligned with the noise-floor crossover Y3 identified in its DGMVP study, but for a different problem class and on a slightly newer Heron generation.
Section 4.4 develops the "QAOA-unique" finding. Among the top-ranked R2L attack-detection rules, several feature combinations are present in the QAOA sampling distribution but absent from beam-search candidates. The headline example is a six-feature subgroup that achieves 91.6% R2L attack rate within the training subgroup (489 connections) and 99.6% precision on the test set (281 matched connections). The paper interprets this as a coverage benefit: superposition-based search evaluates the full combinatorial space simultaneously, while beam search's per-depth pruning blocks discovery of rules whose component features are individually weak. The authors are careful (Sec. 5.1, 5.3) to disclaim that this is not a runtime advantage — beam search runs in milliseconds, QAOA on cloud hardware takes minutes per run — and to position the contribution as "search completeness, not speed."
Two limitations are worth flagging for cross-reading with Y3. First, the cardinality penalty λ appears in the QUBO and inflates the dynamic range of cost coefficients; this is exactly the issue Y2 avoids via its hard mixer construction. A natural follow-on for Yuan's group would be to test whether replacing the penalty with a Y2-style hard mixer keeps the WRAcc ratio above the 30-qubit cliff (where currently the QUBO penalty term plus circuit noise drives rW to 0.039). Second, the choice of p=1 leaves QAOA's true potential underexplored — Y3's analysis shows that layerwise optimisation at higher p is robust under thermal noise; revisiting this paper's experiments at p=3–5 with layerwise initialisation could meaningfully change the scaling boundary.
Figures
No figures extracted from source — the e-print is a PDF without separate figure files.
Citations to Yuan's papers
Overlap with Y1–Y6
- Y3 (DGMVP portfolio QAOA on NISQ): Same template — QUBO objective + cardinality-style structural constraint + QAOA on real IBM hardware + careful noise-scaling analysis. The 156-qubit Heron-class processor is the immediate successor to Y3's hardware. Y3's primary conclusion (thermal relaxation precludes quantum advantage but shot-noise regime gives favourable scaling) is essentially what this paper finds for SD: noise dominates above ~25 qubits, but at 10–20 qubits QAOA is competitive with beam search.
- Y2 (quasi-binary, hard mixer, no penalty): The paper's cardinality penalty λ(Σxi−K)² is exactly the construction Y2 dispenses with. A direct follow-up would be to plug a Y2-style hard mixer into this pipeline and measure whether the 30-qubit collapse is mitigated.
- Y4 (Grover + ADMM for cardinality-constrained binary opt): Same problem family (cardinality-constrained binary optimisation), different algorithm (Grover+ADMM versus QAOA). Y4's epsilon-approximation guarantees stand in contrast to this paper's empirical-only NISQ scaling analysis.
- Y6 (PBR test on IBM Heron2): Shared hardware platform (Heron-class superconducting, 156 qubits). Y6 used Heron2; this paper uses ibm_pittsburgh, also a Heron-class device. Cross-checking single- and two-qubit gate fidelities between the two experiments would let Yuan calibrate expected QAOA-noise behaviour against the PBR-test hardware characterisation.
Recommended action for Yuan
- Cite as a NISQ-empirical complement to Y3: in any forward-looking version of Y3 (e.g., the 2026 follow-up cycle), this paper provides a directly comparable scaling curve on a different application.
- Try the Y2 hard-mixer substitution: use the public NSL-KDD pipeline as a third benchmark (after MaxCut and DGMVP) for Y2's quasi-binary / hard-mixer encoding. The QAOA-unique subgroup story is robust, so the comparison would be on whether hard-mixer encoding pushes the noise boundary past 25 qubits.
- Email the authors: Spell and Shyu (Missouri Quantum Innovation Center) are likely open to collaboration; their pipeline is a well-instrumented testbed for any QAOA improvement Yuan wants to validate empirically on Heron-class hardware.