| CMS-JME-25-001 ; CERN-EP-2026-111 | ||
| Particle transformers for identifying Lorentz-boosted Higgs bosons decaying to a pair of W bosons | ||
| CMS Collaboration | ||
| 10 April 2026 | ||
| Submitted to the Journal of High Energy Physics | ||
| Abstract: A novel deep neural network classifier, a ``particle transformer'' (PART), is introduced for the identification of highly Lorentz-boosted resonances reconstructed as single, multipronged jets in measurements and searches performed by the CMS Collaboration at the CERN LHC. Based on a self-attention mechanism that allows the model to weigh the importance of different particles, PART is trained on a wide variety of topologies, notably demonstrating strong performance for the first time on jets originating from boosted Higgs boson decays to W bosons. The PART algorithm achieves a tagging efficiency of more than 50% for such jets at a background efficiency of 1%, while maintaining decorrelation from the jet mass. A calibration is performed in proton-proton collision data collected by CMS at a center-of-mass energy of 13 TeV, with a data set corresponding to a total luminosity of 138 fb$ ^{-1} $. Data-to-simulation selection efficiency scale factors are measured to be in the 0.9--1.0 range, with relative uncertainties between 7 and 23%. The tagging capability of PART enhances the sensitivity of standard model measurements and searches for beyond-the-standard-model resonances decaying to hadronic diboson systems. | ||
| Links: e-print arXiv:2604.09809 [hep-ex] (PDF) ; CDS record ; inSPIRE record ; CADI line (restricted) ; | ||
| Figures | |
|
png pdf |
Figure 1:
Diagram of the PART model architecture. The model processes two different sets of input features per jet, from PF candidates and SVs. These features are embedded using separate MLPs into 128-dimensional representations before being concatenated and passed through eight PABs. Pairwise features are also calculated between each input element, which are embedded using a single one-dimensional convolutional layer and used as attention biases for each PAB. Two CABs then use the learned features to update a randomly initialized class token, which aggregates these features into a global representation of the jet. Their output is then finally passed through an MLP that outputs the class probabilities, which are normalized by a softmax function, as well as the jet mass. |
|
png pdf |
Figure 2:
Full suite of AK8 jet topologies considered for the PART multiclass classification task. Jet types are first categorized by the number of quarks and leptons in the final state, and then further separated by flavor, as shown in the table on the left. The symbols $ \tau_\mathrm{e} $, $ \tau_\mu $, and $ \tau_\mathrm{h} $ refer to $ \tau $ lepton decays to electrons, muons, and hadrons, respectively. The total number of subclasses for each process, therefore, is given by the tensor product ($ \otimes $) between the different final states and flavors. Diagrams illustrating the corresponding jet topologies, which are not exhaustive, are shown on the right. |
|
png pdf |
Figure 3:
Evolution of the loss function values for the PART model on the training and validation data sets over training epochs, shown separately for the classification (cross-entropy) and regression (log-cosh) terms. |
|
png pdf |
Figure 4:
Comparison of jet mass reconstruction ($ {m_\text{reco}} $) using the SD, PARTICLENET, and PART algorithms, for $ \mathrm{H}\to\mathrm{b}\overline{\mathrm{b}} $ (upper left), $ \mathrm{H}\to\mathrm{W}\mathrm{W} $ (upper right), $ \mathrm{t}\to\mathrm{b}\mathrm{q}\overline{\mathrm{q}} $(lower left), and QCD (lower right) jets with the SM values of $ {m_\mathrm{H}} $ and $ {m_\mathrm{t}} $. An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 400 GeV and $ |\eta| < $ 2.4. Statistical uncertainties in the bin yields originating from the limited number of simulated events are represented by vertical error bars. The mass at the peak ($ {m_\text{peak}} $) for each algorithm, calculated using Gaussian kernel density estimation, and the mass resolution, quantified by the FWHM of the resonance peak, are shown as well for H and t jets. |
|
png pdf |
Figure 4-a:
Comparison of jet mass reconstruction ($ {m_\text{reco}} $) using the SD, PARTICLENET, and PART algorithms, for $ \mathrm{H}\to\mathrm{b}\overline{\mathrm{b}} $ (upper left), $ \mathrm{H}\to\mathrm{W}\mathrm{W} $ (upper right), $ \mathrm{t}\to\mathrm{b}\mathrm{q}\overline{\mathrm{q}} $(lower left), and QCD (lower right) jets with the SM values of $ {m_\mathrm{H}} $ and $ {m_\mathrm{t}} $. An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 400 GeV and $ |\eta| < $ 2.4. Statistical uncertainties in the bin yields originating from the limited number of simulated events are represented by vertical error bars. The mass at the peak ($ {m_\text{peak}} $) for each algorithm, calculated using Gaussian kernel density estimation, and the mass resolution, quantified by the FWHM of the resonance peak, are shown as well for H and t jets. |
|
png pdf |
Figure 4-b:
Comparison of jet mass reconstruction ($ {m_\text{reco}} $) using the SD, PARTICLENET, and PART algorithms, for $ \mathrm{H}\to\mathrm{b}\overline{\mathrm{b}} $ (upper left), $ \mathrm{H}\to\mathrm{W}\mathrm{W} $ (upper right), $ \mathrm{t}\to\mathrm{b}\mathrm{q}\overline{\mathrm{q}} $(lower left), and QCD (lower right) jets with the SM values of $ {m_\mathrm{H}} $ and $ {m_\mathrm{t}} $. An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 400 GeV and $ |\eta| < $ 2.4. Statistical uncertainties in the bin yields originating from the limited number of simulated events are represented by vertical error bars. The mass at the peak ($ {m_\text{peak}} $) for each algorithm, calculated using Gaussian kernel density estimation, and the mass resolution, quantified by the FWHM of the resonance peak, are shown as well for H and t jets. |
|
png pdf |
Figure 4-c:
Comparison of jet mass reconstruction ($ {m_\text{reco}} $) using the SD, PARTICLENET, and PART algorithms, for $ \mathrm{H}\to\mathrm{b}\overline{\mathrm{b}} $ (upper left), $ \mathrm{H}\to\mathrm{W}\mathrm{W} $ (upper right), $ \mathrm{t}\to\mathrm{b}\mathrm{q}\overline{\mathrm{q}} $(lower left), and QCD (lower right) jets with the SM values of $ {m_\mathrm{H}} $ and $ {m_\mathrm{t}} $. An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 400 GeV and $ |\eta| < $ 2.4. Statistical uncertainties in the bin yields originating from the limited number of simulated events are represented by vertical error bars. The mass at the peak ($ {m_\text{peak}} $) for each algorithm, calculated using Gaussian kernel density estimation, and the mass resolution, quantified by the FWHM of the resonance peak, are shown as well for H and t jets. |
|
png pdf |
Figure 4-d:
Comparison of jet mass reconstruction ($ {m_\text{reco}} $) using the SD, PARTICLENET, and PART algorithms, for $ \mathrm{H}\to\mathrm{b}\overline{\mathrm{b}} $ (upper left), $ \mathrm{H}\to\mathrm{W}\mathrm{W} $ (upper right), $ \mathrm{t}\to\mathrm{b}\mathrm{q}\overline{\mathrm{q}} $(lower left), and QCD (lower right) jets with the SM values of $ {m_\mathrm{H}} $ and $ {m_\mathrm{t}} $. An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 400 GeV and $ |\eta| < $ 2.4. Statistical uncertainties in the bin yields originating from the limited number of simulated events are represented by vertical error bars. The mass at the peak ($ {m_\text{peak}} $) for each algorithm, calculated using Gaussian kernel density estimation, and the mass resolution, quantified by the FWHM of the resonance peak, are shown as well for H and t jets. |
|
png pdf |
Figure 5:
Receiver operating characteristic curves for $ \mathrm{H}\to\mathrm{W}\mathrm{W} $ signal jets, with the SM values of $ {m_\mathrm{H}} $ and $ {m_\mathrm{t}} $, versus background jets from simulated QCD multijet (left) and $ \mathrm{t} \overline{\mathrm{t}} $ events (right), for the PART} $T_{HWW}$ and the DEEPAK8-MD scores in the $ p_{\mathrm{T}} $ ranges 200--400, 400--600, and 600--1000 GeV. An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 200 GeV and $ |\eta| < $ 2.4. Signal jets are required to contain all four generator-level quarks from the W boson decays within $ \Delta R (jet, \mathrm{q}) < $ 0.8. |
|
png pdf |
Figure 5-a:
Receiver operating characteristic curves for $ \mathrm{H}\to\mathrm{W}\mathrm{W} $ signal jets, with the SM values of $ {m_\mathrm{H}} $ and $ {m_\mathrm{t}} $, versus background jets from simulated QCD multijet (left) and $ \mathrm{t} \overline{\mathrm{t}} $ events (right), for the PART} $T_{HWW}$ and the DEEPAK8-MD scores in the $ p_{\mathrm{T}} $ ranges 200--400, 400--600, and 600--1000 GeV. An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 200 GeV and $ |\eta| < $ 2.4. Signal jets are required to contain all four generator-level quarks from the W boson decays within $ \Delta R (jet, \mathrm{q}) < $ 0.8. |
|
png pdf |
Figure 5-b:
Receiver operating characteristic curves for $ \mathrm{H}\to\mathrm{W}\mathrm{W} $ signal jets, with the SM values of $ {m_\mathrm{H}} $ and $ {m_\mathrm{t}} $, versus background jets from simulated QCD multijet (left) and $ \mathrm{t} \overline{\mathrm{t}} $ events (right), for the PART} $T_{HWW}$ and the DEEPAK8-MD scores in the $ p_{\mathrm{T}} $ ranges 200--400, 400--600, and 600--1000 GeV. An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 200 GeV and $ |\eta| < $ 2.4. Signal jets are required to contain all four generator-level quarks from the W boson decays within $ \Delta R (jet, \mathrm{q}) < $ 0.8. |
|
png pdf |
Figure 6:
Receiver operating characteristic curves for $ Y \to\mathrm{W}\mathrm{W}^* $ signal jets, with varying $ m_{Y} $ and SM $ {m_\mathrm{W}} $, versus background jets from simulated QCD multijet (left) and $ \mathrm{t} \overline{\mathrm{t}} $ events (right), for the PART $T_{HWW}$ score. An offline selection is applied to the AK8 jets of 600 $ < p_{\mathrm{T}} < $ 1000 GeV, $ |\eta| < $ 2.4, and $ {m_\mathrm{SD}} > $ 30 GeV. |
|
png pdf |
Figure 6-a:
Receiver operating characteristic curves for $ Y \to\mathrm{W}\mathrm{W}^* $ signal jets, with varying $ m_{Y} $ and SM $ {m_\mathrm{W}} $, versus background jets from simulated QCD multijet (left) and $ \mathrm{t} \overline{\mathrm{t}} $ events (right), for the PART $T_{HWW}$ score. An offline selection is applied to the AK8 jets of 600 $ < p_{\mathrm{T}} < $ 1000 GeV, $ |\eta| < $ 2.4, and $ {m_\mathrm{SD}} > $ 30 GeV. |
|
png pdf |
Figure 6-b:
Receiver operating characteristic curves for $ Y \to\mathrm{W}\mathrm{W}^* $ signal jets, with varying $ m_{Y} $ and SM $ {m_\mathrm{W}} $, versus background jets from simulated QCD multijet (left) and $ \mathrm{t} \overline{\mathrm{t}} $ events (right), for the PART $T_{HWW}$ score. An offline selection is applied to the AK8 jets of 600 $ < p_{\mathrm{T}} < $ 1000 GeV, $ |\eta| < $ 2.4, and $ {m_\mathrm{SD}} > $ 30 GeV. |
|
png pdf |
Figure 7:
Confusion matrix with each row indicating the fraction of jets per category classified as the column category by PART. An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 200 GeV and $ |\eta| < $ 2.4. |
|
png pdf |
Figure 8:
Distributions of $ {m_\mathrm{SD}} $ for jets from QCD multijet events, in the $ p_{\mathrm{T}} $ ranges 200--400 GeV (upper), 400--600 GeV (middle) and 600--1000 GeV (lower), after no selections (``inclusive'') on the PART $T_{HWW}$ score (left) and the DEEPAK8-MD score (right) as well as selections corresponding to QCD jet selection efficiencies ($ \epsilon_B $) of 5.0%, 1.0%, and 0.5%. The error bars represent the statistical uncertainties originating from the limited number of simulated events. The lower panels display the ratio of the normalized $ {m_\mathrm{SD}} $ distributions for the different selection efficiencies ($ N_\mathrm{mistag} $) to the normalized inclusive $ {m_\mathrm{SD}} $ distribution ($ N_\mathrm{inclusive} $). An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 400 GeV, $ |\eta| < $ 2.4, and $ {m_\mathrm{SD}} > $ 30 GeV. |
|
png pdf |
Figure 8-a:
Distributions of $ {m_\mathrm{SD}} $ for jets from QCD multijet events, in the $ p_{\mathrm{T}} $ ranges 200--400 GeV (upper), 400--600 GeV (middle) and 600--1000 GeV (lower), after no selections (``inclusive'') on the PART $T_{HWW}$ score (left) and the DEEPAK8-MD score (right) as well as selections corresponding to QCD jet selection efficiencies ($ \epsilon_B $) of 5.0%, 1.0%, and 0.5%. The error bars represent the statistical uncertainties originating from the limited number of simulated events. The lower panels display the ratio of the normalized $ {m_\mathrm{SD}} $ distributions for the different selection efficiencies ($ N_\mathrm{mistag} $) to the normalized inclusive $ {m_\mathrm{SD}} $ distribution ($ N_\mathrm{inclusive} $). An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 400 GeV, $ |\eta| < $ 2.4, and $ {m_\mathrm{SD}} > $ 30 GeV. |
|
png pdf |
Figure 8-b:
Distributions of $ {m_\mathrm{SD}} $ for jets from QCD multijet events, in the $ p_{\mathrm{T}} $ ranges 200--400 GeV (upper), 400--600 GeV (middle) and 600--1000 GeV (lower), after no selections (``inclusive'') on the PART $T_{HWW}$ score (left) and the DEEPAK8-MD score (right) as well as selections corresponding to QCD jet selection efficiencies ($ \epsilon_B $) of 5.0%, 1.0%, and 0.5%. The error bars represent the statistical uncertainties originating from the limited number of simulated events. The lower panels display the ratio of the normalized $ {m_\mathrm{SD}} $ distributions for the different selection efficiencies ($ N_\mathrm{mistag} $) to the normalized inclusive $ {m_\mathrm{SD}} $ distribution ($ N_\mathrm{inclusive} $). An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 400 GeV, $ |\eta| < $ 2.4, and $ {m_\mathrm{SD}} > $ 30 GeV. |
|
png pdf |
Figure 8-c:
Distributions of $ {m_\mathrm{SD}} $ for jets from QCD multijet events, in the $ p_{\mathrm{T}} $ ranges 200--400 GeV (upper), 400--600 GeV (middle) and 600--1000 GeV (lower), after no selections (``inclusive'') on the PART $T_{HWW}$ score (left) and the DEEPAK8-MD score (right) as well as selections corresponding to QCD jet selection efficiencies ($ \epsilon_B $) of 5.0%, 1.0%, and 0.5%. The error bars represent the statistical uncertainties originating from the limited number of simulated events. The lower panels display the ratio of the normalized $ {m_\mathrm{SD}} $ distributions for the different selection efficiencies ($ N_\mathrm{mistag} $) to the normalized inclusive $ {m_\mathrm{SD}} $ distribution ($ N_\mathrm{inclusive} $). An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 400 GeV, $ |\eta| < $ 2.4, and $ {m_\mathrm{SD}} > $ 30 GeV. |
|
png pdf |
Figure 8-d:
Distributions of $ {m_\mathrm{SD}} $ for jets from QCD multijet events, in the $ p_{\mathrm{T}} $ ranges 200--400 GeV (upper), 400--600 GeV (middle) and 600--1000 GeV (lower), after no selections (``inclusive'') on the PART $T_{HWW}$ score (left) and the DEEPAK8-MD score (right) as well as selections corresponding to QCD jet selection efficiencies ($ \epsilon_B $) of 5.0%, 1.0%, and 0.5%. The error bars represent the statistical uncertainties originating from the limited number of simulated events. The lower panels display the ratio of the normalized $ {m_\mathrm{SD}} $ distributions for the different selection efficiencies ($ N_\mathrm{mistag} $) to the normalized inclusive $ {m_\mathrm{SD}} $ distribution ($ N_\mathrm{inclusive} $). An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 400 GeV, $ |\eta| < $ 2.4, and $ {m_\mathrm{SD}} > $ 30 GeV. |
|
png pdf |
Figure 8-e:
Distributions of $ {m_\mathrm{SD}} $ for jets from QCD multijet events, in the $ p_{\mathrm{T}} $ ranges 200--400 GeV (upper), 400--600 GeV (middle) and 600--1000 GeV (lower), after no selections (``inclusive'') on the PART $T_{HWW}$ score (left) and the DEEPAK8-MD score (right) as well as selections corresponding to QCD jet selection efficiencies ($ \epsilon_B $) of 5.0%, 1.0%, and 0.5%. The error bars represent the statistical uncertainties originating from the limited number of simulated events. The lower panels display the ratio of the normalized $ {m_\mathrm{SD}} $ distributions for the different selection efficiencies ($ N_\mathrm{mistag} $) to the normalized inclusive $ {m_\mathrm{SD}} $ distribution ($ N_\mathrm{inclusive} $). An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 400 GeV, $ |\eta| < $ 2.4, and $ {m_\mathrm{SD}} > $ 30 GeV. |
|
png pdf |
Figure 8-f:
Distributions of $ {m_\mathrm{SD}} $ for jets from QCD multijet events, in the $ p_{\mathrm{T}} $ ranges 200--400 GeV (upper), 400--600 GeV (middle) and 600--1000 GeV (lower), after no selections (``inclusive'') on the PART $T_{HWW}$ score (left) and the DEEPAK8-MD score (right) as well as selections corresponding to QCD jet selection efficiencies ($ \epsilon_B $) of 5.0%, 1.0%, and 0.5%. The error bars represent the statistical uncertainties originating from the limited number of simulated events. The lower panels display the ratio of the normalized $ {m_\mathrm{SD}} $ distributions for the different selection efficiencies ($ N_\mathrm{mistag} $) to the normalized inclusive $ {m_\mathrm{SD}} $ distribution ($ N_\mathrm{inclusive} $). An offline selection is applied to the AK8 jets of $ p_{\mathrm{T}} > $ 400 GeV, $ |\eta| < $ 2.4, and $ {m_\mathrm{SD}} > $ 30 GeV. |
|
png pdf |
Figure 9:
The Jensen--Shannon distance (JSD) using base 2 between the $ {m_\mathrm{SD}} $ distribution of jets from QCD multijet events with and without a selection on the PART and DEEPAK8-MD tagger scores. On the left, the JSD is plotted for tagger selections corresponding to different QCD jet selection efficiencies ($ {\epsilon_\mathrm{B}} $), with an offline selection of 600 $ < p_{\mathrm{T}} < $ 1000 GeV, $ |\eta| < $ 2.4, and 30 $ < {m_\mathrm{SD}} < $ 250 GeV applied to the jets. On the right, the JSD is plotted for different jet $ p_{\mathrm{T}} $ bins, at a fixed $ {\epsilon_\mathrm{B}} $ of 1%. |
|
png pdf |
Figure 9-a:
The Jensen--Shannon distance (JSD) using base 2 between the $ {m_\mathrm{SD}} $ distribution of jets from QCD multijet events with and without a selection on the PART and DEEPAK8-MD tagger scores. On the left, the JSD is plotted for tagger selections corresponding to different QCD jet selection efficiencies ($ {\epsilon_\mathrm{B}} $), with an offline selection of 600 $ < p_{\mathrm{T}} < $ 1000 GeV, $ |\eta| < $ 2.4, and 30 $ < {m_\mathrm{SD}} < $ 250 GeV applied to the jets. On the right, the JSD is plotted for different jet $ p_{\mathrm{T}} $ bins, at a fixed $ {\epsilon_\mathrm{B}} $ of 1%. |
|
png pdf |
Figure 9-b:
The Jensen--Shannon distance (JSD) using base 2 between the $ {m_\mathrm{SD}} $ distribution of jets from QCD multijet events with and without a selection on the PART and DEEPAK8-MD tagger scores. On the left, the JSD is plotted for tagger selections corresponding to different QCD jet selection efficiencies ($ {\epsilon_\mathrm{B}} $), with an offline selection of 600 $ < p_{\mathrm{T}} < $ 1000 GeV, $ |\eta| < $ 2.4, and 30 $ < {m_\mathrm{SD}} < $ 250 GeV applied to the jets. On the right, the JSD is plotted for different jet $ p_{\mathrm{T}} $ bins, at a fixed $ {\epsilon_\mathrm{B}} $ of 1%. |
|
png pdf |
Figure 10:
Schematic of the LJP calibration method for $ \mathrm{H}\to\mathrm{W}\mathrm{W} $ tagging. Ratios of primary LJP densities in data and simulation are first measured per subjet in merged two-pronged W jets, with an example of such a ratio reproduced from Ref. [30]. These are then used to derive correction factors for $ \mathrm{H}\to\mathrm{W}\mathrm{W} $ signal jets per prong. |
|
png pdf |
Figure 11:
Distributions of the PART $T_{HWW}^{No top}$ (left) and DEEPAK8-MD (No top) (right) discriminants with and without the LJP corrections for t-matched jets for data and individual simulated processes in the upper panels, and data versus simulation ratios in the lower panels. The combined uncertainties from LJP-based SFs per bin are shown in shaded gray, and the statistical uncertainty in the number of data events per bin is represented by vertical error bars in the upper and lower panels. The $ \chi^2 $ test statistic values between data and simulation, normalized to the number of degrees of freedom (ndof), are also shown for both discriminants with and without LJP corrections. |
|
png pdf |
Figure 11-a:
Distributions of the PART $T_{HWW}^{No top}$ (left) and DEEPAK8-MD (No top) (right) discriminants with and without the LJP corrections for t-matched jets for data and individual simulated processes in the upper panels, and data versus simulation ratios in the lower panels. The combined uncertainties from LJP-based SFs per bin are shown in shaded gray, and the statistical uncertainty in the number of data events per bin is represented by vertical error bars in the upper and lower panels. The $ \chi^2 $ test statistic values between data and simulation, normalized to the number of degrees of freedom (ndof), are also shown for both discriminants with and without LJP corrections. |
|
png pdf |
Figure 11-b:
Distributions of the PART $T_{HWW}^{No top}$ (left) and DEEPAK8-MD (No top) (right) discriminants with and without the LJP corrections for t-matched jets for data and individual simulated processes in the upper panels, and data versus simulation ratios in the lower panels. The combined uncertainties from LJP-based SFs per bin are shown in shaded gray, and the statistical uncertainty in the number of data events per bin is represented by vertical error bars in the upper and lower panels. The $ \chi^2 $ test statistic values between data and simulation, normalized to the number of degrees of freedom (ndof), are also shown for both discriminants with and without LJP corrections. |
| Tables | |
|
png pdf |
Table 1:
Summary of particle masses in the PART training samples. |
|
png pdf |
Table 2:
The complete set of input features per AK8 jet used for the PART model training. Two types of inputs are considered: PF candidates and secondary vertices (SVs). The PF candidate features marked with a star $ (\star) $ apply only to charged PF candidates and a null value is used for neutral candidates. |
|
png pdf |
Table 3:
Relative weights of each of the classes used for training the PART model. Each of the four major processes: $ \mathrm{H}\to\mathrm{W}\mathrm{W} $, $ \mathrm{H}\to\text{2-pronged} $, $ \mathrm{t}\to\mathrm{b}\mathrm{W} $, and QCD jets, are weighted equally and have one row dedicated to them each. |
|
png pdf |
Table 4:
Signal efficiency SFs and uncertainties for the BDT selections on the PART HWW tagging outputs in the $ {\mathrm{H}\mathrm{H}\to\mathrm{b}\overline{\mathrm{b}}\mathrm{W}\mathrm{W}} $ search, measured using the LJP calibration method for different $ {\mathrm{H}\mathrm{H}} $ signals and analysis regions. Both the total combined uncertainty and the components defined in the text are shown. |
| Summary |
| The particle transformer (PART) deep neural network for classifying a wide variety of jets from decays of Lorentz-boosted resonances has been presented. In particular, PART enables effective identification of all-hadronic Higgs boson to W boson ($ \mathrm{H}\to\mathrm{W}\mathrm{W}^*\to4\mathrm{q} $) decays by the CMS experiment for the first time. A novel training strategy is used to address challenges pertaining to $ \mathrm{H}\to\mathrm{W}\mathrm{W} $ classification, through which PART achieves $ > $50% $ \mathrm{H}\to\mathrm{W}\mathrm{W}^*\to4\mathrm{q} $ selection efficiency with a multijet background efficiency of 1%, while maintaining decorrelation with the jet mass. The performance is calibrated on data using the primary Lund jet planes of individual subjets, with data-to-simulation scale factors measured in the 0.9--1.0 range, and relative uncertainties between 7 and 23%. The PART algorithm represents a significant advancement in the identification capabilities of multiprong jets from highly boosted resonances in CMS, illustrated by the first search for boosted Higgs boson pair production in the all-hadronic $ \mathrm{b}\overline{\mathrm{b}}\mathrm{W}\mathrm{W} $ channel. |
| References | ||||
| 1 | CMS Collaboration | Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC | PLB 716 (2012) 30 | CMS-HIG-12-028 1207.7235 |
| 2 | CMS Collaboration | Observation of a new boson with mass near 125 GeV in pp collisions at $ \sqrt{s} = $ 7 and 8 TeV | JHEP 06 (2013) 081 | CMS-HIG-12-036 1303.4571 |
| 3 | ATLAS Collaboration | Observation of a new particle in the search for the standard model Higgs boson with the ATLAS detector at the LHC | PLB 716 (2012) 1 | 1207.7214 |
| 4 | CMS Collaboration | Identification of heavy, energetic, hadronically decaying particles using machine-learning techniques | JINST 15 (2020) P06005 | CMS-JME-18-002 2004.08262 |
| 5 | CMS Collaboration | Performance of the mass-decorrelated DeepDoubleX classifier for double-b and double-c large-radius jets with the CMS detector | CMS Detector Performance Summary CMS-DP-2022-041, 2022 CDS |
|
| 6 | E. A. Moreno et al. | JEDI-net: a jet identification algorithm based on interaction networks | EPJC 80 (2020) 58 | 1908.05318 |
| 7 | E. A. Moreno et al. | Interaction networks for the identification of boosted $ \mathrm{H}\to\mathrm{b}\overline{\mathrm{b}} $ decays | PRD 102 (2020) 012010 | 1909.12285 |
| 8 | H. Qu and L. Gouskos | PARTICLENET: Jet tagging via particle clouds | PRD 101 (2020) 056019 | 1902.08570 |
| 9 | CMS Collaboration | Identification of highly Lorentz-boosted heavy particles using graph neural networks and new mass decorrelation techniques | CMS Detector Performance Summary CMS-DP-2020-002, 2020 CDS |
|
| 10 | CMS Collaboration | Mass regression of highly-boosted jets using graph neural networks | CMS Detector Performance Summary CMS-DP-2021-017, 2021 CDS |
|
| 11 | CMS Collaboration | Measurement of boosted Higgs bosons produced via vector boson fusion or gluon fusion in the $ \mathrm{H}\to\mathrm{b}\overline{\mathrm{b}} $ decay mode using LHC proton-proton collision data at $ \sqrt{s} = $ 13 TeV | JHEP 12 (2024) 035 | CMS-HIG-21-020 2407.08012 |
| 12 | CMS Collaboration | Search for Higgs boson decay to a charm quark-antiquark pair in proton-proton collisions at $ \sqrt{s}= $ 13 TeV | PRL 131 (2023) 061801 | CMS-HIG-21-008 2205.05550 |
| 13 | CMS Collaboration | Search for nonresonant pair production of highly energetic Higgs bosons decaying to bottom quarks | PRL 131 (2023) 041803 | 2205.06667 |
| 14 | CMS Collaboration | Search for a massive scalar resonance decaying to a light scalar and a Higgs boson in the four b quarks final state with boosted topology | PLB 842 (2023) 137392 | 2204.12413 |
| 15 | CMS Collaboration | Search for resonant pair production of Higgs bosons in the $ \mathrm{b}\overline{\mathrm{b}}\mathrm{b}\overline{\mathrm{b}} $ final state using large-area jets in proton-proton collisions at $ \sqrt{s} = $ 13 TeV | JHEP 02 (2025) 040 | 2407.13872 |
| 16 | CMS Collaboration | Search for heavy resonances decaying to a pair of Lorentz-boosted Higgs bosons in final states with leptons and a bottom quark pair at $ \sqrt{s} = $ 13 TeV | JHEP 05 (2022) 005 | 2112.03161 |
| 17 | CMS Collaboration | Search for resonances decaying to three W bosons in the hadronic final state in proton-proton collisions at $ \sqrt{s} = $ 13 TeV | PRD 106 (2022) 012002 | 2112.13090 |
| 18 | CMS Collaboration | Search for resonances decaying to three W bosons in proton-proton collisions at $ \sqrt{s} = $ 13 TeV | PRL 129 (2022) 021802 | 2201.08476 |
| 19 | A. J. Larkoski, I. Moult, and B. Nachman | Jet substructure at the Large Hadron Collider: A review of recent advances in theory and machine learning | Phys. Rept. 841 (2020) 1 | 1709.04464 |
| 20 | H. Qu, C. Li, and S. Qian | Particle transformer for jet tagging | in the Int. Conf. on Machine Learning, volume 162, 2022 Proc. 3 (2022) 18281 |
2202.03772 |
| 21 | CMS Collaboration | Search for a massive resonance decaying into a Higgs boson and a W or Z boson in hadronic final states in proton-proton collisions at $ \sqrt{s}= $ 8 TeV | JHEP 02 (2016) 145 | CMS-EXO-14-009 1506.01443 |
| 22 | G. C. Branco et al. | Theory and phenomenology of two-Higgs-doublet models | Phys. Rept. 516 (2012) 1 | 1106.0034 |
| 23 | N. Craig, J. Galloway, and S. Thomas | Searching for signs of the second Higgs doublet | 1305.2424 | |
| 24 | F. Domingo and S. Pa\ss ehr | About the bosonic decays of heavy Higgs states in the (N)MSSM | EPJC 82 (2022) 962 | 2207.05776 |
| 25 | K. S. Agashe et al. | LHC signals from cascade decays of warped vector resonances | JHEP 05 (2017) 078 | 1612.00047 |
| 26 | K. Agashe et al. | Dedicated strategies for triboson signals from cascade decays of vector resonances | PRD 99 (2019) 075016 | 1711.09920 |
| 27 | H.-Y. Ren, L.-H. Xia, and Y.-P. Kuang | Model-independent probe of anomalous heavy neutral Higgs bosons at the LHC | PRD 90 (2014) 115002 | 1404.6367 |
| 28 | Y.-P. Kuang, H.-Y. Ren, and L.-H. Xia | Further investigation of the model-independent probe of heavy neutral Higgs bosons at LHC Run 2 | Chin. Phys. C 40 (2016) 023101 | 1506.08007 |
| 29 | F. A. Dreyer, G. P. Salam, and G. Soyez | The Lund jet plane | JHEP 12 (2018) 064 | 1807.04758 |
| 30 | CMS Collaboration | A method for correcting the substructure of multiprong jets using the Lund jet plane | JHEP 11 (2025) 038 | CMS-JME-23-001 2507.07775 |
| 31 | CMS Collaboration | Combination of searches for nonresonant Higgs boson pair production in proton-proton collisions at $ \sqrt{s}= $ 13 TeV | Submitted to J. Phys. G | CMS-HIG-20-011 2510.07527 |
| 32 | CMS Collaboration | The CMS experiment at the CERN LHC | JINST 3 (2008) S08004 | |
| 33 | CMS Collaboration | Development of the CMS detector for the CERN LHC Run 3 | JINST 19 (2024) P05064 | CMS-PRF-21-001 2309.05466 |
| 34 | CMS Collaboration | Performance of the CMS Level-1 trigger in proton-proton collisions at $ \sqrt{s} = $ 13 TeV | JINST 15 (2020) P10017 | CMS-TRG-17-001 2006.10165 |
| 35 | CMS Collaboration | The CMS trigger system | JINST 12 (2017) P01020 | CMS-TRG-12-001 1609.02366 |
| 36 | CMS Collaboration | Performance of the CMS high-level trigger during LHC Run 2 | JINST 19 (2024) P11021 | CMS-TRG-19-001 2410.17038 |
| 37 | CMS Collaboration | Electron and photon reconstruction and identification with the CMS experiment at the CERN LHC | JINST 16 (2021) P05014 | CMS-EGM-17-001 2012.06888 |
| 38 | CMS Collaboration | Performance of the CMS muon detector and muon reconstruction with proton-proton collisions at $ \sqrt{s}= $ 13 TeV | JINST 13 (2018) P06015 | CMS-MUO-16-001 1804.04528 |
| 39 | CMS Collaboration | Description and performance of track and primary-vertex reconstruction with the CMS tracker | JINST 9 (2014) P10009 | CMS-TRK-11-001 1405.6569 |
| 40 | CMS Tracker Group | The CMS phase-1 pixel detector upgrade | JINST 16 (2021) P02027 | 2012.14304 |
| 41 | CMS Collaboration | Track impact parameter resolution for the full pseudo rapidity coverage in the 2017 dataset with the CMS phase-1 pixel detector | CMS Detector Performance Summary CMS-DP-2020-049, 2020 CDS |
|
| 42 | CMS Collaboration | 2017 tracking performance plots | CMS Detector Performance Summary CMS-DP-2017-015, 2017 CDS |
|
| 43 | CMS Collaboration | Particle-flow reconstruction and global event description with the CMS detector | JINST 12 (2017) P10003 | CMS-PRF-14-001 1706.04965 |
| 44 | CMS Collaboration | Technical proposal for the Phase-II upgrade of the Compact Muon Solenoid | CMS Technical Proposal CERN-LHCC-2015-010, CMS-TDR-15-02, 2015 CDS |
|
| 45 | CMS Collaboration | Offline secondary vertex reconstruction in the CMS detector | PoS LHCP 236, 2025 link |
|
| 46 | M. Cacciari, G. P. Salam, and G. Soyez | The anti-$ k_{\mathrm{T}} $ jet clustering algorithm | JHEP 04 (2008) 063 | 0802.1189 |
| 47 | M. Cacciari, G. P. Salam, and G. Soyez | FastJet user manual | EPJC 72 (2012) 1896 | 1111.6097 |
| 48 | CMS Collaboration | Pileup removal algorithms | CMS Physics Analysis Summary, 2014 CMS-PAS-JME-14-001 |
CMS-PAS-JME-14-001 |
| 49 | D. Bertolini, P. Harris, M. Low, and N. Tran | Pileup per particle identification | JHEP 10 (2014) 059 | 1407.6013 |
| 50 | CMS Collaboration | Pileup mitigation at CMS in 13 TeV data | JINST 15 (2020) P09018 | CMS-JME-18-001 2003.00503 |
| 51 | CMS Collaboration | Jet energy scale and resolution in the CMS experiment in pp collisions at 8 TeV | JINST 12 (2017) P02014 | CMS-JME-13-004 1607.03663 |
| 52 | S. Catani, Y. L. Dokshitzer, M. H. Seymour, and B. R. Webber | Longitudinally invariant $ k_{\mathrm{T}} $ clustering algorithms for hadron hadron collisions | NPB 406 (1993) 187 | |
| 53 | S. D. Ellis and D. E. Soper | Successive combination jet algorithm for hadron collisions | PRD 48 (1993) 3160 | hep-ph/9305266 |
| 54 | Y. L. Dokshitzer, G. D. Leder, S. Moretti, and B. R. Webber | Better jet clustering algorithms | JHEP 08 (1997) 001 | hep-ph/9707323 |
| 55 | M. Wobisch and T. Wengler | Hadronization corrections to jet cross-sections in deep inelastic scattering | in Proc. Workshop on Monte Carlo Generators for HERA Physics, 1998 | hep-ph/9907280 |
| 56 | A. J. Larkoski, S. Marzani, G. Soyez, and J. Thaler | Soft drop | JHEP 05 (2014) 146 | 1402.2657 |
| 57 | E. Bols et al. | Jet flavour classification using DeepJet | JINST 15 (2020) P12012 | 2008.10519 |
| 58 | CMS Collaboration | Performance of missing transverse momentum reconstruction in proton-proton collisions at $ \sqrt{s} = $ 13 TeV using the CMS detector | JINST 14 (2019) P07004 | CMS-JME-17-001 1903.06078 |
| 59 | J. Alwall et al. | The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations | JHEP 07 (2014) 079 | 1405.0301 |
| 60 | S. Bolognesi et al. | On the spin and parity of a single-produced resonance at the LHC | PRD 86 (2012) 095031 | 1208.4018 |
| 61 | Particle Data Group | Review of particle physics | PRD 110 (2024) 030001 | |
| 62 | T. Sjöstrand et al. | An introduction to PYTHIA8.2 | Comput. Phys. Commun. 191 (2015) 159 | 1410.3012 |
| 63 | M. Cacciari and G. P. Salam | Pileup subtraction using jet areas | PLB 659 (2008) 119 | 0707.1378 |
| 64 | P. Nason | A new method for combining NLO QCD with shower Monte Carlo algorithms | JHEP 11 (2004) 040 | hep-ph/0409146 |
| 65 | S. Frixione, P. Nason, and C. Oleari | Matching NLO QCD computations with parton shower simulations: the POWHEG method | JHEP 11 (2007) 070 | 0709.2092 |
| 66 | S. Alioli, P. Nason, C. Oleari, and E. Re | A general framework for implementing NLO calculations in shower Monte Carlo programs: the POWHEG box | JHEP 06 (2010) 043 | 1002.2581 |
| 67 | E. Bagnaschi, G. Degrassi, P. Slavich, and A. Vicini | Higgs production via gluon fusion in the POWHEG approach in the SM and in the MSSM | JHEP 02 (2012) 088 | 1111.2854 |
| 68 | M. Grazzini et al. | Higgs boson pair production at NNLO with top quark mass effects | JHEP 05 (2018) 059 | 1803.02463 |
| 69 | S. Dawson, S. Dittmaier, and M. Spira | Neutral Higgs boson pair production at hadron colliders: QCD corrections | PRD 58 (1998) 115012 | hep-ph/9805244 |
| 70 | D. de Florian and J. Mazzitelli | Higgs boson pair production at next-to-next-to-leading order in QCD | PRL 111 (2013) 201801 | 1309.6594 |
| 71 | D. de Florian and J. Mazzitelli | Higgs pair production at next-to-next-to-leading logarithmic accuracy at the LHC | JHEP 09 (2015) 053 | 1505.07122 |
| 72 | J. Baglio et al. | Gluon fusion into Higgs pairs at NLO QCD and the top mass scheme | EPJC 79 (2019) 459 | 1811.05692 |
| 73 | S. Borowka et al. | Higgs boson pair production in gluon fusion at next-to-leading order with full top-quark mass dependence | [Erratum: doi:10.1103/PhysRevLett.117.079901] PRL 117 (2016) 012001 |
1604.06447 |
| 74 | D. Y. Shao, C. S. Li, H. T. Li, and J. Wang | Threshold resummation effects in Higgs boson pair production at the LHC | JHEP 07 (2013) 169 | 1301.1245 |
| 75 | CMS Collaboration | Extraction and validation of a new set of CMS PYTHIA8 tunes from underlying-event measurements | EPJC 80 (2020) 4 | CMS-GEN-17-001 1903.12179 |
| 76 | NNPDF Collaboration | Parton distributions for the LHC Run II | JHEP 04 (2015) 040 | 1410.8849 |
| 77 | NNPDF Collaboration | Parton distributions from high-precision collider data | EPJC 77 (2017) 663 | 1706.00428 |
| 78 | GEANT4 Collaboration | GEANT 4---a simulation toolkit | NIM A 506 (2003) 250 | |
| 79 | A. Vaswani et al. | Attention is all you need | in Int. Conf. on Neural Information Processing Systems, NIPS'17, Curran Associates Inc., Red Hook, NY, USA, 2017 Proc. 3 (2017) 6000 |
1706.03762 |
| 80 | H. Touvron et al. | Going deeper with image transformers | in Proc. IEEE/CVF Int. Conf. on Computer Vision (ICCV), 2021 link |
2103.17239 |
| 81 | J. Bridle | Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters | in Advances in Neural Information Processing Systems, D. Touretzky, ed., volume 2. Morgan-Kaufmann, 1989 link |
|
| 82 | F. A. Dreyer and H. Qu | Jet tagging in the Lund plane with graph networks | JHEP 03 (2021) 052 | 2012.08526 |
| 83 | CMS Collaboration | Identification of heavy-flavour jets with the CMS detector in pp collisions at 13 TeV | JINST 13 (2018) P05011 | CMS-BTV-16-002 1712.07158 |
| 84 | H. Qu | Weaver: A machine learning R\&D framework for high energy physics applications | https://github.com/hqucms/weaver-core | |
| 85 | A. Paszke et al. | PyTorch: An imperative style, high-performance deep learning library | Advances in Neural Information Processing Systems 3 (2019) 8024 | 1912.01703 |
| 86 | M. Zhang, J. Lucas, J. Ba, and G. E. Hinton | Lookahead optimizer: $ k $ steps forward, 1 step back | Advances in Neural Information Processing Systems 3 (2019) 2 | 1907.08610 |
| 87 | et al. | On the variance of the adaptive learning rate and beyond | L. Liu in Proc. Int. Conf. on Learning Representations (ICLR), 2020 link |
1908.03265 |
| 88 | CMS Collaboration | Search for heavy scalar resonances decaying to Lorentz-boosted Higgs and Higgs-like bosons in the $ \mathrm{b}\overline{\mathrm{b}} 4\mathrm{q} $ final state at $ \sqrt{s} = $ 13 TeV | Submitted to JHEP | 2602.00273 |
| 89 | J. Dolen et al. | Thinking outside the ROCs: Designing decorrelated taggers (DDT) for jet substructure | JHEP 05 (2016) 156 | 1603.00027 |
| 90 | J. Lin | Divergence measures based on the Shannon entropy | IEEE Trans. on Inf. Th. 37 (1991) 145 | |
| 91 | S. Kullback and R. A. Leibler | On information and sufficiency | Ann. Math. Statist. 22 (1951) 79 | |
| 92 | CMS Collaboration | Performance of heavy-flavour jet identification in Lorentz-boosted topologies in proton-proton collisions at $ \sqrt{s} = $ 13 TeV | JINST 20 (2025) P11006 | CMS-BTV-22-001 2510.10228 |
| 93 | ATLAS Collaboration | Search for pair production of boosted Higgs bosons via vector-boson fusion in the $ \mathrm{b}\overline{\mathrm{b}}\mathrm{b}\overline{\mathrm{b}} $ final state using pp collisions at $ \sqrt{s} = $ 13 TeV with the ATLAS detector | PLB 858 (2024) 139007 | 2404.17193 |
|
Compact Muon Solenoid LHC, CERN |
|
|
|
|
|
|