CMS logoCMS event Hgg
Compact Muon Solenoid
LHC, CERN

CMS-MUO-24-001 ; CERN-EP-2024-325
Identification of low-momentum muons in the CMS detector using multivariate techniques in proton-proton collisions at $ \sqrt{s} = $ 13.6 TeV
Submitted to J. Instrum.
Abstract: ``Soft'' muons with a transverse momentum below 10 GeV are featured in many processes studied by the CMS experiment, such as decays of heavy-flavor hadrons or rare tau lepton decays. Maximizing the selection efficiency for these muons, while simultaneously suppressing backgrounds from long-lived light-flavor hadron decays, is therefore important for the success of the CMS physics program. Multivariate techniques have been shown to deliver better muon identification performance than traditional selection techniques. To take full advantage of the large data set currently being collected during Run 3 of the CERN LHC, a new multivariate classifier based on a gradient-boosted decision tree has been developed. It offers a significantly improved separation of signal and background muons compared to a similar classifier used for the analysis of the Run 2 data. The performance of the new classifier is evaluated on a data set collected with the CMS detector in 2022 and 2023, corresponding to an integrated luminosity of 62 fb$ ^{-1} $.
Figures & Tables Summary References CMS Publications
Figures

png pdf
Figure 1:
The distribution of the fraction of valid hits on the tracker track (left) and the $ -\ln(p_{\text{global}}) $ (right), where $ p_{\text{global}} $ is the global track fit probability, for signal muons (solid blue line) and background muons (dashed red line) used in the training.

png pdf
Figure 1-a:
The distribution of the fraction of valid hits on the tracker track (left) and the $ -\ln(p_{\text{global}}) $ (right), where $ p_{\text{global}} $ is the global track fit probability, for signal muons (solid blue line) and background muons (dashed red line) used in the training.

png pdf
Figure 1-b:
The distribution of the fraction of valid hits on the tracker track (left) and the $ -\ln(p_{\text{global}}) $ (right), where $ p_{\text{global}} $ is the global track fit probability, for signal muons (solid blue line) and background muons (dashed red line) used in the training.

png pdf
Figure 2:
The ROC curve for the Run 3 soft-muon MVA (blue line) compared to those for the Run 2 soft-muon MVA (orange line) and the Muon MVA (green line). The working points defined in Table 1 are indicated with round, colored markers, with the very loose, loose, medium, and tight working points being represented by the grey, blue, red, and purple markers, respectively. For comparison, the performance of the cut-based IDs is indicated by the colored stars, with the loose, soft, and medium IDs being represented by the green, blue, and red markers, respectively. The upper plot shows the ROC curves for all muons, while the lower ones split the muon sample into those that fired the HLT (left) and those that did not (right).

png pdf
Figure 2-a:
The ROC curve for the Run 3 soft-muon MVA (blue line) compared to those for the Run 2 soft-muon MVA (orange line) and the Muon MVA (green line). The working points defined in Table 1 are indicated with round, colored markers, with the very loose, loose, medium, and tight working points being represented by the grey, blue, red, and purple markers, respectively. For comparison, the performance of the cut-based IDs is indicated by the colored stars, with the loose, soft, and medium IDs being represented by the green, blue, and red markers, respectively. The upper plot shows the ROC curves for all muons, while the lower ones split the muon sample into those that fired the HLT (left) and those that did not (right).

png pdf
Figure 2-b:
The ROC curve for the Run 3 soft-muon MVA (blue line) compared to those for the Run 2 soft-muon MVA (orange line) and the Muon MVA (green line). The working points defined in Table 1 are indicated with round, colored markers, with the very loose, loose, medium, and tight working points being represented by the grey, blue, red, and purple markers, respectively. For comparison, the performance of the cut-based IDs is indicated by the colored stars, with the loose, soft, and medium IDs being represented by the green, blue, and red markers, respectively. The upper plot shows the ROC curves for all muons, while the lower ones split the muon sample into those that fired the HLT (left) and those that did not (right).

png pdf
Figure 2-c:
The ROC curve for the Run 3 soft-muon MVA (blue line) compared to those for the Run 2 soft-muon MVA (orange line) and the Muon MVA (green line). The working points defined in Table 1 are indicated with round, colored markers, with the very loose, loose, medium, and tight working points being represented by the grey, blue, red, and purple markers, respectively. For comparison, the performance of the cut-based IDs is indicated by the colored stars, with the loose, soft, and medium IDs being represented by the green, blue, and red markers, respectively. The upper plot shows the ROC curves for all muons, while the lower ones split the muon sample into those that fired the HLT (left) and those that did not (right).

png pdf
Figure 3:
Distribution of the Run 3 soft-muon MVA score for muons of different origins.

png pdf
Figure 4:
Cumulative distribution of the classifier score of the Run 3 soft-muon MVA, comparing data from early 2022 (black markers) with inclusive dilepton simulation (blue histogram) for muon pairs with 2.7 $ < m_{\mu\mu} < $ 3.5 GeV. The lower panel shows the ratio of data to simulation. Uncertainties represented by the vertical bars are statistical only..

png pdf
Figure 5:
Efficiencies of the Run 2 (blue) and Run 3 (red) soft-muon MVA as functions of muon $ p_{\mathrm{T}} $ for muon 0 $ < |\eta| < $ 0.9 (upper left), 0.9 $ < |\eta| < $ 1.2 (upper right), 1.2 $ < |\eta| < $ 2.4 (lower). For the Run 3 soft-muon MVA, the medium working point is used. The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 5-a:
Efficiencies of the Run 2 (blue) and Run 3 (red) soft-muon MVA as functions of muon $ p_{\mathrm{T}} $ for muon 0 $ < |\eta| < $ 0.9 (upper left), 0.9 $ < |\eta| < $ 1.2 (upper right), 1.2 $ < |\eta| < $ 2.4 (lower). For the Run 3 soft-muon MVA, the medium working point is used. The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 5-b:
Efficiencies of the Run 2 (blue) and Run 3 (red) soft-muon MVA as functions of muon $ p_{\mathrm{T}} $ for muon 0 $ < |\eta| < $ 0.9 (upper left), 0.9 $ < |\eta| < $ 1.2 (upper right), 1.2 $ < |\eta| < $ 2.4 (lower). For the Run 3 soft-muon MVA, the medium working point is used. The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 5-c:
Efficiencies of the Run 2 (blue) and Run 3 (red) soft-muon MVA as functions of muon $ p_{\mathrm{T}} $ for muon 0 $ < |\eta| < $ 0.9 (upper left), 0.9 $ < |\eta| < $ 1.2 (upper right), 1.2 $ < |\eta| < $ 2.4 (lower). For the Run 3 soft-muon MVA, the medium working point is used. The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 6:
Efficiencies of the Run 2 (blue) and Run 3 (red) soft-muon MVA as functions of muon $ \eta $ for muons with 3 $ < p_{\mathrm{T}} < $ 6 GeV (left) and $ p_{\mathrm{T}} > $ 6 GeV (right). For the Run 3 soft-muon MVA, the medium working point is use. The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 6-a:
Efficiencies of the Run 2 (blue) and Run 3 (red) soft-muon MVA as functions of muon $ \eta $ for muons with 3 $ < p_{\mathrm{T}} < $ 6 GeV (left) and $ p_{\mathrm{T}} > $ 6 GeV (right). For the Run 3 soft-muon MVA, the medium working point is use. The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 6-b:
Efficiencies of the Run 2 (blue) and Run 3 (red) soft-muon MVA as functions of muon $ \eta $ for muons with 3 $ < p_{\mathrm{T}} < $ 6 GeV (left) and $ p_{\mathrm{T}} > $ 6 GeV (right). For the Run 3 soft-muon MVA, the medium working point is use. The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 7:
Efficiencies of the different working points of the Run 3 soft-muon MVA as functions of muon $ p_{\mathrm{T}} $ for muon 0 $ < |\eta| < $ 0.9 (upper left), 0.9 $ < |\eta| < $ 1.2 (upper right), 1.2 $ < |\eta| < $ 2.4 (lower). The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 7-a:
Efficiencies of the different working points of the Run 3 soft-muon MVA as functions of muon $ p_{\mathrm{T}} $ for muon 0 $ < |\eta| < $ 0.9 (upper left), 0.9 $ < |\eta| < $ 1.2 (upper right), 1.2 $ < |\eta| < $ 2.4 (lower). The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 7-b:
Efficiencies of the different working points of the Run 3 soft-muon MVA as functions of muon $ p_{\mathrm{T}} $ for muon 0 $ < |\eta| < $ 0.9 (upper left), 0.9 $ < |\eta| < $ 1.2 (upper right), 1.2 $ < |\eta| < $ 2.4 (lower). The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 7-c:
Efficiencies of the different working points of the Run 3 soft-muon MVA as functions of muon $ p_{\mathrm{T}} $ for muon 0 $ < |\eta| < $ 0.9 (upper left), 0.9 $ < |\eta| < $ 1.2 (upper right), 1.2 $ < |\eta| < $ 2.4 (lower). The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 8:
Efficiencies of different working points of the Run 3 soft-muon MVA as functions of muon $ \eta $ for muons with 3 $ < p_{\mathrm{T}} < $ 6 GeV (left) and $ p_{\mathrm{T}} > $ 6 GeV (right). The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 8-a:
Efficiencies of different working points of the Run 3 soft-muon MVA as functions of muon $ \eta $ for muons with 3 $ < p_{\mathrm{T}} < $ 6 GeV (left) and $ p_{\mathrm{T}} > $ 6 GeV (right). The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 8-b:
Efficiencies of different working points of the Run 3 soft-muon MVA as functions of muon $ \eta $ for muons with 3 $ < p_{\mathrm{T}} < $ 6 GeV (left) and $ p_{\mathrm{T}} > $ 6 GeV (right). The vertical bars indicate the total uncertainty including statistical and systematic uncertainties.

png pdf
Figure 9:
Background rates for muons from pions of different working points of the Run 3 soft-muon MVA for 2022--2023 data and simulation as a function of pion decay length ($ L_{xy} $) measured in cm (left) and muon $ p_{\mathrm{T}} $ (right). The muon $ p_{\mathrm{T}} > $ 4 GeV is required in the $ L_{xy} $ plot, whereas the misidentification rate of muons 2 $ < p_{\mathrm{T}} < $ 4 GeV is measured only for $ |\eta| > $ 1 since central muons in this $ p_{\mathrm{T}} $ range do not reach the muon detector. The vertical bars indicate the statistical uncertainty.

png pdf
Figure 9-a:
Background rates for muons from pions of different working points of the Run 3 soft-muon MVA for 2022--2023 data and simulation as a function of pion decay length ($ L_{xy} $) measured in cm (left) and muon $ p_{\mathrm{T}} $ (right). The muon $ p_{\mathrm{T}} > $ 4 GeV is required in the $ L_{xy} $ plot, whereas the misidentification rate of muons 2 $ < p_{\mathrm{T}} < $ 4 GeV is measured only for $ |\eta| > $ 1 since central muons in this $ p_{\mathrm{T}} $ range do not reach the muon detector. The vertical bars indicate the statistical uncertainty.

png pdf
Figure 9-b:
Background rates for muons from pions of different working points of the Run 3 soft-muon MVA for 2022--2023 data and simulation as a function of pion decay length ($ L_{xy} $) measured in cm (left) and muon $ p_{\mathrm{T}} $ (right). The muon $ p_{\mathrm{T}} > $ 4 GeV is required in the $ L_{xy} $ plot, whereas the misidentification rate of muons 2 $ < p_{\mathrm{T}} < $ 4 GeV is measured only for $ |\eta| > $ 1 since central muons in this $ p_{\mathrm{T}} $ range do not reach the muon detector. The vertical bars indicate the statistical uncertainty.

png pdf
Figure 10:
Background rates for muons from kaons of different working points of the Run 3 soft-muon MVA for 2022--2023 data and simulation as a function of muon $ p_{\mathrm{T}} $. The misidentification rate of muons 2 $ < p_{\mathrm{T}} < $ 4 GeV is measured only for $ |\eta| > $ 1 since central muons in this $ p_{\mathrm{T}} $ range do not reach the muon detector. The vertical bars indicate the statistical uncertainty.
Tables

png pdf
Table 1:
Working points for the Run 3 soft-muon MVA classifier. For each working point, the efficiency of selecting muons from heavy-flavor decays, the purity of the selected muon sample, and the fraction of selected objects that are not associated with a real muon are given. These metrics are evaluated on the inclusive dilepton sample.
Summary
A multivariate (MVA) classifier based on gradient-boosted decision trees has been developed for the selection of soft muons from the decay of heavy-flavor hadrons and various rare decays for transverse muon momenta $ p_{\mathrm{T}} $ below 10 GeV. It is optimized for the analysis of data recorded by the CMS experiment during Run 3 of the LHC. The classifier was trained to separate these muons from background muons from pion and kaon decays. The training uses muons that pass a looser selection in a larger phase space, compared with the training of a similar classifier used in the analysis of Run 2 data, thus increasing the sensitivity to the processes outlined above. The new training also takes into account a larger set of input features and uses input samples matching the beam and detector conditions in Run 3. Consequently, the new Run 3 soft-muon MVA offers significantly improved selection efficiency for the same background rejection as the Run 2 version in both simulated samples and collision data recorded in 2022 and 2023. This improvement is most apparent for muons with $ p_{\mathrm{T}} < $ 4 GeV and muons reconstructed in the forward direction of the CMS detector, which were not included in the training of the Run 2 version of the classifier. The efficiency and background rate of the Run 3 soft-muon MVA measured in collision data is generally well described by the CMS simulation, with some significant differences observed for forward muons, especially for strict selections on the MVA score, which can be corrected at the analysis level.
References
1 CMS Collaboration Measurement of properties of B$ ^0_\mathrm{s}\to\mu^+\mu^- $ decays and search for B$ ^0\to\mu^+\mu^- $ with the CMS experiment JHEP 04 (2020) 188 CMS-BPH-16-004
1910.12127
2 CMS Collaboration Measurement of the B$ ^{0}_{\mathrm{s}} \to \mu^{+}\mu^{-} $ decay properties and search for the B$ ^0 \to \mu^{+}\mu^{-} $ decay in proton-proton collisions at $ \sqrt{s}= $ 13 TeV PLB 842 (2023) 137955 CMS-BPH-21-006
2212.10311
3 CMS Collaboration Search for the lepton flavor violating $ \tau \to 3\mu $ decay in proton-proton collisions at $ \sqrt{s} = $ 13 TeV PLB 853 (2024) 138633 CMS-BPH-21-005
2312.02371
4 W. J. Marciano, T. Mori, and J. M. Roney Charged lepton flavor violation experiments Ann. Rev. Nucl. Part. Sci. 58 (2008) 315
5 M. Raidal et al. Flavour physics of leptons and dipole moments EPJC 57 (2008) 13 0801.1826
6 E. Arganda and M. J. Herrero Testing supersymmetry with lepton flavor violating $ \tau $ and $ \mu $ decays PRD 73 (2006) 055003 hep-ph/0510405
7 CMS Collaboration Performance of CMS muon reconstruction in pp collision events at $ \sqrt{s}= $ 7 TeV JINST 7 (2012) P10002 CMS-MUO-10-004
1206.4071
8 CMS Collaboration Performance of the CMS muon detector and muon reconstruction with proton-proton collisions at $ \sqrt{s}= $ 13 TeV JINST 13 (2018) P06015 CMS-MUO-16-001
1804.04528
9 J. Therhaag TMVA toolkit for multivariate data analysis in ROOT PoS ICHEP 510, 2010
link
10 CMS Collaboration The CMS experiment at the CERN LHC JINST 3 (2008) S08004
11 CMS Collaboration Development of the CMS detector for the CERN LHC Run 3 JINST 19 (2024) P05064 CMS-PRF-21-001
2309.05466
12 Tracker Group of the CMS Collaboration The CMS phase-1 pixel detector upgrade JINST 16 (2021) P02027 2012.14304
13 CMS Collaboration Offline luminosity measurement for the 2022 pp collisions at 13.6 TeV data set at CMS CMS Physics Analysis Summary, 2024
CMS-PAS-LUM-22-001
CMS-PAS-LUM-22-001
14 CMS Collaboration Track impact parameter resolution for the full pseudo rapidity coverage in the 2017 dataset with the CMS phase-1 pixel detector CMS Detector Performance Summary CMS-DP-2020-049, 2020
CDS
15 CMS Collaboration Performance of the CMS Level-1 trigger in proton-proton collisions at $ \sqrt{s} = $ 13 TeV JINST 15 (2020) P10017 CMS-TRG-17-001
2006.10165
16 CMS Collaboration The CMS trigger system JINST 12 (2017) P01020 CMS-TRG-12-001
1609.02366
17 CMS Collaboration Performance of the CMS muon trigger system in proton-proton collisions at $ \sqrt{s}= $ 13 TeV JINST 16 (2021) P07001 CMS-MUO-19-001
2102.04790
18 R. Frühwirth Application of Kalman filtering to track and vertex fitting NIM A 262 (1987) 444
19 CMS Collaboration Description and performance of track and primary-vertex reconstruction with the CMS tracker JINST 9 (2014) P10009 CMS-TRK-11-001
1405.6569
20 R. E. Schapire Explaining adaboost in Empirical Inference: Festschrift in Honor of Vladimir N. Vapnik, Springer, 2013
link
21 CMS Collaboration Muon identification using multivariate techniques in the CMS experiment in proton-proton collisions at $ \sqrt{s}= $ 13 TeV JINST 19 (2024) P02031 CMS-MUO-22-001
2310.03844
22 T. Sjöstrand et al. An introduction to PYTHIA 8.2 Comput. Phys. Commun. 191 (2015) 159 1410.3012
23 J. Alwall et al. The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations JHEP 07 (2014) 079 1405.0301
24 J. Alwall et al. Comparative study of various algorithms for the merging of parton showers and matrix elements in hadronic collisions EPJC 53 (2008) 473 0706.2569
25 R. Frederix and S. Frixione Merging meets matching in MC@NLO JHEP 12 (2012) 061 1209.6215
26 S. Alioli, P. Nason, C. Oleari, and E. Re A general framework for implementing NLO calculations in shower Monte Carlo programs: The POWHEG box JHEP 06 (2010) 043 1002.2581
27 S. Frixione, P. Nason, and C. Oleari Matching NLO QCD computations with parton shower simulations: the POWHEG method JHEP 11 (2007) 070 0709.2092
28 P. Nason A new method for combining NLO QCD with shower Monte Carlo algorithms JHEP 11 (2004) 040 hep-ph/0409146
29 P. F. Monni et al. MiNNLO$ _{PS} $: A new method to match NNLO QCD to parton showers JHEP 05 (2020) 143 1908.06987
30 NNPDF Collaboration Parton distributions from high-precision collider data EPJC 77 (2017) 663 1706.00428
31 CMS Collaboration Extraction and validation of a new set of CMS PYTHIA8 tunes from underlying-event measurements EPJC 80 (2020) 4 CMS-GEN-17-001
1903.12179
32 GEANT4 Collaboration GEANT 4---a simulation toolkit NIM A 506 (2003) 250
33 J. H. Friedman Greedy function approximation: A gradient boosting machine Ann. Stat. 29 (2001) 1189
34 T. Chen and C. Guestrin XGBoost: A scalable tree boosting system in Proc. 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, KDD '16, 2016
link
35 F. Pedregosa et al. Scikit-learn: Machine learning in Python JMLR 12 (2011) 2825 1201.0490
36 CMS Collaboration Measurements of inclusive W and Z cross sections in $ {\mathrm{p}\mathrm{p}} $ collisions at $ \sqrt{s}= $ 7 TeV JHEP 01 (2011) 080 CMS-EWK-10-002
1012.2466
37 CMS Collaboration Technical proposal for the Phase-II upgrade of the Compact Muon Solenoid CMS Technical Proposal CERN-LHCC-2015-010, CMS-TDR-15-02, 2015
CDS
38 M. J. Oreglia A study of the reactions $ \psi^\prime \to \gamma \gamma \psi $ PhD thesis, Stanford University, SLAC Report SLAC-R-236, see Appendix D, 1980
link
Compact Muon Solenoid
LHC, CERN