CMSPASMLG23005  
Development of systematicaware neural network trainings for binnedlikelihoodanalyses at the LHC  
CMS Collaboration  
25 July 2024  
Abstract: We demonstrate a neural network training, capable of accounting for the effects of systematic variations of the utilized data model in the training process and describe its extension towards neural network multiclass classification. Trainings for binary and multiclass classification with seven output classes are performed, based on a comprehensive data model with 86 nontrivial shapealtering systematic variations, as used for a previous measurement. The neural network output functions are used to infer the signal strengths for inclusive Higgs boson production, as well as for Higgs boson production via gluonfusion ($ r_{\mathrm{ggH}} $) and vector boson fusion ($ r_{\mathrm{qqH}} $). With respect to a conventional training, based on crossentropy, we observe improvements of 12 and 16%, for the sensitivity in $ r_{\mathrm{ggH}} $ and $ r_{\mathrm{qqH}} $, respectively.  
Links: CDS record (PDF) ; CADI line (restricted) ; 
Figures  
png pdf 
Figure 1:
Flow chart of a (upper part) $ \mathrm{CENNT} $ and (lower part) $ \mathrm{SANNT} $. In the figure $ D_{i} $ denotes the dataset, $ n $ ($ d $) the number of events (observables) in the initial dataset $ D_{X} $; $ l $ the number of classes after event classification; and $ h $ the number of histogram bins to enter the statistical inference of the POIs. The function symbol $ \mathbb{P} $ represents the multinomial distribution, the symbol $ \mathcal{L} $ has been defined in Eq. 1. 
png pdf 
Figure 1a:
Flow chart of a (upper part) $ \mathrm{CENNT} $ and (lower part) $ \mathrm{SANNT} $. In the figure $ D_{i} $ denotes the dataset, $ n $ ($ d $) the number of events (observables) in the initial dataset $ D_{X} $; $ l $ the number of classes after event classification; and $ h $ the number of histogram bins to enter the statistical inference of the POIs. The function symbol $ \mathbb{P} $ represents the multinomial distribution, the symbol $ \mathcal{L} $ has been defined in Eq. 1. 
png pdf 
Figure 1b:
Flow chart of a (upper part) $ \mathrm{CENNT} $ and (lower part) $ \mathrm{SANNT} $. In the figure $ D_{i} $ denotes the dataset, $ n $ ($ d $) the number of events (observables) in the initial dataset $ D_{X} $; $ l $ the number of classes after event classification; and $ h $ the number of histogram bins to enter the statistical inference of the POIs. The function symbol $ \mathbb{P} $ represents the multinomial distribution, the symbol $ \mathcal{L} $ has been defined in Eq. 1. 
png pdf 
Figure 2:
Custom functions $ \mathcal{B}_{i} $ for the backward pass of the backpropagation algorithm, as used (left) in Ref. [5] and (right) in this paper. In the first row of each subfigure the same 20 random samples of a simple setup of pseudoexperiments, as described in Section 3.2 are shown. In the second row the resulting histogram $ H $, in the third and fourth rows the functions $ B_{0} $ and$ B_{1} $ for the individual bins $ H_{0} $ and $ H_{1} $, and in the last row the collective effect of $ \sum\mathcal{B}_{i} $ are shown. 
png pdf 
Figure 2a:
Custom functions $ \mathcal{B}_{i} $ for the backward pass of the backpropagation algorithm, as used (left) in Ref. [5] and (right) in this paper. In the first row of each subfigure the same 20 random samples of a simple setup of pseudoexperiments, as described in Section 3.2 are shown. In the second row the resulting histogram $ H $, in the third and fourth rows the functions $ B_{0} $ and$ B_{1} $ for the individual bins $ H_{0} $ and $ H_{1} $, and in the last row the collective effect of $ \sum\mathcal{B}_{i} $ are shown. 
png pdf 
Figure 2b:
Custom functions $ \mathcal{B}_{i} $ for the backward pass of the backpropagation algorithm, as used (left) in Ref. [5] and (right) in this paper. In the first row of each subfigure the same 20 random samples of a simple setup of pseudoexperiments, as described in Section 3.2 are shown. In the second row the resulting histogram $ H $, in the third and fourth rows the functions $ B_{0} $ and$ B_{1} $ for the individual bins $ H_{0} $ and $ H_{1} $, and in the last row the collective effect of $ \sum\mathcal{B}_{i} $ are shown. 
png pdf 
Figure 3:
Evolution of the loss functions CE, $ \Delta r_{s}^{\mathrm{stat.}} $ and $ \Delta r_{s} $ as used (left) in Ref. [5] and (right) for this paper. In the upper panels the evolution of $ \hat{y} $ for randomly selected 50 (blue) signal and 50 (orange) background samples during training is shown. The gray shaded area indicates the pretraining. In the second and third panels from above the evolution of CE and $ \Delta r_{s}^{\mathrm{stat.}} $ is shown. In the lowest panels the evolution of $ L_{\mathrm{SANNT}}=\Delta r_{s} $ is shown. The evaluation on the training (validation) dataset is indicated in blue (orange). The evaluation of the correspondingly inactive loss function, during or after pretraining, evaluated on the validation dataset, is indicated by the dashed orange curves. 
png pdf 
Figure 3a:
Evolution of the loss functions CE, $ \Delta r_{s}^{\mathrm{stat.}} $ and $ \Delta r_{s} $ as used (left) in Ref. [5] and (right) for this paper. In the upper panels the evolution of $ \hat{y} $ for randomly selected 50 (blue) signal and 50 (orange) background samples during training is shown. The gray shaded area indicates the pretraining. In the second and third panels from above the evolution of CE and $ \Delta r_{s}^{\mathrm{stat.}} $ is shown. In the lowest panels the evolution of $ L_{\mathrm{SANNT}}=\Delta r_{s} $ is shown. The evaluation on the training (validation) dataset is indicated in blue (orange). The evaluation of the correspondingly inactive loss function, during or after pretraining, evaluated on the validation dataset, is indicated by the dashed orange curves. 
png pdf 
Figure 3b:
Evolution of the loss functions CE, $ \Delta r_{s}^{\mathrm{stat.}} $ and $ \Delta r_{s} $ as used (left) in Ref. [5] and (right) for this paper. In the upper panels the evolution of $ \hat{y} $ for randomly selected 50 (blue) signal and 50 (orange) background samples during training is shown. The gray shaded area indicates the pretraining. In the second and third panels from above the evolution of CE and $ \Delta r_{s}^{\mathrm{stat.}} $ is shown. In the lowest panels the evolution of $ L_{\mathrm{SANNT}}=\Delta r_{s} $ is shown. The evaluation on the training (validation) dataset is indicated in blue (orange). The evaluation of the correspondingly inactive loss function, during or after pretraining, evaluated on the validation dataset, is indicated by the dashed orange curves. 
png pdf 
Figure 4:
Expected distributions of $ \hat{y}(\,\cdot\,) $ for a binary classification task separating $ S $ from $ B $, for the (left) $ \mathrm{CENNT} $ and (right) $ \mathrm{SANNT} $, prior to any fit to $ D_{H}^{\mathcal{A}} $. The individual distributions for $ S $ and $ B $ are shown by the nonstacked open blue and filled orange histogram, respectively. In the lower panels of the figures the expected values of $ S/B+1$ are shown. The gray bands correspond to the combined statistical and systematic uncertainty in $ B $. 
png pdf 
Figure 4a:
Expected distributions of $ \hat{y}(\,\cdot\,) $ for a binary classification task separating $ S $ from $ B $, for the (left) $ \mathrm{CENNT} $ and (right) $ \mathrm{SANNT} $, prior to any fit to $ D_{H}^{\mathcal{A}} $. The individual distributions for $ S $ and $ B $ are shown by the nonstacked open blue and filled orange histogram, respectively. In the lower panels of the figures the expected values of $ S/B+1$ are shown. The gray bands correspond to the combined statistical and systematic uncertainty in $ B $. 
png pdf 
Figure 4b:
Expected distributions of $ \hat{y}(\,\cdot\,) $ for a binary classification task separating $ S $ from $ B $, for the (left) $ \mathrm{CENNT} $ and (right) $ \mathrm{SANNT} $, prior to any fit to $ D_{H}^{\mathcal{A}} $. The individual distributions for $ S $ and $ B $ are shown by the nonstacked open blue and filled orange histogram, respectively. In the lower panels of the figures the expected values of $ S/B+1$ are shown. The gray bands correspond to the combined statistical and systematic uncertainty in $ B $. 
png pdf 
Figure 5:
The 20 nuisance parameters $ \{\theta_{j}\} $ with the largest impacts on $ r_{s} $. The gray lines refer to the $ \mathrm{CENNT} $ and the colored bars to the $ \mathrm{SANNT} $. The impacts can be read off from the $ x $axis. Labels for each $ \theta_{j} $ decreasing in magnitude when moving from top to bottom of the figure are shown on the $ y $axis. The association of each $ \theta_{j} $ label with the systematic variation it refers to, is summarized in Table 1. A more detailed discussion is given in the text. 
png pdf 
Figure 6:
Negative log of the profile likelihood $ 2\Delta\log\mathcal{L} $ as a function of $ r_{s} $, taking into account (red) all and (blue) only the statistical uncertainties in $ \Delta r_{s} $. The results as obtained from $ \mathrm{CENNT} $ are indicated by the dashed lines, the median expected result of an ensemble of 100 repetitions of the $ \mathrm{SANNT} $ varying random initializations are indicated by the continuous lines. The red and blue shaded bands surrounding the median expectations indicate 68% central intervals of these ensembles. In the lower panels the underlying distributions to these central intervals are shown. 
png pdf 
Figure 7:
Expected distributions of $ \hat{y}_{l}(\,\cdot\,) $ for multiclass classification, based on seven event classes, as used for a differential STXS stage0 cross section measurement of H production in Ref. [1], prior to any fit to $ D_{H}^{\mathcal{A}} $. In the upper (lower) part of the figure the results obtained after $ \mathrm{CENNT} $ ($ \mathrm{SANNT} $) are shown. The background processes of $ \Omega_{X} $ are indicated by stacked, differently colored, filled histograms. The expected $ \mathrm{g}\mathrm{g}\mathrm{H} $ and $ \mathrm{q}\mathrm{q}\mathrm{H} $ contributions are indicated by the nonstacked, cyan and redcolored, open histograms. In the lower panels of the figure the expected values of $ S/B+ $ 1 are shown. The gray bands correspond to the combined statistical and systematic uncertainty in the background model. 
png pdf 
Figure 7a:
Expected distributions of $ \hat{y}_{l}(\,\cdot\,) $ for multiclass classification, based on seven event classes, as used for a differential STXS stage0 cross section measurement of H production in Ref. [1], prior to any fit to $ D_{H}^{\mathcal{A}} $. In the upper (lower) part of the figure the results obtained after $ \mathrm{CENNT} $ ($ \mathrm{SANNT} $) are shown. The background processes of $ \Omega_{X} $ are indicated by stacked, differently colored, filled histograms. The expected $ \mathrm{g}\mathrm{g}\mathrm{H} $ and $ \mathrm{q}\mathrm{q}\mathrm{H} $ contributions are indicated by the nonstacked, cyan and redcolored, open histograms. In the lower panels of the figure the expected values of $ S/B+ $ 1 are shown. The gray bands correspond to the combined statistical and systematic uncertainty in the background model. 
png pdf 
Figure 7b:
Expected distributions of $ \hat{y}_{l}(\,\cdot\,) $ for multiclass classification, based on seven event classes, as used for a differential STXS stage0 cross section measurement of H production in Ref. [1], prior to any fit to $ D_{H}^{\mathcal{A}} $. In the upper (lower) part of the figure the results obtained after $ \mathrm{CENNT} $ ($ \mathrm{SANNT} $) are shown. The background processes of $ \Omega_{X} $ are indicated by stacked, differently colored, filled histograms. The expected $ \mathrm{g}\mathrm{g}\mathrm{H} $ and $ \mathrm{q}\mathrm{q}\mathrm{H} $ contributions are indicated by the nonstacked, cyan and redcolored, open histograms. In the lower panels of the figure the expected values of $ S/B+ $ 1 are shown. The gray bands correspond to the combined statistical and systematic uncertainty in the background model. 
png pdf 
Figure 8:
Negative log of the profile likelihood $ 2\Delta\log\mathcal{L} $ as a function of $ r_{s} $, for a differential STXS stage0 cross section measurement of H production in the $ \mathrm{H}\to\tau\tau $ decay channel, taking (red) all and (blue) only the statistical uncertainties in $ \Delta r_{s} $ into account. In the left part of the figure $ r_{\mathrm{inc.}} $, for an inclusive measurement is shown, in the middle and right parts of the figure $ r_{\mathrm{g}\mathrm{g}\mathrm{H}} $ and $ r_{\mathrm{q}\mathrm{q}\mathrm{H}} $ for a combined differential STXS stage0 measurement of these two contributions to the signal in two bins, are shown. The results as obtained from $ \mathrm{CENNT} $ are indicated by the dashed lines, the median expected result of an ensemble of 100 repetitions of $ \mathrm{SANNT} $ varying random initializations are indicated by the continuous lines. The red and blue shaded bands surrounding the median expectations indicate 68% central intervals of these ensembles. 
png pdf 
Figure 8a:
Negative log of the profile likelihood $ 2\Delta\log\mathcal{L} $ as a function of $ r_{s} $, for a differential STXS stage0 cross section measurement of H production in the $ \mathrm{H}\to\tau\tau $ decay channel, taking (red) all and (blue) only the statistical uncertainties in $ \Delta r_{s} $ into account. In the left part of the figure $ r_{\mathrm{inc.}} $, for an inclusive measurement is shown, in the middle and right parts of the figure $ r_{\mathrm{g}\mathrm{g}\mathrm{H}} $ and $ r_{\mathrm{q}\mathrm{q}\mathrm{H}} $ for a combined differential STXS stage0 measurement of these two contributions to the signal in two bins, are shown. The results as obtained from $ \mathrm{CENNT} $ are indicated by the dashed lines, the median expected result of an ensemble of 100 repetitions of $ \mathrm{SANNT} $ varying random initializations are indicated by the continuous lines. The red and blue shaded bands surrounding the median expectations indicate 68% central intervals of these ensembles. 
png pdf 
Figure 8b:
Negative log of the profile likelihood $ 2\Delta\log\mathcal{L} $ as a function of $ r_{s} $, for a differential STXS stage0 cross section measurement of H production in the $ \mathrm{H}\to\tau\tau $ decay channel, taking (red) all and (blue) only the statistical uncertainties in $ \Delta r_{s} $ into account. In the left part of the figure $ r_{\mathrm{inc.}} $, for an inclusive measurement is shown, in the middle and right parts of the figure $ r_{\mathrm{g}\mathrm{g}\mathrm{H}} $ and $ r_{\mathrm{q}\mathrm{q}\mathrm{H}} $ for a combined differential STXS stage0 measurement of these two contributions to the signal in two bins, are shown. The results as obtained from $ \mathrm{CENNT} $ are indicated by the dashed lines, the median expected result of an ensemble of 100 repetitions of $ \mathrm{SANNT} $ varying random initializations are indicated by the continuous lines. The red and blue shaded bands surrounding the median expectations indicate 68% central intervals of these ensembles. 
png pdf 
Figure 8c:
Negative log of the profile likelihood $ 2\Delta\log\mathcal{L} $ as a function of $ r_{s} $, for a differential STXS stage0 cross section measurement of H production in the $ \mathrm{H}\to\tau\tau $ decay channel, taking (red) all and (blue) only the statistical uncertainties in $ \Delta r_{s} $ into account. In the left part of the figure $ r_{\mathrm{inc.}} $, for an inclusive measurement is shown, in the middle and right parts of the figure $ r_{\mathrm{g}\mathrm{g}\mathrm{H}} $ and $ r_{\mathrm{q}\mathrm{q}\mathrm{H}} $ for a combined differential STXS stage0 measurement of these two contributions to the signal in two bins, are shown. The results as obtained from $ \mathrm{CENNT} $ are indicated by the dashed lines, the median expected result of an ensemble of 100 repetitions of $ \mathrm{SANNT} $ varying random initializations are indicated by the continuous lines. The red and blue shaded bands surrounding the median expectations indicate 68% central intervals of these ensembles. 
png pdf 
Figure 9:
Evolution of the loss functions CE, $ \Delta r_{s}^{\mathrm{stat.}} $, and $ \Delta r_{s} $, as used for this paper. Instead of the custom functions $ \mathcal{B}_{i} $ the identity operation (the socalled straightthrough estimator) is used for SANNT. In the upper panel the evolution of $ \hat{y} $ for randomly selected 50 (blue) signal and 50 (orange) background samples during training is shown. The gray shaded area indicates the pretraining. In the second panel from above the evolution of CE is shown. Though not actively used for the SANNT $ \Delta r_{s}^{\mathrm{stat.}} $ is also shown, in the third panel from above. In the lower panel the evolution of $ \Delta r_{s} $ is shown. The evaluation on the training (validation) dataset is indicated in blue (orange). The evolution of inactive loss functions, evaluated on the validation dataset, is indicated by the dashed orange curves. 
Tables  
png pdf 
Table 1:
Association of nuisance parameters $ \{\theta_{j}\} $ with the systematic variations they refer to, for the 20 $ \{\theta_{j}\} $ with the largest impacts on $ r_{s} $, as shown in Fig. 5. The label of each corresponding uncertainty is given in the first column, the type of uncertainty, process that it applies to, and rank in Fig. 5 are given in the second, third, and fourth column, respectively. More detailed discussion of is given in the text. 
png pdf 
Table 2:
Expected combined statistical and systematic uncertainties $ \Delta r_{s} $ and statistical uncertainties $ \Delta r_{s}^{\mathrm{stat.}} $, in the parameters $ r_{\mathrm{inc.}} $ for an inclusive, and $ r_{\mathrm{g}\mathrm{g}\mathrm{H}} $ and $ r_{\mathrm{q}\mathrm{q}\mathrm{H}} $ for a differential STXS stage0 cross section measurement of H production in the $ \mathrm{H}\to\tau\tau $ decay channel, as obtained from fits to $ D_{H}^{\mathcal{A}} $. In the second (third) column the results after $ \mathrm{SANNT} $ ($ \mathrm{CENNT} $) are shown. 
Summary 
We have demonstrated a neural network training, capable of accounting for the effects of systematic variations of the utilized data model in the training process and described its extension towards neural network multiclass classification. Trainings for binary and multiclass classification with seven output classes have been performed, based on a comprehensive data model with 86 nontrivial shapealtering systematic variations, as used for a previous measurement. The neural network output functions have been used to infer the signal strengths for inclusive Higgs boson production, as well as for Higgs boson production via gluonfusion ($ r_{\mathrm{g}\mathrm{g}\mathrm{H}} $) and vector boson fusion ($ r_{\mathrm{q}\mathrm{q}\mathrm{H}} $). With respect to a conventional training, based on crossentropy, we observe improvements of 12 and 16%, for the sensitivity in $ r_{\mathrm{g}\mathrm{g}\mathrm{H}} $ and $ r_{\mathrm{q}\mathrm{q}\mathrm{H}} $, respectively. This is the first time that a neural network training, capable of accounting for the effects of systematic variations in the utilized data model in the training process, has been demonstrated on a data model of that complexity and the first time that such a training has been applied to multiclass classification. 
References  
1  CMS Collaboration  Measurements of Higgs boson production in the decay channel with a pair of $ \tau $ leptons in protonproton collisions at $ \sqrt{s}= $ 13 TeV  EPJC 83 (2023) 562  CMSHIG19010 2204.12957 
2  M. Neal, Radford  Computing Likelihood Functions for HighEnergy Physics Experiments when Distributions are Defined by Simulators with Nuisance Parameters  link  
3  K. Cranmer, J. Pavez, and G. Louppe  Approximating Likelihood Ratios with Calibrated Discriminative Classifiers  link  1506.02169 
4  P. De Castro and T. Dorigo  INFERNO: InferenceAware Neural Optimisation  Comput. Phys. Commun. 244 (2019) 170  1806.04743 
5  S. Wunsch, S. Jörger, R. Wolf, and G. Quast  Optimal Statistical Inference in the Presence of Systematic Uncertainties Using Neural Network Optimization Based on Binned Poisson Likelihoods with Nuisance Parameters  Comput. Softw. Big Sci. 5 (2021) 4  2003.07186 
6  N. Simpson and L. Heinrich  neos: EndtoEndOptimised Summary Statistics for High Energy Physics  J. Phys. Conf. Ser. 2438 (2023) 012105  2203.05570 
7  CMS Collaboration  Performance of the CMS Level1 trigger in protonproton collisions at $ \sqrt{s} = $ 13 TeV  JINST 15 (2020) P10017  CMSTRG17001 2006.10165 
8  CMS Collaboration  The CMS trigger system  JINST 12 (2017) P01020  CMSTRG12001 1609.02366 
9  CMS Collaboration  The CMS experiment at the CERN LHC  JINST 3 (2008) S08004  
10  CMS Collaboration  Particleflow reconstruction and global event description with the CMS detector  JINST 12 (2017) P10003  CMSPRF14001 1706.04965 
11  CMS Collaboration  Technical proposal for the PhaseII upgrade of the Compact Muon Solenoid  CMS Technical Proposal CERNLHCC2015010, CMSTDR1502, 2015 CDS 

12  CMS Collaboration  Performance of electron reconstruction and selection with the CMS detector in protonproton collisions at $ \sqrt{s} = $ 8 TeV  JINST 10 (2015) P06005  CMSEGM13001 1502.02701 
13  CMS Collaboration  Electron and photon reconstruction and identification with the CMS experiment at the CERN LHC  JINST 16 (2021) P05014  CMSEGM17001 2012.06888 
14  CMS Collaboration  Performance of CMS muon reconstruction in pp collision events at $ \sqrt{s}= $ 7 TeV  JINST 7 (2012) P10002  CMSMUO10004 1206.4071 
15  CMS Collaboration  Performance of the CMS muon detector and muon reconstruction with protonproton collisions at $ \sqrt{s}= $ 13 TeV  JINST 13 (2018) P06015  CMSMUO16001 1804.04528 
16  M. Cacciari, G. P. Salam, and G. Soyez  The anti$ k_{\mathrm{T}} $ jet clustering algorithm  JHEP 04 (2008) 063  0802.1189 
17  M. Cacciari, G. P. Salam, and G. Soyez  FastJet user manual  EPJC 72 (2012) 1896  1111.6097 
18  CMS Collaboration  Identification of heavyflavour jets with the CMS detector in pp collisions at 13 TeV  JINST 13 (2018) P05011  CMSBTV16002 1712.07158 
19  E. Bols et al.  Jet flavour classification using DeepJet  JINST 15 (2020) P12012  2008.10519 
20  CMS Collaboration  Performance of reconstruction and identification of $ \tau $ leptons decaying to hadrons and $ \nu_\tau $ in pp collisions at $ \sqrt{s}= $ 13 TeV  JINST 13 (2018) P10005  CMSTAU16003 1809.02816 
21  CMS Collaboration  Identification of hadronic tau lepton decays using a deep neural network  JINST 17 (2022) P07023  CMSTAU20001 2201.08458 
22  CMS Collaboration  Performance of the CMS missing transverse momentum reconstruction in pp data at $ \sqrt{s} $ = 8 TeV  JINST 10 (2015) P02006  CMSJME13003 1411.0511 
23  D. Bertolini, P. Harris, M. Low, and N. Tran  Pileup per particle identification  JHEP 10 (2014) 059  1407.6013 
24  CMS Collaboration  An embedding technique to determine $ \tau\tau $ backgrounds in protonproton collision data  JINST 14 (2019) P06032  CMSTAU18001 1903.01216 
25  CMS Collaboration  Measurement of the $ \mathrm{Z}\gamma^{*}\to\tau\tau $ cross section in pp collisions at $ \sqrt{s}= $ 13 TeV and validation of $ \tau $ lepton analysis techniques  EPJC 78 (2018) 708  CMSHIG15007 1801.03535 
26  CMS Collaboration  Search for additional neutral MSSM Higgs bosons in the $ \tau\tau $ final state in protonproton collisions at $ \sqrt{s}= $ 13 TeV  JHEP 09 (2018) 007  CMSHIG17020 1803.06553 
27  J. Alwall et al.  MadGraph 5: Going beyond  JHEP 06 (2011) 128  1106.0522 
28  J. Alwall et al.  The automated computation of treelevel and nexttoleading order differential cross sections, and their matching to parton shower simulations  JHEP 07 (2014) 079  1405.0301 
29  P. Nason  A new method for combining NLO QCD with shower Monte Carlo algorithms  JHEP 11 (2004) 040  hepph/0409146 
30  S. Frixione, P. Nason, and C. Oleari  Matching NLO QCD computations with parton shower simulations: the POWHEG method  JHEP 11 (2007) 070  0709.2092 
31  S. Alioli, P. Nason, C. Oleari, and E. Re  NLO Higgs boson production via gluon fusion matched with shower in POWHEG  JHEP 04 (2009) 002  0812.0578 
32  S. Alioli, P. Nason, C. Oleari, and E. Re  A general framework for implementing NLO calculations in shower Monte Carlo programs: the POWHEG BOX  JHEP 06 (2010) 043  1002.2581 
33  S. Alioli et al.  Jet pair production in POWHEG  JHEP 04 (2011) 081  1012.3380 
34  E. Bagnaschi, G. Degrassi, P. Slavich, and A. Vicini  Higgs production via gluon fusion in the POWHEG approach in the SM and in the MSSM  JHEP 02 (2012) 088  1111.2854 
35  T. Sjöstrand et al.  An introduction to PYTHIA 8.2  Comput. Phys. Commun. 191 (2015) 159  1410.3012 
36  S. Agostinelli et al.  GEANT 4a simulation toolkit  NIM A 506 (2003) 250  
37  K. Fukushima  Cognitron: A selforganizing multilayered neural network  Biological Cybernetics 20 (1975) 121  
38  V. Nair and G. E. Hinton  Rectified linear units improve restricted boltzmann machines  in Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML'10, Madison, USA, 2010  
39  L. Bianchini, J. Conway, E. K. Friis, and C. Veelken  Reconstruction of the Higgs mass in $ H\to\tau\tau $ events by dynamical likelihood techniques  J. Phys. Conf. Ser. 513 (2014) 022035  
40  A. V. Gritsan, R. Röntsch, M. Schulze, and M. Xiao  Constraining anomalous Higgs boson couplings to the heavy flavor fermions using matrix element techniques  PRD 94 (2016) 055023  1606.03107 
41  LHC Higgs Cross Section Working Group  Handbook of LHC Higgs cross sections: 4. Deciphering the nature of the Higgs sector  CERN Report CERN2017002M, 2016 link 
1610.07922 
42  N. Berger et al.  Simplified template cross sections  stage 1.1  LHC Higgs Cross Section Working Group Report LHCHXSWG2019003, DESY19070, 2019 link 
1906.02754 
43  G. Cowan, K. Cranmer, E. Gross, and O. Vitells  Asymptotic formulae for likelihoodbased tests of new physics  EPJC 71 (2011) 1554  1007.1727 
44  CMS Collaboration  The CMS statistical analysis and combination tool: Combine  Submitted to Comput. Softw. Big Sci, 2024  CMSCAT23001 2404.06614 
45  R. J. Barlow and C. Beeston  Fitting using finite Monte Carlo samples  Comput. Phys. Commun. 77 (1993) 219  
46  C. R. Rao  Information and the accuracy attainable in the estimation of statistical parameters  in Breakthroughs in statistics, Springer, 1992  
47  H. Cramér  Mathematical methods of statistics, volume 9  Princeton university press, 1999  
48  R. A. Fisher  Theory of statistical estimation  Mathematical Proceedings of the Cambridge Philosophical Society 22 (1925) 700  
49  S. Wunsch, R. Friese, R. Wolf, and G. Quast  Identifying the relevant dependencies of the neural network response on characteristics of the input space  Comput. Softw. Big Sci. 2 (2018) 5  1803.08782 
50  J. Platt and A. Barr  Constrained differential optimization  in Neural Information Processing Systems, D. Anderson, ed., American Institute of Physics, 1987 link 
Compact Muon Solenoid LHC, CERN 