Performance of reconstruction and identification of τ leptons in their decays to hadrons and ντ in LHC Run-2
Abstract: The CMS hadrons-plus-strips (HPS) algorithm that was developed to reconstruct tau leptons in their hadronic decays demonstrated a good performance in the LHC Run-1; the algorithm achieves an identification efficiency of 50-60% with a probability for quark and gluon jets, electrons, and muons to be misidentified as τ lepton between per cent and per mille levels. In this paper improvements to the HPS algorithm for the LHC Run-2 are described. The performance is evaluated with a sample of proton-proton collisions recorded at a center-of-mass energy of s= 13 TeV in 2015. The data sample corresponds to a total integrated luminosity of 2.3 fb1.
png pdf
Figure 1-a:
Distance in η (a) and in ϕ (b) between τh and e/γ, that are due to tau decay products, as a function of e/γ pT. A sample of simulated τh decays is used. The size of the window is larger in the ϕ-direction due to magnetic bending. The dotted point shows 95% quantile for the given bin, and the dashed lines represent the fitted functions f and g given by Eq. 3.

png pdf
Figure 1-b:
Distance in η (a) and in ϕ (b) between τh and e/γ, that are due to tau decay products, as a function of e/γ pT. A sample of simulated τh decays is used. The size of the window is larger in the ϕ-direction due to magnetic bending. The dotted point shows 95% quantile for the given bin, and the dashed lines represent the fitted functions f and g given by Eq. 3.

png pdf
Figure 2-a:
Misidentification probability as a function of τh identification efficiency, evaluated using H ττ and QCD MC samples (a), and Z (2TeV) and QCD MC samples (b). Four different configurations of reconstruction plus isolation method are compared (from top to bottom): Run-1 fixed size strip with Δβ= 0.46 , Run-1 fixed size strip with Δβ= 0.46 and pTstrip, outer cut, Run-1 fixed size strip with Δβ= 0.2 and pTstrip, outer cut, Run-2 dynamic strip with Δβ= 0.2 and pTstrip, outer cut. The three points on each curve correspond to, from left to right, the Tight, Medium and Loose working point. The misidentification probability is calculated with respect to jets, which pass minimal τ reconstruction requirements.

png pdf
Figure 2-b:
Misidentification probability as a function of τh identification efficiency, evaluated using H ττ and QCD MC samples (a), and Z (2TeV) and QCD MC samples (b). Four different configurations of reconstruction plus isolation method are compared (from top to bottom): Run-1 fixed size strip with Δβ= 0.46 , Run-1 fixed size strip with Δβ= 0.46 and pTstrip, outer cut, Run-1 fixed size strip with Δβ= 0.2 and pTstrip, outer cut, Run-2 dynamic strip with Δβ= 0.2 and pTstrip, outer cut. The three points on each curve correspond to, from left to right, the Tight, Medium and Loose working point. The misidentification probability is calculated with respect to jets, which pass minimal τ reconstruction requirements.

png pdf
Figure 3-a:
Misidentification probability as a function of τh identification efficiency, evaluated using H ττ and QCD MC samples (a), and Z (2TeV) and QCD MC samples (b). The MVA-based discriminators are compared to that of the isolation sum discriminators. The points correspond to working points of the discriminators. The three working points of the isolation sum discriminator are Loose, Medium, and Tight working point. The six working points of the MVA-based discriminators are Very Loose, Loose, Medium, Tight, Very Tight, and Very Very Tight working point, respectively. The misidentification probability is calculated with respect to jets, which pass minimal τ reconstruction requirements.

png pdf
Figure 3-b:
Misidentification probability as a function of τh identification efficiency, evaluated using H ττ and QCD MC samples (a), and Z (2TeV) and QCD MC samples (b). The MVA-based discriminators are compared to that of the isolation sum discriminators. The points correspond to working points of the discriminators. The three working points of the isolation sum discriminator are Loose, Medium, and Tight working point. The six working points of the MVA-based discriminators are Very Loose, Loose, Medium, Tight, Very Tight, and Very Very Tight working point, respectively. The misidentification probability is calculated with respect to jets, which pass minimal τ reconstruction requirements.

png pdf
Figure 4-a:
Efficiency of the τh identification estimated with simulated Z/γττ events (a) and the misidentification probability estimated with simulated QCD multi-jet events (b) for the Very Loose, Loose, Medium, Tight, Very Tight, and Very Very Tight working points of the MVA based τh isolation algorithm. The efficiency is shown as a function of the τh transverse momentum while the misidentification probability is shown as a function of the jet transverse momentum.

png pdf
Figure 4-b:
Efficiency of the τh identification estimated with simulated Z/γττ events (a) and the misidentification probability estimated with simulated QCD multi-jet events (b) for the Very Loose, Loose, Medium, Tight, Very Tight, and Very Very Tight working points of the MVA based τh isolation algorithm. The efficiency is shown as a function of the τh transverse momentum while the misidentification probability is shown as a function of the jet transverse momentum.

png pdf
Figure 5-a:
Efficiency of the τh identification estimated with simulated Z/γττ events (a) and the eτhmisidentification probability estimated with simulated Z/γee events (b) for the Very Loose, Loose, Medium, Tight and Very Tight working points of the MVA based anti-e discrimination algorithm. The efficiency is shown as a function of the τh transverse momentum while the misidentification probability is shown as a function of the e transverse momentum. Both efficiency and misidentification probability are calculated for τh candidates with a reconstructed decay mode and passing the Loose working point of the isolation sum discriminator.

png pdf
Figure 5-b:
Efficiency of the τh identification estimated with simulated Z/γττ events (a) and the eτhmisidentification probability estimated with simulated Z/γee events (b) for the Very Loose, Loose, Medium, Tight and Very Tight working points of the MVA based anti-e discrimination algorithm. The efficiency is shown as a function of the τh transverse momentum while the misidentification probability is shown as a function of the e transverse momentum. Both efficiency and misidentification probability are calculated for τh candidates with a reconstructed decay mode and passing the Loose working point of the isolation sum discriminator.

png pdf
Figure 6-a:
Postfit distributions in the pass (a,c) and fail (b,d) control regions, using mvis (a,b) or Ncharged (c,d) as observable, for the Loose working point of the MVA-based isolation.

png pdf
Figure 6-b:
Postfit distributions in the pass (a,c) and fail (b,d) control regions, using mvis (a,b) or Ncharged (c,d) as observable, for the Loose working point of the MVA-based isolation.

png pdf
Figure 6-c:
Postfit distributions in the pass (a,c) and fail (b,d) control regions, using mvis (a,b) or Ncharged (c,d) as observable, for the Loose working point of the MVA-based isolation.

png pdf
Figure 6-d:
Postfit distributions in the pass (a,c) and fail (b,d) control regions, using mvis (a,b) or Ncharged (c,d) as observable, for the Loose working point of the MVA-based isolation.

png pdf
Figure 7-a:
Postfit distributions in the μτh (a) and μμ (b) regions, for the Loose working point of the isolation-sum discriminator as derived using the Zττ/Zμμ ratio method.

png pdf
Figure 7-b:
Postfit distributions in the μτh (a) and μμ (b) regions, for the Loose working point of the isolation-sum discriminator as derived using the Zττ/Zμμ ratio method.

png pdf
Figure 8-a:
The transverse mass distribution in the selected sample of Wμν events after applying maximum likelihood fit (a). The measured transverse mass distribution in the sample of selected Wτhν events with Medium working point of isolation-sum discriminator.

png pdf
Figure 8-b:
The transverse mass distribution in the selected sample of Wμν events after applying maximum likelihood fit (a). The measured transverse mass distribution in the sample of selected Wτhν events with Medium working point of isolation-sum discriminator.

png pdf
Figure 9-a:
The distributions of mvis of the muon-τh system with all τh decay modes included. The observed data are compared to predictions with different shift applied to the energy scale: 6 (a), 0 (b) and +6 (c).

png pdf
Figure 9-b:
The distributions of mvis of the muon-τh system with all τh decay modes included. The observed data are compared to predictions with different shift applied to the energy scale: 6 (a), 0 (b) and +6 (c).

png pdf
Figure 9-c:
The distributions of mvis of the muon-τh system with all τh decay modes included. The observed data are compared to predictions with different shift applied to the energy scale: 6 (a), 0 (b) and +6 (c).

png pdf
Figure 10-a:
Postfit distributions in the SS (a) and OS (b) regions for the charge misidentification probability measurement.

png pdf
Figure 10-b:
Postfit distributions in the SS (a) and OS (b) regions for the charge misidentification probability measurement.

png pdf
Figure 12-a:
Post-fit plots of the tag and probe mass in the pass category for the Loose (a), Medium (b), Tight (c) and Very Tight (d) working point of the anti-e discriminator in the barrel region.

png pdf
Figure 12-b:
Post-fit plots of the tag and probe mass in the pass category for the Loose (a), Medium (b), Tight (c) and Very Tight (d) working point of the anti-e discriminator in the barrel region.

png pdf
Figure 12-c:
Post-fit plots of the tag and probe mass in the pass category for the Loose (a), Medium (b), Tight (c) and Very Tight (d) working point of the anti-e discriminator in the barrel region.

png pdf
Figure 12-d:
Post-fit plots of the tag and probe mass in the pass category for the Loose (a), Medium (b), Tight (c) and Very Tight (d) working point of the anti-e discriminator in the barrel region.

png pdf
Table 1:
Data/MC scale factors for the different working points of the isolation-sum and MVA-based discriminator, and for two values of the isolation cone. An uncertainty of 3.9% has been added in quadrature to the uncertainty returned by the fit to account for the tracking efficiency uncertainty. Tag-and-probe method is used to measure the efficiency and its uncertainty.

png pdf
Table 2:
Data/MC scale factors for the different working points of the isolation-sum and MVA-based discriminator, and for two values of the isolation cone. The ratio, Zτμτh/Zμμ is used as a discriminant variable.

png pdf
Table 3:
τh identification efficiency scale factor, the nomalization of σ(ppW+X|mW>200 GeV), r, and correlation coefficient between the two quantities obtained from the fit. The scale factors are measured for both isolation-sum and MVA-based discriminators.

png pdf
Table 4:
Energy scale corrections for τh measured in Zττ events for τh reconstructed in different decay modes. The inclusive result is obtained by means of an independent fit and hence may be different from the average of τh energy scale corrections measured for individual decay modes.

png pdf
Table 5:
Probability for electrons to pass the different working points of the MVA-based anti-e discriminator, splitted in barrel and endcap region. For each working point, the eτh misidentification probability is defined as the fraction of probes passing the given discriminator with respect to the total number of probes.
The algorithm used in Run-2 to reconstruct and identify hadronically decaying taus has been described in this note, with a particular emphasis on the changes with respect to Run-1. These changes include among others a dynamical strip reconstruction, and additional variables in the MVA-disriminators against jets and electrons.

The performance has been measured in data collected in 2015 at a center-of-mass energy of s=13 TeV. The tau identification and reconstruction techniques described are now fully commissioned and ready for use in CMS physics analyses for the remainder of Run-2. The performance in data of the τh identification efficiency in both low and high pT regions is similar to that in Monte Carlo simulation, while the performance of the jet τh misidentification is found to be moderately different. The energy scale of τh is measured and its response with respect to the Monte Carlo simulation is found to be close to 1. The reduction in electron τh fake probability is seen to perform well in Run-2, and its scale factors have been measured.
png pdf
Additional Figure 1:
Input variable for the MVA-based anti-electron discriminator. Distribution, normalized to unity, of the ratio between the total ECAL energy and the inner track momentum, for hadronic τ decays (blue) and electrons (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, to pass the loose working point of the cut-based isolation and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 2:
Input variable for the MVA-based anti-electron discriminator. Distribution, normalized to unity, of (NGSFhitsNKFhits)/(NGSFhits+NKFhits), for hadronic τ decays (blue) and electrons (red). The quantities NGSFhits and NKFhits are, respectively, the number of valid hits in the tracker detector which are associated with the track reconstructed by the GSF or Kalman filter (KF) algorithms. The τh candidates are required to have pT> 20 GeV, |η|< 2.3, to pass the loose working point of the cut-based isolation and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 3:
Input variable for the MVA-based anti-electron discriminator. Distribution, normalized to unity, of the χ2 per degree-of-freedom (DoF) of the track fit performed with the GSF algorithm, for hadronic τ decays (blue) and electrons (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, to pass the loose working point of the cut-based isolation and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 4:
Input variable for the MVA-based anti-electron discriminator. Distribution, normalized to unity, of Fbrem|pinpout|/pin, for hadronic τ decays (blue) and electrons (red). The quantities pin and pout are the momenta, measured from the track curvature at the innermost and outermost position in the tracker, of the tracks reconstructed using the GSF algorithm. The τh candidates are required to have pT> 20 GeV, |η|< 2.3, to pass the loose working point of the cut-based isolation and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 5:
Input variable for the MVA-based anti-electron discriminator. Distribution, normalized to unity, of the ratio between the ECAL energy associated with the leading track of the τh candidate and the leading track momentum, for hadronic τ decays (blue) and electrons (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, to pass the loose working point of the cut-based isolation and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 6:
Input variable for the MVA-based anti-electron discriminator. Distribution, normalized to unity, of (Δϕ)2pTγ in-sigcone/GeV computed from the pT-weighted square of the distance in ϕ between each photon included in a strip and the leading track of the τh candidate, for hadronic τ decays (blue) and electrons (red). This variable is computed separately for photons inside and outside the τh candidate signal cone in order to increase its separation power. The τh candidates are required to have pT> 20 GeV, |η|< 2.3, to pass the loose working point of the cut-based isolation and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 7:
Input variable for the MVA-based anti-electron discriminator. Distribution, normalized to unity, of (Δη)2pTγ in-sigcone/GeV computed from the pT-weighted square of the distance in η between each photon included in a strip and the leading track of the τh candidate, for hadronic τ decays (blue) and electrons (red). This variable is computed separately for photons inside and outside the τh candidate signal cone in order to increase its separation power. The τh candidates are required to have pT> 20 GeV, |η|< 2.3, to pass the loose working point of the cut-based isolation and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 8:
Input variable for the MVA-based anti-electron discriminator. Distribution, normalized to unity, of the fraction of τh energy carried by photons, for hadronic τ decays (blue) and electrons (red). This variable is computed separately for photons inside and outside the τh candidate signal cone in order to increase its separation power. The τh candidates are required to have pT> 20 GeV, |η|< 2.3, to pass the loose working point of the cut-based isolation and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 9:
Input variable for the MVA-based anti-electron discriminator. Distribution, normalized to unity, of the ratio between the amount of energy deposited in the ECAL and the sum of the ECAL and HCAL energy deposits which are associated with the decay products of the τh candidate, for hadronic τ decays (blue) and electrons (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, to pass the loose working point of the cut-based isolation and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 10:
Input variable for the MVA-based anti-electron discriminator. Distribution, normalized to unity, of the number of valid hits of the track reconstructed by the GSF algorithm, for hadronic τ decays (blue) and electrons (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, to pass the loose working point of the cut-based isolation and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 11:
Input variable for the MVA-based anti-electron discriminator. Distribution, normalized to unity, of the τh candidate visible mass computed summing the four-momenta of photons and charged particles inside the τh candidate signal cone, for hadronic τ decays (blue) and electrons (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, to pass the loose working point of the cut-based isolation and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 12:
Input variable for the MVA-based anti-electron discriminator. Distribution, normalized to unity, of the ratio between the HCAL energy associated with the leading track of the τh candidate and the leading track momentum, for hadronic τ decays (blue) and electrons (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, to pass the loose working point of the cut-based isolation and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 13:
Distribution of the pT sum of charged hadrons in the isolation cone, normalized to unity, which is used as an input variable for the MVA-based τh-isolation discriminator, for hadronic τ decays (blue) and jets (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 14:
Distribution of the pT sum of Photons in the isolation cone, normalized to unity, which is used as an input variable for the MVA-based τh-isolation discriminator, for hadronic τ decays (blue) and jets (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 15:
Distribution of the signed transverse impact parameter of leading track, normalized to unity, which is used as an input variable for the MVA-based τh-isolation discriminator, for hadronic τ decays (blue) and jets (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 16:
Distribution of the signed transverse impact parameter significance of leading track, normalized to unity, which is used as input variables for the MVA-based τh-isolation discriminator, for hadronic τ decays (blue) and jets (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 17:
Distribution of the Signed 3D impact parameter of the leading track, normalized to unity, which is used as an input variable for the MVA-based τh-isolation discriminator, for hadronic τ decays (blue) and jets (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 18:
Distribution of the Signed 3D impact parameter significance of the leading track, normalized to unity, which is used as an input variable for the MVA-based τh-isolation discriminator, for hadronic τ decays (blue) and jets (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 20:
Distribution of the flight distance significance for three-prong τs, normalized to unity, which is used as an input variable for the MVA-based τh-isolation discriminator, for hadronic τ decays (blue) and jets (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 22:
Fraction of electromagnetic energy in signal cone, normalized to unity, which is used as an input variable for the MVA-based τh-isolation discriminator, for hadronic τ decays (blue) and jets (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.

png pdf
Additional Figure 24:
Distribution of the pT-weighted ΔR of photons within signal cone, normalized to unity, which is used as an input variable for the MVA-based τh-isolation discriminator, for hadronic τ decays (blue) and jets (red). The τh candidates are required to have pT> 20 GeV, |η|< 2.3, and have to be reconstructed in one of the decay modes h±, h±π0, h±π0π0 or h±hh±.
