CMS logoCMS event Hgg
Compact Muon Solenoid
LHC, CERN

CMS-MLG-23-001 ; CERN-EP-2023-303
Portable acceleration of CMS computing workflows with coprocessors as a service
Comput. Softw. Big Sci. 8 (2024) 17
Abstract: Computing demands for large scientific experiments, such as the CMS experiment at the CERN LHC, will increase dramatically in the next decades. To complement the future performance increases of software running on central processing units (CPUs), explorations of coprocessor usage in data processing hold great potential and interest. Coprocessors are a class of computer processors that supplement CPUs, often improving the execution of certain functions due to architectural design choices. We explore the approach of Services for Optimized Network Inference on Coprocessors (SONIC) and study the deployment of this as-a-service approach in large-scale data processing. In the studies, we take a data processing workflow of the CMS experiment and run the main workflow on CPUs, while offloading several machine learning (ML) inference tasks onto either remote or local coprocessors, specifically graphics processing units (GPUs). With experiments performed at Google Cloud, the Purdue Tier-2 computing center, and combinations of the two, we demonstrate the acceleration of these ML algorithms individually on coprocessors and the corresponding throughput improvement for the entire workflow. This approach can be easily generalized to different types of coprocessors and deployed on local CPUs without decreasing the throughput performance. We emphasize that the SONIC approach enables high coprocessor usage and enables the portability to run workflows on different types of coprocessors.
Figures & Tables Summary References CMS Publications
Figures

png pdf
Figure 1:
An example inference as a service setup with multiple coprocessor servers. Clients usually run on CPUs, shown on the left side; servers hosting different models run on coprocessors, shown on the right side.

png pdf
Figure 2:
Illustration of the SONIC implementation of IaaS in CMSSW. The figure also shows the possibility of an additional load-balancing layer in the SONIC scheme. For example, if multiple coprocessor-enabled machines are used to host servers, a Kubernetes engine can be set up to distribute inference calls across the machines [84]. Image adapted from Ref. [63].

png pdf
Figure 3:
Illustration of the jet information in the Run 2 simulated $ \mathrm{t} \overline{\mathrm{t}} $ data set used in subsequent studies. Distributions of the number of jets per event (left) and the number of particles per jet (right) are shown for AK4 jets (upper) and AK8 jets (lower). For the distributions of the number of particles, the rightmost bin is an overflow bin.

png pdf
Figure 3-a:
Illustration of the jet information in the Run 2 simulated $ \mathrm{t} \overline{\mathrm{t}} $ data set used in subsequent studies. Distributions of the number of jets per event (left) and the number of particles per jet (right) are shown for AK4 jets (upper) and AK8 jets (lower). For the distributions of the number of particles, the rightmost bin is an overflow bin.

png pdf
Figure 3-b:
Illustration of the jet information in the Run 2 simulated $ \mathrm{t} \overline{\mathrm{t}} $ data set used in subsequent studies. Distributions of the number of jets per event (left) and the number of particles per jet (right) are shown for AK4 jets (upper) and AK8 jets (lower). For the distributions of the number of particles, the rightmost bin is an overflow bin.

png pdf
Figure 3-c:
Illustration of the jet information in the Run 2 simulated $ \mathrm{t} \overline{\mathrm{t}} $ data set used in subsequent studies. Distributions of the number of jets per event (left) and the number of particles per jet (right) are shown for AK4 jets (upper) and AK8 jets (lower). For the distributions of the number of particles, the rightmost bin is an overflow bin.

png pdf
Figure 3-d:
Illustration of the jet information in the Run 2 simulated $ \mathrm{t} \overline{\mathrm{t}} $ data set used in subsequent studies. Distributions of the number of jets per event (left) and the number of particles per jet (right) are shown for AK4 jets (upper) and AK8 jets (lower). For the distributions of the number of particles, the rightmost bin is an overflow bin.

png pdf
Figure 4:
Average processing time (left) and throughput (right) of the PN-AK4 algorithm served by a TRITON server running on one NVIDIA Tesla T4 GPU, presented as a function of the batch size. Values are shown for different inference backends: ONNX (orange), ONNX with TRT (green), and PYTORCH (red). Performance values for these backends when running on a CPU-based TRITON server are given in dashed lines, with the same color-to-backend correspondence.

png pdf
Figure 5:
Average processing time (left) and throughput (right) of one of the AK8 ParticleNet algorithms served by a TRITON server running on one NVIDIA Tesla T4 GPU, presented as a function of the batch size. Values are shown for different inference backends: ONNX (orange), ONNX with TRT (green), and PYTORCH (red). Performance values for these backends when running on a CPU-based TRITON server are given in dashed lines, with the same color-to-backend correspondence.

png pdf
Figure 6:
Average processing time (left) and throughput (right) of the DeepMET algorithm served by a TRITON server running on one NVIDIA Tesla T4 GPU, presented as a function of the batch size. Similar performance when running on a CPU-based TRITON server is given in dashed lines.

png pdf
Figure 7:
Average processing time (left) and throughput (right) of the DeepTau algorithm served by a TRITON server running on one NVIDIA Tesla T4 GPU, presented as a function of the batch size. Values are shown for different inference backends: TENSORFLOW (TF) (orange), and TENSORFLOW with TRT (blue). Performance values for these backends when running on a CPU-based TRITON server are given in dashed lines, with the same color-to-backend correspondence.

png pdf
Figure 8:
The GPU saturation scan performed in GCP, where the per-event throughput is shown as a function of the number of parallel CPU clients for the PYTORCH version of PN-AK4 (black), DeepMET (blue), DeepTau optimized with TRT (red), and all PYTORCH versions of PN-AK8 on a single GPU (green). Each of the parallel jobs was run in a four-threaded configuration. The CPU tasks ran in four-threaded GCP VMs, and the TRITON servers were hosted on separate single GPU VMs also in GCP. The line for direct-inference jobs represents the baseline configuration measured by running all algorithms without the use of the SONIC approach or any GPUs. Each solid line represents running one of the specified models on GPU via the SONIC approach.

png pdf
Figure 9:
Production tests across different sites. The CPU tasks always run at Purdue, while the servers with GPU inference tasks for all the models run at Purdue (blue) and at GCP in Iowa (red). The throughput values are higher than those shown in Fig. 8 because the CPUs at Purdue are more powerful than those comprising the GCP VMs.

png pdf
Figure 10:
Scale-out test results on Google Cloud. The average throughput of the workflow with the SONIC approach is 4.0 events/s (solid blue), while the average throughput of the direct-inference workflow is 3.5 events/s (dashed red).

png pdf
Figure 11:
Throughput (upper) and throughput ratio (lower) between the SONIC approach and direct inference in the local CPU tests at the Purdue Tier-2 cluster. To ensure the CPUs are always saturated, the number of threads per job multiplied by the number of jobs is set to 20.
Tables

png pdf
Table 1:
Average event size of different CMS data tiers with Run 2 data-taking conditions [81,69,78].

png pdf
Table 2:
The average time of the Mini-AOD processing (without the SONIC approach) with one thread on one single CPU core. The average processing times of the algorithms supported by the SONIC approach are listed in the column labeled ``Time.'' The column labeled ``Fraction'' refers to the fraction of the full workflow's processing time that the algorithm in question consumes. Together, the algorithms currently supported by the SONIC approach consume about 9% of the total processing time. This table also contains the expected server input for each model type created per event in Run 2 $ \mathrm{t} \overline{\mathrm{t}} $ events in the column labeled ``Input.''

png pdf
Table 3:
Memory usage with direct inference and the SONIC approach in the local CPU tests at the Purdue Tier-2 cluster. The last column is calculated as the sum of client and server memory usage divided by the direct inference memory usage. To ensure the CPUs are always saturated, the number of threads $ n_\text{T} $ per job multiplied by the number of jobs is set to 20.
Summary
Within the next decade, the data-taking rate at the LHC will increase dramatically, straining the expected computing resources of the LHC experiments. At the same time, more algorithms that run on these resources will be converted into either machine learning or domain algorithms that are easily accelerated with the use of coprocessors, such as graphics processing units (GPUs). By pursuing heterogeneous architectures, it is possible to alleviate potential shortcomings of available central processing unit (CPU) resources. Inference as a service (IaaS) is a promising scheme to integrate coprocessors into CMS computing workflows. In IaaS, client code simply assembles the input data for an algorithm, sends that input to an inference server running either locally or remotely, and retrieves output from the server. The implementation of IaaS discussed throughout this paper is called the Services for Optimized Network Inference on Coprocessors (SONIC) approach, which employs NVIDIA TRITON Inference Servers to host models on coprocessors, as demonstrated here in studies on GPUs, CPUs, and Graphcore Intelligence Processing Units (IPUs). In this paper, the SONIC approach in the CMS software framework ( CMSSW ) is demonstrated in a sample Mini-AOD workflow, where algorithms for jet tagging, tau lepton identification, and missing transverse momentum regression are ported to run on inference servers. These algorithms account for nearly 10% of the total processing time per event in a simulated data set of top quark-antiquark events. After model profiling, which is used to optimize server performance and determine the needed number of GPUs for a given number of client jobs, the expected 10% decrease in per-event processing time was achieved in a large-scale test of Mini-AOD production with the SONIC approach that used about 10\,000 CPU cores and 100 GPUs. The network bandwidth is large enough to support high input-output model inference for the workflow tested, and it will be monitored as the fraction of algorithms using remote GPUs increases. In addition to meeting performance expectations, we demonstrated that the throughput results are not highly sensitive to the physical client-to-server distance, at least up to distances of hundreds of kilometers. Running inference through TRITON servers on local CPU resources does not affect the throughput compared with the standard approach of running inference directly on CPUs in the job thread. We also performed a test using GraphCore IPUs to demonstrate the flexibility of the SONIC approach. The SONIC approach for IaaS represents a flexible method to accelerate algorithms, which is increasingly valuable for LHC experiments. Using a realistic workflow, we highlighted many of the benefits of the SONIC approach, including the use of remote resources, workflow acceleration, and portability to different processor technologies. To make it a viable and robust paradigm for CMS computing in the future, additional studies are ongoing or planned for monitoring and mitigating potential issues such as excessive network and memory usage or server failures.
References
1 L. Evans and P. Bryant LHC machine JINST 3 (2008) S08001
2 ATLAS Collaboration The ATLAS experiment at the CERN Large Hadron Collider JINST 3 (2008) S08003
3 CMS Collaboration The CMS experiment at the CERN LHC JINST 3 (2008) S08004
4 ATLAS Collaboration Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC PLB 716 (2012) 1 1207.7214
5 CMS Collaboration Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC PLB 716 (2012) 30 CMS-HIG-12-028
1207.7235
6 CMS Collaboration Observation of a new boson with mass near 125 GeV in pp collisions at $ \sqrt{s} = $ 7 and 8 TeV JHEP 06 (2013) 081 CMS-HIG-12-036
1303.4571
7 CMS Collaboration Search for supersymmetry in proton-proton collisions at 13 TeV in final states with jets and missing transverse momentum JHEP 10 (2019) 244 CMS-SUS-19-006
1908.04722
8 CMS Collaboration Combined searches for the production of supersymmetric top quark partners in proton-proton collisions at $ \sqrt{s} = $ 13 TeV EPJC 81 (2021) 970 CMS-SUS-20-002
2107.10892
9 CMS Collaboration Search for higgsinos decaying to two Higgs bosons and missing transverse momentum in proton-proton collisions at $ \sqrt{s} = $ 13 TeV JHEP 05 (2022) 014 CMS-SUS-20-004
2201.04206
10 CMS Collaboration Search for supersymmetry in final states with two oppositely charged same-flavor leptons and missing transverse momentum in proton-proton collisions at $ \sqrt{s} = $ 13 TeV JHEP 04 (2021) 123 CMS-SUS-20-001
2012.08600
11 ATLAS Collaboration Search for squarks and gluinos in final states with hadronically decaying $ \tau $-leptons, jets, and missing transverse momentum using pp collisions at $ \sqrt{s} = $ 13 TeV with the ATLAS detector PRD 99 (2019) 012009 1808.06358
12 ATLAS Collaboration Search for top squarks in events with a Higgs or Z boson using 139 fb$ ^{-1} $ of pp collision data at $ \sqrt{s}= $ 13 TeV with the ATLAS detector EPJC 80 (2020) 1080 2006.05880
13 ATLAS Collaboration Search for charginos and neutralinos in final states with two boosted hadronically decaying bosons and missing transverse momentum in pp collisions at $ \sqrt{s} = $ 13 TeV with the ATLAS detector PRD 104 (2021) 112010 2108.07586
14 ATLAS Collaboration Search for direct pair production of sleptons and charginos decaying to two leptons and neutralinos with mass splittings near the W-boson mass in $ \sqrt{s} = $ 13 TeV pp collisions with the ATLAS detector JHEP 06 (2023) 031 2209.13935
15 ATLAS Collaboration Search for new phenomena in events with an energetic jet and missing transverse momentum in pp collisions at $ \sqrt{s}= $ 13 TeV with the ATLAS detector PRD 103 (2021) 112006 2102.10874
16 CMS Collaboration Search for new particles in events with energetic jets and large missing transverse momentum in proton-proton collisions at $ \sqrt{s}= $ 13 TeV JHEP 11 (2021) 153 CMS-EXO-20-004
2107.13021
17 ATLAS Collaboration Search for new resonances in mass distributions of jet pairs using 139 fb$ ^{-1} $ of pp collisions at $ \sqrt{s}= $ 13 TeV with the ATLAS detector JHEP 03 (2020) 145 1910.08447
18 ATLAS Collaboration Search for high-mass dilepton resonances using 139 fb$ ^{-1} $ of pp collision data collected at $ \sqrt{s}= $ 13 TeV with the ATLAS detector PLB 796 (2019) 68 1903.06248
19 CMS Collaboration Search for narrow and broad dijet resonances in proton-proton collisions at $ \sqrt{s}= $ 13 TeV and constraints on dark matter mediators and other new particles JHEP 08 (2018) 130 CMS-EXO-16-056
1806.00843
20 CMS Collaboration Search for high mass dijet resonances with a new background prediction method in proton-proton collisions at $ \sqrt{s}= $ 13 TeV JHEP 05 (2020) 033 CMS-EXO-19-012
1911.03947
21 CMS Collaboration Search for resonant and nonresonant new phenomena in high-mass dilepton final states at $ \sqrt{s}= $ 13 TeV JHEP 07 (2021) 208 CMS-EXO-19-019
2103.02708
22 O. Aberle et al. High-Luminosity Large Hadron Collider (HL-LHC): Technical design report CERN Yellow Rep. Monogr. 10 (2020)
23 O. BrĂ¼ning and L. Rossi, eds The High Luminosity Large Hadron Collider: the new machine for illuminating the mysteries of universe World Scientific, ISBN~978-981-4675-46-8, 978-981-4678-14-8
link
24 CMS Collaboration The Phase-2 upgrade of the CMS level-1 trigger CMS Technical Design Report CERN-LHCC-2020-004, CMS-TDR-021, 2020
CDS
25 ATLAS Collaboration Technical design report for the Phase-II upgrade of the ATLAS TDAQ system ATLAS Technical Design Report CERN-LHCC-2017-020, ATLAS-TDR-029, 2017
link
26 A. Ryd and L. Skinnari Tracking triggers for the HL-LHC ARNPS 70 (2020) 171 2010.13557
27 ATLAS Collaboration Operation of the ATLAS trigger system in Run 2 JINST 15 (2020) P10004 2007.12539
28 CMS Collaboration The CMS trigger system JINST 12 (2017) P01020 CMS-TRG-12-001
1609.02366
29 CMS Collaboration The Phase-2 upgrade of the CMS data acquisition and high level trigger CMS Technical Design Report CERN-LHCC-2021-007, CMS-TDR-022, 2021
CDS
30 CMS Offline Software and Computing Group CMS Phase-2 computing model: Update document CMS Note CMS-NOTE-2022-008, 2022
31 ATLAS Collaboration ATLAS software and computing HL-LHC roadmap LHCC Public Document CERN-LHCC-2022-005, LHCC-G-182, 2022
32 R. H. Dennard et al. Design of ion-implanted MOSFET's with very small physical dimensions IEEE J. Solid-, 1974
State Circuits 9 (1974) 256
33 Graphcore Intelligence processing unit link
34 Z. Jia, B. Tillman, M. Maggioni, and D. P. Scarpazza Dissecting the Graphcore IPU architecture via microbenchmarking 1912.03413
35 D. Guest, K. Cranmer, and D. Whiteson Deep learning and its application to LHC physics ARNPS 68 (2018) 161 1806.11484
36 K. Albertsson et al. Machine learning in high energy physics community white paper JPCS 1085 (2018) 022008 1807.02876
37 D. Bourilkov Machine and deep learning applications in particle physics Int. J. Mod. Phys. A 34 (2020) 1930019 1912.08245
38 A. J. Larkoski, I. Moult, and B. Nachman Jet substructure at the Large Hadron Collider: A review of recent advances in theory and machine learning Phys. Rept. 841 (2020) 1 1709.04464
39 Feickert, Matthew and Nachman, Benjamin A living review of machine learning for particle physics 2102.02770
40 P. Harris et al. Physics community needs, tools, and resources for machine learning in Proc. 2021 US Community Study on the Future of Particle Physics, 2022
link
2203.16255
41 C. Savard et al. Optimizing high throughput inference on graph neural networks at shared computing facilities with the NVIDIA Triton Inference Server Accepted by Comput. Softw. Big Sci, 2023 2312.06838
42 S. Farrell et al. Novel deep learning methods for track reconstruction in 4th International Workshop Connecting The Dots (2018) 1810.06111
43 S. Amrouche et al. The Tracking Machine Learning challenge: Accuracy phase Springer Cham, 4, 2019
link
1904.06778
44 X. Ju et al. Performance of a geometric deep learning pipeline for HL-LHC particle tracking EPJC 81 (2021) 876 2103.06995
45 G. DeZoort et al. Charged particle tracking via edge-classifying interaction networks CSBS 5 (2021) 26 2103.16701
46 S. R. Qasim, J. Kieseler, Y. Iiyama, and M. Pierini Learning representations of irregular particle-detector geometry with distance-weighted graph networks EPJC 79 (2019) 608 1902.07987
47 J. Kieseler Object condensation: one-stage grid-free multi-object reconstruction in physics detectors, graph and image data EPJC 80 (2020) 886 2002.03605
48 CMS Collaboration GNN-based end-to-end reconstruction in the CMS Phase 2 high-granularity calorimeter JPCS 2438 (2023) 012090 2203.01189
49 J. Pata et al. MLPF: Efficient machine-learned particle-flow reconstruction using graph neural networks EPJC 81 (2021) 381 2101.08578
50 CMS Collaboration Machine learning for particle flow reconstruction at CMS JPCS 2438 (2023) 012100 2203.00330
51 F. Mokhtar et al. Progress towards an improved particle flow algorithm at CMS with machine learning in Proc. 21st Intern. Workshop on Advanced Computing and Analysis Techniques in Physics Research: AI meets Reality, 2023 2303.17657
52 F. A. Di Bello et al. Reconstructing particles in jets using set transformer and hypergraph prediction networks EPJC 83 (2023) 596 2212.01328
53 J. Pata et al. Improved particle-flow event reconstruction with scalable neural networks for current and future particle detectors 2309.06782
54 E. A. Moreno et al. JEDI-net: a jet identification algorithm based on interaction networks EPJC 80 (2020) 58 1908.05318
55 H. Qu and L. Gouskos ParticleNet: Jet tagging via particle clouds PRD 101 (2020) 056019 1902.08570
56 E. A. Moreno et al. Interaction networks for the identification of boosted $ \mathrm{H}\to\mathrm{b}\overline{\mathrm{b}} $ decays PRD 102 (2020) 012010 1909.12285
57 E. Bols et al. Jet flavour classification using DeepJet JINST 15 (2020) P12012 2008.10519
58 CMS Collaboration Identification of heavy, energetic, hadronically decaying particles using machine-learning techniques JINST 15 (2020) P06005 CMS-JME-18-002
2004.08262
59 H. Qu, C. Li, and S. Qian Particle transformer for jet tagging in Proc. 39th Intern. Conf. on Machine Learning, K. Chaudhuri et al., eds., volume 162, 2022 2202.03772
60 CMS Collaboration Muon identification using multivariate techniques in the CMS experiment in proton-proton collisions at $ \sqrt{s} = $ 13 TeV Submitted to JINST, 2023 CMS-MUO-22-001
2310.03844
61 J. Duarte et al. FPGA-accelerated machine learning inference as a service for particle physics computing CSBS 3 (2019) 13 1904.08986
62 D. Rankin et al. FPGAs-as-a-service toolkit (FaaST) in Proc. 2020 IEEE/ACM Intern. Workshop on Heterogeneous High-performance Reconfigurable Computing (H2RC)
link
63 J. Krupa et al. GPU coprocessors as a service for deep learning inference in high energy physics MLST 2 (2021) 035005 2007.10359
64 M. Wang et al. GPU-accelerated machine learning inference as a service for computing in neutrino experiments Front. Big Data 3 (2021) 604083 2009.04509
65 T. Cai et al. Accelerating machine learning inference with GPUs in ProtoDUNE data processing Comput. Softw. Big Sci. 7 (2023) 11 2301.04633
66 A. Gunny et al. Hardware-accelerated inference for real-time gravitational-wave astronomy Nature Astron. 6 (2022) 529 2108.12430
67 ALICE Collaboration Real-time data processing in the ALICE high level trigger at the LHC CPC 242 (2019) 25 1812.08036
68 R. Aaij et al. Allen: A high level trigger on GPUs for LHCb CSBS 4 (2020) 7 1912.09161
69 LHCb Collaboration The LHCb upgrade I 2305.10515
70 A. Bocci et al. Heterogeneous reconstruction of tracks and primary vertices with the CMS pixel tracker Front. Big Data 3 (2020) 601728 2008.13461
71 CMS Collaboration CMS high level trigger performance comparison on CPUs and GPUs J. Phys. Conf. Ser. 2438 (2023) 012016
72 D. Vom Bruch Real-time data processing with GPUs in high energy physics JINST 15 (2020) C06010 2003.11491
73 CMS Collaboration Mini-AOD: A new analysis data format for CMS JPCS 664 (2015) 7 1702.04685
74 CMS Collaboration CMS physics: Technical design report volume 1: Detector performance and software CMS Technical Design Report CERN-LHCC-2006-001, CMS-TDR-8-1, 2006
CDS
75 CMS Collaboration CMSSW on Github link
76 CMS Collaboration Performance of the CMS Level-1 trigger in proton-proton collisions at $ \sqrt{s} = $ 13 TeV JINST 15 (2020) P10017 CMS-TRG-17-001
2006.10165
77 CMS Collaboration Electron and photon reconstruction and identification with the CMS experiment at the CERN LHC JINST 16 (2021) P05014 CMS-EGM-17-001
2012.06888
78 CMS Collaboration Description and performance of track and primary-vertex reconstruction with the CMS tracker JINST 9 (2014) P10009 CMS-TRK-11-001
1405.6569
79 CMS Collaboration Particle-flow reconstruction and global event description with the CMS detector JINST 12 (2017) P10003 CMS-PRF-14-001
1706.04965
80 oneAPI Threading Building Blocks link
81 A. Bocci et al. Bringing heterogeneity to the CMS software framework EPJWC 245 (2020) 05009 2004.04334
82 CMS Collaboration A further reduction in CMS event data for analysis: the NANOAOD format EPJWC 214 (2019) 06021
83 CMS Collaboration Extraction and validation of a new set of CMS PYTHIA8 tunes from underlying-event measurements EPJC 80 (2020) 4 CMS-GEN-17-001
1903.12179
84 CMS Collaboration Pileup mitigation at CMS in 13 TeV data JINST 15 (2020) P09018 CMS-JME-18-001
2003.00503
85 CMS Collaboration NANOAOD: A new compact event data format in CMS EPJWC 245 (2020) 06002
86 NVIDIA Triton Inference Server link
87 gRPC A high performance, open source universal RPC framework link
88 Kubernetes Kubernetes documentation link
89 K. Pedro et al. SonicCore link
90 K. Pedro et al. SonicTriton link
91 K. Pedro SonicCMS link
92 A. M. Caulfield et al. A cloud-scale acceleration architecture in Proc. 49th Annual IEEE/ACM Intern. Symp. on Microarchitecture (MICRO), 2016
link
93 V. Kuznetsov vkuznet/TFaaS: First public version link
94 V. Kuznetsov, L. Giommi, and D. Bonacorsi MLaaS4HEP: Machine learning as a service for HEP CSBS 5 (2021) 17 2007.14781
95 KServe Documentation website link
96 NVIDIA Triton Inference Server README (release 22.08) link
97 A. Paszke et al. PyTorch: An imperative style, high-performance deep learning library Advances in Neural Information Processing Systems 3 (2019) 8024
link
1912.01703
98 NVIDIA TensorRT link
99 ONNX Open Neural Network Exchange (ONNX) link
100 M. Abadi et al. TensorFlow: A system for large-scale machine learning 1605.08695
101 T. Chen and C. Guestrin XGBoost: A scalable tree boosting system in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 2016
link
1603.02754
102 NVIDIA Triton Inference Server Model Analyzer link
103 P. Buncic et al. CernVM -- a virtual software appliance for LHC applications JPCS 219 (2010) 042003
104 S. D. Guida et al. The CMS condition database system JPCS 664 (2015) 042024
105 L. Bauerdick et al. Using Xrootd to federate regional storage JPCS 396 (2012) 042009
106 CMS Collaboration Performance of missing transverse momentum reconstruction in proton-proton collisions at $ \sqrt{s}= $ 13 TeV using the CMS detector JINST 14 (2019) P07004 CMS-JME-17-001
1903.06078
107 CMS Collaboration ParticleNet producer in CMSSW link
108 CMS Collaboration ParticleNet SONIC producer in CMSSW link
109 M. Cacciari, G. P. Salam, and G. Soyez The anti-$ k_{\mathrm{T}} $ jet clustering algorithm JHEP 04 (2008) 063 0802.1189
110 M. Cacciari, G. P. Salam, and G. Soyez FastJet user manual EPJC 72 (2012) 1896 1111.6097
111 CMS Collaboration Performance of the ParticleNet tagger on small and large-radius jets at high level trigger in Run 3 CMS Detector Performance Note CMS-DP-2023-021, 2023
CDS
112 CMS Collaboration Identification of highly Lorentz-boosted heavy particles using graph neural networks and new mass decorrelation techniques CMS Detector Performance Note CMS-DP-2020-002, 2020
CDS
113 CMS Collaboration Mass regression of highly-boosted jets using graph neural networks CMS Detector Performance Note CMS-DP-2021-017, 2021
CDS
114 Y. Feng A new deep-neural-network-based missing transverse momentum estimator, and its application to W recoil PhD thesis, University of Maryland, College Park, 2020
link
115 CMS Collaboration Identification of hadronic tau lepton decays using a deep neural network JINST 17 (2022) P07023 CMS-TAU-20-001
2201.08458
116 NVIDIA Corporation NVIDIA T4 70W low profile PCIe GPU accelerator
117 B. Holzman et al. HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation CSBS 1 (2017) 1 1710.00100
118 I. Corporation Intel 64 and IA-32 architectures software developer's manual Intel Corporation, Santa Clara, 2023
119 SchedMD Slurm workload manager link
120 Advanced Micro Devices, Inc. AMD EPYC 7002 series processors power electronic health record solutions Advanced Micro Devices, Inc., Santa Clara, 2020
Compact Muon Solenoid
LHC, CERN