Model Card: SophonAK4
The SophonAK4 model is a realistic small-radius (R = 0.4) anti-kT jet tagger developed for fast-simulation (Delphes) datasets under the JetClass-II configuration, designed to emulate the CMS detector conditions at the LHC.
Here, realistic indicates that the model achieves tagging performance comparable to state-of-the-art jet taggers used in the ATLAS and CMS experiments.
The model is constructed to cover a broad range of final states, including partons and leptons of various flavors and charges.
Model Details
SophonAK4 is trained using a multi-class classification approach based on di-X resonance processes, where the resonance X decays into multiple two-prong final states. Truth labelling is performed by associating reconstructed anti-kT jets with partons or leptons originating from these two-prong decays.
A total of 23 jet labels are defined:
Single-prong labels: , , , , , , , , , , , , , , , , and . These correspond to cases where a single truth particle (either a parton or a lepton) is matched to the jet within ΞR(jet, particle) < 0.4, while the other particle from the same resonance decay is not matched to the jet.
Two-prong labels: , , , , , and . These labels are assigned when both particles from the same resonance decay are matched within the same jet.
Uses
Integrating SophonAK4/Sophon Models
The SophonAK4 model, together with the Sophon model, provides a realistic benchmark for small- and large-R jet tagging on fast-simulation (Delphes) datasets, achieving performance comparable to state-of-the-art taggers used in the ATLAS and CMS experiments.
For an example of integrating them in C++ workflows to analyze Delphes files, check [here]. (note: the SophonAK4 model will be supported since April 25')
For an example of how to integrate these models into the Delphes processing workflow, refer to the following GitHub repository: https://github.com/jet-universe/delphes/tree/jet-models (note: will be available since May 25')
Evaluation
The performance of SophonAK4 is evaluated using the standard model events to enable direct comparison with performance benchmarks from ATLAS and CMS. Details are provided in the [Appendix B of the paper], and are summarized below.
For b- and c-tagging, genuine b, c, and light-flavor jets are selected via jet-parton matching as implemented in Delphes. Jets are required to satisfy pT > 30 GeV and |Ξ·| < 2.5, consistent with CMS configurations.
The following b-tagging discriminant is constructed from SophonAK4's the raw output scores to evaluate b vs. light and b vs. c jet performance:
The following c-tagging discriminants are defined for c vs. light and c vs. b jets, respectively.
- The ROC performance for b vs. light/c jets and c vs. light/b jets is shown below and can be compared to CMS benchmarks (Figs. 1 and 3 for the ttΜ process).
Conclusion
- The b vs. light jet performance is slightly below that of the widely-adopted DeepJet tagger in CMS.
- The b vs. c and c vs. light/b jet performances fall between DeepJet and ParticleNet taggers in CMS.
- Similar trends are found by comparing with ATLAS's widely-adopted DL1r tagger, see Appendix B of the paper.
- Performance across different pT and |Ξ·| regions is benchmarked below and can be compared with CMS benchmarks (Figs. 17, 19, 21, 23, 25, 27, 29, and 31).
Conclusion
- Tagging performance degrades in the low-pT and high-|Ξ·| regions but reaches the plateau beyond the turn-on point, indicating that the SophonAK4 tagger exhibits realistic flavor-tagging behavior across kinematic regimes.
Citation
If you find the SophonAK4 model useful in your research, please cite our [paper] that introduces the model:
@article{Zhao:2025rci,
author = "Zhao, Yuzhe and Li, Congqiao and Agapitos, Antonios and Fu, Dawei and Gao, Leyun and Mao, Yajun and Li, Qiang",
title = "{Novel $|V_{cb}|$ extraction method via boosted $bc$-tagging with in-situ calibration}",
eprint = "2503.00118",
archivePrefix = "arXiv",
primaryClass = "hep-ph",
month = "2",
year = "2025"
}