Model Card: SophonAK4

The SophonAK4 model is a realistic small-radius (R = 0.4) anti-k_T jet tagger developed for fast-simulation (Delphes) datasets under the JetClass-II configuration, designed to emulate the CMS detector conditions at the LHC.

Here, realistic indicates that the model achieves tagging performance comparable to state-of-the-art jet taggers used in the ATLAS and CMS experiments.

The model is constructed to cover a broad range of final states, including partons and leptons of various flavors and charges.

Model Details

SophonAK4 is trained using a multi-class classification approach based on di-X resonance processes, where the resonance X decays into multiple two-prong final states. Truth labelling is performed by associating reconstructed anti-k_T jets with partons or leptons originating from these two-prong decays.

A total of 23 jet labels are defined:

Single-prong labels: $b$ , $\bar{b}$ , $c$ , $\bar{c}$ , $s$ , $\bar{s}$ , $d$ , $\bar{d}$ , $u$ , $\bar{u}$ , $g$ , $e^{-}$ , $e^{+}$ , $μ^{-} \mu^-$ , $μ^{+} \mu^+$ , $\tau_{\rm h}^-$ , and $\tau_{\rm h}^+$ . These correspond to cases where a single truth particle (either a parton or a lepton) is matched to the jet within ΔR(jet, particle) < 0.4, while the other particle from the same resonance decay is not matched to the jet.
Two-prong labels: $b\bar{b}$ , $c\bar{c}$ , $s\bar{s}$ , $d\bar{d}$ , $u\bar{u}$ , and $g g$ . These labels are assigned when both particles from the same resonance decay are matched within the same jet.

Uses

Integrating SophonAK4/Sophon Models

The SophonAK4 model, together with the Sophon model, provides a realistic benchmark for small- and large-R jet tagging on fast-simulation (Delphes) datasets, achieving performance comparable to state-of-the-art taggers used in the ATLAS and CMS experiments.

For an example of integrating them in C++ workflows to analyze Delphes files, check [here]. (note: the SophonAK4 model will be supported since April 25')
For an example of how to integrate these models into the Delphes processing workflow, refer to the following GitHub repository: https://github.com/jet-universe/delphes/tree/jet-models (note: will be available since May 25')

Evaluation

The performance of SophonAK4 is evaluated using the standard model tt̅ events to enable direct comparison with performance benchmarks from ATLAS and CMS. Details are summarized below.

For b- and c-tagging, genuine b, c, and light-flavor jets are selected via ghost-association following the CMS convention. Jets are required to satisfy p_T > 30 GeV and |η| < 2.5, consistent with CMS configurations.

The following b-tagging discriminant is constructed from SophonAK4's the raw output scores to evaluate b vs. light and b vs. c jet performance:

$\text{discr (SophonAK4 $b$ tagging)} = g_{b} + g_{\bar{b}} + g_{b\bar{b}}.$

The following c-tagging discriminants are defined for c vs. light and c vs. b jets, respectively.

$\text{discr (SophonAK4 $c$ tagging)} = g_{c} + g_{\bar{c}} + g_{c\bar{c}},$
$\text{discr (SophonAK4 $c$ vs. $b$ tagging)} = \frac{g_{c} + g_{\bar{c}} + g_{c\bar{c}}}{g_{c} + g_{\bar{c}} + g_{c\bar{c}} + g_{b} + g_{\bar{b}} + g_{b\bar{b}}}.$

The ROC performance for b vs. light/c jets and c vs. light/b jets is shown below and can be compared to CMS benchmarks (Figs. 1 and 3 for the tt̅ process).

Conclusion

The b and c tagging performance of SophonAK4, evaluated on the Delphes simulation, is very compatible with that of the CMS taggers. Its performance falls between DeepJet and UParT, and is generally comparable to ParticleNet.

Click here to show the ROC curves with CMS results overlaid.

Performance across different p_T and |η| regions is benchmarked below and can be compared with CMS benchmarks (Figs. 17, 19, 21, 23, 25, 27, 29, and 31).

Conclusion

Tagging performance degrades in the low-p_T and high-|η| regions but reaches the plateau beyond the turn-on point, indicating that the SophonAK4 tagger exhibits realistic flavor-tagging behavior across kinematic regimes.

Citation

If you find the SophonAK4 model useful in your research, please cite our [paper] that introduces the model:

@article{Zhao:2025rci,
    author = "Zhao, Yuzhe and Li, Congqiao and Agapitos, Antonios and Fu, Dawei and Gao, Leyun and Mao, Yajun and Li, Qiang",
    title = "{Novel $|V_{cb}|$ extraction method via boosted $bc$-tagging with in-situ calibration}",
    eprint = "2503.00118",
    archivePrefix = "arXiv",
    primaryClass = "hep-ph",
    month = "2",
    year = "2025"
}

jet-universe
/

sophon-ak4