[ad_1]
Optimizing correct loss features is popularly believed to yield predictors with good calibration properties; the instinct being that for such losses, the worldwide optimum is to foretell the ground-truth chances, which is certainly calibrated. Nonetheless, typical machine studying fashions are skilled to roughly reduce loss over restricted households of predictors, which might be unlikely to comprise the bottom reality. Beneath what circumstances does optimizing correct loss over a restricted household yield calibrated fashions? What exact calibration ensures does it give? On this work, we offer a rigorous reply to those questions. We exchange the worldwide optimality with a neighborhood optimality situation stipulating that the (correct) lack of the predictor can’t be decreased a lot by post-processing its predictions with a sure household of Lipschitz features. We present that any predictor with this native optimality satisfies clean calibration as outlined in Kakade-Foster (2008), Błasiok et al. (2023). Native optimality is plausibly glad by well-trained DNNs, which suggests an evidence for why they’re calibrated from correct loss minimization alone. Lastly, we present that the connection between native optimality and calibration error goes each methods: almost calibrated predictors are additionally almost regionally optimum.
[ad_2]
Source link