This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Ulysse Gazin, Universit´e Paris Cit´e and Sorbonne Universit´e, CNRS, Laboratoire de Probabilit´es, Statistique et Mod´elisation,
(2) Gilles Blanchard, Universit´e Paris Saclay, Institut Math´ematique d’Orsay,
(3) Etienne Roquain, Sorbonne Universit´e and Universit´e Paris Cit´e, CNRS, Laboratoire de Probabilit´es, Statistique et Mod´elisation.
Table of Links
5 Conclusion
The main takeaway from this work is the characterization of a “universal” joint distribution Pn,m for conformal p-values based on n calibration points and m test points. We derived as a consequence a non-asymptotic concentration inequality for the p-value empirical distribution function; numerical procedures can also be of use for calibration in practice. This entails uniform error bounds on the false coverage/false discovery proportion that hold with high probability, while standard results are only marginal or in expectation and not uniform in the decision. Since the results hold under the score exchangeability assumption only, they are applicable to adaptive score procedures using the calibration and test sets for training.
Acknowledgements
We would like to thank Anna Ben-Hamou and Claire Boyer for constructive discussions and Ariane Marandon for her support with the code. The authors acknowledge the grants ANR-21-CE23-0035 (ASCAI) and ANR-19-CHIA-0021-01 (BISCOTTE) of the French National Research Agency ANR and the Emergence project MARS.
References
Balasubramanian, V., Ho, S.-S., and Vovk, V. (2014). Conformal prediction for reliable machine learning: theory, adaptations and applications. Morgan Kaufmann books.
Bashari, M., Epstein, A., Romano, Y., and Sesia, M. (2023). Derandomized novelty detection with FDR control via conformal E-values. arXiv preprint 2302.07294.
Bates, S., Cand`es, E., Lei, L., Romano, Y., and Sesia, M. (2023). Testing for outliers with conformal p-values. Ann. Statist., 51(1):149–178.
Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B, 57(1):289–300.
Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist., 29(4):1165–1188.
Bian, M. and Barber, R. F. (2022). Training-conditional coverage for distribution-free predictive inference. arXiv preprint arXiv:2205.03647.
Blanchard, G., Neuvial, P., and Roquain, E. (2020). Post hoc confidence bounds on false positives using reference families. Ann. Statist., 48(3):1281–1303.
Boyer, C. and Zaffran, M. (2023). Tutorial on conformal prediction. //claireboyer. github.io/tutorial-conformal-prediction/.
Courty, N., Flamary, R., Habrard, A., and Rakotomamonjy, A. (2017). Joint distribution optimal transportation for domain adaptation. In Advances in neural information processing systems 30 (NIPS 2017), volume 30.
Jin, Y. and Cand`es, E. J. (2023). Model-free selective inference under covariate shift via weighted conformal p-values. arXiv preprint arXiv:2307.09291.
Lei, J., G’Sell, M., Rinaldo, A., Tibshirani, R. J., and Wasserman, L. (2018). Distribution-free predictive inference for regression. J. Amer. Stat. Assoc., 113(523):1094–1111.
Li, J., Maathuis, M. H., and Goeman, J. J. (2022). Simultaneous false discovery proportion bounds via knockoffs and closed testing. arXiv preprint arXiv:2212.12822.
Marandon, A. (2022). Machine learning meets FDR. //github.com/arianemarandon/ adadetect#machine-learning-meets-fdr.
Marandon, A., Lei, L., Mary, D., and Roquain, E. (2022). Machine learning meets false discovery rate. arXiv preprint 2208.06685.
Marques F., P. C. (2023). On the universal distribution of the coverage in split conformal prediction. arXiv preprint 2303.02770.
Massart, P. (1990). The tight constant in the Dvoretzky-Kiefer-Wolfowitz inequality. Ann. Probab., 18(3):1269–1283.
Papadopoulos, H., Proedrou, K., Vovk, V., and Gammerman, A. (2002). Inductive confidence machines for regression. In 13th European Conference on Machine Learning (ECML 2002), pages 345–356. Springer.
Romano, J. P. and Wolf, M. (2005). Exact and approximate stepdown methods for multiple hypothesis testing. J. Amer. Statist. Assoc., 100(469):94–108.
Romano, Y., Patterson, E., and Candes, E. (2019). Conformalized quantile regression. Advances in neural information processing systems, 32.
Sarkar, S. and Kuchibhotla, A. K. (2023). Post-selection inference for conformal prediction: Trading off coverage for precision. arXiv preprint arXiv:2304.06158.
Saunders, C., Gammerman, A., and Vovk, V. (1999). Transduction with confidence and credibility. In 16th International Joint Conference on Artificial Intelligence (IJCAI 1999), pages 722–726.
Simes, R. J. (1986). An improved Bonferroni procedure for multiple tests of significance. Biometrika, 73(3):751–754.
Vanschoren, J., van Rijn, J. N., Bischl, B., and Torgo, L. (2013). Openml: networked science in machine learning. SIGKDD Explorations, 15(2):49–60.
Vovk, V. (2012). Conditional validity of inductive conformal predictors. In 4th Asian conference on machine learning (ACML 2012), pages 475–490. PMLR.
Vovk, V. (2013). Transductive conformal predictors. In Artificial Intelligence Applications and Innovations: 9th IFIP WG 12.5 International Conference (AIAI 2013), pages 348–360. Springer.
Vovk, V., Gammerman, A., and Shafer, G. (2005). Algorithmic learning in a random world. Springer.
Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., and He, Q. (2020). A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1):43–76.