Publication
Journal Articles (refereed)
Proper Losses Regret at Least 1/2-order.
Journal of Machine Learning Research, 2025. (minor revision, to appear)
[arXiv] (alphabetical ordering)
& Takatsu, A.
Feature Normalization Prevents Collapse of Non-contrastive Learning Dynamics.
Neural Computation 37(11), 2025. (to appear)
[link][arXiv]- Takezawa, Y., Sato, R.,
Necessary and Sufficient Watermark for Large Language Models.
Transactions on Machine Learning Research, 2025. (expert certificate)
[OpenReview][arXiv][github]
, Niwa, K., & Yamada, M. - Lin, X.,
Scalable Individual Treatment Effect Estimator for Large Graphs.
Machine Learning, 114(23), 2025.
(Presented at the 16th Asian Conference on Machine Learning (ACML2024), Vietnam, Dec. 5-8, 2024)
[link]
, Cui, Y., Takeuchi, K., & Kashima, H. - Takezawa, Y.,
Momentum Tracking: Momentum Acceleration for Decentralized Deep Learning on Heterogeneous Data.
Transactions on Machine Learning Research, 2023.
[OpenReview][arXiv][github]
, Niwa, K., Sato, R., & Yamada, M.
Sparse Regularized Optimal Transport with Deformed q-Entropy.
Entropy, 24(11):1634, 2022.
[link]
& Sakaue, S.- Yamada, M., Takezawa, Y., Sato, R.,
Approximating 1-Wasserstein Distance with Trees.
Transactions on Machine Learning Research, 2022.
[OpenReview][arXiv]
, Kozareva, Z., & Ravi, S. - Shimada, T.,
Classification from Pairwise Similarities/Dissimilarities and Unlabeled Data via Empirical Risk Minimization.
Neural Computation 33(5):1234-1268, 2021.
[link][arXiv]
, Sato, I., & Sugiyama, M.
Convex Formulation of Multiple Instance Learning from Positive and Unlabeled Bags.
Neural Networks 105:132-141, 2018.
[link][arXiv]
, Sakai, T., Sato, I., & Sugiyama, M.
Conference Proceedings (refereed)
- Sakaue, S., Tsuchiya, T,
Online Inverse Linear Optimization: Improved Regret Bound, Robustness to Suboptimality, and Toward Tight Regret Analysis.
Advances in Neural Information Processing Systems 38 (NeurIPS2025), (to appear), San Diego, CA, USA, Dec. 2-7, 2025.
[link][arXiv]
, & Oki, T.
Any-stepsize Gradient Descent for Separable Data under Fenchel–Young Losses.
Advances in Neural Information Processing Systems 38 (NeurIPS2025), (to appear), San Diego, CA, USA, Dec. 2-7, 2025.
(spotlight) [link][arXiv]
, Sakaue, S., & Takezawa, Y.- Cao, Y.,
Establishing Linear Surrogate Regret Bounds for Convex Smooth Losses via Convolutional Fenchel–Young Losses.
Advances in Neural Information Processing Systems 38 (NeurIPS2025), (to appear), San Diego, CA, USA, Dec. 2-7, 2025.
(spotlight) [link][arXiv]
, Feng, L., & An, B. - Sakaue, S.,
Revisiting Online Learning Approach to Inverse Linear Optimization: A Fenchel–Young Loss Perspective and Gap-Dependent Regret Analysis.
In Proceedings of 28th International Conference on Artificial Intelligence and Statistics (AISTATS2025) PMLR 258:46-54, Phuket, Thailand, May 3-5, 2025.
[link][arXiv]
, & Tsuchiya, T.
Inverse Optimization with Prediction Market: A Characterization of Scoring Rules for Elciting System States.
In Proceedings of 28th International Conference on Artificial Intelligence and Statistics (AISTATS2025) PMLR 258:451-459, Phuket, Thailand, May 3-5, 2025.
[link]
& Sakaue, S.
Calm Composite Losses: Being Improper Yet Proper Composite.
In Proceedings of 28th International Conference on Artificial Intelligence and Statistics (AISTATS2025) PMLR 258:2800-2808, Phuket, Thailand, May 3-5, 2025.
[link]
& Charoenphakdee, N.- Shing, M., Misaki, K.,
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models.
In Proceedings of 13th International Conference on Learning Representations (ICLR2025), Singapore, Apr. 24-28, 2025.
(spotlight) [OpenReview][arXiv][github][blog (ja)][blog (en)]
, Yokoi, S., & Akiba, T. - Ishikawa, S.*, Yamada, M.*,
PhiNets: Brain-inspired Non-contrastive Learning Based on Temporal Prediction Hypothesis.
In Proceedings of 13th International Conference on Learning Representations (ICLR2025), Singapore, Apr. 24-28, 2025.
[OpenReview][arXiv][github]
, & Takezawa, Y. - Yokoi, S.,
Zipfian Whitening.
Advances in Neural Information Processing Systems 37 (NeurIPS2024), 122259-122291, Vancouver, BC, Canada, Dec. 9-15, 2024.
[link][arXiv]
, Kurita, H., & Shimodaira, H. - Takezawa, Y.,
Parameter-free Clipped Gradient Descent Meets Polyak.
Advances in Neural Information Processing Systems 37 (NeurIPS2024), 44575-44599, Vancouver, BC, Canada, Dec. 9-15, 2024.
[link][arXiv]
, Sato, R., Niwa, K., & Yamada, M. - Sakaue, S.,
Online Structured Prediction with Fenchel–Young Losses and Improved Surrogate Regret for Online Multiclass Classification with Logistic Loss.
In Proceedings of 37th Annual Conference on Learning Theory (COLT2024) PMLR 247:4458-4486, Edmonton, Canada, Jun. 30-Jul. 3, 2024.
[link][arXiv]
, Tsuchiya, T., & Oki, T.
Self-attention Networks Localize When QK-eigenspectrum Concentrates.
In Proceedings of 41st International Conference on Machine Learning (ICML2024), PMLR 235:2903-2922, Vienna, Austria, Jul. 22-27, 2024.
[link][arXiv]
, Hataya, R., & Karakida, R.- Houry, G.,
Fast 1-Wasserstein Distance Approximations Using Greedy Strategies.
In Proceedings of 27th International Conference on Artificial Intelligence and Statistics (AISTATS2024), PMLR 238:325-333, Valencia, Spain, May 2-4, 2024.
[link]
, Zhao, H., & Yamada, M. - Takezawa, Y.*, Sato, R.*,
Beyond Exponential Graph: Communication-Efficient Topologies for Decentralized Learning via Finite-time Convergence.
Advances in Neural Information Processing Systems 36 (NeurIPS2023), 76692-76717, New Orleans, LA, USA, Dec. 10-16, 2023.
[link][arXiv][github] (* equal contribution)
, Niwa, K., & Yamada, M. - Hataya, R.,
Will Large-scale Generative Models Corrupt Future Datasets?
In Proceedings of IEEE International Conference on Computer Vision (ICCV2023), 20555-20565, Paris, France, Oct. 2-6, 2023.
[link][arXiv][dataset]
, & Arai, H. - Lin, X., Zhang, G., Lu, X.,
Estimating Treatment Effects Under Heterogeneous Interference.
In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD2023), LNCS 14169:576-592, Turin, Italy, Sep. 18-22, 2023.
[link][arXiv]
, Takeuchi, K., & Kashima, H.
Proper Losses, Moduli of Convexity, and Surrogate Regret Bounds.
In Proceedings of 36th Annual Conference on Learning Theory (COLT2023) PMLR 195:525-547, Bangalore, India, Jul. 12-15, 2023.
[link]- Arase, Y.,
Unbalanced Optimal Transport for Unbalanced Word Alignment.
In Proceedings of 61st Annual Meeting of the Association for Computational Linguistics (ACL2023) 3966–3986, Toronto, Canada, Jul. 9-14, 2023.
[link][arXiv][github]
, & Yokoi, S. - Nakamura, S.,
Robust Computation of Optimal Transport by β-potential Regularization.
In Proceedings of 14th Asian Conference on Machine Learning (ACML2022) PMLR 189:770-785, Hyderabad, India, Dec. 12-14, 2022.
[link][arXiv]
, & Sugiyama, M.
On the Surrogate Gap between Contrastive and Supervised Losses.
In Proceedings of 39th International Conference on Machine Learning (ICML2022), PMLR 162:1585-1606, Baltimore, MD, USA, Jul. 17-23, 2022.
[link][arXiv][poster][github] (equal contribution & alphabetical ordering)
, Nagano, Y., & Nozawa, N.
Pairwise Supervision Can Provably Elicit a Decision Boundary.
In Proceedings of 25th International Conference on Artificial Intelligence and Statistics (AISTATS2022), PMLR 151:2618-2640, online, Mar. 28-30, 2022.
[link][arXiv][poster] (* equal contribution)
*, Shimada, T.*, Xu, L., Sato, I., & Sugiyama, M.- Dan, S.,
Learning from Noisy Similar and Dissimilar Data.
In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD2021), LNCS 12976:233-249, online, Sep. 13-17, 2021.
[link][arXiv]
, & Sugiyama, M.
Fenchel-Young Losses with Skewed Entropies for Class-posterior Probability Estimation.
In Proceedings of 24th International Conference on Artificial Intelligence and Statistics (AISTATS2021), PMLR 130:1648-1656, online, Apr. 13-15, 2021.
[link][poster][github]
& Sugiyama, M.- Nordström, M.,
Calibrated Surrogate Maximization of Dice.
In Proceedings of 23rd International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI2020), LNCS 12264:269-278, online, Oct. 4-8, 2020.
[link]
, Löfman, F., Hult, H., Maki, A., & Sugiyama, M.
Calibrated Surrogate Losses for Adversarially Robust Classification.
In Proceedings of 33rd Annual Conference on Learning Theory (COLT2020), PMLR 125:408-451, online, Jul. 9-12, 2020.
[link][arXiv (corrigendum)][slides] (arXiv version contains a corrigendum; the definition of calibrated losses is modified)
, Scott, C., & Sugiyama, M.
Calibrated Surrogate Maximization of Linear-fractional Utility in Binary Classification.
In Proceedings of 23rd International Conference on Artificial Intelligence and Statistics (AISTATS2020), PMLR 108:2337-2347, online, Aug. 26-28, 2020.
[link][arXiv][slides]
& Sugiyama, M.- Wu, Y.-H., Charoenphakdee, N.,
Imitation Learning from Imperfect Demonstration.
In Proceedings of 36th International Conference on Machine Learning (ICML2019), PMLR 97:6818-6827, Long Beach, CA, USA, Jun. 9-15, 2019.
[link][arXiv][poster][github]
, Tangkaratt, V., & Sugiyama, M. - Kuroki, S., Charoenphakdee, N.,
Unsupervised Domain Adaptation Based on Source-guided Discrepancy.
In Proceedings of 33rd AAAI Conference on Artificial Intelligence (AAAI2019), 33 01:4122-4129, Honolulu, HI, USA, Jan. 27-Feb. 1, 2019.
[link][arXiv]
, Honda, J., Sato, I., & Sugiyama, M.
Classification from Pairwise Similarity and Unlabeled Data.
In Proceedings of 35th International Conference on Machine Learning (ICML2018), PMLR 80:461-470, Stockholm, Sweden, Jul. 10-15, 2018.
[link][arXiv][slides][poster][github]
, Niu, G., & Sugiyama, M.
Preprints
- Sakaue, S.,
Non-Stationary Online Structured Prediction with Surrogate Losses.
[arXiv]
, & Cao, Y. - Liu, W.,
Many-to-Many Matching via Sparsity Controlled Optimal Transport.
[arXiv]
, Yamada, M., Huang, Z., Zheng, N., & Qian, H. - Zhang, G.,
Online Policy Learning from Offline Preferences.
[arXiv]
, & Kashima, H. - Sato, R., Takezawa, Y.,
Embarrassingly Simple Text Watermarks.
[arXiv]
, Niwa, K., & Yamada, M.
Books
- Mochihashi, D., & Suzuki, T. (Eds.), Ishiguro, K., Ito, S., Kajino, H., Kuroki, Y., Komiyama, J., Sato, R., Suzuki, T.,
Probabilistic Machine Learning: An Introduction, Asakura Pub., Tokyo, Japan, 2025. (確率的機械学習:入門編,朝倉書店,2025) , Teshima, T., Hataya, R., Futami, F., Minami, K., Mochihashi, D., & Yokoi, S. (Trans.) - Omata, H. R. (Ed.), Schuab, J.-F., Sato, S., Minaka, N., Matsumoto, T., &
Where Facts Intersect, Nakanishiya Pub., Kyoto, Japan, 2025.
(「事実」の交差点—科学的対話が生まれる文脈を探して,ナカニシヤ出版,2025)
[link] - Sugiyama, M.,
Machine Learning from Weak Supervision: An Empirical Risk Minimization Approach, MIT Press, Cambridge, MA, USA, 2022.
[link]
, Ishida, T., Lu, N., Sakai, T., & Niu, G.
[link (vol 1)][link (vol 2)] (Japanese translation)
PhD Thesis
- Excess Risk Transfer and Learning Problem Reduction towards Reliable Machine Learning, UTokyo Repository, 2022.
Date of granted: 2022/03/24, Japanese title: "信頼性の高い機械学習を目指した剰余リスク転移と学習問題の帰着"
[link]