Presentation
17 March 2023 Scaling laws for deep neural networks
Yasaman Bahri
Author Affiliations +
Proceedings Volume PC12438, AI and Optical Data Sciences IV; PC124380J (2023) https://doi.org/10.1117/12.2659096
Event: SPIE OPTO, 2023, San Francisco, California, United States
Abstract
Recent work in deep learning has underscored the importance of measuring and understanding trends in model performance as a function of basic variables, such as the size of the training dataset, the number of model parameters, and the amount of compute. These trends often, though not always, are governed by power-law scaling. I will survey some of the existing empirical evidence for these so-called “scaling laws” and then discuss regimes where we have a theoretical understanding underlying these trends, based on joint work with collaborators. I will close by discussing connections to optical implementations of deep neural networks.
Conference Presentation
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Yasaman Bahri "Scaling laws for deep neural networks", Proc. SPIE PC12438, AI and Optical Data Sciences IV, PC124380J (17 March 2023); https://doi.org/10.1117/12.2659096
Advertisement
Advertisement
KEYWORDS
Neural networks

Data modeling

Performance modeling

Back to Top