Scaling laws for deep neural networks

Yasaman Bahri

doi:10.1117/12.2659096

17 March 2023 Scaling laws for deep neural networks

Yasaman Bahri

Proceedings Volume PC12438, AI and Optical Data Sciences IV; PC124380J (2023) https://doi.org/10.1117/12.2659096
Event: SPIE OPTO, 2023, San Francisco, California, United States

Abstract

Recent work in deep learning has underscored the importance of measuring and understanding trends in model performance as a function of basic variables, such as the size of the training dataset, the number of model parameters, and the amount of compute. These trends often, though not always, are governed by power-law scaling. I will survey some of the existing empirical evidence for these so-called “scaling laws” and then discuss regimes where we have a theoretical understanding underlying these trends, based on joint work with collaborators. I will close by discussing connections to optical implementations of deep neural networks.

Conference Presentation

Citation Download Citation

Yasaman Bahri "Scaling laws for deep neural networks", Proc. SPIE PC12438, AI and Optical Data Sciences IV, PC124380J (17 March 2023); https://doi.org/10.1117/12.2659096

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
PRESENTATION

WATCH
PRESENTATION SAVE TO MY LIBRARY

GET CITATION

KEYWORDS

Neural networks

Data modeling

Performance modeling

RELATED CONTENT

A random forest model for boiler drum water level prediction...
Proceedings of SPIE (March 27 2022)

Research on named entity recognition of Chinese geographical names and...
Proceedings of SPIE (February 02 2023)

Human action recognition based on attention mechanism and two...
Proceedings of SPIE (August 04 2022)

A bidirectional LSTM method based on temporal features for deep...
Proceedings of SPIE (September 27 2022)

Research on handwriting recognition method based on machine learning
Proceedings of SPIE (February 02 2023)

Extended analysis of atmospheric refraction effects captured by time lapse...
Proceedings of SPIE (August 01 2021)

Subscribe to Digital Library

Receive Erratum Email Alert

Keywords/Phrases

Search In:

Publication Years