Scaling Laws

The point to which Scaling laws Kaplan et al. (2020) govern the current artificial intelligence revolution is vastly unknown and underappreciated given the predictability and clear future trends spelt out by them.

This site aspires to demystify scaling laws and make them accessible and easier to comprehend via a few mini-calculators.

These tools mainly aim to answer and visualize three simple questions:

Compute

This section will deal with the first question. How do you calculate compute (C)? Here is a simple formula that determines the total compute used:

Total compute = *peak FLOPs x number of GPUs x training time (s)

Custom
GPT-4
Llama-3-70b

x

x

=

Future

Here is a graph of major models trained by frontier artificial intelligence labs. The Biden Executive Order had the reporting requirement set at 1e26, as you can see, we are approaching that threshold very quickly with the largest model trained in terms of compute being Google DeepMind's Gemini 1.0 Ultra as per April 2024.

This dataset and several other detailed articles covering the ongoing artificial intelligence revolution can be found at Epochai.org!

Sources

This project is available on GitHub.

All the numbers, metrics, formulas and fact-checking used and done for in this project have been sourced from various articles, tweets and research papers mentioned below.

I hope this project is helpful in any way and would love to hear any feedback on project improvements/suggestions as well as any corrections needed in the calculations!

Scaling Laws

Tanay Desai →