Am curious to see google gemini-nano and apple hardware partnership. Nice coming here later. Watching the predictions and trends towards distillation and smaller. Instead of just bigger and faster. Then hear google coming out with nano. Looking forward to seeing its capabilities. Multimodal from scratch and running at the edge on apple silicon.
Great update Casey! I have been thinking about the end of Moore's law too. You mentioned "Now that Moore's Law is "over", chip architecture is likely the biggest driver of increases in compute speed (aka FLOPS). This may lead traditionally-overlooked chip types, like FPGA (Field Programmable Gate Array), to come to the fore."
I am curious if you have a sense of whether LLM and other AI workloads are limited by the end of Moore's law given that they are running on GPUs vs. CPUs? Perhaps its a dumb question, but are GPUs earlier in their development curve vs. CPUs?
Thanks Matt! I'm not sure I can answer the GPUs vs CPUs part as I'm early on my understanding of all things hardware but I believe that your first question pairs well with what Björn Ommer said here: "In the last 5 years we’ve seen a 15x increase in model size. That growth outstrips the growth in compute power by a factor of 9x. It’s unsustainable to keep focusing on scaling.". My sense from speaking to folks is that LLMs are indeed constrained by the lower growth in compute power we're seeing now that we can't fit as many transistors on one chip, as well as issues like interconnect bandwidth. Check out the "lessons for the future" section of this blog post: https://ai.meta.com/blog/meta-training-inference-accelerator-AI-MTIA/
1. "This is especially true if the AI bubble loses momentum next year (likely)"
2. investors will be less likely to foot the bill for high variable cost AI businesses."
Re 1 above, I don't want to try and split hairs about 2025 Vs 2024 but I agree that this will happen and, when it happens, this will be a good thing as it will help us get to the serious players who have the deep domain experience in the problem areas to ensure the solutions work/add value
Re 2, are you saying VCs will do this because they didn't anticipate this upfront?
RE 1: I agree, and over-hype tends to delegitimise a technology (see crypto).
2: I think investors make trade-offs proportional to their excitement levels. If they're super excited about a technology, team or product they may make more trade-offs. As excitement/hype wanes they may be less willing to make the cost trade off and invest in businesses that are actually cost inefficient. I'd say that there are investors who haven't anticipated it, and other investors who are aware they're investing in high cost startups but are willing to do so because of their excitement for the opportunity.
Am curious to see google gemini-nano and apple hardware partnership. Nice coming here later. Watching the predictions and trends towards distillation and smaller. Instead of just bigger and faster. Then hear google coming out with nano. Looking forward to seeing its capabilities. Multimodal from scratch and running at the edge on apple silicon.
Great update Casey! I have been thinking about the end of Moore's law too. You mentioned "Now that Moore's Law is "over", chip architecture is likely the biggest driver of increases in compute speed (aka FLOPS). This may lead traditionally-overlooked chip types, like FPGA (Field Programmable Gate Array), to come to the fore."
I am curious if you have a sense of whether LLM and other AI workloads are limited by the end of Moore's law given that they are running on GPUs vs. CPUs? Perhaps its a dumb question, but are GPUs earlier in their development curve vs. CPUs?
Thanks Matt! I'm not sure I can answer the GPUs vs CPUs part as I'm early on my understanding of all things hardware but I believe that your first question pairs well with what Björn Ommer said here: "In the last 5 years we’ve seen a 15x increase in model size. That growth outstrips the growth in compute power by a factor of 9x. It’s unsustainable to keep focusing on scaling.". My sense from speaking to folks is that LLMs are indeed constrained by the lower growth in compute power we're seeing now that we can't fit as many transistors on one chip, as well as issues like interconnect bandwidth. Check out the "lessons for the future" section of this blog post: https://ai.meta.com/blog/meta-training-inference-accelerator-AI-MTIA/
Great thanks Casey, will check it out!
Two big predictions, Casey!
1. "This is especially true if the AI bubble loses momentum next year (likely)"
2. investors will be less likely to foot the bill for high variable cost AI businesses."
Re 1 above, I don't want to try and split hairs about 2025 Vs 2024 but I agree that this will happen and, when it happens, this will be a good thing as it will help us get to the serious players who have the deep domain experience in the problem areas to ensure the solutions work/add value
Re 2, are you saying VCs will do this because they didn't anticipate this upfront?
Thanks Arun!
RE 1: I agree, and over-hype tends to delegitimise a technology (see crypto).
2: I think investors make trade-offs proportional to their excitement levels. If they're super excited about a technology, team or product they may make more trade-offs. As excitement/hype wanes they may be less willing to make the cost trade off and invest in businesses that are actually cost inefficient. I'd say that there are investors who haven't anticipated it, and other investors who are aware they're investing in high cost startups but are willing to do so because of their excitement for the opportunity.