🎙️ Semi Doped: AI is Eating Memory

Playback speed

Share post at current time

Share from 0:00

0:00

🎙️ Semi Doped: Micron's Record Profits, Apple's CXMT Plea... AI is Eating All the Memory!

Micron's 85% gross margin, Apple's CXMT request, Korean government's $500B investment, HBM's wafer consumption, and more

Semi Doped

Jul 03, 2026

Austin Lyons and Vik Sekar break down the escalating memory crisis, revealing how AI’s insatiable demand is impacting everything from consumer device prices to geopolitical semiconductor strategies. They explore how memory companies are achieving unprecedented profits while consumers face “shrinkflation” and companies like GoPro struggle to survive. The hosts also delve into the technical reasons behind the memory crunch and debate the future of AI demand amidst political headwinds and cost optimization.

Things we cover:

Micron’s record-breaking revenue and gross margins
Apple’s unprecedented mid-cycle price hikes and CXMT request
South Korea’s half-trillion-dollar investment in memory capacity
The impact of memory prices on consumer devices and companies like GoPro
Technical reasons for HBM’s high wafer consumption
The future of AI model training and inference costs

This podcast is lightly edited for clarity.

The Memory Crisis Hits the Streets

Vik: So check this out. Micron booked more revenue in a single quarter than in any full year of its 50-year history. They booked basically 41.5 billion dollars and its gross margin hit 85%. That’s more than Nvidia’s gross margin. And for a brief moment there, Micron as a memory company was worth more than Meta. And to add up to all this, the South Korean government is investing over half a trillion dollars in SK Hynix and Samsung just to increase memory capacity so that they can own the memory business entirely. And in the midst of all this, Apple is now increasing prices and they’re going to China CXMT to get memory capacity, so while the memory boom is creating winners on one side, it’s also causing a lot of trouble. This whole memory crisis is coming to a head now.

Austin: Welcome to another Semi Doped podcast. I’m Austin Lyons from Chipstrat and with me is Vik Sekar from Vik’s Newsletter. Hey, Vik, what’s up, man?

Vik: Not that much. Just been dealing with the memory crisis. I don’t know. I’m not dealing with it. What do I have to do with it? I’m not buying any devices. My Mac M3 Pro is doing amazing. So I don’t have to buy a laptop or anything. But I am also being more careful with it. I’m not going to drop it anytime because if I have to replace that thing, it’s terrible. The prices have gone up. Apple has raised prices and all that. Yeah, this was basically the entire discussion in the bar yesterday. I met a bunch of friends, and then we were having a few beers and this was the whole discussion: memory prices. What has the world come to? Everybody’s talking LLMs. And mind you, I’m not in the Bay Area, okay? This is, of course, Bangalore is a tech city, so it’s not unheard of, I suppose, but it was very fascinating. Everybody’s like, “Oh man, I can’t buy a laptop. This is Apple has increased prices.” Another friend of mine was like, “I can’t even buy musical instruments anymore because you can’t buy these synths and stuff because all of them have memory in it.” So, even musical instruments have gone up. This is serious memory crisis. It’s hit the streets.

Vik: Totally. Totally. Man, yeah, if it’s a conversation at the bar, then it’s truly impacting the consumer.

Austin: Exactly. When people start talking about the fact that memory prices from AI are affecting their day-to-day purchases, we are in trouble now because I don’t think the general population understood how much memory AI has been sucking up. We’ve been talking a lot about it on the podcast on and off, right? But that’s when it hits the streets like this, it gives you a visible reaction to people asking me, “Hey, you do the podcast, what’s happening with memory?” I’m like, “Yeah, yeah, let me tell you about HBM, let me tell you about DRAM.” And they’re all like, “Oh, wow, is it that much, huh?” And I’m like, “Wow, back to you.” To me, I’ve been thinking, “Oh, so much DRAM,” we’ve always been talking, it’s second nature, “Memory is short.” Yeah, yeah, we know this. Memory companies are making money. Sure, we get it, it’s in the news all the time for what we do. But for everybody else who is not on this train entirely like we are, it’s news actually. It’s like, “AI is killing my consumer devices.”

Vik: Yes, yeah, yes. So, the most simple definition of inflation is just sustained increased prices for a period of time, a year or two. And traditionally, when the consumer thinks about inflation, core inflation, it’s usually the price of food, but it’s also the price of oil, essentially, gas and whatnot. And so people, normal consumers have to tend to pay attention to various random industries based on the things that they buy that cause inflation for them. For example, in the United States, people, if you like to eat beef, you might know that there’s a shortage in the US beef herd or cattle herd. And it’s a very random thing, but you’re like, “Yeah, because when I go to buy steaks now, it costs way more than it used to.” And what’s interesting is because of AI, now the normal consumers are going to have to start paying attention, or at least right now they are, to the memory market. Normal people are going to have to learn, “What is memory? Why is it so expensive?” Because to your point, it’s just like, “Yo, I’m just a person that produces music with a synth and suddenly it’s expensive.” You just start pulling on that thread to find out why. And what’s super interesting is as much as I love AI, which I totally do, it’s interesting to say it as plainly as, “AI is causing inflation.” AI is going to make it so that your grandparents who are in the United States that are on a fixed income and using social security, for example, their social security dollars don’t go as far because the price of certain things have gone up because AI. So it’s crazy to think how it’s truly impacting just everyday people around the world too. A lot of the AI is produced in the United States, but it’s impacting people around the whole world, which is also crazy to think.

Austin: I think the whole sentiment in the market now because of these increased memory prices is that when it hits the consumer and when the consumer starts to feel the pinch, the sentiment seems to be that people are going to stop buying devices. And as much as we talk about AI and the deployment in data centers and how many GPUs people are buying and how much these data centers are spending, the consumer market is actually very significant. It has always been a very big driver of semiconductor business. So it’s not to be discounted or trifled with, right? So when the prices become this high and when everybody sees their electronics prices jumping up, the first tendency is what I said in the beginning, right? Look, I better not drop my laptop because I’m not going to buy anything in these prices. And that’s the thing that everybody is going to think about as well, at least in those lines, most people, I would think. So what that means is that if people stop buying consumer devices, now you’re going to have demand drop and then the supply will equalize better. And once people stop buying consumer devices, the market’s going to pull back or something. So, people anticipate this because they’ve seen this happen in the past. And so the market softened last week around this news. That’s my interpretation anyway.

Consumer vs. AI Demand

Vik: Hmm. Yeah, so definitely consumers are going to delay purchases or potentially, depending on what kind of consumer you are, you might say, “Well, I have to buy a new laptop.” My it’s at work, it’s time to refresh. I have to. They’re probably going to say, “Let’s just delay, delay a quarter or whatever.” Or they might say, “Well, we’ll get you that new MacBook still because it’s been three years. We promised you that, but we’re not going to go all out on memory, of course.” But yes, it’s definitely going to impact consumers. And then the interesting part is it’s not only going to impact the low-end consumers, but even high-end consumers. I think that’s where it’s really interesting is when you have people, Apple always has the best customers who are willing to pay the most for the premium experience of having an iPhone. But it’s pretty crazy when even the premium customers who are willing to spend three, four, five thousand dollars on a laptop, even they are saying, “Well, I guess I better wait or I’ll delay it. Let’s see.” The question I have is, so if we have a drop in demand from all the consumer devices of which there are many and it does consume a lot of memory, is that actually going to help even out the memory price increases? Because the question is, is that going to improve supply? Or is the supply that extra supply, if you will, that’s not coming from consumers, is just going to be sucked up by the AI and the hyperscalers, and all the accelerators who can’t get enough, right?

Austin: Yeah, the answer to that comes from looking at the supply commitments already that are awaiting these memory companies. It looks like people are booked out through 2027 at least. That’s what Micron said in their quarter at least that this is not going to let up in 2027 because they have these long-term agreements, and then they also have it as somehow a prepayment clause, which is basically that you pay for it whether you take it or not. So they’re going to continue to make money and AI book orders are in, they’re all booked up, they’ve been prepaid for. So even if this softens, I don’t see why it will let up in the near term. In the long term, I think supply capacity is being built up and when that comes online, maybe then we will see something letting up here. But up until that point in time, I don’t see anything changing in the near term. But let’s see how right I am in a year from now. We should revisit this episode.

Vik: Right. But, but, so you’re saying if it’s not going to let up in the near term because all the AI accelerator companies are already prepaying for it and they’re going to suck up any extra supply.

Austin: That’s possibly out there, whether the supply is from a new fab or the supply is from the lack of consumers buying it. That’s going to be really hard on consumer device companies because it means that they’re going to continue to have this pain. And so we, I know you and I were talking earlier about some companies that are having pain and trying to work around it. So GoPro, for example, tell us about GoPro.

Vik: GoPro was such an awesome company. It was such a lifestyle statement to have that GoPro stuff. When it came out, people were attaching it on their guitar headstocks and then you could see them playing a live show with the guitar moving around. It was so amazing. Obviously, there’s all these adventure sports that came out. I even considered it for some time mounting it on my surfboard when I lived in San Diego so that I could see how it looks as I surf. You see these surf videos of people going down the line and all that. So, it was such an amazing concept, but today what has happened is they don’t have money for memory. They know, how much will you pay for a GoPro anyway? It’s already something that you don’t really need. It’s a fun thing to have to do all these crazy things. But nobody really needs a GoPro, right? And now the company has become a penny stock, basically. There’s a good chance that they’ll actually go under. It’s just a sad thing because it’s a device that had a lifestyle statement at some point. And now the memory prices are driving companies like this out of business. It’s sad actually.

Austin: Yeah, it is sad. But yeah, you make a good point. Especially any discretionary spending, where people don’t definitely need it, those companies are going to definitely be hurting. What about Apple? How are they trying to deal with not passing on such crazy price hikes to customers?

Vik: So, Apple is actually the biggest giant of our times, right? Before Nvidia, they were first trillion dollar company, if I remember, right? This is a company that has been the gorilla in the room, so to speak, for decades because they had pricing power. They could always come in and say, “Hey, we’re going to sell a billion units, we need this price or we need this performance.” Now, “we need this capacity.” And there were companies before whose sole existence relied on Apple orders. I worked for some of them, actually. So, their whole existence is whether they make it into the next Apple design cycle or not. And Apple was in a sense brutal with their own business practices. It’s well known in the industry that they’re really tough to deal with and nobody would, if they can avoid it, if they didn’t have to make money, they’d rather avoid Apple. But Apple are a tough customer to work with. And this is the first time that Apple has increased product prices mid-cycle. This is not a next refresh, a next generation that comes in or whatever. They refresh prices in the middle of nowhere, saying, “Now the same product is now more expensive.” They’ve never really done that as far as I know.

Austin: Yeah, yeah, it’s pretty wild. Now, tell me, shouldn’t I feel like there’s a paradox here, which Apple is the big company, they make the premium products. They could probably afford to pass on the pricing to their customers the most because they’ve got the most premium customers, right? So if, let’s say you were making an Android phone and I was making an iPhone and memory’s going to get more expensive, it should be easier for me to pass it on, “Hey, things are going to this phone’s going to be an extra 100 bucks” than for you to pass it on as an Android phone maker because you’re like, “100 bucks, I can’t I can’t charge my customers 100 bucks.” But then paradoxically, isn’t it Apple who’s probably getting the best memory prices because they have such scale? So as me as Apple, wouldn’t my cost for memory actually be less than yours, Android maker, because I have such scale? Or what do you think there? I mean, I know Apple captures a lot of the margin, but I do know that there’s Samsung and other Android phone makers that still sell tons of units.

Vik: Yeah, so I think the distinction that we have to make here is premium versus non-premium device. The whole argument that I’ve been reading in recent times is that premium tier devices are actually okay because they have a high enough selling price that they can absorb the price increase in memory and they’ll be fine because of that. Although their gross margin will reduce. But in general, yes, it’s okay. Now, with the Apple price increases, I don’t know if Apple wants to hold on to their gross margins and therefore they increase the price of the device in lock step with what memory prices are increasing so that they can retain gross margins and look good in the next earning cycle. So it’s a gamble, right? Because people may not buy as many if you increase it. So now you lose on revenue anyway. So that’s I’m not sure which way they’re going. But there was news, we’ve mentioned this on this podcast too because we’ve spoken about memory so much. Apple was supposedly buying up DRAM like crazy. They were buying it up at premium prices. If at all, they could only lock out their competitors from DRAM, they were going to do it. And if there is a company that could do it, it’s Apple with their big cash reserves, right?

Austin: Right, right, right. Yeah, so interesting to think about, yes, there’s price elasticity here. So the more they increase the price, the fewer people that want to buy it. So of course, if they could absorb it all, then it wouldn’t affect the demand, but then of course, it’s going to affect their margins. And so, always as public companies, you have to play that game of how much of a margin hit do we want to take versus how much of fewer units do we want to sell. But then on top of it, it’s kind of like, well, are investors going to compare them to the previous quarter and say, “Hey, you sold less units than expected” or the previous year, year over year comparisons are probably best in a seasonal thing like smartphone sales. Or are they just going to compare them to everyone else and say, “Yeah, your margins went down, but they didn’t go down as far as everyone else.” So actually, we will still reward you for that.

Vik: Yeah, the expectation these days has become very AI-centric, right? Everybody wants a massive increase in earnings and massive boosts because the idea is, “Why would I invest in Apple when I can invest in Micron?” Because Micron is making money. “How about I invest in Micron?” You guys are losing gross margin, so why should I invest in Apple? So investors are not really tied to Apple as a company or anything. They’re just there for the money, right? So, we’re here to make money. So, there’s a logical thing is to go to the company that makes money. But so, I don’t know, ultimately, it seems like even the premium tier was not spared. And recently, over the weekend, there was also the discussion that now Apple is asking CXMT, which is a Chinese DRAM manufacturer, for DRAM supply. And CXMT is actually on the entity list, which is a list of no-no companies for US companies to do business with in China because those companies are also suppliers to the Chinese military. So, you don’t want to as a US company give money to a company that then supplies the Chinese military because that’s going to bite the US back in the future, right? They’ll have a stronger military or whatever. Anyway, so, they put CXMT on the entity list and now Apple is asking the US government, “Can we please get CXMT to give us DRAM because we need the supply?” This is another first, actually, because I don’t think there’s any company who’s gone to the US government and said, “Hey, you blacklisted this company, but now we actually need to buy from them.” Imagine Apple actually doing this. Actually, that’s a big deal.

Austin: Yes, the things not on my 2026 bingo card. Yes, Apple asking if CXMT could come off the entity list or any Chinese company could come off the entity list. It just goes to show the lengths that they’re willing to go to to prevent the inflation that we talked about, and obviously try to get more supply. Of course, the interesting thing is, it’s a temporary problem in that, as you said, we know more supply is coming online. Micron, Samsung, SK Hynix, these guys are building fabs. They’re going to come on in 2028, 2029 and so on. If you ask for CXMT to come off the entity list, that’s bringing supply on indefinitely, not just to solve this temporary problem, but indefinitely. Which of course, when you start to game theory this out, you wonder if the big three had it on their game theory bingo card that CXMT supply could get brought online. But it I’m sure it does feel like a slippery slope in that it’s like, yes, you would like more supply right now, but do you want CXMT DRAM in a year from now or two years from now? I mean, maybe at that point they just stop buying from them. They just say, “Hey, thank you for your service and for helping us in 2026.” Even though you’re not on the entity list, we don’t want to buy from you anymore. But it just yeah, it’s just so interesting to think through all the implications.

Vik: Yeah, so now the long and short of it is that the premium companies, premium tier companies like the Samsungs. Samsung is a bit different, right? Because they have DRAM supply by themselves. And they make phones. Think about that. Apple doesn’t have Apple doesn’t make DRAM. Samsung does. And they also make phones, which is a unique advantage if you come to think of it. But, apart from these premium tier companies, what about all the lower mid and low tier? They cannot absorb the costs. So, the one thing that will happen in this scenario is that those phones will stay at the same price, but it’ll get de-specced. So, a phone that was selling for 4 GB of RAM or whatever, some low tier, mid tier phone, will only go with 2 GB of RAM now. And so they get de-specced and that is one way to handle the lower tier. But nobody likes that. So now you’re like, “Okay, fine, I’ve spent the same amount of money, but I get a crappier phone,” right?

Austin: Right, totally, totally. Yeah, that’s a—what do they call that when in consumer foods where you pay the same, but your bag of chips is half as big?

Vik: Yeah, yeah, yeah. Isn’t it a form of inflation?

Austin: Yeah, yeah, yeah. There’s a funny term for it, but this is the same thing. Shrinkflation. That’s what it’s called. Shrinkflation.

Vik: Shrinkflation. Yeah, yeah.

Memory Market Dynamics

Austin: Shrinkflation hits low-end consumer phones. So you mentioned something interesting. It was kind of an aside, but you said, “Oh, Samsung, they make phones, they make memory.” They also have a foundry and can make logic chips, right?

Vik: Yeah.

Austin: So in theory, Samsung could presumably make quite a lot of phone, vertically integrated to some extent. And it would be interesting to think that through. Probably if you’re kind of doing the Taiwan, China war gaming of what happens if TSMC goes down, you could ask which smartphone makers are best poised to continue to be able to supply new smartphones and maybe it’s Samsung.

Vik: I think Samsung plays this game too because for profitability, they also have to make an internal decision as to where to direct their DRAM supply. Would you sell it for a nice hefty profit to AI companies or would you sell it into the consumer market and lose money? They will even refuse their own handset division if it comes to it because it’s all about getting the most out of the DRAM supply you have. So it’s very funny actually. And talking about Samsung and SK Hynix, maybe this is a good time to mention this. The government itself, right, the Korean government is investing north of a half a trillion dollars in DRAM capacity expansion in just these two companies. Because they realize that these two companies now hold the key to the AI semiconductor super cycle. Because yes, there’s TSMC and there’s Taiwan who has the key to manufacturing. Yes, there’s Nvidia in the US who have the dominance in training chips and also all the other hyperscalers, they’re all US-based. All this is good. But what do all of them need in common? What is the one breaking point is memory? And who has all the memory? Two-thirds it’s Korea, right? Micron is US, but two-thirds is a Korean-based company like SK Hynix and Samsung. So the Korean government is all in. They basically pushed all their chips in and said, “Here you go, we’re going to own this thing” and they’re spending half a trillion dollars on it. And then interestingly in other Korean memory company news, SK Hynix actually surpassed Samsung for the first time to become the most valuable company in Korea, ending a reign that Samsung has held for 25 years. So that’s the power of memory. So the other thing I wanted to tell you was about Kioxia because Kioxia is not really a DRAM memory company, it’s more of a NAND flash maker and they spun out of Toshiba. And they recently passed Toyota as the most valuable company in the country. So you can see what AI is doing. Toyota, it’s a big deal, it’s in every country, but now a NAND flash company that Toshiba actually spun off. What a bad idea. I mean, talk about a bad bet. So funny.

Austin: Yeah, pretty wild, pretty wild. Things are changing, that’s for sure.

Micron’s Profitability and Market Cycles

Vik: Yeah. And finally, we have to talk about Micron. We mentioned it in the cold open, obviously. But Micron is, let’s just say, for all the people who are suffering in the AI memory crunch, the GoPros going out of business. I also read about a small company that was making some routers or something. They were selling their routers for, I don’t know, $500, let’s say. And now they have to sell it for $2,000 to even stay profitable. And they’re like, the makers are like, “No, nobody’s going to buy this at $2,000. This is not a product like that.” Sometimes electronics has a certain intrinsic value. Nobody buys a laptop for $20,000. You buy a laptop for, I don’t know, $2,000, right? There’s a logical limit to certain class of devices that you can’t just break. So this is one of them. So for all the suffering that is going on in the memory industry, Micron is among other memory makers making a killing, right? Yes. They are at 84% or something, 85% profit margin, which means that you basically are selling it for four times more than you made it for. So if you make this widget for $10, you can sell it for $50, I think. Yeah. So anyway, that’s a lot of money to be made here.

Austin: Yes, totally. Yes. Micron’s gross margins, I think in the last, which I wrote about this on Chipstrat, you can check it out. In the last four quarters it went from 45% to 56% to 75% to now almost 85%. And they’re actually projecting another 85%, 86% next quarter.

Vik: That’s insane. Earlier you mentioned, maybe this isn’t price gouging, but I’m going to say it is. It is absolutely price gouging. And I think that a lot of people now have this inherent faith for memory companies because everybody’s thinking because of these three. And so basically it’s just like, I think when the down cycle comes, nobody’s going to like it. But everybody’s maybe secretly saying, “It had to happen. I’m glad it did.” Puts these people back in where they belong.

Austin: Yes, yes. I mean, to Micron’s credit, when I think price gouging, I think of someone trying to take advantage of people. And I think to their credit and other memory makers, it’s just that literally their customers are like, “No, I’ll pay more than that, dude, because I need it.” And that dude’s like, “No, I’ll pay more than them because I need it,” and it’s just, I mean, there’s probably literally almost to an effect a bidding war going on. So it is obviously very opportunistic, but if you have a customer saying, “Hey, you’ll sell that to Vik for $3, I’ll give you $5,” you can’t—it’s how do you say no to that?

Vik: But maybe that’s one way of looking at it that is, yeah, maybe they’re offering more to get the supply locked in and Micron is just like, “Sure, why not? You pay more, you get it.” And the next person comes by and like, “Sure, you can.” Or maybe they’re just going, “No, that’s not the price. This is our price. You want to take it or you want to leave it?” You can leave it because if you don’t take it, somebody else will. So there’s this whole FOMO thing. We’ll put up this another chart I have from the Wall Street Journal that shows basically Micron’s adjusted operating earnings every fiscal quarter from 2023 to 2024. If people are not aware of the history of this company, they had negative earnings, okay? They were losing money. It was pretty miserable. But then you can see what happened to this chart as you go from basically the beginning of 2024. They start making a turnaround and now they’re making, I don’t know, 41.5 billion in a single quarter. It’s insane. If you look at the period between 2024 and 2025, 26, the last two years, they haven’t made 41 billion dollars in a quarter put together in all those years. Now they do it in a single quarter. It’s interesting, I think.

Austin: It’s wild. Yes, it is so wild. I mean, we see this across AI, these companies where they’re spending more or making more in a quarter than they did in an entire year. So for the CAPEX guys, for the hyperscalers, it’s like, all of a sudden now they’re spending in a quarter in CAPEX what they used to spend in an entire year. And of course, on the other end, the people who the hyperscalers are buying from, they’re making more in a quarter than they used to make in an entire year. And it’s just pretty wild to have all these companies suddenly making more per quarter than they did in an entire year. And for some of them in several years added up. And for Micron, I mean, 2022 wasn’t that long ago. 2023 wasn’t that long ago. So pretty crazy how their fortunes have turned so quickly.

Vik: Yeah. I like the way you put it in your Substack, basically, you didn’t show the most recent earnings data in your Substack. It’s like, “You see, memory is a cyclical industry. You can see it goes up, it’s like a sine wave.” It’s almost like a sine wave if you draw a line joining all the peaks of these bar charts you have in your Substack article. By the way, whoever listening should go read the whole article. But then later you’re like, “Ah, but I sneakily did not put the most recent one. Here it is with that recent one in and then there’s this huge bar that goes up.” That’s totally, yes. You tell me, is this a cycle? Does it look like a cycle to you? Because that’s a peak. That’s a peak.

Vik: Right. Well, and what’s interesting and I linked to this nice chart from Doug O’Laughlin where he tries to show where different industries or submarkets are in terms of a cycle and I think of it as a merry-go-round. There’s these four quadrants of is there inventory increasing or decreasing and are your sales increasing or decreasing? And if you’re in a quadrant where your inventory is decreasing and your sales are increasing, that’s obviously very good. Prices are going to go up. But eventually it tips over and you start to build back up inventory. So even if prices are high and you’re still selling a lot, you’re building up inventory. And then eventually there’s going to be a point when you have so much inventory that you actually tip over and you don’t sell as much and then, you start to get in this place where it’s like, “Oh, now we’ve got more inventory than we need and prices are going down.” And it’s kind of like a merry-go-round. You just go all the way around. And so a lot of people are trying to make the argument that, “Oh, we have all these new demand drivers,” which is totally true, but they’re kind of arguing that that merry-go-round doesn’t exist. And I think that the merry-go-round still exists, but now I think there’s other drivers that are going to keep AI in a particular quadrant for a lot longer because, and we can talk about this, but actually, let’s definitely get into this really quick. When you buy HBM, you’re buying DRAM wafers essentially for stacked HBM. You can unpack this. And then also there’s people needing to buy DRAM. So, for example, I need lots of CPUs for agentic AI and those need DRAM. And so there’s these different drivers that want to take the same wafers. And it’s kind of like no matter where you slice it, it feels like the demand is going to continue be high and we know that supply is not going to increase until 2027, 2028, 2029. I’ve got it in the Substack as well. And so it it it there’s I’m a believer that the merry-go-round is still there, but there’s just so many drivers that things aren’t going to tip over. The supply is not going to catch up. The inventory is not going to decrease for a lot longer.

Technical Reasons for Memory Shortage

Austin: Yeah, so basically to the point of unpacking what you said, it’s that we are really short of memory because of AI. Okay, that’s one broad way of putting it, but essentially because we need high bandwidth memory, which requires stacking these DRAM chips one on top of each other. To make those DRAM chips that you have to actually stack, actually takes three times as many wafers to make the equivalent number of bits. You know, so I don’t know, to simplify that statement, let’s say you have to make a 1GB DRAM chip. Now, if I were just making a one single chip that I didn’t have to stack up into HBM, I would maybe make it in one wafer, and you can get one wafer’s worth of DRAM chips or whatever, 1GB chips. And let’s say you, I don’t know, maybe you get a thousand chips out of a wafer, each of which is 1GB, okay? Now, you want to make the same 1GB chip, but that 1GB DRAM is going to go into HBM. You’re going to need three times as many wafers than to make that same bit capacity if it were going to be used for HBM. Okay, there’s a technical reason for this. We won’t get into too much of it, but it’s basically that you have to drill holes through this thing and have through silicon vias, which is how you stack them. You actually stack them by connecting them through each DRAM die. Whenever you drill through it, your bit density drops. You can’t have as many bits now because you can’t put memory cells around these drilled holes for safety reasons or it won’t work. So, because of that, basically for every 1GB of HBM and for every 1GB of DRAM, the HBM takes three times as many wafers than the regular DRAM does. So, it’s a big suck up, it’s sucking up all the DRAM supply big time. That’s one big reason. The other reason is that we right now even need DRAM chips just as is. Forget about HBM. We need DRAM because apparently agentic AI needs CPUs and now CPUs need a lot of DRAM. Great. So now you have to get a whole lot of DRAM chips. Then it turns out that we want to run long agentic workloads, which means the AI has to remember stuff literally forever and then keep the context in its brain of what it is doing in a large code base or a very long context conversation. You want to keep asking it questions. You don’t want it to have amnesia ever and you want instant answers, right? You don’t want to wait or anything. So now all of that context that it holds in its brain is also stored on DRAM. It may not be stored in HBM. It could be stored in HBM, but it’s very expensive. So mostly it you can store it in DRAM, which is the next fastest memory. So all of that memory thing that AI has to remember is also stored in DRAM. And now people are looking at pooling DRAM. So what you do is you take all these DRAM sticks and put them together in an appliance and say, “Okay, look, this is my KV cache server,” so it could be CXL pooled DRAM or it could be CXL pooled NAND flash. So everything is going into AI. It doesn’t matter whether you’re looking at agentic CPUs, HBM or just KV cache storage. It’s all over. It’s all DRAM. It’s all memory. So this is this is why we are in this pickle. So in a long story short.

Vik: Totally. And that’s just talking about essentially the data center demand for DRAM or the enterprise demand if you will. There’s obviously still consumer demand for DRAM. The interesting thing is that the data center buyers basically are inelastic in their demand. So if the price goes up, they still want as much. Which and how much do they want? Every wafer you can make. Right? And it’s almost like the curve, the elasticity curve almost goes straight up. It’s like the price goes up, I don’t care, I want the same. But on the other side, the consumers, it’s very elastic. And so there’s this interesting sort of bifurcation into data center demand and consumer demand. And data center demand kind of bifurcates into HBM and DRAM. And DRAM kind of bifurcates into CPUs need it and, maybe some storage appliances to supply your KV cache needs it, right? And so you just have all these demand drivers. Supply is coming online. Meanwhile, of course, the consumers are hung out to dry and it’s impacting how far grandma’s dollar can go, and she can’t get you the new GoPro for Christmas because they’re going out of business, right? And so consumers are getting beat up. But it’s very difficult to see demand ever waning. So I don’t think people should think that data center demand wanes. I think consumer demand will wane, right? Because it’s phones, like we talked about, my laptop’s expensive, I’m going to put it off. But the data center, they’re just going to keep buying. And so that’s why, we kind of think it’s this AI super cycle in memory if you will, because like you said, there’s a little sine wave and then all of a sudden, boom, it’s like it is accelerating, it’s going a lot higher. So it’s going to be there for a while.

The Future of AI Demand

Austin: Yeah, yeah. I think we should wrap this up. I have one question I think that will, maybe we can think on it because this is a bit of a macro question and I don’t know if anybody really knows the answer, but I just like to know what you think of it. So, what do you see happening from now to 2030? Okay, but that’s a broad question. So let me pin that down a little bit more. So, given that now leading frontier labs are not really being allowed to release leading edge models like Fable/Mythos was cut off by the US government, then OpenAI’s ChatGPT 5.6, what they code name Soul, was also not released for general public use. There is a lot of fear of distillation of these models from Chinese competition. So that is one front of it. So maybe the frontier won’t advance as much due to other reasons. It’s not actually scaling or technical reasons. It’s more political reasons or competitive reasons because now do you think that the next generation of leading edge models are going to be trained with the same gusto knowing that they will not be released, right? So that is my first question. What do you think will happen between now because the training seems to be having an inflection point of sorts. So what are your thoughts on that? Then I’ll ask you something else because I have more.

Vik: Okay, so if you assume that the government will always just prevent the leading edge thing from going out, then it does raise an interesting question, which is, well, how much money do we want to spend on advancing the leading edge if we can never let it go out? If that was all true, of course, there is an argument that the leading edge labs would continue to train even better models and just use them internally and just get continuously better and more efficient at doing what they do to improve their business. However, I’m not so sure that the government is going to prevent this forever. I think interestingly, this could be shooting oneself in the foot and that lately with the agentic AI stuff, it felt like the American labs had really run ahead of the Chinese labs. Now that it’s not just training the model, but it’s also training the harness. And so it’s this bigger system that you’re training and delivering. And Fable’s so awesome, that the government’s like, “Whoa, wait a minute, let’s not do that.” And but interestingly, that actually gives pause and gives a chance for Chinese labs to start figuring out how to train harnesses and sort of catch up again, if you will. So I won’t be surprised if there’s the pendulum swinging back and forth where it’s like, “Oh, never mind. Fable’s back on. We figured out how to feel comfortable about its safety,” because we feel like the rest of the world is catching up and we need to run ahead, which by the way, this is going to impact Anthropic and OpenAI’s IPOs. It’s going to impact their—there’s going to be revenue left on the table because they weren’t allowed to share the best technology with the consumers who would pay the most for it, right? So I think there’ll be all sorts of incentives for the government and for these companies to get these models back out there. But I definitely see your point, which is for as long as the government prevents them from doing it, it actually puts—it kind of disincentivizes maybe training the next model at the even almost scaling laws of, “Yeah, now we need a million accelerators. Now we need 5 million,” right? And so then, to your point, maybe that could actually flatten the growth of the demand curve a little bit for memory or for accelerators or whatnot.

Vik: Yeah, yeah. That’s the kind of point I wanted to make that if there is this giant rush towards AGI is being limited or governed by other competitive or political interferences, then it’s questionable how much more we will need. But that’s only the training side, right? The demand for inference is still enormous. We have just started with agents. We have not even scratched the surface of what is possible. I truly believe that, right? And it seems like it’s very, very helpful. The people I’ve spoken to have said, “I can’t believe this thing.” All companies now need to have an agentic AI approach to doing their job. Or their product, if it’s a software company, needs to have AI in it somewhere. Otherwise, as a startup, even a software startup, let’s say, they don’t have any valuation. You can’t go anywhere if there isn’t a component of AI. So it’s a strong driver everywhere. And it’s it is helping productivity gains. If you see the OpenAI jalapeno chip release with Broadcom, they say in the press release that it is accelerated by using ChatGPT. I’m sure it wrote some code to verify the chip or whatever. So, and there are a whole lot of EDA startups doing AI enabled this and that. And all of this are productivity gains. So, we have not even scratched the surface of what’s possible. And I truly believe that we’re going to only need more tokens from here. It’s not like we’re going to need less tokens. Totally. This leads me to my second question. I think now companies are becoming a little sensitive to what the token costs are. It’s not so much as to number of tokens, it’s about the cost of tokens, which is where things like the GLM 5.2 model comes in because it is a very capable model, but you can serve it at a fraction of the cost of what OpenAI can, right? Or you can talk about the DeepSeek V4 Pro and maybe that’s enough intelligence. You don’t have to have always the cutting edge intelligence for everything. So, basically my question is, do you think that people will stop using these frontier models that are so expensive in an attempt to cost optimize inference going forward?

Austin: My hot take is both are going to grow. I think companies are going to continue to just be thoughtful and say, “Guys, we don’t need Opus to do these little—I’m summarizing these 50 articles and giving you a daily thing on it. You don’t need Opus to do that. Just use Sonic or whatever.” So that kind of stuff is going to grow. And which by the way, someone else is going to say, “Don’t use Sonic, let’s buy a token generator and let’s just use an open source thing and just generate these tokens for free if you will, for the price of the CAPEX of buying the on premises token generator.” And let’s just use open source. So I think that’s definitely going to grow big time because lots of CFOs are going to go, “Wait a minute, we just went crazy, we have to.” But there’s people are going to always want the frontier and the frontier always is going to unlock more. And every time I touch the new frontier, I can do way more productive software building and automating and stuff. It was amazing with Opus. It was amazing with Fable when it lasted. We had a great 24-hour session where so much happened and I want Fable back. And so I definitely believe that frontier will always unlock new things. It’s going to be frontier video models. It’s going to just be continued frontier things. So I think both will grow. And of course, I do think that companies are going to start to fine tune their—and I think they’re doing this already—fine tune their own models for their own proprietary use cases because we can’t expect that Fable’s going to be amazing, but it’s not going to have the best biology data for your company as you’re doing RNA sequencing or whatever stuff. I don’t know anything about that, but right, you can tell that that’s not just going to be baked into Fable. You’re going to have data, you’re going to fine tune. That fine tune stuff may continue to be done on open source models, American or not, which we don’t seem to have a ton of American ones right now. But I so I see both growing. That’s my reaction. What about you?

Vik: Yeah, I think there’s one more thing that will become a standard going forward is basically model routing because you essentially have to put the right requests to the right model. That is going to be a very big cost optimizer for companies to make sure that the wrong request isn’t burning frontier model tokens. So that’s a piece of engineering somehow that’s going to become very important within a company.

Austin: Yes, yes. I’ve been wanting to do this on my own. I’ve got lots of agents that are doing different tasks and I want to get better at saying, “Oh, these tasks should only use these models,” which right now how I solve it is I just use OpenRouter and I’m like, “Yeah, use Gemini Flash light or whatever for these tasks.” But it’s still hard to, there’s not a good user interface for visualizing all this stuff, but I’d like a better way to know, here’s all the different agents, here’s what they’re doing, here’s the models they’re using and make sure that it is cost optimized, which, you know, I’m sure that there are startups probably out there that are solving this problem already and I just haven’t learned about it. So if you’re listening and you know how to help me here, feel free to send me an email or comment down below.

Vik: Let us know. We really only do hardware stuff. We don’t know much about this stuff otherwise. We play around with models, right? What do we know about? But then the other thing is somebody mentioned that, “No, no, no, I don’t think your companies can deploy on premise models at scale or whatever,” to which I actually replied that, no, actually currently if you see how big companies do simulation workloads because they do run a lot of workloads on premises and they don’t want all your simulation data to leave the company. So basically these load share facilities or LSF farms are basically data centers run by companies and they are only within the company network. You can’t reach them otherwise, right? So that’s how I ran simulations, for example, all the big simulations were always dispatched to a server somewhere in the world and I got my results back. That’s how it always works because all the Linux terminals I was given in many companies in the past have always been a virtual machine. It’s just a 2GB RAM virtual machine. It’s not nearly enough to do the engineering work I had to do. So it was always sent out on LSF. The same way companies can develop AI farms, they can put in hardware locally and they can deploy local models, even pretty big models, that does exactly what they have to do. So they don’t actually have to do it. Companies, I just read this piece of news too, Meta are now worrying that using Claude on the cloud or whatever is somehow leaking data to the model and then it will be used to distill future models. So they’re issuing warnings to their people saying, “Be careful, don’t give away all the information or whatever.” So yeah, there are all kinds of concerns coming in as AI becomes more mature in its use in the industry. So it’s interesting to see how all this will have an impact. So it’s not only about, “Look, look, we need more AI, more AI,” it’s about noticing that frontier models seem to face some kind of headwind at the moment. As you say, it may not be forever. And companies are getting smarter about what they actually use because they need to show revenue. You can use all the tokens you want, but at the end of the day, your CFO is going to come knocking, “Okay, you blew, I don’t know, $100 million worth of tokens this year. What revenue, but my company revenue hasn’t proportionally skyrocketed. I want to see multiples of that $100 million spent on tokens because we even laid off people for these tokens, remember?” So I need to see that revenue. So this is going to become a big driving factor, I think.

Austin: Yeah, totally. It’s always going to be at what cost. So obviously at what monetary cost, but also at what security cost, at what latency cost and so on. So all right folks, we hope you like our AI’s eating memory deep dive. Thanks for listening. Thanks for our YouTube commenters. We love you. There’s a core group of you. Thank you for that. Thank you for our podcast listeners. Thank you to the person out there that was like, “Yo, where’s the podcast? It’s late this week.” We don’t really stick to a schedule per se, but we try to get it out weekly, but we had some travel, but I love that some of you love the podcast so much that you’re telling us, “Guys, get it out there. I’m ready for it.” So, that’s awesome. Keep listening and we’ll keep bringing this to you. Feel free also to check out our Substacks, ViksNewsletter.com, Chipstrat.com, and share this with a friend. Thank you.

Vik: And check out semidoped.com where you get all these daily news picks but with our little words in it. Think of it as a little readable version of this podcast brought to you every day. So, don’t forget to check out.

Austin: Yes, yes. If you like our podcasts and we can’t bring our podcasts to you fast enough, check out semidoped.com to get Vik and Austin every single day.

Vik: Yes, and also leave us a five-star review on Apple. I’ve been told it’s very important. Yeah.

Austin: All right, that’s a wrap.