0:00
/
Transcript

🎙️Huawei's Tau Scaling Law: Is the "EUV Killer" Real?

1.4nm-class performance, logic folding via hybrid bonding, near-packaged optics, EDA, and more

Huawei dropped a paper claiming 1.4nm-class performance without EUV, and the internet declared ASML dead. 🔥

Is it real, or is it marketing? Vik and Austin unpack what Huawei actually announced and explain why the “EUV killer” headline gets the story backwards. EUV dead? Nope. ASML hurt? Actually, the opposite.

Things we cover:

  • What the “tau scaling law” actually is, and why tau means delay

  • Logic folding: stacking logic on logic via hybrid bonding

  • The Kirin 2026, and doubling transistor count without shrinking

  • Who can build it, and whether hybrid bonding tools can be export-controlled

  • Why stacking is bullish for ASML, not bearish

  • The other tau knobs: a unified memory bus and near-packaged optics

This podcast is lightly edited for clarity.

The Huawei Paper That Broke the Internet

Austin: Welcome to another Semi Doped episode. I’m Austin Lyons with Chipstrat, and with me is Vik Sekar from Vik’s Newsletter. Hey Vik, what’s up?

Vik: Yeah, how’s it going? It’s going well. It’s a nice time to chat. We usually do this later in the week — usually we record on a Thursday, but today we’re doing Tuesday. And it’s perfect, because we were actually going to talk about something else, but then there’s this paper that dropped from Huawei that said something to the effect of, we’re bypassing EUV, we’re going to be at TSMC’s 14 angstrom equivalent by 2031. And everything online all at once is hitting me in the face, saying Huawei has circumvented EUV, ASML is dead in the water, US dominance is gone in leading-edge nodes. At least that’s the sentiment I get. And considering we recorded the deep dive on lithography last week, I texted you and said, okay, can we just talk about this early this week? Because it’s so fresh in my mind, we should just do this.

Austin: Yes, yes. So this is perfect. Let’s talk about it. And in fact, it was Memorial Day in the United States yesterday, so I was hardly even online, but the internet was blowing up. So I’m glad that you texted me, I’m glad that you read it, and I look forward to learning from you through this conversation. And I will say, I wonder how the United States government — anyone in those circles, like the Commerce Department — how they’re feeling. Because of course lithography, EUV lithography, is the big geopolitical chip on the table. And I bet some of them have to be freaking out if they heard or saw on X, like, yeah, what you used for controls is now rendered useless. They also have to be like, wait a minute, is this real, or is this marketing? So that’s the goal of this — to unpack what Huawei actually announced, and whether the marketing speak and the interpretation were different than the technical paper and the technical innovations here.

Vik: Yeah, so we wrote this down in the Semi Doped newsletter — this is kind of where our first reactions go. So if you’re not signed up to that, as the listener, you should. It’s free on Substack. You should go to semidoped.com and sign up, because whatever news comes our way, we just try to write up quickly the first thought that comes to mind. Usually it’s the most natural response. But after that, I got a few questions saying, hey, what do you think of this piece of news? And I responded and said, yeah, this is really interesting that Huawei has this approach of trying to improve performance of chips overall, but without going to the fancy machines, which they can’t get a hold of because they’re all under export control.

So without much ado, I think we should first explain what the whole claim was. There’s this conference called ISCAS 2026 that I believe is being held in Shanghai this week. And He Tingbo, who is a Huawei director and the head of HiSilicon, which is the chip arm of Huawei, gave a talk. I did not hear the talk. Do you know if He Tingbo is a he or a she? Because I might totally get this wrong.

Austin: I saw a picture of a woman, so I think whoever gave the talk was a woman, I’m pretty sure.

Vik: Okay, I’m glad I asked.

What Tau Scaling Actually Is

Vik: But anyway, the point is, the whole idea is that there’s this new guiding principle called the tau scaling law. And the spiel, at least, is that it’s going to replace Moore’s law. Because over time Moore’s law has stopped scaling, and we’ve been squeaking along ever since we hit the EUV nodes below 7 nanometer. The whole idea is, okay, let’s look at something else. And this is where I actually like this framing — the tau scaling law has a much more fundamental scaling that I truly appreciate. And I need to explain why this tau thing, first of all. Tau is a measure of delay. It could be delay on the chip, it could be delay through the interconnect, it could be delay between racks, it could be delay between entire data centers, anything.

So the whole idea of going to smaller transistors was essentially to minimize this delay. The smaller you made a transistor, the lesser the delay got from the input to the output of the transistor, and that only meant it went faster. So for the longest time, the only way to make things go faster was to reduce the delay. So Huawei’s interpretation is to stop thinking about dimensions of the transistor, which they really can’t scale without EUV machines. But why not go down one further level and ask what the transistor was solving anyway? And the answer was delay.

So then the next logical question is, okay, we can’t improve the transistor delay anymore because we don’t have the machines. So where else can we improve the delay? Because it’s not like delay comes only from the chip or the GPU or the CPU or whatever it is. Delay is everywhere. Delay is in software, delay is in interconnects and how you hook up memory, what memory protocols and handshakes you do. So they were like, okay, let’s reinvent everything and think of this from a whole-system perspective. So this is what they call their tau scaling law. We will now scale down tau at the system level, not so much at the transistor level that has been done historically, but over the entire system. It could be a phone to begin with, or an entire AI data center. We are now scaling down delay.

Austin: Okay, that makes sense. So summarizing it, they’re basically saying, hey, Moore’s law is dead. And by the way, even if it’s not dead, we can’t shrink transistors anyway because we can’t get our hands on EUV tools. So if we want to continue to increase performance, and we can’t reduce the geometric footprint of each transistor, how can we increase performance? And so they zoomed out and said, well, wait a minute, maybe the geometric scaling of transistors was actually ultimately about reducing delay. And so they’re trying to reorient around tau — this time-delay, resistance-capacitance product — and say, okay, fine, we have one knob that we can’t turn, but what are all the other knobs we could turn to continue to reduce delay?

And I do like the point of not just delay at the transistor level, but extreme co-optimization, or STCO, DTCO, which we could talk about. I saw you had a tweet about this. Which is just saying, how can we look up and down the whole stack — from transistors and devices to circuits to systems to racks to interconnects to the whole data center to software on top of it — and how can we try to co-optimize amongst all of those?

Vik: Yeah. So whenever they say this is kind of a law, it’s always nice to see some equation. And I read the whole paper, actually — it’s an easy read for a paper. And they had this nice equation which says the tau of the transistor, the tau of the system, is basically the delay through the transistor, delay through the circuit, delay through the chip, and delay through the system. And what they want to do at every subsequent generation — the tau of the next generation — is the tau at this generation divided by some factor alpha, where they think that alpha factor is like 1.3x a year for mobile, and 1.5x for auto, and maybe even 10x for AI workloads. So think about that — optimizing across the system, they’re thinking they can reduce the delay by 10x for AI workloads. That is a significant improvement. And that is why they feel like they can get 1.4 nanometer-class performance by tweaking other parts of the system, not just the transistor.

Austin: Okay, so this law — most of these laws are not actual physical laws, they’re observations. So they must have had in their paper... were they showing chips that they created and measured these constants, and that’s where they’re seeing the scaling? Or where did this data come from in this empirical observation?

Logic Folding: Stacking Logic on Logic

Vik: It’s not anywhere. So they have some silicon — let’s get to that in a bit. But their optimizations were interesting, because they happen, from what I could tell, across three dimensions. The first dimension was that they just want to make transistor density more. That’s the whole thing that EUV does — EUV lets you pack in more transistors per unit area of the chip by making transistors smaller. So they stepped back and asked, okay, we can’t make transistors smaller, so how do we scale up the number of transistors in a chip? So they decided, okay, fine, we’ll just take two chips and stack them one on top of the other and hybrid bond it.

Hybrid bonding is a packaging technique that’s very interesting, because you can have millions of connections between these two logic chips. And the way it works is that you just heat them and put pressure, and they literally stick to each other. In the simplest way, that’s what hybrid bonding is. So you can have very, very fine connections that are closely spaced — the pitch between connections is something like 1.5 micron, across a massive area of a chip. Think about that. So hybrid bonding is a very fine-pitched packaging technology, which is probably the most advanced packaging technology you can get.

So that first dimension was, okay, let’s just stack two chips together. So in the space of one chip, we now get two chips. Hooray. That’s one way to get transistor density. That’s kind of cheating, because now you also use two times the silicon area, since you’ve got to sandwich two wafers together. But considering the cost of EUV, which we discussed in the last podcast episode, maybe it’s not a big deal, just saying.

Austin: Yeah, yeah, totally. Okay, so you’re saying they’ve got a chip, they can’t increase the transistor density because they’re at their fundamental limits with DUV and multi-patterning and whatnot. And historically, by the way, how the industry is kind of quote-unquote getting around Moore’s law is systems of chips. Whether you’re using chiplets or whatever — okay, well, let’s use 2.5D integration and put chips next to each other, and then we’ll have CoWoS, or interposers, connecting things. But you’re saying that Huawei said, no, wait a minute, what if, instead of putting those transistors far apart and having increased delay because now we’ve got to route between them, what if we just try to decrease the delay by stacking them in three dimensions, to increase the density, if you will, in a unit volume really.

Vik: Yeah, they call it logic folding, which is a nice name. But it’s really, if you think of it, no different, I feel, compared to what Intel Foveros is, or how AMD stacked SRAM with V-Cache. I guess it’s not technically logic-to-logic stacking when you’re talking about AMD’s V-Cache, because they put SRAM on top of a logic wafer to boost L3 cache on it. But in principle, yeah, they hybrid bonded an SRAM wafer onto a logic chip, and that’s kind of what this is all about.

So logic-to-logic stacking is not easy, right? Because how are we going to get heat out of this thing? That’s one example — thermals are quite challenging. So there are a lot of challenges to doing this stuff. And you’ve got to align it. Think about it — the connections are like 1.5 micron pitch. It’s actually ridiculously tiny. And so the alignment between the bonds needs to be perfect. And hybrid bonding itself is a crazy packaging technology, because the surfaces that you’re bonding need to be very flat and defect-free and all that. Because when you squeeze it together, if there’s a dust particle between the two of them, you know what happens — now you’ve got an open connection between the two sandwiched chips, and that’s bad. So it’s a challenging process.

That’s the whole question — does China actually have the equipment to do this? Funnily, they do, for two reasons. One, they do have this expertise, because they have been doing memory stacking for NAND at YMTC using wafer-to-wafer stacking and hybrid bonding of chips. That’s how NAND chips work — they’ve even done hundreds of layers of NAND. So they are familiar with it. But memory is a little bit of an easier problem, because memory has so much redundancy that you can route around stuff — have failovers in the memory architecture and stuff like that. So stacking is a different problem in memory than it is for logic. Stacking two GPUs on top of each other is a significantly harder problem than trying to stack 400 layers of NAND memory, you know?

Austin: Totally. So, okay, you’re saying historically stacking things is not a new concept. Even stacking logic is not necessarily a new concept. But normally when we’re stacking — let’s say logic on an interposer — that interposer is passive, so it’s not as big of a deal, you’re just routing through it. And even if you’re stacking memory on logic, that’s a little... and of course memory, and NAND and HBM, a lot of these things are already three-dimensional. So we’re already used to figuring out how to create things in three dimensions and stack them. But it is a bit of a different beast when we stack logic on top of logic, because they’re both active, they’re both powered, they’re both giving off thermals, and you have to make sure all the connections are correct. And there’s not this built-in redundancy where, if something fails, just route around it. But conceptually, to the industry, logic on logic is not a new thing that Huawei has invented.

Vik: No, it’s not, and it’s been around in principle. So that’s what makes it interesting — it’s a challenging problem, and it’s impressive that they do have silicon. It’s called the Kirin 2026. This is a mobile SoC processor, and they actually have this implemented, and they have plans to keep going in the future. The paper has all these numbers — yeah, here I have them. They’ve managed to double their transistor count, obviously by stacking. And so they kind of jump nodes, in principle. Because, we spoke about this, there is no such thing as 5 nanometers or 2 nanometers anymore, because the transistor architectures have changed. So this is just the nomenclature now anyway. So you can go to the same class node by doing other things.

Gate-all-around was one of those other things you could do to go to 2 nanometers. So Huawei’s approach is, yeah, we’ll just stack transistors and we’ll get the same transistor density. In principle, this is like CFET maybe — a complementary FET — where people were like, why should I put an NMOS and a PMOS transistor next to each other? Because for a CMOS, the complementary MOS, you require P-type and N-type transistors. What if I put them on top of each other? You can save space. So this is along the same inspired lines, not exactly the same thing. But basically, why not put a whole transistor wafer on top and stack them up like that, and you can jump generations forward. So they’ve done it, and this is very impressive engineering — all kudos to them. I’m not going to take away from their engineering achievement here. So that is the whole logic stacking aspect of it.

There are two more things that I think are very useful, but I want to get to those after you ask me this question. Yes.

Who Builds It, and Can the Tools Be Restricted?

Austin: Thank you for letting me graciously interrupt. So on the logic stacking — one of the things this reminds me of, of course, is DeepSeek, in that they could not get enough compute. They could not get enough compute and enough memory bandwidth with the chips that they were given. Allegedly — okay, maybe they did find their way to some H100s, but allegedly they had these stripped-down chips, which caused the DeepSeek team to have to innovate in other dimensions because they were constrained on one. And so they got to be the first to think deeply about other things, like, how do we offload some communication, overlap some communication and compute, do other little tricks so that we still unlock the right performance.

And so what I’m thinking about is, okay, if Huawei is constrained to not use EUV, and therefore they’re thinking about how can we reduce delay in other parts of our system — and one of them is, it’s forcing them to go to logic-to-logic stacking, maybe sooner than the rest of the industry feels that they have to. My question is, who is manufacturing it? And is this giving them an advantage in just getting more practice manufacturing logic on logic? Like, will they be able to run ahead a little bit because they are forced to build this for their customers sooner than, say, a TSMC?

Vik: Yes. This is a good question, and I’m glad you asked this now before I went on to talk about something else. Because the fact that we are stacking logic on logic has two implications. The first one is, it is entirely centered around hybrid bonding, because that is the secret sauce that allows them to increase transistor density. So you can increase the density two times. Can you stack it three times? I don’t know. Can you stack it four times? I don’t know. So what is the limit on hybrid bonding here? How many layers? That’s something I don’t know. But remember, it gets extremely difficult, because if you are stacking logic on logic, I think you would want to do die-to-wafer stacking, not wafer-to-wafer stacking, because you will get wrecked on yield. As it is, these logic chips are kind of big. And yield is such an important thing, because you don’t get all that much as people imagine, especially at the very cutting edge. But maybe 7 nanometer nodes is okay.

So that is one aspect of it — that it’s entirely based on hybrid bonding. And the question is, are they capable of it? And the answer to that is, at this point, memory-to-memory was all wafer-to-wafer — so memory stacking is all wafer-to-wafer. But logic needs die-to-wafer, and that is kind of new even to BESI and these companies that specialize in hybrid bonding. Even they have product releases that are very recent. So die-to-wafer bonding in logic chips is very cutting edge. And luckily, from what I was looking at, there is no export restriction on hybrid bonding machines. You have extremely high limitations on what you can do in EUV, but not so much on hybrid bonding machines. So that’s one thing — they do have the machines that they can do this with. And if you look at BESI’s business, 35% of all their business is actually China-based, if you see the last quarter. So that is a large fraction of their machines that actually do go to China. So I’m pretty sure they’ve stacked up on some BESI machines before they let this cat out of the bag.

Because if you already have a piece of silicon that’s stacked up and working, as in the Kirin mobile SoCs, you can believe that they have been working on this for years. This is not an overnight achievement.

Austin: So you’re saying SMIC is the manufacturer here, and they can’t get EUV machines, but they can get hybrid bonding machines, and hybrid bonding is the secret sauce here to logic stacking. So do you think that someone’s going to try to go say, now you can’t buy hybrid bonding equipment anymore?

Vik: So Huawei never said it’s SMIC, by the way — everybody assumes it is. Because it’s a reasonable assumption, and their stock went up and all that, which is cool. But I don’t think it’s SMIC who will do the stacking aspect of it either, because there are specialized people in packaging in China whose names we shouldn’t get into. I’m writing an article on Substack on this, so all those details will be on there. But there is another company that will do the hybrid bonding, because there is a whole learning curve on learning to do hybrid bonding. So there are companies who have patents just on hybrid bonding, and that is something that is not easy to do. So that’s a separate skill set. China is working on that as well.

So that’s the one important thing — it’s all hybrid bonding based. And the next question is, okay, does China have any local hybrid bonding machines? The answer is yes, they do, but I don’t think they are at the same level of sophistication there is for wafer-to-wafer bonding, for example. So that is something they still rely on BESI for. And to answer your question — yes, they can impose restrictions on it, on China, I presume. But it really comes down to, if they can get EV Group wafer-to-wafer bonding... there is a company called EV Group, which is an Austria-based company, and they are not in this axis of export restrictions that BESI is in, because BESI is a Netherlands-based company, and they’re in the same boat as ASML and stuff like that. These countries have a lot of export restrictions. But if they can do wafer-to-wafer bonding — who knows? Maybe they’re not subject to export restrictions, because Austria is not part of this thing.

Austin: Yeah. Well, okay, I did not expect that this conversation would get so much into hybrid bonding, so that’s cool. And I’m like, I need to go learn more about hybrid bonding in the market and who all the competitors are. And then two — yes, these poor European countries, they’re like, we invented something, we’re awesome at wafer-to-wafer hybrid bonding, and then they’re going to get caught up in the crossfire of geopolitics.

Why This Is Actually Good for ASML

Vik: The one thing I anticipate is, wait, if China can’t do EUV, is ASML affected by this news of tau scaling? No, because first of all, they were never buying EUV machines from ASML — they can’t. So there’s never a business to begin with. Secondly, I will argue that this is actually good for ASML. Because remember, now they have to make two wafers using deep ultraviolet — DUV — for every transistor. So they need more DUV machines, which is a positive for ASML, right?

Austin: There you go. I like it. That’s a positive spin.

Vik: Yeah, that’s a positive spin on it. So that’s the whole thing about how this whole logic thing works.

Austin: So then, really quick — logic to logic, maybe last thing. We’re talking about companies who build the die and other companies that package them. It’s kind of the front end and the back end. Of course, Intel Foundry can do both — they can make the wafers, they can also do the advanced packaging. And you mentioned Intel Foveros Direct. So can you say 10 more seconds on Foveros, and whether this is a direction that Intel Foundry could support with the logic-on-logic stacking and packaging?

Vik: Yeah, I don’t see why not. That is essentially what Intel Foveros is, as far as I understand it. And what this shows is — this is the other thing I wanted to mention, it comes to me now that you asked the question — basically, there is no reason that any US fab, like Intel or anybody else, like TSMC, shouldn’t start stacking wafers now. Because if you stack a 7 nanometer wafer and then you get tau scaling to work, imagine what will happen when you stack a 5 nanometer wafer, or a 3 nanometer process node, or a 2 nanometer process node. You’re going to leapfrog past what China can do with tau scaling.

So they may be able to catch up, but what this will do now is drive the ability to do hybrid bonding in the advanced EUV nodes — because why not? It’s not an overnight thing, it’s a very complicated thing to do. But think of it long term. If you can stack 2 nanometer node wafers, that is an amazing amount of compute in a small area. And we may get to CFET before that — maybe we don’t need it. We may do hybrid bonding of gate-all-around FETs before CFET shows up. Don’t know. Or we may hybrid bond CFET wafers together — the ultimate density move.

Austin: Totally. We should do it all, right? The front-end folks should keep working on the transistor innovations, and the packaging folks should keep improving stacking and hybrid bonding, and then slam it all together. And I do agree with your point, which is — okay, let’s say I can do a billion transistors in this little area, and now I can stack them so I get two billion. And then you’re on an advanced node and you can do 1.5 billion transistors in the same area, and now you stack it and now you have three billion transistors. So it compounds, totally.

Vik: Yeah. So the whole tau scaling thing is not an “I will replace EUV technology and leapfrog around you without the right tools.” It is a temporary measure, where yes, you can bump up the performance of silicon with this technique. But if the people who are EUV-enabled do start doing the same thing — and they will, because that’s how the industry works, they’re not going to sit down and not do something about it — then the gap widens. It doesn’t narrow. The gap widens. That’s a good thing.

Austin: Yes. And maybe to make the point yet a third time — if, after Huawei’s big tau scaling announcement, someone came to their silicon manufacturer, whoever that is, and said, would you like EUV as well? I’m sure Huawei would say, sounds great, let’s have that and tau scaling.

Vik: Yes, exactly. That’s what you do. That’s the logical thing to do. So absolutely. So that’s the whole aspect about logic folding, and that is mostly the discussion that is going on here.

The Other Tau Knobs: Memory and Optics

Vik: But their paper actually talks about a few other dimensions that are at least worth mentioning. And that is basically what they call the unified bus for memory. Because they say, look, if you have all these different memory standards talking to each other, and then you have to have all these handshakes and converters and gearboxes and all of this stuff that adds latency, then you’re wasting cycles here. You’re wasting tau. Don’t waste tau. What we will do is have a universal language, which everything in the rack or the system or the data center — I don’t know, the earth, if you could — will all speak the same language, so that there’s no translations happening. And so that is one way to scale down the entire thing, the delay, and speed up stuff. It’s a very good one. I mean, I love it. It’s a very good thing to do anyway, regardless of whether you have EUV or not.

Austin: Totally. So isn’t that ultimately like — weren’t we trying to do that with RDMA and things, and just say, how do we make it so that GPUs can communicate with other GPUs to share their memory directly without so many handshakes? Was the industry already on this path? And does this just speed it up and say, hey, there’s more latency to get rid of?

Vik: Exactly. That’s the whole point. The industry has been there already. Again, this is not a new concept.

Austin: Just a different prioritization.

Vik: Yeah, exactly. When Jensen talks about extreme co-design, what do you think he’s thinking about? This is what he’s saying. Don’t just think about one thing, like memory in isolation, and work with your own standard, and when you try to plug it into a system it has to talk a different language, and now everybody’s like, can you convert this language to that language? Don’t do that. Let’s look at the whole thing as one picture, and then optimize everything for that system-level optimization. So this is the STCO argument — or you can call it extreme co-design, whatever fancy word you want to use. Tau scaling seems like the fanciest word I’ve heard yet.

Austin: Yeah, very good. Their marketing folks did great. So, you may have heard the term STCO — system technology co-optimization. Jensen took it up a notch with extreme co-design, and now Huawei’s trying to one-up with tau scaling. Which, I mean, it does sound pretty sweet.

Vik: It is pretty sweet. No — one other thing they want to save tau on is networking. Because they’re like, why don’t I just do near-packaged optics and eliminate DSPs from the entire system if possible? DSPs are terrible for latency. They are tau killers. DSP is the tau killer — like “fear is the mind killer” in Dune, if you’ve ever read the books. DSP is the tau killer. Because you have to wait for all the bits to arrive, and then you have to wait for the parity bits to come, and then the DSP looks at it and goes, are these bits correct? They’re not correct? Cool, then I have to correct for the error, and then it does all this computation. You need a leading-edge node to do this DSP stuff. Sucks power, sucks latency, and they want to do away with it.

They want to just get rid of DSPs, don’t use all this pluggable stuff. Use as close as you can to CPO, which is maybe near-packaged optics. Maybe you don’t — maybe they’ll try to package the optical engine right on top of this stack, logic, folded logic, whatever. I don’t know. But anyway, at least in the near term, they can put the optical engine as close as possible to the actual compute silicon.

Austin: Stack it up. Let’s go.

Vik: That’s one way to reduce cycles. So they’re looking at all of this stuff. Obviously you can do software optimizations, all that kind of stuff. So at the system level, they want to squeeze as much performance as possible — which is the only logical thing you do when you don’t have access to leading-edge silicon. What do you do? You do everything else. Exactly. That’s what tau scaling is.

Austin: Okay, gotcha. Yeah, of course, NPO and CPO make a ton of sense. Everything comes with trade-offs, though. So there’s the whole, like, does the supply chain support these things? Are they ready for it? Can they build it reliably? Do you have multiple sources? So I think it feels a little bit academic, in that, on one hand, anyone could sit down and look at a system and just say, what are all the different ways we could wave a magic wand to reduce tau? I don’t know if they talked about all the practical bits in the paper, or if this was more of just whiteboarding out where the bottlenecks are.

Takeaways: Bullish Packaging, EDA, and Multiphysics

Vik: Very high level. It is interesting that they have some silicon to show for it, which I love. But it’s also a lot of high-level, hand-wavy, equation-y stuff. It’s not very complicated equations — you can read the paper, it’s online, you can find it. It’s not a very complex paper, it’s very marketing-like, but it’s a good read. Because I think it’s another knob the industry hasn’t entirely paid attention to. I think we are getting there, and this is one of those signs. We realize that we need more speed, we realize co-packaged optics is coming along, we realize that memory bottlenecks are the biggest problem. It’s not compute flops, it’s memory and the interconnect that is really holding back everything now. And this is the next step — okay, how do we squeeze and make active silicon in CPUs, GPUs, or whatever that is, more dense? And the insane way to do it is start stacking them and hybrid bonding them. But this whole thing is insane. So what’s new? AI is insane to begin with. So what’s new?

Austin: Totally. Yes. Never before have we tried to co-optimize on such a grand scale and such a miniature scale. And so I do like looking at a different constraint to optimize around, up and down. So I guess, maybe final takeaways — what does this mean? And you kind of alluded to it before — what does this mean for everyone else, for TSMC, for ASML? Other than the fact that they’re not dead and EUV’s not going away, are there any other takeaways?

Vik: I don’t have too many, unless there is something that comes up that I haven’t thought about. But whatever I thought about, I already said. As a broad summary: I think going to stacking chips is a positive for ASML, because you need more machines to make more chips for the same product — you need to make two times as many wafers, which is a good thing. The other thing is that you will see most of the industry now starting to optimize across the entire stack. That’s already happening, nothing new about that. And then, I’m guessing that we will start seeing some activity around stacking up wafers. Maybe somebody’s going to try to use Intel Foveros and stack GPUs. I don’t know, that’s just a guess. But yeah, once Huawei talks about this logic folding idea, more people are going to do it. And that’s a good thing. Gotta try more complicated stuff — that’s how we move ahead.

Austin: Totally. I guess what’s coming to mind to me — obviously this is bullish advanced packaging, because it’s just more complicated connections. And then also maybe bullish EDA, and multiphysics.

Vik: Oh my God, great point. Actually, that’s a great takeaway. Yeah.

Austin: Yep. So we’re basically at time, so I won’t get into it too much, but really quick, at a high level — where I’m thinking is, okay, now this is a three-dimensional problem that involves thermals, it involves mechanical stress, it involves electricity, it involves optics potentially, if you’re talking near NPO/CPO. And so this becomes more and more of a challenge. This of course reminds me of, like, Synopsys/Ansys’s multiphysics engine — of just like, it’s going to be less and less of the silicon guy does this thing and the packaging guy does that thing and the thermals guy does that thing, but all of it needs to be brought together to figure out how do we stack logic on logic and remove the heat and still meet timing closure and still have reliability, and so on.

Vik: Yes. It is hard enough — EDA is a hard enough problem already, where we are on single-layer transistors, and you know how you scale them up to make the GPUs they are today. There’s a lot of advancement in the use of AI for EDA, and the whole EDA industry is actually very bullish right now, because you can basically sell licenses per agent rather than per person. And it can do a lot of stuff now. And what this adds is a level of complexity that we haven’t seen so far in the transistor world, when you start stacking entire wafers and running complete logic across two wafers stacked, or maybe four wafers stacked in the future. That’s crazy. So there’s going to be a lot of challenges there. And the paper actually does mention that this is a challenge. So that’s very much a good point to bring up.

Austin: Nice, cool. All right. Well, folks, with that, we hope you liked this episode. Check out, of course, our Substacks. Also go to semidoped.com — you can sign up for our free newsletter. If you like Vik and I and our takes, we try to give takes there every single day. And sometimes I come in later than Vik, so I just get to take a take on his take, and he doesn’t get to respond before I hit send. So we try to keep it lighthearted and fun, too. But thanks for listening, and we’ll catch you guys next time.

Discussion about this video

User's avatar

Ready for more?