Ayar Labs CEO Mark Wade sat down with HPCwire to discuss the company’s products and path forward.
posted on
Oct 25, 2024 04:39PM
October 22, 2024
In AI, time is money. Top AI players are spending billions to create computing infrastructures to satisfy that need for speed. However, these companies are bottlenecked by computing constraints at the chip, memory, and I/O levels, which are slowing down AI. In that regard, startup Ayar Labs is in the right place at the right time.
Ayar Labs has another solution: to replace electrical wires with pulses of light for complex chips and memory to communicate faster over short distances. That will improve the utilization of systems and result in more revenue and productivity.
Ayar Labs’ products are almost ready for prime time, and CEO Mark Wade sat down with HPCwire to discuss the company’s products and path forward.
HPCwire: Can you give a high-level update as to where you’re at, what you’re focused on, and what the path forward for your company looks like?
MARK WADE: We’re building optical I/O solutions, which means the full set of products that enable optical communications coming straight from ASIC packages
We have two main revenue-generating products today. One is our SuperNova light source — this is outside of the package, a remote light source. Think of it as an optical power supply that sits somewhere outside of the ASIC package.
We also build and sell the TeraPHY optical I/O chiplet — silicon that has about 70 million transistors and more than 10,000 optical devices. We integrate silicon photonics devices into a CMOS process, resulting in a piece of silicon we sell as a chiplet. That chiplet gets integrated into customer SOC packages.
The whole point is to enable optical communications straight from that SOC package. A lot of the system-level performance bottlenecks are coming from connectivity and bandwidth limits between different SOC or ASIC packages.
If you push optics in the right way into that SOC package, and you have high bandwidth, low power, low latency, and optical connectivity straight from the package, you also break the traditional bandwidth distance trade-off that comes from electrical communications today. You could go up a half meter, 10 meters, or a kilometer, all through the same optical fiber.
HPCwire: Why is optical better for larger implementations, especially in the context of AI and high-performance computing?
WADE: The value of our optical solutions plays out when you’re looking at massive amounts of bandwidth escaping the package and running workloads that need hundreds of SOC packages working together. This is a regime that the high-performance computing community is very well versed in.
With Moore’s Law slowing down, you can still get more transistors per package, but the rate at which you can feed data into it was not keeping up. Looking at trend lines on memory capacity and memory bandwidths, we realized you really needed a way to break this I/O bound on things. We made that bet a long time ago.
For the last 10+ years, we’ve been saying that compute systems are on a path where electrical I/O is going to fail, and it’s going to start showing up more and more acutely.
HPCwire: When did you start focusing on AI workloads?
WADE: We had an outlook that recognized large-scale AI systems, or AI workloads, were on a path to really needing HPC-like systems to run effectively. Our 2018 Series A pitch was based on the idea that large-scale AI clusters would be the biggest opportunity in commercial data centers, and to scale the large-scale AI workload, you’re going to have to have optical I/O.
What really changed the world’s perspective was ChatGPT. Everyone started realizing there’s an AI workload that looks very different from what we thought AI was.
HPCwire: Do customers have to buy the chip directly, or is it IP that they have to buy and manufacture themselves?
WADE: The main business model we have right now is selling the actual product. There’s been this whole paradigm shift in the SOC world to enable chiplet adoption. If you pop the lid off of an ASIC, you’ll see multiple chips inside of it.
We’re selling what we call a known good optical chiplet into our customers’ packages. On the optical I/O chiplet side, we sell this as a revenue-generating product. They just buy the die directly from us.
HPCwire: In terms of product delivery, could you give an idea of when you established yourself, when the products would be in the market, and the things that are happening in between? What has the process been so far?
WADE: A few years ago, we were developing core technologies with our manufacturing partners. Now, these technologies are working, and we’re shipping low-volume products.
In the last 18 months, we’ve shipped over 15,000 units, with steady monthly shipments to multiple tier-one commercial customers. These are primarily for small-scale system builds, helping customers refine their manufacturing and integration processes for large-scale, deeply integrated optical systems.
We’re delivering thousands to tens of thousands of units annually. This sets the stage for volume production in our two-year focus window from mid-2026 to mid-2028. We anticipate scaling to hundreds of thousands to millions of units monthly, potentially reaching 100+ million units annually by 2028 and beyond.
HPCwire: What are those volume intercepts? It’s a two-part question, but I’ll let you answer the first one, and then I’ll follow that up.
WADE: The main commercial intercept that’s driving volume is large-scale AI systems – AI clusters, rack-scale, and multi-rack scale AI clusters for both training and inference. That’s really driving the vast majority of the volume adoption. There are a bunch of other bespoke things that are interesting as well, but they’re much smaller compared to the AI drive.
HPCwire: Is that the first intercept? And then you mentioned the second intercept?
WADE: It’s really a multi-generational set of products that are driving large-scale optical fabrics in optically connected AI systems. We’ve had other applications in telecommunications, in more generic data center architectures and infrastructure.
The US government has been a longtime supporter of our company, and there are a number of applications in the defense and aerospace realm that we ship into today.
Over time, we view this transition to optical I/O as a ubiquitous paradigm shift that happens in lots of different application segments, but the volume adoption and the justification for large-scale investment into these optical technologies and products are all being driven by AI systems.
HPCwire: These AI systems are a few of the chip makers that drive the industry right now. Is your reliance on those chip makers? Or is the customer base vast?
WADE: Our go-to-market strategy focuses on addressing high-volume, high-quality manufacturing in photonics. We’ve established strategic relationships with key players like GlobalFoundries, Applied Materials, Intel, and TSMC, engaging with all tier-one CMOS manufacturers.
We also have a strategic partnership with Nvidia, the leader in large-scale AI systems, collaborating on integrating our technologies into future AI systems. Our direct customers are building SOCs and SOC systems, with the tier-one ecosystem including companies like Nvidia, AMD, Intel, Broadcom, and Qualcomm.
End customers building large-scale AI models, such as Anthropic and OpenAI, are crucial. Many acute problems are emerging in data centers as they try to scale AI workloads. We find it validating that these companies share a similar vision of the future as we’ve been predicting for years.
Our success depends on insertions into these sockets. We’re addressing the challenges in photonics technologies, particularly in high-volume, high-quality manufacturing. This approach allows us to work with major players in the industry while also catering to the needs of end users, pushing the boundaries of AI technology.
HPCwire: Do you need customers to have a mindset of thinking of chiplet first as opposed to chip first?
WADE: There has been a set of education being done around why people are using chiplets and whether there is an ecosystem for chiplets that’s stabilizing fast enough such that customers can view it as a low-risk insertion point into their design.
I don’t know that I’d say customers are thinking chiplet first, but they’re more and more thinking system first. And then how do those requirements drive down into what requirements are coming from the SOC package?
AMD, Intel, and Nvidia — all the tier-one guys have already chipletized. The rest of the ecosystem has to follow because the tier-one guys are carving out that path. We’re more trying to springboard off that… now we just need to introduce this idea of optical chiplets.
HPCwire: There are Tier 2 and Tier 3 offering various chiplets, like CPUs or GPUs. How do you see your product fitting into this ecosystem? For example, could your optical chiplets be an add-on for companies now selling RISC-V CPU chiplets?
WADE: Yes, that’s definitely one option we’re looking at. Our model is to be a pure-play optical solutions provider, and we’re agnostic to what package we’re getting integrated into. This creates an exciting mix of business models and solutions, allowing us to flexibly provide connectivity value that can scale gracefully.
We might have a customer that only wants one optical chiplet per package, but we also have customers looking at 8 to 12 optical chiplets per package. We can address different customers and give them flexibility in how they adopt our technology – how many chiplets they use, what system-level form factor they’re integrating into.
HPCwire: How do you justify the cost and power consumption of optical interconnects compared to electrical ones?
WADE: We focus on the application-level economics, particularly for large-scale AI. Current unit economics for large-scale AI are broken – it’s too expensive. We need to compare AI application-level figures of merit, not just component-level metrics like power consumption.
We’ve developed a system architecture simulator to estimate profitability, interactivity, and throughput for both AI workloads and core technology components. Our results show that while current systems show some performance improvements generationally, there are no significant benefits when normalized to profitability.
However, when comparing next-generation systems built with electrical I/O versus optical I/O, we see massive differences in profitability and interactivity. This economic argument is driving the motivation to move to optical I/O.
HPCwire: Is this more of a CapEx or OpEx consideration for your customers?
WADE: It’s primarily a CapEx consideration. The main issue for customers is unit economics – tokens per second per dollar. This is dominated by CapEx, specifically the system cost amortized over its ability to produce high-throughput token streams. Our estimates show the cost structure is about 80-90% CapEx amortization and 10-20% OpEx.
Essentially, the CapEx of the system is divided by the total useful throughput it can deliver.
HPCwire: How are you handling scaling challenges as a startup?
WADE: We’re in a financing phase aimed at driving our company’s scale for the next two to three years. Our key challenge is navigating this growth with our go-to-market ecosystem partners, including supply chain and early customers.
We view tier-one companies as ecosystem establishers. With the right business model and product strategy, we aim to enable tier-two and tier-three customers within 9-18 months after tier-one companies establish high-volume production.
HPCwire: Do you license your IP to companies if they want it?
WADE: While we’re not opposed to IP licensing, our current focus is on delivering the actual product. This approach is more scalable and simplifies optical I/O adoption for our customers. For the next few years, we believe we’re best positioned to deliver products that integrate successfully into the manufacturing ecosystem.
We support customization and IP licensing conversations for customers serving different application spaces. However, about 90% of our focus is on delivering optical chiplet products, with only a small portion dedicated to IP models.