When we begin speaking about processor efficiency, it is essential to know that numerous functions have quite a lot of wants. It goes past easy “latency vs. throughput” issues. There are many elements that go into really executing a program that each one affect the way it performs.

While we are likely to assume “serial code = CPUs” and “parallel code = GPUs”, there are some duties which might be greatest executed on CPUs, but nonetheless want huge quantities of reminiscence bandwidth and massive caches, or code that’s nearly completely crunchy floating-point compute with minimal I/O. Particularly within the HPC world, you will discover a whole lot of these unique workloads that merely do not run that properly on typical techniques owing to their uncommon calls for.
As evinced by its occasion at ISC 2022, Intel appears to be looking for to serve these clients with all types of customizable super-computing {hardware}. The firm gave the primary efficiency knowledge, obscure and relative as it’s, for its upcoming Sapphire Rapids-family Xeon Scalable processors with on-package High-Bandwidth Memory (HBM), and it additionally talked about a few upcoming merchandise: Rialto Bridge (the successor to Ponte Vecchio) and the Falcon Shores “XPU”.

Sapphire Rapids With HBM For HPC Workloads

First up: the Sapphire Rapids numbers. All of the data that Intel printed got here within the type of relative comparisons towards its present, third-generation Xeon Scalable components. Overall, it looks like Intel is promising between double and triple the efficiency from its upcoming HBM-equipped chips in comparison with the extant fashions. That’s a heck of loads higher than the everyday gen-on-gen will increase we’re used to.

Ponte Vecchio hasn’t really debuted as a industrial product but, however which may be partly as a result of Intel is delivery each accessible bundle it might fabricate to Argonne National Laboratory in Illinois. There, Intel and Cray are constructing the Aurora supercomputer that Intel guarantees will surpass the just-launched Frontier machine powered by rival AMD. AMD beat Intel to the punch with the primary “exa-scale” machine, so Intel’s going to punch again by doubling Frontier’s efficiency.

Rialto Bridge Will Supplant Ponte Vecchio

The successor to Ponte Vecchio shall be referred to as Rialto Bridge. Intel divulged little details about the upcoming processors, however there are a couple of tidbits to be aware of. Where Ponte Vecchio tops out at 128 Xe cores, Rialto Bridge will apparently sport as much as 160 cores.

That, clearly, means extra compute throughput, however Intel can be promising elevated reminiscence bandwidth (by “more GT/s”, which means the next reminiscence clock) in addition to “increased I/O bandwidth.” Ponte Vecchio already makes use of PCIe 5.0, and it is pretty unlikely that they are speaking about PCIe 6.0. Instead, Intel might be speaking about its personal Xe Link interconnect, the analog to NVIDIA’s NVLink.

The blue workforce additionally notes that Rialto Bridge will deliver assist for the second revision of the Open Accelerator Module (OAM) socket specification. OAM is a part of an try (often called OAI) to outline a standard specification for compute accelerators, together with type issue, baseboard, socket, and so forth. OAM v2 specifies a rise in energy supply as much as 800 watts.

Intel’s Powerful Falcon Shores XPU

After Rialto Bridge and Emerald Rapids, Intel shall be launching a product codenamed “Falcon Shores.” Intel refers to Falcon Shores as an “XPU”, and that is as a result of it is neither CPU nor GPU, however each—or maybe “either” may be extra applicable. Essentially, it is a processor that may combine and match Xeon CPU tiles and Xe GPU tiles to supply the correct stability of compute efficiency for all kinds of workloads.

We’ve really already heard about Falcon Shores earlier than, as the corporate introduced it at its winter investor assembly this yr. In reality, there’s not really any new data on Falcon Shores in right now’s presentation, simply the identical claims that, compared to extant components, it would deliver a greater-than-5X enchancment in performance-per-watt, compute density, and each reminiscence bandwidth and capability.

Even for a product that is not launching till 2024 on the earliest, these are some fairly daring claims, however given Intel’s aggressive scheduling for its manufacturing division, they do not sound unrealistic. If Intel can sustain the tempo, its rivals might have their work reduce out for them.