432-Core Chiplet-Based RISC-V Chip Nearly Ready to Blast Into Space

The Occamy processor, which uses a chiplet architecture, packs 432 RISC-V and AI accelerators and comes with 32GB of HBM2E memory, has taped out. The chip is backed by the European Space Agency and developed by engineers from ETH Zürich and the University of Bologna, reports HPC Wire

The ESA-backed Occamy processor uses two chiplets with 216 32-bit RISC-V cores, an unknown number of 64-bit FPUs for matrix calculations, and carries two 16GB HBM2E memory packages from Micron. The cores are interconnected using a silicon interposer, and the dual-tile CPU can deliver 0.75 FP64 TFLOPS of performance and 6 FP8 TFLOPS of compute capability. 

Neither ESA nor its development partners have disclosed the Occamy CPUs’ power consumption, but it is said that the chip can be passively cooled, meaning it might be a low-power processor.


(Image credit: HPC Wire)

Each Occamy chiplet has 216 RISC-V cores and matrix FPUs, totaling around a billion transistors spread over 73mm^2 of silicon. The tiles are made by GlobalFoundries using its 14LPP fabrication process. 

The 73mm^2 chiplet isn’t a particularly large die. For example, Intel’s Alder Lake (with six high-performance cores) has a die size of 163 mm^2. As far as performance is concerned, Nvidia’s A30 GPU with 24GB of HBM2 memory delivers 5.2 FP64/10.3 FP64 Tensor TFLOPS as well as 330/660 (with sparsity) INT8 TOPS.

Meanwhile, one of the advantages of chiplet designs is that ESA and its partners from ETH Zürich and the University of Bologna can add other chiplets to the package to accelerate certain workloads if needed.

The Occamy CPU is developed as a part of the EuPilot program, and it is one of many chips that the ESA is considering for spaceflight computing. However, there are no guarantees that the process will indeed be used onboard spaceships. 

The Occamy design aims to support high-performance and AI workloads through a bare-metal runtime, but it is not yet clear whether the runtime will be at a container level or at the bare-metal level. The Occamy processor can be emulated on FPGAs. The implementation has been tested on two AMD Xilinx Virtex UltraScale+ HBM FPGAs and the Virtex UltraScale+ VCU1525 FPGA.