Kaby Lake-G’s Vega Credentials Questioned: Rapid Packed Math Not Working

As with any new shocking development, rumors have swirled around Intel’s “Kaby Lake-G” since AMD announced it was selling custom graphics chips to its longtime chip rival. Many of the early reports suggested that Intel was purchasing previous-generation “Polaris”-based silicon. Intel later quashed the rumors when it gave us the full rundown of its new Kaby Lake-G processors with Radeon RX Vega graphics. But recent discoveries throw those credentials into question.  

We recently tested the Kaby Lake-G Core i7-8809G in Intel’s Hades Canyon NUC. As we noted in our coverage, many third-party utilities weren’t reporting certain parameters correctly, such as package power consumption and temperatures. That certainly isn’t out of the ordinary for pre-release hardware. Given the obvious lack of optimization for the new processors, we chalked these disparities up to reporting errata.

Now Kaby Lake-G is back in the news with speculation that its graphics are more akin to the Polaris architecture than Vega. Rather than simply reporting on that speculation, we ran some of our own tests to see if Intel supports one of features AMD advertises as Vega-specific. What we found was more than a little surprising.

Reigniting The Polaris Fire

PC World re-opened the Vega/Polaris debate last week when it reported that AIDA64, a third-party testing utility, identified the Hades Canyon NUC’s Radeon GPU as Polaris 22. The utility also lists the architecture as AMD GCN4 (Polaris). Conversely, SiSoftware’s Sandra utility identifies the graphics engine as Radeon RX Vega M GH. 

Unfortunately, device IDs don’t tell the whole story. Rather, they just report based upon identifiers that aren’t always accurate. Kaby Lake-G’s Radeon graphics, for instance, are identified by the gfx804 OpenCL device name, which is the same device name AMD uses for Polaris 12, found on Radeon RX 550 graphics cards. This could be due to an identifier that isn’t updated correctly or a detection error in the OpenCL run-time; some utilities automatically roll back to the latest known code name if an unexpected value is returned. Software makers can also override the code name for display purposes. That explains why some utilities identify the NUC’s Radeon graphics architecture as Vega, while others list it as Polaris 22. Third-party utilities also use PCI device IDs for identification. That’s 694C and 694E for Kaby Lake-G graphics. But those device IDs also track back to Polaris 22.

More telling: PC World also noticed that the Hades Canyon NUC’s Radeon graphics do not support DirectX 12.1 like Vega graphics chips. Instead, it only supports (up to) DirectX 12.0, similar to Polaris. But AMD’s Ryzen 5 2400G processor–armed with on-die Vega graphics–also supports DirectX 12.1. This bolsters the theory that Intel’s Kaby Lake-G uses the Polaris graphics architecture, rather than Vega. 

Where There’s Smoke, There’s…Rapid Packed Math

AMD introduced several new features with its Vega design, including the Next-Generation Geometry path and Draw-Stream Binning Rasterizer. Some of those capabilities are difficult to isolate and test for, so we instead focused on Vega’s Next-Generation Compute Engine (NCU) and its Rapid Packed Math feature.

The Vega NCU’s ability to pack two 16-bit operations into a 32-bit register is new, allowing 64 shaders to perform up to 256 16-bit ops per clock (or even 512 eight-bit ops per clock). This has certain implications for performance any time FP16 operations can be used instead of FP32 ops. One of the most recent AAA titles, Far Cry 5, supports this functionality for example. In comparison, Polaris’ Compute Units do not support Rapid Packed Math. As a result, FP16 operations are handled at a similar rate as 32-bit ops. A simple test should tell us if Kaby Lake-G’s graphics engine contains Vega’s NCUs or Polaris’ older CUs. 

We used SiSoftware Sandra’s GPGPU Processing test to test FP16 performance. It has to run in Direct3D 11 mode to correctly utilize FP16 operations.

We began our tests with a Radeon RX 470 (Polaris) to confirm that its FP32 and FP16 rates are similar. Indeed, they are. We also tested Radeon RX Vega 64 and 56, which returned a 64.2% and 65.8% improvement with FP16, respectively. Then we followed up with Ryzen 5 2400G and its integrated Vega 11 core. That engine yielded a 61.8% increase. Based on those experiments, it appears that the benchmark and API leverage AMD’s Rapid Packed Math functionality.

Interestingly, Intel’s Core i7-8809G and its Vega M GH engine responded more like the Polaris-based RX 470. What does that tell us? At least for now, it seems Rapid Packed Math isn’t enabled on Kaby Lake-G’s Radeon GPU. Since Intel leaves its HD Graphics 630 block enabled on Kaby Lake-G, we also tested that. An outcome of 585.22 MPix/s with FP32 operations and 868.21 MPix/s with FP16 demonstrates Intel’s own support for mixed data types.

Semi-Custom Means Semi-Custom

AMD’s semi-custom business provides other vendors with tailored solutions based on its IP blocks, which are “modular” components that can be used in a chip design. The company’s own GPUs employ these pieces, including memory controllers, interrupt handlers, system management controllers, and hardware-accelerated video encode/decode blocks. These subsystems go through their own development phases, but it isn’t uncommon for a single revision of one component or another to be used in several different graphics architectures.

Many of the semi-custom chip builds are an amalgamation of different features that can blur the line between established architecture code names. For instance, Intel’s Kaby Lake-G uses HBM2, otherwise unique to the Radeon RX Vega add-in cards. But it’s been suggested that Polaris could support similar functionality by reworking its memory controller. The Xbox One and PS4 Pro (both of which use custom AMD silicon) likewise incorporate a unique blend of resources for their intended purposes.

So, Intel may have selected a combination of logic for Kaby Lake-G that doesn’t match up exactly with AMD’s discrete cards. Or, as some theorize, it may have designed Kaby Lake-G before all of the Vega-class features were ready. Either way, our test isn’t the final word on whether Intel’s Kaby Lake-G is or isn’t a true Vega-based part. Our results don’t tell us, either, whether the graphics core on Intel’s AMD-equipped chips are capable of handling Rapid Packed Math. All we really know now is the feature doesn’t appear to be working at the moment.

What Does It All Mean?

We reached out to Intel for comment. The company gave us the same response PC World received:

This is a custom Radeon graphics solution built for Intel. It is similar to the desktop Radeon RX Vega solution with a high bandwidth memory cache controller and enhanced compute units with additional ROPs.

Intel points to the HBM controller and additional ROPs as evidence that Kaby Lake-G wields a Vega-class Radeon graphics chip, also citing “enhanced compute units,” implying the NCU. Given the amount of customization that went into this design, it’s possible that Kaby Lake-G’s Radeon graphics are a hybrid with enough Vega functionality to justify Vega marketing.

It’s also important to remember that Intel and AMD approved the branding. These are both publicly-traded companies that are averse to legal exposure. So, we imagine the naming convention had to pass quite a bit of scrutiny.

According to these preliminary test results, Kaby Lake-G’s Radeon GPU doesn’t currently feature Rapid Packed Math. This doesn’t change anything about the performance we observed in our launch coverage. In fact, application support remains limited. We’re simply using the capability, which is tied to AMD’s Next-Generation Compute Unit, as an indicator of Vega-class functionality on a Vega-branded chip.

In any case, this adds to a growing body of evidence that Kaby Lake-G’s Radeon graphics may be more Polaris than Vega. Truthfully, it’s all a bunch of branding buzzword bingo, anyway. And underlying architectures may not mean much to gamers beyond their correlation to performance. That said, we do want to see companies stay consistent with their messaging so consumers can be sure of what they’re getting. Transparency helps as well, and while we’re not quite pointing fingers here, both AMD and Intel could have done a better job explaining just what is (and isn’t) included in these intriguing new chips.