Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Test Results for AMD Ryzen (agner.org)
92 points by matt_d on May 2, 2017 | hide | past | favorite | 8 comments


Interesting:

> The gain in total performance that you get from running two threads per core is much higher in the Ryzen than in Intel processors because of the higher throughput of the AMD core


I've seen the throughput argument being made continuously for the past decade, including in academic papers where the Opteron's higher throughput has been shown to beat Xeons in HPC applications.

Another interesting aspect is that in HPC applications, where floating point performance is more demanding, only about 1 in 7 operations is actually a floating point operation. The remaining 6 operations are there to move around data, including requests to push data down the cache memory hierarchy. That's one of the reasons why the performance of AMD's Bulldozer and Piledriver lines scaled practically linearly up to the core count in spite of each floating point unit being shared between a pair of CPU cores.

Consequently, HPC research tends to be focused on strategies to minimize the amount of data being moved around, as well as minimizing cache misses, or to take advantage of technology with higher throughput, such as GPGPU. As AMD's Ryzen offer greater throughtput, performance also increases.


So glad AMD's back.


We should all be, no mater what's your preference, that means lower price (and better motivation for further development) ultimately.


What is this about time measurement? Can you get frequency invariant nanosecond resolution timestamps from Ryzen with a single instruction like you can with Intel?


Intels "real clock" is `TSC`. This counts processor cycles. SandyBridge (and up, I think) ensure the `TSC` value never fluctuates but remains constant (at CPUID quoted clock speedish). Generally this _real clock_ is called `RDTSC` even tho its still `TSC` instruction.

Some rumblings from Intel suggest they may discontinue this. Either way this is addressed

`APERF` seems to be AMD's version of `RDTSC` but that requires the code execute in Ring-0. So it sounds like Agner was building a kernel module to wrap the existing test suite.


From the article I think that's backward. AMD's RDTSC is a constant-rate cycle counter and APERF counts actual cycles. So with clock boosting from XFR, etc, APERF will increase while RDTSC will not.


So the issue is that you can't get an invariant TSC without Ring-0 access? I assumed AMD already had invariant TSC before Ryzen that was accessible from userspace. Certainly Intel has had that since Nehalem if I recall. It just seems an odd thing to be hard considering how important it is.

I never thought about it being a cycle count since I usually access via a wrapper. I guess the wrappers are converting from cycles to nanoseconds.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: