64 Core Threadripper 3990X CPU Review

pella · on Feb 7, 2020

Phoronix: AMD Ryzen Threadripper 3990X Offers Incredible Linux Performance

https://www.phoronix.com/scan.php?page=article&item=3990x-th...

"When taking the geometric mean of the benchmarks for this article today, The Threadripper 3990X came out overall 26% faster than the dual Xeon Platinum 8280, which is a very nice accomplishment since such a configuration currently retails for $20,000 USD worth of processors alone."

amluto · on Feb 7, 2020

Looking at Linux performance seems like a good idea. Linux doesn’t have all of Windows’ hangups about core count.

anonsivalley652 · on Feb 7, 2020

Yep. From user-land, numactl FTW

https://manpages.ubuntu.com/manpages/bionic/man8/numactl.8.h...

amluto · on Feb 8, 2020

This isn't a NUMA issue per se. The Windows kernel has some kind of hangup about scheduling across more than 64 logical CPUs.

See https://docs.microsoft.com/en-us/windows/win32/procthread/pr...

matsz · on Feb 7, 2020

Looking at Linux performance is a great idea since those CPUs probably will be utilized in Linux machines as well (lots of render farms run on Linux).

rbanffy · on Feb 7, 2020

I would assume render farms would rely much more on GPUs than on CPUs.

Arnavion · on Feb 7, 2020

See https://stackoverflow.com/questions/38029698/why-do-we-use-c... from a few years ago. The gist is CPUs are more versatile in the work they can do, support larger scenes, and the render doesn't have to be in real-time so it's not a make-or-break comparison as video games would be.

rbanffy · on Feb 8, 2020

I'm not thinking on the specialized GPU parts for shading and mapping, but pure floating point hardware. With 3D you are doing lots and lots of floating point ops that should map easily to what GPUs do best.

OTOH, GPUs are still harder to program than CPUs and I can only imagine what the SFX crew would hear when they have to explain the movie can't be released in summer because the GPUs are too hard to write software for.

nicoburns · on Feb 7, 2020

I think some do, some don't. I think some of the more complex models used in animations can't be easily parallelised.

swalsh · on Feb 7, 2020

As Linux becomes more competitive with Windows from a user experience point of view, I wonder if superior support for these new crazy AMD processors might help give it a boost. I'm sure many graphics people would move (if only Linux had good support for applications like Photoshop).

willis936 · on Feb 7, 2020

I'm curious what performance would be like on a linux hosting a 128 thread windows VM. I expect it to be worse than running windows on the bare metal, but maybe not. I could see a series of deficiencies in windows being eliminated with the linux host -> windows client topology.

dfox · on Feb 7, 2020

I assume that the issue with processor groups on windows is that something somewhere is hardcoded to represent some kind of per-CPU state as bits in machine word. Having some kind of hypervisor as an additional layer will not help with that.

randyzwitch · on Feb 7, 2020

I have the 3970X and use it with Ubuntu. It's fun to have a lot of cores, but in the beginning it's highlighted some areas where open-source programs aren't assuming you're going to use that many cores (similar to what they talk about for Windows). So you see OOM errors with make, see program crashes from defaults being too low (such as creating too many Postgres connections), things like that.

Not taking away from those developers, just sharing my experience. When you do get something that works really well (the newest version of Julia makes it easy to write multithreaded code), it's hilarious running htop and seeing 64 threads of green running.

StillBored · on Feb 7, 2020

I've got a 256 thread machine under my desk at the moment, I find myself frequently disabling threads because i'm out of memory.

I don't think people fully grasp that lots of applications memory requirements scale with the number of threads. Just running gcc against C make files can burn 1/2 a GB a process, multiply that by the 8 cores most people have and 16G of ram leaves plenty of memory for Firefox/etc.

OTOH, 128GB of ram in a 256 thread machine, is borderline out of ram. Build a project where its more like 2G of ram per process (looking at a lot of recent big public google projects) and its more like I need 1/2 a TB and a bunch of swap to keep the project from OOMing.

So at this point I might use that as a general rule, 2G per thread. So your looking at another ~$1.5k+ or so in RAM for this machine for most purposes.

teambayleaf · on Feb 8, 2020

I second this. My experience with many core machines (32-64 cores) is kinda dissappointing so far.

Yes, running "make -j" on these machines feel amazing at first, but you will soon either 1) run out of memory, or 2) notice that you don't need so many cores anyway due to build dependencies.

emn13 · on Feb 8, 2020

Hmm, my experience has been resolutely great; we just upgraded to 64 thread 3970x's with 256GB of RAM, and pretty much everything worked smoothly out of the box; I'm very pleasantly surprised by the lack of tuning necessary.

We did do the math and decided that for us 4GB per thread is likely safe up front; 2GB per thread might have been pushing it. Another reason to avoid the 3990x (for us) was the tricky scaling of the previous generation's 2990WX; We don't have faith that all of our code would run well on the more memory-channel constrained machine.

I mean, if you're just running a single fairly small build, sure, it's likely overkill...

arvinsim · on Feb 13, 2020

Interesting heuristic. That means my 6 core-12 thread CPU will need to have 24GB of RAM. I am on 32GB so it's cool.

zeotroph · on Feb 7, 2020

> htop / 64 threads

It sure is fun to see the entire top half of htop being taken up by individual CPU bars, but this is another program whose assumptions have been overtaken by the current silicon: At some point it is not that helpful anymore and makes looking at the processes themselves harder.

Sadly the development of htop seems to have stalled, patches/PRs for that and many other issues remain unmerged and unanswered.

rbanffy · on Feb 7, 2020

> It sure is fun to see the entire top half of htop being taken up by individual CPU bars,

You should have seen it with the Xeon Phi - the Phi could do 4 threads per core and the big ones had 72 of them, for a mind-blowing 288 threads. It was, however, much less forgiving than either of these beasts - L2 cache was Atom-sized and cache misses were not cheap.

rbanffy · on Feb 7, 2020

IIRC, the SGI Origin 2800 could go up to 512 MIPS R14000 processors. It's one thread per socket but, with 512 sockets and a single memory image, it'd most likely end up crashing htop. ;-)

dijit · on Feb 7, 2020

you can easily change the CPU bars to be quarter sized.

If you press 'F2' and look at the available meters you should see half and quarter sizes.

blattimwind · on Feb 7, 2020

Even then -- https://gamozolabs.github.io/assets/afl_256thread.png

Note that this CPU has 128 threads, so half of that picture. For my usual terminal size I might get 0-2 processes listed below that...

irishcoffee · on Feb 7, 2020

I’ve found htop becomes less useful after about an 8 core count. Is there a reason you’d run htop over top in a normal use case?

penagwin · on Feb 7, 2020

Htop is IMO better overall with more helpful coloring, filtering, etc.

thinkmassive · on Feb 7, 2020

You can replace all of those with a single Avg bar.

philsnow · on Feb 7, 2020

At this number of processors, it starts to make sense to plot min/median/mean/90%ile/99%ile/max instead of gauges for each cpu.

I would love to have this kind of problem :)

_edit_: or just all the deciles and 1%ile + 99%ile

3fe9a03ccd14ca5 · on Feb 7, 2020

What in the world are you doing when that picture was taken??

clSTophEjUdRanu · on Feb 7, 2020

That is definitely not the CPUs problem to worry about. Operating systems need to catch up.

randyzwitch · on Feb 7, 2020

Yes, that was generally my point. Even if available code already took advantage of multiple cores, my experience has shown that currently these chips bring out some latent assumptions on what hardware the code will be running on (whether that's OOM or UX or whatever)

rbanffy · on Feb 7, 2020

That's why we have things like Rust - so we can write multi-threaded apps without shooting ourselves on the foot.

ksec · on Feb 7, 2020

While most of us nerds may be geeking out at the 64 Core CPU all running 3.4Ghz+ priced at $4K and beating the Intel $20K counterpart [0], The reality is that the market for these seems to be very small. If we look at recent AMD [1] and Intel [2] results, it is clear AMD isn't gaining much. 70% of the PC market are Laptops, and majority of the 30% Desktop market are Business uses which generally prefer Intel. The Server market is extremely slow because of long term contract and other reason.

I just wish they could better market themselves to CIO / CTO / CEO rather than prosumer, enthusiast market.

Or they could try and convince Apple to use it on Mac, I am very certain the rest will follow.

[0] https://www.phoronix.com/scan.php?page=article&item=3990x-th...

[1] https://www.anandtech.com/show/15445/amds-fy2019-financial-r...

[2] https://www.anandtech.com/show/15433/intel-q4-fy-2019-result...

lhl · on Feb 7, 2020

Actually, it turns out that AMD has been doing pretty great in both Desktop and Mobile growth: https://www.extremetech.com/computing/305839-intel-amd-both-...

Mobile growth was huge, and this was even with the older 12nm chips, and at best, mid-tier laptop models. This year I suspect will be even better as the 7nm 4000 series beats the Intel competition across the board, and as OEMs seem to be finally putting the AMD chips in higher-end models.

Intel continues to have ongoing chip shortages that's impacting OEM bottom lines, so we'll see if that changes the landscape in 2020 as well. https://www.digitimes.com/news/a20200117PD203.html

ckastner · on Feb 7, 2020

> If we look at recent AMD [1] and Intel [2] results, it is clear AMD isn't gaining much.

This surprised me as well, but your explanation makes sense.

I'm curious as to how the new Ryzen 4000s will affect laptop market share. Because offering 8C/16T at 15W is just terrific.

ksec · on Feb 7, 2020

HP, Dell, Lenovo, Apple, ASUS, just these 5 vendor alone represent 70%+ of the market. And if you start adding smaller ( Comparatively Speaking in unit shipment ) brands Such as Samsung, Microsoft, Sony, Acer, Toshiba, you are now edging close to 90% of the Market.

Those top 5 vendors has long term relationship with Intel. So their AMD offering seems to be very lacking or purely as a play to have more bargaining power with Intel. Not to mention Intel's consumer marketing is far better than AMD.

Unless consumer ( not us, but average consumer ) reacts favourably, I dont think it will make much of a difference in terms of volume and unit shipment. It might do well in Gaming Laptop, but it seems most gamers prefer Nvidia.

bitL · on Feb 7, 2020

If they stuff them only into laptops with 8GB RAM, 1080p displays and no Thunderbolt, it won't change much either (see Lenovo).

bitL · on Feb 7, 2020

AnandTech mentions 512GB RAM limit for 3990X - that would mean there are 64GB ECC UDIMM modules somewhere? All I could find were rare 32GB UDIMM modules at best, capping RAM to 256GB instead...

binarycrusader · on Feb 7, 2020

They're supposed to be coming to market "soon", but 32GB is the largest you can generally get right now as far as I can tell.

However, what I really care about right now is higher speed ECC UDIMMs. It's really hard to get ECC UDIMMs at speeds higher than the JEDEC standard.

lunixbochs · on Feb 7, 2020

LRDIMM modules go up to at least 128GB.

bitL · on Feb 7, 2020

TRX40 doesn't support LRDIMMs, only UDIMMs.

anonsivalley652 · on Feb 7, 2020

I'm waiting to see if/when EPYC Milan are released for price cuts to Rome. I already have an SP3 board and dual-CPU + VRM watercooling block. Some other bits on order are delayed because of Coronavirus, which is fine because I'd rather people be safe than fill orders for material stuff.

ehonda · on Feb 7, 2020

Say you created 32 dual vcore VMs with this, each with 2GB of RAM. How would they perform simultaneously together, vs a separate dual core PC (like a core2duo)? I don't know very much about the performance differences or if its even possible to compare such a thing.

kadoban · on Feb 7, 2020

Probably not near as well, if they're all actually doing anything. You'd hit bottlenecks moving data around, the bus used to fetch memory isn't 32 times as wide, and the memory isn't 32 times as fast, same with other resources.

You could in practice run quite a few VMs before you'd run into trouble, especially if most of them are idle at any given point.

anon73044 · on Feb 7, 2020

That's some data I'd love to see. 32 1u servers all connected with fiber or 10gig vs 32 VMs on the same host vs 32 k8s pods on the same node. I'm wondering if the same test would scale from 4 hosts/VMs/pods

I don't even think there's a benchmark out there for such a test.

AstralStorm · on Feb 7, 2020

It's still much faster than any Core 2. It's faster than 2 cores of Sandy Bridge on my older 2950x, so I'd expect 39z0x to be even faster, probably beating older Skylake.

kadoban · on Feb 7, 2020

Yeah, I'm sure it's going to depend a lot on workload. If each one is in some crazy tight loop, CPU bound without using much working set of memory at all, it should be faster.

Or more realistically if they're usual VMs and idle the vast majority of the time, that should be faster too.

mysterydip · on Feb 7, 2020

VMs in general are more memory-bound than CPU-bound (exceptions for things like SQL servers, encoders, etc). Hypervisors are generally pretty good about spreading VMs across a pool of CPUs and grabbing whichever is idle at the time. You can manually set affinities to always use specific cores, but it's generally wasteful to do so.

One caveat (at least with how vSphere 5.x worked) is the hypervisor has to claim all CPUs at the same time in order to do work, even if the other guest CPUs are idle. For example, if I have a 4 core VM on a 6 core host, it has to wait for 4 of the 6 to be free before the VM gets to do anything. So sometimes VMs with less CPUs can outperform one with more for the same workload. Getting proper measurements on your loads (peak/avg CPU, memory, disk IOPS etc) is critical to a good migration.

slantyyz · on Feb 7, 2020

You might want to to check out this Linus Tech Tips video from December:

https://www.youtube.com/watch?v=jvzeZCZluJ0

What he did was he took a single 32 core AMD CPU to replace all the computers in his house, including gaming PCs. At around 00:54, he mentions that the cores on the CPU are not "weak cores".

bluedino · on Feb 7, 2020

Didn't they also build like a 6 person gaming rig on one server?

ajross · on Feb 7, 2020

The VMs, in addition to dealing with the normal VM overhead we all already understand, would have much lower (like an order of magnitude) bandwidth to main memory and a similarly smaller share of L3 cache, in both cases because the resources are shared across all 32 devices.

So the answer to the question depends heavily on how cache-resident the problem you are throwing at them is. They'd do great mining bitcoin and be a total disaster as memcached hosts. More typical workloads will be somewhere in the middle.

hrgiger · on Feb 7, 2020

Depends on target dual core machine spec. If all VMs run in full or heavy tasks you might face issues with scheduler. But I believe with lot of tunning on host like using numa, cpu passthrough, pinning and using light guests you might get close to those numbers with reasonable workload

bluedino · on Feb 7, 2020

Would you need 32 V's or could you run the program on one server with 32 threads or instances?

pella · on Feb 7, 2020

+ "AMD Ryzen Threadripper 3990X Review Roundup"

https://videocardz.com/84697/amd-ryzen-threadripper-3990x-re...

drewg123 · on Feb 7, 2020

Were these configured for NUMA, or non-NUMA?

I've not gotten my hands on a TR3. But rome gives you the option of 1, 2 or 3 Nodes Per Socket. Various benchmarks (eg, stream) published by vendors such as Dell show better performance in 4NPS mode. My own testing on Netflix workloads shows a similar speedup.

I run my dekstop (2990WX) in numa mode, exposing the real topology to the scheduler, and I find that it seems to help compilation times.

bitL · on Feb 7, 2020

In general, in Linux it's best to run TR1 and TR2 in NUMA mode. Not sure about TR3.

stagger87 · on Feb 7, 2020

The issues surrounding NUMA are addressed extensively in the article.

drewg123 · on Feb 7, 2020

No, not really. They're discussing windows "processor groups" which seem to be arbitrarily created for chips with 64 or more SMT threads.

That's why i'm wondering if 2-NPS or 4-NPS is available with TR3, and what impact it would have.

dana321 · on Feb 13, 2020

This, to me is a turning point.

Its all about cores from now on.

Well, it has been for the last 10 years, but things are going to get silly from now on.

What happened with graphics cores will happen with CPUs now.

https://gist.github.com/cavinsmith/ed92fee35d44ef91e09eaa877...

batmansmk · on Feb 7, 2020

With 128 threads, this CPU will even be able to beat some GPUs on a few parallel tasks. Like Cinebench.

ekianjo · on Feb 7, 2020

Mmm, nothing about performance on Linux? That's lacking.

michaellarabel · on Feb 7, 2020

Linux data here - https://www.phoronix.com/scan.php?page=article&item=3990x-th...

friedman23 · on Feb 7, 2020

I feel an intense desire to learn high performance computing to take advantage of this thing.

DivisionSol · on Feb 7, 2020

http://www.cs.uu.nl/docs/vakken/magr/2017-2018/files/SIMD%20...

https://bisqwit.iki.fi/story/howto/openmp/

The two guides I found useful to get started.

inetknght · on Feb 7, 2020

> I feel an intense desire to learn high performance computing to take advantage of this thing.

Learning high performance computing will let you take advantage of other things too. Like, for example, current desktop workstation processors with "just" four cores.

friedman23 · on Feb 7, 2020

Sure you may be right, but writing a program that uses 1 of 4 cores makes me feel less bad than writing a program that uses 1 of 32.

inetknght · on Feb 7, 2020

On the other hand, writing and debugging a program that uses 4 of 4 cores will help you feel confident enough to write and debug a program that can use 32 of 32.

melling · on Feb 7, 2020

Within 5 to 7 years, it’ll be an under $500 part, and no longer just for high-performance computing. 32 cores will be half that in price.

What computer language will be using to take advantage of that?

dnautics · on Feb 7, 2020

Elixir, erlang, julia make using the smp of your system easy and natural. Go (you have to set up a bunch of things like channels and maybe worry about cleaning them up) and Java if you don't mind a bit of struggle.

If you want low level you can use c, c++ which are both dangerous, rust, zig (I just did some multithreaded stuff in zig and it's fantastically easy).

voldacar · on Feb 7, 2020

> I just did some multithreaded stuff in zig and it's fantastically easy

could you share a little more about your experience here?

Like does zig have its own parallelism/threading in its standard library or do you have to link to a c library like pthreads?

dnautics · on Feb 7, 2020

haha ok so this is going to be complicated.

1) A bit about what I'm doing with zig. I am working on an FFI interface between elixir and zig, with the intent to let you write zig code inline in elixir and have it work correctly (it does). https://github.com/ityonemo/zigler/. Arguably with zigler it's currently easier to FFI a C library than it is with C (I'm planning on making it even more easy; see example in readme).

2) The specific not-in-master-branch feature I'm working on now is running your zig code in a branched-off thread. Fun fact about the erlang VM: if you run native code it can cause the scheduler to get out of whack if the code runs too long. You can run it in a "dirty FFI" but your system restricts how many of these you can run at any given time. A better choice is to spawn a new OS thread, but that requires a lot of boilerplate to do and it's probably easy to get wrong. Making it also be a comprehensive part of erlang's monitoring and resource safety system is also challenging, and so there's a lot to do to keep it in line with Zigler's philosophy of making correctness simple.

3) Zig does have its own, opinionated way of doing concurrency. I honestly find it to be a bit confusing, but it's new (as of 6 months) and is not well documented. I believe the design constraints of this are guided by "not having red/blue functions", "being able to write concurrent library code that is safe to run on nonthreaded/nonthreadable systems"

4) The native zig way of doing concurrency is incompatible with exporting to a C ABI (without a shim layer) so I prefer not to use it anyways.

5) Zig ships with std.thread. I believe it's in the stdlib and not the language because some systems will not support threading. But since I'm writing something that is intended to bind into the erlang VM (BEAM), it's probably on a system that supports threading. Also I believe that std.thread will seamlessly pick either pthreads or not-pthreads based on the build target, which makes cross-compiling easy.

6) So yes, figuring this all out is not easy (zig is young, docs are not mature), but once you figure out what you're supposed to do, the actual code itself is a breeze, this is the code that I use to pack information to connect the beam to linux thread and launch it: https://github.com/ityonemo/zigler/blob/async/lib/zigler/lon.... I really hope the docs come with guides that will make this easy in the near future.

voldacar · on Feb 7, 2020

thanks for the writeup.

friedman23 · on Feb 7, 2020

elixir, golang, java, c++

I'm relearning c++ right now just because I am building a poker solver for a toy project.

You are right btw, if 32 core cpus become common because of a race to the bottom in prices I imagine there will be a massive increase in demand for programmers with the experience to program massively parallel systems.

edit: and rust :)

pornel · on Feb 7, 2020

I'm enjoying the Rust ecosystem. Everyone writes programs with multi-threading in mind, because the language requires everything to be thread-safe anyway.

zozbot234 · on Feb 8, 2020

It only requires thread-safety for stuff that's actually being shared across threads. Which is even more important since it means you're not paying for thread safety via reduced performance where it isn't needed.

ifoundthetao · on Feb 7, 2020

Yep, and with the ease of concurrency in Go and Rust, it'll be freakin' awesome. And hopefully, we'll get some novel security research in areas dealing with attacks against concurrency and parallel execution.

zzzcpan · on Feb 7, 2020

What ease of concurrency are you talking about? See concurrency bugs in Go paper: https://songlh.github.io/paper/go-study.pdf

UnpossibleJim · on Feb 7, 2020

I apologize in advance, but I would be remise if I didn't put in the obligatory call out for Rust =)

zeotroph · on Feb 7, 2020

...but without a sentence or three about the reasons you think Rust is more suited for parallel workloads you are just giving Rust users a bad name again (see Rust Evangelism Strikeforce).

UnpossibleJim · on Feb 7, 2020

That's completely fair, and rather than take on why it handles parallel cycles better during workload better, I tend to take a different approach with Rust and parallelism. I enjoy the compiler and parallel work loads and the elimination (well, reduction, let's face facts) of safety concerns and data races. It isn't that these same things can't be done, and done as well, with other languages but the built in tool chain is done without extension with Rust. Now, there is a learning curve and a different programming paradigm (though not a radically different as I was warned), which I happen to enjoy. It won't be for everyone. No language is. The resources are there and free (online) and I do encourage people with some spare time to give it a spin, but I don't think it's the end of the world if people don't want to =)

That being said, I like it, but I tend to use Python more. I wish I had more of a chance to use Rust in my daily life, but I don't use it at work =/

solotronics · on Feb 7, 2020

even python with asyncio and multithreading / multiprocessing

MaxBarraclough · on Feb 8, 2020

Python is just about the worst major programming language to write high-performance code in. The only slower major language is Ruby.

If you're using Python to invoke highly-optimised native-code, then your performance will be excellent (as shown by the various Python numerical libraries), but performance-sensitive code shouldn't run in the Python interpreter.

As others have said, Python also lacks true multithreading (its threads are capable of concurrency but not parallelism, on account of the GIL), but you do have the option of just running a bunch of Python processes in parallel. I imagine that's a workable solution at least some of the time, but I've never explored this, so I don't know how good the library support is.

Edit: Someone else mentioned 'mpi4py' which seems to be a Python library for multi-process work.

montecarl · on Feb 7, 2020

Or python with mpi4py. MPI is the perfect multiprocessing paradigm for parallel python code, since you avoid the GIL. You can easily use MPI on a single workstation, or scale it to run on a supercomputer.

armitron · on Feb 7, 2020

Python with asyncio and multithreading doesn't take good advantage of multiple cores due to the global interpreter lock. With multiprocessing, one pays big costs in IPC.

Python is not at all suitable today for parallelism. Which is one reason why languages like Go and Elixir are gaining so much traction.

blattimwind · on Feb 7, 2020

> Python is not at all suitable today for parallelism.

As a long time Python dev, including work on parallel applications, I have to agree. It's always annoying in Python, and entirely Un-Pythonic.

no_wizard · on Feb 7, 2020

They’re actively working on a neat angle at solving this problem via much cleaner IPC through spawning sub interpreters

https://www.python.org/dev/peps/pep-0554/

This just may be the way forward in the Python ecosystem

blattimwind · on Feb 7, 2020

Subinterpreters share a common GIL. Also subinterpreters don't share Python objects, which means among other things that all modules are imported separately in each SI, which increases startup time, memory usage and reduce cache effectiveness.

It's a band-aid. If you want to run Python code in parallel, without large overhead, then CPython is simply not your environment to do so, and Python is not a good choice overall in that kind of endeavour.

zzzcpan · on Feb 7, 2020

Elixir uses slow interpreted vm, you can't put it into the same high performance category and golang, java, c++ and rust can't do well on massively multicore systems on their own, you would have to fight them to do it, ditch idiomatic ideas and effectively build your own runtime and use your own concurrency model (not shared memory multithreading!). So they are more like every other language that is somewhat close to the primitives that OS and hardware provide, not actually well suited for the job itself.

dnautics · on Feb 7, 2020

Elixir is compiled. And as someone who worked in HPC, I would really not call golang high performance (by the criteria which people in HPC consider "high performance")

RmDen · on Feb 7, 2020

Most databases will use all the cores you can throw at it :-)

Not out of the ordinary to see a SQL Server query using 20+ cores for a query with a parallelized plan

garmaine · on Feb 7, 2020

FYI you can already get engineering units on eBay for that price. I have a few...

pdimitar · on Feb 8, 2020

Don't they have different characteristics compared to the final retail units?

garmaine · on Feb 9, 2020

It's not like they're intentionally broken, just have whatever bugs come with being the first revision. These are the units they send out to journalists and big companies to do benchmarks, test software for, plan large scale rollouts, etc. Having them work the same as the retail units is pretty essential for that purpose.

That said, one of my units has a clock speed that doesn't match any of the retail models (I guess they didn't end up selling that model?), and another doesn't seem to work with threading (or whatever AMD calls it) enabled. But that's a small price to pay for the money saved.

rbanffy · on Feb 7, 2020

The 8 memory channels of the EPYC against the 4 of the Threadripper that the article mentions is not just about how much memory you can install, but memory bandwidth. An L3 cache miss should be roughly twice as expensive on the desktop platform than on the server one.

LarryDarrell · on Feb 7, 2020

Suddenly I feel like a wimp with my TR 1950X.

yellowapple · on Feb 7, 2020

Right? I feel like it's the 90's all over again and I'm loving every bit of it.

zozbot234 · on Feb 8, 2020

Except that in the 90's you didn't have hardware from 15 years prior that was still generally usable for baseline tasks. Nobody was running Linux on their old ZX Spectrums or Commodore PET's! In fact, running an up-to-date Linux with mainstream features (GUI) was only for relatively high-class hardware.

yellowapple · on Feb 11, 2020

Unfortunately, I'd be hesitant to call a whole lot of hardware from 15 years ago "generally usable for baseline tasks", either. Even running just a web browser, a 2005-era machine is gonna struggle quite a bit on modern websites with a modern browser and underlying OS.

frabert · on Feb 7, 2020

Wow, we're getting close to the amount of cores/threads that Xeon Phis boasted, but with actual per-core performance. And on an (almost) mainstream platform!

fulafel · on Feb 7, 2020

Why are they testing this on Windows?