Hirdetés

Új hozzászólás Aktív témák

  • S_x96x_S

    őstag

    válasz Petykemano #3555 üzenetére

    > Egy pár mondatban összefoglalható, hogy mi az ellenszenv oka?

    az én értelmezésem szerint a fragmentáció a legnagyobb problémája
    .... a rengeteg AVX-512 variáció
    https://en.wikichip.org/wiki/x86/avx-512#Implementation
    ... aminek nehéz a támogatása ...
    meg összehasonlítva az ARM SVE2 -vel .. az AVX-512 .. gányolás...

    Az ARM-es SV2 bár késői szülés ... de alaposabban átgondolt mint az Inteles rögtönzés - és jobban skálázódik .. mobiltelefontól --- az ARM-es HPC -ig .. egy utasításrendszer ... amit bárhol lehet használni ...

    későbbi e-mail -ben jobban kifejtette ...

    -------------------------------
    "Now, that said, do I hate MMX/SSE/AVX/AVX2 with the same burning passion as AVX512? No. Because there's a big difference between them.

    MMX/SSE was a first-attempt (plus fixes). The i387 was a particularly nasty thing to be compatible with anyway, it's entirely understandable why it was done the way it was done. In hindsight, maybe it could have been done better, but a "in hindsight" argument is always complete BS. So that's not a valid argument. MMX/SSE was fine.

    AVX/AVX2 were reasonable cleanups and honestly, I don't think 256 bits is a huge pain even as a baseline. And Intel has been good about keeping AVX always there. Afaik, new CPU's really have gotten AVX reliably. So it hasn't been a fragmentation issue, and while I think it has the same state dirtying issue ("helper function using MMX instructions and saves/restores the instructions it modifies will be clearing upper bits in AVX registers and trashing state"), I think it was a fairly reasonable extension.

    So again, AVX/AVX2 was fine. Was it "lovely"? No. But I think it's a reasonable baseline.

    So what's different with AVX512?

    One fundamental difference is that fragmentation issue. It came up before AVX512 was even out, with the failed multi-core Knights atoms having a completely different versions. But it's really been obvious lately, with even today, in CPU's being sold, it being a "marketing feature".

    But the other - and to me really annoying - fundamental issue is "by now, you should have damn well have learnt from your mistakes".

    Here, look at the real competition for Intel and x86 long-term: ARM. They had an equally disgusting and horrendously bad FPU situation originally. Yes, their FPU situation was differently bad from the i387, but the whole soft-FP vs VFP vs random other implementations was arguably worse than Intel ever had, even if at the time, you would find the usual ARM fanbois that made excuses for just how horrendous the situation was.

    But then ARM got their act together, and NEON happened. I'd say that was roughly the equivalent to SSE, because I'll call the original mess of nasty shit comparable to the nofp/i387/IBM-mis-wiring-the-exception-pin/MMX era. The timing may not line up, but with NEON, ARM at least had gotten rid of their messy lack of standards, and I think it's fair to compare it to Intel and SSE conceptually.

    So ARM did SVE, and I'll call that their AVX/AVX2. But now you see signs of differences. Part of it is just the name. "S" for "Scalable". ARM is starting to do something interesting and fundamentally different from what AVX was for Intel.

    And then ARM designed SVE2, and again, let's see how it actually plays out in real life, but I think it has the potential to be their "AVX512 done right". And they designed it to have a reasonable downgrade/upgrade path, to be extensible, to do that masking and memory accesses etc that is so important for compilers to auto-parallelize.

    Honestly, if I were into HPC and vectorization, I'd be all in on the ARM bandwagon.

    As it happens, I'm not into HPC and vectorization, and it's possible that exactly because I'm not into it, I'm missing why SVE2 has some horrible problems. And I realize that AVX512 does some things that a very very very small minority of people care deeply about (I don't know why, but some people really love the shuffle instructions and will put up with absolutely anything if they get them).

    So just as a bystander, I'm looking at AVX512, and I'm looking at SVE2, and I'm going "AVX512 really is nasty, isn't it"?

    And by now it's the third big generation, and the "it wasn't clear what the right answer was" is no longer an excuse for doing things wrong. People knew that scaling up and down the CPU stack was an issue. This wasn't something where Intel couldn't have seen it coming - when Intel was designing AVX512, Intel was still trying to also enter the smartphone and IoT area.

    Have I sufficiently explained why I absolutely despise AVX512?

    And yes, maybe in five years, AVX512 is there everywhere and my fragmentation argument goes away.

    Buy maybe in five years, SVE2 is everywhere too, and is happily working in cellphones and in supercomputers, and I think I won't be the only person in the room that says "AVX512 is a butt-ugly disgrace".

    We'll see, even if it might take years. I'm happy to be proven wrong.

    And I'm here for the heated technical discussion anyway. Tell me why I'm a pinhead and a nincompoop, and why SVE2 is so bad, and why AVX512 is clearly better.

    Because this forum is about architecture design and implementation, isn't it? So I think it's very fair to put down that gauntlet: AVX512 vs SVE2. "Gong plays" - FIGHT!

    Linus"
    https://www.realworldtech.com/forum/?threadid=193189&curpostid=193248

    [ Szerkesztve ]

    Mottó: "A verseny jó!"

Új hozzászólás Aktív témák