Hirdetés

Keresés

Hirdetés

Új hozzászólás Aktív témák

  • válasz Raymond #41486 üzenetére

    Az AMD másképp számolja, mert Dual Compute Unit-nak hívja és 2 CU-nak számolja az egész egységet. És nem az egész DCU-t, hanem csak a felét hasonlítja a GCN CU-hoz. Átírtam a táblázatot az AMD terminológiája szerint.

    [ Szerkesztve ]

    A RIOS rendkívül felhasználóbarát, csak megválogatja a barátait.

  • #82819712

    törölt tag

    válasz Raymond #41486 üzenetére

    The RDNA Compute Unit sees the bulk of AMD's innovation. Groups of two CUs make a "Dual Compute Unit" that share a scalar data cahe, shader instruction cache, and a local data share. Each CU is now split between two SIMD units of 32 stream processors, a vector register, and a scalar unit, each. This way, AMD doubled the number of scalar units on the silicon to 80, double the CU count. Each scalar unit is similar in concept to a CPU core, and is designed to handle heavy scalar indivisible workloads. Each SIMD unit has its own scheduler. Four TMUs are part of each CU. This massive redesign in SIMD and CU hierarchy achieves a doubling in scalar- and vector instruction rates, and resource pooling between every two adjacent CUs.

    Groups of five RDNA dual-compute unit share a prim unit, a rasterizer, 16 ROPs, and a large L1 cache. Two such groups make a Shader Engine, and the two Shader Engines meet at a centralized Graphics Command Processor that marshals workloads between the various components, a Geometry Processor, and four Asynchronous-Compute Engines (ACEs).

    The second major redesign "Navi" features over previous generations is the cache hierarchy. Each RDNA dual-CU has a local fast cache AMD refers to as L0 (level zero). Each 16 KB L0 unit is made up of the fastest SRAM, and cushions direct transfers between the compute units and the L1 cache, bypassing the compute unit's I-cache and K-cache. The 128 KB L1 cache shared between five dual-CUs is a 16-way block of fast SRAM cushioning transfers between the shade engines and the 4 MB of L2 cache.

    In all, RDNA helps AMD achieve a 2.3x gain in performance per area, 1.5x gain in performance per Watt

    Gyorítótárakkal közvetlen adat továbbítással dolgoznak. Engem amúgy ez a dual compute adatmegosztása emlékeztet a buldózer működésére :)

Új hozzászólás Aktív témák