Hirdetés
- TCL LCD és LED TV-k
- NVIDIA GeForce RTX 5070 / 5070 Ti (GB205 / 203)
- A CannonKeys felkavarja a slim profilos billentyűzetek állóvizét
- AMD Navi Radeon™ RX 9xxx sorozat
- AMD Ryzen 9 / 7 / 5 9***(X) "Zen 5" (AM5)
- Épített vízhűtés (nem kompakt) topic
- HDD probléma (nem adatmentés)
- Raspberry Pi
- Friss alaplapszériát avat az ASRock
- iPad topik
Aktív témák
-
Oliverda
Topikgazda
"Interesting to note are the "two tightly linked cores" sharing ressources. The shared FPU (which seems to be capable of 1x256 bit FMAC or 2x128 bit FMAC/FADD/FMUL in any combination per cycle) has been proposed many years ago by Fred Weber (AMD's CTO at that time). He already said, that two cores might share a FPU sitting between them. The whole thing is a CMT capable processor as speculated before. And if we look at core counts of Bulldozer based MPUs we should remember, that 2 such cores are accompanied by 1 FPU and an 8 core Zambezi actually contains 4 of these blocks shown on the Bulldozer slide.
Further the 4 int pipelines per core/cluster aren't further detailed, while for Bobcat they are. In the latter case we see 2 ALU + 1 Store + 1 Load pipes. For BD I still think, that we'll see 2 ALU + 2 AGU (combined Load/Store) pipes. Those "multi mode" AGUs would simply fit better to achieve a higher bandwidth and be more flexible, because the FPU will also make use of these pipes. BTW compared to the combined 48B/cycle L1 bandwidth (could be used e.g. as 48B load bandwidth) of Sandy Bridge we might only see 32B/cycle L1 bandwidth per core but up to 64B/cycle combined L1 bandwidth per FPU (although shared by 2 threads). Finally, nobody knows clock speeds of these processors, so no real comparison is possible right now.
Today rumour site Fudzilla posted some rumoured details of BD, e.g. DDR3-1866 compatibility, 8 MB L3 (for 8 cores) and "APM Boost Technology". A german site even mentioned some mysterious patents pointing into the same direction... Well, given the time frame and filing dates I see both a chance for a simple core level overclocking like Intels Turbo Boost, because this is covered in AMD patents and a chance for a more complex power management on a core component level as described in my second last blog posting."
"Take the current decoders and add a fourth. This would work most of the time but would work better if you add another small buffer between the predecode and decode stages or if you increase the depth of the pick buffer. Then you continue with 4 pipelines to the reorder buffer and down to four dispatch and four ALU/AGU units. You can't break these apart without redesigning the front end decoder and the dispatch units.
The FMAC units are now twice as fast so you really don't need more than two per pair of cores. Besides you can't keep the same ratio while doubling the speed without busting your thermal limits. Even at that, the L1 Data bus is too small. So, you double the width of the L2 cache bus and get your SSE data directly from there. This leaves the existing L1 Data caches to service the Integer units. That increases the data bandwidth by 50%. The current limit is two 128 bit loads per core. If we allow the same for FP then we have six 128 bit loads per core pair. The current limit is two 64 bit saves per core. They could leave this unchanged on the integer side and beef up the FP unit to allow two 128 bit loads, one 128 bit load plus one 128 bit save, or two 128 bit saves. That would give the FP sufficient bandwidth. The front end doesn't really have to be changed since current FP instructions are already sent to another bus. It's a good question of whether each core gets one FMAC all to itself or whether they can intermingle. Either would be possible since threading the FP pipeline would only require extra tags to tell the two threads apart. Two threads would also break some dependencies and partially make up for the extra volume due to tighter pipeline packing and fewer stalls.
Presumably, widening the pipelines to four would give a good boost to Integer performance. I suppose 33% would be the max unless they beef up the decoders but 20% would be enough. I'm guessing they would also change the current behavior of the decoders which now tend to stall when when decoding complex instructions and would probably reduce the decode time on more of the Integer instructions.
The only reason for a complete redesign would be if the architecture extensions can't fit into the thermal limits."
Aktív témák
- TCL LCD és LED TV-k
- Suzuki topik
- NVIDIA GeForce RTX 5070 / 5070 Ti (GB205 / 203)
- Makett, makettezés
- Teljes verziós játékok letöltése ingyen
- A CannonKeys felkavarja a slim profilos billentyűzetek állóvizét
- AMD Navi Radeon™ RX 9xxx sorozat
- Samsung Galaxy A54 - türelemjáték
- Sweet.tv - internetes TV
- Debrecen és környéke adok-veszek-beszélgetek
- További aktív témák...
- Macbook Air M3 15" 16GB 256GB 100%
- LG 65B3 -65" OLED - 4K 120Hz 1ms - NVIDIA G-Sync - FreeSync Premium - HDMI 2.1 - PS5 és Xbox Ready
- GYÖNYÖRŰ iPhone 15 Pro 256GB Natural Titanium -1 ÉV GARANCIA - Kártyafüggetlen, 100% Akkumulátor
- BESZÁMÍTÁS! MSI MAG B550 R7 5700X 32GB DDR4 1TB SSD RX 9070 XT 16GB NZXT H440 fehér GIGABYTE 750W
- HIBÁTLAN iPhone 13 Pro 128GB Graphite -1 ÉV GARANCIA - Kártyafüggetlen, MS3747, 100% Akkumulátor
Állásajánlatok
Cég: Laptopműhely Bt.
Város: Budapest
Cég: PCMENTOR SZERVIZ KFT.
Város: Budapest
Oliverda

