Expect explicitly declared, all content of this account belong to Public Domain, under CC0.
#TIL 冷知识 Get：王者荣耀里你一天上线七个小时，就算是成年人（而且节目里特意提及了成年人）系统也要你下线
Dolphin 5.0-11590 - Avoid using PDEP and PEXT on AMD Zen by Techjar
> However, unknown to any of us, it turns out that PEXT is extremely slow on AMD Zen and Zen 2 architectures. People have recorded these instructions taking up to 289 cycles, versus just one cycle on other CPU architectures. The reason these instructions are so slow is that they are not directly implemented on the CPU itself, but were instead implemented in microcode.
> Please note that other CPU architectures are not affected by this change; to our knowledge all other architectures either support PEXT directly or not at all.
> I just ran some tests: the performance seems to depend heavily on the value in the last operand; this is also the case for the register variants. If the last operand is set to -1 (i.e., all bits are 1), the instr. has 518 uops and needs more than 289 cycles!
GitHub: llvm/llvm-project /llvm/lib/Target/X86/X86ScheduleZnver1.td (as of LLVM 12.0.1)
// PDEP PEXT.
def : InstRW<[WriteMicrocoded], (instregex "PDEP(32|64)rr", "PEXT(32|64)rr")>;
def : InstRW<[WriteMicrocoded], (instregex "PDEP(32|64)rm", "PEXT(32|64)rm")>;
AMD 不愧是农企，轻易做到了其他 CPU 厂商做不到的事情
The complete guide for open sourcing video games
Have fun and play together~