This also modifies the inline assembly to be more optimizable - instead of doing explicit movs, we instead communicate to LLVM which registers we would like to, somehow, have the correct values. This is how the x86_64 code already worked and thus allows the code to be unified across the two architectures. As a bonus, I threw in x86 support.
9.8 KiB
9.8 KiB