tensor core Posted on 2025-10-05 Tensor Core1st4 * 2 * 64 FP16 FMA/clock = 512 per SM per clock 2nd3rd4 * 1 * 256 FP16 FMA/clock = 1024 per SM per clock