gpu instruction throughput Posted on 2025-11-04 Instruction Throughput Instruction Latencies and Instructions/SM Little’s law所需线程数量 = 延迟*吞吐量 Arithmetic Instruction LatencyMemory Instruction Latency每个时钟周期的读取字节数 = 内存带宽 / 时钟频率