3.2. Resultshalf-float (FP16) を使えば、使用するメモリがCPUのキャッシュに載らない場合では 1.6倍くらい高速化できるらしい。
Using half-floats provides a performance benefit over 32-bit floats when 32-bit float data does not fit into the L1 cache. Specifically, half-floats provide an average speedup of 1.05x when 32-bit data would fit in the L2 cache, an average speedup of 1.3x when 32-bit data would fit in the L3 cache, and an average speedup of 1.6x when the 32-bit data would fit into memory. Additionally, while half-floats may not provide a direct performance benefit when 32-bit data would fit into the L1 cache, you may still experience an auxiliary benefit when using half-floats in your program because half-floats will use half as much space, which allows for significantly more of your programs data to reside in L1.
Showing posts with label Patrick Konsor. Show all posts
Showing posts with label Patrick Konsor. Show all posts
2016-10-29
Performance Benefits of Half Precision Floats | Intel® Software
https://software.intel.com/en-us/articles/performance-benefits-of-half-precision-floats/