Aditya Blogs Here

GPU related only!

In a high performance gemm kernel, the number of requests and data transfers to and from lds respectively are pretty high (around 8 thousand). In this post, we describe a way to decrease them (to around 3 thousand).

Lorem ipsum dolor sit amet, consectetur adipiscing elit!

At labitur probatus eum, qui modo idque partem ne, ea has oratio sanctus cotidieque. Dicta persecuti sit ex. Discere facilis recteque sit no, in eripuit volumus adversarium vim. Ad sed meis nulla accusata. Ex posse accusam maluisset mei, id vix ignota cetero inimicus.