Published onMarch 24, 2025Interesting Tidbits from GTC 2025: Asynchronicity Beyond StreamscudagtcSome notes for CUDA programmers who haven't kept up with the times; this first post covers in-kernel pipelining..
Published onMarch 16, 2025AtomicMinFloat; overloading integer-only atomics for floating-point numbers in CUDAc++atomiccudaConvincing you (and myself) that with some minor edits, we can still use atomicMin for floats in CUDA..
Published onMarch 10, 2025Gotta go (randomly) fast, thrust vs cuRANDc++ccudarngthrustcurandHow fast can you generate (pseudo-)random numbers on the GPU?
Published onFebruary 23, 2025When circumstances don’t allow you to use unique_ptrc++memorymanagerunique_ptrSome background on my single header file memory manager and why I made it..
Published onJanuary 30, 2025Cross-platform trigonometric SIMD and how the C ABI confused mec++csimdsvmlintrinsicMy small journey in discovering libmvec functions and the C problems I encountered..