Published onJuly 20, 2025Quickly sketching out a BlockQuickSelectcudaquickselectcubI wrote a blockwide quickselect. That's the post.
Published onJuly 14, 2025Old dog, old C struct trickscstructmemoryLook at this code snippet and tell me with a straight face you didn't think it was a memory leak at first either..
Published onMarch 25, 2025Interesting Tidbits from GTC 2025: CUDA GraphscudagtcIn the 2nd post of this series, I give a short introduction on something that is also not particularly new, but new to me - CUDA Graphs!
Published onMarch 24, 2025Interesting Tidbits from GTC 2025: Asynchronicity Beyond StreamscudagtcSome notes for CUDA programmers who haven't kept up with the times; this first post covers in-kernel pipelining..
Published onMarch 16, 2025AtomicMinFloat; overloading integer-only atomics for floating-point numbers in CUDAc++atomiccudaConvincing you (and myself) that with some minor edits, we can still use atomicMin for floats in CUDA..