Published onSeptember 10, 2025Stumbling into a sweep line algorithm's edge casec++sweeplinealgorithmMistakes that you should learn from, lesson 1..
Published onAugust 13, 2025Some tips for integrating small bits of CUDA code into larger codebasescudac++pimplSome lessons from general C++ come in handy here..
Published onJuly 14, 2025Old dog, old C struct trickscstructmemoryLook at this code snippet and tell me with a straight face you didn't think it was a memory leak at first either..
Published onMarch 16, 2025AtomicMinFloat; overloading integer-only atomics for floating-point numbers in CUDAc++atomiccudaConvincing you (and myself) that with some minor edits, we can still use atomicMin for floats in CUDA..
Published onMarch 10, 2025Gotta go (randomly) fast, thrust vs cuRANDc++ccudarngthrustcurandHow fast can you generate (pseudo-)random numbers on the GPU?
Published onFebruary 23, 2025When circumstances don’t allow you to use unique_ptrc++memorymanagerunique_ptrSome background on my single header file memory manager and why I made it..
Published onJanuary 30, 2025Cross-platform trigonometric SIMD and how the C ABI confused mec++csimdsvmlintrinsicMy small journey in discovering libmvec functions and the C problems I encountered..
Published onOctober 18, 2024Esoteric Errors: 'hidden symbol is referenced by DSO'c++gccbuilderrorslinkerHidden linkage isn't something I'd encountered until now, so maybe this will help someone else too..
Published onSeptember 9, 2024Who knew a simple logger class would be this complicated?c++c++20printfloggersource_locationWriting a printf-based C++ logger class was more of a journey than I originally thought..
Published onJuly 15, 2024Some notes on 2D real-to-complex Fourier transformsdftc++r2cippcufftnumpypythonIPP in particular has some very niche ways of packing R2C DFT output, but otherwise there's a few pointers here to keep in mind for how they are implemented in most libraries.
Published onMay 4, 2024CRTP, method chaining, and static polymorphismc++crtpYes, it's yet another blogpost about CRTP and how it'd be useful..
Published onMarch 11, 2024Clang and Eigen's alternatives to complex multiplication SIMDassemblyc++avxsimdintrinsicsclangeigenClang isn't much better than MSVC for complex number multiplication, while Eigen is equivalent to GCC but uses slightly different instructions.
Published onFebruary 24, 2024MSVC's terrible auto-vectoriser for AVXassemblyc++avxsimdintrinsicsgccmsvcMSVC has extremely lackluster auto-vectorisation, so I handrolled intrinsic calls by backtranslating GCC's output.
Published onFebruary 19, 2024Getting IPP to work on non-Intel chipsippc++Intel Performance Primitives is not guaranteed to work on non-Intel chips, but there are some ways around it..