Jino's Blog
Home
Blogs
Projects
Featured Projects
Python
Tachyon - a educational LLM inference engine to run on consumer hardware. It generates 600 toks/s on an RTX 4060 Ti.
C++ / CUDA
fastcv - fastcv is a C++ CUDA rewrite with Pytorch bindings of the image filters in the OpenCV library.
C / C++
InferGPT - A high performance C/C++ inference engine for GPT based architectures that runs on CPU.
CUDA C++
Kernels - A collection of CUDA C++ and CuTe DSL kernels written for matrix multiplication.
ML Theory
Advanced ML - A collection of advanced machine learning topics implemented from scratch.