Tuning Flash Attention for Peak Performance in NVIDIA CUDA Tile | NVIDIA Technical Blog
… His experience spans the full AI stack, from GPU kernel optimization to AI product leadership. Before NVIDIA, he led the team at IBM Research that shipped the Watson Code Assistant, one of the earliest large-scale generative AI products. …