Welcome to Dr. Harisankar Sadasivan’s Homepage

Tech lead, AI Performance Engineering, NVIDIA | Aff. Asst. Professor, UW Seattle | Distinguished Visitor, IEEE-CS

Welcome to Harisankar Sadasivan’s Homepage

Tech lead, AI Performance Engineering, NVIDIA | Aff. Asst. Professor, UW Seattle | Distinguished Visitor, IEEE-CS

Hi, I'm Hari Sadasivan

Tech lead, AI Performance Engineering, NVIDIA | Aff. Asst. Professor, UW Seattle | Distinguished Visitor, IEEE-CS

Sarah Chen
Seattle, WA, USA

At NVIDIA, I serve as a Tech Lead for AI GPU performance engineering. Key responsibilities include:

  • Lead CUDA, Triton, CUTLASS, CuTe, CuTile, and cuDNN kernel optimization for Hopper, Blackwell, and Rubin-class GPUs.
  • Improve AI and LLM training/inference by fixing bottlenecks, fusing kernels, and optimizing memory bandwidth.
  • Enable PyTorch and JAX integration through XLA, MLIR, TorchInductor, and torch.compile.
  • Use Nsight Compute and Nsight Systems for profiling and performance analysis.
  • Work on fused GEMMs, attention kernels, distributed training, and inference optimization.
  • Drive co-design feedback for future GPU software and hardware.

At AMD, I worked as a GPU performance engineer and technical lead for AI workloads.

  • Optimized training and inference on MI300 and MI200 platforms.
  • Improved throughput, memory efficiency, and scaling for transformer workloads.
  • Collaborated with framework, architecture, and distributed systems teams.
  • Contributed to Composable Kernel and ROCm-based AI libraries.
  • Optimized attention, GEMM, batching, KV-cache, and multi-GPU communication.
  • Helped establish and lead the AMD Center of Excellence in AI at UW Seattle.
"

"Nothing in life is to be feared, it is only to be understood"- Marie Curie

Teaching

Bridging industry and academia to connect AI hardware, software, and student learning.

Advising

I continue to advise students on GPU optimizations. Past advisees:
-Melissa Queen, University of Washington Seattle
-M. Emin Ozturk, University of Utah
-Juechu Dong and Xueshen Liu , U Michigan Ann Arbor

News

Latest updates.

[December 2025] — Symposium Chair
I'm chairing an IEEE joint-symposium on Systems for AI and robotics in Seattle.

[December 2024] — Invited to speak at the US Dept. of Defense
I was invited to address the challenges in HW-SW for the future of AI and genomics by the Defense Intelligence Agency, Department of Defense, USA.

[Nov 2024] — Minimap2 Chaining – ACM BCB 2024
Our GPU-accelerated Minimap2 chaining paper was presented at ACM BCB 2024. Here’s an AMD blog post on the collaboration.

[August 2024] — Speeding up AI matmuls on GPUs
Our work on improving work-partitioning for AI on GPUs is out. Here’s a pre-print.

[July 2024] — Defining the future of genomics HW and SW
Our paper on setting the direction of genomics acceleration for the coming decades is out. Here’s a pre-print.

Reseach

Research

I envision a world where AI is advanced and performant enough to diagnose and find cures for all human diseases via techniques such as Precision Medicine & Drug Discovery. I realize that’s a big jump. So, I have broken down my interests for now.

High Performance Artificial Intelligence (AI):

Tall & Skinny GEMMs, Stream-K, Performance issues on multi-chiplet GPUs, LLM inference optimizations, Faster attention kernels, Parallelism ,Disaggregation

Research

High Performance -omics & Drug Discovery:

Long-read DNA/ MSA/ raw-signal alignment, AI-based basecalling, Genomics/metagenomics, protein/RNA structure prediction for drug discovery

Recent Publications

Research contributions across AI, genomics, GPU acceleration, computational biology, and high-performance computing.

  • All
  • Pre-print
  • Journal
  • Conference
  • Workshop
Genomic Computing Revolution
Conference 2025

Stream-K++: Adaptive GPU GEMM Kernel Selection and Scheduling for AI Using Bloom Filters

Harisankar Sadasivan, Muhammad Emin, Muhammad Osama, Chris Millette et al.

Genomic Computing Revolution
Journal 2024

brain lymphoma diagnostics through nanopore sequencing of cytology-negative CSF

J. Hench, C. Hultschig, I. Bratic Hench, Harisankar Sadasivan et al. Acta Neuropathologica. Springer.

Genomic Computing Revolution
Conference 2024

Genomic Computing Revolution: Defining the Next Decades of Accelerating Genomics

Harisankar Sadasivan, Artur Klauser, Juergen Hench et al. IEEE HPEC.

mm2-gb GPU Accelerated
Workshop 2024

GPU Accelerated Minimap2 for Long Read DNA Mapping

Juechu Dong, Xueshen Liu, Harisankar Sadasivan et al. Biosys Workshop @ ASPLOS 2024.

DTW GPU
Journal 2024

Dynamic Time Warping on GPU for Selective Nanopore Sequencing

Harisankar Sadasivan, Daniel Stiffler, Ajay Tirumala et al. Journal of Biotechnology and Biomedicine.

Minimap2 GPU
Journal 2023

Minimap2 for Accurate Long Read Alignment on GPUs

Harisankar Sadasivan, Milos Maric, Eric Dawson et al. Journal of Biotechnology and Biomedicine.

SquiggleFilter
Conference 2021

An Accelerator for Portable Virus Detection

Presented at MICRO 2021. MICRO Top Picks 2022 Honorable Mention.

RawMap
Journal 2023

Rapid Real-time Squiggle Classification for Read until using RawMap

Harisankar Sadasivan, Jack Wadden, Kush Goliya et al. Archives of Clinical and Biomedical Research.

Press Release

Latest updates.

Social Feed

Latest updates from social media.