Hawk: Harnessing Hardware-Aware Knowledge for High-Performance NPU Kernel Generation

본문 미리보기

arXiv:2607.01590v1 Announce Type: new Abstract: Developing high-performance kernels for Neural Processing Units (NPUs) is a critical industry bottleneck, requiring developers to manually navigate implicit hardware constraints and strict memory hierarchies. While large language models offer immense automation potential, they fail catastrophically on NPUs due to a fundamental lack of hardware-specific priors. Naively transplanting code snippets from similar NPU kernels may pass the compiler, but

Hawk: Harnessing Hardware-Aware Knowledge for High-Performance NPU Kernel Generation

본문 미리보기

관련 글

Profit-Based Counterfactual Explanations for Product Improvement: A Case Study of Manga Sales in Japan

Scaling Trends for Lie Detector Oversight in Preference Learning

Discrete Diffusion Language Models for Interactive Radiology Report Drafting

Revisiting Chain-of-Thought Reasoning under Limited Supervision: Semi-supervised Chain-of-Thought Learning