TNPU: Supporting Trusted Execution with Tree-less Integrity Protection for Neural Processing Unit
Published in HPCA, 2022
Sunho Lee, Jungwoo Kim, Seonjin Na, Jongse Park, and Jaehyuk Huh, "TNPU: Supporting Trusted Execution with Tree-less Integrity Protection for Neural Processing Unit", the 28th IEEE International Symposium on High-Performance Computer Architecture ( HPCA ), Feburary 2022
As neural processing units (NPUs) for machine learning inference have been incorporated into a wide range of system-on-a-chips, NPUs are processing more and more mission-critical computations such as autonomous driving. With the increasing application scenarios, securing NPU operations from potential attacks has become crucial for the safety of the entire system. To address the security challenges of NPU operations, this study investigates how the trusted execution technology can be extended to harden the NPU execution by hardware supports. This paper proposes trusted NPU (TNPU) which supports trusted execution for NPUs integrated in a processor. For securing NPUs, a key performance challenge is in the encryption and integrity protection for external memory. This work proposes a novel tree-less integrity protection by exploiting the data flow semantics of DNN computation. The tree-less integrity protection maintains a version number for each tensor or sub-tensor inside the CPU enclave which drives the NPU computation. By exploiting the data flow of tensor updates, a per-tensor version number can efficiently verify the recency of the data in the tensor. The tree-less integrity protection eliminates performance losses by counter and hash cache misses, which are major performance overheads of hardware-based memory protection. Our evaluation with simulated NPUs shows that the performance overheads for trusted NPUs can be significantly reduced from the prior tree-based design, improving the performance of a single NPU by 10.0% and 7.5% on average over the prior one with two different NPU configurations. When the number of NPUs is increased to three, the performance gains further improve, achieving on average 13.3% and 8.7% improvements.