Access BCOE COVID-19 information and guidance.   Submit a return-to-work proposal.   Request a one-time visit to BCOE.  Take the daily wellness check survey.


Exploring Full Potential of Emerging Hardware Technologies

It is critical to performance engineering that we can investigate the efficiency of applications in utilizing hardware components. As new hardware technologies, including hardware accelerators and fast, non-volatile memory technologies emerge and be integrated into systems, it is essential that profiling tools must allow performance engineers to re-examine the interactions between applications and emerging hardware technologies. 

Extreme Storage & Computer Architecture Laboratory (ESCAL) at UC Riverside led by Prof. Hung-Wei Tseng recently released two profiling tools — TPUPoint ( and Tier Memory Profiler ( to fill the missing pieces of profiling in modern datacenter servers. TPUPoint is a tool that allows programmers in understanding the usage of Google’s Cloud Tensor Processing Units (Cloud TPU) by artificial intelligence (AI) and machine learning (ML) workloads. Through experimental results released in an ISPASS paper by Abenezer Wudenhe and Hung-Wei Tseng, both from ESCAL, earlier this year [1], the bottlenecks in AI/ML workloads start shifting to data transformation and exchange overhead as the datasets and performance of TPUs scale. It is important for programmers to rethink optimizations on these “non-computational” parts to streamline computation. 

Tier Memory Profiler (TMP) is a tool released jointly by researchers from ESCAL (Jinyoung Choi and Hung-Wei Tseng) and AMD (Sergey Blagodurov). The IPDPS paper written by the authors of TMP also reveals that through TMP’s more complete understanding about the memory usage of big-data applications, the same memory allocation policy can work more efficiently. In other words, the profiling tool matters and probably even more than policies!

For more information regarding to Prof. Hung-Wei Tseng and his group’s research, please visit ESCAL’s website at

Please also take a look of our GitHub repositories for 
TPUPoint —

[1] Abenezer Wudenhe and Hung-Wei Tseng, "TPUPoint: Automatic Characterization of Hardware-Accelerated Machine-Learning Behavior for Cloud Computing," 2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2021, pp. 254-264.

[2] Jinyoung Choi, Sergey Blagodurov and Hung-Wei Tseng. Dancing in the Dark: Profiling in the Age of Tiered Memory. In 35th IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2021.