headshot of Jingtong Hu

Jingtong Hu

Associate Professor
William Kepler Whiteford Faculty Fellow
Homepage Electrical and Computer Engineering

about

(2021) IEEE Transactions on Computer-Aided Design Donald O. Pederson Best Paper Award.

(2020) Best Paper Award Nomination, ASP-DAC 2020.

(2019) ACM SIGDA Meritorious Service Award.

(2019) Employer Diversity Recognition Award.

(2019) Best Paper Award Nomination, CODES+ISSS 2019.

(2019) Best Paper Award Nomination, DAC 2019.

(2018) Best Paper Award Nomination, ISQED 2018.

(2017) Best Paper Award Nomination, DAC 2017.

PhD, University of Texas at Dallas, 2007 - 2013

B.E., Shandong University, 2003 - 2007

Deng, J., Dong, S., Chen, L., Hu, J., & Zhuo, C. (2024). STDF: Spatio-Temporal Deformable Fusion for Video Quality Enhancement on Embedded Platforms. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 23(2), 1-25.Association for Computing Machinery (ACM). doi: 10.1145/3645113.

Dong, P., Zhuang, J., Yang, Z., Ji, S., Li, Y., Xu, D., Huang, H., Hu, J., Jones, A.K., Shi, Y., Wang, Y., & Zhou, P. (2024). EQ-ViT: Algorithm-Hardware Co-Design for End-to-End Acceleration of Real-Time Vision Transformer Inference on Versal ACAP Architecture. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 43(11), 3949-3960.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2024.3443692.

Jia, Z., Zhou, T., Yan, Z., Hu, J., & Shi, Y. (2024). Personalized Meta-Federated Learning for IoT-Enabled Health Monitoring. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 43(10), 3157-3170.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2024.3388908.

Tang, Y., Song, Y., Elango, N., Priya, S.R., Jones, A.K., Xiong, J., Zhou, P., & Hu, J. (2024). CHEF: A Framework for Deploying Heterogeneous Models on Clusters with Heterogeneous FPGAs. IEEE Trans Comput Aided Des Integr Circuits Syst, 43(11), 3937-3948.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/tcad.2024.3438994.

Zhhuang, J., Lau, J., Ye, H., Yang, Z., Ji, S., Lo, J., Denolf, K., Neuendorffer, S., Jones, A., Hu, J., Shi, Y., Chen, D., Cong, J., & Zhou, P. (2024). CHARM 2.0: Composing Heterogeneous Accelerators for Deep Learning on Versal ACAP Architecture. ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 17(3), 1-31.Association for Computing Machinery (ACM). doi: 10.1145/3686163.

Ollivier, S., Li, S., Tang, Y., Cahoon, S., Caginalp, R., Chaudhuri, C., Zhou, P., Tang, X., Hu, J., & Jones, A.K. (2023). Sustainable AI Processing at the Edge. IEEE MICRO, 43(1), 19-28.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/MM.2022.3220399.

Ollivier, S., Longofono, S., Dutta, P., Hu, J., Bhanja, S., & Jones, A.K. (2023). Toward Comprehensive Shifting Fault Tolerance for Domain-Wall Memories With PIETT. IEEE TRANSACTIONS ON COMPUTERS, 72(4), 1095-1109.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TC.2022.3188206.

Shi, J., Wu, Y., Zeng, D., Tao, J., Hu, J., & Shi, Y. (2023). Self-Supervised On-Device Federated Learning From Unlabeled Streams. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 42(12), 4871-4882.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2023.3274956.

Wu, T., Ma, K., Hu, J., Xue, J., Li, J., Shi, X., Yang, H., & Liu, Y. (2023). Reliable and Efficient Parallel Checkpointing Framework for Nonvolatile Processor With Concurrent Peripherals. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 70(1), 228-240.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCSI.2022.3208523.

Delanerolle, G., Hu, J., Cavalini, H., Yardley, L., Barnard-Kelly, K., Elliot, K., Raymont, V., Rathod, S., Shi, J.Q., & Phiri, P. (2022). Impact of SARS-Cov-2 on Clinical Trial Unit workforce in the United Kingdom; An observational study. 2022.06.30.22277052.Cold Spring Harbor Laboratory. doi: 10.1101/2022.06.30.22277052.

Jia, Z., Shi, Y., & Hu, J. (2022). Personalized Neural Network for Patient-Specific Health Monitoring in IoT: A Metalearning Approach. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 41(12), 5394-5407.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2022.3162182.

Li, Y., Wu, Y., Zhang, X., Hu, J., & Lee, I. (2022). Energy-Aware Adaptive Multi-Exit Neural Network Inference Implementation for a Millimeter-Scale Sensing System. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 30(7), 849-859.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TVLSI.2022.3171308.

Ollivier, S., Zhang, X., Tang, Y., Choudhuri, C., Hu, J., & Jones, A.K. (2022). FPIRM: Floating-point Processing in Racetrack Memories.

Ollivier, S., Zhang, X., Tang, Y., Choudhuri, C., Hu, J., & Jones, A.K. (2022). Pod-racing: bulk-bitwise to floating-point compute in racetrack memory for machine learning at the edge. IEEE MICRO, 42(6), 9-16.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/MM.2022.3195761.

Tang, Y., Wu, Y., Zhou, P., & Hu, J. (2022). Enabling Weakly Supervised Temporal Action Localization From On-Device Learning of the Video Stream. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 41(11), 3910-3921.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2022.3197536.

Tang, Y., Zhang, X., Zhou, P., & Hu, J. (2022). EF-Train: Enable Efficient On-device CNN Training on FPGA through Data Reshaping for Online Adaptation or Personalization. ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 27(5), 1-36.Association for Computing Machinery (ACM). doi: 10.1145/3505633.

Wu, Y., Jia, Z., Fang, F., & Hu, J. (2022). Cooperative Communication Between Two Transiently Powered Sensor Nodes by Reinforcement Learning. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 41(1), 76-90.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2021.3054329.

Wu, Y., Zeng, D., Wang, Z., Shi, Y., & Hu, J. (2022). Distributed contrastive learning for medical image segmentation. Med Image Anal, 81, 102564.Elsevier. doi: 10.1016/j.media.2022.102564.

Hu, J., Zhu, Q., & Jha, S. (2021). Introduction to the Special Issue on Artificial Intelligence and Cyber-Physical Systems: Part 1. ACM Transactions on Cyber-Physical Systems, 5(4), 1-3.Association for Computing Machinery (ACM). doi: 10.1145/3471164.

Jiang, W., Lou, Q., Yan, Z., Yang, L., Hu, J., Hu, X.S., & Shi, Y. (2021). Device-Circuit-Architecture Co-Exploration for Computing-in-Memory Neural Accelerators. IEEE TRANSACTIONS ON COMPUTERS, 70(4), 595-605.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TC.2020.2991575.

Li, Y., Gao, Y., Shao, M., Tonecha, J.T., Wu, Y., Hu, J., & Lee, I. (2021). Implementation of Multi-Exit Neural-Network Inferences for an Image-Based Sensing System with Energy Harvesting. JOURNAL OF LOW POWER ELECTRONICS AND APPLICATIONS, 11(3), 34.MDPI. doi: 10.3390/jlpea11030034.

Ollivier, S., Longofono, S., Dutta, P., Hu, J., Bhanja, S., & Jones, A.K. (2021). PIRM: Processing In Racetrack Memories.

Xu, X., Zhang, X., Yu, B., Hu, X.S., Rowen, C., Hu, J., & Shi, Y. (2021). DAC-SDC Low Power Object Detection Challenge for UAV Applications. IEEE Trans Pattern Anal Mach Intell, 43(2), 392-403.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TPAMI.2019.2932429.

Zhang, X., Wu, Y., Zhou, P., Tang, X., & Hu, J. (2021). Algorithm-hardware Co-design of Attention Mechanism on FPGA Devices. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 20(5), 1-24.Association for Computing Machinery (ACM). doi: 10.1145/3477002.

Clark, R.M., Dickerson, S., Bedewy, M., Chen, K.P., Dallal, A., Gomez, A., Hu, J., Kerestes, R., & Luangkesorn, L. (2020). Social-Driven Propagation of Active Learning and Associated Scholarship Activity in Engineering: A Case Study. INTERNATIONAL JOURNAL OF ENGINEERING EDUCATION, 36(5), 1667-1680.

Jiang, W., Yang, L., Dasgupta, S., Hu, J., & Shi, Y. (2020). Standing on the Shoulders of Giants: Hardware and Neural Architecture Co-Search With Hot Start. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 39(11), 4154-4165.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2020.3012863.

Jiang, W., Yang, L., Sha, E.H.M., Zhuge, Q., Gu, S., Dasgupta, S., Shi, Y., & Hu, J. (2020). Hardware/Software Co-Exploration of Neural Architectures. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 39(12), 4805-4815.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2020.2986127.

Liu, K., Zhao, M., Ju, L., Jia, Z., Hu, J., & Xue, C.J. (2020). Applying Multiple Level Cell to Non-volatile FPGAs. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 19(4), 1-22.Association for Computing Machinery (ACM). doi: 10.1145/3400885.

Qiu, K., Li, Q., Hu, J., Zhang, W., & Xue, C.J. (2020). Write Mode Aware Loop Tiling for High-Performance Low-Power Volatile PCM in Embedded Systems. In Smart Sensors and Systems. (pp. 171-198).Springer Nature. doi: 10.1007/978-3-030-42234-9_10.

Wang, Y., Liu, J., & Hu, J. (2020). Communication-Aware Task Scheduling for Energy-Harvesting Nonvolatile Processors. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 28(8), 1796-1806.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TVLSI.2020.2978543.

Zhang, X., Patterson, C., Liu, Y., Yang, C., Xue, C.J., & Hu, J. (2020). Low Overhead Online Data Flow Tracking for Intermittently Powered Non-Volatile FPGAs. ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 16(3), 1-20.Association for Computing Machinery (ACM). doi: 10.1145/3371392.

Chang, Y.H., Hu, J., Tahoori, M.B., & DeMara, R.F. (2019). Guest Editorial: IEEE Transactions on Computers Special Section on Emerging Non-Volatile Memory Technologies: From Devices to Architectures and Systems. IEEE Transactions on Computers, 68(8), 1111-1113.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/tc.2019.2923033.

Fu, C., Liu, Q., Wu, P., Li, M., Xue, C.J., Zhao, Y., Hu, J., & Han, S. (2019). Real-Time Data Retrieval in Cyber-Physical Systems with Temporal Validity and Data Availability Constraints. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 31(9), 1779-1793.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TKDE.2018.2866842.

Jiang, W., Sha, E.H.M., Zhuge, Q., Yang, L., Chen, X., & Hu, J. (2019). On the Design of Time-Constrained and Buffer-Optimal Self-Timed Pipelines. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 38(8), 1515-1528.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2018.2846642.

Li, F., Qiu, K., Zhao, M., Hu, J., Liu, Y., Guan, Y., & Xue, C.J. (2019). Checkpointing-Aware Loop Tiling for Energy Harvesting Powered Nonvolatile Processors. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 38(1), 15-28.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2018.2803624.

Pan, C., Xie, M., Han, S., Mao, Z.H., & Hu, J. (2019). Modeling and Optimization for Self-powered Non-volatile IoT Edge Devices with Ultra-low Harvesting Power. ACM Transactions on Cyber-Physical Systems, 3(3), 1-26.Association for Computing Machinery (ACM). doi: 10.1145/3324609.

Xie, M., Pan, C., Zhang, Y., Hu, J., Liu, Y., & Xue, C.J. (2019). A Novel STT-RAM-Based Hybrid Cache for Intermittently Powered Processors in IoT Devices. IEEE MICRO, 39(1), 24-32.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/MM.2018.2890257.

Jiang, W., Sha, E.H.M., Zhuge, Q., Yang, L., Chen, X., & Hu, J. (2018). Heterogeneous FPGA-Based Cost-Optimal Design for Timing-Constrained CNNs. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 37(11), 2542-2554.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2018.2857098.

Ju, L., Sui, X., Li, S., Zhao, M., Xue, C.J., Hu, J., & Jia, Z. (2018). NVM-Based FPGA Block RAM With Adaptive SLC-MLC Conversion. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 37(11), 2661-2672.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2018.2857261.

Li, J., Liu, Y., Li, H., Yuan, Z., Fu, C., Yue, J., Feng, X., Xue, C.J., Hu, J., & Yang, H. (2018). PATH: Performance-Aware Task Scheduling for Energy-Harvesting Nonvolatile Processors. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 26(9), 1671-1684.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TVLSI.2018.2825605.

Luo, H., Liu, Q., Hu, J., Li, Q., Shi, L., Zhuge, Q., & Sha, E.H.M. (2018). Write Energy Reduction for PCM via Pumping Efficiency Improvement. ACM TRANSACTIONS ON STORAGE, 14(3), 1-21.Association for Computing Machinery (ACM). doi: 10.1145/3200139.

Pan, C., Xie, M., & Hu, J. (2018). ENZYME: An Energy-Efficient Transient Computing Paradigm for Ultralow Self-Powered IoT Edge Devices. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 37(11), 2440-2450.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2018.2858478.

Xie, M., Li, S., Glova, A.O., Hu, J., & Xie, Y. (2018). Securing Emerging Nonvolatile Main Memory With Fast and Energy-Efficient AES In-Memory Implementation. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 26(11), 2443-2455.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TVLSI.2018.2865133.

Xie, M., Pan, C., Zhao, M., Liu, Y., Xue, C.J., & Hu, J. (2018). Avoiding Data Inconsistency in Energy Harvesting Powered Embedded Systems. ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 23(3), 1-25.Association for Computing Machinery (ACM). doi: 10.1145/3182170.

Chen, R., Wang, Y., Hu, J., Liu, D., Shao, Z., & Guan, Y. (2017). vFlash: Virtualized Flash for Optimizing the I/O Performance in Mobile Devices. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 36(7), 1203-1214.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2016.2618881.

Ding, C., Liu, N., Wang, Y., Heidari, S., Hu, J., Li, J., & Liu, Y. (2017). Multisource Indoor Energy Harvesting for Nonvolatile Processors. IEEE DESIGN & TEST, 34(3), 42-49.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/MDAT.2017.2682242.

Guo, J., Wen, W., Hu, J., Wang, D., Li, H., & Chen, Y. (2017). FlexLevel NAND Flash Storage System Design to Reduce LDPC Latency. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 36(7), 1167-1180.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2016.2619480.

Pan, C., Xie, M., Yang, C., Chen, Y., & Hu, J. (2017). Exploiting Multiple Write Modes of Nonvolatile Main Memory in Embedded Systems. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 16(4), 1-26.Association for Computing Machinery (ACM). doi: 10.1145/3063130.

Yuan, Z., Liu, Y., Li, J., Hu, J., Xue, C.J., & Yang, H. (2017). CP-FPGA: Energy-Efficient Nonvolatile FPGA With Offline/Online Checkpointing Optimization. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 25(7), 2153-2163.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TVLSI.2017.2680464.

Zhao, M., Fu, C., Li, Z., Li, Q., Xie, M., Liu, Y., Hu, J., Jia, Z., & Xue, C.J. (2017). Stack-Size Sensitive On-Chip Memory Backup for Self-Powered Nonvolatile Processors. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 36(11), 1804-1816.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/tcad.2017.2666606.

Zhao, M., Xue, Y., Hu, J., Yang, C., Liu, T., Jia, Z., & Xue, C.J. (2017). State Asymmetry Driven State Remapping in Phase Change Memory. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 36(1), 27-40.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2016.2561408.

Chen, R., Wang, Y., Hu, J., Liu, D., Shao, Z., & Guan, Y. (2016). Image-Content-Aware I/O Optimization for Mobile Virtualization. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 16(1), 1-24.Association for Computing Machinery (ACM). doi: 10.1145/2950059.

Gu, S., Sha, E.H.M., Zhuge, Q., Chen, Y., & Hu, J. (2016). A Time, Energy, and Area Efficient Domain Wall Memory-Based SPM for Embedded Systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 35(12), 2008-2017.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/tcad.2016.2547903.

Gu, S., Zhuge, Q., Yi, J., Hu, J., & Sha, E.H.M. (2016). Data Allocation with Minimum Cost under Guaranteed Probability for Multiple Types of Memories. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 84(1), 151-162.Springer Nature. doi: 10.1007/s11265-015-0985-5.

Pan, C., Gu, S., Xie, M., Liu, Y., Xue, J., & Hu, J. (2016). Wear-Leveling Aware Page Management for Non-Volatile Main Memory on Embedded Systems. IEEE Transactions on Multi-Scale Computing Systems, 2(2), 129-142.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/tmscs.2016.2525999.

Qiu, K., Li, Q., Hu, J., Zhang, W., & Xue, C.J. (2016). Write Mode Aware Loop Tiling for High Performance Low Power Volatile PCM in Embedded Systems. IEEE TRANSACTIONS ON COMPUTERS, 65(7), 2313-2324.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TC.2015.2479605.

Zhang, Y., Zhang, X., Hu, J., Nan, J., Zheng, Z., Zhang, Z.Zhang, Y., Vernier, N., Ravelosona, D., & Zhao, W. (2016). Ring-shaped Racetrack memory based on spin orbit torque driven chiral domain wall motions. Sci Rep, 6(1), 35062.Springer Nature. doi: 10.1038/srep35062.

Gu, S., Zhuge, Q., Yi, J., Hu, J., & Sha, E.H.M. (2015). Optimizing Task and Data Assignment on Multi-Core Systems with Multi-Port SPMs. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 26(9), 2549-2560.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TPDS.2014.2356194.

Hu, J., Xie, M., Pan, C., Xue, C.J., Zhuge, Q., & Sha, E.H.M. (2015). Low Overhead Software Wear Leveling for Hybrid PCM plus DRAM Main Memory on Embedded Systems. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 23(4), 654-663.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TVLSI.2014.2321571.

Yi, J., Zhuge, Q., Hu, J., Gu, S., Qin, M., & Sha, E.H.M. (2015). Reliability-Guaranteed Task Assignment and Scheduling for Heterogeneous Multiprocessors Considering Timing Constraint. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 81(3), 359-375.Springer Nature. doi: 10.1007/s11265-014-0958-0.

Hu, J., Zhuge, Q., Xue, C.J., Tseng, W.C., & Sha, E.H.M. (2014). Management and Optimization for Nonvolatile Memory-Based Hybrid Scratchpad Memory on Multicore Embedded Processors. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 13(4), 1-25.Association for Computing Machinery (ACM). doi: 10.1145/2560019.

Hu, J., Zhuge, Q., Xue, C.J., Tseng, W.C., Gu, S., & Sha, E.H.M. (2014). Scheduling to Optimize Cache Utilization for Non-Volatile Main Memories. IEEE TRANSACTIONS ON COMPUTERS, 63(8), 2039-2051.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TC.2013.11.

Liu, J., Zhuge, Q., Gu, S., Hu, J., Zhu, G., & Sha, E.H.M. (2014). Minimizing System Cost with Efficient Task Assignment on Heterogeneous Multicore Processors Considering Time Constraint. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 25(8), 2101-2113.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TPDS.2013.312.

Long, L., Liu, D., Hu, J., Gu, S., Zhuge, Q., & Sha, E.H.M. (2014). A space allocation and reuse strategy for PCM-based embedded systems. JOURNAL OF SYSTEMS ARCHITECTURE, 60(8), 655-667.Elsevier. doi: 10.1016/j.sysarc.2014.07.002.

Sun, Q., Zhuge, Q., Hu, J., Yi, J., & Sha, E.H.M. (2014). Efficient grouping-based mapping and scheduling on heterogeneous cluster architectures. COMPUTERS & ELECTRICAL ENGINEERING, 40(5), 1604-1620.Elsevier. doi: 10.1016/j.compeleceng.2014.03.009.

Xu, Y., Li, K., Hu, J., Li, K. (2014). A genetic algorithm for task scheduling on heterogeneous computing systems using multiple priority queues. INFORMATION SCIENCES, 270, 255-287.Elsevier. doi: 10.1016/j.ins.2014.02.122.

Du, J., Wang, Y., Zhuge, Q., Hu, J., & Sha, E.H.M. (2013). Efficient Loop Scheduling for Chip Multiprocessors with Non-Volatile Main Memory. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 71(3), 261-273.Springer Nature. doi: 10.1007/s11265-012-0703-5.

Guo, Y., Zhuge, Q., Hu, J., Yi, J., Qiu, M., & Sha, E.H.M. (2013). Data Placement and Duplication for Embedded Multicore Systems With Scratch Pad Memory. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 32(6), 809-817.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2013.2238990.

Hu, J., He, Y., Zhuge, Q., Sha, E.H.M., Xue, C.J., & Zhao, Y. (2013). Minimizing accumulative memory load cost on multi-core DSPs with multi-level memory. JOURNAL OF SYSTEMS ARCHITECTURE, 59(7), 389-399.Elsevier. doi: 10.1016/j.sysarc.2013.05.003.

Hu, J., Xue, C.J., Qiu, M., Tseng, W.C., & Sha, E.H.M. (2013). Algorithms to Minimize Data Transfer for Code Update on Wireless Sensor Network. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 71(2), 143-157.Springer Nature. doi: 10.1007/s11265-012-0689-z.

Hu, J., Xue, C.J., Zhuge, Q., Tseng, W.C., & Sha, E.H.M. (2013). Write Activity Reduction on Non-Volatile Main Memories for Embedded Chip Multiprocessors. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 12(3), 1-27.Association for Computing Machinery (ACM). doi: 10.1145/2442116.2442127.

Hu, J., Xue, C.J., Zhuge, Q., Tseng, W.C., & Sha, E.H.M. (2013). Data Allocation Optimization for Hybrid Scratch Pad Memory With SRAM and Nonvolatile Memory. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 21(6), 1094-1102.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TVLSI.2012.2202700.

Mei, J., Li, K., Hu, J., Yin, S., & Sha, E.H.M. (2013). Energy-aware preemptive scheduling algorithm for sporadic tasks on DVS platform. MICROPROCESSORS AND MICROSYSTEMS, 37(1), 99-112.Elsevier. doi: 10.1016/j.micpro.2012.11.002.

Hu, J., Xue, C.J., Tseng, W.C., Zhuge, Q., Zhao, Y., & Sha, E.H.M. (2012). Memory access schedule minimization for embedded systems. JOURNAL OF SYSTEMS ARCHITECTURE, 58(1), 48-59.Elsevier. doi: 10.1016/j.sysarc.2011.10.002.

Zhang, D., Liao, X., Qiu, M., Hu, J., & Sha, E.H.M. (2012). Randomized execution algorithms for smart cards to resist power analysis attacks. JOURNAL OF SYSTEMS ARCHITECTURE, 58(10), 426-438.Elsevier. doi: 10.1016/j.sysarc.2012.08.004.

Zhuge, Q., Guo, Y., Hu, J., Tseng, W.C., Xue, C.J., & Sha, E.H.M. (2012). Minimizing Access Cost for Multiple Types of Memory Units in Embedded Systems Through Data Allocation and Scheduling. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 60(6), 3253-3263.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TSP.2012.2189768.

Hu, J., Tseng, W.C., Xue, C.J., Zhuge, Q., Zhao, Y., & Sha, E.H.M. (2011). Write Activity Minimization for Nonvolatile Main Memory Via Scheduling and Recomputation. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 30(4), 584-592.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2010.2097307.

Tseng, W.C., Hu, J., Zhuge, Q., He, Y., & Sha, E.M. (2010). Algorithms for Optimally Arranging Multicore Memory Structures. EURASIP Journal on Embedded Systems, 2010(1), 871510.Springer Nature. doi: 10.1155/2010/871510.

Xue, C.J., Hu, J., Shao, Z., & Sha, E. (2010). Iterational Retiming with Partitioning: Loop Scheduling with Complete Memory Latency Hiding. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 9(3), 1-26.Association for Computing Machinery (ACM). doi: 10.1145/1698772.1698780.

Xu, C.Q., Xue, C.J., Hu, J., & Sha, E.H.M. (2009). Optimizing Scheduling and Intercluster Connection for Application-Specific DSP Processors. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 57(11), 4538-4547.Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TSP.2009.2024870.

Zhuge, Q., Xue, C.J., Qiu, M., Hu, J., & Sha, E.H.M. (2008). Timing optimization via nest-loop pipelining considering code size. MICROPROCESSORS AND MICROSYSTEMS, 32(7), 351-363.Elsevier. doi: 10.1016/j.micpro.2008.02.002.

Zhuang, J., Yang, Z., Ji, S., Huang, H., Jones, A.K., Hu, J., Shi, Y., & Zhou, P. (2024). SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration. In Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, (pp. 55-66).Association for Computing Machinery (ACM). doi: 10.1145/3626202.3637569.

Zhou, P., Zhuang, J., Cahoon, S., Tang, Y., Yang, Z., Chen, X., Shi, Y., Hu, J., & Jones, A.K. (2023). REFRESH FPGAs: Sustainable FPGA Chiplet Architectures. In Proceedings of the 14th International Green and Sustainable Computing Conference, (pp. 1-3).Association for Computing Machinery (ACM). doi: 10.1145/3634769.3634798.

Zhuang, J., Lau, J., Ye, H., Yang, Z., Du, Y., Lo, J., Denolf, K., Neuendorffer, S., Jones, A., Hu, J., Chen, D., Cong, J., & Zhou, P. (2023). CHARM: C omposing H eterogeneous A ccele R ators for M atrix Multiply on Versal ACAP Architecture. In Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, (pp. 153-164).Association for Computing Machinery (ACM). doi: 10.1145/3543622.3573210.

Jia, Z., Hong, F., Ping, L., Shi, Y., & Hu, J. (2021). Enabling On-Device Model Personalization for Ventricular Arrhythmias Detection by Generative Adversarial Networks. In 2021 58th ACM/IEEE Design Automation Conference (DAC), 00, (pp. 163-168).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/dac18074.2021.9586123.

Jia, Z., Shi, Y., Saba, S., & Hu, J. (2021). On-device Prior Knowledge Incorporated Learning for Personalized Atrial Fibrillation Detection. In ACM Transactions on Embedded Computing Systems, 20(5s), (pp. 1-25).Association for Computing Machinery (ACM). doi: 10.1145/3476987.

Li, Y., Wu, Y., Zhang, X., Hamed, E., Hu, J., & Lee, I. (2021). Developing a miniature energy-harvesting-powered edge device with multi-exit neural network. In Proceedings - IEEE International Symposium on Circuits and Systems, 2021-May. doi: 10.1109/ISCAS51556.2021.09401799.

Wang, Z., Wu, Y., Jia, Z., Shi, Y., & Hu, J. (2021). Lightweight Run-Time Working Memory Compression for Deployment of Deep Neural Networks on Resource-Constrained MCUs. In Proceedings of the 26th Asia and South Pacific Design Automation Conference, (pp. 607-614).Association for Computing Machinery (ACM). doi: 10.1145/3394885.3439194.

Wu, Y., Wang, Z., Zeng, D., Shi, Y., & Hu, J. (2021). Enabling On-Device Self-Supervised Contrastive Learning with Selective Data Contrast. In 2021 58th ACM/IEEE Design Automation Conference (DAC), 00, (pp. 655-660).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/dac18074.2021.9586228.

Zeng, D., Ding, Y., Yuan, H., Huang, M., Xu, X., Zhuang, J., Hu, J., & Shi, Y. (2021). Invited:Hardware-aware Real-time Myocardial Segmentation Quality Control in Contrast Echocardiography. In 2021 58th ACM/IEEE Design Automation Conference (DAC), 00, (pp. 1339-1342).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/dac18074.2021.9586158.

Zhang, X., Wu, Y., Zhou, P., Tang, X., & Hu, J. (2021). Algorithm-hardware Co-design of Attention Mechanism on FPGA Devices. In ACM Transactions on Embedded Computing Systems, 20(5s). doi: 10.1145/347702.

Jia, Z., Wang, Z., Hong, F., Ping, L., Shi, Y., & Hu, J. (2020). Personalized deep learning for ventricular arrhythmias detection on medical IoT systems. In Proceedings of the 39th International Conference on Computer-Aided Design, 2020-November, (pp. 1-9).Association for Computing Machinery (ACM). doi: 10.1145/3400302.3415774.

Qiu, K., Zhao, M., Jia, Z., Hu, J., Xue, C.J., Ma, K., Li, X., Liu, Y., & Narayanan, V. (2020). Design Insights of Non-volatile Processors and Accelerators in Energy Harvesting Systems. In Proceedings of the 2020 on Great Lakes Symposium on VLSI, (pp. 369-374).Association for Computing Machinery (ACM). doi: 10.1145/3386263.3407596.

Wu, Y., Wang, Z., Jia, Z., Shi, Y., & Hu, J. (2020). Intermittent Inference with Nonuniformly Compressed Multi-Exit Neural Network for Energy Harvesting Powered Devices. In 2020 57th ACM/IEEE Design Automation Conference (DAC), 00, (pp. 1-6).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/dac18072.2020.9218526.

Wu, Y., Wang, Z., Shi, Y., & Hu, J. (2020). Enabling On-Device CNN Training by Self-Supervised Instance Filtering and Error Map Pruning. In IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 39(11), (pp. 3445-3457).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/TCAD.2020.3012216.

Yang, L., Jiang, W., Liu, W., Sha, E.H.M., Shi, Y., & Hu, J. (2020). Co-Exploring Neural Architecture and Network-on-Chip Design for Real-Time Artificial Intelligence. In 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC), 2020-January, (pp. 85-90).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/asp-dac47756.2020.9045595.

Zhang, X., Jiang, W., & Hu, J. (2020). Achieving Full Parallelism in LSTM via a Unified Accelerator Design. In 2020 IEEE 38th International Conference on Computer Design (ICCD), 00, (pp. 469-477).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/iccd50377.2020.00086.

Jia, Z., Wu, Y., & Hu, J. (2019). Q-learning based routing for transiently powered wireless sensor network. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis Companion, (pp. 1-2).Association for Computing Machinery (ACM). doi: 10.1145/3349567.3351732.

Jiang, W., Sha, E.H.M., Zhang, X., Yang, L., Zhuge, Q., Shi, Y., & Hu, J. (2019). Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference. In ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 18(5), (pp. 1-23).Association for Computing Machinery (ACM). doi: 10.1145/3358192.

Jiang, W., Zhang, X., Sha, E.H.M., Yang, L., Zhuge, Q., Shi, Y., & Hu, J. (2019). Accuracy vs. Efficiency. In Proceedings of the 56th Annual Design Automation Conference 2019, (pp. 1-6).Association for Computing Machinery (ACM). doi: 10.1145/3316781.3317757.

Wu, Y., Jia, Z., Fang, F., & Hu, J. (2019). Cooperative communication between two transiently powered sensors by reinforcement learning. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis Companion, (pp. 1-2).Association for Computing Machinery (ACM). doi: 10.1145/3349567.3351723.

Xie, M., Wu, Y., Jia, Z., & Hu, J. (2019). In-memory AES Implementation for Emerging Non-volatile Main Memory. In 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 00, (p. 103).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/isvlsi.2019.00027.

Zhang, X., Jiang, W., Shi, Y., & Hu, J. (2019). When Neural Architecture Search Meets Hardware Implementation: from Hardware Awareness to Co-Design. In 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 00, (pp. 25-30).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/isvlsi.2019.00014.

Ma, X., Zhang, Y., Yuan, G., Ren, A., Li, Z., Han, J., Hu, J., & Wang, Y. (2018). An Area and Energy Efficient Design of Domain-Wall Memory-Based Deep Convolutional Neural Networks Using Stochastic Computing. In 2018 19th International Symposium on Quality Electronic Design (ISQED), 2018-March, (pp. 314-321).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/isqed.2018.8357306.

Wu, Y., Sun, Y., Jia, Z., Zhang, L., Liu, Y., & Hu, J. (2018). Prototyping Energy Harvesting Powered Systems with Nonvolatile Processor (Invited Paper). In 2018 International Symposium on Rapid System Prototyping (RSP), 00, (pp. 49-55).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/rsp.2018.8631991.

Xie, M., Li, S., Glova, A.O., Hu, J., Wang, Y., & Xie, Y. (2018). AIM: Fast and Energy-Efficient AES In-Memory Implementation for Emerging Non-Volatile Main Memory. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2018-January, (pp. 625-628).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.23919/date.2018.8342085.

Zhang, X., Patterson, C., Liu, Y., Yang, C., Xue, C.J., & Hu, J. (2018). Low Overhead Online Checkpoint for Intermittently Powered Non-volatile FPGAs. In 2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 2018-July, (pp. 238-244).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/isvlsi.2018.00052.

Li, J., Guo, Q., Su, F., Yuan, Z., Yue, J., Hu, J., Yang, H., & Liu, Y. (2017). CNN-Based Pattern Recognition on Nonvolatile IoT Platform for Smart Ultraviolet Monitoring (Invited Paper). In 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2017-November, (pp. 888-893).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/iccad.2017.8203874.

Liu, K., Zhao, M., Ju, L., Jia, Z., Xue, C.J., & Hu, J. (2017). Design Exploration for Multiple Level Cell Based Non-volatile FPGAs. In 2017 IEEE International Conference on Computer Design (ICCD), (pp. 257-264).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/iccd.2017.46.

Lou, Q., Zhao, M., Ju, L., Xue, C.J., Hu, J., & Jia, Z. (2017). Runtime and Reconfiguration Dual-Aware Placement for SRAM-NVM Hybrid FPGAs. In 2017 IEEE 6th Non-Volatile Memory Systems and Applications Symposium (NVMSA), (pp. 1-6).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/nvmsa.2017.8064477.

Pan, C., Xie, M., & Hu, J. (2017). Maximize Energy Utilization for Ultra-Low Energy Harvesting Powered Embedded Systems. In 2017 IEEE 23rd International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA), (pp. 1-6).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/rtcsa.2017.8046325.

Pan, C., Xie, M., Liu, Y., Wang, Y., Xue, C.J., Wang, Y., Chen, Y., & Hu, J. (2017). A lightweight progress maximization scheduler for non-volatile processor under unstable energy harvesting. In Proceedings of the 18th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, 52(5), (pp. 101-110).Association for Computing Machinery (ACM). doi: 10.1145/3078633.3081038.

Xue, Y., Yang, C., & Hu, J. (2017). Age-aware Logic and Memory Co-Placement for RRAM-FPGAs. In Proceedings of the 54th Annual Design Automation Conference 2017, Part 128280, (pp. 1-6).Association for Computing Machinery (ACM). doi: 10.1145/3061639.3062198.

Ding, C., Heidari, S., Wang, Y., Liu, Y., & Hu, J. (2016). Multi-Source In-Door Energy Harvesting for Non-Volatile Processors. In 2016 IEEE International Symposium on Circuits and Systems (ISCAS), 2016-July, (pp. 173-176).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/iscas.2016.7527198.

Ding, C., Li, H., Hu, J., Liu, Y., & Wang, Y. (2016). Dynamic Converter Reconfiguration for Near-Threshold Non-Volatile Processors Using in-door Energy Harvesting. In 2016 IEEE 34th International Conference on Computer Design (ICCD), (pp. 289-295).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/iccd.2016.7753292.

Li, H., Liu, Y., Fu, C., Xue, C.J., Xiang, D., Yue, J., Li, J., Zhang, D., Hu, J., & Yang, H. (2016). Performance-aware task scheduling for energy harvesting nonvolatile processors considering power switching overhead. In Proceedings of the 53rd Annual Design Automation Conference, 05-09-June-2016, (pp. 1-6).Association for Computing Machinery (ACM). doi: 10.1145/2897937.2898059.

Liu, N., Ding, C., Wang, Y., & Hu, J. (2016). Neural Network-based Prediction Algorithms for In-Door Multi-Source Energy Harvesting System for Non-Volatile Processors. In Proceedings of the 26th edition on Great Lakes Symposium on VLSI, 18-20-May-2016, (pp. 275-280).Association for Computing Machinery (ACM). doi: 10.1145/2902961.2903037.

Luo, H., Hu, J., Shi, L., Xue, C.J., & Zhuge, Q. (2016). Two-step state transition minimization for lifetime and performance improvement on MLC STT-RAM. In Proceedings of the 53rd Annual Design Automation Conference, 05-09-June-2016, (pp. 1-6).Association for Computing Machinery (ACM). doi: 10.1145/2897937.2898106.

Luo, H., Hut, J., Shi, L., Xue, C.J., & Zhuge, Q. (2016). Peak-to-average Pumping Efficiency Improvement for Charge Pump in Phase Change Memories. In 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC), 25-28-January-2016, (pp. 450-455).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/aspdac.2016.7428053.

Mao, M., Wen, W., Liu, X., Hu, J., Wang, D., Chen, Y., & Li, H. (2016). TEMP. In Proceedings of the 53rd Annual Design Automation Conference, 05-09-June-2016, (pp. 1-6).Association for Computing Machinery (ACM). doi: 10.1145/2897937.2898103.

Xie, M., Zhao, M., Pan, C., Li, H., Liu, Y., Zhang, Y., Xue, C.J., & Hu, J. (2016). Checkpoint aware hybrid cache architecture for NV processor in energy harvesting powered systems. In Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, (pp. 1-10).Association for Computing Machinery (ACM). doi: 10.1145/2968456.2968477.

Xue, Y., Cronin, P., Yang, C., & Hu, J. (2016). Routing Path Reuse Maximization for Efficient NV-FPGA Reconfiguration **This work is supported by NSF grant #1527464, #1527506 and #1464429. In 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC), 25-28-January-2016, (pp. 360-365).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/aspdac.2016.7428038.

Zha, M., Qiu, K., Xie, Y., Hu, J., & Xue, C.J. (2016). Redesigning Software and Systems for Non-volatile Processors on Self-powered Devices. In 2016 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC), (pp. 1-6).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/vlsi-soc.2016.7753544.

Chen, R., Wang, Y., Hu, J., Liu, D., Shao, Z., & Guan, Y. (2015). Virtual Machine Image Content Aware I/O Optimization for Mobile Virtualization. In 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems, (pp. 1031-1036).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/hpcc-css-icess.2015.90.

Chen, R., Wang, Y., Hu, J., Liu, D., Shao, Z., & Guan, Y. (2015). Unified Non-Volatile Memory and NAND Flash Memory Architecture in Smartphones. In The 20th Asia and South Pacific Design Automation Conference, (pp. 340-345).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/aspdac.2015.7059028.

Gu, S., Sha, E.H.M., Zhuge, Q., Chen, Y., & Hu, J. (2015). Area and performance co-optimization for domain wall memory in application-specific embedded systems. In Proceedings of the 52nd Annual Design Automation Conference, 2015-June, (pp. 1-6).Association for Computing Machinery (ACM). doi: 10.1145/2744769.2744800.

Guo, J., Wen, W., Hu, J., Wang, D., Li, H., & Chen, Y. (2015). FlexLevel. In Proceedings of the 52nd Annual Design Automation Conference, 2015-July, (pp. 1-6).Association for Computing Machinery (ACM). doi: 10.1145/2744769.2744843.

Heidari, S., Ding, C., Liu, Y., Wang, Y., & Hu, J. (2015). Multi-Source Energy Harvesting Management and Optimization for Non-volatile Processors. In 2015 Sixth International Green and Sustainable Computing Conference (IGSC), (pp. 1-2).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/igcc.2015.7393721.

Khouzani, H.A., Yang, C., & Hu, J. (2015). Improving Performance and Lifetime of DRAM-PCM Hybrid Main Memory through a Proactive Page Allocation Strategy. In The 20th Asia and South Pacific Design Automation Conference, (pp. 508-513).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/aspdac.2015.7059057.

Li, Q., Zhao, M., Hu, J., Liu, Y., He, Y., & Xue, C.J. (2015). Compiler directed automatic stack trimming for efficient non-volatile processors. In Proceedings of the 52nd Annual Design Automation Conference, 2015-July, (pp. 1-6).Association for Computing Machinery (ACM). doi: 10.1145/2744769.2744809.

Mao, M., Hu, J., Chen, Y., & Li, H. (2015). VWS. In Proceedings of the 52nd Annual Design Automation Conference, 2015-July, (pp. 1-6).Association for Computing Machinery (ACM). doi: 10.1145/2744769.2744931.

Pan, C., Xie, M., Yang, C., Shao, Z., & Hu, J. (2015). Nonvolatile Main Memory Aware Garbage Collection in High-Level Language Virtual Machine. In 2015 International Conference on Embedded Software (EMSOFT), (pp. 197-206).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/emsoft.2015.7318275.

Xie, M., Pan, C., Jingtong, H., Yang, C., & Chen, Y. (2015). Checkpoint-Aware Instruction Scheduling for Nonvolatile Processor with Multiple Functional Units. In The 20th Asia and South Pacific Design Automation Conference, (pp. 316-321).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/aspdac.2015.7059024.

Xie, M., Zhao, M., Pan, C., Hu, J., Liu, Y., & Xue, C.J. (2015). Fixing the broken time machine. In Proceedings of the 52nd Annual Design Automation Conference, 2015-July, (pp. 1-6).Association for Computing Machinery (ACM). doi: 10.1145/2744769.2744842.

Xue, Y., Cronin, P., Yang, C., & Hu, J. (2015). Fine-Tuning CLB Placement to Speed Up Reconfigurations in NVM-Based FPGAs. In 2015 25th International Conference on Field Programmable Logic and Applications (FPL), (pp. 1-8).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/fpl.2015.7294013.

Xue, Y., Cronin, P., Yang, C., & Hu, J. (2015). Non-Volatile Memories in FPGAs: Exploiting Logic Similarity to Accelerate Reconfiguration and Increase Programming Cycles. In 2015 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC), 2015-October, (pp. 92-97).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/vlsi-soc.2015.7314398.

Zhao, M., Li, Q., Xie, M., Liu, Y., Hu, J., & Xue, C.J. (2015). Software Assisted Non-volatile Register Reduction for Energy Harvesting Based Cyber-Physical System. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015, 2015-April, (pp. 567-572).EDAA. doi: 10.7873/date.2015.0619.

Chen, R., Wang, Y., Hu, J., Liu, D., Shao, Z., & Guan, Y. (2014). Virtual-Machine Metadata Optimization for I/O Traffic Reduction in Mobile Virtualization. In 2014 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA), (pp. 1-2).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/nvmsa.2014.6927204.

Gu, S., Zhuge, Q., Hu, J., Yi, J., & Sha, E.H.M. (2014). Minimum-Cost Data Allocation with Guaranteed Probability on Multiple Types of Memory. In 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications, (pp. 1-9).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/rtcsa.2014.6910510.

Lu, R., Wu, G., Xie, B., & Hu, J. (2014). Stream Bench: Towards Benchmarking Modern Distributed Stream Computing Frameworks. In 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing, (pp. 69-78).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/ucc.2014.15.

Pan, C., Xie, M., Hu, J., Chen, Y., & Yang, C. (2014). 3M-PCM. In Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis, (pp. 1-10).Association for Computing Machinery (ACM). doi: 10.1145/2656075.2656076.

Pan, C., Xie, M., Hu, J., Qiu, M., & Zhuge, Q. (2014). Wear-Leveling for PCM Main Memory on Embedded System via Page Management and Process Scheduling. In 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications, (pp. 1-9).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/rtcsa.2014.6910513.

Qiu, M., Chen, L., Zhu, Y., Hu, J., & Qin, X. (2014). Online Data Allocation for Hybrid Memories on Embedded Tele-Health Systems. In 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS), (pp. 574-579).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/hpcc.2014.98.

Xie, M., Pan, C., Hu, J., Xue, C.J., & Zhuge, Q. (2014). Non-Volatile Registers Aware Instruction Selection for Embedded Systems. In 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications, (pp. 1-9).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/rtcsa.2014.6910508.

Gu, S., Zhuge, Q., Hu, J., Yi, J., & Sha, E.H.M. (2013). EFFICIENT TASK ASSIGNMENT AND SCHEDULING FOR MPSOC DSPS WITH VS-SPM CONSIDERING CONCURRENT ACCESSES THROUGH DATA ALLOCATION. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, (pp. 2615-2619).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/icassp.2013.6638129.

Guo, Y., Zhuge, Q., Zhang, J., Hu, J., & Sha, E.H.M. (2013). Optimal Data Allocation Algorithm for Loop-Centric Applications on Scratch-Pad Memories. In SiPS 2013 Proceedings, (pp. 383-388).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/sips.2013.6674537.

Hu, J., Zhuge, Q., Xue, C.J., Tseng, W.C., & Sha, E.H.M. (2013). Software Enabled Wear-Leveling for Hybrid PCM Main Memory on Embedded Systems. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013, (pp. 599-602).EDAA. doi: 10.7873/date.2013.131.

Long, L., Liu, D., Hu, J., Gu, S., Zhuge, Q., & Sha, E.H.M. (2013). A Space-Based Wear Leveling for PCM-Based Embedded Systems. In 2013 IEEE 19th International Conference on Embedded and Real-Time Computing Systems and Applications, (pp. 145-148).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/rtcsa.2013.6732213.

Yi, J., Zhuge, Q., Hu, J., Gu, S., Qin, M., & Sha, E.H.M. (2013). Optimizing Task Assignment for Heterogeneous Multiprocessor System with Guaranteed Reliability and Timing Constraint. In 2013 IEEE 19th International Conference on Embedded and Real-Time Computing Systems and Applications, (pp. 193-200).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/rtcsa.2013.6732219.

Hu, J., Zhuge, Q., Xue, C.J., Tseng, W.C., & Sha, E.H.M. (2012). Optimizing Data Allocation and Memory Configuration for Non-Volatile Memory based Hybrid SPM on Embedded CMPs. In 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum, (pp. 982-989).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/ipdpsw.2012.120.

Li, Q., Zhao, Y., Hu, J., Xue, C.J., Sha, E., & He, Y. (2012). MGC: Multiple Graph-Coloring for Non-Volatile Memory Based Hybrid Scratchpad Memory Behaviors*This work is partially supported by the Research Grants Council of the Hong Kong Special Administrative Region, China [Project No. CityU 123609, CityU 123210, CityU 123811], and the National Natural Science Fund of China [Project No. 61170022, 61173014, 61133005], and NSFCNS-1015802, Texas NHARP 009741-0020-2009. In 2012 16th Workshop on Interaction between Compilers and Computer Architectures (INTERACT), 1, (pp. 17-24).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/interact.2012.6339622.

Tseng, W.C., Xue, C.J., Zhuge, Q., Hu, J., & Sha, E.H.M. (2012). PRR: A Low-Overhead Cache Replacement Algorithm for Embedded Processors. In 17th Asia and South Pacific Design Automation Conference, 1, (pp. 35-40).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/aspdac.2012.6164972.

Wang, L., Liu, J., Hu, J., Zhuge, Q., & Sha, E.H.M. (2012). Optimal Assignment for Tree-Structure Task Graph on Heterogeneous Multicore Systems Considering Time Constraint. In 2012 IEEE 6th International Symposium on Embedded Multicore SoCs, (pp. 121-127).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/mcsoc.2012.11.

Wang, L., Liu, J., Hu, J., Zhuge, Q., Liu, D., & Sha, E.H.M. (2012). Efficient Task Assignment on Heterogeneous Multicore Systems Considering Communication Overhead. In Lecture Notes in Computer Science, 7439(PART 1), (pp. 171-185).Springer Nature. doi: 10.1007/978-3-642-33078-0_13.

Wang, Y., Du, J., Hu, J., Zhuge, Q., & Sha, E.H.M. (2012). LOOP SCHEDULING OPTIMIZATION FOR CHIP-MULTIPROCESSORS WITH NON-VOLATILE MAIN MEMORY. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (pp. 1553-1556).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/icassp.2012.6288188.

Wu, G., Zhang, H., Dong, Y., & Hu, J. (2012). CAR: Securing PCM Main Memory System with Cache Address Remapping. In 2012 IEEE 18th International Conference on Parallel and Distributed Systems, (pp. 628-635).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/icpads.2012.90.

Guo, Y., Zhuge, Q., Hu, J., & Sha, E.H.M. (2011). Optimal Data Placement for Memory Architectures with Scratch-Pad Memories. In 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications, (pp. 1045-1050).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/trustcom.2011.143.

Guo, Y., Zhuge, Q., Hu, J., Qiu, M., & Sha, E.H.M. (2011). Optimal Data Allocation for Scratch-Pad Memory on Embedded Multi-core Systems. In 2011 International Conference on Parallel Processing, (pp. 464-471).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/icpp.2011.79.

Hu, J., Xue, C.J., Zhuge, Q., Tseng, W.C., & Sha, E.H.M. (2011). Towards energy efficient hybrid on-chip scratch pad memory with non-volatile memory. In Proceedings -Design, Automation and Test in Europe, DATE, (pp. 746-751).

Hu, J., Xue, C.J., Tseng, W.C., He, Y., Qiu, M., & Sha, E.H.M. (2010). Reducing write activities on non-volatile memories in embedded CMPs via data migration and recomputation. In Proceedings of the 47th Design Automation Conference, (pp. 350-355).Association for Computing Machinery (ACM). doi: 10.1145/1837274.1837363.

Hu, J., Xue, C.J., Tseng, W.C., Zhuge, Q., & Sha, E.H.M. (2010). Minimizing Write Activities to Non-volatile Memory via Scheduling and Recomputation. In 2010 IEEE 8th Symposium on Application Specific Processors (SASP), (pp. 101-106).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/sasp.2010.5521139.

Li, J., Qiu, M., Niu, J., Liu:, M., Wang, B., & Hu, J. (2010). Impacts of Inaccurate Information on Resource Allocation for Multi-Core Embedded Systems. In 2010 10th IEEE International Conference on Computer and Information Technology, (pp. 2692-2697).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/cit.2010.452.

Lit, J., Qiu, M., Hu, J., & Sha, E.H.M. (2010). Thermal-Aware Rotation Scheduling for 3D Multi-Core with Timing Constraint. In 2010 IEEE Workshop On Signal Processing Systems, (pp. 323-326).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/sips.2010.5624809.

Shi, L., Xue, C.J., Hu, J., Tseng, W.C., Zhou, X., & Sha, E.H.M. (2010). Write activity reduction on flash main memory via smart victim cache. In Proceedings of the 20th symposium on Great lakes symposium on VLSI, (pp. 91-94).Association for Computing Machinery (ACM). doi: 10.1145/1785481.1785503.

Tseng, W.C., Xue, C.J., Zhuge, Q., Hu, J., & Sha, E.H.M. (2010). Optimal Scheduling to Minimize Non-Volatile Memory Access Time with Hardware Cache. In 2010 18th IEEE/IFIP International Conference on VLSI and System-on-Chip, (pp. 131-136).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/vlsisoc.2010.5642609.

Hu, J., Xue, C.J., He, Y., & Sha, E.H.M. (2009). Reprogramming with Minimal Transferred Data on Wireless Sensor Network. In 2009 IEEE 6th International Conference on Mobile Adhoc and Sensor Systems, (pp. 160-167).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/mobhoc.2009.5337000.

Hu, J., Xue, C.J., Tseng, W.C., Qiu, M., Zhao, Y., & Sha, E.H.M. (2009). Minimizing Memory Access Schedule for Memories *This work is partially supported by NSF IIS-0513669, HK CERG B-Q60B and NSFC 60728206, Changjiang Honorary Chair Professor Scholarship and grants from City University of Hong Kong [Project No.9041505 (CityU 123609)][Project No.9681001]. In 2009 15th International Conference on Parallel and Distributed Systems, (pp. 104-111).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/icpads.2009.86.

Qiu, M., Wu, G., Xue, C.J., Hu, J., Tseng, W.C., & Sha, E.H.M. (2009). Energy Minimization and Latency Hiding for Heterogeneous Parallel Memory Modules. In 2009 15th International Conference on Parallel and Distributed Systems, (pp. 503-510).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/icpads.2009.132.

Hu, J., Xue, C.J., Qiu, M., Tseng, W.C., Xu, C.Q., Zhang, L., & Sha, E.H.M. (2008). Minimizing Transferred Data for Code Update on Wireless Sensor Network. In Lecture Notes in Computer Science, 5258, (pp. 349-360).Springer Nature. doi: 10.1007/978-3-540-88582-5_34.

Qiu, M., Wu, J., Hu, J., He, Y., & Sha, E.H.M. (2008). Dynamic and Leakage Power Minimization with Loop Voltage Scheduling and Assignment. In 2008 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, 1, (pp. 192-198).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/euc.2008.90.

Qiu, M., Wu, J., Xue, C.J., Hu, J.A., Tseng, W.C., & Sha, E.H.M. (2008). Loop Scheduling and Assignment to Minimize Energy while Hiding Latency for Heterogeneous Multi-Bank Memory. In 2008 International Conference on Field Programmable Logic and Applications, (pp. 459-462).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/fpl.2008.4629983.

Xue, C.J., Liu, T., Shao, Z., Hu, J., Jia, Z., Jia, W., & Sha, E.H.M. (2008). ADDRESS ASSIGNMENT SENSITIVE VARIABLE PARTITIONING AND SCHEDULING FOR DSPS WITH MULTIPLE MEMORY BANKS. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, (pp. 1453-1456).Institute of Electrical and Electronics Engineers (IEEE). doi: 10.1109/icassp.2008.4517894.

Qiu, M., Hu, J., & Sha, E.H.M. (2007). Adaptive online energy saving for heterogeneous sensor networks. In Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Systems, (pp. 294-299).