- Education
- PhD, Wayne State University, 2008.
Major: Computer Engineering
Dissertation Title: Failure-Aware Reconfigurable Distributed Virtual Machine for Dependable and High Productivity Computing - MS, Nanjing University, 2002.
Major: Computer Science - BS, Nanjing University of Aeronautics and Astronautics, 1999.
Major: Computer Science
- PhD, Wayne State University, 2008.
- Research
As parallel and distributed computing systems become more and more large-scale and complex, new foundations are needed for understanding and controlling their integral properties. Dr. Fu’s research is dedicated to the investigation, establishment, and experimental evaluation of new theoretical foundations and system artifacts to significantly improve the system resilience, power & energy, and performance. His research interest is primarily in high-performance computing, distributed and cloud systems, including
- Resilience and Failure/Error Management
- Anomaly Detection and Failure Diagnosis
- Smart Storage and Storage Reliability
- Power Management and Energy Efficiency
- Autonomic Resource Management and Reconfiguration
- High Performance Computing
- Smart Cities and Smart Communities
- Publications
Peer-Reviewed Conference and Journal Publications
- Song Fu, Hsing-Bung Chen and George Qiao
A Machine Learning based Disk Health Status Assessment, Failure Prediction, and Pre-Failure Data Recovery Approach for Supporting Always-On Extreme Scale Storage Systems
ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC), July 2017. - Zongze Li, Matthew Davidson, Song Fu, Sean Blanchard and Michael Lang
Event Block Analysis for Effective Anomaly Detection on Production HPC Systems
ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC), July 2017. - Song Huang, Song Fu, Weisong Shi and Devesh Tiwari
Proactive Disk Failure Management and Data Protection for Highly Available Storage Systems
ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC), July 2017. - Mohit Kumar, Devesh Tiwari, Weisong Shi, Saurabh Gupta and Song Fu
Towards Understanding Interconnect Failures in HPC Systems
Greater Chicago Area Systems Research Workshop (GCASR), April 2017. - Hsing-Bung Chen and Song Fu
Improving Coding Performance and Energy Efficiency of Erasure Coding Process for Storage Systems
IEEE International Conference on Cloud Computing (CLOUD), July 2016. - Song Huang, Song Fu, Scott Pakin and Michael Lang
Characterizing Power and Energy Efficiency of Legion Runtime and Applications: An Early Experience
IEEE International Green and Sustainable Computing Conference (IGSC), November 2016. - Song Huang, Zhiang Deng and Song Fu
Quantifying Topology Criticality for Fault Impact Analysis in Software-Defined Networks
The 35th IEEE International Performance Computing and Communications Conference (IPCCC), December 2016. - Jacob Hochstetler, Lauren Hochstetler, and Song Fu
An Optimal Police Patrol Planning Strategy for Smart City Safety
The 14th IEEE International Conference on Smart City (SmartCity), December 2016. - Elisabeth Baseman, Sean Blanchard, Zongze Li, and Song Fu
Relational Synthesis of Text and Numeric Data for Anomaly Detection on Computing System Logs
The 15th IEEE International Conference on Machine Learning and Applications (ICMLA), December 2016. - Hsing-Bung Chen and Song Fu
Parallel Erasure Coding: Exploring Task Parallelism in Erasure Coding for Enhanced Bandwidth
IEEE International Conference on Networking, Architecture and Storage (NAS), August 2016. - Ziming Zhang, Michael Lang, Scott Pakin, and Song Fu
TracSim: Simulating and Scheduling Trapped Power Capacity to Maximize Machine Room Throughput
Parallel Computing (ParCo) Journal, Vol. 57: 108-124, September 2016. - Qiang Guan, Nathan DeBardeleben, Sean Blanchard and Song Fu
Addressing Statistical Significance of Fault Injection: Empirical Studies of the Soft Error Susceptibility
International Journal of High Performance Computing and Networking, June 2016. - Q. Guan, N. DeBardeleben, S. Blanchard, S. Fu, C. H. Davis, and W. M. Jones
Analyzing the Robustness of HPC Applications Using a Fine-Grained Soft Error Fault Injection Tool
Book Chapter, Innovative Research and Applications in Next-Generation High Performance Computing, pp. 277-305, IGI Global, July 2016. - S. Huang, S. Fu, Q. Zhang and W. Shi
Characterizing Disk Failures with Quantified Disk Degradation Signatures: An Early Experience
IEEE International Symposium on Workload Characterization (IISWC), 10 pages, October 2015. - Q. Guan, N. DeBardeleben, S. Blanchard and S. Fu
Addressing Statistical Significance of Fault Injection: Empirical Studies of the Soft Error Susceptibility
The 21st IEEE/IFIP International Symposium on Dependable Computing, 10 pages, November 2015. - S. Huang, S. Fu, N. DeBardeleben, Q. Guan and C. Xu
Differentiated Failure Remediation with Action Selection for Resilient Computing
The 21st IEEE/IFIP International Symposium on Dependable Computing, 10 pages, November 2015. - H.-B. Chen, S. Fu, Z. Qiao, S. Liang and S. Huang
A Parallel, Reliable and Scalable Storage Software Infrastructure for Active Storage System and I/O Environments
The 34th IEEE International Performance Computing and Communications Conference, December 2015. - Z. Qiao, S. Liang, H. Jiang and S. Fu
A Customizable MapReduce Framework for Complex Data-Intensive Workflows on GPUs
The 34th IEEE International Performance Computing and Communications Conference, December 2015. - S. Huang, M. Lang, S. Pakin and S. Fu
Measurement and Characterization of Haswell Power and Energy Consumption
Energy Efficient Supercomputing in IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC'15), 10 pages, November 2015. - Q. Guan, N. DeBardeleben, S. Blanchard and S. Fu
Empirical Studies of the Soft Error Susceptibility of Sorting Algorithms to Statistical Fault Injection
Fault Tolerance for HPC at eXtreme Scale in the 24th International ACM Symposium on High Performance Distributed Computing (HPDC), Pages 33-40, June 2015. - Q. Guan, N. DeBardeleben, S. Blanchard and S. Fu
Soft Error Susceptibility of Sorting Algorithms
IEEE International Workshop on Silicon Errors in Logic - System Effects (SELSE), March 2015. - Z. Qiao, S. Liang, H. Jiang, and S. Fu
MR-Graph: a Customizable GPU MapReduce
IEEE International Conference on Cyber Security and Cloud Computing, November 2015. - B. Arigong, M. Zhou, H. Ren, J. Shao, S. Fu, H. Kim and H. Zhang
System Application of Planar Couplers
IEEE Symposium on Wireless and Microwave Circuits and Systems, April 2015. - H. Ren, J. Shao, B. Arigong, M. Zhou, S. Fu, H. Kim and H. Zhang
Simplified Doherty Power Amplifier Structures
IEEE Symposium on Wireless and Microwave Circuits and Systems, April 2015. - J. Shao, H. Ren, M. Zhou, B. Arigong, J. Ding, S. Fu, H. Kim, and H. Zhang
Design of a Dual-Band Sequential Power Amplifier
Microwave and Optical Technology Letters, 2015. - J. Shao, H. Ren, M. Zhou, B. Arigong, J. Ding, S. Fu, H. Kim, and H. Zhang
Design of a Tunable Sequential Power Amplifier
Microwave and Optical Technology Letters, 2015. - B. Arigong, J. Ding, H. Ren, M. Zhou, J. Shao, H. Kim, S. Fu, and H. Zhang
An Improved Design of Dual-band 3dB 180▲ Directional Coupler
Progress in Electromagnetic Research (PIER), Vol. 56: 153-162, 2015. - J. Shao, R. Zhou, S. Yoon, S. Fu, H. Kim, and H. Zhang
Design of Dual-Band GaN Doherty Power Amplifier Using a Simplified Structure
Microwave and Optical Technology Letters, Vol. 57(4): 953-956, 2015. - Qiang Guan, Song Fu, Nathan DeBardeleben and Sean Blanchard
F-SEFI: A Fine-grained Soft Error Fault Injection Tool for Profiling Application Vulnerability
The 28th IEEE International Parallel & Distributed Processing Symposium (IPDPS), pp.1-10, May 2014. - Ziming Zhang, Michael Lang, Scott Pakin and Song Fu
Trapped Capacity: Scheduling under a Power Cap to Maximize Machine-Room Throughput
Workshop on Energy Efficient Supercomputing in conjunction with IEEE/ACM Supercomputing Conference (SC), pp.1-10, November 2014. - Qiang Guan, Nathan DeBardeleben, Sean Blanchard and Song Fu
Towards Exploring the Soft Error Susceptibility of Heapsort Algorithms
The 44th IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), June 2014. - Xiajun Wang, Song Huang, Song Fu and Krishna Kavi
Characterizing Workload of Web Applications on Virtualized Servers
The 4th Workshop on Big Data Benchmarks, Performance Optimization, and Emerging Hardware in conjunction with the 19th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp.1-8, March 2014. - Bayaner Arigong, Jun Ding, Han Ren, Mi Zhou, Jin Shao, Rongguo Zhou, Hyoungsoo Kim, Yuankun Lin, Song Fu and Hualiang Zhang
Transformation Optics for Microwave and Optical Device Design
IEEE International Conference on Electromagnetics in Advanced Applications (ICEAA), August 2014. - Jin Shao, Rongguo Zhou, Sang-Woong Yoon, Song Fu, Hyoungsoo Kim and Hualiang Zhang
Design of Dual-Band GaN Doherty Power Amplifier Using a Simplified Structure
Microwave and Optical Technology Letters, in press, December 2014. - Qiang Guan and Song Fu
Adaptive Anomaly Identification by Exploring Metric Subspace in Cloud Computing Infrastructures
The 32nd IEEE International Symposium on Reliable Distributed Systems (SRDS), pp.1-10, 2013. - Qiang Guan and Song Fu
Autonomic Failure Identification and Diagnosis for Building Dependable Computing Systems
ACM/IEEE Supercomputing Conference (SC), 2013. - Qiang Guan, Song Fu, Nathan DeBardeleben and Sean Blanchard
Exploring Time and Frequency Domains for Accurate and Automated Anomaly Detection in Cloud Computing Systems
The 19th IEEE/IFIP International Symposium on Dependable Computing (PRDC), pp.1-10, 2013. - Qiang Guan and Song Fu
Wavelet-Based Multi-Scale Anomaly Identification in Cloud Computing Systems
IEEE Global Communications Conference (GLOBECOM), pp.1-7, 2013. - Husanbir Pannu, Jianguo Liu and Song Fu
AAD: Adaptive Anomaly Detection System for Cloud Computing Infrastructures
The 31st IEEE International Symposium on Reliable Distributed Systems (SRDS), 2012. - Ziming Zhang, Qiang Guan and Song Fu
An Adaptive Power Management Framework for Autonomic Resource Configuration in Cloud Computing Infrastructures
The 31st IEEE International Performance Computing and Communications Conference (IPCCC), pp.1-10, 2012. - Husanbir Pannu, Jianguo Liu, Qiang Guan and Song Fu
An Autonomic Failure Detection System for Cloud Computing Infrastructures
The 31st IEEE International Performance Computing and Communications Conference (IPCCC), pp.1-10, 2012. - Qiang Guan, Chi-Chen Chiu and Song Fu
A Cloud Dependability Analysis Framework for Assessing System Dependability in Cloud Computing Infrastructures
The 18th IEEE/IFIP International Symposium on Dependable Computing (PRDC), pp.1-10, 2012. - Husanbir Pannu, Jianguo Liu and Song Fu
A Hybrid Anomaly Detection Framework in Cloud Computing using One-Class and Two-Class Support Vector Machines
International Conference on Advanced Data Mining and Applications (ADMA), pp.1-12, 2012. - Qiang Guan, Chi-Chen Chiu, Ziming Zhang and Song Fu
Efficient and Accurate Anomaly Identification Using Reduced Metric Space in Utility Clouds
IEEE International Conference on Networking, Architecture, and Storage (NAS), pp.1-10, 2012. - Husanbir Pannu, Jianguo Liu and Song Fu
Autonomic Anomaly Identification for Developing Highly Dependable Utility Clouds
IEEE Global Communications Conference (GLOBECOM), pp.1-7, 2012. - Qiang Guan, Ziming Zhang and Song Fu
Ensemble of Bayesian Predictors and Decision Trees for Proactive Failure Management in Cloud Computing Systems
Journal of Communications, Vol. 7, No. 1, pp. 52-61, 2012. - Qiang Guan, Ziming Zhang and Song Fu
A Failure Detection and Prediction Mechanism for Enhancing Dependability of Data Centers
International Journal of Computer Theory and Engineering, In press, 2012. - Ziming Zhang and Song Fu
Characterizing Power and Energy Usage in Cloud Computing Systems
IEEE International Conference on Cloud Computing Technology and Science (CloudCom), 2011. - Qiang Guan, Ziming Zhang and Song Fu
Proactive Failure Management by Integrated Unsupervised and Semi-Supervised Learning for Dependable Cloud Systems
IEEE International Conference on Availability, Reliability and Security (ARES), 2011. - Song Fu, Qiang Guan, Ziming Zhang
Failure Detection and Prediction for Dependable Cloud Computing Systems
IEEE Global Communication Conference (GLOBECOM), 2011. - Nathan DeBardeleben, Sean Blanchard, Qiang Guan, Ziming Zhang, and Song Fu
Experimental Framework for Injecting Logic Errors in a Virtual Machine to Profile Applications for Soft Error Resilience
Resilience, the 17th International European Conference on Parallel and Distributed Computing (Euro-Par), September 2011. - Ziming Zhang, and Song Fu
macropower: A Coarse-Grain Power Profiling Framework for Energy-Efficient Cloud Computing
The 30th IEEE International Performance Computing and Communications Conference (IPCCC), 2011. - Qiang Guan, Ziming Zhang and Song Fu
Ensemble of Bayesian Predictors for Autonomic Failure Management in Cloud Computing
The 20th IEEE International Conference on Computer Communications and Networks (ICCCN), 2011. - Song Fu, Chengzhong Xu and Helen Shen
Randomized Load Balancing Strategies with Churn Resilience in Peer-to-Peer Networks
Journal of Network and Computer Applications, Elsevier, Vol. 34, No. 1, pp. 252-261, 2011. - Song Fu and Chengzhong Xu
Failure-Aware Resource Management for High-Availability Computing Clusters with Distributed Virtual Machines
Journal of Parallel and Distributed Computing, Elsevier, Vol. 70, No. 4, pp. 384-393, 2010. - Song Fu and Chengzhong Xu
Quantifying Event Correlations for Proactive Failure Management in Networked Computing Systems
Journal of Parallel and Distributed Computing, Elsevier, Vol. 70, No. 11, pp. 1100-1109, 2010. - Ziming Zhang and Song Fu
A Hierarchical Failure Management Framework for Dependability Assurance in Compute Clusters
International Journal of Computational Science, Vol. 4, No. 4, pp. 313-326, 2010. - Ziming Zhang and Song Fu
Failure Prediction for Autonomic Management of Networked Computer Systems with Availability Assurance
DPDNS, IEEE International Parallel and Distributed Processing Symposium (IPDPS), April 2010. - Qiang Guan and Song Fu
auto-AID: A Data Mining Framework for Autonomic Anomaly Identification in Networked Computer Systems
The 29th IEEE International Performance Computing and Communications Conference (IPCCC), December 2010. - Song Fu
Dependability Enhancement for Coalition Clusters with Autonomic Failure Management
The 15th IEEE International Symposium on Computers and Communications (ISCC), June 2010. - Ziming Zhang and Song Fu
Proactive Failure Management for High Availability Computing in Computer Clusters
IEEE International Conference on Computational Sciences and Optimization (CSO), May 2010. - Qiang Guan, Derek Smith and Song Fu
Anomaly Detection in Large-Scale Coalition Clusters for Dependability Assurance
The 17th IEEE International Conference on High Performance Computing (HiPC), December 2010. - Song Fu
Failure-Aware Construction and Reconfiguration of Distributed Virtual Machines for High Availability Computing
The 9th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid), 2009. - Song Fu and Chengzhong Xu
Proactive Resource Management for Failure Resilient High Performance Computing Clusters
The IEEE International Conference on Availability, Reliability and Security (ARES), 2009. - Song Fu, Chengzhong Xu and Haiying Shen
Random Choices for Churn Resilient Load Balancing in Peer-to-Peer Networks
The 22nd ACM/IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2008. - Song Fu and Chengzhong Xu
Exploring Event Correlation for Failure Prediction in Coalitions of Clusters
The ACM/IEEE Supercomputing Conference (SC), 2007. (Acceptance rate: 20%) - Song Fu and Chengzhong Xu
Quantifying Temporal and Spatial Correlation of Failure Events for Proactive Management
The 26th IEEE International Symposium on Reliable Distributed Systems (SRDS), 2007. - Song Fu and Chengzhong Xu
Coordinated Access Control with Spatial Constraints in Coalition Mobile Computing Systems
Journal of Future Generation Computer Systems, Elsevier, Vol. 23, No. 6, pp. 804-815, 2007. - Song Fu and Chengzhong Xu
Stochastic Modeling and Analysis of Hybrid Mobility in Reconfigurable Distributed Virtual Machines
Journal of Parallel and Distributed Computing, Elsevier, Vol. 66, No. 11, pp. 1442-1454, 2006. - Song Fu, Chengzhong Xu, Brian Wims, and Ramzi Basharahil
Distributed Shared Arrays: A Distributed Virtual Machine with Mobility Support for Reconfiguration
Journal of Cluster Computing, Vol. 9, No. 3, pp. 237-255, 2006. - Song Fu and Chengzhong Xu
Service Migration in Distributed Virtual Machines for Adaptive Grid Computing
The 34th IEEE International Conference on Parallel Processing (ICPP), 2005. (Best Paper Nominee) - Song Fu and Chengzhong Xu
A Coordinated Spatio-Temporal Access Control Model for Mobile Computing in Coalition Environments
The 19th ACM/IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2005. - Song Fu and Chengzhong Xu
Mobility Support for Adaptive Grid Computing
Scalable and Secure Internet Services and Architecture, Chapman & Hall/CRC, 2005. - Song Fu and Chengzhong Xu
Mobile Code and Protection
Handbook of Information Security, John Wiley & Sons, 2005. - Song Fu and Chengzhong Xu
Migration Decision for Hybrid Mobility in Reconfigurable Distributed Virtual Machines
The 33rd IEEE International Conference on Parallel Processing (ICPP), 2004. - Ramzi Basharahil, Brian Wims, Chengzhong Xu, and Song Fu
Distributed Shared Array: an Integration of Message Passing and Multithreading on SMP Clusters
Journal of Supercomputing, Vol. 31, No. 2, pp. 161-184, 2004. - Chengzhong Xu and Song Fu
Privilege Delegation and Agent-oriented Access Control in Naplet
IEEE International Workshop on Mobile Distributed Computing (In conjunction with ICDCS), 2003. - Song Fu, Zhiquan Jin and Peipei Chen
A Rate-Based Multicast Protocol for Large-Scale Reliable Transport
The 17th IEEE International Conference on Advanced Information Networking and Applications, 2003.
Research Posters
- Jason He (TAMS student)
CODY: Characterizing Power and Energy Usage with Resource Auto-Configuration in Cloud Computing
DFW Science and Engineering Fair, Dallas Texas, 2013. [PDF] - Husanbir Singh Pannu
Adaptive Anomaly Detection System for Cloud Computing Infrastructures
The 31st IEEE International Symposium on Reliable Distributed Systems (SRDS), 2012. [PDF] - Song Fu
Proactive Failure Management for Dependable Networked Computer Systems
Department of Computer Science, University of North Texas, 2011. [PDF]
- Song Fu, Hsing-Bung Chen and George Qiao
- Professional Experience
- Panelist for U.S. National Science Foundation; Proposal Reviewer for Portuguese Foundation for Science and Technology, Canada Foundation for Innovation, Research Grants Council of Hong Kong, Kentucky Science and Engineering Foundation, South Carolina Institutions of Higher Education.
- General Chair, 35th IEEE International Performance Computing and Communications Conference (IPCCC 2016)
- Workshop Chair, 26th IEEE International Conference on Computer Communications and Networks (ICCCN 2017)
- Travel Grant Chair, 24th IEEE International Conference on Computer Communications and Networks (ICCCN 2015)
- General Vice-Chair, 32nd IEEE International Performance Computing and Communications Conference (IPCCC 2013)
- Program Chair, 31st IEEE International Performance Computing and Communications Conference (IPCCC 2012)
- Publication Chair, 30th IEEE International Performance Computing and Communications Conference (IPCCC 2011)
- Track Chair, 20th IEEE International Conference on Computer Communications and Networks (ICCCN 2011)
- Registration Chair, IEEE International Symposium on Electronic System Design (ISED 2011)
- Poster Chair, 29th IEEE International Performance Computing and Communications Conference (IPCCC 2010)
- Program Committee, IEEE/ACM IPDPS 2018, IEEE ISM 2015, IEEE CLOUD 2015, IEEE CLOUDNET 2015, IEEE CLOUDNET 2014, ACM BodyNet 2014, IARIA CLOUD COMPUTING 2014, ACM BodyNet 2013, IEEE NAS 2013, IARIA INTERNET 2013, IEEE COMPSAC 2012, IEEE NAS 2012, IEEE I-SPAN 2012, FTRA FutureTech 2012, IEEE ICPADS 2010, IEEE AINA 2010, IEEE CloudCom 2010, ACM IC3 2010, IEEE I-SPAN 2009, ACM Compute 2009, IEEE FCST 2009, IEEE AINA 2009, ACM IC3 2009, IEEE/IFIP EUC 2008.
- Paper Reviewer, IEEE Transactions on Parallel and Distributed Systems (TPDS), IEEE Transactions on Computers (TOC), IEEE Transactions on Emerging Topics in Computing (TETC), IEEE Transactions on Services Computing (TSC), ACM Transactions on Autonomous and Adaptive Systems (TAAS), Journal of Parallel and Distributed Computing (JPDC), Journal of Systems and Software (JSS), Journal of Future Generation Computer Systems (FGCS), Journal of Supercomputing (JSC).
- IEEE senior member
- Member of ACM, ASEE and Sigma Xi.