
教师简介:
jbo竞博电竞官方网站教授, 博士生导师, 广东省杰出青年基金获得者,量子计算与软件研究所副所长,广东省大数据分析与处理重点实验室副主任,jbo竞博电竞官方网站逸仙学者。2016年6月于西安交通大学计算机科学与技术系获博士学位。2012年7月-2012年11月作为“明日之星”在微软亚洲研究工作实习,2016年6月-2018年1月在IBM中国研究员云计算部门担任研究员,2017年2月-2017年4月作为访问研究员在IBM T.J. Watson研究中心工作。主要方向为:分布式系统、操作系统、计算机网络、云计算、无服务器计算、软件可靠性等。近年来在国际会议和期刊共发表高水平论文80余篇,同时担任多个国际期刊和会议的审稿人。
如果你对云计算 (IaaS、PaaS、SaaS、FaaS)、分布式系统、计算机网络、操作系统感兴趣,并能沉下心来做研究,可以联系读博士或者博士后!
研究领域:
分布式、云计算、操作系统、计算机网络、软件可靠性
教育背景:
- 2009.9-2016.6,西安交通大学计算机科学与技术系,硕士、博士
- 2005.9-2009.6,西安交通大学计算机科学与技术系,本科
工作经历:
- 2024.04-至今,jbo竞博电竞官方网站,教授,博士生导师
- 2018.1-2024.04,jbo竞博电竞官方网站数据科学与jbo竞博电竞官方网站,“百人计划”副教授,博士生导师
- 2016.6-2018.1,IBM中国研究院,研究员
海外经历:
2017.2-2017.4,IBM T.J. Watson研究中心,访问研究员
获奖及荣誉:
2020年获得陕西省计算机学会首届优秀博士论文,论文题目:网络分布式环境中软件性能管理问题研究,导师:齐勇教授
科研项目:
正在进行中:
- "基于大模型的云原生系统智能运维能力构建及演化",国家重点研发计划课题(2024YFB4505904), 2025.1-2027.12,主持;
- "基于人工智能的政务网络精细化管理及精准运维", 国家重点研发计划课题(2019YFB1804002),2020.01~2023.12, 主持;
- "基于多模态数据融合的云原生系统自适应故障检测与定位方法研究", 国家自然基金面上项目(62272495),2023.01~2026.12,主持;
- "基于数据驱动的云原生系统智能运维方法研究与实践", 广东省杰出青年基金(2023B1515020054),2023.01~2026.12, 主持;
- "安全可靠云操作系统关键技术研究及应用示范",广东省重大项目(2020B010165002),2020.01~2024.10,课题负责人;
- “智算集群可靠性方法研究”,中大-华为合作项目,2024.07~2025.07,主持;
- “基于图神经网络的微服务动态拓扑故障定界和根因分析方案研究” ,中大-腾讯合作项目,2024.06~2024.11,主持;
- “基于鸿蒙操作系统的自动化日志打点及日志挖掘技术合作项目”, 中大-华为合作项目,2024.06~2025.06,主持;
- "无侵入全链路云原生可观测性技术合作项目", 中大-华为合作项目,2023.06~2024.06,主持;
- "云原生平台的问题诊断及稳定性提升“,中大-蚂蚁金服合作项目,2023.07~2024.07,主持;
- “基于 eBPF 进行服务网格性能优化及构建下一代零侵入的微服务治理”,阿里巴巴创新研究计划(Alibaba Innovative Resarch, AIR),2022.06~2024.07,主持;
已完成项目:
- “欧拉系统及微服务的智能运维研究合作项目”, 中大-华为校企联合, 2023.03-2024.03, 主持;
- "基于eBPF的云边网络性能优化、问题定位方法及原型系统",CCF-联想蓝海计划,2022.08~2023.08,主持;
- "关于微服务架构下的动态拓扑构建及故障分析方案研究", 中大-腾讯合作项目,2023.05~2023.10,主持;
- “基于大数据的地方金融安全智能预警与防控系统”,国家自然科学委员会(NSFC)-广东省人民政府大数据科学研究中心项目 (U1811462),2019.01~2022.12, 子课题负责人;
- “面向云原生系统的性能异常检测与定位方法研究",广东省自然科学基金-面上项目(项目编号:2019A1515012229),2019.10 ~ 2022.10, 主持;
- “面向云原生系统端到端性能问题定位与恢复方法研究",广州市基础与应用基础研究项目 (项目编号:202002030328),2020.4 ~ 2022.3, 主持;
- “ 基于微服务依赖图的故障根因定位”, 腾讯微信犀牛鸟项目,2021.04~2022.04,主持;
- "云原生环境中SRE智能运维",中大-华为合作项目,2021.11~2022.11,主持;
- “面向微服务系统的主动性故障注入实验平台”,校企合作项目 2020.02~2021.02,主持;
- “5G环境下区块链+低延迟和带宽敏感的物联网场景融合发展研究与应用”, 教育部区块链核心项目(项目编号:2020KJ010801),中大负责人;
- “面向容器化微服务架构的软件性能诊断研究",国家自然基金青年项目(项目编号:61802448),2019.01 ~ 2021.12, 主持;
- "XXX业务的通用故障发现与故障分析方案研究",腾讯,2021-2022,主持;
- “面向ICT环境的智能运维前沿技术研究”,中大-华为合作项目,2020.06~2021.06, 主持;
- "面向深度学习的AI云计算平台系统", 中大-华为合作项目, 2020.08~2020.11, 主持;
- “网络环境中软件老化模式及再生技术研究”,国家自然科学基金重点项目(项目编号60933003),2010.1至2013.12,项目骨干;
- “云计算环境下数据中心的power capping关键问题研究”, 国家自然基金面上项目(项目编号61272460),2013.1至2019.12,课题参与人;
- “多应用负载性能干扰预测与隔离相关问题研究”,国家自然基金面上项目(项目编号61672421),2016.1至 2019.12,课题参与人;
- “工商银行PaaS云支撑项目”,工商银行与IBM签订的商业合作项目,2017.03至2018.01,子项目负责人;
学术兼职
审稿人:ACM TOSEM、IEEE TDSC、IEEE TPDS、IEEE TC、IEEE TSC、软件学报、IEEE Transactions on Cybernetics, Information Science, Neurocomputing, Soft Computing, FGCS, JSEPetc.
编辑:JSEP、《计算机技术与发展》
教授课程
- 操作系统原理(本科)
- 高级分布式系统 (研究生)
- 分布式系统(本科)
代表性论著
- 【CCF A, IEEE TPDS】Wanqi Yang (student), Pengfei Chen*, et.al., ZeroTracer: In-band eBPF-based Trace Generator with Zero Instrumentation for Microservice Systems, IEEE Transactions on Parallel and Distributed Systems (CCF A类期刊),2025, to appear.
- 【CCF A, IEEE TSC】Wanqi Yang (student), Pengfei Chen*, et.al., eProbe: eBPF-enhanced Accurate Container Status Probing in Cloud-native Systems, IEEE Transactions on Services Computing (CCF A类期刊), 2025, to appear.
- 【CCF A, IEEE TC】Liang Ai (student), Pengfei Chen*, et.al., A Self-Tuning and Fair Shared Replacement Cache, IEEE Transactions on Computers (CCF A类期刊),2025, to appear.
- 【CCF A, IEEE/ACM ToN】Hongyang Chen (student), Pengfei Chen*,et.al., NetScope: Fault Localization in Programmable Networking Systems with Low-cost In-Band Network Telemetry and In-Network Detection, IEEE/ACM Transactions on Networking (CCF A类期刊), to appear, 2025.
- 【CCF A, ACM TOSEM】Zilong He (student), Pengfei Chen*, et.al., On the Practicability of Deep Learning based Anomaly Detection for Modern Online Software Systems: A Pre-Train-and-Align Framework, ACM Transactions on Software Engineering and Methodology(CCF A类期刊),2025,to appear.
- 【CCF A, ACM TOSEM】Yu, Guangba (student); Tang, Gou; Huang, Haojia; Zhang, Zhenyu; Chen, Pengfei*, A Survey on Failure Analysis and Fault Injection in AI Systems, ACM Transactions on Software Engineering and Methodology (ACM TOSEM, CCF A类期刊), to appear, 2025.
- 【CCFA, ACM ASPLOS'25】Haiyu Huang (student), Pengfei Chen*, et.al., Mint: Cost-Efficient Tracing with All Requests Collection via Commonality and Variability Analysis, the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (CCF A类会议), 2025.
- 【CCF A, ACM FSE'24】Guangba Yu (student), Pengfei Chen*,et.al., ChangeRCA: Finding Root Causes from Software Changes in Large Online Systems, FSE'2024 (CCF A类), 2024.
- 【CCF A, ACM FSE‘24】Haiyu Huang (student), Pengfei Chen*, et.al., TraStrainer: Adaptive Sampling for Distributed Traces with System Runtime State, FSE'2024 (CCF A类, 杰出论文奖),2024.
- 【CCF A, ACM ASE'24】Yilun Wang(student), Pengfei Chen*, et.al., FaaSConf: QoS-aware Hybrid Resources Configuration for Serverless Workflows, ASE'2024(CCF A类),2024, to appear;
- 【CCF A, IEEE TDSC】Hongyang Chen (Student), Pengfei Chen*, et.al., MicroFI: Non-Intrusive and PrioritizedRequest-Level Fault Injection for Microservice Applications, IEEE Transactions on Dependable and Secure Computing (CCF A类期刊), 2024.
- 【CCF A, IEEE TPDS】Hui Dou, Yilun Wang (student), Pengfei Chen, et.al., DeepCAT+: A Low-Cost and Transferrable Online Configuration Auto-Tuning Approach for Big Data Frameworks, IEEE Transactions on Parallel and Distributed Systems (IEEE TPDS, CCF A类期刊), 2024, to appear;
- 【CCF A, IEEE TSC】Guangba Yu (Student), Pengfei Chen*, et.al., FaaSDeliver: Cost-efficient and QoS-aware Function Delivery in Computing Continuum, IEEE TRANSACTIONS ON SERVICE COMPUTING (IEEE TSC, CCF A类), accepted, 2023.
- 【CCF A, IEEE TDSC】Xiaoyun Li (Student), Pengfei Chen*, et.al., SwissLog: Robust Anomaly Detection andLocalization for Interleaved Unstructured Logs, IEEE Transactions on Dependable and Secure Computing (IEEE TDSC, CCF A类), accepted, 2022.
- 【CCF A, 软件学报】黄梓程(学生),陈鹏飞*,余广坝,陈泓仰,面向Java微服务系统的透明请求追踪及采样方法研究,软件学报(CCF 推荐A类中文期刊),2021。
- 【CCF A, IEEE TDSC】Pengfei Chen, Yong Qi, Xinyi Li, Di Hou Michael Lyu, ARF-Predictor: Effective Prediction of Aging-Related Failure Using Entropy[J]. IEEE Transactions on Dependable and Secure Computing (IEEE TDSC, CCF A类), pp:1-19, 2016 (published online, DOI: 10.1109/TDSC.2016.2604381).
- 【CCF A, IEEE TSC】Pengfei Chen, Yong Qi, Di Hou, CauseInfer: Automated End-to-End Performance Diagnosis with Hierarchical Causality Graph in Cloud Environement[J]. IEEE Transactions on Service Computing (IEEE TSC, CCF A类), pp:1-17, 2016 (published online, DOI: 10.1109/TSC.2016.2607739)
- [CCF A, ACM ICSE'24] Xiaoyun Li (student), Pengfei Chen, et.al., LogShrink: Effective Log Compression by Leveraging Commonality and Variability of Log Data, The 46th International Conference on Software Engineering, (ICSE'24, CCF推荐A类会议), to appear, 2024;
- 【CCF A, ACM ESEC/FSE‘23】Zhiming Chen (student), Pengfei Chen*(通讯作者), et.al., DiagConfig: Configuration Diagnosis of Performance Violations in Configurable Software Systems, The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ( CCF推荐A类会议),accepted, 2023;
- 【CCF A, ACM ESEC/FSE‘23】Guangba Yu (student), Pengfei Chen*(通讯作者), et.al., Nezha: Interpretable Fine-Grained Root Causes Analysis for Microservices on Multi-Modal Observability Data,The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, (CCF推荐A类会议),accepted, 2023;
- 【CCF A, ACM ICSE'23】Guangba Yu (student), Pengfei Chen*(通讯作者), et.al., LogReducer: Identify and Reduce Log Hotspots in Kernel on the Fly, The 45th International Conference on Software Engineering, (ICSE'23, CCF推荐A类会议), 2023;
- [CCF A, ACM ASE'22] Zilong He (student),Pengfei Chen*(通讯作者), et.al, Graph based Incident Extraction and Diagnosis in Large-Scale Online Systems, The 37th IEEE/ACM International Conference on Automated Software Engineering (CCF 推荐A类会议), accepted, 2022.
- [CCF A, ACM ASE'22] Tao Huang, Pengfei Chen*(共同一作、通讯作者), et.al, A Transferable Time Series Forecasting Service using Deep Transformer model for Online Systems, The 37th IEEE/ACM International Conference on Automated Software Engineering (CCF 推荐A类会议), conditionally accepted, 2022.
- [CCF A, WWW'22] Tao Huang, Pengfei Chen* (共同一作, 通讯作者), et.al., A Semi-Supervised VAE Based Active Anomaly Detection Framework in Multivariate Time Series for Online Systems,accepted as full paper by WWW 2022 (CCF 推荐A类会议),2022.
- [CCF A, WWW'21]. Guangba Yu (student), Pengfei Chen*(通讯作者), et al. MicroRank: End-to-End Latency Issue Localization with Extended Spectrum Analysis in Microservice Environments, WWW 2021 (CCF A类会议), 2021
- [CCF A, WWW'20]. Meng Ma, Ping Wang, Jingmin Xu, Pengfei Chen, Yuan Wang, et al., AutoMAP: Diagnose Your Microservice-based Web Applications Automatically, WWW 2020, (CCF 推荐 A类会议), pp. 246-258,2020;
- [CCF A, IEEE INFOCOM'14]. Pengfei Chen*, Yong Qi, Pengfei Zheng, Di Hou, CauseInfer: Automatic and distributed performance diagnosis with hierarchical causality graph in large distributed systems[C]. Proceedings of the 33rd Annual IEEE International Conference on Computer Communications (INFOCOM'14) (CCF 推荐A类会议), April 27th,May 2nd, 2014, Toronto, Canada, 2014:1887- 1895. (EI: 20143017978534).
- 【CCF B, CN】Hongyang Chen (Student), Pengfei Chen*, et.al., MoTor: Resource-efficient Cloud-Native Network Acceleration with Programmable Switches, Computer Networks (CCF B类期刊), 2025, to appear;
- 【CCF B, IEEE DSN'24】Junyu Zhang (student), Pengfei Chen*, et.al., Real-Time Intrusion Detection and Prevention with Neural Network in Kernel using eBPF, The 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN'24 (CCF B), 2024, to appear.
- [CCF B, IEEE ISSRE'24] Jin Huang (student), Pengfei Chen*, et.a.l., FaaSRCA: Full Lifecycle Root Cause Analysis for Serverless Applications, IEEE ISSRE'24 (CCF B类会议),2024, to appear.
- [CCF B, IEEE SANER'25] Haojia Huang, Pengfei Chen*, et.a.l. Conan: Uncover Consensus Issues in Distributed Databases Using Fuzzing-driven Fault Injection, The IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER,CCF B类会议) , 2025, to appear;
- 【CCFB, CN】Hongyang Chen (Student), Pengfei Chen*, et.al., Graph Neural Network based Robust Anomaly Detection at Service Level in SDN Driven Microservice System, Computer Networks (CCF B类期刊),2023,to appear;
- 【CCF C, JNCA】Wanqi Yang (Student), Pengfei Chen*, et.al., Network shortcut in data plane of service mesh with eBPF, Journal of Network and Computer Applications (中科院2区),2023, to appear;
- 【CCF B, IEEE TNNLS】Zilong He (student), Pengfei Chen*, Xiaoyun Li, Yongfeng Wang, Guangba Y, et.al., "A Spatio-Temporal Deep Learning Approach for Unsupervised Anomaly Detection in Cloud Systems", in IEEE Transactions On Neural Networks and Learning Systems(中科院1区, ESI 高被引论文),to appear, 2020.
- 【CCF C, IEEE TCC】Guangba Yu (student), Pengfei Chen*, Zibin Zheng, "Microscaler: Cost-effective Scaling for Microservice Applications in the Cloud with anOnline Learning Approach", in IEEE Transaction on Cloud Computing (中科院1区), to appear, 2020.
- 【CCF B, IEEE TETC】Pengfei Chen, Yong Qi, Di Hou,InvarNet-X: A Black-box Invariant-based Approach to Diagnosing Big Data Systems[J]. IEEE Transactions on Emerging Topics in Computing (TETC,中科院二区), 5(4),pp:450-465, 2017, DOI: 10.1109/TETC.2015.2497143.
- 【CCF B, IEEE TR】Pengfei Zheng, Yong Qi, Yangfan Zhou, Pengfei Chen, Jianfeng Zhan, Michael Rung-Tsong Lyu, An Automatic Publications Framework for Detecting and Characterizing Performance Degradation and Characterizing Performance Degradation of Software Systems, IEEE Transactions on Reliability (IEEE TR), 2014, 63(4): 927-943.
- 【CCF B, ACM ICPP'23】Jingrun Zhang (student), Pengfei Chen*(通讯作者), et.al., DeepPower: Deep Reinforcement Learning based PowerManagement for Latency Critical Applications in Multi-coreSystems, the 52nd International Conference on Parallel Processing, (CCF 推荐B类会议), 2023;
- 【CCF B, ACM ICPP'23】Benran Wang (student), Pengfei Chen*(通讯作者), et.al., MARS: Fault Localization in Programmable Networking Systems with Low-cost In-Band Network Telemetry, the 52nd International Conference on Parallel Processing, ( CCF 推荐B类会议), 2023;
- [CCF B, ICSOC'22]Yufeng Li (student), Guangba Yu, Pengfei Chen*(通讯作者),et.al., MicroSketch: Lightweight and Adaptive Sketch based Performance Issue Detection andLocalization in Microservice Systems, The 20th International Conference on Service-Oriented Computing (CCF推荐B类会议), 2022.
- [CCF B, IEEE ISSRE'22] Xiaoyun Li (student), Pengfei Chen* (通讯作者),et.al., Going through the Life Cycle of Faults in Clouds:Guidelines on Fault Handling,The 33rd IEEE International Symposium on Software Reliability Engineering (CCF推荐B类会议), 2022.
- [CCF B, IEEE ISSRE'22] Zilong He (student), Pengfei Chen* (通讯作者), et.al, Share or Not Share? Towards the Practicability ofDeep Models for Unsupervised Anomaly Detection in Modern Online Systems,The 33rd IEEE International Symposium on Software Reliability Engineering (CCF推荐B类会议,大会唯一最佳论文奖),2022.
- [CCF B, ACM ICPP'22] Dou Hui, Pengfei Chen, et. al., DeepCAT: A Cost-Efficient Online Configuration Auto-Tuning Approach for Big Data Frameworks, the 51st International Conference on Parallel Processing (CCF推荐B类会议).
- [CCF B, IEEE ICWS'22] Zijun Hu (Student), Pengfei Chen* (通讯作者), Guangba Yu, Zilong He, Xiaoyun Li, TS-InvarNet: Anomaly Detection and Localization based onTempo-spatial KPI Invariants in Distributed Services, to appear on IEEE ICWS 2022 (CCF推荐B类会议),2022;
- [CCF B, IEEE DSN'22] Wenlu Wang, Pengfei Chen* (通讯作者), et.al., Active-MTSAD: Multivariate Time Series Anomaly Detection With Active Learning,to appear on IEEE/IFIP DSN 2022 (CCF推荐B类会议),2022.
- [CCF B, IEEE ICWS'21] Zicheng Huang (student), Pengfei Chen*(通讯作者), Guangba Yu, Hongyang Chen, Sieve: Attention-based Sampling of End-to-End Trace Data in Distributed Microservice Systems, (CCF B类会议), accepted as full paper by IEEE ICWS, 2021.
- [CCF C, ACM CCGRID'21]. Zihao Ye (Student), Pengfei Chen*(通讯作者), Guangba Yu, T-Rank: A Lightweight Spectrum based Fault Localization Approach for Microservice Systems, CCGrid 2021(CCF C类会议,欧洲 Core A会议), 2021.
- [CCF B, IEEE ISSRE'20]. Xiaoyun Li (student), Pengfei Chen*(通讯作者), Linxiao Jing, Zilong He, Guangba Yu, SwissLog: Robust and Unified Deep Learning Based Log Anomaly Detection for Diverse Faults, ISSRE 2020, (CCF 推荐B类会议), 2020.
- [CCF B, IEEE ICWS'19]. Guangba Yu (student), Pengfei Chen*(通讯作者), Zibin Zheng, Microscaler: Automatic Scaling for Microservices with an Online Learning Approach, IEEE ICWS 2019 (CCF 推荐B类会议),2019.
- [CCF B, ICSOC'18]. Jinjin Lin (student), Pengfei Chen*(通讯作者), Zibin Zheng, Microscope: Pinpoint the Abnormal Services with Causal Graphs in Micro-service Environments, IEEE ICSOC 2018 (CCF 推荐B类论文),2018;
- [CCF B, IEEE ICWS'17]. Tong Jia, Pengfei Chen*, Lin Yang Ying Li, Fanjing Meng, Jingmin Xu. An Approach for Anomaly Diagnosis Based on Hybrid Graph Model with Logs for Distributed Services[C] IEEE International Conference on Web Services (IEEE ICWS CCF推荐B类会议). IEEE, 2017:25-32, Hawaii, USA.
- [CCF C, IEEE Cloud'17]. Tong Jia, Lin Yang*, Pengfei Chen, Ying Li, Fanjing Meng, Jingmin Xu. LogSed: Anomaly Diagnosis through Mining Time-Weighted Control Flow Graph in Logs[C]. IEEE, International Conference on Cloud Computing. (CCF推荐C类会议), 2017:447-455, Hawaii, USA.
- [CCF C, ACM CCGRID'18]. Ping Wang, Jingmin Xu, Meng Ma, Pengfei Chen and Yuan Wang, et al. CloudRanger: Root Cause Identification for Cloud Native Systems[C]. Proceedings of the International Symposium on Cluster, Cloud and Grid Computing ( CCF推荐C类会议), May 1-4, 2018;