Software Performance Engineering

At GWDG we aim to provide performance engineering services for our customers to help them achieve their goals with high level accuracy and efficiency. The performance engineering services will improve customers computing experiences and help them integrate AI solutions into their HPC workloads.

Goals

  • Improve efficiency and productivity in our HPC Systems
  • Improved customer experience
  • Increased competitiveness

Key Components

  • Testing and Measurements
    • Use of software development and analysis tools to test and measure key performance metrics
  • Root cause analysis
    • Use performance tools to analyze performance data and identify possible root causes of performance issues
  • Modeling
    • Characterization of hardware and software features
    • Predictive analytics: Apply machine learning algorithms to analyze performance data in predicting possible performance degradation caused by changes in the systems
  • Monitoring
    • Real-time performance monitoring: Use dashboards to identify bottlenecks in utilization of HPC resources
  • Optimization
    • Use performance tools to identify bottlenecks
    • Use performance models to identify optimization options
    • Analyse performance data to explore the optimization space.
    • Use AI-powered tools to optimize resource utilization and power consumption

Performance of LLM Inference

We aim to investigate the overall scalability and memory utilization of LLM inference in GWDG systems and offer support for our customers on performance best practices.

AI powered Performance Engineering (Research options)

We aim to integrate AI technologies and Machine Learning techniques in our services to improve the performance and efficiency of our software systems

  • Develop research ideas in collaboration with customers and partner institutions
  • Horovod+ Ray

Domains covered

  • All Scientific Domains
    • Physics, Chemistry, Mathematics
  • Applied AI
    • ML, DNN and LLM services
  • High Performance Data Analytics

Tools

Use cases

  • Using ScoreP to instrument NNI - AutoML
  • Characterization of CPU utilization (memory, network, etc) - WLCG
  • Monitoring Energy Consumption of GROMACS in Emmy

Performance Engineering Competition

We plan student competitions and hackathons on performance engineering on a regular basis.

Contact

  • GWDG Academy courses
  • Write a ticket to hpc-support mentioning PE in the subject