Software Performance Engineering

At GWDG we aim to provide performance engineering services for our customers to help them achieve their goals with high level accuracy and efficiency. The performance engineering services will improve customers computing experiences and help them integrate AI solutions into their HPC workloads.

Goals

Improve efficiency and productivity in our HPC Systems
Improved customer experience
Increased competitiveness

Key Components

Testing and Measurements
- Use of software development and analysis tools to test and measure key performance metrics
Root cause analysis
- Use performance tools to analyze performance data and identify possible root causes of performance issues
Modeling
- Characterization of hardware and software features
- Predictive analytics: Apply machine learning algorithms to analyze performance data in predicting possible performance degradation caused by changes in the systems
Monitoring
- Real-time performance monitoring: Use dashboards to identify bottlenecks in utilization of HPC resources
Optimization
- Use performance tools to identify bottlenecks
- Use performance models to identify optimization options
- Analyse performance data to explore the optimization space.
- Use AI-powered tools to optimize resource utilization and power consumption

Performance of LLM Inference

We aim to investigate the overall scalability and memory utilization of LLM inference in GWDG systems and offer support for our customers on performance best practices.

AI powered Performance Engineering (Research options)

We aim to integrate AI technologies and Machine Learning techniques in our services to improve the performance and efficiency of our software systems

Develop research ideas in collaboration with customers and partner institutions
Horovod+ Ray

Domains covered

All Scientific Domains
- Physics, Chemistry, Mathematics
Applied AI
- ML, DNN and LLM services
High Performance Data Analytics

Tools

ScoreP
VAmpir
NSIGHT-Systems
Horovod+ Ray
etc

Use cases

Using ScoreP to instrument NNI - AutoML
Characterization of CPU utilization (memory, network, etc) - WLCG
Monitoring Energy Consumption of GROMACS in Emmy

Performance Engineering Competition

We plan student competitions and hackathons on performance engineering on a regular basis.

Contact

GWDG Academy courses
Write a ticket to hpc-support mentioning PE in the subject