Publications

Sascha Hunold

Journal Publications

  1. Ioannis Vardas, Jesper Larsson Träff, Ruben Laso, and Sascha Hunold. Mpisee: Communicator-Centric Profiling of MPI Applications. In Concurr. Comput. Pract. Exp., vol. 37, no. 15-17, doi: 10.1002/CPE.70158, 2025. PDF
  2. Majid Salimi Beni, Sascha Hunold, and Biagio Cosenza. Analysis and prediction of performance variability in large-scale computing systems. In J. Supercomput., vol. 80, no. 10, doi: 10.1007/S11227-024-06040-W, pp. 14978-15005, 2024. PDF
  3. Jesper Larsson Träff, Sascha Hunold, Guillaume Mercier, and Daniel J. Holmes. MPI collective communication through a single set of interfaces: A case for orthogonality. In Parallel Comput., vol. 107, doi: 10.1016/j.parco.2021.102826, pp. 102826, 2021. PDF
  4. Thomas Kainrad, Sascha Hunold, Thomas Seidel, and Thierry Langer. LigandScout Remote: A New User-Friendly Interface for HPC and Cloud Resources. In Journal of Chemical Information and Modeling, vol. 59, no. 1, doi: 10.1021/acs.jcim.8b00716, pp. 31-37, 2019. PDF
  5. Raphaël Bleuse, Sascha Hunold, Safia Kedad-Sidhoum, Florence Monna, Grégory Mounié, and Denis Trystram. Scheduling Independent Moldable Tasks on Multi-Cores with GPUs. In IEEE Transactions on Parallel and Distributed Systems, vol. 28, no. 9, doi: 10.1109/TPDS.2017.2675891, pp. 2689-2702, 2017. PDF Code
  6. Alexandra Carpen-Amarie, Sascha Hunold, and Jesper Larsson Träff. On Expected and Observed Communication Performance with MPI Derived Datatypes. In Parallel Computing, vol. 69, doi: 10.1016/j.parco.2017.08.006, pp. 98-17, 2017. PDF Code
  7. Sascha Hunold and Alexandra Carpen-Amarie. Reproducible MPI Benchmarking Is Still Not As Easy As You Think. In IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 12, doi: 10.1109/TPDS.2016.2539167, pp. 3617-3630, 2016. PDF Code
  8. Sascha Hunold. One Step towards Bridging the Gap between Theory and Practice in Moldable Task Scheduling with Precedence Constraints. In Concurrency and Computation: Practice and Experience, vol. 27, no. 4, doi: 10.1002/cpe.3372, pp. 1010-1026, 2015. PDF
  9. Rémi Bertin, Sascha Hunold, Arnaud Legrand, and Corinne Touati. Fair scheduling of bag-of-tasks applications using distributed Lagrangian optimization. In Journal of Parallel and Distributed Computing, vol. 74, no. 1, doi: 10.1016/j.jpdc.2013.08.011, pp. 1914-1929, 2014. PDF
  10. Sascha Hunold, Thomas Rauber, and Gudula Rünger. Combining Building Blocks for Parallel Multi-level Matrix Multiplication. In Parallel Computing, vol. 34, no. 6-8, doi: 10.1016/j.parco.2008.03.003, pp. 411-426, 2008. PDF

Conference Publications

  1. Pallez, Guillaume, Hill, Judith, and Hunold, Sascha. Implementing a Reproducibility Initiative in HPC: Experiences from SC'24. In Proceedings of the 3rd ACM Conference on Reproducibility and Replicability, doi: 10.1145/3736731.3746148, pp. 76-84, 2025. PDF
  2. Ruben Laso, Diego Krupitza, and Sascha Hunold. Exploring Scalability in C++ Parallel STL Implementations. In Proceedings of the 53rd International Conference on Parallel Processing, ICPP 2024, Gotland, Sweden, August 12-15, 2024, doi: 10.1145/3673038.3673065, pp. 284-293, 2024. PDF
  3. Majid Salimi Beni, Biagio Cosenza, and Sascha Hunold. MPI Collective Algorithm Selection in the Presence of Process Arrival Patterns. In IEEE CLUSTER, 2024. PDF
  4. Jesper Larsson Träff, Sascha Hunold, Ioannis Vardas, and Nikolaus Manes Funk. Uniform Algorithms for Reduce-scatter and (most) other Collectives for MPI. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER), 2023. PDF
  5. Joseph Schuchart, Sascha Hunold, and George Bosilca. MPI Process Synchronization in Space and Time. In Proceedings of the EuroMPI, 2023. PDF
  6. Sascha Hunold and Klaus Kraßnitzer. A Quantitative Analysis of OpenMP Task Runtime Systems. In Proceedings of the 14th BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench 2022), 2022. PDF
  7. Sascha Hunold, Abhinav Bhatele, George Bosilca, and Peter Knees. Predicting MPI Collective Communication Performance Using Machine Learning. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER), doi: 10.1109/CLUSTER49012.2020.00036, pp. 259-269, 2020. PDF
  8. Konrad von Kirchbach, Markus Lehr, Sascha Hunold, Christian Schulz, and Jesper Larsson Träff. Efficient Process-to-Node Mapping Algorithms for Stencil Computations. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER), doi: 10.1109/CLUSTER49012.2020.00011, pp. 1-11, 2020. PDF
  9. Jesper Larsson Träff and Sascha Hunold. Decomposing MPI Collectives for Exploiting Multi-lane Communication. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER), doi: 10.1109/CLUSTER49012.2020.00037, pp. 270-280, 2020. PDF
  10. Jesper Larsson Träff, Sascha Hunold, Guillaume Mercier, and Daniel J. Holmes. Collectives and Communicators: A Case for Orthogonality. In EuroMPI/USA, doi: 10.1145/3416315.3416319, pp. 31-38, 2020. PDF
  11. Jesper Larsson Träff and Sascha Hunold. Cartesian Collective Communication. In Proceedings of the 48th International Conference on Parallel Processing (ICPP), doi: 10.1145/3337821.3337848, pp. 48:1-48:11, 2019. PDF Code
  12. Sascha Hunold and Alexandra Carpen-Amarie. Hierarchical Clock Synchronization in MPI. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER), doi: 10.1109/CLUSTER.2018.00050, pp. 325-336, 2018. PDF
  13. Sascha Hunold and Alexandra Carpen-Amarie. Autotuning MPI Collectives using Performance Guidelines. In Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia), doi: 10.1145/3149457.3149461, pp. 64-74, 2018. PDF Code
  14. Franz C. Heinrich, Tom Cornebize, Augustin Degomme, Arnaud Legrand, Alexandra Carpen-Amarie, Sascha Hunold, Anne-Cécile Orgerie, and Martin Quinson. Predicting the Energy Consumption of MPI Applications at Scale Using a Single Node. In Proceedings of the 2017 IEEE International Conference on Cluster Computing (CLUSTER), doi: 10.1109/CLUSTER.2017.66, 2017. PDF
  15. Sascha Hunold, Alexandra Carpen-Amarie, Felix Donatus Lübbe, and Jesper Larsson Träff. Automatic Verification of Self-consistent MPI Performance Guidelines. In Euro-Par 2016, doi: 10.1007/978-3-319-43659-3_32, pp. 433-446, 2016. PDF Code
  16. Alexandra Carpen-Amarie, Sascha Hunold, and Jesper Larsson Träff. On the Expected and Observed Communication Performance with MPI Derived Datatypes. In EuroMPI, doi: 10.1145/2966884.2966905, pp. 108-120, 2016. PDF Code
  17. Sascha Hunold and Alexandra Carpen-Amarie. On the Impact of Synchronizing Clocks and Processes on Benchmarking MPI Collectives. In EuroMPI, doi: 10.1145/2802658.2802662, pp. 8:1-8:10, 2015. PDF Code
  18. Jesper Larsson Träff, Felix Donatus Lübbe, Antoine Rougier, and Sascha Hunold. Isomorphic, Sparse MPI-like Collective Communication Operations for Parallel Stencil Computations. In EuroMPI, doi: 10.1145/2802658.2802663, pp. 10:1-10:10, 2015. PDF
  19. Sascha Hunold, Alexandra Carpen-Amarie, and Jesper Larsson Träff. Reproducible MPI Micro-Benchmarking Isn’t As Easy As You Think. In EuroMPI/ASIA, doi: 10.1145/2642769.2642785, pp. 69:69-69:76, 2014. PDF Code
  20. Jesper Larsson Träff, Antoine Rougier, and Sascha Hunold. Implementing a Classic: Zero-copy All-to-all Communication with MPI Datatypes. In Proceedings of the 28th International Conference on Supercomputing (ICS'14), doi: 10.1145/2597652.2597662, 2014. PDF
  21. Sascha Hunold. Scheduling Moldable Tasks with Precedence Constraints and Arbitrary Speedup Functions on Multiprocessors. In Proceedings of the 10th International Conference on Parallel Processing and Applied Mathematics (PPAM), doi: 10.1007/978-3-642-55195-6_2, 2013. PDF
  22. Sascha Hunold and Joachim Lepping. Evolutionary Scheduling of Parallel Tasks Graphs onto Homogeneous Clusters. In Proceedings of the IEEE International Conference on Cluster Computing (Cluster 2011), doi: 10.1109/CLUSTER.2011.45, 2011. PDF
  23. Marvin Ferber, Sascha Hunold, and Thomas Rauber. Combining OO Design and SOA with Remote Objects over Web Services. In Proceedings of the 8th European Conference on Web Services (ECOWS 2010), doi: 10.1109/ECOWS.2010.19, 2010. PDF
  24. Marvin Ferber, Sascha Hunold, and Thomas Rauber. BPEL Remote Objects: Integrating BPEL Processes into Object-Oriented Applications. In Proceedings of the 7th IEEE 2010 International Conference on Services Computing (SCC 2010), doi: 10.1109/SCC.2010.84, 2010. PDF
  25. Sascha Hunold. Low-Cost Tuning of Two-Step Algorithms for Scheduling Mixed-Parallel Applications onto Homogeneous Clusters. In Proceedings of the 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2010), doi: 10.1109/CCGRID.2010.52, 2010. PDF
  26. Marvin Ferber, Sascha Hunold, Björn Krellner, Thomas Rauber, Thomas Reichel, and Gudula Rünger. Reducing the Class Coupling of Legacy Code by a Metrics-Based Relocation of Class Members. In Proceedings 4th IFIP TC2 Central and East European Conference on Software Engineering Techniques (CEE-SET), doi: 10.1007/978-3-642-28038-2_16, 2009. PDF
  27. Sascha Hunold, Björn Krellner, Thomas Rauber, Thomas Reichel, and Gudula Rünger. Pattern-based Refactoring of Legacy Software Systems. In Proceedings of the 11th International Conference on Enterprise Information Systems (ICEIS), doi: 10.1007/978-3-642-01347-8_7, pp. 78-89, 2009. PDF
  28. Sascha Hunold, Thomas Rauber, and Frédéric Suter. Redistribution Aware Two-Step Scheduling for Mixed-Parallel Applications. In Proceedings of the 10th IEEE International Conference on Cluster Computing (CL), doi: 10.1109/CLUSTR.2008.4663755, 2008. PDF
  29. Sascha Hunold, Matthias Korch, Björn Krellner, Thomas Rauber, Thomas Reichel, and Gudula Rünger. Transformation of Legacy Software into Client/Server Applications through Pattern-based Rearchitecturing. In Proc. of the IEEE Computer Software and Applications Conference (COMPSAC 2008), doi: 10.1109/COMPSAC.2008.158, 2008. PDF
  30. Sascha Hunold, Thomas Rauber, and Gudula Rünger. Design and Evaluation of a Parallel Data Redistribution Component for TGrid. In Proceedings of the International Symposium on Parallel and Distributed Processing and Applications (ISPA), doi: 10.1007/11946441_58, 2006. PDF
  31. Sascha Hunold and Thomas Rauber. Reducing the Overhead of Intra-Node Communication in Clusters of SMPs. In Proc. of the 3rd International Symposium on Parallel and Distributed Processing and Applications (ISPA), doi: 10.1007/11576235_10, 2005. PDF
  32. Sascha Hunold and Thomas Rauber. Automatic Tuning of PDGEMM Towards Optimal Performance. In Proc. of the Euro-Par Conference 2005, doi: 10.1007/11549468_91, 2005. PDF
  33. Sascha Hunold, Thomas Rauber, and Gudula Rünger. Multilevel Hierarchical Matrix Multiplication on Clusters. In Proceedings of the 18th Annual ACM International Conference on Supercomputing (ICS), doi: 10.1145/1006209.1006230, pp. 136-145, 2004. PDF
  34. Sascha Hunold, Thomas Rauber, and Gudula Rünger. Hierarchical Matrix-Matrix Multiplication based on Multiprocessor Tasks. In Proceedings of the International Conference on Computational Science (ICCS), Part II, doi: 10.1007/978-3-540-24687-9_1, pp. 1-8, 2004. PDF

Workshop Publications

  1. Majid Salimi Beni, Ruben Laso, Biagio Cosenza, Siegfried Benkner, and Sascha Hunold. Exploring NCCL Tuning Strategies for Distributed Deep Learning. In Proceedings of the IPDPS Workshops, 2025. PDF
  2. Sascha Hunold. Verifying Performance Guidelines for MPI Collectives at Scale. In Proceedings of the SC ‘23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, SC-W 2023, Denver, CO, USA, November 12-17, 2023, doi: 10.1145/3624062.3625532, pp. 1264-1268, 2023. PDF
  3. Philippe Swartvagher, Sascha Hunold, Jesper Larsson Träff, and Ioannis Vardas. Using Mixed-Radix Decomposition to Enumerate Computational Resources of Deeply Hierarchical Architectures. In Proceedings of the SC ‘23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, SC-W 2023, Denver, CO, USA, November 12-17, 2023, doi: 10.1145/3624062.3624109, pp. 404-415, 2023. PDF
  4. Majid Salimi Beni, Sascha Hunold, and Biagio Cosenza. Algorithm Selection of MPI Collectives Considering System Utilization. In Proceedings of the Euro-Par Workshops, doi: 10.1007/978-3-031-48803-0_37, pp. 302-307, 2023. PDF
  5. Ioannis Vardas, Sascha Hunold, Philippe Swartvagher, and Jesper Larsson Träff. Exploring Mapping Strategies for Co-allocated HPC Applications. In Proceedings of the Euro-Par Workshops, doi: 10.1007/978-3-031-48803-0_31, pp. 271-276, 2023. PDF
  6. Sascha Hunold, Jordy I. Ajanohoun, Ioannis Vardas, and Jesper Larsson Träff. An Overhead Analysis of MPI Profiling and Tracing Tools. In Proceedings of the 2nd Workshop on Performance EngineeRing, Modelling, Analysis, and VisualizatiOn Strategy (PERMAVOST), doi: 10.1145/3526063.3535353, pp. 5-13, 2022. PDF
  7. Sascha Hunold and Sebastian Steiner. OMPICollTune: Autotuning MPI Collectives by Incremental Online Learning. In Proceedungs of the International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), 2022. PDF
  8. Ioannis Vardas, Sascha Hunold, Jordy I. Ajanohoun, and Jesper Larsson Träff. mpisee: MPI Profiling for Communication and Communicator Structure. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium, IPDPS Workshops, doi: 10.1109/IPDPSW55747.2022.00092, pp. 520-529, 2022. PDF
  9. Sascha Hunold, Jordy I. Ajanohoun, Ioannis Vardas, and Jesper Larsson Träff. An Overhead Analysis of MPI Profiling and Tracing Tools. In Proceedings of the 2nd Workshop on Performance EngineeRing, Modelling, Analysis, and VisualizatiOn Strategy (PERMAVOST), doi: 10.1145/3526063.3535353, pp. 5-13, 2022. PDF
  10. Sascha Hunold and Bartlomiej Przybylski. Teaching Complex Scheduling Algorithms. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, doi: 10.1109/IPDPSW52791.2021.00058, pp. 321-327, 2021. PDF
  11. Sascha Hunold and Sebastian Steiner. Benchmarking Julia’s Communication Performance: Is Julia HPC ready or Full HPC?. In Proceedings of the PMBS@SC, 2020. PDF
  12. Sascha Hunold and Alexandra Carpen-Amarie. Algorithm Selection of MPI Collectives Using Machine Learning Techniques. In Proceedings of the IEEE/ACM Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS@SC), doi: 10.1109/PMBS.2018.8641622, pp. 45-50, 2018. PDF
  13. Sascha Hunold, Henri Casanova, and Frédéric Suter. From Simulation to Experiment: A Case Study on Multiprocessor Task Scheduling. In Proceedings of the 13th Workshop on Advances on Parallel and Distributed Processing Symposium (APDCM), doi: 10.1109/IPDPS.2011.201, 2011. PDF
  14. Sascha Hunold, Ralf Hoffmann, and Frédéric Suter. Jedule: A Tool for Visualizing Schedules of Parallel Applications. In Proceedings of the 1st International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2010), doi: 10.1109/ICPPW.2010.34, 2010. PDF
  15. Marvin Ferber, Sascha Hunold, and Thomas Rauber. Load Balancing Concurrent BPEL Processes by Dynamic Selection of Web Service Endpoints. In Proceedings of the Fifth International Workshop on Scheduling and Resource Management for Parallel and Distributed Systems (SRMPDS'09), doi: 10.1109/ICPPW.2009.18, 2009. PDF
  16. Sascha Hunold, Thomas Rauber, and Frédéric Suter. Scheduling Dynamic Workflows onto Clusters of Clusters using Postponing. In Proceedings of the 3rd International Workshop on Workflow Systems in e-Science (WSES 08), doi: 10.1109/CCGRID.2008.44, 2008. PDF
  17. Ralf Hoffmann, Sascha Hunold, Matthias Korch, and Thomas Rauber. Towards Scalable Parallel Numerical Algorithms and Dynamic Load Balancing Strategies. In Third Joint HLRB and KONWIHR Result and Reviewing Workshop 2007, doi: 10.1007/978-3-540-69182-2_40, 2008. PDF
  18. Sascha Hunold, Thomas Rauber, and Georg Wille. Sequential and Parallel Implementation of a Constraint-based Algorithm for Searching Protein Structures. In Proceedings of the 9th IEEE International Conference on Cluster Computing (CLUSTER), doi: 10.1109/CLUSTR.2007.4629254, 2007. PDF
  19. Sascha Hunold, Thomas Rauber, and Gudula Rünger. Dynamic Scheduling of Multi-Processor Tasks on Clusters of Clusters. In Proceedings of the Sixth International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks (Heteropar'07), doi: 10.1109/CLUSTR.2007.4629277, 2007. PDF
  20. Sascha Hunold, Thomas Rauber, and Gudula Rünger. TGrid - Grid Runtime Support for Hierarchically Structured Task-parallel Programs. In Proceedings of the Fifth International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks (Heteropar'06), doi: 10.1109/CLUSTR.2006.311910, 2006. PDF

Technical Reports

  1. Kate Keahey, Marc Richardson, Rafael Tolosana-Calasanz, Sascha Hunold, Jay F. Lofstead, Tanu Malik, and Christian Pérez. Report on Challenges of Practical Reproducibility for Systems and HPC Computer Science. In CoRR, vol. abs/2505.01671, doi: 10.48550/ARXIV.2505.01671, 2025. PDF
  2. Ruben Laso, Diego Krupitza, and Sascha Hunold. pSTL-Bench: A Micro-Benchmark Suite for Assessing Scalability of C++ Parallel STL Implementations. In CoRR, vol. abs/2402.06384, doi: 10.48550/ARXIV.2402.06384, 2024. PDF
  3. Sascha Hunold, Konrad von Kirchbach, Markus Lehr, Christian Schulz, and Jesper Larsson Träff. Efficient Process-to-Node Mapping Algorithms for Stencil Computations. In CoRR, vol. abs/2005.09521, 2020. PDF
  4. Sascha Hunold and Bartlomiej Przybylski. Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia. In CoRR, vol. abs/2003.05217, 2020. PDF
  5. Sascha Hunold and Alexandra Carpen-Amarie. Tuning MPI Collectives by Verifying Performance Guidelines. In CoRR, vol. abs/1707.09965, 2017. PDF Code
  6. Sascha Hunold, Alexandra Carpen-Amarie, Felix Donatus Lübbe, and Jesper Larsson Träff. PGMPI: Automatically Verifying Self-Consistent MPI Performance Guidelines. In CoRR, vol. abs/1606.00215, 2016. PDF Code
  7. Jesper Larsson Träff, Alexandra Carpen-Amarie, Sascha Hunold, and Antoine Rougier. Message-Combining Algorithms for Isomorphic, Sparse Collective Communication. In CoRR, vol. abs/1606.07676, 2016. PDF
  8. Alexandra Carpen-Amarie, Sascha Hunold, and Jesper Larsson Träff. MPI Derived Datatypes: Performance Expectations and Status Quo. In CoRR, vol. abs/1607.00178, 2016. PDF Code
  9. Raphaël Bleuse, Sascha Hunold, Safia Kedad-Sidhoum, Florence Monna, Grégory Mounié, and Denis Trystram. Scheduling Independent Moldable Tasks on Multi-Cores with GPUs. techreport, Inria Grenoble Rhône-Alpes, Université de Grenoble, URL: https://hal.archives-ouvertes.fr/hal-01263100, 2016. PDF
  10. Sascha Hunold and Alexandra Carpen-Amarie. MPI Benchmarking Revisited: Experimental Design and Reproducibility. In CoRR, vol. abs/1505.07734, 2015. PDF Code
  11. Sascha Hunold. A Survey on Reproducibility in Parallel Computing. In CoRR, vol. abs/1511.04217, 2015. PDF
  12. Sascha Hunold and Alexandra Carpen-Amarie. MPI Benchmarking Revisited: Experimental Design and Reproducibility. In CoRR, vol. abs/1505.07734, 2015. PDF
  13. Sascha Hunold and Jesper Larsson Träff. On the State and Importance of Reproducible Experimental Research in Parallel Computing. In CoRR, vol. abs/1308.3648, 2013. PDF
  14. Rémi Bertin, Sascha Hunold, Arnaud Legrand, and Corinne Touati. From Flow Control in Multi-path Networks to Multiple Bag-of-tasks Application Scheduling on Grids. techreport, INRIA, URL: http://hal.inria.fr/inria-00627532/en/, pp. 26, 2011. PDF

Publications in German

  1. Berndt, J., Ferber, M., Hunold, S., Krellner, B., Nobbers, I., Rauber, T., Reichel, T., and Rünger, G.. Transformation monolithischer Business-Softwaresysteme in verteilte, workflowbasierte Client-Server-Architekturen - Schlussbericht BMBF-Verbundprojekt TransBS. techreport, TU Chemnitz, URL: urn:nbn:de:bsz:ch1-201001140, 2010.
  2. Marvin Ferber, Sascha Hunold, Björn Krellner, Thomas Rauber, Thomas Reichel, and Gudula Rünger. Softwaremodernisierung durch werkzeugunterstütztes Verschieben von Codeblöcken. In Proceedings der Software Engineering 2009 - Beiträge zu den Workshops, 2009.
  3. Sascha Hunold, Matthias Korch, Björn Krellner, Thomas Rauber, Thomas Reichel, and Gudula Rünger. Inkrementelle Transformation einer monolithischen Geschäftssoftware. In Proceedings der Software Engineering 2008 - Beiträge zu den Workshops, 2008. PDF