Faculty of Informatics – Università della Svizzera italiana (USI)

Profiling Concurrency on the JVM


Developing multi-threaded applications is becoming increasingly important to exploit the massive parallel computing resources of nowadays hardware technologies. While parallel programming offers major benefits in speeding up applications, it can also lead to suboptimal performance if not done with care. To assess the performance of parallel applications and to locate optimization opportunities, it is fundamental to analyze their behavior under multiple aspects, particularly in relation to the use of concurrency and synchronization constructs.

We tackle this issue with P3, a novel profiling suite for parallel applications focused on metrics related to parallelism, concurrency, and synchronization. Specifically, P3 profiles the use of concurrent entities, constructs and classes to implement synchronization, lock-free operations, as well as synchronized and concurrent collections. Our tool enables the collection of metrics that, to the best of our knowledge, are not targeted by other profilers, such as the use of volatile memory accesses, futures and promises, synchronizers, synchronized collections, and concurrent collections. P3 can be run on any standard JVM that supports the JVM Tool Interface (JVMTI). P3 is composed of several profiling modules that can be enabled individually, each incurring only moderate profiling overhead. In addition, P3 can be immediately applied to popular benchmark suites for the JVM (e.g., Renaissance, DaCapo, ScalaBench, SPECjvm2008) and can be readily used to conduct large-scale analyses on public software repositories via NAB. P3 resorts to efficient lock-free data structures, a careful architectural design that minimizes computations done in the inserted instrumentation code, and the use of advanced technologies such as reification of reflective information in a separate instrumentation process.

P3 has been fundamental in the development of the Renaissance suite, where we used P3 attached to NAB to select candidate workloads hosted in public software repositories showing a high degree of concurrency and synchronization. Moreover, we used P3 to filter out workloads showing low parallelism and concurrency, which did not fall in the scope of the suite. We also used P3 to obtain key metrics on concurrency and synchronization on the selected benchmarks, which demonstrated the higher diversity of Renaissance wrt. other prevalent benchmark suites for the JVM. Finally, with P3 we analyzed the variability of different metrics for multiple iterations of the Renaissance benchmarks.

This work has been published at APLAS’20 [1]. An evaluation version of P3 is publicly available [A].


Key Publications


[1] Andrea Rosà, Walter Binder: P3: A Profiler Suite for Parallel Applications on the Java Virtual Machine. APLAS 2020: 364-372 [pdf][video][slides]


Software


[A] See the software page