tgp is a task-granularity profiler for shared-memory multithreaded JVM applications. tgp is built on top of the DiSL and Shadow VM frameworks, which enable the detection of all spawned tasks, including those in the Java class library (which is notoriously hard to instrument).
tgp detects each task created by an application, collecting its granularity either in terms of bytecode count (i.e., the number of bytecode instructions executed), or reference-cycles count, (i.e., the number of cycles elapsed during task execution), collected at the nominal frequency of the CPU (even in case of frequency scaling).
tgp assists developers in locating performance drawbacks related to suboptimal task granularity. On the one hand, tgp helps detecting situations where many small tasks carry out few computations, potentially introducing parallelization overheads due to significant interference and contention between them. On the other hand, tgp helps detecting situations where only few large tasks are spawned, each of them performing substantial computations, potentially resulting in low CPU utilization or load imbalance.
In addition, tgp profiles calling contexts, i.e., all methods open on the call stack upon the creation, submission, or execution of a task. This information helps the user locate classes and methods to modify to optimize task granularity.
tgp enables an accurate collection of task-granularity profiles even for tasks showing complex patterns, such as nested tasks, tasks executed multiple times, and tasks with recursive operations. To enable a detailed and accurate analysis of task granularity, tgp resorts to vertical profiling, collecting a carefully selected set of metrics from the whole system stack, aligning them via offline analysis. tgp resorts to a novel and efficient profiling methodology, instrumentation and data structures to collect accurate task-granularity profiles with low profiling overhead (i.e., 1.05x on average) and so to reduce perturbations of the collected task-granularity profiles. Overall, tgp helps developers locate performance and scalability problems related to task granularity. To the best of our knowledge, tgp is the first task-granularity profiler for the JVM.
tgp is an open-source project hosted on GitHub.
 Andrea Rosà, Eduardo Rosales, Walter Binder: Analyzing and Optimizing Task Granularity on the JVM. CGO 2018: 27-37 [pdf][slides]
 Andrea Rosà, Eduardo Rosales, Walter Binder: Analysis and Optimization of Task Granularity on the Java Virtual Machine. ACM Trans. Program. Lang. Syst. 41(3): 19:1-19:47 (2019) [pdf]