visit
Authors:
(1) JOANNA C. S. SANTOS, University of Notre Dame, USA; (2) MEHDI MIRAKHORLI, University of Hawaii at Manoa, USA; (3) ALI SHOKRI, Virginia Tech, USA.
As shown in Table 4, only our serialization-aware call graph construction (Seneca) and Salsa passed all of the nine test cases. Only three other algorithms partially provided support for callback methods, namely Soot') and Soot⇠ (2 out of 9) and OPAL') (5 out of 9) [Reif et al. 2019]. The remaining algorithms, i.e., Soot (VTA, and Spark), Wala (RTA, 0-CFA, 1-CFA, 0-1-CFA), and Doop (context-insensitive), did not provide support at all for serialization-related callback methods.
It is also important to highlight that the frameworks that provided partial support for serialization-related features (SootRTA, SootCHA, and OpalRTA) use imprecise call graph construction algorithms (CHA [Dean et al. 1995] or RTA [Bacon and Sweeney 1996]). Table 5 shows a comparison of call graphs’ sizes in terms of nodes and edges. As we can infer from these charts, the only call graph construction algorithms used by Soot, and Opal that provided partial support for serialization create much larger call graphs (in terms of the number of nodes and edges). Since these algorithms only rely on static types when computing the possible targets of a method invocation, they introduce spurious nodes and edges, thereby increasing the call graph’s size.
5.1.2 Dataset #2: XCorpus Dataset Figure 7 depicts the percentage of edges in the runtime call graph of the projects, that are missing on the static call graph computed by each approach. From this chart, we notice that Seneca outperformed Wala and Salsa. Our approach has less missing edges compared to other the approaches, i.e., it is able to soundly infer hidden paths through serialization callbacks.
When inspecting the edges that Seneca missed, we observed that these edges were unrelated to serialization callbacks. That is, these were edges to which the underlying pointer analysis algorithm cannot soundly infer the points-to sets of variables. For example, we observed edges that were missed because instructions were using re"ection to invoke methods. These were constructs that the underlying 0-1-CFA and 1-CFA pointer analysis provided by Wala (our baseline framework) could not correctly infer the dispatch.
One of the reasons as to why Salsa performed similar to Seneca with the CATS test suite but performed poorly on the XCorpus dataset has to do with its inability to compute potential method dispatches from classes in the classpath. As described in their work [Santos et al. 2021, 2020], the approach relies on downcasts of objects to infer what are the object(s) being deserialized. When downcasts are unavailable, the approach relies on a simple approach of computing all possible dispatches, but limited to classes on the application scope. Our approach, on the other hand, follows Java’s serialization specification and includes all classes in the classpath, irrespective of its scope (i.e., extension, primordial, or application scope).
5.2.1 Dataset #1: CATS Figure 8 depicts the number of edges in the static call graph that were not present in the runtime call graph for the test cases in the CATS test suite [Reif et al. 2019]. As shown in this chart, Seneca was able to provide full support for serialization callbacks (passing all test cases, see Table 4) while maintaining reasonably sized call graphs. Compared to Soot and OPAL, the derived call graphs were far more imprecise. While Opal and Soot had over 800 imprecise edges, Seneca had between 95 and 343 incorrect edges.
5.2.2 Dataset #2: XCorpus Dataset Figure 9 plots the percentage of edges that are in the runtime call graph, but that are not in the static call graph of each approach. As observed on this chart, unsurprisingly, increasing the soundness of the call graph also increased the number of imprecise edges (i.e., edges that did not arise at runtime). The increase of missed edges is comparable to the one by Salsa.