🤖 AI Summary
Existing object serialization formats (e.g., Protobuf, JSON, XML) exhibit poor readability and limited auditability in source-embedded contexts such as test cases. This paper introduces ProDJ—the first pure-code serialization technique for Java—that converts runtime objects directly into syntactically valid, executable, and highly readable Java source code expressions. Its core innovation lies in leveraging the target language’s native syntax for serialization, integrating reflective introspection, abstract syntax tree (AST) generation, and cycle-aware object graph traversal to simultaneously ensure readability, executability, and maintainability. Evaluation demonstrates that ProDJ successfully serializes over 174,000 real-world objects with negligible runtime overhead. A user study confirms that developers significantly prefer ProDJ-generated Java code over JSON or XML—particularly for development tasks requiring human involvement, such as test generation.
📝 Abstract
In managed languages, serialization of objects is typically done in bespoke binary formats such as Protobuf, or markup languages such as XML or JSON. The major limitation of these formats is readability. Human developers cannot read binary code, and in most cases, suffer from the syntax of XML or JSON. This is a major issue when objects are meant to be embedded and read in source code, such as in test cases. To address this problem, we propose plain-code serialization. Our core idea is to serialize objects observed at runtime in the native syntax of a programming language. We realize this vision in the context of Java, and demonstrate a prototype which serializes Java objects to Java source code. The resulting source faithfully reconstructs the objects seen at runtime. Our prototype is called ProDJ and is publicly available. We experiment with ProDJ to successfully plain-code serialize 174,699 objects observed during the execution of 4 open-source Java applications. Our performance measurement shows that the performance impact is not noticeable. Through a user study, we demonstrate that developers prefer plain-code serialized objects within automatically generated tests over their representations as XML or JSON.