A JSON Schema to Bytecode compiler
API-First development (also known as Contract-driven development) starts by designing and documenting the contracts between the components that build your system (services, user interfaces, third party providers, etc.) before writing the source code.
There are many benefits to this approach. Some of the main ones include:
- Central Source of Truth: The API specification serves as the definitive guide for component behavior, simplifying testing and validation.
- Parallel Development: Provider and consumer components can be developed simultaneously.
- Automation: Shared artifacts of the system can be generated directly from the specification documents.
An API specification document describes two things: component behavior and shared data schema. For component behavior we use specifications like OpenAPI for REST APIs, or AsyncAPI for Event-driven Architectures. These specifications allow us to define how our components interact, which endpoints they expose, which messages they publish and consume.
On the other hand, for shared schema definitions, we can use JSON Schema to define the data types that our components will interchange. Both OpenAPI and AsyncAPI support JSON Schema definition for requests, responses, and asynchronous message payloads.
A typical process involves using a code generator to create part of the source code from the specification. This is useful when implementing behavior described in the API specification. However, sometimes the goal is only to generate the shared model.
It can be beneficial to create an artifact that contains all the data types exchanged by the components, and let that artifact be imported as a dependency in several services.
For these cases, we can still use a code generator to write the source files, and then a compiler to generate the binary artifact. But since we would not need to add behavior to such a library, it is somewhat overkill to install a full development kit only to write a library that contains some shared data types.
json-schema-compiler is a command-line utility that compiles JSON Schema files to bytecode. It can be used to write a set of class files, or a single jar library, that can then be integrated into any application running on a JVM. This means that the generated models can be instantiated and manipulated from any programming language that can run in a JVM, from Java to Kotlin, from Scala to Groovy or Clojure.
Using the compiler
The compiler is a single self-contained binary that can be downloaded and used from any compatible system (or you can generate it for your specific SO and architecture if needed). Suppose we have the following JSON Schema file called Product.json
:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/product.schema.json",
"title": "Product",
"description": "A product from Acme's catalog",
"type": "object",
"properties": {
"productId": {
"description": "The unique identifier for a product",
"type": "integer"
},
"productName": {
"description": "Name of the product",
"type": "string"
},
"price": {
"description": "The price of the product",
"type": "number",
"exclusiveMinimum": 0
},
"dimensions": {
"type": "object",
"properties": {
"length": {
"type": "number"
},
"width": {
"type": "number"
},
"height": {
"type": "number"
}
},
"required": [ "length", "width", "height" ]
}
},
"required": [ "productId", "productName", "price" ]
}
To create a JAR file with equivalent classes, use the following command:
cat Product.json | json-schema-compiler -o product.jar -p es.nachobrito.product
This generates a library called product.jar
with the following structure:
$ jar -tf product.jar
META-INF/MANIFEST.MF
es/nachobrito/product/ProductDimensions.class
es/nachobrito/product/Product.class
This approach has some advantages over generating source code:
- No need to install a full JDK and create a project. This eases the generation of such libraries from automated CI/CD pipelines.
- Faster execution time, as this direct compilation will tipically take milliseconds to execute, compared to several seconds of code generation, compilation and packaging.
You can review the project documentation for more details and usage instructions.
⚠️ IMPORTANT: This project makes heavy use of the Java Class-File API , which currently is in preview stage and will be part of the Java 24 release in March 2025. Also, there are some important features pending to implement, so consider it as an experiment for the moment. Also, the project is open sourced under the Apache2 license so please feel free to fork and contribute if you think there is something you can fix or enhance :-)
If you are curious about how the compiler works, keep reading. The rest of this article contains some technical documentation of the internal parts.
How to generate bytecode using the Java Class-File API
The JEP 484 “Class-File API” provides “a standard API for parsing, generating, and transforming Java class files”. It was firstly introduced in JDK 22 as preview feature and is planned for final release in JDK 24. It is targetted to framework and tooling developers that need to manipulate bytecode at runtime, to remove the need of third party libraries that get outdated each time the class-file format changes (remember there is a new version of the JDK every six months!).
The java.lang.classfile.ClassFile interface is the entry point for class file parsing and manipulation.
When it comes to create new classes, a ClassFile object exposes the build(ClassDesc, Consumer<? super ClassBuilder>)
method that you can use to implement all of its parts. The Consumer
parameter is a Function you provide that will receive a ClassBuilder instance for you to maniuplate the class.
Let’s see some example code to understand how this works.
Create a new class
Since JSON Schema define data types used to transfer information between components, it makes sense to represent them in Java as Records, since they are already immutable data containers perfect for this tasks.
The JSON Schema compiler is based in the visitor pattern, defining a set of objects that implement the ModelGenerator interface and are capable of adding different parts of the new class to the ClassBuilder:
public interface ModelGenerator {
static Set<ModelGenerator> of(
RuntimeConfiguration runtimeConfiguration,
ClassGenerationParams params) {
return Set.of(
new ConstructorGenerator(runtimeConfiguration, params),
new PropertiesGenerator(runtimeConfiguration, params),
new EqualsGenerator(runtimeConfiguration, params),
new HashCodeGenerator(runtimeConfiguration, params),
new ToStringGenerator(runtimeConfiguration, params));
}
void generatePart();
}
Where ClassGenerationParams
is a simple Record that stores the Class description, the ClassBuilder and the sorted map of properties:
public record ClassGenerationParams(
ClassDesc classDesc, ClassBuilder classBuilder, SortedMap<String, Property> properties) {}
Once the JSON Schema file is read and the type definitions are identified, the Compiler creates a new ClassFile instance, configures it as a subclass of java.lang.Record
, and passes its classBuilder to a set of generators that will compose the constructor, properties and all the required parts:
var className = /* Fully qualified class name */
var properties = /* A sorted Map of property definitions, indexed by name */
var bytes =
ClassFile.of()
.build(of(className), classBuilder -> {
//define the new class as public final, extending from java.lang.Record
classBuilder.withFlags(ACC_PUBLIC | ACC_FINAL).withSuperclass(of("java.lang.Record"));
var classDesc = of(className);
var params = new ClassGenerationParams(classDesc, classBuilder, properties);
ModelGenerator.of(runtimeConfiguration, params).forEach(ModelGenerator::generatePart);
});
Example: Constructor generation
The constructor generator the simplest one, and can be used as example of how generators work.
As we want to generate Records, we need to create a constructor that receives all the properties. This is done by using the ClassBuilder.withMethod
method, that receives the following parameters:
- The method name, in this case there is a special constant for the “<init>” value reserved for constructors.
- An instance of
MethodTypeDesc
that describe the return and parameter types. - A set of accessibility flags, in this case it is a public constructor.
- A
Function
that will receive theMethodBuilder
and use it to write the method body.
To uderstand the method body we need to think in bytecode: each method is composed of a stack of operands. Operations work like a Polish notation calculator, you add values to the stack, then you add operations that consume values from the stack as parameters, and push new values as result. There is also a Locals collection that keeps all the variables and references that are in scope.
For example, the first thing a constructor do is calling the parent class constructor. You can see how the constructor loads the local variable at position 0 (the this reference), and then invokes the java.lang.Record
constructor on it (remember that this will remove this from the stack, so you need to push it again if you need to do a new operation on it).
Then, the constructor enters a loop that does the following for each property:
- load this again to the stack
- load the parameter associated to the current index (starting from 1, as 0 is this)
- use the
putfield
operation that stores the value in the given property.
@Override
public void generatePart() {
var propertyTypes = /* a sorted array of ClassDesc objects representing the properties to create*/;
params.classBuilder().withMethod(
INIT_NAME,
MethodTypeDesc.of(CD_void, propertyTypes),
ACC_PUBLIC,
methodBuilder -> {
methodBuilder.withCode(
codeBuilder -> {
// Invoke parent constructor
codeBuilder
.aload(0)
.invokespecial(of("java.lang.Record"), INIT_NAME, MethodTypeDesc.of(CD_void));
// Set params.properties():
int index = 1;
for (var entry : params.properties().entrySet()) {
codeBuilder
//[this]
.aload(0)
//[the parameter]
.aload(index++)
//store [the parameter] in [this], asociated to the given property name
.putfield(
params.classDesc(), entry.getValue().formattedName(), entry.getValue().type());
}
codeBuilder.return_();
});
if (runtimeConfiguration.withJacksonAnnotations()) {
var annotations =
params.properties().values().stream()
.map(
property ->
List.of(
Annotation.of(
ClassDesc.of(JsonProperty.class.getName()),
AnnotationElement.ofString("value", property.key()))))
.toList();
methodBuilder.with(RuntimeVisibleParameterAnnotationsAttribute.of(annotations));
}
});
}
The other generators are similar to this one, but a little bit more complex as they have logic to generate methods like equals, hashCode or toString.
Why?
The idea of creating json-schema-compiler came when I was reading about this new Class-File API, thinking about possible pet-projects to give it a try.
I have been working a lot in API-first development techniques and event-driven architectures recently, in particular with code generation templates for AsyncAPI specifications. I thought it would make sense to generate bytecode from the specification directly, without going to the source code generation step, to have the shared models in a single jar file that interested modules could depend on.
The vision is to have a small binary utility that can be used in CI pipelines to generate the library automatically everytime the specification documents are updated.
Future Development
This is a pet-project I have created to learn a new API. I think it does have applications for real world use cases, so I will continue mantaining it and implementing the missing bits for a while. I have released it as open-source, under the Apache2 license, in case it is of use for anyone. If that happens, I would need some volunteer help to keep it updated and mantained. Feel free to contact me if you are interested!
References:
The following are some reference materials I’ve used to learn about the Classfile API, and the Java Virtual Machine Instruction set:
- A Basic Introduction to the Classfile API
- Build A Compiler With The JEP 457 Class-File API
- The Java Virtual Machine Instruction Set
- JEP 484: Class-File API
- Looking at Java 22: Class-File API
- Class File API: Not Your Everyday Java API
- Java Bytecode Crash Course