AREX Agent Source Code Analysis

This article will provide an in-depth explanation of the AREX Agent source code from the perspective of how it works.

AREX Startup Process

General Java Agent Startup Process

A Java Agent is a type of Java application that can be dynamically injected into the JVM during the startup of a Java application. It monitors and modifies the behavior of the application during runtime. Java Agents are commonly used for performance analysis, code coverage, security checks, and other purposes.

Here is the startup process of a Java Agent:

Write a Java Agent program that implements the premain method. The premain method is the entry point of the Java Agent and is called when the Java application starts. In the premain method, you can perform initialization operations such as setting up proxies, loading configuration files, etc.
Package the Java Agent into a JAR file and specify the Premain-Class attribute in the MANIFEST.MF file. This attribute specifies the entry class of the Java Agent.
When starting the Java application, specify the path to the Java Agent's JAR file using the -javaagent parameter. For example:
```
java -javaagent:/path/to/agent.jar -jar myapp.jar
```
In the above command, /path/to/agent.jar is the path to the Java Agent's JAR file, and myapp.jar is the path to the Java application's JAR file.
When the Java application starts, the JVM loads the Java Agent's JAR file and calls the premain method. In the premain method, the Java Agent can use the Java Instrumentation API to modify the bytecode of the Java application and achieve monitoring and modification of the application.

The startup process of AREX from the perspective of source code

Step 1

In the pom.xml file of the arex-agent module, the Premain-Class attribute is configured to io.arex.agent.ArexJavaAgent using the manifestEntries configuration. This means that when building the arex-agent.jar, the manifest file will specify ArexJavaAgent class as the entry point of the agent.

Step 2

In the ArexJavaAgent class, the premain method is implemented as the entry point method for the agent. In the premain method, it calls the agentmain method. In the agentmain method, it further calls the init(Instrumentation inst, String agentArgs) function. This function accepts an Instrumentation object and a string parameter agentArgs.

Step 3

In the init function, there are two important operations: installBootstrapJar() and AgentInitializer.initialize().

installBootstrapJar()

The installBootstrapJar() function locates the jar file that contains the AgentInitializer.class and adds it to the search path of the Bootstrap ClassLoader by calling inst.appendToBootstrapClassLoaderSearch(jar). The Bootstrap ClassLoader is a special class loader in the Java virtual machine responsible for loading core libraries such as java.lang and java.util. By calling the appendToBootstrapClassLoaderSearch method, custom libraries can be added to the search path of the Bootstrap ClassLoader, allowing Java applications to use these custom libraries.

To obtain the jar file in which a class is implemented based on a class object or a jar file, you can follow these steps:

Obtain the Class object of the desired class.
Call the getProtectionDomain() method on the Class object to retrieve the ProtectionDomain object of the class.
Call the getCodeSource() method on the ProtectionDomain object to obtain the CodeSource object of the class.
Call the getLocation() method on the CodeSource object to retrieve the URL of the jar file in which the class is located.
Use the getFile() method on the URL object to obtain the file path of the jar file.

AgentInitializer.initialize()

In the AgentInitializer.initialize() function, the following steps are performed:

It locates the jar file containing the ArexJavaAgent.class (the AgentInitializer.java file) and sets the arex.agent.jar.file.path variable to the directory where the agent jar file is located.
It searches for a /extensions/ subdirectory within that directory and reads all the jar files found in that directory. These jar files are the locations of the extension packages.
It calls the createAgentClassLoader(agent jar, extension jars) function to create an AgentClassLoader object, which is a custom class loader provided by AREX. Using a custom class loader helps isolate and prevent the application from accessing AREX Agent's code.
It calls the createAgentInstaller() function, which uses the previously created AgentClassLoader to load the io.arex.agent.instrumentation.InstrumentationInstaller class, retrieves its constructor, creates an instance, and returns an object referring to the AgentInstaller interface.
The AdviceClassesCollector collects the agent jar file and the extension jar files.
Using the installer object (returned earlier) that points to the AgentInstaller interface, it calls the install() function. This effectively calls the install() function of the BaseAgentInstaller class, which then calls init(String agentArgs) for initialization.
In the install() function of the BaseAgentInstaller class, the following operations are performed in the init() function:
1. Initialization of the TraceContextManager, which generates an IDGenerator used for generating TransactionID.
2. Initialization of the installSerializer.
3. Initialization of the RecordLimiter and setting the recording frequency limit.
4. Loading proxy configuration using the ConfigService, including settings for debug mode, dynamic class configuration, excluded operations configuration, Dubbo replay threshold, record rate configuration, and more.
5. Initialization of the data collector, which is determined based on the running mode, and starts the data collector.
6. Retrieving proxy configuration from the server again, with three retry attempts. The configuration is then parsed and updated. (Note: There is a bug where the Dubbo replay threshold doesn't get updated after the second retrieval from the server.)
In the install() function of the BaseAgentInstaller class, there is a call to an abstract function named transform(). The actual implementation of this abstract function is found in the transform() function of the InstrumentationInstaller class.

With these configurations and operations, the ArexJavaAgent class serves as the entry point for the agent and is loaded when the Java application starts. It extends the Bootstrap ClassLoader, allowing the application to use custom libraries.

Step 4

The transform() function in the InstrumentationInstaller class implements the code injection operation for the target application.

It obtains an instance of ByteBuddy's AgentBuilder through getAgentBuilder().
It retrieves a list of all classes annotated with @AutoService(ModuleInstrumentation.class), which are identified as ModuleInstrumentation classes using the com.google.auto.service SPI mechanism.
For each class in the list, it calls InstallModule() to register the module using the AgentBuilder and ModuleInstrumentation.
Within each ModuleInstrumentation class, it retrieves a list of TypeInstrumentation instances and, for each one, finds the corresponding list of MethodInstrumentation.
For each MethodInstrumentation, it invokes the transform() function of AgentBuilder.Identified to perform the code injection.

In summary, this step implements modular instrumentation. By implementing the ModuleInstrumentation interface, modules that require code injection can be defined. Within each module, by implementing the TypeInstrumentation interface, specific types that need code injection can be defined. Similarly, within each type, by implementing the MethodInstrumentation interface, specific methods that require code injection can be defined. This way, the AREX Agent can inject recording and replaying code into the respective methods based on these definitions, enabling recording and replaying functionality.

Step 5

After completing the injection of all classes and the initialization process, AREX starts running.

AREX Recording and Replay

Recording & Replay Overview

AREX's recording feature goes beyond capturing individual request messages. It aims to save not only the requests and response messages but also the internal invocations and their corresponding requests and responses. The main objective is to establish associations between requests, responses, and internal calls and store them together. AREX employs a tracing technique similar to OpenTelemetry to achieve end-to-end tracing and save the associated tracing IDs.

Recording

Recording in AREX is divided into two parts: entry recording and internal invocation recording. In the entry recording, the requests do not have a tracing ID initially, so a unique tracing ID is generated and recorded. The entry recording captures the requests along with the generated tracing ID. In the internal invocation recording, the tracing ID and the requests and responses of internal invocations are saved.

It is also important to record the response messages of entry requests. This includes the response of the entry call along with the associated tracing ID, which is referred to as the AREX Record ID in subsequent sections.

Replay

During the playback process, the entry requests contain AREX-Replay-ID and Record ID in their messages. The response corresponding to the Record ID is retrieved from the database and returned to the caller. At the same time, the Replay ID is associated with the recorded data and saved to the database to track the playback process.

In the case of internal invocations, if the system detects that it is in the playback state, it retrieves the data from the database based on the Record ID and returns it as a simulated response. The internal invocation requests are recorded, associated with the Replay ID, and saved to the database.

Using the Replay ID, the response messages of the entry calls and the request messages of the internal invocations are retrieved, allowing for a comparison of differences between the recording scenario and the playback scenario.

Finally, the differences are output as the result, and the playback process concludes.

Entry Recording and Replay for AREX Servlet

Code location directory: arex-agent-java\arex-instrumentation\servlet

Three elements of AREX injection code

ModuleInstrumentation: FilterModuleInstrumentationV3
TypeInstrumentation: FilterInstrumentationV3
MethodInstrumentation:

     @Override
    public List<MethodInstrumentation> methodAdvices() {
        ElementMatcher<MethodDescription> matcher = named("doFilter")
                .and(takesArgument(0, named("javax.servlet.ServletRequest")))
                .and(takesArgument(1, named("javax.servlet.ServletResponse")));

        return Collections.singletonList(new MethodInstrumentation(matcher, FilterAdvice.class.getName()));
    } 

Steps to Record and Replay

Modify the doFilter(request, response) method of the javax.servlet.Filter class.
Perform modifications at the entry point (OnMethodEnter) and retrieve two parameters: the request at position 0 and the response at position 1.
a. Invoke ServletAdviceHelper.onServiceEnter(), passing the request and response.
b. Invoke CaseEventDispatcher.onEvent(CaseEvent.ofEnterEvent()), which includes calling TimeCache.remove(), TraceContextManager.remove(), and ContextManager.overdueCleanUp().
c. Invoke CaseEventDispatcher.onEvent(CaseEvent.ofCreateEvent()), which includes calling initContext(source) and initClock().
The initContext() function sets the ArexContext, generating a TraceID at the entry point. The parameter createIfAbsent in ContextManager.currentContext(true, source.getCaseId()) is set to True, which calls TRACE_CONTEXT.set(messageId).
The initClock() function checks if the system is in playback state. If it is, it parses the time and calls TimeCache.put(millis). If the system is in recording state (i.e., ArexContext is not empty and not in playback state ContextManager.needRecord()), it calls RecordMocker.
Perform modifications at the exit point (OnMethodExit) and invoke ServletAdviceHelper.onServiceExit().
Invoke the new ServletExtractor<>(adapter, httpServletRequest, httpServletResponse).execute() function.
Then, call doExecute() to build the Mocker object and set request headers, body, and attributes. Also, set the response object, body, and type for the Mocker object.
If the system is currently in playback state, replay the Mocker data. If it is in recording state, save the Mocker data.

Similar implementation approaches can be applied to entry recording and playback, with similar principles.

For Dubbo, the implementation can be done in the onServiceEnter() method of the DubboProviderExtractor class.
For Netty, the implementation can be done in the add prefixed functions and replace functions of the io.netty.channel.DefaultChannelPipeline class.

Recording and Replay of AREX internal calls

Code location directory: arex-agent-java\arex-instrumentation\netty\arex-netty-v4

Three elements of AREX injection code

ModuleInstrumentation: NettyModuleInstrumentation
TypeInstrumentation: ChannelPipelineInstrumentation
MethodInstrumentation:

     @Override
    public List<MethodInstrumentation> methodAdvices() {
        return singletonList(new MethodInstrumentation(
                isMethod().and(nameStartsWith("add").or(named("replace")))
                        .and(takesArgument(1, String.class))
                        .and(takesArgument(2, named("io.netty.channel.ChannelHandler"))),
                AddHandlerAdvice.class.getName()));
    } 

Steps to Record and Replay

In Java Netty, the ChannelPipeline is an event processing mechanism used to handle inbound and outbound events. It is one of the core components of Netty and is responsible for managing the processing flow of ChannelHandlers. When an event is triggered, it is passed to the ChannelPipeline, and each ChannelHandler in the pipeline processes it in sequence. Each ChannelHandler can handle the event or forward it to the next ChannelHandler. The addAfter method is used to add a new ChannelHandler to the ChannelPipeline and insert it after a specified ChannelHandler. This method allows dynamic modification of the processing flow in the ChannelPipeline, enabling the addition or removal of handlers as needed at runtime.

When modifying the add prefixed functions or replace functions in the io.netty.channel.DefaultChannelPipeline class, we can obtain the current ChannelPipeline of the object and the parameters handleNamer (parameter 1) and handler (parameter 2) in the OnMethodExit function.

We can perform the following checks and processing:

If the handler is an instance of HttpRequestDecoder, call RequestTracingHandler() to handle playback data.
If the handler is an instance of HttpResponseEncoder, call ResponseTracingHandler() to handle recorded data.
If the handler is an instance of HttpServerCodec, call ServerCodecTracingHandler() for processing. HttpServerCodec is a ChannelHandler in Java Netty used for encoding and decoding HTTP requests and responses into HTTP messages. It implements the encoding and decoding of the HTTP protocol, converting HTTP requests and responses into byte streams for transmission over the network.

Handling of asynchronous access

In the Java ecosystem, there are various asynchronous frameworks and libraries available, such as Reactor, RxJava, etc. Additionally, some libraries provide implementations for asynchronous access. For example, Lettuce offers both synchronous and asynchronous access to Redis. Different scenarios often require different solutions.

Taking ApacheAsyncClient as an example, it achieves asynchronous processing by listening for responses and initiating callbacks (Callback) in dedicated running threads. Throughout the entire process of invoking, listening, and callback execution, it is important to ensure the propagation of traces across multiple threads.

In the injection code, the Trace needs to be propagated using TraceTransmitter from the FutureCallbackWrapper. The specific injection points are as follows:

ModuleInstrumentation: SyncClientModuleInstrumentation
TypeInstrumentation: InternalHttpAsyncClientInstrumentation (for asynchronous cases), InternalHttpClientInstrumentation
MethodInstrumentation: Inject into the execute function of the org.apache.http.impl.nio.client.InternalHttpAsyncClient class, identified using the named("execute") method.

Steps to Record and Replay

In the injection code, we target the execute function of the org.apache.http.impl.nio.client.InternalHttpAsyncClient class and use the named("execute") method to identify it.

First, we retrieve the third parameter of the execute function, which is the FutureCallback. We assign it to the callback parameter of the FutureCallbackWrapper, a wrapper class implemented by AREX. The FutureCallback interface defines two methods: onSuccess and onFailure. The onSuccess method is called when the asynchronous operation is successfully completed, passing the result of the asynchronous operation as a parameter. The onFailure method is called when the asynchronous operation fails, passing the exception as a parameter.

Next, we perform the following checks:

If recording is required, the FutureCallbackWrapper's wrapper class overrides the completed(T) function. In the completed function, the response data is saved, and then the original completed method of the FutureCallback is called. Similarly, the FutureCallbackWrapper's wrapper class also overrides the failed() function. In the failed function, the response data is recorded, and the original failed method of the FutureCallback is called.
If replay is required, we retrieve the replay data and store it in the local mockResult variable.

Finally, at the exit point of the injection function, if the mockResult variable is not empty and the callback is an instance of the AREX wrapper class, we call the replay function of the wrapper class to perform the replay operation.

Through these operations, we handle the propagation of traces across threads at the entry and exit points of the execute function, including the implementation of recording and replay functionality.

AREX recording frequency setting

In the onServiceEnter function of the ServletAdviceHelper class, which is called when a servlet enters, the recording frequency of AREX is implemented.

CaseEventDispatcher.onEvent(CaseEvent.ofEnterEvent());
if (shouldSkip(adapter, httpServletRequest)) {
            return null;
}

First, the recording decision is determined based on the request headers and configuration:

If the request headers contain the caseID field and the configuration item arex.disable.replay is set to true, recording is skipped.
If the request headers contain the arex-force-record field and its value is true, recording cannot be skipped.
If the request headers contain the arex-replay-warm-up field and its value is true, recording is skipped.

Next, the request message is parsed:

If the request URL is empty, recording is skipped.
If the request URL is in the recording ignore list defined in the configuration, recording is skipped.

Then, the invalidRecord method of the Config class is called to check the validity of the recording:

If the configuration is in debug mode, recording cannot be skipped and false is returned.
If the recording rate in the configuration is less than 0, recording is skipped.

Finally, the decision to skip recording is made based on the request path and recording rate. The acquire function of the com.google.common.util.concurrent.RateLimiter class is used for this purpose. RateLimiter is a class in the Google Guava library used to limit the rate of operations. It can be used to control how many times an operation can be executed within a certain period of time. To use the RateLimiter class, a RateLimiter object is created and the rate limit is specified. Then, the acquire() method is used to acquire a permit, indicating that an operation can be performed.

If the current rate limit has been reached, the acquire function will block until a permit can be obtained.
If a permit can be obtained, recording is not skipped.

AREX Code Isolation

In the Java Virtual Machine (JVM), when comparing two classes for equality, not only their fully qualified names are compared but also their class loaders. If two classes have the same fully qualified name but are loaded by different class loaders, the JVM considers them as different classes.

This design helps ensure the security and isolation of the Java Virtual Machine. Different class loaders can load the same class, but the classes they load are independent and not visible to each other. This avoids class conflicts and interference between different applications or modules.

In AREX, the following class loaders are involved:

arex-agent: Loaded by the AppClassLoader, it is responsible for loading the core components of the AREX Agent.
arex-agent-bootstrap: Loaded by the Bootstrap ClassLoader, it is responsible for loading the bootstrap classes of the AREX Agent.
arex-agent-core: Loaded by the AgentClassLoader, which is a custom ClassLoader in AREX, responsible for loading arex-agent-core and other related JAR files.
arex-instrumentation: Loaded by the UserClassLoader, it is responsible for loading AREX's Instrumentation, Modules, Advices, and other components.
- XXX Instrumentation & Module & Advice: Loaded by the AgentClassLoader, it is responsible for loading specific implementations of Instrumentation, Modules, Advices, and other components.
arex-instrumentation-api: Loaded by the AgentClassLoader, it includes both the API and Runtime parts.
- api: Loaded by the AgentClassLoader, it provides APIs for users to interact with.
- runtime: Loaded by the AppClassLoader, it provides runtime functionalities for AREX.
arex-instrumentation-foundation: Loaded by the AgentClassLoader, it is responsible for loading the foundational functionalities of AREX, such as backend implementations.

These different class loaders have isolation, ensuring the independence and security of each component.

In the context of AREX:

AgentClassLoader: It is a custom ClassLoader in AREX. It is responsible for loading classes specific to AREX's agent.
Bootstrap ClassLoader: The Java Instrumentation API is a powerful tool introduced in Java SE 5, allowing runtime modification of Java class behavior. The Instrumentation class is one of the core classes in the Java Instrumentation API, providing methods to monitor and modify the runtime behavior of Java applications.
The appendToBootstrapClassLoaderSearch method is a method in the Instrumentation class. Its purpose is to add the specified JAR file to the search path of the Bootstrap ClassLoader.
The Bootstrap ClassLoader is a special class loader in the Java Virtual Machine responsible for loading the core libraries of the Java runtime environment, such as java.lang and java.util.
By calling the appendToBootstrapClassLoaderSearch method, custom libraries can be added to the search path of the Bootstrap ClassLoader, allowing Java applications to use these custom libraries. It is important to note that since the appendToBootstrapClassLoaderSearch method modifies the runtime state of the Java Virtual Machine, only users with sufficient privileges can invoke this method.
AppClassLoader: It is the default ClassLoader for Java applications. It is responsible for loading classes of the application. The AppClassLoader searches for class files in the paths specified by the CLASSPATH environment variable or the java.class.path system property.
If the class to be loaded is not found in the AppClassLoader's search path, it delegates the loading to the parent ClassLoader until the Bootstrap ClassLoader is reached.
UserClassLoader: It is a user-defined ClassLoader. In the SPIUtil class, the Load method uses the following code to obtain the ClassLoader:
```
ClassLoader cl = Thread.currentThread().getContextClassLoader();
```
This code retrieves the current thread's context ClassLoader, which can be set by the application or framework to load classes from a specific source or location.

AREX Startup Process​

General Java Agent Startup Process​

The startup process of AREX from the perspective of source code​

Step 1​

Step 2​

Step 3​

Step 4​

Step 5​

AREX Recording and Replay​

Recording & Replay Overview​

Recording​

Replay​

Entry Recording and Replay for AREX Servlet​

Three elements of AREX injection code​

Steps to Record and Replay​

Recording and Replay of AREX internal calls​

Three elements of AREX injection code​

Steps to Record and Replay​

Handling of asynchronous access​

Steps to Record and Replay​

AREX recording frequency setting​

AREX Code Isolation​

AREX Startup Process

General Java Agent Startup Process

The startup process of AREX from the perspective of source code

Step 1

Step 2

Step 3

Step 4

Step 5

AREX Recording and Replay

Recording & Replay Overview

Recording

Replay

Entry Recording and Replay for AREX Servlet

Three elements of AREX injection code

Steps to Record and Replay

Recording and Replay of AREX internal calls

Three elements of AREX injection code

Steps to Record and Replay

Handling of asynchronous access

Steps to Record and Replay

AREX recording frequency setting

AREX Code Isolation