-
Notifications
You must be signed in to change notification settings - Fork 40.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a cache to LaunchedURLClassloader to improve startup performance #30016
base: main
Are you sure you want to change the base?
Conversation
This commit adds a ClassLoaderCache to LaunchedURLClassLoader to improve startup performance. URLClassLoader shows awful performance when find Classes/Resources which do not exist.
This commit adds a ClassLoaderCache to LaunchedURLClassLoader to improve startup performance. URLClassLoader shows awful performance when find Classes/Resources which do not exist.
This commit adds a ClassLoaderCache to LaunchedURLClassLoader to improve startup performance. URLClassLoader shows awful performance when find Classes/Resources which do not exist.
This looks very promising. Thank you, @zjulbj. Have you done any comparisons of heap usage with and without the cache? |
Hi wilkinsona. Thanks for your reply. Talking about heap usage , there are two ways to control it
protected <K, V> Map<K, V> createCache(int maxSize) {
return Collections.synchronizedMap(new LinkedHashMap<K, V>(maxSize, 0.75f, true) {
@Override
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
return size() >= maxSize;
}
});
}
/**
* Clear URL caches.
*/
public void clearCache() {
this.loaderCache.clearCache();
if (this.exploded) {
return;
}
for (URL url : getURLs()) {
try {
URLConnection connection = url.openConnection();
if (connection instanceof JarURLConnection) {
clearCache(connection);
}
}
catch (IOException ex) {
// Ignore
}
}
} Here is a comparison of memory usage, and I think it's acceptable
with cache
|
Thanks for the heap comparisons! I'm going to bring this up on our next team call. |
Hey, thanks for your contribution and your work. We want to run more benchmarks on it and want to explore other paths (like indexing the classpath etc.) and don't want to commit yet on the cache solution. I'll put it on hold until we have more information / ideas. |
Many thanks for your reply. If I understand correctly,you want to explore other paths to solve this issue in a more widely view? That is also what I am going to do next. My rough idea is generating a index file mapping class/resource/packages to jar urls in the compile time, a little bit like INDEX.LIST,which I believe can imporve the time complexity to O(1). I not sure whether this is what 'indexing the classpath' meant with,and whether am I on a proper path |
Yes, that was meant with "indexing the classpath" :) |
@mhalbritter Is there any progress in "indexing the classpath"?will it be realize in 2.7.x or only support in 3.x? |
This will be 3.0.x at the earliest, but more likely later in the 3.x cycle. I have updated the milestone accordingly. |
@wilkinsona @mhalbritter |
Not at the moment, unfortunately. We have other, higher-priority work at the moment. Perhaps you could build the changes in this PR and share some before and after numbers for your apps? |
Those interested in this PR, it would be really useful to know if you've tried running your app using class data sharing (CDS), particularly with Java 21. We've seen some very promising results with Spring Boot apps and would like to know if you have a similar experience. You can learn more in this blog post from @sdeleuze that includes some information about a script that unpacks a Spring Boot executable jar into a format that's as CDS-friendly as possible. |
I wrote about my experience with CAS here. I did not try the script that does the unpacking, but in a nutshell, what I saw (with a WAR file) was: jar -xf build/libs/cas.war
cd build/libs
java org.springframework.boot.loader.launch.JarLauncher
...
...
...
<Started CasWebApplication in 7.693 seconds (process running for 9.285)> Then: jar -xf build/libs/cas.war
cd build/libs
java -cp "WEB-INF/classes:WEB-INF/lib/*" org.apereo.cas.web.CasWebApplication
...
...
...
<Started CasWebApplication in 6.873 seconds (process running for 7.469)> and then with CDS: java -XX:+AutoCreateSharedArchive -XX:SharedArchiveFile=cas.jsa \
org.springframework.boot.loader.launch.JarLauncher
...
...
...
<Started CasWebApplication in 6.766 seconds (process running for 7.582)> Not that much difference, but I can see what I might have missed with the script. |
Please check the guidelines printed when using the It is expected they you get no significant gain when using the
|
Thanks! I made small modifications to the script so it can run against a WAR file (replaced unpack-executable-jar.sh -d tmp build/libs/cas.war Then: $ java -jar tmp/run-app.jar
.
.
- <Started CasWebApplication in 6.637 seconds (process running for 8.178)> Then: java -XX:ArchiveClassesAtExit=application.jsa -Dspring.context.exit=onRefresh -jar tmp/run-app.jar
java -XX:SharedArchiveFile=application.jsa -jar tmp/run-app.jar
.
.
- <Started CasWebApplication in 6.284 seconds (process running for 6.693)> |
I had a deeper look, this CAS application seems to perform a lot of processing at startup. To put things in perspective, on my powerful MacBook Pro M2, the CAS application takes 5.285 s to start (after disabling key generation at startup) while Petclinic JDBC takes 1.448 s to start. CDS is improving the loading of the classes, it can't optimize custom application processing at startup. If we analyze the data points more in details (unpacked variant is done with the unpack-executable-jar.sh script which replicate #38276 behavior): JARexe: Started CasWebApplication in 7.854 seconds (process running for 8.709) JARexe: Started PetClinicApplication in 1.586 seconds (process running for 1.842) So CDS seems to optimize pretty consistently the loading of the classes on the 2 samples (0,573 s reduction versus 0,47 s). This half of a second on a powerful machine turns to be multiple seconds when deployed on a cheap Cloud server. I think the layout used by the script and proposed by #38276 is immune even without CDS to the performance issue discussed in this PR, since it does not use AOT could be used to optimize even more the Spring Boot configuration processing of this application, but I was not able to make it work with CAS due to some double bean registration (feel free to raise a bug on https://github.com/spring-projects/spring-framework if you think there is something to fix on Spring Framework side). Cache Data Store (Project Leyden premain) will be able to optimize with a wider scope than just class loading (optionally combined with AOT). It will for example include JVM AOT cache and maybe some classpath check caches. For the rest, that will be to the application itself to make careful use of the processing steps happening before startup. |
Thanks very much, @sdeleuze. |
Thank you very much @sdeleuze. (At the moment AOT processing of the CAS application is blocked due to a bug in Spring Data. I will revisit this once the fix is out, but even at the time I was able to build native images, startup time of the native image was still around 2~3 seconds) I hope this isn't hijacking the thread here, but does the team have any recommendations on how to analyze the startup data and performance beyond the likes of VisualVM, etc? One suspicion I have (at least in the case of this CAS app) is that given it's a Spring Cloud app and uses |
@mmoayyed You can probably use FlightRecorderApplicationStartup. |
Sure thing, thank you very much! |
This comment was marked as off-topic.
This comment was marked as off-topic.
Thanks for your pull request. With this cache, our app's start time has decreased by 50%. |
The AppCDS technology relies on generating an archive file every time the JVM exits. AppCDS for custom classloader has a more complex loading process: the JVM uniquely identifies a class using a (classloader, className) tuple.
To conclude, CDS is not the key to solve the startup time problem |
@qsLI CDS indeed does not apply to custom classloaders, but I am wondering if you take the perspective of running your Spring Boot application directly from the executable JAR on production.
It is kind of addressed via a more integrated feature that automatically perform the training run without starting fully your application, see https://docs.spring.io/spring-boot/how-to/class-data-sharing.html for more details. There are still side effects like early database interaction but that could evolve and we have documented how to avoid that easily.
Could you please confirm this is compared to the startup time of an executable JAR. Can you measure against an application extracted with what is described in https://docs.spring.io/spring-boot/reference/packaging/efficient.html? |
@sdeleuze Due to historical reasons, we have been using the Spring Boot executable JAR mode for deployment, so most class loaders are LaunchedURLClassLoader, which makes it impossible to use the CDS feature. Switching to the extracted JAR mode should resolve this issue. Additionally, our project is filled with various lazy implementations, which may require actual traffic to trigger the corresponding class loading. Generating the CDS file during the compile phase might not fully resolve the issues we are encountering. The 50% performance improvement data is indeed based on the executable JAR. We will try the extracted mode when we have the time. |
Motivation
URLClassLoader shows awful performance when loading Classes/Resources which doesn't exist .
URLClassPath
takes linear time to find a resource and it will traverse all over the classpath even if the resource doesn't exist. This will obviously slow down the startup , which is even worse when running from an executable fat jar or if the application's class path is a long list.Scenarios which may lead to this issue are as follows.
1.Find a class repeatedly which apparently doesn't exist
related issue:
spring-projects/spring-framework#13653
Here are the top 10 calls of
loadClass
in the startup of my applicationHere are the top 4 calls of
loadClass
in the startup of my applicationFor example,org.springframework.http.converter.json.Jackson2ObjectMapperBuilder
2.Find resource repeatedly which doesn't exist
Here are the top 4 calls of
findResource/findResources
in the startup of my application3.Find resource repeatedly
For example
4.Reflection repeatedly triggers
sun.reflect.GeneratedMethodAccessor<N>
class loading in runtimeAll the cases listed above happen at startup time, while there are cases in runtime. This will make runtime performance worst. Here is a case which triggers
sun.reflect.GeneratedMethodAccessor<N>
class loading : https://blog.actorsfit.com/a?ID=00001-e5b50c75-c8a5-4cdb-9b0c-5558fb985a60We generally solve these issues mentioned above case by case and solutions are normally tricky. Is there any final solution to solve this fundamentally?
Approach
This commit trying to fundamentally solve this issue by adding a ClassLoaderCache to LaunchedURLClassLoader .
ClassNotFoundException
and fast throw it for the next timegetResource
result directlygetResources
NOT-EXIST resultloader.cache.enable=true
to enable cacheBenefit
Here are the startup performance of our applications after this enhancement, which shows a 50% acceleration on average