This is a working document
The Java application technology stack
Performance Issues in my own experience
Java application implementation
Issue 1: Java Virtual Machine spends more time to perform garbage collection when more objects exist and more temporary objects are created.
Issue 2: Application algorithms are usually not optimal, which leads to performance bottlenecks.
Issue 2a: A method that performs well is called to often.
Issue 3: When Java Virtual Machine cannot allocate an object (because it is out of memory, and no more memory can be freed by the garbage collector), OutOfMemoryError is thrown, which can cause an application crash or further unstable operation.
Issue 4: An application that uses a lot of memory reduces available physical memory for itself and other programs, and thus forces the operating system to swap memory pages to and from the disk more frequently. This leads to serious overall system performance degradation.
Issue 5: Memory leak is an existence of objects that are not needed anymore according to the application logic, but still retain memory and cannot be collected because they are referenced from other live objects, due to a bug in application itself.
Issue 6: Lock contention
Issue 7: Deadlock
Issue 8: Livelock
Issue 9: Starvation
Issue 10: Race conditions
Issue 11: Too many threads (-Xss)
Issue 12: Oversynchronization
Issue 13: Useless synchronization
Improper heap sizing
Issue 14: Heapsize is too large or too small
Issue 15: Heap segmentation is not appropriate for application
Improper GC configuration
Target: Have as few full GC as possible!
Issue 16: The Young space is collected in parallel, but the Tenured may not. This means that at a time of load if a full collection event occurs, since the event is a 'stop-the-world' serial event then all application threads other than the garbage collector thread are taken off the CPU. This can have severe consequences if requests continue to accrue during these 'outage' periods.
Improper JVM flags
Issue 17: The hotspot server JVM has specific code-path optimizations which yield an approximate 10% gain over the client version. Most installations should already have this selected by default, but it is still wise to force it with -server, especially on some Windows machines.
Issue 18: If explicit GC is allowed your server application can get in severe trouble when programs continuously perform GCs (which they shouldn't). This can be disabled by using the flag -XX:+DisableExplicitGC at startup.
Java container configuration (application servers)
Issue 19: Too small database connection pool
Issue 20: Wrong max. client thread configuration (too high/low)
Issue 21: Long running database queries
Issue 22: Bad third party libraries
General Performance Links
Hotspot JVM Architecture
Hotspot Memory Management
Hotspot Tuning White Paper
Presentation: Debugging Java Performance Problems
Presentation: Java Performance Tuning
Jack Shirazi's performance tuning blog
Hotspot GC Tuning and the summary
IBM's Diagnostics Guide
The Top Java Memory Problems: Part 1
The Top Java Memory Problems: Part 2
Hotspot Trouble Shooting Guide
Sun's Java Performance Documentation:
Java Performance JVM Options Categorized
Building robust benchmarks
Copy arrays fast
A benchmarking framework
Java bytecode instruction listing
dr. garbage bytecode analyzer
Bytecode outline plugin
Print assembly code (Java 7)
X86 assembly language
X86 instruction listing
Measuring Object Sizes
Estimating Object Sizes with Instrumentation
Java Agent for Memory Measurements
Determining Memory Usage in Java
Garbage Collection in Hotspot JVM
Understanding Java Garbage Collection
JVM configuration Links
Java HotSpot VM
Oracle JRockit JVM
Hotspot VM Options
the Java application launcher
Eclipse Heap Analyzer
Top Memory Problems - Part 1
Top Memory Problems - Part 2
How to analyze leaky web apps
Memory Analyzer Documentation
Java Performance - Charlie Hunt, 2011
Dozens of JVM options can affect benchmarking. Some relevant ones are:
Type of JVM: server (-server) versus client (-client).
Ensuring sufficient memory is available (-Xmx).
Type of garbage collector used (advanced JVMs offer many tuning options, but be careful).
Whether or not class garbage collection is allowed (-Xnoclassgc). The default is that class GC occurs; it has been argued that using -Xnoclassgc is a bad idea.
Whether or not escape analysis is being performed (-XX:+DoEscapeAnalysis).
Whether or not large page heaps are supported (-XX:+UseLargePages).
If thread stack size has been changed (for example, -Xss128k).
Whether or not JIT compiling is always used (-Xcomp), never used (-Xint), or only done on hotspots (-Xmixed; this is the default, and highest performance option).
The amount of profiling that is accumulated before JIT compilation occurs (-XX:CompileThreshold), and/or background JIT compilation (-Xbatch), and/or tiered JIT compilation (-XX:+TieredCompilation).
Whether or not biased locking is being performed (-XX:+UseBiasedLocking); note that JDK 1.6+ automatically does this.
Whether or not the latest experimental performance tweaks have been activated (-XX:+AggressiveOpts).
Enabling or disabling assertions (-enableassertions and -enablesystemassertions).
Enabling or disabling strict native call checking (-Xcheck:jni).
Enabling memory location optimizations for NUMA multi-CPU systems (-XX:+UseNUMA).