If your IQ instance suddenly crashes or not responding, the following troubleshooting steps are recommended to help diagnostic the root cause:
-
Collect Diagnostics Before RestartingBefore restarting the IQ service, collect thread dumps and system diagnostics. This helps in identifying what the process was doing at the time of the crash. See Generate thread dumps from Nexus Lifecycle java process for details.
-
Review IQ Server logsExamine
clm-server.logandrequest.logfor errors or warnings around the time of the crash. Look for abrupt stops, out-of-memory errors, or network issues. -
Check JVM and Memory SettingsEnsure that JVM heap and direct memory settings are appropriate for your server's available resources. Over-allocating memory to the JVM can starve the OS and lead to crashes.
-
Check System Logs and Resources- Review systemctl logs for any errors or warnings event around the time of the crash.- Review CPU and memory usage on the server. Insufficient memory or CPU can cause the JVM to be killed by the OS (e.g., by the Linux OOM killer). Thus can be verified by checking
/var/log/messagesfor entries like "killed process" to confirm if the OS terminated the process due to resource exhaustion.- Consider increasing RAM and CPU if resources are low. -
Monitor and Gather TelemetryIf possible, collect 24 hours of telemetry (memory, IO, network, etc.) prior to the incident to help identify trends or resource exhaustion.
-
Check for External FactorsIf Nexus is behind a load balancer or reverse proxy, check those logs for connection issues. Also, verify network stability and database connectivity if using an external database.
If you follow these steps, you will maximize the chances of identifying the root cause of a sudden Nexus crash. If the issue persists or you need additional help on the RCA, provide all collected data to Sonatype support includes a full support zip (without file limits) for further assistance.