When troubleshooting a Production Down issue, if the administrator would like to investigate the root cause of the issue, it is important to collect information before the environment is restarted.
Even when all this information is available, a root cause may not be able to be determined.
Pre-requirement
It is assumed that you have prepared commands for faster operations as mentioned in our Installation Guide and our knowledge base article Setup Bash Shell Aliases.
On behalf you may create one of the necessary commands via
alias dmpid='ps -ef | grep -i "java.*jetty.*datameer" | grep -v grep | tr -s " " | cut -d " " -f2'
Items to Collect before Restarting Datameer
- Gather
dmesg
and/var/log/messages
- Get all! logs from the
logs/
directory that were updated in the past day or two at least, especially the application log file (akaconductor.log
):tar -zcvf rca.tar.gz logs/*.*
- To list and collect the open files (
lsof
), run on the Datameer environment
lsof > lsof.out
- Gather network connections information
netstat -tonp | grep -i WAIT > netstat.out
- Force a heap dump
$JAVA_HOME/bin/jmap -F -dump:format=b,file=heapdump.hprof $(dmpid)
- If the
/dev
page is accessible, collect a Thread Dump (i.e. http(s)://<host>:<port>/dev/threaddump) -
Collect also a Java thread dump, if possible.
$JAVA_HOME/bin/jstack -l $(dmpid) > jstack.out kill -3 `dmpid`
- If a heapdump exists from an
OutOfMemoryException
, collect the heapdump file - If a
javacore
file exists, collect thejavacore
file - Gather the current status of the MySQL metastore processlist and save to a file.
Log into the MySQL host shell and run:mysql -h<host> -u<userid> -p<password> -e 'SHOW FULL PROCESSLIST;' > processlist.txt
- Gather an application database dump, if possible.
mysqldump -h<host> -u<userid> -p<password> <database name> | gzip > Datameer-<version>-<dist>-<date>.sql.gz
Further Troubleshooting
Even when all the information above is collected, the root cause itself may require further troubleshooting to diagnose. If an issue is recurring, Datameer Support may recommend activating Memory Profiling from our Documentation. This is not recommended generally and should only be activated at the request of Datameer Support.
Comments
0 comments
Please sign in to leave a comment.