How to troubleshoot performance degradations.
If a performance degradation is experienced, it is often not clear which components are involved and what needs to be checked. This script should provide some guidance.
- Note the software and release versions of involved Operating System (OS), Datameer (DM) package, Cluster distribution (HDP, CDH, ...) and Browser.
- Capture logs! Like application log file, job trace(s). Double check for correct information. Force to gather Hadoop task logs if missing. Include from
hadoop-metrics.log.*. Do this as early as possible.
- Capture configuration files from
<datameer-install-path>/etc. Check them for valid values.
- Create a Heap Dump for Datameer Java Process. It might be necessary to force em.
- Gather a database dump.
- Gather Datameer Hadoop Cluster configuration.
hdfs-site.xmlfrom the cluster.
- Review the settings regarding the
job-conf-cluster.xmlfrom available job traces.
- Stop unnecessary imports, jobs and workbooks. Remove them from the queue.
- Depending on the issue get the data pipeline (workbook and any upstream workbooks and import jobs / data links) as JSON and sample data.
- Capture screenshots.
- Explore more details about the pattern for which workbooks or sheets this occurs on and which workbooks sheets this does not occur on to try to detect a predictable pattern.
Contact Datameer support. It might be necessary to proceed further with Data Collection for Root Cause Analysis. Providing as much information as possible will give us the capability to resolve an issue fast.