Why can SAP HANA go offline ?
Power failure in data center causes downtime.
Server-level hardware failures (CPU/memory) contribute.
Storage-level hardware failures (disk) impact operations.
Network-level hardware failures (switches/router) disrupt connectivity.
Operating system errors (Linux) lead to downtime.
Storage system errors (SAN/NAS) contribute to issues.
Database errors (SAP HANA) affect overall performance.
Human errors in server, router, storage, Linux, and SAP HANA cause downtime.
What is the system down scenario? In this situation Hana cannot be accessed through sql or any other connection method. SAP Hana cockpit might only be partially connect to the SAP HANA system
Ping selected hosts in the data center.
Ensure stable and responsive network connections.
Because external and internal network connections are important for a SAP HANA system. You should test both by pinging SAP HANA and non-SAP HANA hosts in your network.
ping <SAP HANA host>
ping <internal host>
ping <external host>
It is possible that the host is reachable but maybe the network packages are taking the long way home due to a routing problem.
traceroute <SAP HANA host>
traceroute <internal host>
If your company utilizes a virtual desktop infrastructure (VDI) solution for end-user network connections or operates within a dedicated network, it is advisable to perform network connection tests specifically within these infrastructures.
Log in using SSH to selected hosts.
Verify the operating system's status and ensure there are no issues.
Check when was the host last rebooted it is possible that the host was rebooted in sometime back which cause the issue
last | grep boot You can also use the command uptime
Analyzing the linux system log files to analyze the system is one of the most important tasks during troubleshooting since the move from syslog to systemd , kernel messages and messages of system services are handled by systemd
Systemd was introduced in SLES 12 and RHEL 7 and replaces the traditional init scripts. Systemd also introduced its own logging system called journal. Systemd manages the journal as a system service under the name systemd-journald.service and it is switched on by default.
journalctl -n 50 -p err -b
-n = number of messages to display -p = message priority -b = display boot messages if you use -k instead of -b you can get kernel messages
In some scenario you will see that journalctl is not activated and so it does not work in that case you can check th logs in /var/log
grep -i 'error' messages | tail -n 50
You will see multiple logs written in the logs directory which you can use here
Storage System Test:
Create, read, or delete a file on the storage system.
Confirm seamless connection and functionality.
To check the storage status : df -Th | grep hana and df -Th | grep root
SAN and NAS feature integrated error correction and redundancy to handle power and hardware failures. Modern Linux file systems, being journal-based, can rectify errors caused by power interruptions. Furthermore, databases employ diverse techniques to withstand power failures and cope with improper service shutdown situations.
SAP HANA Database Services:
Use sapcontrol to test the status of all SAP HANA database services.
Ensure all services are running as expected.
Check the SAP HANA Process list :
sapcontrol -nr <instance number> -function GetProcessList
sapcontrol -nr -function GetProcessList sapcontrol -nr -function GetSystemInstanceList
SQL Database Test:
Use hdbsql to test the SQL database for the application user(s).
Confirm proper functionality and responsiveness of the SAP HANA SQL database.
The default port number range for tenant databases is 3<instance>40 - 3<instance>99.
Next step is obviously to check if the SQL interface connection is working
From the HANA host itself :
hdbsql -n localhost -i <instance_number> -d <Tenant_name> -u <Your_database_user>
It is crucial to test all your tenants as each tenant has a different sql port
If you want to check using a key you need to use -U instead
It is possible that you are able to connect from the self host but the application is not able to login in that case check form the application host :
hdbsql -n <SAP HANA Host > -i <instance_number> -d <Tenant_name> -u <Your_database_user>
By performing these steps you will be able to get the exact issue because of which your HANA system is down.