Skip to main content

SAP HANA SAVE POINT : An Introduction

This blog is for beginner in HANA savepoint , if you know already how it works feel free to skip this blog and read another blog mentioned as Advanced Hana Savepoint. I have used a reference from Klaus in this blog and you can check his blog in link.



We all know, that the unique selling point of HANA is that it is in-memory database, which means that the data is stored and processed in RAM. First thing that popped in my mind after hearing this was, If it is stored in RAM what happens when you turn off the system. As RAM is volatile in nature, how is persistency maintained ?


So when persistency is concerned , SAVEPOINTS come in action. Savepoints are required to synchronize change in memory with disk level data. Savepoint is a periodic point in time , when all changed data is written in storage in form of pages, all data is flushed from memory to data volumes. 


Talking in Layman terms, how the data is saved from RAM to disk, which is a nonvolatile storage . How HANA as a database justifies the C (consistency) of the ACID properties. The answer to all this is savepoint.


All modified pages of row and column store are written to disk during savepoint. Pages can be considered as the block which stores data that will be transferred from memory to disk.


Points to Note in case of HANA Savepoint :-


  1. Each SAP HANA host and service has its own savepoint.

  2. Data that belongs to savepoint represents a consistent state of data in disk

  3. No changes are done to these savepoint until the next savepoint operation has been completed [changes are not done on the previous consistent state until the next savepoint is completed]


When are the savepoints triggered ?


  1. Savepoint interval (automatic) : During normal operations savepoints are automatically triggered after a specific time interval. This time can be controlled by defining the parameter [persistence] -> savepoint_interval_s in global.ini

The default value is 300 seconds, so savepoints are taken at interval of 300 seconds i.e. 5 mins

  1. We can trigger SAVEPOINT manually : ALTER SYSTEM SAVEPOINT

  2. Soft Shutdown

Soft shutdown triggers a savepoint that is why after soft shutdown you have a quick restart (because you have a consistent state and you don't need to process the log segment) but not the same case in Hard Shutdown (logs need not be processed from the beginning, but only from the last savepoint position.)

  1. Backup 

A global savepoint is performed before a data backup is started , A savepoint is written after the backup of a service is finished

  1. Startup 

After a consistent database state is reached during startup , a savepoint is performed

  1. Reclaim Data Volume

  2. Auto Merge Function (mergedog)

  3. Snapshots

Savepoint normally overwrites older savepoint, but it is possible to freeze savepoint that is known as snapshot. Snapshots are savepoints that are preserved for longer use and so they are not overwritten by the next savepoint.

HANA Savepoint is split into three individual stages:

Phase 1 (PAGEFLUSH): All changed pages are determined that are not yet written to disk. The savepoint coordinator triggers writing of all these pages and waits until the I/O operations are complete. Write transactions are allowed in this phase.
 
Phase 2 (BLOCKING): 

Majority of the savepoint is performed online without holding a lock , but the finalization of the savepoint requires a lock ( Allow me to add that if we have savepoint interval less than 5 mins we do not face immediate issue but we can face some issue as we need to hold locks at every savepoint. In real life scenario we have seen some issue with 3 mins). This step is called the blocking phase of the savepoint. It consists of two major phase 


Sub phase

Thread detail

Description

WaitForLock : this is time we wait to get all the required locks on the table 

enterCriticalPhase(waitForLock)

Before the critical phase is entered, a ConsistentChangeLock needs to be allocated by the savepoint. 


If this lock is held by other threads / transactions, the duration of this phase is increasing. 


At the same time blocking all the DML on the underlying table like INSERT, UPDATE or DELETE are blocked by the savepoint with ConsistentChangeLock.

Critical : this is a one slight moment when database is in sort of hung state “NO” operations are done in this phase. Finalization is done here

processCriticalPhase

Once the ConsistentChangeLock is acquired, the actual critical phase is entered and remaining I/O writes are performed in order to guarantee a consistent set of data on disk level. 


During this time other transactions aren’t allowed to perform changes on the underlying table and are blocked with ConsistentChangeLock.



Phase 3 (POSTCRITICAL): Changes are allowed in this phase again. The savepoint coordinator waits until all asynchronous I/O operations related to the savepoint are finished and marks the savepoint as completed.

Helpful Views when we talk savepoint

View

Details

M_SAVEPOINT_STATISTICS

Global savepoint information per host and service

M_SAVEPOINTS

Detailed information for individual savepoints

M_SERVICE_THREADS

M_SERVICE_THREAD_SAMPLES

HOST_SERVICE_THREAD_SAMPLES

As of SAP HANA SPS 10 savepoint details are logged for THREAD_TYPE = ‘PeriodicSavepoint’ (see SAP Note 2114710).


Helpful SQL Script when we talk savepoint.

1969700 – SQL statement collection for SAP HANA , these self explanatory scripts SQL: “HANA_IO_Savepoints“ [for savepoints]and SQL: “HANA_IO_Snapshots” [for snapshot]

Known issue in Savepoint 


Symptoms

Thread detail

Details

Long waitForLock phase

enterCriticalPhase

(waitForLock)

Long durations of the blocking phase (outside of the critical phase) are typically caused by SAP HANA internal lock contention. The following known scenarios exist

ConsistentChangeLock


Starting with Rev. 102 you can configure the following parameter in order to trigger a runtime dump (SAP Note 2400007) in case waiting for entering the critical phase takes longer than <seconds> seconds:


indexserver.ini ->


[persistence] ->


runtimedump_for_blocked_savepoint_timeout = ‘<seconds>’


(This is not a default parameter, add this parameter manually )

Long critical phase

processCriticalPhase

Delays during the critical phase are often caused by problems in the disk I/O area.



Analyzing the runtime dumps


So all the dumps are created in trace directory quick way to reach that cdtrace and get to the file indexserver_<hostname>.30003.rtedump.<timestamp>.savepoint_blocked.trc


is triggered by the parameter runtimedump_for_blocked_savepoint_timeout.



We could find the savepoint thread,
Savepoint Callstack contains “DataAccess::SavepointLock::lockExclusive”

Complete lock needs to be called

Other threads(SQL thread) waiting for the lock, Callstack contains: “DataAccess::SavepointSPI::lockSavepoint”

Runtime dump : section [SAVEPOINT_SHAREDLOCK_OWNERS]

Most time the savepoint hangs because the exclusive lock is occupied by other thread, that means that the savepoint is locked in shared manner which can be resolved by help of SAP Note 2100009


When you check owner of shared savepoint locks you will get thread that has the lock, After you get the thread id of the sharedlock owner, you can search the thread id and try to find its parent thread id. Once you find the parent you can find the issue with that and resolve the same first and then that parent process will eventually release the lock.

Runtime dump: Section :  [STATISTICS]  M_SAVEPOINTS

We check two data here 


CRITICAL_PHASE_WAIT_TIME : Large time here means, Time required to acquire the ConsistentChangeLock. This can be used to state that issue with the savepoint and also an issue with the exclusive lock. 


CRITICAL_PHASE_DURATION : Large time here means, there is an issue with the I/O.



Comments

You might find these interesting

How to properly Start/Stop SAP system through command line ?

Starting/stopping an SAP system is not a critical task, but the method that most of us follow to achieve this is sometimes wrong. A common mistake that most of the SAP admins do is, making use of the 'startsap' and 'stopsap' commands for starting/stopping the system.  These commands got deprecated in 2015 because the scripts were not being maintained anymore and SAP recommends not to use them as many people have faced errors while executing those scripts. For more info and the bugs in scripts, you can check the sap note 809477.  These scripts are not available in kernel version 7.73 and later. So if these are not the correct commands, then how to start/stop the sap system?  In this post, we will see how to do it in the correct way. SAP SYSTEM VS INSTANCE In SAP, an instance is a group of resources such as memory, work processes and so on, usually in support of a single application server or database server with...

sapstartsrv is not started or sapcontrol is not working

 What is sapstartsrv ? The SAP start service runs on every computer where an instance of an SAP system is started. It is implemented as a service on Windows, and as a daemon on UNIX. The process is called  sapstartsrv.exe   on Windows, and   sapstartsrv   on UNIX platforms. The SAP start service provides the following functions for monitoring SAP systems, instances, and processes. Starting and stopping Monitoring the runtime state Reading logs, traces, and configuration files Technical information, such as network ports, active sessions, thread lists, etc. These services are provided on SAPControl SOAP Web Service, and used by SAP monitoring tools (SAP Management Console,  SAP NetWeaver  Administrator, etc.). For more understanding use this link : https://help.sap.com/doc/saphelp_nw73ehp1/7.31.19/enUS/b3/903925c34a45e28a2861b59c3c5623/content.htm?no_cache=true How to check if it is working or not ? In case of linux , you can simply ps -ef | grep s...

HANA System Replication - Prerequisites & Setup

Hey Folks! Welcome back to Hana high availability blog series. In our last blog we checked out operation & replication modes in hana system replication. If you haven't gone though that blog, you can checkout  this link In this blog we will be talking about the prerequisites of hana replication and it's setup. So let's get started. When we plan to setup hana system replication, we need to make sure that all prerequisite steps have been followed. Let's have a look at these prerequisites. HANA System Replication Prerequisites: Primary & secondary systems should be up & running HDB version of secondary should be greater than or equal to Primary database sever But, for Active/Active(read enabled config), HDB version should be same on both sites. System configuration/ini files should be identical on both sides Replication happe...

HANA hdbuserstore

The hdbuserstore (hana secure user store) is a tool which comes as an executable with the SAP Hana Client package. This secure user store allows you to store SAP HANA connection information, including user passwords, securely on clients. With the help of secure store, the client applications can connect to SAP HANA without the user having to enter host name or logon credentials. You can also use the secure store to configure failover support for application servers in a 3-tier scenario (for example, SAP Business Warehouse) by storing a list of all the hosts that the application server can connect to. To access the system using secure store, there are two connect options: (1)key and (2)virtualHostName. key is the hdbuserstore key that you use to connect to SAP HANA, while virtualHostName specifies the virtual host name. This option allows you to change where the hdbuserstore searches for the data and key files. Note...

ST03N : The chapter for all BASIS Admins

This blog is targeted to BASIS ADMINS Transaction for workload analysis statistical data changed over time are monitored using transaction code ST03 , now ST03N (from SAP R/3 4.6C) . With SAP Web AS 6.4 the transaction ST03 is available again. From time to time ST03 and ST03N has seen many changes but later in SAP NW7.0 ST03N has reworked in detail specially processing time is now shown in separate column. Main Use of ST03N  is to get detailed information on performance of any ABAP based SAP system. Workload monitor analyzes the statistical data originally collected by kernel. You can compare or analyze the performance of a single application server or multiple application server. Using this you start checking from the entire system and finding your way to that one application server and narrowing down to exact issue. By Default :- You see data of current day as default view , you can change the default view. Source of the image : sap-perf.ca Let's discuss the WORKLO...

SAP application log tables: BALHDR (Application Log: Header Data) and BALDAT (Application Log: Detail Data)

  BALHDR (Application Log: Header Data): Usage : The BALHDR table stores the header information for application logs. It serves as a central repository for managing and organizing log entries. Example Data Stored: The table may contain entries for various system activities, such as error messages, warnings, or information logs generated during SAP transactions or custom programs. Columns Involved: LOGNUMBER: Unique log number assigned to each log entry. OBJECT: Identifies the object associated with the log entry (e.g., a program, transaction, or process). SUBOBJECT: Further categorizes the object. USERNAME: User ID of the person who created the log entry. TIME: Date and time when the log entry was created. ADD_OBJECT: Additional information or details related to the log entry. BALDAT (Application Log: Detail Data): Usage : The BALDAT table contains the detailed data for each log entry, linked to the corresponding entry in the BALHDR table. It stores the specific log details an...

ABAP Dumps Analysis

Ever now and then have you heard about ABAP Dumps, We also have a joke everything in temporary in life except ABAP dumps for SAP Consultants. Lets try to understand ABAP dumps from perspective of a SAP BASIS Consultant. Dumps happen when an ABAP program runs and something goes wrong that cannot be handled by the program We have two broad categories of Dumps , In custom program Dumps and SAP provided program Dumps. Dumps that happen in the customer namespace ranges (i.e. own-developed code) or known as Custom Program , can usually be fixed by the ABAP programmer of your team. Dumps that happen in SAP standard code probably need a fix from SAP. You do not have to be an "ABAPer" in order to resolve ABAP dump issues. The common way to deal with them is to look up in ST22 How to correct the error ? Hints are given for the keywords that may be used to search on the note system. Gather Information about the issue  Go to System > Status and Check the Basis SP level as well as info...

How to resolve Common Error : Standard Template "sap_sm.xls" missing

Hey everyone, putting forward a common error we usually face when we have “ Excel inplace” functionality enabled in our SAP system. This error occurs when validity of the signature of SAP standard templates expired or were incorrectly delivered via support packages. We can reproduce the error by doing as below.. Click on “spreadsheet” icon after any SAP ALV grid view of data is on screen to make this data to export into excel directly from SAP.

SAP HANA System Replication - Operation Mode & Replication Mode

Hey Folks! Welcome back to Hana high availability blog series. In our last blog we checked out what is hana system replication and how it basically works. If you haven't gone through that blog, you can checkout link In this blog we will be talking about the replication modes and operation modes in hana system replication. So let's get started. When we setup the replication and register the secondary site, we need to decide the operation mode & replication mode we want to choose for replication. For now we won't focus on setting up replication as we'll cover it in our next blogs.  Operation Modes in Hana System Replication: There are three operation modes available in system replication: delta_datashipping, logreplay and logreplay_readaccess. Default operation mode is logreplay. 1. Delta_datashipping: In this operation mode initially one full data shipping is done as part of replication setup and then a delta data shipping takes place occasionally in addition to cont...

Work Process and Memory Management in SAP

Let’s talk about the entire concepts that are related to memory when we talk about SAP Application. Starting with few basic terminologies, Local Memory :  Local process memory, the operating system keeps the two allocation steps transparent. The operating system does the other tasks, such as reserving physical memory, loading and unloading virtual memory into and out of the main memory. Shared Memory :  If several processes are to access the same memory area, the two allocation steps are not transparent. One object is created that represents the physical memory and can be used by various processes. The processes can map the object fully or partially into the address space. The way this is done varies from platform to platform. Memory mapped files, unnamed mapped files, and shared memory are used.  Extended Memory : SAP extended memory is the core of the SAP memory management system. Each SAP work process has a part reserved in its virtual address space for extended memory...