Row Store in HANA Database

 Hello Consultants , In the last blog we have seen about different table types and impacts.

Lets deep dive in more and understand what is row store and how does it work.

Records that are inserted in table in same form in the main memory i.e. store the data in form resembling the logical table structure . Each record is saved as one concatenated chunk of values for every column in memory.



Source : HANA ADMIN BOOK




Properties in HANA :-


Advantages :-

1. Direct Mapping of logical table layout and operation performed it to actual data manipulation that happens in memory which makes easy to understand for developer and administration.

2. When records are most often accessed with all columns, mass data processing and analysis do not play any role , then row store tables can show better performance than column store table.

Disadvantage :-

1. DBMS cannot directly access a specific column of table whole data pages need to be transferred  

2. Structuring the data representation by row is not very effective for many type of operation , Every values is stored again for occurrence of value within the table.

3. Even with normalized data models , the repetition of data, especially for very common values cannot be prevented because Foreign key references need to be stored. On top of it , this reference needs to be resolved during process by joins which need high computational power.

Note :- Row Storage is entirely stored in the main memory unlike column store


Limitations of HANA Row Store 

1. Row store table cannot be partitioned , which limits the possible total size of all row store tables to the memory available on a single server that tables are located on.

What it means suppose you have 2 servers with memory of say 512 GB both and you have a table with 1024 GB . In row store you cannot store : You will need to have a server with 1024GB memory .

In Column store : You can partition and save among the two servers.


2. No Compression offered by HANA for row store table.


3. Columns in row store cannot be accessed independently and in parallel . 

For example , we have a table with columns : Name, City, mobile no, Employee ID etc. . you cannot access only Name and City and expect fast processing in HANA . It does not work that way.

But it does not mean that the row store won't be processed in parallel . In fact many operations such as sorting, grouping , index creation and window function processing can be heavily parallelized.


4. Row store table cannot be displaced from memory . It should be in memory always when system is up and running . Therefore the table is automatically uploaded into memory during system startup. Obviously , This increases the startup time .


5.  If row store table is not loaded fully in memory system cannot started.


6. In most SAP HANA informational model , ROW store tables cannot be used directly as data source.


For SAP NW system running on SAP HANA defines which tables shall be row store tables. Upon installation or migration of a SAP NW on SAP HANA Database the correct assignment is performed automatically.


If you want to check all the tables that are stored in row store :-

select * from M_RS_TABLES

select * from M_RS_TABLES where HOST='<worker node>’

(if checking on particular hosts)

Two important aspect of HANA  :-


1. Multi version concurrency control Lock for free data access and manipulation while maintaining transactional consistency , and indexes are a technique

2. Indexes  : Technique of optimizing data access.


Diving Deep in both of these aspects 


Multi version Concurrency control 

MVCC is a well known technique to allow parallel access to same bits of information to multiple session, even when one or more session are actively changing this information .

This is achieved by keeping copies of original version of the record and presenting each session with the version appropriate to sequence of system change : COMMITS that the session has been exposed to 



For the Developers and Administrator this happens automatically and no additional care or precaution  is needed . However this changes are implemented in different ways in row store and column store but it brings different challenges for the Administrator .

In Row Store , Each Changed paged is copied first and placed into a chain of page version and with each version reflecting the state of data for a specific commit point . These page chains are stored in virtual container structure called undo cleanup files that can be monitored in M_UNDO_CLEANUP_FILES. But this is generally not a concern for Administrator and it is managed by Garbage Collector. The note worthy point is clearing this won't result in immediate free usable memory.

Garbage collector can only remove those old version for which transaction is completed (either committed or rolled back).

One know issue is :- If a transaction which is modifying tens of thousands of records without committing them we will end up in a situation in which large amount of redundant row store data need to be kept in main memory as there will be tens of thousands of record locks and new active record version kept in database.

Source : wiki.scn.sap.com

Indexes 

As we have in other DBMS , HANA also offers the concept of Indexes 

Indexing is used to optimize the performance of a database by minimizing the number of disk accesses required when a query is processed.

For our understanding purpose indexing is something like this



To review information of indexes on row store tables , we can use monitoring views M_RS_INDEXES.

Indexes on row store table are not saved to the persistency and is rebuilt when table is loaded into the memory. this happens during index server startup and logs are written on the trace file of index server process.


For row store table we have two type of indexing that is present :-

1. Classic b- tree Index :- Used for all other data types other than string , binary string or decimal types.

E.g. based on whatisdbms.com

Let’s take an example as to explain how B-tree indexing is helpful. Imagine books are arranged in the college library based on the alphabetical manner, the library has books of all departments such as Automobile, Aeronautical, Bio-tech, Chemical, Civil, Electronics and so on. After entering the library, you see that ground-floor contains books by department name A-G, first-floor H-N, second-floor O-U and third-floor V-Z. So based on your requirement you can quickly find the required book. Consider equivalent database search now, just Imagine books database table, with a B-tree index on the dpt_name column. To find your book of civil, you can simply perform below query.

2. cpb+ tree Index :- Compressed prefix b- tree index , this is highly optimized to handle character based index keys in memory. It uses partial keys to store and navigate within  the index structure.

To understand :-

This basically means, the B-tree index and leaf nodes do not contain the full strings for keys. Instead, the parts of the key-strings that are common among the keys (the prefixes) are stored separately. The leaf and index nodes then only contain

1.the pointer to the prefix
2. a kind of “delta” that contains the remaining key (this is where the partial key from the pkB-tree comes in)
3.and a pointer to the data record (row id)

This technique is rather common in many DBMS, usually attached to a feature called “index compression” 

Hana uses this for columns that are string , binary string or decimal types

We will be dropping follow up blogs on this topic so stay tuned and let us know if anything's needs to be added up here.

References :- 

cpb+tree

rsc1

btree





Comments