Hello Consultants , In the last blog we have seen about different table types and impacts.
Lets deep dive in more and understand what is row store and how does it work.
Records that are inserted in table in same form in the main memory i.e. store the data in form resembling the logical table structure . Each record is saved as one concatenated chunk of values for every column in memory.
Properties in HANA :-
Advantages :-
1. Direct Mapping of logical table layout and operation performed it to actual data manipulation that happens in memory which makes easy to understand for developer and administration.
2. When records are most often accessed with all columns, mass data processing and analysis do not play any role , then row store tables can show better performance than column store table.
Disadvantage :-
1. DBMS cannot directly access a specific column of table whole data pages need to be transferred
2. Structuring the data representation by row is not very effective for many type of operation , Every values is stored again for occurrence of value within the table.
3. Even with normalized data models , the repetition of data, especially for very common values cannot be prevented because Foreign key references need to be stored. On top of it , this reference needs to be resolved during process by joins which need high computational power.
Note :- Row Storage is entirely stored in the main memory unlike column store
Limitations of HANA Row Store
1. Row store table cannot be partitioned , which limits the possible total size of all row store tables to the memory available on a single server that tables are located on.
What it means suppose you have 2 servers with memory of say 512 GB both and you have a table with 1024 GB . In row store you cannot store : You will need to have a server with 1024GB memory .
In Column store : You can partition and save among the two servers.
2. No Compression offered by HANA for row store table.
3. Columns in row store cannot be accessed independently and in parallel .
For example , we have a table with columns : Name, City, mobile no, Employee ID etc. . you cannot access only Name and City and expect fast processing in HANA . It does not work that way.
But it does not mean that the row store won't be processed in parallel . In fact many operations such as sorting, grouping , index creation and window function processing can be heavily parallelized.
4. Row store table cannot be displaced from memory . It should be in memory always when system is up and running . Therefore the table is automatically uploaded into memory during system startup. Obviously , This increases the startup time .
5. If row store table is not loaded fully in memory system cannot started.
6. In most SAP HANA informational model , ROW store tables cannot be used directly as data source.
For SAP NW system running on SAP HANA defines which tables shall be row store tables. Upon installation or migration of a SAP NW on SAP HANA Database the correct assignment is performed automatically.
If you want to check all the tables that are stored in row store :-
select * from M_RS_TABLES
select * from M_RS_TABLES where HOST='<worker node>’
(if checking on particular hosts)
Two important aspect of HANA :-
1. Multi version concurrency control : Lock for free data access and manipulation while maintaining transactional consistency , and indexes are a technique
2. Indexes : Technique of optimizing data access.
Diving Deep in both of these aspects
Multi version Concurrency control
MVCC is a well known technique to allow parallel access to same bits of information to multiple session, even when one or more session are actively changing this information .
This is achieved by keeping copies of original version of the record and presenting each session with the version appropriate to sequence of system change : COMMITS that the session has been exposed to
For the Developers and Administrator this happens automatically and no additional care or precaution is needed . However this changes are implemented in different ways in row store and column store but it brings different challenges for the Administrator .
In Row Store , Each Changed paged is copied first and placed into a chain of page version and with each version reflecting the state of data for a specific commit point . These page chains are stored in virtual container structure called undo cleanup files that can be monitored in M_UNDO_CLEANUP_FILES. But this is generally not a concern for Administrator and it is managed by Garbage Collector. The note worthy point is clearing this won't result in immediate free usable memory.
Garbage collector can only remove those old version for which transaction is completed (either committed or rolled back).
One know issue is :- If a transaction which is modifying tens of thousands of records without committing them we will end up in a situation in which large amount of redundant row store data need to be kept in main memory as there will be tens of thousands of record locks and new active record version kept in database.
As we have in other DBMS , HANA also offers the concept of Indexes
Comments
Post a Comment