摘要
像C語言或者C++語言這些編程會失去數據的完整性 因為編程過程中鍵入了不安全的數據處理,它會使得對程序至關重要的數據內存崩潰或者帶來錯誤的程序執行。即使是近期的內存保護技術,例如數據分割、分頁、保護秘鑰和模擬分割都無法確保數據的完整性。我們引進了一種新的方式叫做“關鍵內存”,用來保護程序中用來進行修正的關鍵數據。“關鍵內存”被運用到程序中,阻礙不關鍵數據運行導致的關鍵數據的內存崩潰。這種內存模型會在應用層上進行工作,并且能夠幫助內存管理部件定位和保護程序中的關鍵數據,以此來修正程序的執行。為了能夠在軟件中運用“關鍵內存”,我們做出了“武士”系統。“武士”是一個使得“關鍵內存”能在軟件上操作的運行式系統,它的操作基本原則是復制并且前向糾錯,以此來保證使用關鍵數據時的數據完整性。在應用層上使用“武士”不會影響程序的錯誤,因為它只會對關鍵數據進行操作,并不會影響其他非關鍵數據的運行,還有就是“武士”跟系統中的其他程序庫是兼容的。我們將會對“武士”的使用表現進行評價,觀察它在應用和程序庫上的表現,并且將它與其他的應用層內存保護技術做對比。
ABSTRACT
Programming languages like C and C++ looses data integrity due to type unsafe nature of handling data which leads to corruption of data in memory critical to the program and results in incorrect program execution. Even recent memory protection techniques like segmentation, paging, protection keys and simulated segmentation flunks to assure data integrity. We introduce a new method called Critical memory to protect critical data in program necessary for correct program execution. Critical memory is implemented in software, which barricades memory corruption in such way that operations on non critical data will not corrupt critical data. It is a memory model that works at application layer and assists memory management unit to locate and protect critical data in program which leads to correct program execution. For implementing critical memory in software we are presenting Samurai. Samurai is a run time system that implements critical memory in software. Basic principle of operation of samurai is replication and forward error correction to assure data integrity using critical memory. Implementation of samurai in application layer will not step down system performance because Samurai operates only on critical data without interfering non critical data operations. Compatibility of samurai is not affected by outside libraries in system. We will appraise performance of Samurai using applications, libraries and by comparing it with other application layer memory protection techniques (Die-hard).
I. INTRODUCTION
Memory allocation and memory protection is a fundamental aspect of a computing system. Since 1960 this aspect is swaying between solved or insoluble one. For correct execution of a program, memory must hold correct value of data which is critical for that program. In many standard programming languages like (C, C++) programmers are allowed to assign a block of memory to a program. This particular block is supposed to hold particular data type value which is declared in program. Any attempt to assign different data type value to this memory block results in memory data corruption followed by program execution error. For example in type unsafe language C it is possible for an integer to be viewed as a function pointer, which can be then jumped to and executed causing errors such as segmentation faults . Even though language semantics explicitly allows for these violations, language never defines what should happen when for example, a
floating point value is treated as a pointer. We can categorize such memory errors into following four categories:
1) Buffer Overflow 2) Heap Metadata Overwrites 3) Dangling Pointers 4) Uninitialized Read
Out of these, errors due to Buffer overflow, Heap Metadata overwrites and dangling pointers causes memory corruption. To overcome these errors we devise a new memory model, Critical Memorythat forbids whimsical memory writes from corrupting critical data. Critical memory is separate block of memory assigned to store critical data. Programmer decides which data is critical while writing a program. Memory read and memory write operations are separate for critical memory. Local reasoning about safety from memory corruptions is enabled by differentiating critical data.
Critical memory is implemented in runtime system by using an object based technique called Samurai. Samurai is an object oriented method of implementing critical memory. Samurai assures safety for critical data from corruption but this guarantee is relative to the size of critical data used by program. Samurai tries to implement substance of ideal critical memory by using replication and forward error correction method. There is another technique called Die-Hard which uses randomization and replication approach for achieving memory safety. But Die-Hard needs very large memory which is double of what is required without it. Due to large memory requirement Die-Hard steps down system performance.
We are also evaluating performance of Samurai using applications like a ray shading application , 4 SPECINT2000 Benchmarks , a multithreaded web server (written in C++) and in libraries like STL list class. Results will show how performance of Samurai will vary relative to amount of critical data present in a system. Our results will also show how Samurai will recover critical data successfully. Major advantage for using Samurai is it does not step down system performance and it's compatibility with third party libraries in program.
II. Critical Memory
Critical memory is a memory model which is implemented in software. Programmers using critical memory are allowed to locate and protect data which is critical for correct execution of program. Critical memory extends normal load/store instruction set to next level by introducing critical load and critical store. Instructions to load and store critical memory, are separate than non critical load and store. But critical store instruction is assigned higher privilege to write non critical memory too. Normal load and store instructions are not allowed to access critical memory.
A: critical_store 100, (&balance);
B: store 1000(&balance)
C: load r1, (&balance)
D: critical_load r2, (&balance)
critical_store 100, (&balance); critical_load r2,(&balance)
Store 1000(&balance) Citical_load returns 100
Loadr1,(&balance)
Load returns 1000
Normal Memory
&balance
A B C D
Critical load returns value stored by earliest critical store occurrences of normal store do not affect value read by critical store hence critical data is not written so it is secured from normal non critical memory writes.
As shown in Fig .1 consider a programming sequence is applied to study critical memory operation. When Critical memory is exactly assumed to be parallel to normal memory and a processor can perform both critical and normal memory write and reads at once. On the advent of 1st critical store both normal and critical memory locations having address ‘&balance' are written with value 100, next normal store will load value 1000 into normal memory location so critical memory will retain its old value to make it available for further reading. Next instruction in sequence which is normal load will load value 1000 from normal memory and critical load followed by normal load will load last stored data in critical memory by critical store. This shows critical value 100 is not corrupted by normal load store operations. Though operations appear to be simple and accurate, there is a chance of error if normal and critical load store operations execute out of synchronization. To avoid this critical load and store will detect difference between normal and critical memory values and it will generate trap if behavior is desired otherwise that instruction will cause memory exception.
To implement critical memory object oriented environment critical objects are declared. A critical object will share memory from both normal and critical memory and it will provide operations like normal load/store and critical load store which will help to address either critical or normal memory shared by critical object. Fig 2 shows how critical object is declared.
Critical memory with its critical operations not only acts as memory but it also acts as a recovery system which will provide checkpoints to debug a system. In case of exception or error critical memory can restart system from a point where exception was occurred.
For proficient use of critical memory with sustained system performance selection of critical data is very important aspect. We are implementing critical memory in software so if we implement all program data as critical then it will have performance impact on system. So data which will be most useful in restoring a system after exception occurs should be declared as critical. Apart from this to run a program, system uses library which holds standard functions, tables, data structures and constants which are used by all programs. Such libraries should be declared as critical because without that program will never restore its state.
As we said earlier to implement critical memory we are using a run time application ‘Samurai'. For interfacing this critical memory with runtime application certain APIs (Application Program Interface) need to be provided. API's are nothing but operations on critical memory mapped in the form of standard functions which can read, write, allocate and de-allocate memory. APIs are available in programmer model. Samurai uses these API's to perform operations on critical memory.
III. Samurai Runtime System
Samurai runtime system increases program reliability that implements critical memory using replication and error correction. This is possible as Samurai uses DieHard as its memory manager. DieHard allows the critical data to perform critical load and critical store operations which makes the allocation and deallocation of memory efficient. For each critical data object, two copies are maintained by samurai. These copies have the same critical contents as the object has and are known as shadows. At the time of critical store, shadows are updated instead of the original data. However, at the tie of critical load, the critical data object is loaded into the memory. Before loading the data object, it is compared with its shadows to find any inconsistency. If any inconsistency is found, majority voting method is performed to perform updates. In majority voting method, consistency of data is checked between the shadows as they are updated at the time of critical store. If both the shadows are same and critical data object differs, that means, the data object needs to be updated with the contents of shadow. The shadows and critical data object, all these use same address space. Therefore, if any program error occurs, the shadows might get corrupted.
A. Heap Organization
Heap organization is important in Samurai as we have got two shadows and one critical data object which are to be compared at every update and load operations. In case of Samurai, the object metadata is stored on top of the heap and in order to access the object data, we use Big Bag of Pages (BIBOP) mechanism that is fast enough to access the main object as well as the shadow copies. Therefore, the heap is organized in BIBOP form. In BIBOP heap organization, the heap is divided into contiguous regions, storing objects of same type in one region. As each reason contains similar objects, the size of those objects is rounded of to the nearest power of two. By this similarity condition, we can mask any pointer pointing to the object easily, to the pointer pointing to the base of the region. This special pointer is known as the base pointer that helps in determining the size of the objects present in the region. To determine the size of the objects in a region, a look-up table is created that also helps in determining offsets of pointer from base to the object in a region and offset of the pointer from object to the base of the region. The look-up table gives us a functionality to determine all these characteristics of the region and the pointer to an object. However, in order to actually compute this data, we require a getBase function that returns all the pertinent values.
B. Metadata
Data about data is known as metadata. However, in case of a critical data object, that has been allocated memory, the memory manager in Samurai, creates two mirrors on heap, which are also known as shadows. As the shadow object use the same address space, they are used as metadata and the read pointer to the critical data object points to its shadows too. Also, we know that the original data object, critical in nature is loaded into the memory, the compare or the update pointer also points to the critical data object so that the process of majority voting can be executed easily. The fields of metadata vary according to the purpose. In case of valid tags, also known as special flags, two bytes of memory are allocated. The checksum and shadow pointers are allocated a given memory space between one to four bytes.
C. Checking for valid objects
Critical objects should only access the valid objects. To make this statement true, valid objects in the metadata have a specific bit pattern. Hence while loading a data object, it is the critical load operation that checks for this bit pattern so that the critical data only accesses the valid objects. In case of corruption of the pattern of bits, the critical load operation aborts the operation. As the objects are stored in rounded off fashion, Samurai makes it sure that critical store operation stores the data up to the actual size of the object not to the one after rounding up the size of the object to nearest power of two. By monitoring these conditions, corruption of adjacent objects, be it critical or non critical, is kind of impossible because even in the worst case scenario, the critical store operation will not step over the memory allocated the object. Though, the objects in the heap are protected with individual checksum values, if in case the objects get corrupted, for safety purpose, an entire copy of heap is stored in the virtual memory that is not accessible by the critical store operation. In case of corruption, this extra heap an be used for the restoration purpose.
Similarly, updating the shadow objects is the responsibility of critical store and critical load operations. The pointer pointing to each shadow object is retrieved from the metadata by invoking the getShadowAddresses function. This return the address of the pointer and the critical store and critical load operations can virtually link up to update or compare the shadow objects. Explicitly, the memory allocation and de-allocation routines are run to allocate or de-allocate memory from the object and its shadows. These routines also map the memory addresses of the object and the shadows in the hash table and within object's metadata.
D. Optimization Techniques
As per the Die-Hard optimization by Microsoft, we take two cases to optimize Samurai. In our attempt to optimize Samurai, the first observation was made on the critical data object and the shadow object. It is a general phenomenon that, while performing a critical load, it is important to perform a compare operation between the critical data object and one of the shadow image. This helps in detecting an error while loading the data into memory. If in case there is a mismatch, the second shadow can be used to restore the data in the first shadow as well as the critical data. But in case, if the second shadow isn't used for a long time, it becomes corrupted and is of no use. In the mean time, if the critical data and the first shadow is corrupted doe to a program error, it is impossible for us to recover the data. Hence, this problem is solved by performing a periodic check over both the shadow objects. By this both the shadow images are of use and by performing a periodic check after hefty amount of transactions, the cost of the check can be optimized. However, if a critical data object looks up for the same object in the look table, the periodic check is delayed as the critical load function is not going to change any object values.
As, the objects need to look up for the data, the second optimization technique implements a software cache. Over here, the size of cache is important because larger sized cache increase the risk of cache misses and reduces the cache hit percentage. Hence the best way is to implement a single element cache that improves hit ratio significantly. However, there are a few things which are to taken under consideration while implementing a software cache. For example if both the shadow objects have been corrupted in a same way, Samurai does not has any way to correct that. Also, Samurai does not has a capability to handle the corruption of entire heap which is of large scale and happens in a fraction of time. Similarly if the look-up table is corrupted, Samurai cannot detect this kind of corruption. Hence for both of these problems that can be encountered while implementing a software cache, there is a solution. If the shadows are randomly distributed on the heap, it mitigates the probability of similar error or corruption at both the shadow objects. Secondly, the look table is referred only in case of allocating and de-allocating the memory locations, a page level mechanism mitigates the possibility of corruption. Another pre-requisite while implementing Samurai is to make sure that the program accessing critical data objects and using Samurai, is free of race conditions and accessing the critical memory location is mutually exclusive among threads in a multi—threaded environment. The reason behind this pre requisite is that, while performing a critical store, memory is updated many times with critical data by Samurai and is at the same time another process or another thread performs a memory update at the same location, the contents get corrupted. However, in case of inconsistency, Samurai globally locks all the memory locations accessed by it to repair the inconsistency. Therefore, critical loads do not create much problem even if the memory access is not mutually exclusive. To explain this pre requisite, let us compare the critical store and load operations with a famous reader/writer's problem in operating system. In the problem we have a file which is a shared resource and we have also got readers, who are waiting in a queue to read the contents of that file and there are writers, waiting in another queue to write there observations to the file. At a time only one writer can write to avoid the anomaly or inconsistency of the shared file. However, at a given time, multiple readers can read the shared file provided no writer is writing on to the file. Therefore, it becomes apparent that no thread should perform a critical store if any of the thread is performing a critical load operation and at a given time, only one thread can perform a critical store operation.
IV. Performance Evaluation
While evaluating the performance of Samurai, we overcame the biggest challenge of not having a good compiler which was capable of determining the store and load operations critical once critical data is declared. For the same we checked for the data at runtime. If the data was critical, the corresponding load and store operations were also declared as critical. By doing this, the system dynamically determines the critical data at runtime, also it invokes the appropriate function of Samurai for the required operation. Therefore, after overcoming the problem of determining the critical data, we analyzed a few fault injection experiments performed on Samurai to evaluate Samurai's fault resilience. It was found that, to evaluate he performance of Samurai, two paths can be followed. The first path of performance evaluation focuses on the corruption of the critical data. In this case a critical data object s randomly selected. Within this critical data object, we take select an offset randomly. Also, we take a set of bytes randomly, can be termed as random numbers. These random numbers are then replaced with the offset selected. We need to make it sure that the random numbers selected should be equal in length to the offset selected. This turns out to be a perfect corruption of a single critical data object. When Samurai processes the critical data, it automatically replaces the corrupted values of the critical data object with one of the shadow objects. The second path focuses on corruption of non critical data. In this case Samurai is tested if any non critical data tries to corrupt a critical data object. Therefore, an emphasis is given to data errors, control errors and pointer errors. Data error, refers to the error made by a critical store operation if an incorrect value is written. However, if the program follows an incorrect path, it leads to a control error and a pointer error is the one when a critical store operation writes data to an incorrect address. To insert these faults into the system, we need to have all the details such as the value about the program counter while the critical store operation is being performed, the address of the location where the critical store is being made and the data that is being stored. All of these contents can be found using a trace mechanism termed as The Golden Trace. Once we are set wit all the details, we inject the faults into a non critical data object and run the program. After running one instance, we perform a golden trace again to check the mismatches. Mismatch can be found at various places. A mismatch in the data occurs because of the faults injected into the program and refers to a data error. A deviated value of Program Counter refers to a program error and pointer error is indicated by the critical store being performed at the wrong address.
A. Results of Performance Evaluation
Several SPEC benchmarks discussed earlier have been used to measure the performance overhead, fault tolerance. The performance measurement has been done with a Standard Template Library (STL) list structure in a multi-threaded web server. The benchmarks have been used with and without being modified to use the critical memory. The overheads for measurement were normalized to 1 and lower values of it are better. For vpr, crafty, parser and gzip the four SPEC benchmarks the ref input were used to measure the slowdown whereas in case of rayshade a coin being flipped was used.
Except for gzip all the other benchmarks has a performance overhead of less than 10% (See Fig. 3) and only for gzip the value did overshoot to 2.7 times the normalized value. This abnormal increase in the value of gzip has been accounted due to relatively high fraction of critical loads and stores in comparison to other benchmarks. This phenomenon can also be explained by the fact that in Huffman decompression the table consists of critical data and all the access to this table is on the critical path.
Two situations have been analyzed for performance measurement with Samurai, one by injecting fault into the critical data and the other by non-critical data. In injecting fault into critical data, one fault was injected into the heap every N memory accesses where N was the period of the injection. For each value of N, the application was executed 10 times and the results were classified into success or failure. The fault period was varied from 100,000 to 1,000,000 in increments of 100,000. Also faults have been inserted in the critical data both with and without Samurai. Statistics of only one benchmark vpr has been shown for fault projection without Samurai (See Fig. 4) and with Samurai protection (See Fig. 5). Clearly the results show that the application with Samurai protection has very less fault rates.
Now we provide results of injecting faults into non-critical data with all the five benchmark applications. The results of the injections (See Fig. 6) are classified as data, control or pointer errors which will be explained later. The numbers in the table that are not in parentheses indicate how many of the trials produced a result and the those that are in parentheses are which resulted in a failure. It also measures the propagation of faults from non-critical data to critical data and resulting in a data, control or pointer error is relatively small which is almost less than 2%.
The number of data erorrs may be high but the number of failures is less due to the fact that address allocated using normal malloc will be stored in the critical data, and corrupting the value of a random strore will increase the likelihood of corrupting the data structure used in the standard allocator. Thus the corrupted structure will result in an erroneous address being returned by the malloc which is different from the address in the correct execution. Although this different address does not result in overall error for the application as it does not impact the correctness of the data, but it may propagate to the critical data.
A. Reliable Memory Allocator
Memory allocators usually store metadata about the allocated data on the heap. Though the metadata is open for corruption by the application and therefore there lies a chance of fatal security vulnerabilities. An important undertaking in the in the design of a fool proof memory allocator is the separation between the application data and the allocator's metadata. This problem poses a challenge because metadata is often stored by the memory allocators in a contiguous format. The current approach uses virtual memory protection to protect the pages around the metadata. The approaches mentioned above do incur a lot of re-modification of the allocator's code and may not be a practical solution to implement for all the memory allocators. Above all the Heap Server approach can be very expensive and as it needs bypassing the process boundary via a system call or through any inter process communication.
In Samurai a reliable memory allocator is built for higher security and to protect the allocator metadata. Memory allocators use some sort of an interface to allocate and deallocate huge chunks of memory, which is used for both the code segment and data segment as well as the allocator metadata. This large segment of memory is replaced by Samurai. Therefore by only applying changes to the to the allocator clients the required results can be achieved. The only one overhead of using Samurai at this point is that when the allocator references its metadata it must use critical load and critical store operation. At this point of time the allocator client performs normal load and store operation on the heap data even though this data has not been allocated on the Samurai heap. Whatsoever the situation may be there may be some discerete incidents of inconsistency in the allocator metadata, then there needs to be repair strategy of Samurai. On an alternate basis the repair strategy of Samurai can be completely avoided and simply an exception has to be raised when a corrupt metadata is detected. This approach can be verified by modigying the HeapLayers paackage by Samurai. Four packages have been used to implement the layers.
StrictSegHeap. It provides a demarcation of the object on the basis of its size and each object is rounded to the nearest power of two and stored in its corresponding location.
SizeHeap. It stores the object size with the objects for faster lookup.
FreeListHeap. It maintains all the objects in a link list and recycles it from this list for faster allocation.
SbrkHeap. This heap has the flexibility of expanding and contracting the Kingsley heap by allocating and deallocating the pages from the operating system. With the tests run by replacing the allocator with Samurai and by taking a mean of three runs with minimal variance, the overhead of using the reliable allocator was found to be 21%.
V. Modifications
In order to make Samurai work in a multi-threaded environment, there are some modifications need to be made on the actual concept of Samurai. Firstly, we know that, in order to repair an inconsistency, Samurai acquires a global lock on the entire set of critical memory. However, while updating the lookup table, we require a global lock too in a multi-threaded environment as updating the lookup table maintains the mapping between the critical data object and its shadow objects. As Samurai is based on the underlying technology of DieHard, no locking mechanism is needed while allocating or deallocating the memory as the DieHard memory allocator is designed for multithreaded environment. Secondly, we always assume on fact that the program trying to access Samurai is free of any race condition. While performing the critical store operation, the assumption made is that the critical data within the program is always updated using a single thread or the program is synchronized in a way that more than one thread do not get access to perform a memory based operation at a given time. In case of critical load operation, there are no modifications needed as the critical data objects or its shadow objects are not modified. Thirdly, in case of repair, when an inconsistency is found by Samurai, the repair routine needs a modification to acquire the lock on all the critical memory locations before performing any kind of operations. As the repair routine emphasizes on the repair of shadow images and recovering of the critical data, it might also need access to the lookup table. Hence all the pertinent data is required to be locked before any such operation is performed.
VI. Conclusion
This report introduced a critical memory model that protects data from arbitrary changes from any local interference. The critical memory mentioned here can be used in variety of purposes as it enables local reasoning and consistency of memory. The semantics of the critical memory is described here and various performance analysis reports have been presented. It provides a probabilistic memory safety to type unsafe languages like C and C++. Samurai implements its technology by using replication layered above a robust runtime system. Samurai can be used to protect key data inside an application. Samurai has been demonstrated here by using five benchmarks from SPEC and a Standard Template Library(STL) with a memory allocator. It allowed selective protection of application data. It can also be combined with Software Transaction Management (STM) for detecting concurrency violations. Finally it is shown that Samurai decreases the number of faults in an application by 10% or less to a factor of 2.7 times in worst case.