Đăng ký Đăng nhập
Trang chủ Airline reservation system using concurrency control computer science ...

Tài liệu Airline reservation system using concurrency control computer science

.PDF
67
1
106

Mô tả:

Airline Reservation System using Concurrency Control Graduation Thesis Submitted to Hanoi University for the degree of Bachelor of Computer Science THƯVIỆNĐẠI HỌC HANỘI HANOI UN IVERSITY LIB R A R Y M;. ! 2 I lieu OÛOO _ By Ta Thỉ Hong (Computer Science) December,2009 Abstract Doing business on the Internet can be considered as a crucial source o f profit. When transactions are performed online, the possibility o f concurrent, competing transactions is increased. Thus, how to ensure such events do not violate the data integrity o f a database system is one important question that needs to be solved when designing a commercial system. Concurrency control is known as one key mechanism to deal with this problem. Knowing its importance, this project aims to provide a sufficient knowledge base o f concurrency control and related issues. Since it is impossible to cover all aspects o f concurrency control as well as its mechanisms, this paper focuses on investigating four transaction isolation levels: read uncommitted, read committed, repeatable read and serializable by building an airline reservation system for demonstration. Although two months is a short time, this project ends with several surprising findings about highest level o f transaction isolation - serializable 一 in the MySQL open source DBMS . Acknow ledgm ents I would like to thank my dear Instructor Mrs. Marry Gallagher for her guidance and enthusiasm. Without her detailed suggestions and encouragement, I never would have taken on this project. I also thank Faculty o f Information Technology in Hanoi University that gives me a great chance to undertake this project. Finally, I am grateful to my family, friends for their supports and encouragement. Ta Thi Hong December 2009 Table of Contents Table o f Contents................................................................................................................. iv List o f fig u res...................................................................................................................... vi Chapter 1 Introduction............................................................................................................1 Objectives...........................................................................................................................1 Motivation..........................................................................................................................2 Methodology...................................................................................................................... 3 An Overview o f the Rest o f the Document....................................................................... 4 Chapter 2 Background Knowledge........................................................................................5 Concurrency control definition..........................................................................................5 Transactions and AC ID properties....................................................................................7 Schedules........................................................................................................................... 9 Serial schedules.................................................................................................................10 Serializable schedules....................................................................................................... 11 Transaction Isolation........................................................................................................ 12 READ UNCOM M ITTED............................................................................................ 13 READ C O M M IT T E D ................................................................................................. 14 REPEATABLE R E A D ................................................................................................ 14 S E R IA L IZ A B L E ..........................................................................................................15 Chapter 3 Airline reservation system...................................................................................20 Database design............................................................................................................... 20 Data objects and classes................................................................................................ 22 Reservation Process...................................................................................................... 25 User interface.................................................................................................................2ö Concurrency...................................................................................................................28 Chapter 4 Investigating isolation levels in airline reservation system............................. 33 Remarkable problems in building the demo system.....................................................33 Four isolation levels in airline reservation system........................................................34 Read Uncommitted isolation level............................................................................ 36 Read committed isolation level..................................................................................37 Repeatable read isolation level..................................................................................39 Serializable isolation level.........................................................................................41 Chapter 5 Summary and Conclusions................................................................................48 References.......................................................................................................................... 50 Appendix 1 Application Code........................................................................................... 51 PHP Code..................................................................................................................................... 51 V List of Figures Figure 1: Two transactions...................................................................................................10 Figure 2: Serial schedule in which T ị precedes Tl. ..............................................................11 Figure 3: T ị and 72 interleaves............................................................................................. 11 Figure 4: D irty read.............................................................................................................. 17 Figure 5: Non-repeatable read.............................................................................................18 Figure 6: Phantom reads...................................................................................................... 19 Figure 7: ER diagram........................................................................................................... 21 Figure 8: Flight table............................................................................................................ 21 Figure 9: Customer table.................................................................................................................. 21 Figure 10: Reservation table................................................................................................22 Figure 11: Table relationship...............................................................................................22 Figure 12: List o f available flights.......................................................................................27 Figure 13: Check availability...............................................................................................27 Figure 14: Personal information form..................................................................................27 Figure 15: Confirm ation.................................................................................................... 28 Figure 16: Flow chart......................................................................................................... 29 Figure 17: Check availability code....................................................................................30 Figure 18: Code handles submit action..............................................................................31 Figure 19: Code handles commit or rollback action......................................................... 32 Figure 20: Customer reservation........................................................................................35 Figure 21 : First test scenario Figure 22: Serializable tes scenario Chapter 1 Introduction Objectives In the age o f information systems, where everything is shared on the Internet, people all over the world are connected together via an international channel. Therefore, more and more businesses, services, social networks etc. are taking advantage o f this trend. Every day, there are hundreds o f millions o f people doing business on-line. When transactions are performed on-line, the chance that two or more people w ill want the same product at the same time is inevitable. It is a critical problem especially when it relates to money. In term o f database management systems (DBMS), there is a mechanism to deal with this issue: concurrcncy control. Since ¡I is important in designing a database system, the initial objective o f this paper is to provide a better understanding about concurrency control and its effect on a database management system. Building a simple airline reservation system as a demonstration system to investigate differences among isolation levels o f transactions is another main objective o f this project. M otivation Among many interesting topics about computer science, I decided to choose topic uairline reservation system using concurrency согПгоГ for my graduation thesis for the following reasons: First concurrency control o f transactions is a problem that must be considered in the design o f most commercial systems. The accuracy and consistency o f data always are two important properties in a database management system. However, when two or more transactions are performed simultaneously (especially when both try to update the same data at the same time), these two properties are more likely to be broken. With concurrency control, the occurance o f inconsistent data is lower. Unfortunately, in most undergraduate courses, concurrency control is not a subject covered in depth. A t university, there are many subjects about programming languages, database management system, data structure algorithm or information system etc. With this knowledge, a student can have an overall picture o f a system but this is not sufficient. In case o f a real commercial system, there are many issues that need to be concerned and concurrency control is one o f them. Thus, learning concurrency control o f transactions would help an undergraduate student have a better knowledge o f database management systems. Lastly but most importantly, working in database field always brings me lot o f excitement. Despite o f its hardness, database management system is an interesting subject. In addition, since concurrency control is an important technique in database technologies, seizing the essence o f this matter is a competitive advantage for me in future jobs. 2 Methodology To carry out this project, both researching methods and building an application are used. Before creating anv demo system, I need to really to understand the concept o f concurrency control. Therefore, I have been reading related books and, finding information in the Internet. After that, I have to analyze the information, choose one suitable algorithm for this project. This part would help me have a sufficient background o f concurrency control. To get a real understanding o f this issue, I decided tobuild an airline reservation system in which I would try to apply the concurrencycontrol theory. Here is the environment for this demo system. - OS environment: Windows - Database server: MySQL version 5.1 or higher - Apache Web Server: Xampp version 1.7.2 - Web browsers: Firefox, Chrome, Internet Explorer (two browsers are required) - Programming language: PHP By building a demo system, the problem w ill be visualized, easy to understand and it is the best way to demonstrate the effect o f concurrency control on a database management system. 3 An Overview of the Rest of the Document This paper includes the following main parts. Chapter 2: Background Knowledge This chapter provides a summary o f knowledge needed for this project. Definitions o f concurrency control and related terms are presented. Examples and related issues are discussed to ensure that readers could go through the content o f this paper without difficulties. Chapter 3: Airline reservation system This chapter describes the design as well as the structure o f the airline reservation system. A ll classes and their functions are explained to provide better understanding o f system implementation. Chapter 4: Investigating isolation levels in airline reservation system This chapter describes the process o f applying transaction isolation levels in the demo system. Test results are discussed to provide reader a clearer view o f differences among four levels o f transaction isolation levels. This chapter ends with a summary o f findings. 4 Chapter 2 Background Knowledge Concurrency control definition Most Database Management Systems (DBMS) run in a multiuser environment, with the expectation that users w ill be able to share the data contained in the database. In applications like Web services, banking, or airline reservations, hundreds o f operations per second may be performed on the database. I f users are only reading data, no data integrity problems w ill be encountered, because no changes w ill be made in the database. However, i f one or more users are updating data, then potential problems with maintaining data integrity arise. When more than one transaction is being processed in a database at the same time, the transactions are considered to he concurrent, interactions among concurrently executing transactions can cause the database state to become inconsistent, even when the transactions individually preserve correctness o f the state and there is no system failure. In a database management system, concurrency control is a mechanism that “ensures that concurrent transactions are performed correctly without violating the data integrity o f the database” (Wikipedia 2009b). A full definition o f concurrency control is “The process o f managing simulìaneous operations against a database so that data integrity is maintained and the operations do not interfere with each other in a multiuser enviroriment ’’ (Hoffer, Prescott & McFadden 2004, p. 471). 5 There are two basic mechanisms used for concurrency control. They are optimistic and pessimistic. Optimistic: This mechanism assumes that although conflicts are possible, they w ill be very unlikely. Therefore, it is not necessary to lock1 every record every time it is used. The system w ill allow multiple users to simultaneously access a shared database and perform their transactions. However, i f two users actually do try to update the same record at the same time, then one user’s updates are aborted. Optimistic concurrency control is called “ optimistic,,because the system assumes the best case - it assumes that there would not exist two users who want to update the same record at the same time. Pessimistic: This mechanism is also known as 'locking". Locks allow multiple users to safely share a database as long as all users are updating different data at the same time. When locks are used, it is impossible for two users to update a row at the same time. As soon as one user gets a lock, no one else can process that row. This is a safe, conceptually simple approach. The disadvantage is that it requires overhead for every operation, whether or not two or more users are actually trying to access the same record. Furthermore, every time that a user tries to access a row, the system must also check whether the requested row(s) are already locked by another user or not. Pessimistic concurrency control is called "pessimistic" because the system assumes the worst — it assumes that two users w ill usually want to update the same record at the same time, and then prevents that possibility by locking records, no matter how unlikely conflicts actually are (IB M 2009). 1 A lo c k is a mechanism fo r lim itin g other users’ access to a piece o f data. When one user has a lock on a record,the lock prevents other users from changing (and in some cases reading) that record. 6 When you use optimistic locking, you do not find out there is a conflict until just before you write the updated data. In pessimistic locking, you find out there is a conflict as soon as you try to read the data. In the SQL Guide on the IBM website,the writer has an interesting example o f optimistic and pessimistic locking. He considers a bank as a database system and bank’ s customers as system’ s users. In this case, he calls pessimistic concurrency control like "having a guard at the bank door who checks your account number when you tiy to enter. I f someone else is already in the bank accessing your account, then you cannot enter until that other person finishes her transactiori and leaves” (IBM 2009). Optimistic concurrency control, on the other hand, "allows you to walk into the bank at any time arid try to do your business, but at the risk that as you are walking out the door the bank guard will tell you that your transaction conflicted with someone else’s and yo u ’ll have to go back and do the transaction again” (IBM 2009). Optimistic and pessimistic concurrency control differ in another important way besides the time at which conflicts are detected and error messages are issued. The pessimistic mechanism allows one user to not only block another user from updating the same record, but even from reading that record. With the optimistic mechanism, however, we do not check for conflicts except at the time that we write updated data to the database. Transactions and ACID properties To have a clear understanding o f concurrency control, we first need to understand transactions. Informally, a transaction is a collection o f one or more operations on the database that must be executed atomically; that is either all operations are performed or 7 none are. When processing a transaction the DBMS must ensure that an executed transaction should follow ACID properties which are: atomicity, consistency, isolation and durability. Atomicity: The ability o f a DBMS to tkguarantee that either all o f the tasks o f a transactiOfi are performed or none o f them ¡S” (Wikipedia 2009a). I f and only i f all parts o f the transaction succeed, the whole transaction w ill be committed and written in the database. Otherwise, the whole transaction w ill be rolled back and none o f the tasks is written. For example, suppose that the program accepts a new customer order, increases Balance Due, and stores the updated Customer record. However, suppose that the new Order record is not inserted successftilly for some reason. In this case, we want none o f the parts o f the transaction to affect the database. Consistency: The ability o f a DBMS to “ensure that the database remains in a consistent state before the start o f a transaction and after the transaction is over (whether it is successful or notj” (Wikipedia 2009a). It states that only valid data w ill be written to the database. If, for some reason, an executed transaction violates the database’s consistency rules, the entire transaction w ill be rolled back and the database w ill be restored to a state consistent with those rules. For example, i f the inventory on-hand balance must be the difference between total receipts minus total issues, this w ill be true both before and after an order transaction, which depletes on-hand balance to satisfy the order. Isolation: The ability o f a DBMS to ensure that other operations cannot access or see the data in an intermediate state during a transaction or changes to the database are not revealed to users until the transaction is committed. This constraint is required to 8 maintain the performance as well as the consistency between transactions in the DBMS. Thus, each transaction is unaware o f other transactions executing concurrently in the system. For example, this property means that other users do not know what the new inventory on-hand is until an inventory transaction is complete; this property then usually means that other users are prohibited from simultaneously updating and possibly even reading data that are in the process o f being updated. Durability: The ability o f a DBMS to "guarantee that once the user has been notified o f success, the transaction will persist, and not be urtdorie" (Wikipedia 2009a). Thus, i f a transaction is committed, no subsequent failure o f the database can reverse the effect o f the transaction. Durability does not imply a permanent state o f the database. Another transaction may overwrite any changes made by the current transaction without obstructing durability. Among all o f these transaction properties, providing isolation is the main goal o f concurrency control. Thus, in this paper, we are going to focus on examining isolation levels in term o f concurrency control. Schedules In the field o f database and transaction processing, a schedule is a “sequence o f actions taken by one or more transactions'1,(Garcia-Molina, H, Ullman J & Widon, J 2009, p.884). Not all transaction operation types should be included in a schedule, and typically, only some operation types (e.g., data access operations) are included, as needed. Schedules and schedule properties are fundamental concepts in concurrency control theory. Consider the following example o f two transactions. 9 Example 1: Assume that there are two transactions Tj and 丁2 and their actions are shown in Figure 1. Ti T: > READ (A, t) READ (B, v) t := t+50 WRITE (A , t ) V := v*2 WRITE (B , R E A D ( A, READ (Bf t) t := t+50 WRITE(B, V) v) V := v*2 t) WRITE(A, V) Figure 1: Two transactions A and В are database elements that can be accessed by Ti or T 2 whereas t and s are local variables o f Ti and 丁2 . In this example the only consistency constraint is that the value in A must equal the value in в when the transaction starts and when it completes. According to the above figure, transaction Tj adds 50 to both A and в ;T2 multiplies both A and В by 2. I f these transactions run separately, i.e. in isolation, the database remains in a consistent state. Serial schedules A schedule is a serial schedule i f all actions o f each transaction occur consecutively (Garcia-Molina, H, Ullman J & Widon, J 2009,p.885). No mixing o f the actions is allowed. Exam ple 2: There are two serial schedules for the transactions o f Figure 2,one is that T ị precedes T2 and the other is vise versa. Figure 2 shows the schedule in which Tj precedes T2 and the initial state is A=B=20. 10 T] T2 READ(A, t) t := t+50 WRITE(A, t) R E A D ( B f t) t := t+50 WRITE(B, t) A 20 В 20 70 70 RE AD ( B f V ) V := v*2 W R I T E (В, V ) READ(A, V ) V := v*2 WRITE(A, V ) 140 140 Figure 2: Serial schedule in which Tị precedes т2 Serializable schedules When two or more transactions are performed concurrently, the risk o f lost update might occur which causes incorrect data. Let us consider two transactions in Figure 2. In this case, we w ill let the transaction 丁2 execute after the first transaction Tj finishes w riting to A. Ti T2 READ ( A, t) t := t+50 WRITE(A, t) A 20 В 20 70 RE AD ( B f V ) := v*2 W R I T E (В, V ) V 40 READ(B, t) t := t+50 WRITE(B, t) 90 R E A D ( A, V ) V := v*2 WRITE(A, v) 140 Figure 3: T] and Гі interleaves II As shown in Figure 3,at the end o f the transactions, the value o f A is 140 while the value o f в is only 90. A t this time, the constraint A=B is violated. In other words, this schedule leads to inconsistent results. We say that concurrent transactions need to be processed “ in isolation., so that they do not interfere with each other. Procedures that process transactions so that the outcome is the same as this are called serializable. Processing transactions using a serializable schedule w ill give the same results as i f the transactions had been processed one after the other. Schedules are designed so that transactions that w ill not interfere with each other can still be run in parallel. Transaction Isolation One solution for the problems o f interleaving between concurrent transactions is grouping execution statements into transactions. Recall from the “ Transactions and ACID properties” part, a transaction is a set o f operations that must be executed together or none is. A transaction can be started by using the SQL command “start transaction”. An executing transaction only ends when it executes a “ commỉt” or a “ rollback” statement. With the “ commit , , statement, all changes made by the transaction are written successfully in the database. On the other hand, i f a transaction ends with the statement “ rollback” the changes made by that transaction w ill be aborted. MySQL implements transaction isolation levels to give clients control over what kind o f changes made by other transactions they w ill be able to see. Different isolation levels allow or prevent the various problems that can occur when different transactions run simultaneously 12
- Xem thêm -

Tài liệu liên quan

Tài liệu xem nhiều nhất