COMPEC Oracle Show: Day 1
Let me try to summarize what we talk about today under some headlines:
What is data?
Data is something you may encounter when you look anywhere in the universe. Everything is data such that probably it will be never possible to store whole data in the universe. Let’s take this phylosophical discussion aside and explain the concept of data with practical examples. For eaxmple 28 is a data. It may the 28th of November or age of your brother or the temprature of the weather. Even more it may not be the number 28. It may be the sentences “Bob Miner writes the half code of first Oracle version” in some strange Oraclish language. You can’t load context to data.
What is information?
Information is a piece of knowledge that may be apriori given or that may be inferrred. Think in this way:
- Information 1(given): Student Ahmet has a GPA of 3.98 out of 4.0
- Information 2(given): If a student’s GPA is above 3.5 he/she is a high honour student.
- Information 3(inferred): Student Ahmet is a high honour student.
What makes data information ?
It is the semantic data model that transforms data into information. Semantic Data Model can be thought as a mathematical function:
SemanticDMFunc1(Ahmet, 3.98) –> Student Ahmet has a GPA of 3.98 out of 4.0
Notice that only semantic data model (or can be thought as bussiness requreiment or environment) can tell you that,
- Ahmet stands for the name of a student
- 3.98 stands for GPA
- 4.00 is the maximum GPA level.
What is database ?
Database is a collection of data used to represent a modeled real world information based on a semantic model. In this definition like sentences the most important thing is the phrase “modeled real world”. Modelling in this sense is the simplification of real world according to your semantic model or according to your need. A real person has a father and his/her father has a name. But including father name into your database is related with what we call business. The line that seperates the real world and the database may be so ambigious that you may not decide whether to put a real world information or not into. In those cases it is usually better to include it into the database with the cost of increasing system size(not only storage size, I mean system complexity) or to phase the needs to handle them in parts.
What is DBMS ?
DBMS is the abbreviation used for DataBase Management System. A database management system, or DBMS, is software designed to assist in
maintaining and utilizing large collections of data, and the need for such systems, as well as their use, is growing rapidly. Oracle, TimesTen, BerkeleyDB are all different kinds of database management systems.
File System vs DBMS
A classical question is that why not to use a filesystem to handle all those stuff rather than having a dedicated software. Here are some drawbacks of using file system with compared to DBMS:
Assume that you have data to be stored of size 500 GB and you all of the data as file systems.
- We probably do not have 500 GB of main memory to hold all the data. We must
therefore store data in a storage device such as a disk or tape and bring relevant
parts into main memory for processing as needed.
- We have to write special programs to answer each question that users may want
to ask about the data. These programs are likely to be complex because of the
large volume of data to be searched.
- We must protect the data from inconsistent changes made by different users accessing the data concurrently. If programs that access the data are written with such concurrent access in mind, this adds greatly to their complexity.
- We must ensure that data is restored to a consistent state if the system crashes while changes are being made.
- Operating systems provide only a password mechanism for security. This is not sufficiently flexible to enforce security policies in which differet users have permission to access different subsets of the data.
Those are the very first problems you will encounter in the beginning of the story. There much harder stuff to deal with 🙂
Atomicity in a DBMS
Atomicity is to group set of actions such that whole members of this action set either performed together or not at all. Let’s remember the bank transaction scenario we did in class:
Bob wants to send $1000 to Eve and you are the developer of the system. The design is as follows:
- Withdraw $1000 from account of Bob at time 09:00:00,001.
- Add $1000 to the account of Eve at time 09:00:00,006.
Assume that this two opperation executed in a totally independent manner. That is no atomicity takes place. If everything goes fine transaction will succeed. But assume that at time 09:00:00,004 Evil came to database room and unplugged the power cord of your database. Goosh. Do you see the result. As your system recovered Bob is $1000 poorer but Eve is not $1000 richer. $1000 is get lost. I think your manager will appriciate your magicial power in losing $1000.
Assume now that exactly the same scenario in a system including atomicity. As the system recovered system will identified that some in complete atomic operations exist. Yes we have completed the first part of atomic bank transaction but cord is unplugged before the second part so the system itself will decide to UNDO the whole operation. Which means Bob is not poorer and Eve is not richer by keeping their initial states.
Isolation in a DBMS
Isolation concept may vary among different DBMS. But let’s talk about Oracle isolation over the same example we have used in atomicity section with a little bit of extension.
Assume that we need two toy reports about the accounts of Bob and Eve from two different sessions while the bank transaction is performed from yet another session. Totally we have three sessions.
- Withdraw $1000 from account of Bob at time 09:00:00,001 from session 1.
- The company’s “Technical Marketting Rechargeble of High Capacity Working Effort the Most General Manager” takes a report on current balance of Bob’s account at time 09:00:00,004 from session 2.
- Add $1000 to the account of Eve at time 09:00:00,006 from session 1.
- Commit at time 09:00:00,007 from session 1.
- The company’s “Capable Planning of High Recursively Defined Human Department Head” takes another report on current balance of Eve’s account at time09:00:00,009 from session 3.
Assume that there is no isolation among sessions and reporters can see changes of modifers independent of commit. Then at step 2 “… The Most General Manager” will take a report that says Bob currently has $0 (assuming it has $1000 initially). And the manager make a presentation to the company board that Mr. Bobby has no money. One hour later he will take a telephone saying that at time 09:00:00,005 database were crashed and DBA performed a recovery. Finally database has been consistently reopened. “…The Most General Manager” will thank to the DBA and one day after “..The Most General Manager” will be fired because Mr. Bobby reported to have a zero account indeed now has $1000. A wrong report to board.
Do you see the reason why Oracle displays $1000 in such a case ? Because it is not guaranteed that the each transaction started will be completed successfully. It may fail so you can’t display a fuzzy stated record to other users. This is called read consistency in Oracle. You can not see any uncommited change on a row unless you are the session making that change. Next week we will examplify this using our XE.
See you next week.