What is a DBMS a RDBMS, OLAP and OLTP ?

What is a DBMS ?

A database management system (DBMS) is a computer software application that interacts with the user, other applications, and the database itself to capture and analyze data.

DBMS is a general term for a database with built in functions for data retrieval, data definition, and administration.

Using a DBMS you don’t access the data files directly, you only access them thru the provided interface.

For contrast, an example of a non-DBMS database is ISAM

A general-purpose DBMS is designed to allow the definition, creation, querying, update, and administration of databases.

Any software platform that manages data can be considered a DBMS.

Well-known DBMSs include:

Microsoft SQL Server

Portablility across different DBMSs is not an obligation, but different DBMS can interoperate by using standards such as SQL and ODBC or JDBC to allow a single application to work with more than one DBMS.

Database management systems are often classified according to the database model that they support; the most popular database systems since the 1980s have all supported the relational model as represented by the SQL language.

What is a RDBMS ?

Under DBMS we have Relational data base management systems.

These are platforms that stress the relationships as being important.

RDBMS is a DBMS that enforces the relational database design which was first proposed by E. F. Codd in the 1970s.

The idea is that you take data that would be repeated in a flat database and move it into separate database tables with a key relationship tieing it back to the parent table.

For example, a flat implementation of a school roll would have a single table that listed each student with the name of their teacher in the same record.

In a relational model of the school database you would have a pointer in the student table that linked to the apppropriate teacher in the teacher table. Seperating data this way is called normalization.

Under RDBMS we have both OLAP and OLTP.

Generally OLTP is for transactional implementations and OLAP is for analytical implementations.

OLTP is a term used to describe the type of data processing that occurs in the system. The database design for an OLTP application fits the relational model very well and they tend to be very normalized.

OLAP applications are for analyzing data to find knowledge about historical data that is useful to the business for making decisions.

They typically analyze huge amounts of data. Because of this they don’t fit the relational model as well. A highly degree of normalization comes at a cost and that cost can be higher access times. The design of an OLAP database breaks some of the rules of the relational model so that the data retrieval times can be faster. There is much more to know about OLAP database design – the basis of which is a “cube”.

OLTP and OLAP are really two different types of applications – not databases. The application needs drive the design of the database and so we have database designs called OLTP and OLAP that support the two types of applications.

What is OLAP?

Online analytical processing is a computer technology term referring to systems focused on analysing data in a specific database.

This kind of systems are characterized for their analytical capabilities, addressing multi-dimensional or one dimension data, processing all the information. The standard applications of OLAP are bussiness intelligence, data writing and reporting, throught data mining processes.
OLAP operations and databases

On the database level, these systems operation is defined by a low level of transactions, dealing with archived and historical information. This data is seldom updated, identifying the SELECT database operation as the key feature of the system. Therefore, this kind of databases are based on READ operations, aggregating all available information.

Databases that work as data warehouses apply this methodology, optimizing reading and aggregation operations of its multidimensional data model. Thus providing a great support for data analysis and reporting operations, critical in these kind of databases.
Data cube

The main component of these systems is a OLAP cube.

A cube consists in combining data warehouse’s structures like facts and dimensions. Those are organized as schemas: star schema, snowflake schema and fact constellation. The merging of all the cubes creates a multidimensional data warehouse.
System types

There are many types of OLAP systems, depending on it’s structure characteristics. The most common ones are: MOLAP, ROLAP and HOLAP.

The most important real world applications of these systems are: bussiness management and reporting, financial reporting, marketing, research and another data related issues. These processes are growing faster on these days, making them absolutely critical in a world that is becoming dependent of data. In the next paragraph we will provide a real world example of what we described before.

Real World Example: In a hospital there is 20 years of very complete patient information stored. Someone on the administration wants a detailed report of the most common deseases, sucess rate of treatment, intership days and a lot of relevant data. For this, we apply OLAP operations to our data warehouse with historical information, and throught complex queries we get these results. Then they can be reported to the administration for further analysis.

What is OLTP?

Online Transaction Processing is a information system type that prioritizes transaction processing, dealing with operational data.

This kind of computer systems are identified by the large number of transactions they support, making them the best to address online application. The main applications of this method are all kind of transactional systems like databases, commercial, hospital applications and so on.

In a simple way, these systems gather input information and store them on a database, in a large scale. Most of today’s applications are based on this interaction methodology, with implementations of centralized or descentralized systems.

OLTP database and operations:

On the database level, these transactional systems base their operation on multi-access, fast and effective querys to the database. The most used operations are INSERT, UPDATE and DELETE, since they are directly modifying the data, providing new information on new trasactions. So, in these systems, data is frequently updated, requiring a effective write operations support.

One special characteristic of those databases is the normalization of it’s data. This happens because data normalization provides a faster and more effective way to perform database writes. The main concern is the atomicity of the trasanctions and ensuring that concurrent accesses don’t damage data and also don’t degradate system’s performance.
Other systems

OLTP is not only about databases, but also other types of interaction mecanisms. All client-server architectures are based on these processes, taking benefit of the fast transaction and concurrent models. Descentralized systems are also online transaction processing, as all broker programs and web servervices are transaction oriented.

Real World Example: A banking transaction system is a classic example. There are many users executing operations into their accounts and the system must guarantee the completeness of the actions. In this case there are several concurrent transactions at the same time, being data coherence and efficient operations the main goal.

Comparing OLTP vs OLAP

OLTP, also known as Online Transaction Processing, and OLAP which stands for Online Analytical Processing, are two distinct kinds of information systems technologies.

Both are related to information databases, which provide the means and support for these two types of functioning.

Each one of the methods creates a different branch on data management system, with it’s own ideas and processes, but they complement themselves.

To analyse and compare them we’ve built this resource!

Basically, OLAP and OLTP are very different approaches to the use of databases, but not only. In one hand online analytical processing is more focused on data analysis and reporting, on the other hand online transaction processing target a transaction optimized system, with a lot of data changes.

For someone learning about data sciences, related to IT methods, it is important to know the difference between these two approaches to information. This is the base idea to systems like business intelligence, data mining, data warehousing, data modelling, etl processes and big data.

Regarding the previous descriptions of the systems, we can compare them in a lot of distinct categories.

The review is detailed in the next table. Then we have a further discussion on each compared item which could evoke some doubts, to ensure you understood.

The first obvious difference is the focus of each. OLTP is best suited to update existing data, becoming the logical choice to operational systems which work with every day actions. In opposite, OLAP provides great analysis habilities of stored information, making this kind of systems perfect for reporting purposes. They usually base their work in historical data, which is best to analyse, getting more accurate reports.

These systems have very different purposes.

To achiveve the better performance on each one, there are some architectures that suit best operational actions, and others that boost reporting and analysis.

Then OLTP database schemas are usually on a normalized form, providing a better performance for the usual queries.

On the other hand, OLAP databases have specific data warehouse organization schemas.

Based on this first and critical differences, we can infer some other contrasts.

In OLTP, where the common work is operational, the queries on the system should be much more simpler than on OLAP, where complex digging on data is performed. Therefore, OLTP systems tend to be much faster than OLAP ones.

One of the subjects that hasn’t been discussed yet is data refresh.

In the OLTP system information is constantly changing, so this refresh rate is immediate after each actions. In the other hand, on OLAP systems, the refresh is a pre-defined processing job that stores large sets of data simultaneously. This take long time because of the data size and the process of normalizing the information.

Another term we didn’t discuss is the backup situation.

Since data on OLTP systems is absolutely critical, it needs a complex backup system.

Full backups of the data combined with incremental backups are required.

OLAP only needs a backup from time to time, since it’s data is not critical and doesn’t keep the system running.

Space requirements on both sides is, obviously dependent on the size of information stored.

However is safe to assume that historical data will need more storage space, since a lot more information is stored.

Next subject will be NoSQL Database :Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open-source and horizontally scalable.