What Is a Database?

Introduction

Databases are the backbone of all modern information systems. Since computers store contemporary databases, the data can be any size and complexity. There are many ways to collect and arrange data depending on the usage and the data type.

This article provides a comprehensive overview of databases and database systems.

What Is A Database?

Database Definition

A database is an organized set of logically connected data. The information transforms into helpful knowledge, structured and maintained to fit the user's needs. Apart from storing the data itself, a database also keeps the relationships between data points.

In the broader sense, a database is an integrated set of information about a system and the procedures for maintenance and usage. Unlike spreadsheets, multiple users and applications access the storage at once.

What Are Databases Used For?

Databases have a broad application spectrum. Typical examples include:

  • Banking systems keep databases for clients, bank accounts, credits, transactions, etc.
  • Airline traffic keeps the information about flights, ticket reservations, and similar. Airline companies are the first sector to use geographically distributed databases.
  • Universities use databases to record information about students, applications, grades, courses, etc.
  • Credit card transactions contain tabs on payments and generate monthly reports.
  • Telecommunication companies store information about calls, generate monthly bills, track the communication line, etc.
  • The finance sector tracks the sales and purchases of financial instruments such as bonds and shares.
  • Commerce and e-commerce businesses store data about consumers, products, and various price catalogs.
  • Manufacturing businesses manage supply chains, production lines, storages, generate invoices, etc.
  • Human resources stores information about employees, paychecks, taxes, benefits, etc.

The list above shows how crucial databases are for any business type. With modern user interfaces, the backend is hidden when accessing a database, so many users are unaware they use one daily.

A Brief History of Databases

The first automated database connects to Herman Hollerith, who patented the system for automatic data processing in 1884. The 1890 US census used the punched cards counter system, and the collection of cards represents the first automated database system.

Each card had 80 columns and characterized information about a person. Usually, census information would take two years to process. However, the punch card system and the automated card reading mechanism took only around six weeks to process the data. The punched cards system was used throughout the 20th century, especially for voting and clocking in for work.

After World War II, companies and government institutions started using computers for simple linear accounting databases. The first computerized databases were used for specific tasks and lacked flexibility.

Databases in the 1960s

In the 1960s, document databases had a dominant role. The first database management systems appeared in this decade, and they were used for big and complex projects, such as the Apollo moon landing.

Databases in the 1970s

Databases become a commercial reality in the 1970s. Hierarchical and network systems for managing data are introduced mainly for handling complex data structures, such as factory accounts, when purchasing raw materials.

This decade includes the first-generation commercial DBMS, and some are still in use today. Several drawbacks are:

  • The data was hard to retrieve. Companies used complex programs in the 1970s to access simple data.
  • Limited data independence made information hard to change and update.
  • No theoretical foundation for any database models.

Databases in the 1980s

The 1980s address the drawbacks of the previous decade. The relational data model appears during the 1970s, and the second-generation DBMS finds commercial business use in the 1980s. When using the relational model, all data is in a familiar tabular format. A relatively simple programming language (SQL) retrieves the data from the database.

The new database model allowed easier data access to people who were not programmers, addressing the most significant issue with the previous models. The relational model was convenient for client/server communication, parallel data transfers, and a GUI made usage simpler.

Databases in the 1990s

The 1990s gave rise to internet application and data storage systems. Multimedia data (graphics, sound, pictures, and videos) became more common. Massive amounts of both structured and unstructured data were standard. Due to data complexity rising, relational database systems turned to the object-oriented approach.

Note: Learn more about data categorization with distinct approaches to storage, processing, and analysis in our article Structured vs. Unstructured Data.

Databases in the 2000s

Three new database types appear: XML, NoSQL and NewSQL databases.

XML databases are a highly structured document-based type. Querying is allowed through XML attributes with varying degrees of flexibility.

NoSQL databases answer the strong demand for highly flexible distributed database systems, which use eventual consistency and do not require a fixed schema. The NoSQL type is highly scalable, and stores denormalized data.

NewSQL aims to combine the best attributes from NoSQL databases, such as scalability, while using SQL and maintaining ACID compliance.

Different Types of Databases

There are many different database types currently available, each with benefits and drawbacks. Every database type creates a specific environment for storing data and the relationship between information.

  • Relational databases store data in table-like structures as rows and columns with a focus on data consistency. This database type focuses on relations between data, and it is the most widely used database type.
  • Object-oriented databases combine the object-oriented programming (OOP) principles with relational database standards.
  • Distributed databases spread across multiple sites and scale horizontally.
  • Data warehouses integrate data from various sources consistently into a single decision support system. Warehouses caters to large volumes of data and commonly reside on big data servers.
  • NoSQL databases are structurally diverse types of databases with a focus on high availability. NoSQL systems are best suited for large volumes of unstructured data.
  • Graph databases are a type of NoSQL database with a focus on relationships between data points. With a topographical network structure, graph databases are the best system for exploring and discovering relationships.
  • OLTP databases focus on short day-to-day transactions, supporting a large userbase with high data integrity and effectiveness in simultaneous queries.
  • Open-source databases are open to modifications and free to use. Customizable user preferences and the low cost make this database type widely adopted.
  • Cloud databases have all the traditional database features with cloud computing flexibility.
  • Multi-model databases provide a single engine for working with multiple database model types.
  • Document/JSON database is a NoSQL storage system which stores data in JSON documents.
  • Self-driving cloud databases (autonomous databases) use machine learning to automate various tasks in the DBMS.

Components of a Database

Five main components make up a database system.

Components of a database

Hardware

Hardware encompasses the physical devices that connect computers with the real world. When it comes to databases, servers, storage disks, and various data collection devices include the hardware needed to run and populate a database.

Software

The software includes a wide array of programs used to access, manipulate, and control the databases. On the lower levels, the software includes the operating systems on which the databases reside, the network for communication with the databases, and the software to access the data.

Data

Data is the essential fact about an item or event which the database can save. The data requires processing to gain meaning and become information. Additionally, processing extracts insightful details from the data and aids in decision-making procedures.

Procedures

Database procedures include all the functions operating within a storage. Whether it's regular backups, generating reports, and other day-to-day operations, procedures are an instruction set run in the database management system.

Database Access Language

The database access language is the programming language used to insert, update, delete and modify data stored in a database. Databases execute queries in the database access language directly.

Output of an example query in a database access language

What is a Database Management System (DBMS)?

A database management system (DBMS) serves as the interface between the user (or applications) and the database. The program allows direct communication with the database, permitting data retrieval, updates, optimization, and the overall management of the information stored in the database.

What is a Database Server?

A database server is a dedicated server that provides services to a client through database applications. One part of the server stores the DBMS, while another stores the database itself. Usually, database servers have a large storage capacity and many memory sticks.  

Why are Databases Important?

A database system stores essential data about a business: the data, when analyzed, becomes valuable information about a company and helps in the decision-making process.

Likewise, a database helps build an archive about the business, making a company more robust and aware of where it's headed when making choices. Storing data about an interaction helps provide insight into information to help the company develop further and increase profits.

There are many advantages when using a system with a database:

  • Independence between programs and data. Separating metadata from applications that use data is a key attribute. The main database characteristic is the ability to change and transfer an organization's data to a different computer system without changing the programs that process the information.
  • Minimal data redundancy. Databases address the challenge of data repetition. Information integrates into one logical structure, and every piece of data repeats minimally. However, databases do not eliminate redundancies entirely. The system allows a database designer to plan the extent of redundancies depending on the use case.

Note: Data redundancies are desirable in some cases and increase database performance. For example, redundancies increase search speed.

  • Improved data sharing. Databases are a company resource that many employees and sectors use. Specific internal and external users operate the database, and each person or group has specialized views of the data.
  • Greater data security. A DBMS has administrative functions which help control the security over sensitive information through privileges and user roles.
  • Increased application development productivity. Developing new applications is faster thanks to database systems. A programmer can concentrate on functions necessary for a new application without having to define data. A DBMS helps automate activities such as the design and implementation of a database.

Common Database Challenges

Databases contain challenges, risks, and expenses when building the system.

Staff trainingA business that opts for a database system must enlist or train people to design, implement, and maintain a database. Due to the constantly changing nature, continual training is necessary to maximize the efficiency of database technologies.
Installation and management expenses and complexitiesMultiuser database systems are extensive and complex software architectures, often with yearly costs for technical support and expansions. Software extensions with continual improvements on security are a must-have when working with data.
Backups, recovery, and securityRegular backups assure data consistency and high availability. Clear security procedures and database recovery are necessary for a modern database system.
Organization conflictsShared databases require consent about the definition and data ownership. Additionally, a dedicated person is required to maintain the data. Therefore, a capable database administrator and meaningful access roles are necessary.

The Future of Databases

The technology with the most potential impact for the future of databases is autonomous databases. Machine learning helps fully automate provisioning, management, tuning, and upgrades to a database. Automation additionally helps put a more significant focus on database security, which is the biggest challenge the systems face in the future.

Conclusion

This article provided an insight into database history, database components, and how and when to databases are used. Next, read about database servers in our comprehensive article: What Is a Database Server & What Is It Used For?

Was this article helpful?
YesNo
Milica Dancuk
Milica Dancuk is a technical writer at phoenixNAP with a passion for programming. With a background in Electrical Engineering and Computing, coupled with her teaching experience, she excels at simplifying complex technical concepts in her writing.
Next you should read
10 Database Security Best Practices
March 30, 2023

Database security involves all aspects of security. Many unwanted database breaches and information compromises are avoidable...
Read more
OLTP vs. OLAP: A Comprehensive Comparison
June 8, 2021

With massive amounts of data, different processing techniques are used depending on whether you need to add information to a database or...
Read more
What Is a Database Server & What Is It Used For?
May 31, 2021

A database server is a machine used to store the database and to manage data access and retrieval. Read this article to learn more about database...
Read more
What is Database Normalization?
May 26, 2021

Database normalization is a key concept in database design. Learn how to organize the data and divide it into optimal tables for maximized efficiency.
Read more