Skip to main content

Indexing in DBMS

Indexing in DBMS

  • Indexing is used to optimize the performance of a database by minimizing the number of disk accesses required when a query is processed. 
  • The index is a type of data structure. It is used to locate and access the data in a database table quickly.

Index structure:
Indexes can be created using some database columns.


  • The first column of the database is the search key that contains a copy of the primary key or candidate key of the table. The values of the primary key are stored in sorted order so that the corresponding data can be accessed easily. 
  • The second column of the database is the data reference. It contains a set of pointers holding the address of the disk block where the value of the particular key can be found.

Indexing Methods



Ordered indices
The indices are usually sorted to make searching faster. The indices which are sorted are known as ordered indices.

Example: Suppose we have an employee table with thousands of record and each of which is 10 bytes long. If their IDs start with 1, 2, 3....and so on and we have to search student with ID-543.

  • In the case of a database with no index, we have to search the disk block from starting till it reaches 543. The DBMS will read the record after reading 543*10=5430 bytes.
  • In the case of an index, we will search using indexes and the DBMS will read the record after reading 542*2= 1084 bytes which are very less compared to the previous case.

Primary Index
  • If the index is created on the basis of the primary key of the table, then it is known as primary indexing. These primary keys are unique to each record and contain 1:1 relation between the records.
  • As primary keys are stored in sorted order, the performance of the searching operation is quite efficient. 
  • The primary index can be classified into two types: Dense index and Sparse index.

Dense index
  • The dense index contains an index record for every search key value in the data file. It makes searching faster.
  • In this, the number of records in the index table is same as the number of records in the main table.
  • It needs more space to store index record itself. The index records have the search key and a pointer to the actual record on the disk.



Sparse index
In the data file, index record appears only for a few items. Each item points to a block.
In this, instead of pointing to each record in the main table, the index points to the records in the main table in a gap.



Clustering Index
  • A clustered index can be defined as an ordered data file. Sometimes the index is created on non-primary key columns which may not be unique for each record.
  • In this case, to identify the record faster, we will group two or more columns to get the unique value and create index out of them. This method is called a clustering index.
  • The records which have similar characteristics are grouped, and indexes are created for these group.
Example: suppose a company contains several employees in each department. Suppose we use a clustering index, where all employees which belong to the same Dept_ID are considered within a single cluster, and index pointers point to the cluster as a whole. Here Dept_Id is a non-unique key.



The previous schema is little confusing because one disk block is shared by records which belong to the different cluster. If we use separate disk block for separate clusters, then it is called better technique.



Secondary Index
In the sparse indexing, as the size of the table grows, the size of mapping also grows. These mappings are usually kept in the primary memory so that address fetch should be faster. Then the secondary memory searches the actual data based on the address got from mapping. If the mapping size grows then fetching the address itself becomes slower. In this case, the sparse index will not be efficient. To overcome this problem, secondary indexing is introduced.

In secondary indexing, to reduce the size of mapping, another level of indexing is introduced. In this method, the huge range for the columns is selected initially so that the mapping size of the first level becomes small. Then each range is further divided into smaller ranges. The mapping of the first level is stored in the primary memory, so that address fetch is faster. The mapping of the second level and actual data are stored in the secondary memory (hard disk).



For example:
  • If you want to find the record of roll 111 in the diagram, then it will search the highest entry which is smaller than or equal to 111 in the first level index. It will get 100 at this level. 
  • Then in the second index level, again it does max (111) <= 111 and gets 110. Now using the address 110, it goes to the data block and starts searching each record till it gets 111. 
  • This is how a search is performed in this method. Inserting, updating or deleting is also done in the same manner.

Comments

Popular posts from this blog

DBMS Keys

DBMS Keys KEYS in DBMS is an attribute or set of attributes which helps you to identify a row (tuple) uniquely in a relation(table). They allow you to find the relation between two tables. Keys help you uniquely identify a row in a table by a combination of one or more columns in that table. Key is also helpful for finding unique record or row from the table. Database key is also helpful for finding unique record or row from the table. Example: Employee ID FirstName LastName 11 Andrew Johnson 22 Tom Wood 33 Alex Hale In the above-given example, employee ID is a primary key because it uniquely identifies an employee record. In this table, no other employee can have the same employee ID. Here are some reasons for using sql key in the DBMS system. Keys help you to identify any row of data in a table. In a real-world application, a table could contain thousands of records. Moreover, the records could be duplicated. Keys in RDBMS ensure that you can uniquely identify a table record despite ...

Colors in CSS

Ways to declare Colors in CSS (Image by - Sharma Guides | Subham232330) 1. Color Name 2. Hex Value 3. RGB() and RGBA() 4. HSL() and HSLA() 5. HWB() * Color Name:- background-color:red; * HEX Value:- background-color:#001122; * RGB():- background-color:rgb(25,31,52); * RGBA():- background-color:rgba(0,0,0,1.5);          |           Transparency The hexadecimal system uses values from 0 to 255 but in RGB we can use 0% to 100% as well.

Computer Short Questions

Computer Short Questions & Answers: 1. What is any part of the computer that you can physically touch? – Hardware 2. Which generation of computers is still under development? – Fifth 3. What is the most common storage device for the personal computer? – Hard Disk Drive 4. Which key is used in combination with another key to perform a specific task? – Control 5. What is the pattern of printed lines on most products? – Barcodes 6. To make the number pad act as a directional arrow, we press which key? – Shift 7. Which devices let the computer communicate with you? – Input 8. What is the most frequently used piece of hardware for inputting data? – Hardware 9. What is the place where the computer stores programs and data? – Storage unit 10. What is the process of dividing the disk into tracks and sectors? – Formatting 11. What is the space in your computer that loads’ and works with data? – RAM memory 12. What is the storage which stores or retains data after power off? – Non-volatile s...