Cardinality Examples in SQL


What is Cardinality(SQL)?

In SQL, cardinality refers to the uniqueness of data in a specific column of a table. A table would be said to have less cardinality if it has more duplicated data in a column. So, more the cardinality less the data duplication (in a column) of SQL database table. In databases, the term data cardinality is used. It is related to the performance of a query. It uncovers how many unique values are in an attribute.

How to calculate Cardinality(SQL)?

A cardinality estimation framework is used which includes base statistics that estimate the foreseen number of rows that would return from the primary table. The base statistics consists of three components:

  • Table Metadata

  • Column Density

  • Column Distribution Histogram

Histograms created either manually or automatically are used for cardinality estimation CE in SQL. Logical rewrites and constraint information can also be used by SQL server to calculate cardinality.

Types and example:

For instance, suppose we have a table with a "Gender" column which has just two potential values of "Men" and "Women". At that point, that "Gender" column would have a cardinality of 2 on the grounds that there are just two one of a kind values that might show up in that column – Men and Women. A column with zero cardinality means that column does not hold any unique value.

Cardinality Examples in SQL

1. Column cardinality:

Suppose you have a table named Employees with the following data:

EmployeeID FirstName LastName Department
1 Alice Smith HR
2 Bob Johnson IT
3 Carol Williams IT
4 Dave Brown HR

The cardinality of the Department column is 2, as there are two distinct departments: HR and IT.

2. Query result cardinality:

Consider the following SQL query on the Employees table:

SELECT Department FROM Employees;

The result set contains 4 rows, so the cardinality of the result set is 4.

3. Cardinality in JOIN operations:

Suppose you have another table named Projects with the following data:

ProjectID ProjectName Department
1 ProjectA HR
2 ProjectB IT
3 ProjectC IT

When you perform an INNER JOIN between the Employees and Projects tables based on the Department column, the cardinality of the result set depends on the matching rows:

SELECT E.EmployeeID, E.FirstName, E.LastName, P.ProjectID, P.ProjectName
FROM Employees E
INNER JOIN Projects P ON E.Department = P.Department;

The result set contains 5 rows, so the cardinality of the result set is 5.

Cardinality is essential in understanding database performance and optimization, as high cardinality columns are better suited for indexing and can result in more efficient query execution plans.

Let’s build data apps to transform your business processes

Start for Free
Scale operations fast
Backed by Y Combinator
1250 Missour St San Francisco CA 94010
Copyright © 2023 Acho Software Inc.