Conditional CASE expressions. Conditional expressions CASE HAVING - imposing a selection condition on grouped data

In this part, we will meet:
  1. with a CASE expression, which allows you to include conditional expressions in a query;
  2. with aggregate functions that allow you to get various kinds of totals (aggregated values) calculated on the basis of detailed data obtained by the “SELECT … WHERE …” operator;
  3. with the GROUP BY clause, which, in combination with aggregate functions, allows you to get totals for detailed data by group;
  4. with the HAVING clause, which allows you to filter by grouped data.

CASE expression - SQL conditional statement

This operator allows you to check the conditions and return, depending on the fulfillment of a particular condition, one or another result.

The CASE statement has 2 forms:

Expressions can also act as values ​​here.

Let's take the first form of CASE as an example:

SELECT ID,Name,Salary, CASE WHEN Salary>=3000 THEN "SR>= 3000" WHEN Salary>=2000 THEN "2000<= ЗП < 3000" ELSE "ЗП < 2000" END SalaryTypeWithELSE, CASE WHEN Salary>=3000 THEN "ZP >= 3000" WHEN Salary>=2000 THEN "2000<= ЗП < 3000" END SalaryTypeWithoutELSE FROM Employees

WHEN conditions are checked sequentially, from top to bottom. When the first satisfying condition is reached, further checking is aborted and the value specified after the THEN word associated with this WHEN block is returned.

If none of the WHEN conditions are met, then the value specified after the ELSE word is returned (which in this case means "ELSE RETURN ...").

If the ELSE block is not specified and none of the WHEN conditions are met, then NULL is returned.

In both the first and second forms, the ELSE block comes at the very end of the CASE construct, i.e. after all WHEN conditions.

Let's take the second form of CASE as an example:

Suppose, for the new year, they decided to reward all employees and asked to calculate the amount of bonuses according to the following scheme:

  • Give employees of the IT department 15% of the salary;
  • Employees of the Accounting Department 10% of the salary;
  • Everyone else gets 5% of the salary.

We use a query with a CASE expression for this task:

SELECT ID,Name,Salary,DepartmentID, -- for clarity, display the percentage as a string CASE DepartmentID -- value to be checked WHEN 2 THEN "10%" -- 10% of the RFP give Accountants WHEN 3 THEN "15%" -- 15% from RFP give IT staff ELSE "5%" -- everyone else get 5% END NewYearBonusPercent, -- construct an expression using CASE to see the amount of the bonus Salary/100* CASE DepartmentID WHEN 2 THEN 10 -- issue 10% of the RFP Accountants WHEN 3 THEN 15 -- 15% of salary to be given to IT staff ELSE 5 -- everyone else 5% each END BonusAmount FROM Employees

This is a sequential check of the DepartmentID value with WHEN values. When the first DepartmentID is equal to the WHEN value, the check is aborted and the value specified after the THEN word associated with this WHEN block is returned.

Accordingly, the value of the ELSE block is returned if DepartmentID does not match any WHEN value.

If there is no ELSE block, NULL will be returned if DepartmentID does not match any WHEN value.

The second form of CASE is easy to represent using the first form:

SELECT ID,Name,Salary,DepartmentID, CASE WHEN DepartmentID=2 THEN "10%" -- 10% of RFP to issue to Accountants WHEN DepartmentID=3 THEN "15%" -- 15% of RFP to issue to IT staff ELSE "5% " -- everyone else gets 5% END NewYearBonusPercent, -- construct an expression using CASE to see the amount of the bonus Salary/100* CASE WHEN DepartmentID=2 THEN 10 -- give 10% of the salary to Accountants WHEN DepartmentID=3 THEN 15 -- 15% of the salary to be given to ELSE 5 IT staff -- to everyone else 5% each END BonusAmount FROM Employees

So the second form is just a simplified notation for when we need to do an equality comparison of the same test value with every WHEN value/expression.

Note. The first and second forms of CASE are part of the SQL language standard, so most likely they should be applicable in many DBMS.

With MS SQL version 2012, a simplified notation for IIF has appeared. It can be used to simplify the CASE construct when only 2 values ​​are returned. The IIF construct looks like this:

IIF(condition, true_value, false_value)

Those. this is essentially a wrapper for the following CASE construct:

CASE WHEN condition THEN true_value ELSE false_value END

Let's look at an example:

SELECT ID,Name,Salary, IIF(Salary>=2500,"Salary >= 2500","Salary< 2500") DemoIIF, CASE WHEN Salary>=2500 THEN "WR >= 2500" ELSE "WR< 2500" END DemoCASE FROM Employees

CASE, IIF constructs can be nested into each other. Consider an abstract example:

SELECT ID,Name,Salary, CASE WHEN DepartmentID IN(1,2) THEN "A" WHEN DepartmentID=3 THEN CASE PositionID -- nested CASE WHEN 3 THEN "B-1" WHEN 4 THEN "B-2" END ELSE " C" END Demo1, IIF(DepartmentID IN(1,2),"A", IIF(DepartmentID=3,CASE PositionID WHEN 3 THEN "B-1" WHEN 4 THEN "B-2" END,"C")) Demo2 FROM Employees

Since the CASE and IIF constructs are expressions that return a result, we can use them not only in the SELECT block, but also in other blocks that allow the use of expressions, for example, in WHERE or ORDER BY blocks.

For example, let's say we were given the task of creating a list for issuing a RFP on hand, as follows:

  • First of all, the RFP should be received by employees whose salary is less than 2500
  • Those employees who have a salary greater than or equal to 2500 receive salary in the second place
  • Inside these two groups, you need to sort the lines by last name (Name field)

Let's try to solve this problem by adding a CASE expression to the ORDER BY block:

SELECT ID,Name,Salary FROM Employees ORDER BY CASE WHEN Salary>=2500 THEN 1 ELSE 0 END, -- issue PO first to those who have it below 2500 Name -- further sort the list in order of full name

As you can see, Ivanov and Sidorov will be the last to leave work.

And an abstract example of using CASE in the WHERE clause:

SELECT ID,Name,Salary FROM Employees WHERE CASE WHEN Salary>=2500 THEN 1 ELSE 0 END=1 -- all records with expression equal to 1

You can try to remake the last 2 examples yourself with the IIF function.

And finally, let's remember once again about NULL values:

SELECT ID,Name,Salary,DepartmentID, CASE WHEN DepartmentID=2 THEN "10%" -- 10% of the RFP to issue to Accountants WHEN DepartmentID=3 THEN "15%" -- 15% of the RFP to issue to IT staff WHEN DepartmentID IS NULL THEN "-" -- we don't give bonuses to freelancers (we use IS NULL) ELSE "5%" -- everyone else gets 5% END NewYearBonusPercent1, -- otherwise you can't check for NULL, remember what was said about NULL in the second part of CASE DepartmentID - - value to be checked WHEN 2 THEN "10%" WHEN 3 THEN "15%" WHEN NULL THEN "-" -- !!! in this case, using the second form of CASE is not appropriate ELSE "5%" END NewYearBonusPercent2 FROM Employees

Of course, it could have been written something like this:

SELECT ID,Name,Salary,DepartmentID, CASE ISNULL(DepartmentID,-1) -- use replacement for NULL with -1 WHEN 2 THEN "10%" WHEN 3 THEN "15%" WHEN -1 THEN "-" -- if we are sure that there is no and will not be a department with ID equal to (-1) ELSE "5%" END NewYearBonusPercent3 FROM Employees

In general, the flight of fancy in this case is not limited.

For example, let's see how we can model the ISNULL function using CASE and IIF:

SELECT ID,Name,LastName, ISNULL(LastName,"Not Specified") DemoISNULL, CASE WHEN LastName IS NULL THEN "Not Specified" ELSE LastName END DemoCASE, IIF(LastName IS NULL,"Not Specified",LastName) DemoIIF FROM Employees

The CASE construct is a very powerful feature of the SQL language that allows you to impose additional logic to calculate the values ​​of the result set. In this part, possession of the CASE-construct will still be useful to us, therefore, in this part, first of all, attention is paid to it.

Aggregate functions

Here we will only cover the basic and most commonly used aggregate functions:
Name Description
COUNT(*) Returns the number of rows obtained by the "SELECT ... WHERE ..." statement. In the absence of WHERE, the number of all records in the table.
COUNT(column/expression) Returns the number of values ​​(not null) in the specified column/expression
COUNT(DISTINCT column/expression) Returns the number of unique non-null values ​​in the specified column/expression
SUM(column/expression) Returns the sum over the values ​​of a column/expression
AVG(column/expression) Returns the average value over the values ​​of a column/expression. NULL values ​​are not counted.
MIN(column/expression) Returns the minimum value over the values ​​of a column/expression
MAX(column/expression) returns maximum value by column/expression values

Aggregate functions allow us to calculate the total value for a set of rows obtained using the SELECT statement.

Let's look at each function with an example:

SELECT COUNT(*) [Total number of employees], COUNT(DISTINCT DepartmentID) [Number of unique departments], COUNT(DISTINCT PositionID) [Number of unique positions], COUNT(BonusPercent) [Number of employees with bonus %] , MAX(BonusPercent) [Maximum bonus percentage], MIN(BonusPercent) [Minimum bonus percentage], SUM(Salary/100*BonusPercent) [Sum of all bonuses], AVG(Salary/100*BonusPercent) [Average bonus amount], AVG (Salary) [Average Salary] FROM Employees

For clarity, I decided to make an exception here and used the […] syntax to set column aliases.

Let's analyze how each returned value turned out, and for one thing, recall the constructions of the basic syntax of the SELECT statement.

First, because If we did not specify WHERE conditions in the query, then the totals will be calculated for the detailed data that is obtained by the query:

SELECT * FROM Employees

Those. for all rows in the Employees table.

For clarity, we select only the fields and expressions that are used in aggregate functions:

SELECT DepartmentID, PositionID, BonusPercent, Salary/100*BonusPercent , Salary FROM Employees

DepartmentID PositionID BonusPercent Salary/100*BonusPercent Salary
1 2 50 2500 5000
3 3 15 225 1500
2 1 NULL NULL 2500
3 4 30 600 2000
3 3 NULL NULL 1500
NULL NULL NULL NULL 2000

These are the initial data (detailed rows), which will be used to calculate the totals of the aggregated query.

Now let's break down each aggregated value:

COUNT(*)– because If we didn't specify filtering conditions in the WHERE block in the query, then COUNT(*) gave us the total number of records in the table, i.e. is the number of rows the query returns:

SELECT * FROM Employees

COUNT(DISTINCT Department ID)- returned us the value 3, i.e. this number corresponds to the number of unique department values ​​specified in the DepartmentID column, excluding NULL values. Let's go through the values ​​of the DepartmentID column and color the same values ​​in the same color (feel free, all methods are good for learning):

We discard NULL, after which we got 3 unique values ​​(1, 2 and 3). Those. the value obtained by COUNT(DISTINCT DepartmentID), in expanded form, can be represented by the following sample:

SELECT DISTINCT DepartmentID -- 2. take only unique values ​​FROM Employees WHERE DepartmentID IS NOT NULL -- 1. discard NULL values


COUNT(DISTINCT PositionID)- the same as was said about COUNT(DISTINCT DepartmentID), only for the PositionID field. We look at the values ​​of the PositionID column and do not spare colors:


COUNT(BonusPercent)– returns the number of rows that have a BonusPercent value, i.e. the number of records with BonusPercent IS NOT NULL is counted. Here it will be easier for us, because. you don't need to count unique values, just discard records with NULL values. We take the values ​​of the BonusPercent column and cross out all NULL values:

There are 3 values ​​left. Those. In expanded form, the sample can be represented as follows:

SELECT BonusPercent -- 2. take all values ​​FROM Employees WHERE BonusPercent IS NOT NULL -- 1. discard NULL values

Because Since we did not use the words DISTINCT, then repeated BonusPercents will be counted if they exist, ignoring BonusPercent equal to NULL. For example, let's compare the result with and without DISTINCT. For greater clarity, let's use the values ​​of the DepartmentID field:

SELECT COUNT(*), -- 6 COUNT(DISTINCT DepartmentID), -- 3 COUNT(DepartmentID) -- 5 FROM Employees


MAX(BonusPercent)- returns the maximum value of BonusPercent, again excluding NULL values.
We take the values ​​of the BonusPercent column and look for the maximum value among them, we do not pay attention to NULL values:

SELECT TOP 1 BonusPercent FROM Employees WHERE BonusPercent IS NOT NULL ORDER BY BonusPercent DESC -- sort descending

MIN(BonusPercent)– returns the minimum value of BonusPercent, again ignoring NULL values. As in the case of MAX, only looking for the minimum value, ignoring NULL:

Those. we get the following value:

SELECT TOP 1 BonusPercent FROM Employees WHERE BonusPercent IS NOT NULL ORDER BY BonusPercent -- sort ascending

A visual representation of MIN(BonusPercent) and MAX(BonusPercent):


SUM(Salary/100*BonusPercent)– returns the sum of all non-NULL values. Parsing the values ​​of the expression (Salary/100*BonusPercent):

Those. the following values ​​are summed:

SELECT Salary/100*BonusPercent FROM Employees WHERE Salary/100*BonusPercent IS NOT NULL


AVG(Salary/100*BonusPercent)- returns the average of the values. NULL expressions are not taken into account, i.e. this matches the second expression:

SELECT AVG(Salary/100*BonusPercent), -- 1108.33333333333 SUM(Salary/100*BonusPercent)/COUNT(Salary/100*BonusPercent), -- 1108.33333333333 SUM(Salary/100*BonusPercent)/COUNT(*) -- 566.6616 FROM Employees

Those. again, NULL values ​​don't count towards the count.

If you need to calculate the average of all employees, as in the third expression, which gives 554.166666666667, then use the preliminary conversion of NULL values ​​to zero:

SELECT AVG(ISNULL(Salary/100*BonusPercent,0)), -- 554.166666666667 SUM(Salary/100*BonusPercent)/COUNT(*) -- 554.166666666667 FROM Employees

AVG(Salary)– in fact, everything here is the same as in the previous case, i.e. if an employee's Salary is NULL, then he will not be counted. To take into account all employees, respectively, do a preliminary NULL transformation of the values ​​AVG(ISNULL(Salary,0))

Let's sum up some results:
  • COUNT(*) - serves to count the total number of rows that are received by the "SELECT ... WHERE ..." statement
  • in all other aggregate functions listed above, when calculating the total, NULL values ​​are not taken into account
  • if we need to take into account all rows, this is more relevant for the AVG function, then it is first necessary to process NULL values, for example, as shown above "AVG (ISNULL (Salary, 0))"

Accordingly, when specifying an additional condition with aggregate functions in the WHERE block, only the totals for the rows that satisfy the condition will be calculated. Those. aggregate values ​​are calculated for the final set, which is obtained using the SELECT construct. For example, let's do the same, but only in the context of the IT department:

SELECT COUNT(*) [Total number of employees], COUNT(DISTINCT DepartmentID) [Number of unique departments], COUNT(DISTINCT PositionID) [Number of unique positions], COUNT(BonusPercent) [Number of employees with bonus %] , MAX(BonusPercent) [Maximum bonus percentage], MIN(BonusPercent) [Minimum bonus percentage], SUM(Salary/100*BonusPercent) [Sum of all bonuses], AVG(Salary/100*BonusPercent) [Average bonus amount], AVG (Salary) [Average Salary] FROM Employees WHERE DepartmentID=3 -- consider IT department only

I suggest you, for a better understanding of the work of aggregate functions, independently analyze each received value. We carry out calculations here, respectively, according to the detailed data received by the request:

SELECT DepartmentID, PositionID, BonusPercent, Salary/100*BonusPercent , Salary FROM Employees WHERE DepartmentID=3 -- consider IT department only

DepartmentID PositionID BonusPercent Salary/100*BonusPercent Salary
3 3 15 225 1500
3 4 30 600 2000
3 3 NULL NULL 1500

Move on. If the aggregate function returns NULL (for example, all employees do not have the Salary value), or there are no records in the selection, and in the report, for this case we need to show 0, then the ISNULL function can wrap the aggregate expression:

SELECT SUM(Salary), AVG(Salary), -- handle total with ISNULL ISNULL(SUM(Salary),0), ISNULL(AVG(Salary),0) FROM Employees WHERE DepartmentID=10 -- non-existent department specified here so that the query returns no records

(No column name) (No column name) (No column name) (No column name)
NULL NULL 0 0

I think it's very important to understand the purpose of each aggregate function and how they calculate, because in SQL, it is the main tool used to calculate totals.

In this case, we have considered how each aggregate function behaves independently, i.e. it applied to the values ​​of the entire record set returned by the SELECT command. Next, we'll look at how these same functions are used to calculate group totals using the GROUP BY clause.

GROUP BY - grouping data

Before that, we already calculated totals for a specific department, something like this:

SELECT COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID=3 -- data for IT department only

Now imagine that we were asked to get the same numbers in the context of each department. Of course we can roll up our sleeves and run the same request for each department. So, said and done, we write 4 queries:

SELECT "Administration" Info, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID=1 -- Administration data SELECT "Accounting" Info, COUNT(DISTINCT PositionID) PositionCount, COUNT( *) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID=2 -- Accounting data SELECT "IT" Info, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID=3 -- IT department data SELECT "Other" Info, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID IS NULL -- and don't forget freelance data

As a result, we will get 4 data sets:

Please note that we can use fields specified as constants - "Administration", "Accounting", ...

In general, we obtained all the figures that we were asked about, combine everything in Excel and give it to the director.

The director liked the report, and he says: "And add another column with information on the average salary." And as always, this needs to be done very urgently.

Hmm, what to do?! In addition, imagine that we have not 3 departments, but 15.

This is exactly what the GROUP BY construct serves for such cases:

SELECT DepartmentID, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount, AVG(Salary) SalaryAvg -- plus do the director's wish FROM Employees GROUP BY DepartmentID

DepartmentID PositionCount EmplCount SalaryAmount SalaryAvg
NULL 0 1 2000 2000
1 1 1 5000 5000
2 1 1 2500 2500
3 2 3 5000 1666.66666666667

We received all the same data, but now using only one request!

For now, do not pay attention to the fact that our departments are displayed in the form of numbers, then we will learn how to display everything beautifully.

In the GROUP BY clause, you can specify several fields "GROUP BY field1, field2, ..., fieldN", in this case, grouping will occur in groups that form the values ​​of these fields "field1, field2, ..., fieldN".

For example, let's group the data in the context of Departments and Positions:

SELECT DepartmentID,PositionID, COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees GROUP BY DepartmentID,PositionID

After that, a run is made through each combination and aggregate functions are calculated:

SELECT COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID IS NULL AND PositionID IS NULL SELECT COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID=1 AND PositionID=2 -- ... SELECT COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID=3 AND PositionID=4

And then all these results are combined together and given to us as one set:

From the main one, it is worth noting that in the case of grouping (GROUP BY), in the list of columns in the SELECT block:

  • We can only use the columns listed in the GROUP BY block
  • You can use expressions with fields from the GROUP BY block
  • You can use constants, because they do not affect the grouping result
  • All other fields (not listed in the GROUP BY block) can only be used with aggregate functions (COUNT, SUM, MIN, MAX, ...)
  • It is not necessary to list all columns from the GROUP BY block in the SELECT column list

And a demo of it all:

SELECT "String constant" Const1, -- constant as a string 1 Const2, -- constant as a number -- expression using the fields involved in the grouping CONCAT("Department № ",DepartmentID) ConstAndGroupField, CONCAT("Department № ",DepartmentID ,", Position № ",PositionID) ConstAndGroupFields, DepartmentID, -- field from the list of fields participating in the grouping -- PositionID, -- field participating in the grouping, it is not necessary to duplicate here COUNT(*) EmplCount, -- number of rows in per group -- other fields can only be used with aggregate functions: COUNT, SUM, MIN, MAX, … SUM(Salary) SalaryAmount, MIN(ID) MinID FROM Employees GROUP BY DepartmentID,PositionID -- grouping by DepartmentID,PositionID fields

It is also worth noting that grouping can be done not only by fields, but also by expressions. For example, let's group the data by employees, by year of birth:

SELECT CONCAT("Year of birth - ",YEAR(Birthday)) YearOfBirthday, COUNT(*) EmplCount FROM Employees GROUP BY YEAR(Birthday)

Let's look at an example with a more complex expression. For example, we get the gradation of employees by year of birth:

SELECT CASE WHEN YEAR(Birthday)>=2000 THEN "from 2000" WHEN YEAR(Birthday)>=1990 THEN "1999-1990" WHEN YEAR(Birthday)>=1980 THEN "1989-1980" WHEN YEAR(Birthday)>= 1970 THEN "1979-1970" WHEN Birthday IS NOT NULL THEN "before 1970" ELSE "not specified" END RangeName, COUNT(*) EmplCount FROM Employees GROUP BY CASE WHEN YEAR(Birthday)>=2000 THEN "from 2000" WHEN YEAR (Birthday)>=1990 THEN "1999-1990" WHEN YEAR(Birthday)>=1980 THEN "1989-1980" WHEN YEAR(Birthday)>=1970 THEN "1979-1970" WHEN Birthday IS NOT NULL THEN "before 1970" ELSE "not specified" END

RangeName EmplCount
1979-1970 1
1989-1980 2
not specified 2
before 1970 1

Those. in this case, the grouping is done according to the CASE-expression previously calculated for each employee:

SELECT ID, CASE WHEN YEAR(Birthday)>=2000 THEN "from 2000" WHEN YEAR(Birthday)>=1990 THEN "1999-1990" WHEN YEAR(Birthday)>=1980 THEN "1989-1980" WHEN YEAR(Birthday) >=1970 THEN "1979-1970" WHEN Birthday IS NOT NULL THEN "before 1970" ELSE "not specified" END FROM Employees

And of course, you can combine expressions with fields in the GROUP BY block:

SELECT DepartmentID, CONCAT("Birth year - ",YEAR(Birthday)) YearOfBirthday, COUNT(*) EmplCount FROM Employees GROUP BY YEAR(Birthday),DepartmentID -- the order may not match the order they are used in the SELECT ORDER BY DepartmentID block, YearOfBirthday -- finally we can sort the result

Let's return to our original task. As we already know, the director really liked the report, and he asked us to do it weekly so that he could monitor changes in the company. In order not to interrupt each time in Excel the digital value of the department with its name, we will use the knowledge that we already have and improve our query:

SELECT CASE DepartmentID WHEN 1 THEN "Administration" WHEN 2 THEN "Accounting" WHEN 3 THEN "IT" ELSE "Other" END Info, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount, AVG(Salary ) SalaryAvg -- plus we fulfill the director's wish FROM Employees GROUP BY DepartmentID ORDER BY Info -- add sorting by the Info column for more convenience

Although it may look scary from the outside, it is still better than it was originally. The disadvantage is that if a new department and its employees are started, then we will need to add the CASE expression so that the employees of the new department do not fall into the “Other” group.

But nothing, over time, we will learn how to do everything beautifully so that our selection does not depend on the appearance of new data in the database, but is dynamic. I'll jump ahead a bit to show what kind of queries we're aiming to get into writing:

SELECT ISNULL(dep.Name,"Others") DepName, COUNT(DISTINCT emp.PositionID) PositionCount, COUNT(*) EmplCount, SUM(emp.Salary) SalaryAmount, AVG(emp.Salary) SalaryAvg -- plus fulfill director's wish FROM Employees emp LEFT JOIN Departments dep ON emp.DepartmentID=dep.ID GROUP BY emp.DepartmentID,dep.Name ORDER BY DepName

In general, do not worry - everyone started with a simple one. For now, you just need to understand the essence of the GROUP BY construct.

Finally, let's see how you can build summary reports using GROUP BY.

For example, let's display a pivot table, broken down by departments, so that the total wages received by employees by position are calculated:

SELECT DepartmentID, SUM(CASE WHEN PositionID=1 THEN Salary END) [Accountants], SUM(CASE WHEN PositionID=2 THEN Salary END) [Directors], SUM(CASE WHEN PositionID=3 THEN Salary END) [Programmers], SUM( CASE WHEN PositionID=4 THEN Salary END) [Senior Programmers], SUM(Salary) [Department Total] FROM Employees GROUP BY DepartmentID

Those. we are free to use any expressions within aggregate functions.

Of course, you can also rewrite it using IIF:

SELECT DepartmentID, SUM(IIF(PositionID=1,Salary,NULL)) [Accountant], SUM(IIF(PositionID=2,Salary,NULL)) [Director], SUM(IIF(PositionID=3,Salary,NULL)) [Programmers], SUM(IIF(PositionID=4,Salary,NULL)) [Senior Programmers], SUM(Salary) [Department Total] FROM Employees GROUP BY DepartmentID

But in the case of IIF, we will have to explicitly specify NULL, which is returned if the condition is not met.

In similar cases, I like to use CASE without an ELSE block rather than write NULL again. But this is of course a matter of taste, which is not disputed.

And let's remember that NULL values ​​are not taken into account in aggregate functions during aggregation.

To consolidate, do an independent analysis of the received data on a detailed request:

SELECT DepartmentID, CASE WHEN PositionID=1 THEN Salary END [Accountant], CASE WHEN PositionID=2 THEN Salary END [Directors], CASE WHEN PositionID=3 THEN Salary END [Programmers], CASE WHEN PositionID=4 THEN Salary END [Senior Programmers] ], Salary [Total by Department] FROM Employees

DepartmentID accountant Directors Programmers senior programmers Total by department
1 NULL 5000 NULL NULL 5000
3 NULL NULL 1500 NULL 1500
2 2500 NULL NULL NULL 2500
3 NULL NULL NULL 2000 2000
3 NULL NULL 1500 NULL 1500
NULL NULL NULL NULL NULL 2000

And let's also remember that if instead of NULL we want to see zeros, then we can process the value returned by the aggregate function. For example:

SELECT DepartmentID, ISNULL(SUM(IIF(PositionID=1,Salary,NULL)),0) [Accountant], ISNULL(SUM(IIF(PositionID=2,Salary,NULL)),0) [Director], ISNULL(SUM (IIF(PositionID=3,Salary,NULL)),0) [Programmers], ISNULL(SUM(IIF(PositionID=4,Salary,NULL)),0) [Senior Programmers], ISNULL(SUM(Salary),0 ) [Total by Department] FROM Employees GROUP BY DepartmentID

Now for practice purposes, you can:

  • output department names instead of their IDs, for example by adding a CASE statement handling DepartmentID in a SELECT block
  • add sorting by department name using ORDER BY

GROUP BY, in conjunction with aggregate functions, is one of the main tools used to obtain summary data from the database, because data is usually used in this form, because. usually we are required to provide summary reports, not detailed data (sheets). And of course, it all revolves around knowing the basic design, because. before you summarize (aggregate) something, you first need to select it correctly using "SELECT ... WHERE ...".

Practice plays an important role here, so if you set a goal to understand the SQL language, not to study, but to understand - practice, practice and practice, sorting through the most different options that you can think of.

At the initial stages, if you are not sure about the correctness of the aggregated data obtained, make a detailed selection, including all the values ​​that are being aggregated. And check the correctness of the calculations manually according to these detailed data. In this case, the use of Excel can be very helpful.

Let's say you've reached this point

Let's say that you are an accountant Sidorov S.S., who decided to learn how to write SELECT queries.
Let's say that you have already read this tutorial up to this point, and you are already confidently using all of the above basic constructions, i.e. you can:
  • Select detailed data by WHERE clause from one table
  • Know how to use aggregate functions and grouping from one table
Since at work they thought that you already knew everything, you were given access to the database (and this sometimes happens), and now you have developed and are pulling out the same weekly report for the director.

Yes, but they didn't take into account that you still don't know how to build queries from several tables, but only from one, i.e. you can't do something like this:

SELECT emp.*, -- return all fields of the Employees table dep.Name DepartmentName, -- add the Name field from the Departments table pos.Name PositionName to these fields -- and also add the Name field from the Positions table FROM Employees emp LEFT JOIN Departments dep ON emp.DepartmentID=dep.ID LEFT JOIN Positions pos ON emp.PositionID=pos.ID

Despite the fact that you do not know how to do this, believe me, you have done well, and have already achieved so much.

So, how can you use your current knowledge and get even more productive results at the same time?! Let's use the power of the collective mind - we go to the programmers who work for you, i.e. to Andreev A.A., Petrov P.P. or Nikolaev N.N., and ask one of them to write a view for you (VIEW or just "View", so they even, I think, will understand you faster), which, in addition to the main fields from the Employees table, will also return fields with “Department name” and “Position name”, which you so lack now for the weekly report that Ivanov I.I. loaded you with.

Because you explained everything correctly, then IT specialists immediately understood what they wanted from them and created, especially for you, a view called ViewEmployeesInfo.

We imagine that you do not see the following command, because IT people do it:

CREATE VIEW ViewEmployeesInfo AS SELECT emp.*, -- return all fields of the Employees table dep.Name DepartmentName, -- add the Name field from the Departments table pos.Name PositionName to these fields -- and also add the Name field from the Positions table FROM Employees emp LEFT JOIN Departments dep ON emp.DepartmentID=dep.ID LEFT JOIN Positions pos ON emp.PositionID=pos.ID

Those. for you, all this, so far scary and incomprehensible, text remains behind the scenes, and IT people give you only the name of the ViewEmployeesInfo view, which returns all the above data (that is, what you asked them for).

You can now work with this view as with a regular table:

SELECT * FROM ViewEmployeesInfo

Because now all the data necessary for the report is in one “table” (a la view), then you can easily redo your weekly report:

SELECT DepartmentName, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount, AVG(Salary) SalaryAvg FROM ViewEmployeesInfo emp GROUP BY DepartmentID,DepartmentName ORDER BY DepartmentName

Now all the names of the departments are in place, plus the query has become dynamic, and will change when new departments and their employees are added, i.e. now you do not need to redo anything, but it is enough to execute the request once a week and give its result to the director.

Those. for you in this case, as if nothing has changed, you continue to work with one table in the same way (only it’s more correct to say with the ViewEmployeesInfo view), which returns all the data you need. Thanks to the help of IT people, the details of obtaining DepartmentName and PositionName have been left in a black box for you. Those. the view looks just like a regular table to you, think of it as an extended version of the Employees table.

For example, let's also form a statement so that you make sure that everything is valid as I said (that the entire selection comes from one view):

SELECT ID, Name, Salary FROM ViewEmployeesInfo WHERE Salary IS NOT NULL AND Salary>0 ORDER BY Name

I hope you understand this request.

Using views in some cases makes it possible to significantly expand the boundaries of users who know how to write basic SELECT queries. In this case, the view is a flat table with all the data the user needs (for those who understand OLAP, this can be compared with an approximate similarity of an OLAP cube with facts and dimensions).

Clipping from Wikipedia. Although SQL was conceived as an end user tool, it eventually became so complex that it became a programmer's tool.

As you can see, dear users, the SQL language was originally conceived as a tool for you. So, everything is in your hands and desire, do not let go.

HAVING - applying a selection condition to grouped data

Actually, if you understand what grouping is, then there is nothing complicated with HAVING. HAVING - somewhat similar to WHERE, only if the WHERE condition is applied to the detailed data, then the HAVING condition is applied to the already grouped data. For this reason, in the conditions of the HAVING block, we can use either expressions with fields included in the grouping, or expressions enclosed in aggregate functions.

Consider an example:

SELECT DepartmentID, SUM(Salary) SalaryAmount FROM Employees GROUP BY DepartmentID HAVING SUM(Salary)>3000

DepartmentID SalaryAmount
1 5000
3 5000

Those. this query returned to us grouped data only for those departments whose total salary of all employees exceeds 3000, i.e. "SUM(Salary)>3000".

Those. here, first of all, grouping occurs and data is calculated for all departments:

SELECT DepartmentID, SUM(Salary) SalaryAmount FROM Employees GROUP BY DepartmentID -- 1. get grouped data across all departments

And already the condition specified in the HAVING block is applied to this data:

SELECT DepartmentID, SUM(Salary) SalaryAmount FROM Employees GROUP BY DepartmentID -- 1. get grouped data for all departments HAVING SUM(Salary)>3000 -- 2. condition for filtering grouped data

In the HAVING condition, you can also build complex conditions using the AND, OR and NOT operators:

SELECT DepartmentID, SUM(Salary) SalaryAmount FROM Employees GROUP BY DepartmentID HAVING SUM(Salary)>3000 AND COUNT(*)<2 -- и число людей меньше 2-х

As you can see here, an aggregate function (see "COUNT(*)") can only be specified in a HAVING block.

Accordingly, we can display only the number of the department that falls under the HAVING condition:

SELECT DepartmentID FROM Employees GROUP BY DepartmentID HAVING SUM(Salary)>3000 AND COUNT(*)<2 -- и число людей меньше 2-х

An example of using a HAVING condition on a field included in GROUP BY:

SELECT DepartmentID, SUM(Salary) SalaryAmount FROM Employees GROUP BY DepartmentID -- 1. make a grouping HAVING DepartmentID=3 -- 2. apply a filter on the grouping result

This is just an example, because in this case, it would be more logical to check through the WHERE condition:

SELECT DepartmentID, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID=3 -- 1. filter detailed data GROUP BY DepartmentID -- 2. group only by selected records

Those. first filter employees by department 3, and only then do the calculation.

Note. In fact, even though these two queries look different, the DBMS optimizer can execute them in the same way.

I think this is the end of the story about HAVING conditions.

Summing up

Let's summarize the data obtained in the second and third parts and consider the specific location of each structure we studied and indicate the order in which they are performed:
Structure/Block Order of execution Executable function
SELECT return expressions 4 Returning the data received by the request
FROM source 0 In our case, this is so far all the rows of the table
WHERE condition for fetching from source 1 Only rows matching the condition are selected
GROUP BY grouping expressions 2 Create groups based on the specified grouping expression. Calculation of aggregated values ​​for these groups used in SELECT or HAVING blocks
HAVING filter by grouped data 3 Filtering applied to grouped data
ORDER BY result sort expression 5 Sort data by specified expression

Of course, you can also apply the DISTINCT and TOP clauses learned in Part 2 to grouped data.

These suggestions in this case will apply to the final result:

SELECT TOP 1 -- 6. apply last SUM(Salary) SalaryAmount FROM Employees GROUP BY DepartmentID HAVING SUM(Salary)>3000 ORDER BY DepartmentID -- 5. sort result

How did you get these results, analyze for yourself.

Conclusion

The main goal that I set in this part is to reveal to you the essence of aggregate functions and groupings.

If the basic design allowed us to obtain the necessary detailed data, then the use of aggregate functions and groupings on these detailed data made it possible for us to obtain summary data on them. So, as you can see, everything is important here, because. one relies on the other - without knowing the basic design, we will not be able, for example, to correctly select the data for which we need to calculate the results.

Here I deliberately try to show only the basics in order to focus beginners on the most important constructions and not overload them with unnecessary information. A solid understanding of the basic constructions (which I will continue to talk about in subsequent parts) will give you the opportunity to solve almost any problem of retrieving data from RDB. The basic constructions of the SELECT statement are applicable in the same form in almost all DBMS (the differences are mainly in the details, for example, in the implementation of functions - for working with strings, time, etc.).

In the future, a solid knowledge of the database will give you the opportunity to easily learn various extensions of the SQL language, such as:

  • GROUP BY ROLLUP(…), GROUP BY GROUPING SETS(…), …
  • PIVOT, UNPIVOT
  • etc.
For the purpose of this tutorial, I decided not to talk about these extensions, because. and without their knowledge, knowing only the basic structures of the SQL language, you can solve a very wide range of problems. Extensions of the SQL language, in fact, serve to solve a certain range of tasks, i.e. allow you to solve a problem of a certain class more elegantly (but not always more efficient in terms of speed or resources spent).

If you are taking your first steps in SQL, then first of all, focus on learning the basic structures, because. owning the base, everything else will be much easier for you to understand, and besides, on your own. First of all, you need to understand in depth the possibilities of the SQL language, i.e. what kind of operations it generally allows to perform on the data. To convey information to beginners in volumetric form is another reason why I will show only the most important (iron) structures.

Good luck with learning and understanding the SQL language.

Part four -

Team CASE allows you to select for one from multiple command sequences. This construct has been present in the SQL standard since 1992, although it was not supported in Oracle SQL until Oracle8i, and in PL/SQL until Oracle9i Release 1. Starting with this version, the following varieties of CASE commands are supported:

  • simple command CASE - Associates one or more sequences of PL/SQL commands with corresponding values ​​(the sequence to be executed is selected based on the result of evaluating an expression that returns one of the values).
  • Search team CASE - selects one or more sequences of commands to be executed, depending on the results of checking a list of boolean values. The sequence of commands associated with the first condition that evaluates to TRUE is executed.

NULL or UNKNOWN?

In the article on the IF statement, you may have learned that the result of a boolean expression can be TRUE , FALSE , or NULL .

In PL/SQL, this statement is true, but in the broader context of relational theory, it is considered incorrect to talk about returning NULL from a boolean expression. Relational theory says that a comparison with NULL is of the following form:

2 < NULL

yields the logical result UNKNOWN , and UNKNOWN is not equivalent to NULL . However, don't worry too much about PL/SQL's use of NULL for UNKNOWN . However, you should be aware that the third value in 3-valued logic is UNKNOWN . And I hope you never get screwed (like I did!) by using the wrong term when discussing three-valued logic with relational experts.

In addition to CASE commands, PL/SQL also supports CASE statements. Such an expression is very similar to the CASE command, it allows you to select one or more expressions to evaluate. The result of a CASE expression is a single value, while the result of a CASE command is the execution of a sequence of PL/SQL commands.

Simple CASE Commands

The simple CASE command allows you to select one of several sequences of PL/SQL commands to execute, depending on the result of the evaluation of an expression. It is written as follows:

CASE expression WHEN result_1 THEN command_1 WHEN result_2 THEN command_2 ... ELSE command_else END CASE;

The ELSE branch is optional here. When executing such a command, PL/SQL first evaluates the expression, after which the result is compared with result_1 . If they match, then commands_1 are executed. Otherwise, result_2 is checked, and so on.

Here is an example of a simple CASE command, in which the bonus is calculated depending on the value of the employee_type variable:

CASE employee_type WHEN "S" THEN award_salary_bonus(employee_id); WHEN "H" THEN award_hourly_bonus(employee_id); WHEN "C" THEN award_commissioned_bonus(employee_id); ELSE RAISE invalid_employee_type; END CASE;

In this example, there is an explicit ELSE clause, but in general it is not required. Without the ELSE section, the PL/SQL compiler implicitly substitutes code like this:

ELSE RAISE CASE_NOT_FOUND;

In other words, if you don't specify the ELSE keyword, and if none of the results in the WHEN clauses match the result of the expression in the CASE command, PL/SQL throws a CASE_NOT_FOUND exception. This is the difference between this command and IF . When the IF command does not contain the ELSE keyword, nothing happens if the condition is not met, while in the CASE command, a similar situation results in an error.

It is interesting to see how to implement the bonus logic described at the beginning of the chapter using a simple CASE command. At first glance, this seems impossible, but by approaching the matter creatively, we arrive at the following solution:

CASE TRUE WHEN salary >= 10000 AND salary<=20000 THEN give_bonus(employee_id, 1500); WHEN salary >20000 AND salary<= 40000 THEN give_bonus(employee_id, 1000); WHEN salary >40000 THEN give_bonus(employee_id, 500); ELSE give_bonus(employee_id, 0); END CASE;

The important thing here is that the expression and result elements can either be scalar values ​​or expressions that evaluate to scalar values.

Returning to the IF...THEN...ELSIF command, which implements the same logic, you will see that the ELSE section is defined in the CASE command, while the ELSE keyword is absent in the IF-THEN-ELSIF command. The reason for adding ELSE is simple: if none of the bonus conditions are met, the IF instruction does nothing and the bonus is zero. The CASE command throws an error in this case, so the situation with a zero premium has to be programmed explicitly.

To prevent CASE_NOT_FOUND errors, ensure that any value of the expression being tested will satisfy at least one of the conditions.

The above CASE TRUE command may sound like a gimmick to some, but it's actually just an implementation of the CASE search command, which we'll talk about in the next section.

CASE search team

The CASE search command examines a list of boolean expressions; upon encountering an expression equal to TRUE , it executes the sequence of commands associated with it. In essence, the CASE search command is the equivalent of the CASE TRUE command, an example of which was given in the previous section. The CASE search command has the following notation:

CASE WHEN expression_1 THEN command_1 WHEN expression_2 THEN command_2 ... ELSE command_else END CASE; It is ideal for implementing bonus logic: CASE WHEN salary >= 10000 AND salary<=20000 THEN give_bonus(employee_id, 1500); WHEN salary >20000 AND salary<= 40000 THEN give_bonus(employee_id, 1000); WHEN salary >40000 THEN give_bonus(employee_id, 500); ELSE give_bonus(employee_id, 0); END CASE;

The search command CASE , like the simple command, is subject to the following rules:

  • Command execution ends immediately after the execution of the sequence of executable commands associated with the true expression. If several expressions are true, then the commands associated with the first of them are executed.
  • The ELSE keyword is optional. If it is not set and none of the expressions are TRUE , a CASE_NOT_FOUND exception is thrown.
  • WHEN conditions are checked in a strictly defined order, from beginning to end.

Consider another implementation of the bonus calculation logic, which uses the fact that the WHEN conditions are checked in the order they are written. Individual expressions are simpler, but is it possible to say that the meaning of the entire command has become more understandable?

CASE WHEN salary > 40000 THEN give_bonus(employee_id, 500); WHEN salary > 20000 THEN give_bonus(employee_id, 1000); WHEN salary >= 10000 THEN give_bonus(employee_id, 1500); ELSE give_bonus(employee_id, 0); END CASE;

If an employee's salary is $20,000, then the first two conditions are FALSE and the third is TRUE, so the employee will receive a bonus of $1,500. If the salary is $21,000, then the result of the second condition will be TRUE and the bonus will be $1,000. The execution of the CASE command will end at the second WHEN branch, and the third condition will not even be checked. Whether it is worth using this approach when writing CASE commands is a moot point. Be that as it may, keep in mind that writing such a command is possible, and special care is required when debugging and editing programs in which the result depends on the order of expressions.

The logic that depends on the order of homogeneous branches of WHEN is a potential source of errors that occur when they are rearranged. As an example, consider the following CASE search command, in which, with a salary value of 20,000, the conditions in both WHEN branches evaluate to TRUE:

CASE WHEN salary BETWEEN 10000 AND 20000 THEN give_bonus(employee_id, 1500); WHEN salary BETWEEN 20000 AND 40000 THEN give_bonus(employee_id, 1000); ...

Imagine that the maintainer of this program carelessly rearranges the WHEN branches to order them in descending salary order. Don't turn down this opportunity! Programmers often tend to "finish" perfectly working code, guided by some internal idea of ​​​​order. The CASE command with the WHEN sections rearranged looks like this:

CASE WHEN salary BETWEEN 20000 AND 40000 THEN give_bonus(employee_id, 1000); WHEN salary BETWEEN 10000 AND 20000 THEN give_bonus(employee_id, 1500); ...

At first glance, everything is correct, right? Unfortunately, due to the overlap of the two WHEN branches, an insidious bug appears in the program. Now an employee with a salary of 20,000 will receive a bonus of 1,000 instead of the 1,500 expected. It may be desirable in some situations to overlap between WHEN branches, but should still be avoided if possible. Always remember that the order of the branches is important, and resist the urge to improve the already working code - "don't fix what's not broken".

Because the WHEN conditions are evaluated in order, you can make your code a little more efficient by putting branches with the most likely conditions at the top of the list. Also, if you have a branch with "expensive" expressions (for example, requiring significant CPU time and memory), you can put them at the end to minimize the chance of checking them out. See the Nested IF Commands section for details.

CASE search commands are used when the commands to be executed are determined by a set of logical expressions. The simple CASE command is used when a decision is made based on the result of a single expression.

Nested CASE Commands

CASE commands, like IF commands, can be nested. For example, a nested CASE statement is present in the following (rather convoluted) implementation of the bonus logic:

CASE WHEN salary >= 10000 THEN CASE WHEN salary<= 20000 THEN give_bonus(employee_id, 1500); WHEN salary >40000 THEN give_bonus(employee_id, 500); WHEN salary > 20000 THEN give_bonus(employee_id, 1000); END CASE; WHEN salary< 10000 THEN give_bonus(employee_id,0); END CASE;

Any command can be used in a CASE command, so the internal CASE command is easily replaced by an IF command. Similarly, any command can be nested in an IF command, including CASE .

CASE expressions

CASE expressions solve the same problem as CASE commands, only not for executable commands, but for expressions. A simple CASE expression selects one of several expressions to evaluate based on a given scalar value. The CASE search expression sequentially evaluates the expressions in the list until one of them evaluates to TRUE , and then returns the result of its associated expression.

The syntax for these two varieties of CASE expressions is:

Simple_Expression_Case:= CASE expression WHEN result_1 THEN result_expression_1 WHEN result_2 THEN result_expression_2 ... ELSE result_expression_else END; Search_expression_Case:= CASE WHEN expression_1 THEN result_expression_1 WHEN expression_2 THEN result_expression_2 ... ELSE result_expression_else END;

The CASE expression returns a single value, the result of the expression selected to be evaluated. Each branch of WHEN must be assigned one resulting expression (but not a command). Neither a semicolon nor END CASE is placed at the end of a CASE expression. The CASE expression ends with the END keyword.

The following is an example of a simple CASE expression used in conjunction with the PUT_LINE procedure of the DBMS_OUTPUT package to display the value of a boolean variable.
(Recall that the PUT_LINE program does not directly support Boolean types.) In this example, the CASE expression converts a Boolean value to a character string, which is then output by the PUT_LINE procedure:

DECLARE boolean_true BOOLEAN:= TRUE; boolean_false BOOLEAN:= FALSE; boolean_null BOOLEAN; FUNCTION boolean_to_varchar2 (flag IN BOOLEAN) RETURN VARCHAR2 IS BEGIN RETURN CASE flag WHEN TRUE THEN "True" WHEN FALSE THEN "False" ELSE "NULL" END; END; BEGIN DBMS_OUTPUT.PUT_LINE(boolean_to_varchar2(boolean_true)); DBMS_OUTPUT.PUT_LINE(boolean_to_varchar2(boolean_false)); DBMS_OUTPUT.PUT_LINE(boolean_to_varchar2(boolean_null)); END;

To implement the bonus calculation logic, you can use the CASE search expression, which returns the bonus value for a given salary:

DECLARE salary NUMBER:= 20000; employee_id NUMBER:= 36325; PROCEDURE give_bonus (emp_id IN NUMBER, bonus_amt IN NUMBER) IS BEGIN DBMS_OUTPUT.PUT_LINE(emp_id); DBMS_OUTPUT.PUT_LINE(bonus_amt); END; BEGIN give_bonus(employee_id, CASE WHEN salary >= 10000 AND salary<= 20000 THEN 1500 WHEN salary >20000 AND salary<= 40000 THEN 1000 WHEN salary >40000 THEN 500 ELSE 0 END); END;

The CASE expression can be used wherever expressions of any other type are allowed. The following example uses a CASE expression to calculate the bonus amount, multiplies it by 10, and assigns the result to a variable that is displayed by DBMS_OUTPUT:

DECLARE salary NUMBER:= 20000; employee_id NUMBER:= 36325; bonus_amount NUMBER; BEGIN bonus_amount:= CASE WHEN salary >= 10000 AND salary<= 20000 THEN 1500 WHEN salary >20000 AND salary<= 40000 THEN 1000 WHEN salary >40000 THEN 500 ELSE 0 END * 10; DBMS_OUTPUT.PUT_LINE(bonus_amount); END;

Unlike the CASE command, if no WHEN branch condition is met, the CASE expression does not throw an error, but simply returns NULL .

In this syntax, Oracle compares the input expression (e) to each comparison expression e1, e2, …, en.

If the input expression equals any comparison expression, the CASE expression returns the corresponding result expression (r).

If the input expression e does not match any comparison expression, the CASE expression returns the expression in the ELSE clause if the ELSE clause exists, otherwise, it returns a null value.

Oracle uses short-circuit evaluation for the simple CASE expression. It means that Oracle evaluates each comparison expression (e1, e2, .. en) only before comparing one of them with the input expression (e). Oracle does not evaluate all comparison expressions before comparing any of them with the expression (e). As a result, Oracle never evaluates a comparison expression if a previous one equals the input expression (e).

Simple CASE expression example

We will use the products table in the for the demonstration.

The following query uses the CASE expression to calculate the discount for each product category i.e., CPU 5%, video card 10%, and other product categories 8%

SELECT

CASE category_id

WHEN 1

THEN ROUND (list_price*0.05,2)-- CPU

WHEN 2

THEN ROUND (List_price*0.1,2)-- Video Card

ELSE ROUND (list_price*0.08,2)-- other categories

END discount

FROM

ORDER BY

Note that we used the ROUND () function to round the discount to two decimal places.

Searched CASE expression

The Oracle searched CASE expression evaluates a list of Boolean expressions to determine the result.

The searched CASE statement has the following syntax:

CASE

WHEN e1THEN r1

WHEN condition_1 THEN result_1 WHEN condition_2 THEN result_2 ... WHEN condition_n THEN result_n ELSE result END

Parameters or Arguments

expression optional. It is the value that you are comparing to the list of conditions. (ie: condition_1, condition_2, ... condition_n) condition_1, condition_2, ... condition_n The conditions that must all be the same datatype. The conditions are evaluated in the order listed. Once a condition is found to be true, the CASE statement will return the result and not evaluate the conditions any further. result_1, result_2, ... result_n Results that must all be the same datatype. This is the value returned once a condition is found to be true.

returns

The CASE statement returns any datatype such as a string, numeric, date, etc. (BUT all results must be the same datatype in the CASE statement.)
If all conditions are not the same datatype, an ORA-00932 error will be returned.
If all results are not the same datatype, an ORA-00932 error will be returned.
if no condition is found to be true, then the CASE statement will return the value in the ELSE clause.
If the ELSE clause is omitted and no condition is found to be true, then the CASE statement will return NULL.

note

  • You can have up to 255 comparisons in a CASE statement. Each WHEN ... THEN clause is considered 2 comparisons.

Applies To

The CASE statement can be used in the following versions of Oracle/PLSQL:

  • Oracle 12c, Oracle 11g, Oracle 10g, Oracle 9i

example

The CASE statement can be used in Oracle/PLSQL.

You could use the CASE statement in a SQL statement as follows: (includes the expression clause)

SELECT table_name, CASE owner WHEN "SYS" THEN "The owner is SYS" WHEN "SYSTEM" THEN "The owner is SYSTEM" ELSE "The owner is another value" END FROM all_tables;

Or you could write the SQL statement using the CASE statement like this: (omits the expression clause)

SELECT table_name, CASE WHEN owner="SYS" THEN "The owner is SYS" WHEN owner="SYSTEM" THEN "The owner is SYSTEM" ELSE "The owner is another value" END FROM all_tables;

The above two CASE statements are equivalent to the following IF-THEN-ELSE statement:

IF owner = "SYS" THEN result:= "The owner is SYS"; ELSIF owner = "SYSTEM" THEN result:= "The owner is SYSTEM""; ELSE result:= "The owner is another value"; END IF;

The CASE statement will compare each owner value, one by one.

One thing to note is that the ELSE clause within the CASE statement is optional. You could have omitted it. Let's look at the SQL statement above with the ELSE clause omitted.

Your SQL statement would look as follows:

SELECT table_name, CASE owner WHEN "SYS" THEN "The owner is SYS" WHEN "SYSTEM" THEN "The owner is SYSTEM" END FROM all_tables;

With the ELSE clause omitted, if no condition was found to be true, the CASE statement would return NULL.

Comparing 2 Conditions

Here is an example that demonstrates how to use the CASE statement to compare different conditions:

SELECT CASE WHEN a< b THEN "hello" WHEN d < e THEN "goodbye" END FROM suppliers;

Frequently Asked Questions

Question: Can you create a CASE statement that evaluates two different fields? I want to return a value based on the combinations in two different fields.

Answer: Yes, below is an example of a case statement that evaluates two different fields.

SELECT supplier_id, CASE WHEN supplier_name = "IBM" and supplier_type = "Hardware" THEN "North office" WHEN supplier_name = "IBM" and supplier_type = "Software" THEN "South office" END FROM suppliers;

So if supplier_name field is IBM and the supplier_type field is Hardware, then the CASE statement will return north office. If the supplier_name field is IBM and the supplier_type is Software, the CASE statement will return south office.

mob_info