Skip to main content

Use Cases for SQL’s CASE statement

Have you ever wanted to conditionally do some processing in an SQL SELECT statement? I‘m sure you have, as this isn’t really an unusual situation. In RPG and other languages, we have the IF or SELECT statements for this. In an SQL procedure, you can also use IF. But in a SELECT statement, the way to conditionally test and perform options, is the CASE statement. 

There are a couple basic formats of the CASE statement: 

Case    when <conditional 1> then <statement 1> 

            when <conditional 2> then <statement 2> 

            else   <statement 3> 

End

Format 1.

Case    <variable or expression>

            when <Value 1> then <statement 1> 

            when <Value 2> then <statement 2> 

            else   <statement 3> 

End

Format 2.

The conditionalexpression or variable portions in the above formats, enable you to use whichever is most convenient to your processing. A complex predicate or calculation can be substituted for the conditional and then tested against a value to influence which statement will be executed. And, statement can also be a simple or complex calculation. 

In Figure 1, we see examples of inserting both Format 1 and 2 CASE phrases into a select statement to determine a column value and create a new description column. Section A shows ACCTYPE field used repeatedly while B shows it used just once. The key here, is that we wished to test the same value on each WHEN, so format B provides a shortcut. When these Select statements are run, each CASE performs the same conditional processing, placing the result into a column named Status.

Inserting Format 1 and 2 CASE phrases into a select statement

Figure 1

Other Formats and Uses

Another situation for CASE could be to test the validity of a value for a date in a numeric column where the date may contain special meaningful values. 

Suppose in your ERP system, a numeric date which hasn’t yet been entered, is represented as ZERO or perhaps a value like 9999-99-99 represents a future but as yet undetermined date. 

The DATE data type provides additional flexibility, so you might wish to convert the column to a date. The SQL to handle this could look as follows, employing CASE to first check the content for special values, and then create a valid DATE column containing either an actual date or null. Other values such as 0001-01-01 or 9999-12-31 are valid dates, could visually signal NO DATE or FUTURE DATA to users, but carry a risk of large results when used in date arithmetic. You’ll need to decide what best suits your site or circumstance. SQL will reduce to place an invalid date in a converted column. 

In Figure 2, the database column XOSDAT (ship date) can be either 0 or 99999999 which mean undefined and future respectively. Neither of these values can be placed in true DATE column. Therefore CASE, detecting those special values, sets those to null and 9999-12-31. When it isn’t a special value, the SUBSTR function reformats and combines the date parts into a syntax usable within the DATE keyword. The resulting SHIPDT_REALDATE column can be used in date arithmetic or decomposed into constituent parts YEAR, MONTH, etc. 

Notice the final date, 20210431 is an invalid date because April contains only 30-days. Therefore, the converted date came out NULL. Checking for anomalous dates and fixing them could be a topic for another article, but I suggest reading up on Db2 for i date functions, such as TO_DATE, LAST_DAY which may be helpful in manipulating or validating dates.

database column XOSDAT (ship date)
Figure 2

Suppose the logic you require needs more IF/THEN logic than can be performed in a single WHEN clause? Multiple CASE statements can be used together or they can be nested if need be. Figure 3 shows data from an SQL statement containing multiple CASE statements used in tandem to provide additional information in the Status columns. For this one, it’s helpful to see the result first and then review the SQL code in Figure 4.

Figure 3Figure 3

Figure 4Figure 4

We need to pull Figure 4 apart to see what’s going on. We wanted to be able to convert a code to a meaningful description, for Active and Open Status, identify how many years they’ve been in that status, and list that time period as part of the text. We wished to differentiate between Accounts this year or prior. 

By nesting another case under the first case, at points A and B we can differentiate between this year and a prior year.  At C, by concatenating ( | | or CONCAT) the prior case information to the result of the number of years calculation, we can add that numeric value into the string we’ve created, forming the final result for Status. Finally at D, we’re using a CASE with various calculations and functions on each WHEN, to determine the relationship between two dates and then show a result as the Time since last activity column. 

While the above tutorials on CASE include some made up examples, we have shown various implementations of the phrasing which may aid your future reporting. 

Using CASE in a Reporting Example

Next, we’ll consider how CASE logic could be used within an SQL procedure, to provide varying data to the consumer based on parameters passed at runtime. A SQL procedure can utilize hardcoded SQL statements defined by a DECLARE, or SQL statements constructed on the fly. The advantage of building something on the fly is the added flexibility, and the ability to produce different reports, based on passed parameters, giving more utility to consumers than a single report. As an example, Figure 5 shows a snapshot of a Yahoo Finance web page for an individual stock. This page has information about the stock plus a chart section dedicated to performance over time. 

Figure 5
Figure 5
 

Let’s consider just the chart aspect of that page for the moment. 

In Figure 5a, on the left in red, we see the chart of stock price during a single day. The light green shaded area, at the top of that panel, contains links for 1D, 5D, 1M etc. Initially, 1D was selected at the top, so the chart depicts one-day performance. In this case, the stock was dropping, hence the red. The Yahoo pages allow you to change the view by clicking a duration you’d like to see. By switching from the 1D to the 5D, we produce the green chart on the right, which shows a five-day gain. Viewing the two, we discern a different trend than what the single-day picture showed. This is informative, and the technique allowing changes may be one we could mimic to our advantage? 

Chart of stock price during a single day
Figure 5a
 

Suppose we wanted to create a similar application, allowing a business analyst to review sales data and trends. The objective, as in the Yahoo scenario, would be to summarize data for various periods and for stated durations. How could SQL enable this? Using a similar syntax for CASE as we discussed above, we might develop SQL logic such that passing differing parameters allowed us to create and run statements to return varying result sets. We could present such results in textual or graphical formats. A chart similar to the Yahoo example appeals most to me, so I am showing an example of running such charts from a single report definition in Db2 Web Query for i, driven by a stored procedure that massages the data into various shaped results.

In Figures 6, 7, and 8 below, we see examples of such charts. By using parameters for Period and Duration, the results change what’s being shown. Possible periods for my chart include: years, quarters, months, weeks, days. The durations can be a variety of numbers between one and 90. The horizontal axis shows time periods, the vertical axis shows revenue dollars and the size of the bubble at each period plot, indicates profit dollars. The user can change the time scale as seen in the figures, to extract information by period and hovering over the bubble yields tool-tip text 

Four quarters
Figure 6: Four quarters

Six weeksFigure 7: Six weeks

Six months
Figure 8:
Six months

The above example, using on-the-fly variations to adjust the viewpoint, illustrates one of the potential benefits of using CASE in SQL. 

The procedure code to populate the result set, in various ways, is shown in the text area below:

CREATE OR REPLACE PROCEDURE GetordSP (

    IN PPERIOD CHAR(8), 

    IN PDURATION INTEGER)

  DYNAMIC RESULT SETS 1

  LANGUAGE SQL

  SPECIFIC GETORDSP

  NOT DETERMINISTIC

  MODIFIES SQL DATA

  CALLED ON NULL INPUT

  

BEGIN

 DECLARE STMT1 CHAR(2000) DEFAULT ' ';

 DECLARE STMT2 CHAR(2000) DEFAULT

'select char(aaaa1,100) as Heading1, char(aaaa2,100) as Heading2, producttype, xxxx as period , SUM(quantity) AS TOTQTY, SUM(Linetotal) AS TOTRev, SUM(costofgoodssold) AS TOTCost, SUM(Linetotal-costofgoodssold) AS TOTProfit from qwqcent.orders o

 join qwqcent.inventory i on o.productnumber=i.productnumber WHERE shipdate BETWEEN bbbb and eeee GROUP BY  producttype, xxxx order by 1,2,3,4 '

;

 DECLARE STMT3 CHAR(2000);

 DECLARE STMT4 CHAR(2000);

 DECLARE RANGEBGN DATE DEFAULT '0001-01-01';

 DECLARE RANGEEND DATE DEFAULT '2099-12-31';

 DECLARE TEXTMSG CHAR(100);

 DECLARE DATERANGE CHAR(50);

 DECLARE C1 CURSOR FOR S1;

 CASE

  WHEN PPERIOD = 'YEARS' THEN 

    SET STMT1 = ' char( year(shipdate)) ';

    SET TEXTMSG = 'Shipments by Year for ' || TRIM(CHAR(PDURATION)) || ' years';

  WHEN PPERIOD = 'QUARTERS' THEN 

    SET STMT1 = ' char( year(shipdate) || ''-Q'' || quarter(shipdate) ) '  ;

    SET TEXTMSG = 'Shipments by Quarter for ' || TRIM(CHAR(PDURATION)) || ' quarters’;

  WHEN PPERIOD = 'MONTHS' THEN 

    SET STMT1 =  ' concat(year(shipdate) || ''-'' ,  substr(digits(month(shipdate)),9,2)) ‘;

    SET TEXTMSG = 'Shipments by Month for ' || TRIM(CHAR(PDURATION)) || ' months';

  WHEN PPERIOD = 'WEEKS' THEN 

    SET STMT1 =  ' char( concat(year(shipdate) || ''-'' ,  substr(digits(week(shipdate)),9,2))) ';

    SET TEXTMSG = 'Shipments by Week for ' || TRIM(CHAR(PDURATION)) || ' weeks';

  ELSE 

    SET STMT1 = ' char(shipdate) ';

    SET TEXTMSG = 'Shipments by Day for ' || TRIM(CHAR(PDURATION)) || ' days';

 END CASE;

-- determine date range begin and end 

 SET RANGEEND = current date; 


-- figure out the duration backwards by subtracting from current date

set pduration = pduration - 1;

  set  RANGEBGN = 

  case 

    when PPERIOD = 'YEARS'        then current date - PDURATION years

    when PPERIOD = 'QUARTERS'  then current date - (PDURATION*3) months

    when PPERIOD = 'MONTHS'     then current date - PDURATION months

    when PPERIOD = 'WEEKS'       then current date - (PDURATION*7) days

    when PPERIOD = 'DAYS'          then current date - PDURATION days

  end ;


 SET STMT3 = CHAR(REPLACE(STMT2, 'xxxx', TRIM(STMT1)), 2000);

 SET STMT4 = REPLACE(STMT3, 'eeee', '''' || CHAR(RANGEEND) || '''');

 SET STMT3 = STMT4;

 SET STMT4 = REPLACE(STMT3, 'bbbb', '''' || CHAR(RANGEBGN) || '''');

 SET STMT3 = STMT4;

 SET STMT4 = REPLACE(STMT3, 'aaaa1', '''' || trim(TEXTMSG) || '''');

 SET STMT3 = STMT4;

 SET DATERANGE = CHAR(RANGEBGN) || ' thru ' ||  CHAR(RANGEEND);

 SET STMT4 = REPLACE(STMT3, 'aaaa2', '''' || DATERANGE || ‘''');


PREPARE S1 FROM STMT4;

 OPEN C1;

 SET RESULT SETS CURSOR C1;

END; 

(Copy/pastable version above, image of code below)

Code
In the code, the lines in red highlight how CASE is used to build the SQL Statements for processing depending upon the PERIOD parameter. For YEARS there was one style of processing, for MONTHS another, etc.  Calculations for duration also are influenced by CASE to determine the range of data to be used. The phrases built on the fly are placed into a basic Select statement, using the SQL Replace function. 

This technique could potentially be used for other reporting scenarios. Hopefully this exposure to CASE in these ways is valuable and gives you ideas to use in your reporting.