STRUCTURED QUERY LANGUAGE (SQL) – MORE ON QUERYING THE DATABASE

 

Topics

 

Selecting From More Than One Table (Review)

 

Aggregate Functions:  Examples

 

Aggregate Functions:  Exercises

 

Complete Syntax For Select Statement

 

Subqueries:  Examples

 

Subqueries:  Exercises

 

Subqueries:  Syntax

 

Self-Joins

 

Statistics in SQL (if time permits)

 

 

 

SELECTING FROM MORE THAN ONE TABLE (review)

 

Many of the examples and exercises that follow use more than one table, with one or more INNER JOINs.  To review:

 

- Run the Order Details Extended query.

 

- Go to SQL for this query (ANSI style) and separate the statement into its clauses (SELECT, FROM, INNER JOIN, etc.).  Do NOT save changes you make at this point.

 

- In a new SQL window, rewrite the SQL with the JOIN condition in a WHERE clause.

 

Note:  To simplify rewriting the query, you may omit the table names wherever they are not needed, use table aliases if you wish, and calculate the total price without discounting (that is, replace the "Extended Price" calculation) as follows:

 

            [Unit Price] * Quantity AS NondiscountPrice

 

The Unit Price column appears in both tables used in this query, so you will have to specify which table the database should use.

 

AGGREGATE FUNCTIONS

 

A series of straightforward math or statistical functions that apply to sets or subsets of data rather than to individual instances (rows).

 

Examples

 

Order Subtotals:  The (discounted) total $ amount for all items in each order

Sales By Category:  The (discounted) total $ amount for each product, by category, for 1997

Category Sales for 1997  (based on next query):  Total sales, by category.

Product Sales for 1997:  Total product sales, by quarter.

 

 

Standard aggregate functions:  MIN, MAX, AVG, SUM, COUNT

 

Used with GROUP BY to create the subsets

            with WHERE before GROUP BY to set criteria before calculating

             and HAVING after GROUP BY to set criteria for the calculated results

 

For example: 

 

In the Category Sales for 1997 query, to see only totals for the first quarter, add the following criteria before the GROUP BY clause (ShippedQuarter is a column in the Product Sales for 1997 query):

 

            WHERE ShippedQuarter like "*1"

 

In the modified query, to see only totals greater than $20000, add the following criteria after the GROUP BY clause:

 

            HAVING Sum(ProductSales) > 20000

 

Note:  No need to include the table or query name – the ProductSales column only appears in one of the tables or queries.

 

Note on aggregate functions and nulls:  The aggregate functions do not include rows that have null values, that is, nulls are not handled as if they were zero.  The only function that can return a value for all rows, regardless of the presence of nulls is the COUNT function:

 

            E.g.      COUNT(column_name) returns a count of all non-null rows, but

                        COUNT(*) returns the count of all rows, even those with nulls

 

Exercises

 

(Note:  None of these exercises require more than one table.)

 

How many different products are represented in the database?

How many products in each category?

 

Find the average freight charged by each shipper.

            Modify the first query:  Average freight by shipper for freight above $75

            Modify the first query:  Average freight by shipper showing only averages above $75

 

Find the most popular product (as measured by how many ordered)

Find the most expensive product

 

Practical example (contrived in this database): 

            Find duplicate (or more) product listings in Order Details.

            Find duplicate (or more) product-price pairings (same product and price) in Order Details

            (Compare with Find Duplicates wizard in each case)

 

AGGREGATE FUNCTIONS:  SYNTAX

 

            SELECT column list, function(), function(), ...

            FROM table1

                INNER JOIN table2

                ...

                ON

                ...

            WHERE criteria for row selection

            GROUP BY column list

            HAVING criteria for function results

            ORDER BY column list

 

SUBQUERIES

 

Used to structure queries.  In  many cases, can be used instead of JOIN (and vice versa).

In SQL 92 can also be used to provide one or more values in the SELECT clause.

Typically part of the WHERE clause, as follows:

 

WHERE column IN (subquery)

or

WHERE column <comparison> (subquery)

or

WHERE EXISTS (subquery)

 

Examples

 

IN

                       

Find the price of all products in a particular category, for example condiments:

 

            SELECT ProductName, UnitPrice

            FROM Products

            WHERE CategoryID In

                        (SELECT CategoryID

                        FROM Categories

                        WHERE CategoryName = "Condiments");

 

Same query using a JOIN:

 

            SELECT ProductName, UnitPrice

            FROM Products

                        INNER JOIN Categories

                        ON Products.CategoryID = Categories.CategoryID

            WHERE CategoryName = "Condiments";

 

With comparison operator

 

Run the query Products Above Average Price; check in SQL view.

 

EXISTS

                       

Create a list of suppliers from whom we buy dairy products:

 

            SELECT CompanyName

            FROM Suppliers AS s

            WHERE EXISTS

                        (SELECT *

                         FROM Products p, Categories c

                        WHERE p.SupplierID = s.SupplierID

                        AND p.CategoryID = c.CategoryID

                        AND CategoryName LIKE "*Dairy*");

 

Exercises

 

Create a list of customers located in the same city as a supplier, showing company name, city, and country (try with a subquery and check with a join).

 

Find all freight charges that are greater than average and list along with the shipper's name (find the average first, as a check).

 

How many shipments has each shipper made at charges greater than average?

 

Which  suppliers do not sell dairy products?

 

SYNTAX

 

            SELECT column list

            FROM table(s)

            WHERE [column] [ IN   |

                                           <comparison>  |

                                            EXISTS]

                        (SELECT column list

                        FROM table(s)

                         ... )

            GROUP BY ...

            HAVING ...

            ORDER BY

 

SELF-JOIN

 

A self-join is a query in which a table is joined (compared) to itself.  Self-joins are used to compare values in a column with other values in the same column in the same table.

 

To write the query, select from the same table listed twice with different aliases, set up the comparison, and eliminate cases where a particular value is equal to itself.

 

Example

           

Which customers are located in the same state (column name is Region)?

 

            SELECT DISTINCT c1.ContactName, c1.Address, c1.City, c1.Region

            FROM Customers AS c1, Customers AS c2

            WHERE c1.Region = c2.Region

            AND c1.Region is not null

            AND c2.Region is not null

            AND c1.ContactName <> c2.ContactName

            ORDER BY c1.Region, c1.ContactName;

 

Exercise

 

Which customers are located in the same city?

 

STATISTICS

 

Note: Can be done in the database, but best done in a spreadsheet or using statistics software!

 

            AVG

 

            Mode

                        Price frequency

                        Price mode

 

            Median (concept)

 

 

            Ranking

                        Products numbered consecutively

                        Numbered customers

 

            Running sum

                        Running Total for Orders

                        Running Total and Count for Orders