STRUCTURED QUERY LANGUAGE (SQL) – MORE
ON QUERYING THE DATABASE
Topics
Selecting From More Than One Table (Review)
Aggregate Functions:
Examples
Aggregate Functions:
Exercises
Complete Syntax For Select
Statement
Subqueries:
Examples
Subqueries: Exercises
Subqueries:
Syntax
Self-Joins
Statistics in SQL (if time permits)
SELECTING FROM MORE THAN ONE TABLE (review)
Many of the examples and
exercises that follow use more than one table, with one or more INNER
JOINs. To review:
- Run the Order Details
Extended query.
- Go to SQL
for this query (ANSI style) and separate the statement into its clauses
(SELECT, FROM, INNER JOIN, etc.). Do NOT
save changes you make at this point.
- In a new SQL window, rewrite
the SQL with the JOIN condition in a WHERE clause.
Note: To simplify rewriting the query, you may omit
the table names wherever they are not needed, use table aliases if you wish, and
calculate the total price without discounting (that is, replace the "Extended
Price" calculation) as follows:
[Unit Price] * Quantity AS NondiscountPrice
The Unit Price column appears
in both tables used in this query, so you will have to specify which table the
database should use.
AGGREGATE FUNCTIONS
A series of
straightforward math or statistical functions that apply to sets or subsets of
data rather than to individual instances (rows).
Examples
Order Subtotals: The (discounted) total $ amount for all items
in each order
Sales By Category: The (discounted) total $ amount for each
product, by category, for 1997
Category Sales for 1997 (based on next
query): Total sales, by category.
Product Sales for 1997: Total product sales, by quarter.
Standard aggregate functions: MIN, MAX, AVG, SUM, COUNT
Used with GROUP BY to create
the subsets
with WHERE before GROUP BY to set criteria before calculating
and HAVING after GROUP BY to set criteria for the calculated
results
For example:
In the Category Sales for
1997 query, to see only totals for the first quarter, add the following
criteria before the GROUP BY clause (ShippedQuarter is a column in the Product
Sales for 1997 query):
WHERE ShippedQuarter like "*1"
In the modified query, to see
only totals greater than $20000, add the following criteria after the GROUP BY
clause:
HAVING Sum(ProductSales) > 20000
Note: No need to include the table or query name –
the ProductSales column only appears in one of the tables or queries.
Note on aggregate functions
and nulls: The aggregate functions do
not include rows that have null values, that is, nulls are not handled as if
they were zero. The only function that can return a value for all rows,
regardless of the presence of nulls is the COUNT function:
E.g. COUNT(column_name) returns a count of all non-null rows, but
COUNT(*)
returns the count of all rows, even those with nulls
Exercises
(Note: None of these exercises require more than one
table.)
How many different products
are represented in the database?
How many products in each
category?
Find the average freight
charged by each shipper.
Modify the first query:
Average freight by shipper for freight above $75
Modify the first query:
Average freight by shipper showing only averages above $75
Find the most popular product
(as measured by how many ordered)
Find the most expensive
product
Practical example (contrived
in this database):
Find duplicate (or more) product
listings in Order Details.
Find duplicate (or more) product-price pairings (same
product and price) in Order Details
(Compare with Find Duplicates wizard in each case)
AGGREGATE FUNCTIONS:
SYNTAX
SELECT column list, function(),
function(), ...
FROM table1
INNER JOIN
table2
...
ON
...
WHERE criteria for row
selection
GROUP BY column list
HAVING criteria for
function results
ORDER BY column list
SUBQUERIES
Used to structure
queries. In many cases, can be used instead of
JOIN (and vice versa).
In SQL 92 can also be used to
provide one or more values in the SELECT clause.
Typically part of the WHERE
clause, as follows:
WHERE column IN (subquery)
or
WHERE column <comparison>
(subquery)
or
WHERE EXISTS (subquery)
Examples
IN
Find the price of all products
in a particular category, for example condiments:
SELECT ProductName, UnitPrice
FROM Products
WHERE CategoryID In
(SELECT CategoryID
FROM
Categories
WHERE
CategoryName = "Condiments");
Same query using a JOIN:
SELECT ProductName, UnitPrice
FROM Products
INNER JOIN Categories
ON Products.CategoryID =
Categories.CategoryID
WHERE CategoryName = "Condiments";
With comparison operator
Run the query Products Above Average Price; check in SQL view.
EXISTS
Create a list of suppliers from
whom we buy dairy products:
SELECT CompanyName
FROM Suppliers AS s
WHERE EXISTS
(SELECT
*
FROM Products p, Categories c
WHERE
p.SupplierID = s.SupplierID
AND
p.CategoryID = c.CategoryID
AND
CategoryName LIKE "*Dairy*");
Exercises
Create a list of customers
located in the same city as a supplier, showing company name, city, and country
(try with a subquery and check with a join).
Find all freight charges that
are greater than average and list along with the shipper's name (find the
average first, as a check).
How many shipments has each
shipper made at charges greater than average?
Which suppliers do not sell dairy products?
SYNTAX
SELECT column list
FROM table(s)
WHERE [column] [ IN |
<comparison> |
EXISTS]
(SELECT column list
FROM table(s)
... )
GROUP BY ...
HAVING ...
ORDER BY
SELF-JOIN
A self-join is a query in
which a table is joined (compared) to itself.
Self-joins are used to compare values in a column with other values in
the same column in the same table.
To write the query, select
from the same table listed twice with different aliases, set up the comparison,
and eliminate cases where a particular value is equal to itself.
Example
Which customers are located in
the same state (column name is Region)?
SELECT DISTINCT c1.ContactName, c1.Address, c1.City,
c1.Region
FROM Customers AS c1, Customers AS c2
WHERE c1.Region = c2.Region
AND c1.Region is not null
AND c2.Region is not null
AND c1.ContactName <> c2.ContactName
ORDER BY c1.Region, c1.ContactName;
Exercise
Which customers are located
in the same city?
STATISTICS
Note: Can be done in the database, but best done in a
spreadsheet or using statistics software!
AVG
Mode
Price frequency
Price mode
Median (concept)
Ranking
Products numbered consecutively
Numbered customers
Running sum
Running Total for Orders
Running Total and Count for Orders