AGGREGATE FUNCTIONS
Aggregate functions are
math or statistical functions that apply to
sets or subsets of data rather than to individual instances (rows).
Examples
The
standard
aggregate functions are MIN, MAX, AVG, SUM,
and COUNT.
They
are almost always used with
GROUP BY to create the subsets
with WHERE
before GROUP BY to set criteria before
calculating
and
HAVING after
GROUP BY to set criteria
for the calculated results
The GROUP BY clause must include all columns in the SELECT clause
except the calculated aggregate columns. Grouping occurs from
left to right, as written in the GROUP BY clause, and groups "break"
(to a new group) as soon as a different column value appears in the
left-to-right order. For example, in the last query
above, note the break between Chai and Chang, and the individual breaks
for each quarter. Scrolling down in the query, there is a break
(a new group or subset) for each product category.
E.g. COUNT(column_name)
returns a count of all rows where the value of column_name is not null,
but
COUNT(*) returns the count of all
rows, even those with
nulls in column_name
AGGREGATE FUNCTIONS
Note:
No need to include the table or query name \226
the ProductSales column only appears in one of the tables or queries.
Run the query
and compare the results with the previous query - only 4 values are
selected.
Exercises using other aggregate functions
(Note: None of these exercises require more than one
table.)
Modify the original freight query:
Average
freight by shipper for freight above
$75
Modify the original freight query:
Average
freight by shipper showing only
averages above $75
Find duplicate (or
more) product listings in Order Details.
Find duplicate (or more)
product-price pairings (same product and price) in Order
Details
(Compare with Find Duplicates wizard
in each case)
COMPLETE SYNTAX FOR
SELECT STATEMENT
FROM
table1
INNER JOIN
table2
...
ON table1.col1 =
table2.col2
...
WHERE criteria
for row selection
[AND
criteria for row
selection]
[OR criteria
for row selection]
HAVING criteria
for function results
ORDER BY
column
list
SUBQUERIES
Subqueries
are used to
structure queries. In many cases, a subquery can be used
instead of a JOIN (and vice versa).
For
database systems fully compliant with the SQL 92
standard, a subquery can also be used to provide one or more values in
the SELECT clause.
In
most database systems, subqueries are typically
part of the WHERE clause, as follows:
Examples
&nb
sp;
Find
the
price of all products in a particular category, for example
condiments.
Type
this query in the SQL window and check against the result shown
below:
SELECT ProductName, UnitPrice
FROM Products
WHERE CategoryID In
(SELECT CategoryID
FROM
Categories
WHERE
CategoryName = "Condiments");
This JOIN should give the
same result:
SELECT ProductName, UnitPrice
FROM Products
INNER JOIN Categories
ON Products.CategoryID =
Categories.CategoryID
WHERE CategoryName =
"Condiments";
Run
the query
Products Above Average Price; check in SQL
view (shown here):
&nb
sp;
Create
a
list of suppliers from whom we buy dairy products. Type the
following query in the SQL window - use the INNER JOIN ... ON
syntax in the subquery, if you prefer.
SELECT CompanyName
FROM Suppliers AS s
WHERE EXISTS
(SELECT
*
FROM Products p, Categories
c
WHERE
p.SupplierID = s.SupplierID
AND
p.CategoryID = c.CategoryID
AND
CategoryName LIKE "*Dairy*");
Result:
SUBQUERIES
Exercises
Create
a
list of customers located in the same city as a supplier, showing
company name,
city, and country (try with a subquery and check with a
join). (14 rows)
Find
all
freight charges that are greater than average and list along with the
shipper's
name (find the average first, as a check). (242
rows)
How
many
shipments has each shipper made at charges greater than average?
(3 rows, 2 or 3 columns)
Which suppliers do not
sell dairy products? (25 rows)
SUBQUERIES:
SYNTAX
SELECT column
list
FROM
table(s)
WHERE
<column_name>
<comparison>
|
EXISTS]
(SELECT column
list
FROM table(s)
...
)
GROUP BY
...
HAVING
...
ORDER
BY column list
SELF-JOIN
A
self-join
is a query in which a table is joined (compared) to itself. Self-joins are used to compare values in a
column with other values in the same
column in the same table. One practical use for
self-joins: obtaining running counts and running totals in an SQL
query.
To
write
the query, select from the same table listed twice with different
aliases, set
up the comparison, and eliminate cases where a particular value would
be equal to
itself.
Example
Which
customers are located in the same state (column name is Region)?
Type this statement in the SQL window:
SELECT DISTINCT c1.ContactName,
c1.Address, c1.City, c1.Region
FROM Customers AS c1, Customers AS
c2
WHERE c1.Region = c2.Region
AND c1.ContactName <>
c2.ContactName
ORDER BY c1.Region, c1.ContactName;
The result
should look like this:
Exercise
Which
customers are located in the same city? (32
rows)
STATISTICS
IN SQL
Note: Can be done in the
database, but
best done in a spreadsheet or using statistics software!
EXAMPLES
SQL for the two queries:
SELECT Products.UnitPrice, Count(*)
AS Frequency
FROM Products
GROUP BY Products.UnitPrice
ORDER BY Count(*) DESC ,
UnitPrice;
SELECT UnitPrice, Count(*) AS
Frequency
FROM Products
GROUP BY UnitPrice
HAVING Count(*) >= ALL
(SELECT Count(*)
FROM Products
GROUP BY
UnitPrice);
Note: We need the subquery because aggregate functions cannot be
nested.
FROM Products AS a
INNER JOIN Products AS b
ON a.UnitPrice >= b.UnitPrice
GROUP BY a.ProductID, a.ProductName, a.UnitPrice
HAVING
Count(a.ProductID)
Between (SELECT
(Count(ProductID)/2)-0.5
&nbs
p;
FROM Products)
And (SELECT
(Count(a.ProductID)/2)+0.5
&nbs
p;
FROM Products)
ORDER BY a.UnitPrice;