7. Joins

Sections
Qualifying Column Names
Creating Table Aliases with AS
Using Joins
Creating Joins with JOIN or WHERE
Creating a Cross Join with CROSS JOIN
Creating a Natural Join with NATURAL JOIN
Creating an Inner Join with INNER JOIN
Creating Outer Joins with OUTER JOIN
Creating a Self-Join

All the queries so far have retrieved rows from a single table. This chapter explains how to use joins to retrieve rows from multiple tables simultaneously. Recall from “Relationships” in Chapter 2 that a relationship is an association established between common columns in two tables. A join is a table operation that uses related columns to combine rows from two input tables into one result table. You can chain joins to retrieve rows from an unlimited number of tables.

Why do joins matter? The most important database information isn’t so much stored in the rows of individual tables; rather, it’s the implied relationships between sets of related rows. In the sample database, for example, the individual rows of the tables authors, publishers, and titles contain important values, of course, but it’s the implied relationships that let you understand and analyze your data in its entirety: Who wrote what? Who published what? To whom do we send royalty checks? For how much? And so on.

This chapter explains the different types of joins, why they’re used, and how to create a SELECT statement that uses them.

Qualifying Column Names

Recall from “Tables, Columns, and Rows” in Chapter 2 that column names must be unique within a table but can be reused in other tables. The tables authors and publishers in the sample database both contain a column named city, for example.

To identify an otherwise-ambiguous column uniquely in a query that involves multiple tables, use its qualified name. A qualified name is a table name followed by a dot and the name of the column in the table. Because tables must have different names within a database, a qualified name identifies a single column uniquely within the entire database.

To qualify a column name:

Listing 7.1Here, the qualified names resolve otherwise-ambiguous references to the column city in the tables authors and publishers. See Figure 7.1 for the result.

SELECT au_id, authors.city
  FROM authors
  INNER JOIN publishers
    ON authors.city = publishers.city;

Figure 7.1Result of Listing 7.1. This result lists authors who live in the same city as some publisher; the join syntax is explained later in this chapter.

au_id city
----- -------------
A03   San Francisco
A04   San Francisco
A05   New York

Tips for Qualified Column Names

Creating Table Aliases with AS

You can create table aliases by using AS just as you can create column aliases; see “Creating Column Aliases with AS” in Chapter 4. Table aliases:

To create a table alias:

Listing 7.2Tables aliases make queries shorter and easier to read. Note that you can use an alias in the SELECT clause before it’s actually defined later in the statement. See Figure 7.2 for the result.

SELECT au_fname, au_lname, a.city
  FROM authors a
  INNER JOIN publishers p
    ON a.city = p.city;

Figure 7.2Result of Listing 7.2.

au_fname  au_lname city
--------- -------- -------------
Hallie    Hull     San Francisco
Klee      Hull     San Francisco
Christian Kells    New York

Tips for AS

Using Joins

You can use a join to extract data from more than one table. The rest of this chapter explains the different types of joins (Table 7.1), why they’re used, and how to create SELECT statements that use them.

Table 7.1Types of Joins
Join Description
Cross join Returns all rows from the first table in which each row from the first table is combined with all rows from the second table.
Natural join A join that compares, for equality, all the columns in the first table with corresponding columns that have the same name in the second table.
Inner join A join that uses a comparison operator to match rows from two tables based on the values in common columns from each table. Inner joins are the most common type of join.
Left outer join Returns all the rows from the left table, not just the ones in which the joined columns match. If a row in the left table has no matching rows in the right table, then the associated result row contains nulls for all SELECT-clause columns coming from the right table.
Right outer join The reverse of a left outer join. All rows from the right table are returned. Nulls are returned for the left table if a right-table row has no matching left-table row.
Full outer join Returns all rows in both the left and right tables. If a row has no match in the other table, then the SELECT-clause columns from the other table contain nulls. If there is a match between the tables, then the entire result row contains values from both tables.
Self-join A join of a table to itself.

The important characteristics of joins are:

Domains and Comparisons

The values that you compare in joins and WHERE clauses must be meaningfully comparable—that is, have the same data type and the same meaning. The sample-database columns au_id and pub_id, for example, have the same data type—both are CHAR(3), a letter followed by two digits—but mean different things, so they can’t be compared sensibly.

Recall from “Tables, Columns, and Rows” in Chapter 2 that a domain is the set of permissible values for a column. To prevent meaningless comparisons, the relational model requires that comparable columns draw from domains that have the same meaning. Unfortunately, SQL and DBMSs stray from the model and have no intrinsic mechanism that prevents users from comparing, say, IQ and shoe size. If you’re building a database application, then it’s up to you to stop (or warn) users from making meaningless comparisons that waste processing time or, worse, yield results that might be interpreted as valid.

Creating Joins with JOIN or WHERE

You have two alternative ways of specifying a join: by using JOIN syntax or WHERE syntax. SQL-92 and later standards prescribe JOIN syntax, but older standards prescribe WHERE; hence, both JOIN and WHERE are used widely.

This section explains the general syntax for JOIN and WHERE joins that involve two tables. The actual syntax that you’ll use in real queries will vary by the join type, the number of columns joined, the number of tables joined, and the syntax requirements of your DBMS. The syntax diagrams and examples in the following sections show you how to create specific joins.

To create a join by using JOIN:

To create a join by using WHERE:

Listings 7.3a and 7.3b show equivalent queries that use JOIN and WHERE syntax. See Figure 7.3 for the result.

Listing 7.3aA join that uses JOIN syntax. See Figure 7.3 for the result.

SELECT au_fname, au_lname, a.city
  FROM authors a
  INNER JOIN publishers p
    ON a.city = p.city;

Listing 7.3bThe same join, using WHERE syntax. See Figure 7.3 for the result.

SELECT au_fname, au_lname, a.city
  FROM authors a, publishers p
  WHERE a.city = p.city;

Figure 7.3Result of Listings 7.3a and 7.3b.

au_fname  au_lname city
--------- -------- -------------
Hallie    Hull     San Francisco
Klee      Hull     San Francisco
Christian Kells    New York

Query Execution Sequence

When your DBMS processes joins, it uses a logical sequence to execute the entire query. The DBMS:

  1. Applies the join conditions in the JOIN clause.
  2. Applies the join conditions and search conditions in the WHERE clause.
  3. Groups rows according to the GROUP BY clause.
  4. Applies the search conditions in the HAVING clause to the groups.
  5. Sorts the result according to the ORDER BY clause.

Tips for JOIN and WHERE

The USING Clause

For JOIN syntax, the SQL standard also defines a USING clause that can be used instead of the ON clause if the joined columns have the same name and are compared for equality:

FROM table1 join_type table2
  USING (columns)

columns is a comma-separated list of one or more column names. The parentheses are required. The query performs an equijoin on the named pair(s) of columns. The type of join is called a named columns join. Rewriting Listing 7.3a with USING:

SELECT au_fname, au_lname, city
  FROM authors
  INNER JOIN publishers
    USING (city);

The USING clause acts like a natural join, except that you can use it if you don’t want to join all pairs of columns with the same name in both tables. Note that the preceding USING example joins only on the column city in both tables, whereas a natural join would join on both the columns city and state common to the tables. See “Creating a Natural Join with NATURAL JOIN” later in this chapter.

USING is a syntactic convenience that doesn’t add extra functionality to SQL. A USING clause always can be replicated with an ON clause in JOIN syntax or with a WHERE clause in WHERE syntax.

Microsoft Access, Microsoft SQL Server, and Db2 don’t support USING. MySQL requires the SELECT clause’s common column names to be qualified in USING queries. To run the preceding example, change city to authors.city in the SELECT clause.

Creating a Cross Join with CROSS JOIN

A cross join:

To create a cross join:

Listing 7.4A cross join displays all possible combinations of rows from two tables. See Figure 7.4 for the result.

SELECT
    au_id,
    pub_id,
    a.state AS "au_state",
    p.state AS "pub_state"
  FROM authors a
  CROSS JOIN publishers p;

Figure 7.4Result of Listing 7.4.

au_id pub_id au_state pub_state
----- ------ -------- ---------
A01   P01    NY       NY
A02   P01    CO       NY
A03   P01    CA       NY
A04   P01    CA       NY
A05   P01    NY       NY
A06   P01    CA       NY
A07   P01    FL       NY
A01   P02    NY       CA
A02   P02    CO       CA
A03   P02    CA       CA
A04   P02    CA       CA
A05   P02    NY       CA
A06   P02    CA       CA
A07   P02    FL       CA
A01   P03    NY       NULL
A02   P03    CO       NULL
A03   P03    CA       NULL
A04   P03    CA       NULL
A05   P03    NY       NULL
A06   P03    CA       NULL
A07   P03    FL       NULL
A01   P04    NY       CA
A02   P04    CO       CA
A03   P04    CA       CA
A04   P04    CA       CA
A05   P04    NY       CA
A06   P04    CA       CA
A07   P04    FL       CA

Tips for CROSS JOIN

Creating a Natural Join with NATURAL JOIN

A natural join:

To create a natural join:

When your DBMS runs Listing 7.5, it will join rows in the table publishers with rows in the table titles that have equal values in the columns publishers.pub_id and titles.pub_id—the two columns that have the same name in both tables. See Figure 7.5 for the result.

Listing 7.5List each book’s publisher. See Figure 7.5 for the result.

SELECT
    title_id,
    pub_id,
    pub_name
  FROM publishers
  NATURAL JOIN titles;

Figure 7.5Result of Listing 7.5.

title_id pub_id pub_name
-------- ------ -------------------
T01      P01    Abatis Publishers
T02      P03    Schadenfreude Press
T03      P02    Core Dump Books
T04      P04    Tenterhooks Press
T05      P04    Tenterhooks Press
T06      P01    Abatis Publishers
T07      P03    Schadenfreude Press
T08      P04    Tenterhooks Press
T09      P04    Tenterhooks Press
T10      P01    Abatis Publishers
T11      P04    Tenterhooks Press
T12      P01    Abatis Publishers
T13      P03    Schadenfreude Press

In Listing 7.6, I’ve added another join to Listing 7.5 to retrieve the advance for each book. The WHERE condition retrieves books with advances less than $20000. When your DBMS runs Listing 7.6, it will join the pub_id columns in the tables publishers and titles, and it will join the title_id columns in the tables titles and royalties. See Figure 7.6 for the result.

Listing 7.6List each book’s publisher and advance for books with advances less than $20000. See Figure 7.6 for the result.

SELECT
    title_id,
    pub_id,
    pub_name,
    advance
  FROM publishers
  NATURAL JOIN titles
  NATURAL JOIN royalties
  WHERE advance < 20000;

Figure 7.6Result of Listing 7.6.

title_id pub_id pub_name            advance
-------- ------ ------------------- -------
T01      P01    Abatis Publishers     10000
T02      P03    Schadenfreude Press    1000
T03      P02    Core Dump Books       15000
T08      P04    Tenterhooks Press         0
T09      P04    Tenterhooks Press         0

Tips for NATURAL JOIN

Creating an Inner Join with INNER JOIN

An inner join:

To create an inner join:

Tips for INNER JOIN

Listing 7.7 joins two tables on the column au_id to list the books that each author wrote (or cowrote). Each author’s au_id in the table authors matches zero or more rows in the table title_authors. See Figure 7.7 for the result. Note that author A07 (Paddy O'Furniture) is omitted from the result because he has written no books and so has no matching rows in title_authors.

Listing 7.7List the books that each author wrote (or cowrote). See Figure 7.7 for the result.

SELECT
    a.au_id,
    a.au_fname,
    a.au_lname,
    ta.title_id
  FROM authors a
  INNER JOIN title_authors ta
    ON a.au_id = ta.au_id
  ORDER BY a.au_id ASC, ta.title_id ASC;

Figure 7.7Result of Listing 7.7.

au_id au_fname  au_lname  title_id
----- --------- --------- --------
A01   Sarah     Buchman   T01
A01   Sarah     Buchman   T02
A01   Sarah     Buchman   T13
A02   Wendy     Heydemark T06
A02   Wendy     Heydemark T07
A02   Wendy     Heydemark T10
A02   Wendy     Heydemark T12
A03   Hallie    Hull      T04
A03   Hallie    Hull      T11
A04   Klee      Hull      T04
A04   Klee      Hull      T05
A04   Klee      Hull      T07
A04   Klee      Hull      T11
A05   Christian Kells     T03
A06             Kellsey   T08
A06             Kellsey   T09
A06             Kellsey   T11

Tips for Listing 7.7

Listing 7.8 joins two tables on the column pub_id to list each book’s title name and ID, and each book’s publisher name and ID. Note that the join is necessary to retrieve only the publisher name (the fourth column in the result); all the other three columns are available in the table titles. See Figure 7.8 for the result.

Listing 7.8List each book’s title name and ID and each book’s publisher name and ID. See Figure 7.8 for the result.

SELECT
    t.title_id,
    t.title_name,
    t.pub_id,
    p.pub_name
  FROM titles t
  INNER JOIN publishers p
    ON p.pub_id = t.pub_id
  ORDER BY t.title_name ASC;

Figure 7.8Result of Listing 7.8.

title_id title_name                          pub_id pub_name
-------- ----------------------------------- ------ -------------------
T01      1977!                               P01    Abatis Publishers
T02      200 Years of German Humor           P03    Schadenfreude Press
T03      Ask Your System Administrator       P02    Core Dump Books
T04      But I Did It Unconsciously          P04    Tenterhooks Press
T05      Exchange of Platitudes              P04    Tenterhooks Press
T06      How About Never?                    P01    Abatis Publishers
T07      I Blame My Mother                   P03    Schadenfreude Press
T08      Just Wait Until After School        P04    Tenterhooks Press
T09      Kiss My Boo-Boo                     P04    Tenterhooks Press
T10      Not Without My Faberge Egg          P01    Abatis Publishers
T11      Perhaps It's a Glandular Problem    P04    Tenterhooks Press
T12      Spontaneous, Not Annoying           P01    Abatis Publishers
T13      What Are The Civilian Applications? P03    Schadenfreude Press

Tips for Listing 7.8

Listing 7.9 uses two join conditions to list the authors who live in the same city and state as some publisher (any publisher). See Figure 7.9 for the result. Note that this query is a natural join on the identically named, nonkey columns city and state in the two tables (see “Creating a Natural Join with NATURAL JOIN” earlier in this chapter). An equivalent query is:

SELECT a.au_id, a.au_fname,
    a.au_lname, a.city, a.state
  FROM authors a
  NATURAL JOIN publishers p
  ORDER BY a.au_id ASC;

Listing 7.9List the authors who live in the same city and state in which a publisher is located. See Figure 7.9 for the result.

SELECT
    a.au_id,
    a.au_fname,
    a.au_lname,
    a.city,
    a.state
  FROM authors a
  INNER JOIN publishers p
    ON a.city = p.city
    AND a.state = p.state
  ORDER BY a.au_id;

Figure 7.9Result of Listing 7.9.

au_id au_fname  au_lname city          state
----- --------- -------- ------------- -----
A03   Hallie    Hull     San Francisco CA
A04   Klee      Hull     San Francisco CA
A05   Christian Kells    New York      NY

Tips for Listing 7.9

Listing 7.10 combines an inner join with WHERE conditions to list books published in California or outside the large North American countries; see “Filtering Rows with WHERE” in Chapter 4. See Figure 7.10 for the result.

Listing 7.10List the books published in California or outside the large North American countries. See Figure 7.10 for the result.

SELECT
    t.title_id,
    t.title_name,
    p.state,
    p.country
  FROM titles t
  INNER JOIN publishers p
    ON t.pub_id = p.pub_id
  WHERE p.state = 'CA'
     OR p.country NOT IN
      ('USA', 'Canada', 'Mexico')
  ORDER BY t.title_id ASC;

Figure 7.10Result of Listing 7.10.

title_id title_name                          state country
-------- ----------------------------------- ----- -------
T02      200 Years of German Humor           NULL  Germany
T03      Ask Your System Administrator       CA    USA
T04      But I Did It Unconsciously          CA    USA
T05      Exchange of Platitudes              CA    USA
T07      I Blame My Mother                   NULL  Germany
T08      Just Wait Until After School        CA    USA
T09      Kiss My Boo-Boo                     CA    USA
T11      Perhaps It's a Glandular Problem    CA    USA
T13      What Are The Civilian Applications? NULL  Germany

Tips for Listing 7.10

Listing 7.11 combines an inner join with the aggregate function COUNT() and a GROUP BY clause to list the number of books that each author wrote (or cowrote). For information about aggregate functions and GROUP BY, see Chapter 6. See Figure 7.11 for the result. Note that, as in Figure 7.7, author A07 (Paddy O'Furniture) is omitted from the result because he has written no books and so has no matching rows in title_authors. See Listing 7.30 in “Creating Outer Joins with OUTER JOIN” later in this chapter for an example that lists authors who have written no books.

Listing 7.11List the number of books that each author wrote (or cowrote). See Figure 7.11 for the result.

SELECT
    a.au_id,
    COUNT(ta.title_id) AS "Num books"
  FROM authors a
  INNER JOIN title_authors ta
    ON a.au_id = ta.au_id
  GROUP BY a.au_id
  ORDER BY a.au_id ASC;

Figure 7.11Result of Listing 7.11.

au_id Num books
----- ---------
A01           3
A02           4
A03           2
A04           4
A05           1
A06           3

Tips for Listing 7.11

Listing 7.12 uses WHERE conditions to list the advance paid for each biography. See Figure 7.12 for the result.

Listing 7.12List the advance paid for each biography. See Figure 7.12 for the result.

SELECT
    t.title_id,
    t.title_name,
    r.advance
  FROM royalties r
  INNER JOIN titles t
    ON r.title_id = t.title_id
  WHERE t.type = 'biography'
    AND r.advance IS NOT NULL
  ORDER BY r.advance DESC;

Figure 7.12Result of Listing 7.12.

title_id title_name                advance
-------- ------------------------- -----------
T07      I Blame My Mother          1000000.00
T12      Spontaneous, Not Annoying    50000.00
T06      How About Never?             20000.00

Tips for Listing 7.12

Listing 7.13 uses aggregate functions and a GROUP BY clause to list the count and total advance paid for each type of book. See Figure 7.13 for the result.

Listing 7.13List the count and total advance paid for each type of book. See Figure 7.13 for the result.

SELECT
    t.type,
    COUNT(r.advance)
      AS "COUNT(r.advance)",
    SUM(r.advance)
      AS "SUM(r.advance)"
  FROM royalties r
  INNER JOIN titles t
    ON r.title_id = t.title_id
  WHERE r.advance IS NOT NULL
  GROUP BY t.type
  ORDER BY t.type ASC;

Figure 7.13Result of Listing 7.13.

type       COUNT(r.advance) SUM(r.advance)
---------- ---------------- --------------
biography                 3     1070000.00
children                  2           0.00
computer                  1       15000.00
history                   3       31000.00
psychology                3      220000.00

Tips for Listing 7.13

Listing 7.14 is similar to Listing 7.13, except that it uses an additional grouping column to list the count and total advance paid for each type of book by publisher. See Figure 7.14 for the result.

Listing 7.14List the count and total advance paid for each type of book, by publisher. See Figure 7.14 for the result.

SELECT
    t.type,
    t.pub_id,
    COUNT(r.advance) AS "COUNT(r.advance)",
    SUM(r.advance) AS "SUM(r.advance)"
  FROM royalties r
  INNER JOIN titles t
    ON r.title_id = t.title_id
  WHERE r.advance IS NOT NULL
  GROUP BY t.type, t.pub_id
  ORDER BY t.type ASC, t.pub_id ASC;

Figure 7.14Result of Listing 7.14.

type       pub_id COUNT(r.advance) SUM(r.advance)
---------- ------ ---------------- --------------
biography  P01                   2       70000.00
biography  P03                   1     1000000.00
children   P04                   2           0.00
computer   P02                   1       15000.00
history    P01                   1       10000.00
history    P03                   2       21000.00
psychology P04                   3      220000.00

Tips for Listing 7.14

Listing 7.15 uses a HAVING clause to list the number of coauthors of each book written by two or more authors. For information about HAVING, see “Filtering Groups with HAVING” in Chapter 6. See Figure 7.15 for the result.

Listing 7.15List the number of coauthors of each book written by two or more authors. See Figure 7.15 for the result.

SELECT
    ta.title_id,
    COUNT(ta.au_id) AS "Num authors"
  FROM authors a
  INNER JOIN title_authors ta
    ON a.au_id = ta.au_id
  GROUP BY ta.title_id
  HAVING COUNT(ta.au_id) > 1
  ORDER BY ta.title_id ASC;

Figure 7.15Result of Listing 7.15.

title_id Num authors
-------- -----------
T04                2
T07                2
T11                3

Tips for Listing 7.15

You also can join values in two columns that aren’t equal. Listing 7.16 uses greater-than (>) join to find each book whose revenue (= price × sales) is at least 10 times greater than the advance paid to the author(s). See Figure 7.16 for the result. The use of <, <=, >, and >= joins is common, but not-equal joins (<>) are used rarely. Generally, not-equal joins make sense only when used with a self-join; see “Creating a Self-Join” later in this chapter.

Listing 7.16List each book whose revenue (= price × sales) is at least 10 times greater than its advance. See Figure 7.16 for the result.

SELECT
    t.title_id,
    t.title_name,
    r.advance,
    t.price * t.sales AS "Revenue"
  FROM titles t
  INNER JOIN royalties r
    ON t.price * t.sales > r.advance * 10
    AND t.title_id = r.title_id
  ORDER BY t.price * t.sales DESC;

Figure 7.16Result of Listing 7.16.

title_id title_name                          advance    Revenue
-------- ----------------------------------- ---------- -----------
T07      I Blame My Mother                   1000000.00 35929790.00
T05      Exchange of Platitudes               100000.00  1400008.00
T12      Spontaneous, Not Annoying             50000.00  1299012.99
T03      Ask Your System Administrator         15000.00  1025396.65
T13      What Are The Civilian Applications?   20000.00   313905.33
T06      How About Never?                      20000.00   225834.00
T02      200 Years of German Humor              1000.00   190841.70
T09      Kiss My Boo-Boo                            .00    69750.00
T08      Just Wait Until After School               .00    40950.00

Tips for Listing 7.16

Complicated queries can arise from simple questions. In Listing 7.17, I must join three tables to list the author names and the names of the books that each author wrote (or cowrote). See Figure 7.17 for the result.

Listing 7.17List the author names and the names of the books that each author wrote (or cowrote). See Figure 7.17 for the result.

SELECT
    a.au_fname,
    a.au_lname,
    t.title_name
  FROM authors a
  INNER JOIN title_authors ta
    ON a.au_id = ta.au_id
  INNER JOIN titles t
    ON t.title_id = ta.title_id
  ORDER BY a.au_lname ASC,
    a.au_fname ASC, t.title_name ASC;

Figure 7.17Result of Listing 7.17.

au_fname  au_lname  title_name
--------- --------- -----------------------------------
Sarah     Buchman   1977!
Sarah     Buchman   200 Years of German Humor
Sarah     Buchman   What Are The Civilian Applications?
Wendy     Heydemark How About Never?
Wendy     Heydemark I Blame My Mother
Wendy     Heydemark Not Without My Faberge Egg
Wendy     Heydemark Spontaneous, Not Annoying
Hallie    Hull      But I Did It Unconsciously
Hallie    Hull      Perhaps It's a Glandular Problem
Klee      Hull      But I Did It Unconsciously
Klee      Hull      Exchange of Platitudes
Klee      Hull      I Blame My Mother
Klee      Hull      Perhaps It's a Glandular Problem
Christian Kells     Ask Your System Administrator
          Kellsey   Just Wait Until After School
          Kellsey   Kiss My Boo-Boo
          Kellsey   Perhaps It's a Glandular Problem

Tips for Listing 7.17

Expanding on Listing 7.17, Listing 7.18 requires a four-table join to list the publisher names along with the names of the authors and books. See Figure 7.18 for the result.

Listing 7.18List the author names, the names of the books that each author wrote (or cowrote), and the publisher names. See Figure 7.18 for the result.

SELECT
    a.au_fname,
    a.au_lname,
    t.title_name,
    p.pub_name
  FROM authors a
  INNER JOIN title_authors ta
    ON a.au_id = ta.au_id
  INNER JOIN titles t
    ON t.title_id = ta.title_id
  INNER JOIN publishers p
    ON p.pub_id = t.pub_id
  ORDER BY a.au_lname ASC, a.au_fname ASC,
    t.title_name ASC;

Figure 7.18Result of Listing 7.18.

au_fname  au_lname  title_name                          pub_name
--------- --------- ----------------------------------- -------------------
Sarah     Buchman   1977!                               Abatis Publishers
Sarah     Buchman   200 Years of German Humor           Schadenfreude Press
Sarah     Buchman   What Are The Civilian Applications? Schadenfreude Press
Wendy     Heydemark How About Never?                    Abatis Publishers
Wendy     Heydemark I Blame My Mother                   Schadenfreude Press
Wendy     Heydemark Not Without My Faberge Egg          Abatis Publishers
Wendy     Heydemark Spontaneous, Not Annoying           Abatis Publishers
Hallie    Hull      But I Did It Unconsciously          Tenterhooks Press
Hallie    Hull      Perhaps It's a Glandular Problem    Tenterhooks Press
Klee      Hull      But I Did It Unconsciously          Tenterhooks Press
Klee      Hull      Exchange of Platitudes              Tenterhooks Press
Klee      Hull      I Blame My Mother                   Schadenfreude Press
Klee      Hull      Perhaps It's a Glandular Problem    Tenterhooks Press
Christian Kells     Ask Your System Administrator       Core Dump Books
          Kellsey   Just Wait Until After School        Tenterhooks Press
          Kellsey   Kiss My Boo-Boo                     Tenterhooks Press
          Kellsey   Perhaps It's a Glandular Problem    Tenterhooks Press

Tips for Listing 7.18

Listing 7.19 calculates the total royalties for all books. The gross royalty of a book is the book’s revenue (= sales × price) times the royalty rate (the fraction of revenue paid to the author). In most cases, the author receives an initial advance against royalties. The publisher deducts the advance from the gross royalty to get the net royalty. If the net royalty is positive, then the publisher must pay the author; if the net royalty is negative or zero, then the author gets nothing because he or she still hasn’t “earned out” the advance. See Figure 7.19 for the result. Gross royalties are labeled “Total royalties”, gross advances are labeled “Total advances”, and net royalties are labeled “Total due to authors”.

Listing 7.19 calculates total royalties for all books; the subsequent examples in this section will show you how to break down royalties by author, book, publisher, and other groups.

Listing 7.19Calculate the total royalties for all books. See Figure 7.19 for the result.

SELECT
    SUM(t.sales * t.price *
      r.royalty_rate)
      AS "Total royalties",
    SUM(r.advance)
      AS "Total advances",
    SUM((t.sales * t.price *
      r.royalty_rate) - r.advance)
      AS "Total due to authors"
  FROM titles t
  INNER JOIN royalties r
    ON r.title_id = t.title_id
  WHERE t.sales IS NOT NULL;

Figure 7.19Result of Listing 7.19.

Total royalties Total advances Total due to authors
--------------- -------------- --------------------
     4387219.55     1336000.00           3051219.55

Tips for Listing 7.19

Listing 7.20 uses a three-table join to calculate the royalty earned by each author for each book that the author wrote (or cowrote). Because a book can have multiple authors, per-author royalty calculations involve each author’s share of a book’s royalty (and advance). The author’s royalty share for each book is given in the table title_authors in the column royalty_share. For a book with a sole author, royalty_share is 1.0 (100 percent). For a book with multiple authors, the royalty_share of each author is a fractional amount between 0 and 1 (inclusive); all the royalty_share values for a particular book must sum to 1.0 (100 percent). See Figure 7.20 for the result. The sum of the values in each of the last three columns in the result equals the corresponding total in Figure 7.19.

Listing 7.20Calculate the royalty earned by each author for each book that the author wrote (or cowrote). See Figure 7.20 for the result.

SELECT
    ta.au_id,
    t.title_id,
    t.pub_id,
    t.sales * t.price *
      r.royalty_rate * ta.royalty_share
      AS "Royalty share",
    r.advance * ta.royalty_share
      AS "Advance share",
    (t.sales * t.price *
      r.royalty_rate * ta.royalty_share) -
      (r.advance * ta.royalty_share)
      AS "Due to author"
  FROM title_authors ta
  INNER JOIN titles t
    ON t.title_id = ta.title_id
  INNER JOIN royalties r
    ON r.title_id = t.title_id
  WHERE t.sales IS NOT NULL
  ORDER BY ta.au_id ASC, t.title_id ASC;

Figure 7.20Result of Listing 7.20.

au_id title_id pub_id Royalty share Advance share Due to author
----- -------- ------ ------------- ------------- -------------
A01   T01      P01           622.32      10000.00      -9377.68
A01   T02      P03         11450.50       1000.00      10450.50
A01   T13      P03         18834.32      20000.00      -1165.68
A02   T06      P01         18066.72      20000.00      -1933.28
A02   T07      P03       1976138.45     500000.00    1476138.45
A02   T12      P01        116911.17      50000.00      66911.17
A03   T04      P04          8106.38      12000.00      -3893.62
A03   T11      P04         15792.90      30000.00     -14207.10
A04   T04      P04          5404.26       8000.00      -2595.74
A04   T05      P04        126000.72     100000.00      26000.72
A04   T07      P03       1976138.45     500000.00    1476138.45
A04   T11      P04         15792.90      30000.00     -14207.10
A05   T03      P02         71777.77      15000.00      56777.77
A06   T08      P04          1638.00           .00       1638.00
A06   T09      P04          3487.50           .00       3487.50
A06   T11      P04         21057.20      40000.00     -18942.80

Tips for Listing 7.20

Listing 7.21 is similar to Listing 7.20 except that it adds a join to the table authors to print the author names and includes a WHERE condition to retrieve rows with only positive royalties. See Figure 7.21 for the result.

Listing 7.21List only positive royalties earned by each author for each book that the author wrote (or cowrote). See Figure 7.21 for the result.

SELECT
    a.au_id,
    a.au_fname,
    a.au_lname,
    t.title_name,
    (t.sales * t.price *
      r.royalty_rate * ta.royalty_share) -
      (r.advance * ta.royalty_share)
      AS "Due to author"
  FROM authors a
  INNER JOIN title_authors ta
    ON a.au_id = ta.au_id
  INNER JOIN titles t
    ON t.title_id = ta.title_id
  INNER JOIN royalties r
    ON r.title_id = t.title_id
  WHERE t.sales IS NOT NULL
    AND (t.sales * t.price *
      r.royalty_rate * ta.royalty_share) -
      (r.advance * ta.royalty_share) > 0
  ORDER BY a.au_id ASC, t.title_id ASC;

Figure 7.21Result of Listing 7.21.

au_id au_fname  au_lname  title_name                    Due to author
----- --------- --------- ----------------------------- -------------
A01   Sarah     Buchman   200 Years of German Humor          10450.50
A02   Wendy     Heydemark I Blame My Mother                1476138.45
A02   Wendy     Heydemark Spontaneous, Not Annoying          66911.17
A04   Klee      Hull      Exchange of Platitudes             26000.72
A04   Klee      Hull      I Blame My Mother                1476138.45
A05   Christian Kells     Ask Your System Administrator      56777.77
A06             Kellsey   Just Wait Until After School        1638.00
A06             Kellsey   Kiss My Boo-Boo                     3487.50

Tips for Listing 7.21

Listing 7.22 uses a GROUP BY clause to calculate the total royalties paid by each publisher. The aggregate function COUNT() computes the total number of books for which each publisher pays royalties. Note that each author’s royalty share is unnecessary here because no per-author calculations are involved. See Figure 7.22 for the result. The sum of the values in each of the last three columns in the result equals the corresponding total in Figure 7.19.

Listing 7.22Calculate the total royalties paid by each publisher. See Figure 7.22 for the result.

SELECT
    t.pub_id,
    COUNT(t.sales)
      AS "Num books",
    SUM(t.sales * t.price * r.royalty_rate)
      AS "Total royalties",
    SUM(r.advance)
      AS "Total advances",
    SUM((t.sales * t.price
      * r.royalty_rate) -
      r.advance)
      AS "Total due to authors"
  FROM titles t
  INNER JOIN royalties r
    ON r.title_id = t.title_id
  WHERE t.sales IS NOT NULL
  GROUP BY t.pub_id
  ORDER BY t.pub_id ASC;

Figure 7.22Result of Listing 7.22.

pub_id Num books Total royalties Total advances Total due to authors
------ --------- --------------- -------------- --------------------
P01            3       135600.21       80000.00             55600.21
P02            1        71777.77       15000.00             56777.77
P03            3      3982561.72     1021000.00           2961561.72
P04            5       197279.85      220000.00            -22720.15

Tips for Listing 7.22

Listing 7.23 is similar to Listing 7.22 except that it calculates the total royalties earned by each author for all books written (or cowritten). See Figure 7.23 for the result. The sum of the values in each of the last three columns in the result equals the corresponding total in Figure 7.19.

Listing 7.23Calculate the total royalties earned by each author for all books written (or cowritten). See Figure 7.23 for the result.

SELECT
    ta.au_id,
    COUNT(sales)
      AS "Num books",
    SUM(t.sales * t.price *
      r.royalty_rate * ta.royalty_share)
      AS "Total royalties share",
    SUM(r.advance * ta.royalty_share)
      AS "Total advances share",
    SUM((t.sales * t.price *
      r.royalty_rate * ta.royalty_share) -
      (r.advance * ta.royalty_share))
      AS "Total due to author"
  FROM title_authors ta
  INNER JOIN titles t
    ON t.title_id = ta.title_id
  INNER JOIN royalties r
    ON r.title_id = t.title_id
  WHERE t.sales IS NOT NULL
  GROUP BY ta.au_id
  ORDER BY ta.au_id ASC;

Figure 7.23Result of Listing 7.23.

au_id Num books   Total royalties share Total advances share Total due to author
----- ----------- --------------------- -------------------- -------------------
A01             3              30907.14             31000.00              -92.86
A02             3            2111116.34            570000.00          1541116.34
A03             2              23899.28             42000.00           -18100.72
A04             4            2123336.32            638000.00          1485336.32
A05             1              71777.77             15000.00            56777.77
A06             3              26182.70             40000.00           -13817.30

Tips for Listing 7.23

Listing 7.24 uses two grouping columns to calculate the total royalties to be paid by each U.S. publisher to each author for all books written (or cowritten) by the author. The HAVING condition returns retrieve rows with only positive net royalties, and the WHERE condition retrieves only U.S. publishers. See Figure 7.24 for the result.

Listing 7.24Calculate the positive net royalties to be paid by each U.S. publisher to each author for all books written (or cowritten) by the author. See Figure 7.24 for the result.

SELECT
    t.pub_id,
    ta.au_id,
    COUNT(*)
      AS "Num books",
    SUM(t.sales * t.price *
      r.royalty_rate * ta.royalty_share)
      AS "Total royalties share",
    SUM(r.advance * ta.royalty_share)
      AS "Total advances share",
    SUM((t.sales * t.price *
      r.royalty_rate * ta.royalty_share) -
      (r.advance * ta.royalty_share))
      AS "Total due to author"
  FROM title_authors ta
  INNER JOIN titles t
    ON t.title_id = ta.title_id
  INNER JOIN royalties r
    ON r.title_id = t.title_id
  INNER JOIN publishers p
    ON p.pub_id = t.pub_id
  WHERE t.sales IS NOT NULL
    AND p.country IN ('USA')
  GROUP BY t.pub_id, ta.au_id
  HAVING SUM((t.sales * t.price *
    r.royalty_rate * ta.royalty_share) -
    (r.advance * ta.royalty_share)) > 0
  ORDER BY t.pub_id ASC, ta.au_id ASC;

Figure 7.24Result of Listing 7.24.

pub_id au_id Num books   Total royalties share Total advances share Total due to author
------ ----- ----------- --------------------- -------------------- -------------------
P01    A02             2             134977.89             70000.00            64977.89
P02    A05             1              71777.77             15000.00            56777.77
P04    A04             3             147197.87            138000.00             9197.87

Tips for Listing 7.24

Creating Outer Joins with OUTER JOIN

In the preceding section, you learned that inner joins return rows only if at least one row from both tables satisfies the join condition(s). An inner join eliminates the rows that don’t match with a row from the other table, whereas an outer join returns all rows from at least one of the tables (provided that those rows meet any WHERE or HAVING search conditions).

Outer joins are useful for answering questions that involve missing quantities: authors who have written no books or classes with no enrolled students, for example. Outer joins also are helpful for creating reports in which you want to list all the rows of one table along with matching rows from another table: all authors and any books that sold more than a given number of copies, for example, or all products with order quantities, including products that no one ordered.

Unlike other joins, the order in which you specify the tables in outer joins is important, so the two join operands are called the left table and the right table. Outer joins come in three flavors:

To summarize, all rows are retrieved from the left table referenced in a left outer join, all rows are retrieved from the right table referenced in a right outer join, and all rows from both tables are retrieved in a full outer join. In all cases, unmatched rows are padded with nulls. In the result, you can’t distinguish the nulls (if any) that were in the input tables originally from the nulls inserted by the outer-join operation. Remember that the conditions NULL = NULL and NULL = any_value are unknown and not matches; see “Nulls” in Chapter 3.

To create a left outer join:

To create a right outer join:

To create a full outer join:

Tips for OUTER JOIN

For reference in the following four examples, Listing 7.25 and Figure 7.25 show the city for each author and publisher.

Listing 7.25List the cities of the authors and the cities of the publishers. See Figure 7.25 for the result.

SELECT a.au_fname, a.au_lname, a.city
  FROM authors a;

SELECT p.pub_name, p.city
  FROM publishers p;

Figure 7.25Result of Listing 7.25.

au_fname  au_lname    city
--------- ----------- -------------
Sarah     Buchman     Bronx
Wendy     Heydemark   Boulder
Hallie    Hull        San Francisco
Klee      Hull        San Francisco
Christian Kells       New York
          Kellsey     Palo Alto
Paddy     O'Furniture Sarasota

pub_name            city
------------------- -------------
Abatis Publishers   New York
Core Dump Books     San Francisco
Schadenfreude Press Hamburg
Tenterhooks Press   Berkeley

Listing 7.26 performs an inner join of the tables authors and publishers on their city columns. The result, Figure 7.26, lists only the authors who live in cities in which a publisher is located. You can compare the result of this inner join with the results of the outer joins in the following three examples.

Listing 7.26List the authors who live in cities in which a publisher is located. See Figure 7.26 for the result.

SELECT a.au_fname, a.au_lname, p.pub_name
  FROM authors a
  INNER JOIN publishers p
    ON a.city = p.city;

Figure 7.26Result of Listing 7.26.

au_fname  au_lname pub_name
--------- -------- -----------------
Hallie    Hull     Core Dump Books
Klee      Hull     Core Dump Books
Christian Kells    Abatis Publishers

Tips for Listing 7.26

Listing 7.27 uses a left outer join to include all authors in the result, regardless of whether a publisher is located in the same city. See Figure 7.27 for the result.

Listing 7.27This left outer join includes all rows in the table authors in the result, whether or not there’s a match in the column city in the table publishers. See Figure 7.27 for the result.

SELECT a.au_fname, a.au_lname, p.pub_name
  FROM authors a
  LEFT OUTER JOIN publishers p
    ON a.city = p.city
  ORDER BY p.pub_name ASC,
    a.au_lname ASC, a.au_fname ASC;

Figure 7.27Result of Listing 7.27. Note that there’s no matching data for four of the listed authors, so these rows contain null in the column pub_name.

au_fname  au_lname    pub_name
--------- ----------- -----------------
Sarah     Buchman     NULL
Wendy     Heydemark   NULL
          Kellsey     NULL
Paddy     O'Furniture NULL
Christian Kells       Abatis Publishers
Hallie    Hull        Core Dump Books
Klee      Hull        Core Dump Books

Tips for Listing 7.27

Listing 7.28 uses a right outer join to include all publishers in the result, regardless of whether an author lives in the publisher’s city. See Figure 7.28 for the result.

Listing 7.28This right outer join includes all rows in the table publishers in the result, whether or not there’s a match in the column city in the table authors. See Figure 7.28 for the result.

SELECT a.au_fname, a.au_lname, p.pub_name
  FROM authors a
  RIGHT OUTER JOIN publishers p
    ON a.city = p.city
  ORDER BY p.pub_name ASC,
    a.au_lname ASC, a.au_fname ASC;

Figure 7.28Result of Listing 7.28. Note that there’s no matching data for two of the listed publishers, so these rows contain nulls in the columns au_fname and au_lname.

au_fname  au_lname pub_name
--------- -------- -------------------
Christian Kells    Abatis Publishers
Hallie    Hull     Core Dump Books
Klee      Hull     Core Dump Books
NULL      NULL     Schadenfreude Press
NULL      NULL     Tenterhooks Press

Tips for Listing 7.28

Listing 7.29 uses a full outer join to include all publishers and all authors in the result, regardless of whether a publisher and author are located in the same city. See Figure 7.29 for the result.

Listing 7.29This full outer join includes all rows in the tables authors and publishers in the result, whether or not there’s a match in the city columns. See Figure 7.29 for the result.

SELECT a.au_fname, a.au_lname, p.pub_name
  FROM authors a
  FULL OUTER JOIN publishers p
    ON a.city = p.city
  ORDER BY p.pub_name ASC,
    a.au_lname ASC, a.au_fname ASC;

Figure 7.29Result of Listing 7.29. This result contains nine rows: four rows for authors who have no matching rows in the table publishers, three rows in which the author and publisher coexist in the same city, and two rows for publishers who have no matching city in the table authors.

au_fname  au_lname    pub_name
--------- ----------- -------------------
Sarah     Buchman     NULL
Wendy     Heydemark   NULL
          Kellsey     NULL
Paddy     O'Furniture NULL
Christian Kells       Abatis Publishers
Hallie    Hull        Core Dump Books
Klee      Hull        Core Dump Books
NULL      NULL        Schadenfreude Press
NULL      NULL        Tenterhooks Press

Tips for Listing 7.29

Listing 7.30 uses a left outer join to list the number of books that each author wrote (or cowrote). See Figure 7.30 for the result. Note that in contrast to Listing 7.11 in “Creating an Inner Join with INNER JOIN” earlier in this chapter, the author A07 (Paddy O'Furniture) appears in the result even though he has written no books.

Listing 7.30List the number of books that each author wrote (or cowrote), including authors who have written no books. See Figure 7.30 for the result.

SELECT
    a.au_id,
    COUNT(ta.title_id) AS "Num books"
  FROM authors a
  LEFT OUTER JOIN title_authors ta
    ON a.au_id = ta.au_id
  GROUP BY a.au_id
  ORDER BY a.au_id ASC;

Figure 7.30Result of Listing 7.30.

au_id Num books
----- ---------
A01           3
A02           4
A03           2
A04           4
A05           1
A06           3
A07           0

Tips for Listing 7.30

Listing 7.31 uses a WHERE condition to test for null and list only the authors who haven’t written a book. See Figure 7.31 for the result.

Listing 7.31List the authors who haven’t written (or cowritten) a book. See Figure 7.31 for the result.

SELECT a.au_id, a.au_fname, a.au_lname
  FROM authors a
  LEFT OUTER JOIN title_authors ta
    ON a.au_id = ta.au_id
  WHERE ta.au_id IS NULL;

Figure 7.31Result of Listing 7.31.

au_id au_fname au_lname
----- -------- -----------
A07   Paddy    O'Furniture

Tips for Listing 7.31

Listing 7.32 combines an inner join and a left outer join to list all authors and any books they wrote (or cowrote) that sold more than 100,000 copies. In this example, first I created a filtered INNER JOIN result and then OUTER JOINed it with the table authors, from which I wanted all rows. See Figure 7.32 for the result.

Listing 7.32List all authors and any books written (or cowritten) that sold more than 100,000 copies. See Figure 7.32 for the result.

SELECT a.au_id, a.au_fname, a.au_lname,
    tta.title_id, tta.title_name, tta.sales
  FROM authors a
  LEFT OUTER JOIN
  (SELECT ta.au_id, t.title_id,
      t.title_name, t.sales
    FROM title_authors ta
    INNER JOIN titles t
      ON t.title_id = ta.title_id
    WHERE sales > 100000) tta
    ON a.au_id = tta.au_id
  ORDER BY a.au_id ASC, tta.title_id ASC;

Figure 7.32Result of Listing 7.32.

au_id au_fname  au_lname    title_id title_name                sales
----- --------- ----------- -------- ------------------------- -------
A01   Sarah     Buchman     NULL     NULL                         NULL
A02   Wendy     Heydemark   T07      I Blame My Mother         1500200
A02   Wendy     Heydemark   T12      Spontaneous, Not Annoying  100001
A03   Hallie    Hull        NULL     NULL                         NULL
A04   Klee      Hull        T05      Exchange of Platitudes     201440
A04   Klee      Hull        T07      I Blame My Mother         1500200
A05   Christian Kells       NULL     NULL                         NULL
A06             Kellsey     NULL     NULL                         NULL
A07   Paddy     O'Furniture NULL     NULL                         NULL

Tips for Listing 7.32

Creating a Self-Join

A self-join is a normal SQL join that joins a table to itself and retrieves rows from a table by comparing values in one or more columns in the same table. Self-joins often are used in tables with a reflexive relationship, which is a primary-key/foreign-key relationship from a column or combination of columns in a table to other columns in that same table. For information about keys, see “Primary Keys” and “Foreign Keys” in Chapter 2.

Suppose that you have the following table, named employees:

emp_id emp_name          boss_id
------ ----------------- -------
E01    Lord Copper       NULL
E02    Jocelyn Hitchcock E01
E03    Mr. Salter        E01
E04    William Boot      E03
E05    Mr. Corker        E03

emp_id is a primary key that uniquely identifies the employee, and boss_id is an employee ID that identifies the employee’s manager. Each manager also is an employee, so to ensure that each manager ID that is added to the table matches an existing employee ID, boss_id is defined as a foreign key of emp_id. Listing 7.33 uses this reflexive relationship to compare rows within the table and retrieve the name of the manager of each employee. (You wouldn’t need a join at all to get just the manager’s ID.) See Figure 7.33 for the result.

Listing 7.33List the name of each employee and the name of his or her manager. See Figure 7.33 for the result.

SELECT
    e1.emp_name AS "Employee name",
    e2.emp_name AS "Boss name"
  FROM employees e1
  INNER JOIN employees e2
    ON e1.boss_id = e2.emp_id;

Figure 7.33Result of Listing 7.33. Note that Lord Copper, who has no boss, is excluded from the result because his null boss_id doesn’t satisfy the join condition.

Employee name     Boss name
----------------- -----------
Jocelyn Hitchcock Lord Copper
Mr. Salter        Lord Copper
William Boot      Mr. Salter
Mr. Corker        Mr. Salter

The same table (employees) appears twice in Listing 7.33 with two different aliases (e1 and e2) that are used to qualify column names in the join condition:

e1.boss_id = e2.emp_id

As with any join, a self-join requires two tables, but instead of adding a second table to the join, you add a second instance of the same table. That way, you can compare a column in the first instance of the table to a column in the second instance. As with all joins, the DBMS combines and returns rows of the table that satisfy the join condition. You actually aren’t creating another copy of the table—you’re joining the table to itself—but the effect might be easier to understand if you think about it as being two tables.

To create a self-join:

Tips for Self Joins

Listing 7.34 uses a WHERE search condition and self-join from the column state to itself to find all authors who live in the same state as author A04 (Klee Hull). See Figure 7.34 for the result.

Listing 7.34List the authors who live in the same state as author A04 (Klee Hull). See Figure 7.34 for the result.

SELECT a1.au_id, a1.au_fname,
    a1.au_lname, a1.state
  FROM authors a1
  INNER JOIN authors a2
    ON a1.state = a2.state
  WHERE a2.au_id = 'A04';

Figure 7.34Result of Listing 7.34.

au_id au_fname au_lname state
----- -------- -------- -----
A03   Hallie   Hull     CA
A04   Klee     Hull     CA
A06            Kellsey  CA

Tips for Listing 7.34

For every biography, Listing 7.35 lists the other biographies that outsold it. Note that the WHERE search condition requires type = 'biography' for both tables t1 and t2 because the join condition considers the column type to be two separate columns. See Figure 7.35 for the result.

Listing 7.35For every biography, list the title ID and sales of the other biographies that outsold it. See Figure 7.35 for the result.

SELECT t1.title_id, t1.sales,
    t2.title_id AS "Better seller",
    t2.sales AS "Higher sales"
  FROM titles t1
  INNER JOIN titles t2
    ON t1.sales < t2.sales
  WHERE t1.type = 'biography'
    AND t2.type = 'biography'
  ORDER BY t1.title_id ASC, t2.sales ASC;

Figure 7.35Result of Listing 7.35.

title_id sales  Better seller Higher sales
-------- ------ ------------- ------------
T06       11320 T12                 100001
T06       11320 T07                1500200
T12      100001 T07                1500200

Tips for Listing 7.35

Listing 7.36 is a self-join to find all pairs of authors within New York state. See Figure 7.36 for the result.

Listing 7.36List all pairs of authors who live in New York state. See Figure 7.36 for the result.

SELECT
    a1.au_fname, a1.au_lname,
    a2.au_fname, a2.au_lname
  FROM authors a1
  INNER JOIN authors a2
    ON a1.state = a2.state
  WHERE a1.state = 'NY'
  ORDER BY a1.au_id ASC, a2.au_id ASC;

Figure 7.36Result of Listing 7.36.

au_fname  au_lname au_fname  au_lname
--------- -------- --------- --------
Sarah     Buchman  Sarah     Buchman
Sarah     Buchman  Christian Kells
Christian Kells    Sarah     Buchman
Christian Kells    Christian Kells

Tips for Listing 7.36

The first and fourth rows of Figure 7.36 are unnecessary because they indicate that Sarah Buchman lives in the same state as Sarah Buchman, and likewise for Christian Kells. Adding a join condition retains only those rows in which the two authors differ (Listing 7.37 and Figure 7.37).

Listing 7.37List all different pairs of authors who live in New York state. See Figure 7.37 for the result.

SELECT
    a1.au_fname, a1.au_lname,
    a2.au_fname, a2.au_lname
  FROM authors a1
  INNER JOIN authors a2
    ON a1.state = a2.state
    AND a1.au_id <> a2.au_id
  WHERE a1.state = 'NY'
  ORDER BY a1.au_id ASC, a2.au_id ASC;

Figure 7.37Result of Listing 7.37.

au_fname  au_lname au_fname  au_lname
--------- -------- --------- --------
Sarah     Buchman  Christian Kells
Christian Kells    Sarah     Buchman

Tips for Listing 7.37

Listing 7.37 still isn’t quite what I want because the two result rows are redundant. The first row states that Sarah Buchman lives in the same state as Christian Kells, and the second row gives the same information. To eliminate this redundancy, I’ll change the second join condition’s comparison operator from not-equal to less-than (Listing 7.38 and Figure 7.38).

Listing 7.38List all different pairs of authors who live in New York state, with no redundancies. See Figure 7.38 for the result.

SELECT
    a1.au_fname, a1.au_lname,
    a2.au_fname, a2.au_lname
  FROM authors a1
  INNER JOIN authors a2
    ON a1.state = a2.state
    AND a1.au_id < a2.au_id
  WHERE a1.state = 'NY'
  ORDER BY a1.au_id ASC, a2.au_id ASC;

Figure 7.38Result of Listing 7.38.

au_fname au_lname au_fname  au_lname
-------- -------- --------- --------
Sarah    Buchman  Christian Kells

Tips for Listing 7.38