Introduction

Sections
About SQL
About This Book
What You’ll Need

SQL (pronounced es-kyoo-el) is the standard programming language for creating, updating, and retrieving information that’s stored in databases. With SQL, you can turn your ordinary questions (“Where do our customers live?”) into statements that your database system can understand (SELECT DISTINCT city, state FROM customers;). You might already know how to extract this type of information by using a graphical query or reporting tool, but perhaps you’ve noticed that this tool becomes limiting or cumbersome as your questions grow in complexity—that’s where SQL comes in.

You also can use SQL to add, change, and delete data and database objects. All modern relational database management systems (DBMSs) support SQL, although support varies by product (more about that later in this introduction).

Database vs. DBMS

A database is not the same as the database software that you’re running; it’s incorrect to say, “Oracle is a database.” Database software is called a database management system (DBMS). A database, which is just one component of a DBMS, is the data itself—that is, it’s a container (one or more files) that stores structured information. Besides controlling the organization, integrity, and retrieval of data in databases, DBMSs handle such tasks as physical storage, security, backup, replication, and error recovery.

DBMS also is abbreviated RDBMS, in which the R stands for relational. An RDBMS organizes data according to the relational model (see Chapter 2) rather than, say, a hierarchical or network model. This book covers only relational systems, so the initial R is implied in DBMS, .

About SQL

SQL is:

A programming language.SQL is a formal language in which you write programs to create, modify, and query databases. Your database system executes your SQL program, performs the tasks you’ve specified, and then displays the results (or an error message). Programming languages differ from natural (spoken) languages in that programming languages are designed for a specific purpose, have a small vocabulary, and are inflexible and unambiguous. So if you don’t get the results you expect, then it’s because your program contains an error—or bug—and not because the computer misinterpreted your instructions. (Debugging one’s programs is a cardinal programming task.)

SQL, like any formal language, is defined by rules of syntax, which determine the words and symbols you can use and how they can be combined, and semantics, which determine the actual meaning of a syntactically correct statement. Note that you can write a legal SQL statement that expresses the wrong meaning (good syntax, bad semantics). Chapter 3 introduces SQL syntax and semantics.

Easy to learn.Easy compared with other programming languages, that is. If you’ve never written a program before, then you’ll find the transition from natural to formal language to be frustrating. Still, SQL’s statements read like sentences to make things easy on humans. A novice programmer probably would understand the SQL statement SELECT au_fname, au_lname FROM authors ORDER BY au_lname; to mean “List the authors’ first and last names, sorted by last name,” whereas the same person would find the equivalent C or Perl program impenetrable.

Declarative.If you’ve never programmed, then you can skip this point without loss of continuity. If you’ve programmed in a language such as C or JavaScript, then you’ve used a procedural language, in which you specify the explicit steps to follow to produce a result. SQL is a declarative language, in which you describe what you want and not how to do it; your database system’s optimizer will determine the “how.” As such, standard SQL lacks traditional control-flow constructs such as if-then-else, while, for, and goto statements.

To demonstrate this difference, I’ve written programs that perform an equivalent task in Microsoft Access Visual Basic (VB, a procedural language) and SQL. Listing i.1 shows a VB program that extracts author names from a table that contains author information. You needn’t understand the entire program, but note that it uses a Do Until loop to define explicitly how to extract data. Listing i.2 shows how to do the same task with a single SQL statement (as opposed to about 20 lines of VB code). With SQL, you specify only what needs to be accomplished; the DBMS determines and performs internally the actual step-by-step operations needed to get the result.

Listing i.1This Microsoft Access Visual Basic routine extracts the first and last names from a database table containing author information and places the results in an array.

Sub GetAuthorNames()
  Dim db As Database
  Dim rs As Recordset
  Dim i As Integer
  Dim au_names() As String
  Set db = CurrentDb()
  Set rs = db.OpenRecordset("authors")
  rs.MoveLast
  ReDim au_names(rs.RecordCount - 1, 1)
  With rs
    .MoveFirst
    i = 0
    Do Until .EOF
      au_names(i, 0) = ![au_fname]
      au_names(i, 1) = ![au_lname]
      i = i + 1
      .MoveNext
    Loop
  End With
  rs.Close
  db.Close
End Sub

Listing i.2This single SQL statement performs the same query as the Visual Basic routine in Listing i.1. Access’s internal optimizer determines the best way to extract the data.

SELECT au_fname, au_lname
  FROM authors;

Moreover, Listing i.2 is a trivial SQL query. After you add common operations such as sorts, filters, and joins, you might need more than 100 lines of procedural code to accomplish what a single SQL SELECT statement can do.

Interactive or embedded.In interactive SQL, you issue SQL commands directly to your DBMS, which displays the results as soon as they’re produced. DBMS servers come with both graphical and command-line tools that accept typed SQL statements or text files that contain SQL programs (scripts).

If you’re developing database applications, then you can “embed” SQL statements in programs written in a host language, which commonly is a general-purpose language (C++, Java, or COBOL, for example) or a scripting language (Python, R, or PHP). A PHP script can use an SQL statement to query a MySQL database, for example; MySQL will pass the query result back to a PHP variable for further analysis or webpage display. Drawing from the preceding examples, I’ve embedded an SQL statement in an Access Visual Basic program in Listing i.3.

Listing i.3Here, Visual Basic serves as the host language for embedded SQL.

Sub GetAuthorNames2()
  Dim db As Database
  Dim rs As Recordset
  Set db = CurrentDb()
  Set rs = db.OpenRecordset("SELECT au_fname,
      au_lname FROM authors;")
  '  --Do something with rs here.
  rs.Close
  db.Close
End Sub

This book covers interactive SQL. In general, any SQL statement that can be used interactively also can be used in a host language, although perhaps with slight syntactic differences, depending on your DBMS, host language, and operating environment.

Standardized.SQL isn’t “owned” by any particular firm. It’s an open standard defined by an international standards working group, under the joint leadership of the International Organization for Standardization (ISO) and the International Engineering Consortium (IEC). The American National Standards Institute (ANSI) participates in the working groups and has ratified the standard (Figure i.1). The terms ISO SQL, ANSI SQL, and standard SQL are synonymous. For more information, see “SQL Standards and Conformance” in Chapter 3.

Figure i.1This is the cover of ISO/IEC 9075 (Part 1), which defines the SQL language officially. You can buy it in electronic format at ansi.org or iso.org if you like. Its intended audience is not SQL programmers, however, but people who design DBMS systems, compilers, and optimizers.

Document: Cover of publication ISO/IEC 9075-1, Information technology - Database languages - SQL - Part 1: Framework (SQL/Framework)

All DBMS vendors add proprietary features to standard SQL to enhance the language. These extensions usually are additional commands, keywords, functions, operators, data types, and control-flow constructs such as if-then, while, and goto statements. Microsoft, Oracle, and IBM have added so many features to standard SQL that the resulting languages—Transact-SQL, PL/SQL, and SQL PL, respectively—are actually separate languages in their own right, rather than mere supersets of standard SQL. One vendor’s extensions generally are incompatible with other vendors’ products. I don’t cover proprietary SQL extensions, but I do point out when a vendor’s SQL dialect doesn’t comply with the standard SQL examples in this book; see “Using SQL with a Specific DBMS” later in this introduction.

Used to change data and database objects.SQL statements are divided into three categories:

Not an acronym.It’s a common misconception that SQL stands for structured query language; it stands for S–Q–L and nothing else. Why? Because ISO says so. The official name is Database Language SQL (refer to Figure i.1). Furthermore, referring to it as a structured query language is a disservice to new SQL programmers. It amuses insiders to point out that “structured query language” is the worst possible description because SQL:

About This Book

Chapters 1 through 3 contain expository material about DBMSs, the relational model, and SQL syntax. Later chapters all use a learn-by-example approach to teach you how to use the SQL programming language to maintain and query database information.

You don’t need prior programming experience but you must be familiar with your operating system’s filesystem (the hierarchy of folders and files) and know how to issue commands at a command prompt or shell.

This book isn’t an exhaustive guide to SQL; I’ve limited its scope to the most-used statements and functions (itself a vast topic). For information about other SQL statements, refer to your DBMS’s documentation.

Typographic Conventions

I use the following typographic conventions:

Bold type introduces new terms.

Italic type represents replaceable variables in regular text.

Monospace type denotes SQL code and syntax in listings and in regular text. It also shows command-prompt text.

Highlighted monospace type highlights SQL code fragments and results that are explained in the accompanying text.

Italic monospace type denotes a variable in SQL code that you must replace with a value. You’d replace column with the name of an actual column, for example.

Syntax Conventions

SQL is a free-form language without restrictions on line breaks or the number of words per line, so I use a consistent style in SQL syntax diagrams and code listings to make the code easy to read and maintain:

Using SQL with a Specific DBMS

The DBMS icon flags a vendor-specific departure from the SQL standard. If you see this icon, then it means that a particular vendor’s SQL dialect doesn’t comply with the standard, and you must modify the listed SQL program to run on your DBMS. For example, the standard SQL operator that joins (concatenates) two strings is || (a double pipe), but Microsoft products use + (a plus sign) and MySQL uses the CONCAT() function instead, so you’ll need to change all occurrences of a || b in the example SQL listing to a + b (if you’re using Microsoft Access or Microsoft SQL Server) or to CONCAT(a, b) (if you’re using MySQL). In most cases, the SQL examples will work as is or with minor syntactic changes. Occasionally, SQL code won’t work at all because the DBMS doesn’t support a particular feature.

This book covers the following DBMSs (see Chapter 1 for details):

If you’re using a different DBMS (such as SAP, Teradata, MariaDB, FileMaker, or SAS SQL), and one of the SQL examples doesn’t work, then read the documentation to see how your DBMS’s SQL implementation departs from the SQL standard.

Tip: To compare general and technical information for a number of DBMSs, read the Wikipedia article “Comparison of relational database management systems”.

Server vs. Desktop DBMSs

A server DBMS acts as the server part of a client/server network; it stores databases and responds to SQL requests made by many clients. A client is an application or computer that sends an SQL request to a server and accepts the server’s response. The server does the actual work of executing the SQL against a database; the client merely accepts the answer. If your network uses a client/server architecture, then the client is the computer on your desk, and the server is a powerful, specialized machine in another room, building, or country. Standard database access protocols and interfaces such as ODBC, JDBC, and ADO.NET define how client/server requests and responses are transmitted.

A desktop DBMS is a stand-alone program that can store a database and do all the SQL processing itself or behave as a client of a server. A desktop DBMS can’t accept requests from other clients (that is, it can’t act like a server).

Servers include Microsoft SQL Server, Oracle Database, IBM Db2 Database, MySQL, and PostgreSQL. Desktop systems include Microsoft Access and FileMaker. By convention, I use the terms client and server to refer to client and server software itself or to the machine on which the software runs, unless the distinction is important.

What You’ll Need

To replicate this book’s examples on your own computer, you’ll need:

A text editor.Typing short or ad-hoc interactive SQL statements at a prompt is convenient, but you’ll want to store nontrivial SQL programs in text files. A text editor is a program that you use to open, create, and edit text files, which contain only printable letters, numbers, and symbols—no fonts, formatting, invisible codes, colors, graphics, or any of the clutter usually associated with a word processor. Every operating system includes a free text editor. Windows has Notepad, Unix and Linux have vi and emacs, and macOS has TextEdit, for example. By convention, SQL files have the filename extension .sql, but you can use .txt (or any extension) if you prefer. For programming, professional editors such as Sublime Text (sublimetext.com) and Vim (vim.org) are superior to the basic editors that come with Windows and macOS.

The sample database.The examples in this book use the same database, described in “The Sample Database” in Chapter 2. To build the sample database, follow the instructions in “Creating the Sample Database” in Chapter 2. If you’re working with a production-server DBMS, then you might need permission from your database administrator to run SQL programs that create, update, and delete data and database objects.

A database management system.How do you get SQL? You don’t—you get a DBMS that understands SQL and feed it an SQL program. The DBMS runs your program and displays the results, as described in the next chapter.