Linux DevCenter    
 Published on Linux DevCenter (http://www.linuxdevcenter.com/)
 See this if you're having trouble printing code examples


Filtering SELECTed Data with WHERE

Filtering SELECTed Data with WHERE

11/03/2000

In the last column, we took our first look at selecting and sorting columns of data from a SQL database using the SELECT statement. All of our queries last week, however, returned results from the entire database -- what about returning some subset of the data? SQL databases can easily be filtered by virtually any set of criteria using a single clause.

The WHERE clause

As we've seen before, SQL is a very English-like language with verbs (SELECT), objects (tables, fields), and clauses (FROM xxx, ORDER BY xxx). And, as in standard English grammar, clauses can serve to modify some other portion of a statement. One of the most powerful SQL clauses is the WHERE clause, which is used to filter results on a specific set of criteria. The syntax is trivial.

SELECT ... WHERE (search criteria)

The search criteria can be any combination of database fields, values, and operators. So what operators exist? Table 1 lists some standard SQL operators which should be basically familiar to you if you've done any programming or scripting in the past. Table 2 lists the Boolean operators that SQL recognizes.

Table 1: SQL Operator

Operator Meaning
<> non-equality
= equality
< less than
<= less than or equal to
> greater than
>= greater than or equal to
BETWEEN is in a range of values
IN is in a set of values
LIKE is similar to a string value

Table 2: SQL Boolean Operator

Operator Meaning
AND combines two conditions
OR either or both of two conditions
NOT removes occurrences of condition from query
IS equivalent (equal) to some condition

These operators provide a suprisingly rich ability to filter virtually any kind of SQL data.

WHERE in action

We can jump right into action with the WHERE clause using the example MusicCollection database from the last column, which is reproduced in the following table.

MusicCollection
ID Title Artist Year
1 Pet Sounds The Beach Boys 1966
2 Security Peter Gabriel 1990
3 The Way It Is Bruce Hornsby 1986
4 Joshua Judges Ruth Lyle Lovett 1992

Starting with the full database, let's quickly review last week by displaying all the album titles using the SELECT statement.

SELECT Title FROM MusicCollection;

Title
--------------------
Pet Sounds
Security
The Way It Is
Joshua Judges Ruth

We can now modify that simple SELECT statement to return Title field values by adding a WHERE clause. Let's filter the results by choosing only titles from the past decade:

SELECT Title FROM MusicCollection WHERE Year>=1990;

Title
--------------------
Security
Joshua Judges Ruth

The result is that the Title field is filtered by the condition that the Year field must be greater than or equal to the value 1990. This seems pretty simple, but we've now drastically increased our power to manipulate the database by adding a single clause. We can take any of the SELECT techniques we learned last time -- selecting multiple columns and sorting -- and combine them with the WHERE clause to create powerful queries. For example:

SELECT Title,Artist FROM MusicCollection WHERE Year>=1990 ORDER BY Title;

Title Artist
-------------------- --------------------
Joshua Judges Ruth Lyle Lovett
Security Peter Gabriel

Now we're starting to get some serious work done! We've filtered the database table by Year, sorted by Title, and returned the Title and Artist fields. That's a lot of power in a few simple clauses. But if you've really been paying attention, this query isn't quite what we wanted -- I wanted to find all albums from the past decade (the 1990s). Using a single >= operator will return improper results as years pass and newer albums are added to the database. What I really want to do is combine two sets of criteria -- now it is time for those Boolean operators!

SELECT Title,Artist FROM MusicCollection WHERE (Year>=1990) AND (Year<=1999) ORDER BY Title;

Title Artist
-------------------- --------------------
Joshua Judges Ruth Lyle Lovett
Security Peter Gabriel

This query returns exactly the same results, but has the added advantage of working properly when I add albums released in 2000, 2001, etc. Just for fun, we'll rewrite this query using the BETWEEN operator:

SELECT Title,Artist FROM MusicCollection WHERE (Year BETWEEN 1990 AND 1999) ORDER BY Title;

Title Artist
-------------------- --------------------
Joshua Judges Ruth Lyle Lovett
Security Peter Gabriel

Different query, same result. This points up one of the fundamental pieces of SQL wisdom:

SQL Wisdom #1) There are often multiple ways to implement a SQL query to produce a given result.

It is really important to realize that there are many, many pieces of the SQL language -- some that often seem redundant or superfluous -- but they all get the job done, which is the real goal.

Next steps

This time we delved a bit deeper into SQL's SELECT statement. Adding this week to last week, we've got the following syntax for SELECT:

SELECT column_name(s) [AS alias] FROM table_name [ORDER BY column_name(s) (ASC | DESC) ] WHERE (search_criteria);

where the statements in square brackets are optional.

Next time, we'll take a slight detour and look at how to get data into the database. Until then, try practicing the SELECT statement to show what you know aboutSQL.

John Paul Ashenfelter is president and CTO of TransitionPoint.com, a technology development and analysis group focusing on web database applications.


Read more aboutSQL columns.

Discuss this article in the O'Reilly Network Linux Forum.

Return to the Linux DevCenter.

 

Copyright © 2009 O'Reilly Media, Inc.