SQL Fundamentals – WHERE

WHERE – Filtering Results

SELECT retrieves data, but without filtering, you get everything. In a production database with millions of rows, "everything" is rarely what you want. The WHERE clause filters rows based on conditions, returning only the data that meets your criteria. This transforms generic queries into precise tools for test validation.

Basic WHERE Syntax

The WHERE clause appears after the FROM clause and specifies conditions that rows must satisfy to be included in results:

SELECT FirstName, LastName, Email
FROM Users
WHERE IsActive = 1;

This returns only users where the IsActive column equals 1 (true). Users with IsActive = 0 are excluded from results. The WHERE clause evaluates each row individually – if the condition is true, the row is included; if false, it's excluded.

Comparison Operators

SQL supports standard comparison operators for filtering:

  • = Equal to
  • != or <> Not equal to
  • < Less than
  • > Greater than
  • <= Less than or equal to
  • >= Greater than or equal to

Examples demonstrating each:

-- Equal to
SELECT * FROM Products WHERE CategoryId = 3;

-- Not equal to (both syntaxes work)
SELECT * FROM Users WHERE IsActive != 1;
SELECT * FROM Users WHERE IsActive <> 1;

-- Greater than
SELECT * FROM Orders WHERE TotalAmount > 100.00;

-- Less than or equal to
SELECT * FROM Products WHERE Price <= 50.00;

-- Greater than or equal to (date comparison)
SELECT * FROM Orders WHERE OrderDate >= '2024-01-01';

Date and string comparisons work exactly as you'd expect. Dates in quotes are parsed by the database, and string comparisons are case-sensitive or case-insensitive depending on your database configuration (SQL Server is typically case-insensitive; PostgreSQL is case-sensitive by default).

Combining Conditions with AND, OR, NOT

Real-world queries rarely filter on a single condition. Logical operators combine multiple conditions:

AND: Both conditions must be true

SELECT * FROM Orders
WHERE Status = 'Shipped'
  AND OrderDate >= '2024-01-01';

This returns orders that are both shipped and created on or after January 1, 2024. If either condition is false, the row is excluded.

OR: At least one condition must be true

SELECT * FROM Products
WHERE CategoryId = 1
   OR CategoryId = 2;

This returns products in category 1 or category 2 (or both, if a product somehow belonged to multiple categories).

NOT: Negates a condition

SELECT * FROM Users
WHERE NOT IsActive = 1;

-- Equivalent to:
SELECT * FROM Users
WHERE IsActive != 1;

Combining AND and OR requires careful attention to operator precedence:

-- WRONG: This doesn't do what you might expect
SELECT * FROM Products
WHERE CategoryId = 1 OR CategoryId = 2 AND Price < 100;

-- This reads as: CategoryId = 1 OR (CategoryId = 2 AND Price < 100)
-- So it returns ALL products in category 1 (any price)
-- PLUS products in category 2 that cost less than $100

AND has higher precedence than OR, so it evaluates first. Use parentheses to make your intent explicit:

-- CORRECT: Use parentheses for clarity
SELECT * FROM Products
WHERE (CategoryId = 1 OR CategoryId = 2)
  AND Price < 100;

This returns products in category 1 or 2 that also cost less than $100. Always use parentheses when mixing AND and OR – it eliminates ambiguity and makes your queries self-documenting.

Pattern Matching with LIKE

Exact equality comparisons (=) work for precise matches, but what if you need partial matches? The LIKE operator enables pattern matching with wildcards:

  • % matches any sequence of characters (including zero characters)
  • _ matches exactly one character
-- Emails ending with @example.com
SELECT * FROM Users
WHERE Email LIKE '%@example.com';

-- First names starting with 'A'
SELECT * FROM Users
WHERE FirstName LIKE 'A%';

-- First names containing 'ann' anywhere
SELECT * FROM Users
WHERE FirstName LIKE '%ann%';

-- Exactly 5-character product codes
SELECT * FROM Products
WHERE ProductCode LIKE '_____';

-- Product codes starting with 'A' followed by exactly 3 characters
SELECT * FROM Products
WHERE ProductCode LIKE 'A___';

Testing Context: LIKE is invaluable for test data cleanup. If your test framework generates users with emails following a pattern (like test_<guid>@example.com), you can clean them up easily:

DELETE FROM Users
WHERE Email LIKE 'test_%@example.com';

This removes all test users without touching real user accounts.

The IN Operator – Testing Membership

When you need to check if a value matches any item in a list, the IN operator is cleaner than multiple OR conditions:

-- Verbose way with OR
SELECT * FROM Products
WHERE CategoryId = 1
   OR CategoryId = 3
   OR CategoryId = 5
   OR CategoryId = 7;

-- Cleaner way with IN
SELECT * FROM Products
WHERE CategoryId IN (1, 3, 5, 7);

Both queries are functionally equivalent, but IN is more readable and maintainable. IN works with strings too:

SELECT * FROM Orders
WHERE Status IN ('Pending', 'Processing', 'Shipped');

You can negate IN with NOT IN to exclude values:

-- Get products NOT in categories 2, 4, or 6
SELECT * FROM Products
WHERE CategoryId NOT IN (2, 4, 6);

Range Queries with BETWEEN

The BETWEEN operator checks if a value falls within a range (inclusive of endpoints):

-- Products priced between $50 and $200 (inclusive)
SELECT * FROM Products
WHERE Price BETWEEN 50 AND 200;

-- Equivalent to:
SELECT * FROM Products
WHERE Price >= 50 AND Price <= 200;

BETWEEN works with dates, making it perfect for time-range queries:

-- Orders placed in January 2024
SELECT * FROM Orders
WHERE OrderDate BETWEEN '2024-01-01' AND '2024-01-31';

-- Orders placed in the last 30 days
SELECT * FROM Orders
WHERE OrderDate BETWEEN CURRENT_DATE - INTERVAL '30 days' AND CURRENT_DATE;

Testing Context: BETWEEN is excellent for isolating test runs by timestamp. If your test data includes creation timestamps, you can query only the data created during a specific test execution window.

Handling NULL Values

NULL represents the absence of a value – it's not zero, not an empty string, but unknown or missing data. NULL requires special handling in SQL because it doesn't equal anything, including itself.

This is wrong and will never return rows:

-- WRONG: This always returns zero rows
SELECT * FROM Orders
WHERE ShipmentDate = NULL;

Why? Because NULL = NULL evaluates to UNKNOWN (not TRUE), so the WHERE condition never passes. To check for NULL, use IS NULL or IS NOT NULL:

-- CORRECT: Find orders that haven't shipped yet
SELECT * FROM Orders
WHERE ShipmentDate IS NULL;

-- Find orders that have shipped
SELECT * FROM Orders
WHERE ShipmentDate IS NOT NULL;

Testing Context: NULL checks are critical for validation queries. After testing a feature that should populate a field, verify it's not NULL:

-- Verify all completed orders have a completion timestamp
SELECT OrderId, Status, CompletedDate
FROM Orders
WHERE Status = 'Completed'
  AND CompletedDate IS NULL;

If this query returns any rows, your application has a bug – completed orders are missing their completion timestamp.

Common Pitfall – NULL Comparisons

NULL behavior trips up even experienced developers. Remember these critical rules:

  • NULL = NULL is UNKNOWN (not TRUE) – use IS NULL instead
  • NULL != NULL is also UNKNOWN – use IS NOT NULL instead
  • Any arithmetic operation involving NULL results in NULL: 10 + NULL = NULL
  • String concatenation with NULL produces NULL: 'Hello' || NULL = NULL
  • Comparisons with NULL are always UNKNOWN: 5 > NULL is UNKNOWN, not false

When debugging "why isn't my WHERE clause working?", NULL values are often the culprit. Use IS NULL and IS NOT NULL explicitly rather than equality comparisons.

The WHERE clause transforms SELECT from a blunt instrument into a precision tool. By combining comparison operators, logical operators, pattern matching, and NULL handling, you can filter result sets to contain exactly the rows your test needs to validate. Next, we'll learn how to sort and limit these filtered results, giving you complete control over query output.