This lesson introduces fundamental database concepts and basic SQL (Structured Query Language) for data retrieval and manipulation.
A database is an organized collection of structured information, typically stored electronically in a computer system. Databases are designed to efficiently store, retrieve, and manage data for various applications.
Relational Databases:
NoSQL Databases:
Understanding the fundamental building blocks of relational databases is essential for effective data management and analysis.
Definition: A table is a collection of related data organized in rows and columns, similar to a spreadsheet.
Table Structure:
+------------+-----------+----------+------------+
| employee_id| name | department| salary |
+------------+-----------+----------+------------+
| 101 | John Smith| IT | 75000 |
| 102 | Mary Jones | HR | 65000 |
| 103 | Bob Brown | Sales | 70000 |
+------------+-----------+----------+------------+
Table Components:
Definition: A record (or row) represents a single entry in a table, containing values for each column.
Record Characteristics:
Example Record:
| 101 | John Smith | IT | 75000 | 2023-01-15 | active |
Definition: A field (or column) represents a specific attribute or piece of information about the records in a table.
Field Properties:
Common Data Types:
Numeric Types:
String Types:
Date/Time Types:
Keys are essential for establishing relationships between tables and ensuring data integrity in relational databases.
Definition: A primary key is a unique identifier for each record in a table. It ensures that each row can be uniquely identified.
Primary Key Characteristics:
Types of Primary Keys:
Surrogate Keys:
employee_id INT AUTO_INCREMENT PRIMARY KEYNatural Keys:
Composite Primary Keys:
(order_id, product_id) in order details tablePrimary Key Examples:
-- Single column primary key
CREATE TABLE employees (
employee_id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50)
);
-- Composite primary key
CREATE TABLE order_details (
order_id INT,
product_id INT,
quantity INT,
PRIMARY KEY (order_id, product_id)
);
Definition: A foreign key is a field in one table that uniquely identifies a row in another table. It establishes a relationship between tables.
Foreign Key Characteristics:
Foreign Key Relationships:
One-to-Many Relationship:
Departments Table:
+-------------+-----------+
| department_id| name |
+-------------+-----------+
| 1 | IT |
| 2 | HR |
+-------------+-----------+
Employees Table:
+-------------+-----------+---------------+
| employee_id | name | department_id |
+-------------+-----------+---------------+
| 101 | John | 1 |
| 102 | Mary | 2 |
| 103 | Bob | 1 |
+-------------+-----------+---------------+
Foreign Key Example:
CREATE TABLE employees (
employee_id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100),
department_id INT,
FOREIGN KEY (department_id) REFERENCES departments(department_id)
);
Foreign Key Constraints:
| Relationship Type | Description | Example |
|---|---|---|
| One-to-One | One record in table A relates to one record in table B | Person ↔ Passport |
| One-to-Many | One record in table A relates to many records in table B | Department ↔ Employees |
| Many-to-Many | Many records in table A relate to many records in table B | Students ↔ Courses |
SQL (Structured Query Language) is the standard language for managing and manipulating data in relational databases.
Definition: The SELECT statement retrieves data from one or more tables.
Basic Syntax:
SELECT column1, column2, ...
FROM table_name;
SELECT Examples:
Select All Columns:
SELECT * FROM employees;
Select Specific Columns:
SELECT employee_id, name, department FROM employees;
Select with Calculations:
SELECT
name,
salary,
salary * 1.1 AS salary_with_bonus
FROM employees;
Select with Concatenation:
SELECT
first_name,
last_name,
CONCAT(first_name, ' ', last_name) AS full_name
FROM employees;
Select with Conditional Logic:
SELECT
name,
salary,
CASE
WHEN salary > 80000 THEN 'High'
WHEN salary > 60000 THEN 'Medium'
ELSE 'Low'
END AS salary_category
FROM employees;
Definition: The WHERE clause filters records based on specified conditions.
Basic Syntax:
SELECT column1, column2, ...
FROM table_name
WHERE condition;
WHERE Operators:
Comparison Operators:
=: Equal to!= or <>: Not equal to>: Greater than<: Less than>=: Greater than or equal to<=: Less than or equal toLogical Operators:
AND: All conditions must be trueOR: At least one condition must be trueNOT: Negates the conditionBETWEEN: Within a rangeIN: In a list of valuesLIKE: Pattern matchingIS NULL: Check for NULL valuesWHERE Examples:
Single Condition:
SELECT * FROM employees WHERE department = 'IT';
Multiple Conditions with AND:
SELECT * FROM employees
WHERE department = 'IT' AND salary > 70000;
Multiple Conditions with OR:
SELECT * FROM employees
WHERE department = 'IT' OR department = 'HR';
BETWEEN Operator:
SELECT * FROM employees
WHERE salary BETWEEN 60000 AND 80000;
IN Operator:
SELECT * FROM employees
WHERE department IN ('IT', 'HR', 'Sales');
LIKE Operator for Pattern Matching:
-- Starts with 'J'
SELECT * FROM employees WHERE name LIKE 'J%';
-- Ends with 'n'
SELECT * FROM employees WHERE name LIKE '%n';
-- Contains 'oh'
SELECT * FROM employees WHERE name LIKE '%oh%';
-- Second character is 'o'
SELECT * FROM employees WHERE name LIKE '_o%';
NULL Checks:
SELECT * FROM employees WHERE manager_id IS NULL;
SELECT * FROM employees WHERE manager_id IS NOT NULL;
Complex Conditions:
SELECT * FROM employees
WHERE (department = 'IT' OR department = 'HR')
AND salary > 65000
AND hire_date >= '2023-01-01';
Definition: The ORDER BY clause sorts the result set based on specified columns.
Basic Syntax:
SELECT column1, column2, ...
FROM table_name
ORDER BY column1 [ASC|DESC], column2 [ASC|DESC];
Sorting Options:
ORDER BY Examples:
Single Column Sorting:
SELECT name, salary FROM employees ORDER BY salary;
-- Same as ORDER BY salary ASC
Descending Order:
SELECT name, salary FROM employees ORDER BY salary DESC;
Multiple Column Sorting:
SELECT name, department, salary
FROM employees
ORDER BY department ASC, salary DESC;
Sorting by Expression:
SELECT name, salary
FROM employees
ORDER BY salary * 1.1 DESC;
Sorting by Column Position:
SELECT name, department, salary
FROM employees
ORDER BY 2, 3 DESC; -- Order by department, then salary descending
Advanced filtering and sorting techniques help you retrieve exactly the data you need in the desired format.
Complete Query Structure:
SELECT column1, column2, ...
FROM table_name
WHERE condition
ORDER BY column1 [ASC|DESC], column2 [ASC|DESC];
Practical Examples:
Example 1: Find High-Earning IT Employees
SELECT
employee_id,
name,
salary,
hire_date
FROM employees
WHERE department = 'IT'
AND salary > 75000
AND hire_date >= '2022-01-01'
ORDER BY salary DESC, hire_date ASC;
Example 2: Find Recent Hires in Specific Departments
SELECT
name,
department,
salary,
hire_date
FROM employees
WHERE department IN ('IT', 'HR', 'Sales')
AND hire_date BETWEEN '2023-01-01' AND '2023-12-31'
ORDER BY hire_date DESC;
Example 3: Find Employees with Specific Name Patterns
SELECT
name,
department,
salary
FROM employees
WHERE name LIKE '%Smith%'
OR name LIKE '%Johnson%'
ORDER BY department ASC, salary DESC;
LIMIT Clause (MySQL/PostgreSQL):
-- Get top 5 highest paid employees
SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5;
Pagination:
-- Get page 2 of results (10 per page)
SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 10 OFFSET 10;
Let's work with a simple company database:
-- Departments table
CREATE TABLE departments (
department_id INT PRIMARY KEY,
department_name VARCHAR(50) NOT NULL,
manager_id INT
);
-- Employees table
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE,
department_id INT,
salary DECIMAL(10,2),
hire_date DATE,
FOREIGN KEY (department_id) REFERENCES departments(department_id)
);
Business Question 1: Find all IT employees earning over $70,000
SELECT
CONCAT(first_name, ' ', last_name) AS full_name,
email,
salary
FROM employees
WHERE department_id = (
SELECT department_id
FROM departments
WHERE department_name = 'IT'
)
AND salary > 70000
ORDER BY salary DESC;
Business Question 2: Find employees hired in the last 6 months
SELECT
CONCAT(first_name, ' ', last_name) AS full_name,
department_name,
hire_date,
DATEDIFF(CURRENT_DATE, hire_date) AS days_employed
FROM employees e
JOIN departments d ON e.department_id = d.department_id
WHERE hire_date >= DATE_SUB(CURRENT_DATE, INTERVAL 6 MONTH)
ORDER BY hire_date DESC;
Use Specific Columns:
-- Good: Select only needed columns
SELECT employee_id, name, salary FROM employees;
-- Avoid: Select all columns when not needed
SELECT * FROM employees;
Use Appropriate Indexes:
-- Create indexes on frequently filtered columns
CREATE INDEX idx_employee_department ON employees(department_id);
CREATE INDEX idx_employee_salary ON employees(salary);
Use Consistent Formatting:
SELECT
employee_id,
first_name,
last_name,
department,
salary
FROM employees
WHERE department = 'IT'
AND salary > 70000
ORDER BY last_name, first_name;
Use Meaningful Aliases:
SELECT
e.first_name,
e.last_name,
d.department_name
FROM employees e
JOIN departments d ON e.department_id = d.department_id;
Missing Commas:
-- Error: Missing comma
SELECT first_name last_name FROM employees;
-- Correct: Include comma
SELECT first_name, last_name FROM employees;
Incorrect Quotes:
-- Error: Single quotes for column names
SELECT 'first_name' FROM employees;
-- Correct: No quotes for column names
SELECT first_name FROM employees;
-- Correct: Single quotes for string literals
SELECT * FROM employees WHERE department = 'IT';
NULL Comparisons:
-- Error: Won't work with NULL values
WHERE manager_id = NULL;
-- Correct: Use IS NULL
WHERE manager_id IS NULL;
In the next lesson, we'll explore joins and data relationships to learn how to combine data from multiple tables effectively.