Connect to the database file (it will be created if it doesn‘t exist)
As a Python developer, being able to work with databases is an essential skill. Databases allow you to store, organize, and retrieve data for your applications. While there are several types of databases, relational databases that use Structured Query Language (SQL) are very common.
In this guide, we‘ll go in-depth on how to create tables and manipulate data in a SQL database, all from Python code. We‘ll use SQLite in our examples, which is a lightweight embedded database, but the same principles apply to other databases like MySQL and PostgreSQL.
Connecting to a SQL Database from Python
To work with a SQL database from Python, you first need to establish a connection to it. Python‘s standard library includes the sqlite3
module for working with SQLite databases. For other databases, you‘ll typically install a 3rd-party module, like mysql-connector-python
for MySQL.
Here‘s how to connect to a SQLite database file using Python:
import sqlite3con = sqlite3.connect(‘example.db‘)
cur = con.cursor()
This code will open a connection to a SQLite database file named "example.db" in the current directory. If the file doesn‘t exist, it will be created. The cursor()
method returns a Cursor object that lets us execute SQL statements and fetch results.
Creating a New Table
With a database connection established, we‘re ready to start working with tables. If you‘re starting with an empty database, the first thing you‘ll need to do is create a table to store your data.
We can create a new table with the CREATE TABLE
SQL statement. Here‘s an example:
# Create a table named "users" cur.execute(‘‘‘ CREATE TABLE users ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, email TEXT NOT NULL UNIQUE, age INTEGER ) ‘‘‘)
This SQL statement creates a table named "users" with four columns:
id
– an integer that uniquely identifies each row and automatically incrementsname
– the user‘s name, stored as text, required for each rowemail
– the user‘s email address, which must be uniqueage
– the user‘s age (optional)
The CREATE TABLE
statement defines the table‘s schema – the names and data types of each column, along with any constraints. You can use various data types like TEXT
, INTEGER
, REAL
(floating-point number), and BLOB
(binary data).
Some useful constraints to know are:
PRIMARY KEY
– specifies that a column uniquely identifies the rowNOT NULL
– requires a value to be provided for the columnUNIQUE
– enforces that all values in the column are different
Inserting Data into a Table
Once you have a table created, you can insert data into it using the INSERT
statement. It lets you add one or more new rows to a table.
Here‘s how to INSERT
a single row into the "users" table:
cur.execute(‘‘‘ INSERT INTO users (name, email, age) VALUES (‘John Doe‘, ‘[email protected]‘, 30) ‘‘‘)
You provide the table name, list the columns you‘re providing values for, and then include a matching number of values. The values are bound to the query using SQLite‘s parameterized statement feature, which escapes special characters to prevent SQL injection attacks.
To insert multiple rows at once, you can use executemany():
users = [ (‘Jane Doe‘, ‘[email protected]‘, 35), (‘Bob Smith‘, ‘[email protected]‘, 48), (‘Alice Jones‘, ‘[email protected]‘, 27) ]cur.executemany(‘INSERT INTO users (name, email, age) VALUES (?, ?, ?)‘, users)
executemany()
lets you run the same SQL statement repeatedly with different values. Use placeholders like ?
in the query, and pass a list of tuples containing the values.
After executing INSERT
statements, make sure to call commit()
on the Connection object to save the changes permanently:
con.commit()
Querying Data from a Table
To retrieve data from a table, we use the versatile SELECT
statement. It allows querying all or parts of a table with various criteria.
Here are some examples of SELECT
queries on the "users" table:
# Retrieve all columns for all rows cur.execute(‘SELECT * FROM users‘)cur.execute(‘SELECT name, email FROM users WHERE age > 30‘)
cur.execute(‘SELECT * FROM users ORDER BY age DESC‘)
The first query selects all columns (*
) from the "users" table. The second retrieves only the "name" and "email" columns for users over age 30. The last query selects all columns but sorts the results by the "age" column in descending order.
After executing a SELECT
query, you can retrieve the matching rows using methods on the Cursor object:
fetchone()
– retrieve the next row of a query result, returning a tuple or Nonefetchmany(size=cursor.arraysize)
– retrieve the next set of rows, returning a list of tuplesfetchall()
– retrieve all remaining rows of a query result, returning a list of tuples
Here‘s an example of processing each row returned by a query:
cur.execute(‘SELECT name, email, age FROM users ORDER BY age‘)for row in cur: print(f‘Name: {row[0]}, Email: {row[1]}, Age: {row[2]}‘)
This code executes a SELECT
query and then loops over the Cursor object itself, retrieving one row at a time. The row is represented as a tuple of values corresponding to the queried columns.
Updating Table Data
To modify existing data in a table, use the UPDATE
statement. It allows changing column values for one or more rows matching specified criteria.
Here‘s an example of updating a row in the "users" table:
cur.execute(‘‘‘ UPDATE users SET age = 31, email = ‘[email protected]‘ WHERE id = 1 ‘‘‘) con.commit()
This UPDATE
statement modifies the "age" and "email" columns for the row where "id" equals 1. The WHERE
clause specifies the condition for matching rows to update.
Be careful when constructing UPDATE
statements. If you omit the WHERE
clause, all rows will be updated!
After executing an UPDATE
, call commit()
to make the changes permanent.
Deleting Table Data
To remove rows from a table, use the DELETE
statement. Like UPDATE
, it can affect one or more rows matching a condition.
cur.execute(‘DELETE FROM users WHERE id = 2‘) con.commit()
This statement deletes the row from "users" whose "id" equals 2. Again, be cautious when writing DELETE
statements. Accidentally omitting the WHERE
clause will remove all rows from the table!
Remember to call commit()
after a DELETE
to persist the changes.
Conclusion
In this guide, we covered the basics of using Python to create tables and manipulate data in a SQL database. With the sqlite3
module and SQL statements, you can:
- Create a new table specifying its columns and constraints
- Insert new rows of data into a table
- Query a table to retrieve rows matching certain criteria
- Update column values for existing table rows
- Delete rows from a table
The same techniques work with other relational databases too, like MySQL and PostgreSQL.
Once you‘re comfortable with these fundamentals, you can dive into more advanced SQL topics like:
- Joining data from multiple related tables
- Grouping and aggregating data for analytical queries
- Creating indexes to optimize query performance
Learning to combine SQL and Python effectively is a powerful skill that will enable you to build data-driven applications. You can create Python programs that query databases, process the results, and then update the data – a common pattern for ETL (extract, transform, load) pipelines and other data flows.
As a next step, I recommend exploring Python‘s rich ecosystem of database tools. SQLAlchemy is a popular Python library that provides an object-relational mapping (ORM) layer over many different databases. ORMs let you work with databases using familiar Python classes and objects.
Integrating a database will let you build more capable, data-driven Python applications. With practice, you‘ll be writing complex queries in no time!