Techno Blender
Digitally Yours.

Understand SQL Injection and Learn to Avoid It in Python with SQLAlchemy | by Lynn Kwong | Apr, 2023

0 18


Image by mohamed_hassan on Pixabay

SQL Injection is one of the most common and also most dangerous web security vulnerabilities which allows hackers to inject malicious SQL code into unvalidated and unsanitized plain SQL queries. It is also a common issue that new developers overlook.

The cause and solution for SQL Injection are actually pretty simple. In this post, we will explore SQL Injection with some simple queries and will pretend to be an attacker to exploit our database. At the end of this post, you will fully understand SQL Injection and will never make the mistake again after realizing its power and danger.

Preparation

As usual, we will create a MySQL database using Docker:

# Create a volume to persist the data.
$ docker volume create mysql8-data

# Create the container for MySQL.
$ docker run --name mysql8 -d -e MYSQL_ROOT_PASSWORD=root -p 13306:3306 -v mysql8-data:/var/lib/mysql mysql:8

# Connect to the local MySQL server in Docker.
$ docker exec -it mysql8 mysql -u root -proot

mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 8.0.31 |
+-----------+
1 row in set (0.00 sec)

Note that the root user is used for simplicity in this post, but it should never be used directly in our web applications directly in practice.

Then let’s create some database and table to play with. The data set is the same as the one used in the previous series of posts for simplicity.

CREATE DATABASE `data`;

CREATE TABLE `data`.`student_scores` (
`student_id` smallint NOT NULL,
`subject` varchar(50) NOT NULL,
`score` tinyint DEFAULT '0',
PRIMARY KEY (`student_id`,`subject`),
KEY `ix_subject` (`subject`),
KEY `ix_score` (`score`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
;

INSERT INTO `data`.student_scores
(student_id, subject, score)
VALUES
(1, 'Literature', 90),
(1, 'Math', 60),
(2, 'Literature', 80),
(2, 'Math', 80),
(3, 'Literature', 70),
(3, 'Math', 95)
;

Since we will use SQLAlchemy in this post for database connection, we need to install the necessary libraries. As usual, it’s recommended to create a separate virtual environment so that the libraries won’t impact the system also other virtual environments.

conda create --name sql python=3.11
conda activate sql

pip install -U "SQLAlchemy>=2.0.0,<2.1.0"
pip install -U "pymysql>=1.0.0,<1.1.0"
pip install -U "cryptography>=40.0.0,<40.1.0"

Explore SQL Injection

Let’s now create a simple function to read some data:

from sqlalchemy import create_engine, text

db_url = "mysql+pymysql://root:root@localhost:13306/data"
engine = create_engine(db_url, pool_size=5, pool_recycle=3600)
conn = engine.connect()

def read_student_score(student_id):
sql_text = text(f"""
SELECT subject, score
FROM data.student_scores
WHERE student_id = {student_id}
""")
result = list(conn.execute(sql_text))

print(result)

The read_student_score() function looks seemingly normal from a simple coding point of view. However, there is a huge security issue that can be exploited by malicious users.

If we use it normally, it will just work normally:

read_student_score(1)
# [('Literature', 90), ('Math', 60)]

However, it can also return something that is not supposed to be returned by evil users. The first thing to hack is to return all the records, even those that a user is not supposed to see:

read_student_score('-1 OR 1')
# [('Literature', 90), ('Math', 60), ('Literature', 80), ('Math', 80), ('Literature', 70), ('Math', 95)]

This is possible because the read_student_score() function doesn’t clean and validate the input parameter and simply concatenates the input data with the original query.

This is not uncommon for many developers. Actually, I have seen quite some legacy code written in this way. It is lucky that they have not been hacked before. Or maybe they have …

SQL Injection can be more harmful than shown above and can actually return anything for hackers.

Now, let’s pretend we are malicious users and try to get some information that is not supposed to be returned by this function.

The first thing a hacker wants to know is how many columns are returned. It is obvious in this example that two columns are returned. However, when the output is displayed with some user interface, it may not be that obvious.

There are many ways to guess how many columns are returned, two common ones are to use ORDER BY and UNION. Let’s see how it works:

read_student_score('-1 ORDER BY 1')
# []
read_student_score('-1 ORDER BY 2')
# []
read_student_score('-1 ORDER BY 3')
# OperationalError: (pymysql.err.OperationalError) (1054, "Unknown column '3' in 'order clause'")

From the above queries, the results, and the errors, we know that two columns are returned.

We can arrive at the same conclusion using UNION:

read_student_score('-1 UNION SELECT 1')
# OperationalError: (pymysql.err.OperationalError) (1222, 'The used SELECT statements have a different number of columns')
read_student_score('-1 UNION SELECT 1,2')
# [('1', 2)]

Using UNION we are able to guess the correct number of columns with a smaller number of tests. Actually, UNION is the most commonly used hacking tool to exploit a database.

Let’s try to read something that’s not supposed to be returned normally:

read_student_score('-1 UNION SELECT DATABASE(), @@VERSION')
# [('data', '8.0.31')]

The database name and version are returned!

Let’s see something even scarier:

read_student_score('-1 UNION SELECT user, authentication_string FROM mysql.user')
# [('root', '$A$005$j\x1cZ\x1aj*t\x16_aI\t.\tk\x1a0b8,6nT16rTboTxEGJsq8R.xLN1dlygQWOe12XurOijG5v9'), ('mysql.infoschema', '$A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED'), ('mysql.session', '$A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED'), ('mysql.sys', '$A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED'), ('root', '$A$005$\x0c=\x10gE\x7f]g\x18WQNnB`Y&I1\x18zPIQ3wM3cj43wk4Qq4/Tt88B0ypKrwYLYnD3BpGqfY5')]

The username and authentication strings of all DB users are returned! Using some brute guessing tools, the hacker can hack the passwords in a short time, especially if simple passwords are used.

How to avoid SQL Injection?

Now that we have understood what SQL Injection is and how dangerous it can be, let’s see how it can be avoided in practice.

The most efficient way to prevent SQL Injection is to use parametrized queries, which is achieved with the :param_name syntax in SQLAlchemy:

def read_student_score(student_id):
sql_text = text("""
SELECT subject, score
FROM data.student_scores
WHERE student_id = :student_id
""")
result = list(conn.execute(sql_text, parameters={"student_id": student_id}))

print(result)

More details about running plain SQL queries in SQLAlchemy can be found in this post. However, note that SQLAlchemy 2.0 is used in this post and thus the syntax for specifying parameters will be different from that in SQLAlchemy 1.x (normally 1.4).

Let’s see what the evil queries will return with parametrized queries:

read_student_score('-1 OR 1')
# []
read_student_score('-1 UNION SELECT DATABASE(), @@VERSION')
# []
read_student_score('-1 UNION SELECT user, authentication_string FROM mysql.user')
# []

All these evil queries return an empty result, which is much safer than before.


Image by mohamed_hassan on Pixabay

SQL Injection is one of the most common and also most dangerous web security vulnerabilities which allows hackers to inject malicious SQL code into unvalidated and unsanitized plain SQL queries. It is also a common issue that new developers overlook.

The cause and solution for SQL Injection are actually pretty simple. In this post, we will explore SQL Injection with some simple queries and will pretend to be an attacker to exploit our database. At the end of this post, you will fully understand SQL Injection and will never make the mistake again after realizing its power and danger.

Preparation

As usual, we will create a MySQL database using Docker:

# Create a volume to persist the data.
$ docker volume create mysql8-data

# Create the container for MySQL.
$ docker run --name mysql8 -d -e MYSQL_ROOT_PASSWORD=root -p 13306:3306 -v mysql8-data:/var/lib/mysql mysql:8

# Connect to the local MySQL server in Docker.
$ docker exec -it mysql8 mysql -u root -proot

mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 8.0.31 |
+-----------+
1 row in set (0.00 sec)

Note that the root user is used for simplicity in this post, but it should never be used directly in our web applications directly in practice.

Then let’s create some database and table to play with. The data set is the same as the one used in the previous series of posts for simplicity.

CREATE DATABASE `data`;

CREATE TABLE `data`.`student_scores` (
`student_id` smallint NOT NULL,
`subject` varchar(50) NOT NULL,
`score` tinyint DEFAULT '0',
PRIMARY KEY (`student_id`,`subject`),
KEY `ix_subject` (`subject`),
KEY `ix_score` (`score`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
;

INSERT INTO `data`.student_scores
(student_id, subject, score)
VALUES
(1, 'Literature', 90),
(1, 'Math', 60),
(2, 'Literature', 80),
(2, 'Math', 80),
(3, 'Literature', 70),
(3, 'Math', 95)
;

Since we will use SQLAlchemy in this post for database connection, we need to install the necessary libraries. As usual, it’s recommended to create a separate virtual environment so that the libraries won’t impact the system also other virtual environments.

conda create --name sql python=3.11
conda activate sql

pip install -U "SQLAlchemy>=2.0.0,<2.1.0"
pip install -U "pymysql>=1.0.0,<1.1.0"
pip install -U "cryptography>=40.0.0,<40.1.0"

Explore SQL Injection

Let’s now create a simple function to read some data:

from sqlalchemy import create_engine, text

db_url = "mysql+pymysql://root:root@localhost:13306/data"
engine = create_engine(db_url, pool_size=5, pool_recycle=3600)
conn = engine.connect()

def read_student_score(student_id):
sql_text = text(f"""
SELECT subject, score
FROM data.student_scores
WHERE student_id = {student_id}
""")
result = list(conn.execute(sql_text))

print(result)

The read_student_score() function looks seemingly normal from a simple coding point of view. However, there is a huge security issue that can be exploited by malicious users.

If we use it normally, it will just work normally:

read_student_score(1)
# [('Literature', 90), ('Math', 60)]

However, it can also return something that is not supposed to be returned by evil users. The first thing to hack is to return all the records, even those that a user is not supposed to see:

read_student_score('-1 OR 1')
# [('Literature', 90), ('Math', 60), ('Literature', 80), ('Math', 80), ('Literature', 70), ('Math', 95)]

This is possible because the read_student_score() function doesn’t clean and validate the input parameter and simply concatenates the input data with the original query.

This is not uncommon for many developers. Actually, I have seen quite some legacy code written in this way. It is lucky that they have not been hacked before. Or maybe they have …

SQL Injection can be more harmful than shown above and can actually return anything for hackers.

Now, let’s pretend we are malicious users and try to get some information that is not supposed to be returned by this function.

The first thing a hacker wants to know is how many columns are returned. It is obvious in this example that two columns are returned. However, when the output is displayed with some user interface, it may not be that obvious.

There are many ways to guess how many columns are returned, two common ones are to use ORDER BY and UNION. Let’s see how it works:

read_student_score('-1 ORDER BY 1')
# []
read_student_score('-1 ORDER BY 2')
# []
read_student_score('-1 ORDER BY 3')
# OperationalError: (pymysql.err.OperationalError) (1054, "Unknown column '3' in 'order clause'")

From the above queries, the results, and the errors, we know that two columns are returned.

We can arrive at the same conclusion using UNION:

read_student_score('-1 UNION SELECT 1')
# OperationalError: (pymysql.err.OperationalError) (1222, 'The used SELECT statements have a different number of columns')
read_student_score('-1 UNION SELECT 1,2')
# [('1', 2)]

Using UNION we are able to guess the correct number of columns with a smaller number of tests. Actually, UNION is the most commonly used hacking tool to exploit a database.

Let’s try to read something that’s not supposed to be returned normally:

read_student_score('-1 UNION SELECT DATABASE(), @@VERSION')
# [('data', '8.0.31')]

The database name and version are returned!

Let’s see something even scarier:

read_student_score('-1 UNION SELECT user, authentication_string FROM mysql.user')
# [('root', '$A$005$j\x1cZ\x1aj*t\x16_aI\t.\tk\x1a0b8,6nT16rTboTxEGJsq8R.xLN1dlygQWOe12XurOijG5v9'), ('mysql.infoschema', '$A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED'), ('mysql.session', '$A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED'), ('mysql.sys', '$A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED'), ('root', '$A$005$\x0c=\x10gE\x7f]g\x18WQNnB`Y&I1\x18zPIQ3wM3cj43wk4Qq4/Tt88B0ypKrwYLYnD3BpGqfY5')]

The username and authentication strings of all DB users are returned! Using some brute guessing tools, the hacker can hack the passwords in a short time, especially if simple passwords are used.

How to avoid SQL Injection?

Now that we have understood what SQL Injection is and how dangerous it can be, let’s see how it can be avoided in practice.

The most efficient way to prevent SQL Injection is to use parametrized queries, which is achieved with the :param_name syntax in SQLAlchemy:

def read_student_score(student_id):
sql_text = text("""
SELECT subject, score
FROM data.student_scores
WHERE student_id = :student_id
""")
result = list(conn.execute(sql_text, parameters={"student_id": student_id}))

print(result)

More details about running plain SQL queries in SQLAlchemy can be found in this post. However, note that SQLAlchemy 2.0 is used in this post and thus the syntax for specifying parameters will be different from that in SQLAlchemy 1.x (normally 1.4).

Let’s see what the evil queries will return with parametrized queries:

read_student_score('-1 OR 1')
# []
read_student_score('-1 UNION SELECT DATABASE(), @@VERSION')
# []
read_student_score('-1 UNION SELECT user, authentication_string FROM mysql.user')
# []

All these evil queries return an empty result, which is much safer than before.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment