How can I prevent SQL injection in PHP?


If user input is inserted without modification into an SQL query, then the application becomes vulnerable to SQL injection, like in the following example:

$unsafe_variable = $_POST['user_input']; 

mysql_query("INSERT INTO `table` (`column`) VALUES ('$unsafe_variable')");

That's because the user can input something like value'); DROP TABLE table;--, and the query becomes:

INSERT INTO `table` (`column`) VALUES('value'); DROP TABLE table;--')

What can be done to prevent this from happening?



Answers


Use prepared statements and parameterized queries. These are SQL statements that are sent to and parsed by the database server separately from any parameters. This way it is impossible for an attacker to inject malicious SQL.

You basically have two options to achieve this:

  1. Using PDO (for any supported database driver):

    $stmt = $pdo->prepare('SELECT * FROM employees WHERE name = :name');
    
    $stmt->execute(array('name' => $name));
    
    foreach ($stmt as $row) {
        // do something with $row
    }
  2. Using MySQLi (for MySQL):

    $stmt = $dbConnection->prepare('SELECT * FROM employees WHERE name = ?');
    $stmt->bind_param('s', $name);
    
    $stmt->execute();
    
    $result = $stmt->get_result();
    while ($row = $result->fetch_assoc()) {
        // do something with $row
    }

If you're connecting to a database other than MySQL, there is a driver-specific second option that you can refer to (e.g. pg_prepare() and pg_execute() for PostgreSQL). PDO is the universal option.

Correctly setting up the connection

Note that when using PDO to access a MySQL database real prepared statements are not used by default. To fix this you have to disable the emulation of prepared statements. An example of creating a connection using PDO is:

$dbConnection = new PDO('mysql:dbname=dbtest;host=127.0.0.1;charset=utf8', 'user', 'pass');

$dbConnection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$dbConnection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

In the above example the error mode isn't strictly necessary, but it is advised to add it. This way the script will not stop with a Fatal Error when something goes wrong. And it gives the developer the chance to catch any error(s) which are thrown as PDOExceptions.

What is mandatory however is the first setAttribute() line, which tells PDO to disable emulated prepared statements and use real prepared statements. This makes sure the statement and the values aren't parsed by PHP before sending it to the MySQL server (giving a possible attacker no chance to inject malicious SQL).

Although you can set the charset in the options of the constructor, it's important to note that 'older' versions of PHP (< 5.3.6) silently ignored the charset parameter in the DSN.

Explanation

What happens is that the SQL statement you pass to prepare is parsed and compiled by the database server. By specifying parameters (either a ? or a named parameter like :name in the example above) you tell the database engine where you want to filter on. Then when you call execute, the prepared statement is combined with the parameter values you specify.

The important thing here is that the parameter values are combined with the compiled statement, not an SQL string. SQL injection works by tricking the script into including malicious strings when it creates SQL to send to the database. So by sending the actual SQL separately from the parameters, you limit the risk of ending up with something you didn't intend. Any parameters you send when using a prepared statement will just be treated as strings (although the database engine may do some optimization so parameters may end up as numbers too, of course). In the example above, if the $name variable contains 'Sarah'; DELETE FROM employees the result would simply be a search for the string "'Sarah'; DELETE FROM employees", and you will not end up with an empty table.

Another benefit with using prepared statements is that if you execute the same statement many times in the same session it will only be parsed and compiled once, giving you some speed gains.

Oh, and since you asked about how to do it for an insert, here's an example (using PDO):

$preparedStatement = $db->prepare('INSERT INTO table (column) VALUES (:column)');

$preparedStatement->execute(array('column' => $unsafeValue));

Can prepared statements be used for dynamic queries?

While you can still use prepared statements for the query parameters, the structure of the dynamic query itself cannot be parametrized and certain query features cannot be parametrized.

For these specific scenarios, the best thing to do is use a whitelist filter that restricts the possible values.

// Value whitelist
// $dir can only be 'DESC' otherwise it will be 'ASC'
if (empty($dir) || $dir !== 'DESC') {
   $dir = 'ASC';
}



Warning: The question's sample code uses PHP's mysql extension, which was deprecated in PHP 5.5.0 and removed entirely in PHP 7.0.0.

If you're using a recent version of PHP, the mysql_real_escape_string option outlined below will no longer be available (though mysqli::escape_string is a modern equivalent). These days the mysql_real_escape_string option would only make sense for legacy code on an old version of PHP.


You've got two options - escaping the special characters in your unsafe_variable, or using a parameterized query. Both would protect you from SQL injection. The parameterized query is considered the better practice but will require changing to a newer mysql extension in PHP before you can use it.

We'll cover the lower impact string escaping one first.

//Connect

$unsafe_variable = $_POST["user-input"];
$safe_variable = mysql_real_escape_string($unsafe_variable);

mysql_query("INSERT INTO table (column) VALUES ('" . $safe_variable . "')");

//Disconnect

See also, the details of the mysql_real_escape_string function.

To use the parameterized query, you need to use MySQLi rather than the MySQL functions. To rewrite your example, we would need something like the following.

<?php
    $mysqli = new mysqli("server", "username", "password", "database_name");

    // TODO - Check that connection was successful.

    $unsafe_variable = $_POST["user-input"];

    $stmt = $mysqli->prepare("INSERT INTO table (column) VALUES (?)");

    // TODO check that $stmt creation succeeded

    // "s" means the database expects a string
    $stmt->bind_param("s", $unsafe_variable);

    $stmt->execute();

    $stmt->close();

    $mysqli->close();
?>

The key function you'll want to read up on there would be mysqli::prepare.

Also, as others have suggested, you may find it useful/easier to step up a layer of abstraction with something like PDO.

Please note that the case you asked about is a fairly simple one and that more complex cases may require more complex approaches. In particular:

  • If you want to alter the structure of the SQL based on user input, parameterized queries are not going to help, and the escaping required is not covered by mysql_real_escape_string. In this kind of case, you would be better off passing the user's input through a whitelist to ensure only 'safe' values are allowed through.
  • If you use integers from user input in a condition and take the mysql_real_escape_string approach, you will suffer from the problem described by Polynomial in the comments below. This case is trickier because integers would not be surrounded by quotes, so you could deal with by validating that the user input contains only digits.
  • There are likely other cases I'm not aware of. You might find this is a useful resource on some of the more subtle problems you can encounter.



Every answer here covers only part of the problem.
In fact, there are four different query parts which we can add to it dynamically:

  • a string
  • a number
  • an identifier
  • a syntax keyword.

and prepared statements covers only 2 of them

But sometimes we have to make our query even more dynamic, adding operators or identifiers as well.
So, we will need different protection techniques.

In general, such a protection approach is based on whitelisting. In this case, every dynamic parameter should be hardcoded in your script and chosen from that set.
For example, to do dynamic ordering:

$orders  = array("name","price","qty"); //field names
$key     = array_search($_GET['sort'],$orders)); // see if we have such a name
$orderby = $orders[$key]; //if not, first one will be set automatically. smart enuf :)
$query   = "SELECT * FROM `table` ORDER BY $orderby"; //value is safe

However, there is another way to secure identifiers - escaping. As long as you have an identifier quoted, you can escape backticks inside by doubling them.

As a further step, we can borrow a truly brilliant idea of using some placeholder (a proxy to represent the actual value in the query) from the prepared statements and invent a placeholder of another type - an identifier placeholder.

So, to make the long story short: it's a placeholder, not prepared statement can be considered as a silver bullet.

So, a general recommendation may be phrased as
As long as you are adding dynamic parts to the query using placeholders (and these placeholders properly processed of course), you can be sure that your query is safe.

Still, there is an issue with SQL syntax keywords (such as AND, DESC and such) but white-listing seems the only approach in this case.

Update

Although there is a general agreement on the best practices regarding SQL injection protection, there are still many bad practices as well. And some of them too deeply rooted in the minds of PHP users. For instance, on this very page there are (although invisible to most visitors) more than 80 deleted answers - all removed by the community due to bad quality or promoting bad and outdated practices. Worse yet, some of the bad answers aren't deleted but rather prospering.

For example, there(1) are(2) still(3) many(4) answers(5), including the second most upvoted answer suggesting you manual string escaping - an outdated approach that is proven to be insecure.

Or there is a slightly better answer that suggests just another method of string formatting and even boasts it as ultimate panacea. While of course, it is not. This method is no better than regular string formatting yet it keeps all its drawbacks: it is applicable to strings only and, as any other manual formatting, it's essentially optional, non-obligatory measure, prone to human error of any sort.

I think that all this because of one very old superstition, supported by such authorities like OWASP or PHP manual, which proclaims equality between whatever "escaping" and protection from SQL injections.

Regardless of what PHP manual said for ages, *_escape_string by no means makes data safe and never has been intended to. Besides being useless for any SQL part other than string, manual escaping is wrong because it is manual as opposite to automated.

And OWASP makes it even worse, stressing on escaping user input which is an utter nonsense: there should be no such words in the context of injection protection. Every variable is potentially dangerous - no matter the source! Or, in other words - every variable has to be properly formatted to be put into a query - no matter the source again. It's the destination that matters. The moment a developer starts to separate the sheep from the goats (thinking whether some particular variable is "safe" or not) he takes his first step towards disaster. Not to mention that even the wording suggests bulk escaping at the entry point, resembling the very magic quotes feature - already despised, deprecated and removed.

So, unlike whatever "escaping", prepared statements is the measure that indeed protects from SQL injection (when applicable).

If you're still not convinced, here is a step-by-step explanation I wrote, The Hitchhiker's Guide to SQL Injection prevention, where I explained all these matters in detail and even compiled a section entirely dedicated to bad practices and their disclosure.




Escaping single quote in PHP when inserting into MySQL

You should be escaping each of these strings (in both snippets) with mysql_real_escape_string().

http://us3.php.net/mysql-real-escape-string

The reason your two queries are behaving differently is likely because you have magic_quotes_gpc turned on (which you should know is a bad idea). This means that strings gathered from $_GET, $_POST and $_COOKIES are escaped for you (i.e., "O'Brien" -> "O\'Brien").

Once you store the data, and subsequently retrieve it again, the string you get back from the database will not be automatically escaped for you. You'll get back "O'Brien". So, you will need to pass it through mysql_real_escape_string().




For anyone finding this solution in 2015 and moving forward...

The mysql_real_escape_string() function is deprecated as of PHP 5.5.0.

See: php.net

Warning

This extension is deprecated as of PHP 5.5.0, and will be removed in the future. Instead, the MySQLi or PDO_MySQL extension should be used. See also MySQL: choosing an API guide and related FAQ for more information. Alternatives to this function include:

mysqli_real_escape_string()

PDO::quote()




You should do something like this to help you debug

$sql = "insert into blah blah....";
echo $sql;

You will probably find that the single quote is escaped with a backslash in the working query. This might have been done automatically by php via the magic_quotes_gpc setting, or maybe you did it yourself in some other part of the code(addslashes and stripslashes might be functions to look for).

See http://php.net/manual/en/security.magicquotes.php




How to deal with Apostrophe while writing into Mysql database

The process of encoding data which contains characters MySQL might interpret is called "escaping". You must escape your strings with mysql_real_escape_string, which is a PHP function, not a MySQL function, meaning you have to run it in PHP before you pass your query to the database. You must escape any data that comes into your program from an external source. Any data that isn't escaped is a potential SQL injection.

You have to escape your data before you build your query. Also, you can build your query programatically using PHP's looping constructs and range:

// Build tag fields    
$tags = 'tag' . implode(', tag', range(1,30));

// Escape each value in the uniqkey array
$values = array_map('mysql_real_escape_string', $uniqkey);

// implode values with quotes and commas
$values = "'" . implode("', '", $values) . "'";

$query = "INSERT INTO alltags (id, $tags) VALUES ('', $values)";    

mysql_query($query) or die(mysql_error());



Using mysql_real_escape_string is a safer approach to handling characters for SQL insertion/updating:

INSERT INTO YOUR_TABLE
VALUES
  (mysql_real_escape_string($var1),
   mysql_real_escape_string($var2))

Also, I'd change your columns back from TEXT to VARCHAR - searching, besides indexing, works much better.

Update for your update

Being that id is an auto_increment column you can:

  • leave it out of the list of columns, so you don't have to provide a value in the VALUES clause:

    INSERT INTO alltags
      (tag1,tag2,tag3,tag4,tag5,tag6,tag7,tag8,tag9,tag10,tag11,tag12,tag13,tag14,tag15,tag16,tag17,tag18,tag19,tag20,tag21,tag22,tag23,tag24,tag25,tag26,tag27,tag28,tag29,tag30)
    VALUES      
      (mysql_real_escape_string($uniqkey[0]),mysql_real_escape_string($uniqkey[1]),mysql_real_escape_string($uniqkey[2]),mysql_real_escape_string($uniqkey[3]),mysql_real_escape_string($uniqkey[4]),mysql_real_escape_string($uniqkey[5]),mysql_real_escape_string($uniqkey[6]),mysql_real_escape_string($uniqkey[7]),mysql_real_escape_string($uniqkey[8]),mysql_real_escape_string($uniqkey[9]),mysql_real_escape_string($uniqkey[10]),mysql_real_escape_string($uniqkey[11]),mysql_real_escape_string($uniqkey[12]),mysql_real_escape_string($uniqkey[13]),mysql_real_escape_string($uniqkey[14]),mysql_real_escape_string($uniqkey[15]),mysql_real_escape_string($uniqkey[16]),mysql_real_escape_string($uniqkey[17]),mysql_real_escape_string($uniqkey[18]),mysql_real_escape_string($uniqkey[19]),mysql_real_escape_string($uniqkey[20]),mysql_real_escape_string($uniqkey[21]),mysql_real_escape_string($uniqkey[22]),mysql_real_escape_string($uniqkey[23]),mysql_real_escape_string($uniqkey[24]),mysql_real_escape_string($uniqkey[25]),mysql_real_escape_string($uniqkey[26]),mysql_real_escape_string($uniqkey[27]),mysql_real_escape_string($uniqkey[28]),mysql_real_escape_string($uniqkey[29])) "; 
  • include id in the list of columns, which requires you use either value in its place in the VALUES clause:

    • NULL
    • DEFAULT

Here's an example using NULL as the id placeholder:

INSERT INTO alltags
  (id,tag1,tag2,tag3,tag4,tag5,tag6,tag7,tag8,tag9,tag10,tag11,tag12,tag13,tag14,tag15,tag16,tag17,tag18,tag19,tag20,tag21,tag22,tag23,tag24,tag25,tag26,tag27,tag28,tag29,tag30)
 VALUES      
  (NULL,mysql_real_escape_string($uniqkey[0]),mysql_real_escape_string($uniqkey[1]),mysql_real_escape_string($uniqkey[2]),mysql_real_escape_string($uniqkey[3]),mysql_real_escape_string($uniqkey[4]),mysql_real_escape_string($uniqkey[5]),mysql_real_escape_string($uniqkey[6]),mysql_real_escape_string($uniqkey[7]),mysql_real_escape_string($uniqkey[8]),mysql_real_escape_string($uniqkey[9]),mysql_real_escape_string($uniqkey[10]),mysql_real_escape_string($uniqkey[11]),mysql_real_escape_string($uniqkey[12]),mysql_real_escape_string($uniqkey[13]),mysql_real_escape_string($uniqkey[14]),mysql_real_escape_string($uniqkey[15]),mysql_real_escape_string($uniqkey[16]),mysql_real_escape_string($uniqkey[17]),mysql_real_escape_string($uniqkey[18]),mysql_real_escape_string($uniqkey[19]),mysql_real_escape_string($uniqkey[20]),mysql_real_escape_string($uniqkey[21]),mysql_real_escape_string($uniqkey[22]),mysql_real_escape_string($uniqkey[23]),mysql_real_escape_string($uniqkey[24]),mysql_real_escape_string($uniqkey[25]),mysql_real_escape_string($uniqkey[26]),mysql_real_escape_string($uniqkey[27]),mysql_real_escape_string($uniqkey[28]),mysql_real_escape_string($uniqkey[29])) "; 

I want to really stress that you should not setup your columns like that.




Slight improvement of meagar's answer:

EDIT: meagar updated his post, so his answer is now better.

$query = 'INSERT INTO alltags (id, ';

// append tag1, tag2, etc.
$query .= 'tag' . implode(', tag', range(1, 30)) . ") VALUES ('', ";

// escape each value in the uniqkey array
$escaped_tags = array_map('mysql_real_escape_string', $uniqkey);

// implode values with quotes and commas, and add closing bracket
$query .= "'" . implode("', '", $escaped_tags) . "')";

// actually query
mysql_query($query) or die(mysql_error());



What is the equivalent of real_escape_string() for PDO?

You should use PDO Prepare

From the link:

Calling PDO::prepare() and PDOStatement::execute() for statements that will be issued multiple times with different parameter values optimizes the performance of your application by allowing the driver to negotiate client and/or server side caching of the query plan and meta information, and helps to prevent SQL injection attacks by eliminating the need to manually quote the parameters.




PDO offers an alternative designed to replace mysql_escape_string() with the PDO::quote() method.

Here is an excerpt from the PHP website:

<?php
    $conn = new PDO('sqlite:/home/lynn/music.sql3');

    /* Simple string */
    $string = 'Nice';
    print "Unquoted string: $string\n";
    print "Quoted string: " . $conn->quote($string) . "\n";
?>

The above code will output:

Unquoted string: Nice
Quoted string: 'Nice'



Use prepared statements. Those keep the data and syntax apart, which removes the need for escaping MySQL data. See e.g. this tutorial.