Programming Tips Grab Bag No. 3

Apr 21, 2009

It's been quite a while since my last programming tips grab bag article, and it's high time for another. As promised, I'm discussing PHP this time around. Although simple, each of these tips is geared towards writing cleaner code, which is always a good thing.

1. Use Helper Functions to Get Incoming Data

Data is typically passed to a given web page through either GET or POST requests. To make things easy, PHP give us two superglobal arrays for each of these request types: $_GET and $_POST, respectively. I prefer to use helper functions to poke around in these superglobal arrays; it results in cleaner looking code. Here are the helper functions I typically use:

// Helper function for getting $_GET data
function getGet($key)
{
    if(isset($_GET[$key]))
    {
        if(is_array($_GET[$key]))
            return $_GET[$key];
        else
            return (trim($_GET[$key]));
    }
    else
        return null;
}

// Helper function for getting $_POST data
function getPost($key)
{
    if(isset($_POST[$key]))
    {
        if(is_array($_POST[$key]))
            return $_POST[$key];
        else
            return (trim($_POST[$key]));
    }
    else
        return null;
}

Calling these functions is super simple:

$someValue = getGet('some_value');

If the some_value parameter is set, the variable will get the appropriate value. If it's not set, the variable gets assigned null. So, all that's needed after calling getGet or getPost, is a test to make sure the variable is non-null:

if(! is_null($someValue))
{
    // ... do something
}

Note that these functions also handle the case where the incoming data may be an array (useful when processing lots of similar data fields at once). If the data is simply a scalar value, I run it through the trim function to make sure there's no stray whitespace on either side of the incoming value.

2. Write Your Own SQL Sanitizer

The first and most important rule when accepting data from a user is: never trust the user, even if that user is you! When incoming data is going to be put into a database, you need to sanitize the input to avoid SQL injection attacks. Like the superglobal arrays above, I like using a helper function for this task:

function dbSafe($string)
{
    global $db; // MySQLi extension instance
    return "'" . $db->escape_string($string) . "'";
}

In this example, I'm making use of the MySQLi extension. The $db variable is an instance of this extension, which gets created in another file. Here's an example of creating that instance, minus all the error checking (which you should do); the constants used as parameters should be self explanatory, and are defined elsewhere in my code:

$db = new mysqli(DB_HOST, DB_USER, DB_PASSWORD, DB_NAME);

Back to our dbSafe function, all I do is create a string value: a single quote, followed by the escaped version of the incoming data, followed by another single quote. Let's assume that my test data is the following:

$string = dbSafe("Isn't this the greatest?");

The resulting value of $string becomes 'Isn\'t this the greatest?'. Nice and clean for insertion into a database! Again, this helper makes writing code faster and cleaner.

3. Make a Simple Output Sanitizer

If you work with an application that displays user-generated content (and after all, isn't that what PHP is for?), you have to deal with cross-site scripting (XSS) attacks as well. All such data that is to be rendered to the screen must be sanitized. The htmlentities and htmlspecialchars functions provide us with the capability to encode HTML entities, thus making our output safe. I prefer using the latter, since it's a little safer when working with UTF-8 encoded data (see my article Unicode and the Web: Part 1 for more on that topic). As before, I wrap the call to this function in a helper to save me some typing:

function safeString($text)
{
    return htmlspecialchars($text, ENT_QUOTES, 'UTF-8', FALSE);
}

Everything here should be self explanatory (see the htmlspecialchars manual entry for explanations on the parameters to that function). I make sure to use this any time I display user-generated content; even content that I myself generate! Not only is it important from an XSS point of view, but it helps keep your HTML validation compliant.

4. Use Alternate Conditional Syntax for Cleaner Code

Displaying HTML based on a certain condition is incredibly handy when working with any web application. I used to write this kind of code like this:

<?php
if($someCondition)
{
    echo "\t<div class=\"myclass\">Some element to insert</div>\n";
}
else
{
    echo "\t<div class=\"myclass\"></div>\n"; // Empty element
}
?>

Not only do the backslashed double quotes look bad, the whole thing is generally messy. Instead, I now make use of PHP's alternative syntax for control structures. Using this alternative syntax, the above code is modified to become:

<?php if($someCondition): ?>
    <div class="myclass">Some element to insert</div>
<?php else: ?>
    <div></div>
<?php endif; ?>

Isn't that better? The second form is much easier to read, arguably making things much easier to maintain down the road. And no more backslashes!

2 Comments

kip

9:55 PM on Apr 21, 2009
Nice, I am totally going to start using your getGet() and getPost() functions. I find I repeat code like that all the time. I had forgotten about the alternate if/else syntax because I thought it was useless, but in that format it makes sense. I guess I'll reciprocate these tips with a little PHP tip I use myself (and which I'll probably combine with your getGet and getPost functions):
function stripslashes_smart($string)
{
    if(function_exists('get_magic_quotes_gpc') && get_magic_quotes_gpc())
        return stripslashes($string);
    return $string;
}
This makes code more portable, since magic quotes is really just there to keep novice users safe from SQL injections, but if you know what you're doing it really interferes. The check for function_exists is because I think I read that the function will be removed starting with PHP6 (not sure though).

Gary

12:35 PM on Apr 22, 2009
Good ideas all. I adopted your getGet for Tar Heel Reader and expanded it to include a default return value and validation. The $rule argument below is a regular expression that the input must match.
function getGet($key, $default = null, $rule = null)
{
    if(isset($_GET[$key])) {
        if(is_array($_GET[$key])) {
            $result = $_GET[$key];
            for($i=0; $i<count($result); $i++)
                if($rule && !preg_match($rule, $result[$i]))
                    return $default;
        } else {
            $result = trim($_GET[$key]);
            if($rule && !preg_match($rule, $result))
                return $default;
        }
        return $result;
    }
    return $default;
}

Leave a Comment

Ignore this field:
Never displayed
Leave this blank:
Optional; will not be indexed
Ignore this field:
Both Markdown and a limited set of HTML tags are supported
Leave this empty: