Wednesday, May 14, 2008

Performance difference using mysql_real_escape_string versus intval

Someone asked me today what the performance difference would be if they used mysql_real_escape_string() on all of their variable inputs into a database instead of casting to variable type integers for the int inputs.
I created the below bechmark test to demonstrate what performance differences there would be in using each of the possible functions in such a situation.

It’s important to note that all user input variables that are intended to be strings go through the mysql_real_escape_string() function to ensure proper sanitisation for those strings.
But for integers, what about those? should we use the mysql_real_escape_string() function on them, or should we cast the variable type int?
Lets find out what the performance differences are:

PHP Code:
<?php

$link 
mysql_connect('localhost''username''password');
if (!
$link)
{
    die(
'Could not connect: ' mysql_error());
}
echo 
"Connected successfully<br />\n";

$string "12345'?;/FOO/FUM\';Example String =";
$num 500000;
echo 
"benchmark: Begin<br />\n";

$start_time microtime(true);

for (
$i 0$i $num$i++)
{
    
$escaped mysql_real_escape_string($string);
}

$escape_benchmark microtime(true) - $start_time;

$start_time microtime(true);

for (
$i 0$i $num$i++)
{
    
$intval intval($string);
}

$intval_benchmark microtime(true) - $start_time;

$start_time microtime(true);

for (
$i 0$i $num$i++)
{
    
$int = (int) $string;
}

$int_benchmark microtime(true) - $start_time;

echo 
"escape_string time: $escape_benchmark <br />\n";
echo 
"intval time: $intval_benchmark <br />\n";
echo 
"int time: $int_benchmark <br />\n";

echo 
'intval vs escape_string difference: ' . ($escape_benchmark $intval_benchmark) . "<br />\n";
echo 
'int vs escape_string difference: ' . ($escape_benchmark $int_benchmark) . "<br />\n";
echo 
'int vs intval difference: ' . ($intval_benchmark $int_benchmark) . "<br />\n";

echo 
'Percent faster intval is than escape: ' round((- ($intval_benchmark $escape_benchmark)) * 100) . "%<br />\n";
echo 
'Percent faster int is than escape: ' round((- ($int_benchmark $escape_benchmark)) * 100) . "%<br />\n";
echo 
'Percent faster int is than intval: ' round((- ($int_benchmark $intval_benchmark)) * 100) . "%<br />\n";
echo 
"<br />\nThe result of the sanitisation:<br />\n";
echo 
"escape_string: $escaped <br />\n";
echo 
"intval func: $intval <br />\n";
echo 
'cast int: ' $int;
mysql_close($link);

?>

Result:

Connected successfully
benchmark: Begin
escape_string time: 0.697912931442
intval time:  0.349272966385
int time:   0.16778588295

intval vs escape_string difference: 0.348639965057
int vs escape_string difference: 0.530127048492
int vs intval difference:   0.181487083435

Percent faster intval is than escape: 50%
Percent faster int is than escape:  76%
Percent faster int is than intval:  52%

The result of the sanitisation:

escape_string: 12345\'?;/FOO/FUM\\\';Example String = 
intval func: 12345
cast int:  12345

From the benchmark, we can deduce that (int) produces a 70% to 75% performance advantage over using mysql_real_escape_string(); and a 50% performance advantage over using the intval() function.
In addition, if you are using an UPDATE or INSERT SQL query, forcing var type int on integer variables will mean that your query will produce a valid result, whereas an intended int variable that contains a string which goes through the mysql_real_escape_string() function is going to throw an SQL Error if the SQL database is expecting a float or integer value of some sort.

Of course, don’t take my word for it... the above script is provided for you to test yourselves as well. :)
If you want to test the benchmark without making a database connect, substitute the mysql_real_escape_string() with the older mysql_escape_string() function, which does not require a database link to process.
Enjoy!
- Highway of Life

Friday, May 2, 2008

Why Sanitisation and Type Casting is Important in PHP

PHP is a language that is very open to different types of coding.
Many good, many bad. There are always multiple ways of doing something in PHP, that’s one reason it is such a great language.
Everyone has their own way of doing something. The most common problem, however, is lack of proper knowledge of the language. Take someone who has worked on Java or C for 20 years and put them in front of PHP, they will learn the language quickly, but they will not necessarily pick up the true and correct methods and concepts of the language. Amazingly, few do.

In working with many PHP programmers as an instructor, I’ve seen coders from every level, starting from novice to advanced.
From inexperienced beginners to 20-30 years of experience programming C-based languages.
It is an easy language to learn, but a difficult language to master -- properly.
But even as easy as the language is, as popular as the language is, as many people program in it, few code correctly and securely as the language was intended to be programmed in. -- As an instructor, I regularly see code that is vulnerable in one way or another, about 1 in every 3 scripts has a severe vulnerability of some kind (75% of the time, it is an SQL Injection and usually due to poor or no sanitisation methods).
These are mostly due to not understanding typecasting properly, not using it, or not understanding the concepts behind proper coding in PHP, including coding PHP securely and the way it was intended to be used.

The reason that there were 20,000 sites hacked between January and April of 2008 was due to SQL Injections in web applications that hackers exploited to insert code that would inject iframes into every web-based dynamic (ASP or PHP) file on the website.
Some 200,000 plus webpages contained these iframes.
So how do people address this? How do you, as a site owner and coder, secure your site or ensure that the applications or scripts that you use or code are going to be secure and free from exploits?
You code in the way that PHP was intended, and the way that PHP was intended to work with databases.
Sure, you can program PHP in the way that you want to, you can use shortcuts, or do things your way because you feel like it, or you can program in a way that is correct and secure.

One of the most important concepts that I’ve had to drill into the heads of my students has been proper sanitisation of variables AND type casting.
Type casting is equally as important as sanitisation of strings.
PHP is built to be a dynamic language where you can dynamically assign an assortment of type definitions to a variable, it does not require or support explicit type definitions, this means that it is set by the content that is assigned to the variable.

If your Database field is intended to contain a float, you, the programmer must ensure that the data you are placing into that field is a float. If the field is an integer, you must ensure that the variable is an integer. If it is meant to contain a string, you must ensure that the variable type is going to be a string.
User data can be entered as a string, integer, float, or an array.
You cannot insert an array into a string field, a string field into an integer field, or float into an integer field (for example).

When you create a database, you define the fields to be exact database types, exact numeric data types such as Integer, decimal and numeric, approximate numeric data types such as float, real or double, a bit data type (as of MySQL 5), date and time data types, string data types such as char, varchar, text, binary (similar to char, but stores binary byte strings), blob and enum.

When creating a database, you do not create each field to be a string, you create the field with the data type of the information it will be storing -- I hope I don’t have to go into examples about *why* you must do this with databases --.
You then assign data to these database tables through your database interface, most commonly the built-in mysql functions in PHP.
When inserting data, they are meant to contain the data type that you have assigned to the field. If you created an numeric data type field, you need to ensure that the variable is an integer or float (float, double or real) as necessary.
If you do not, you introduce many potential and guaranteed problems into both your code and your database, database errors notwithstanding.

If you insert string data types, you need to ensure that the data that you are inputting is not a resource or array. -- Integer data types are allowed in string data type fields.
That means that even when pulling data, such as $_POST['my_variable'], you cannot just input this straight into your database, obviously, you need to sanitise the variable.
Your method of sanitisation does depend on what data type the field is that you will be inserting it into.
If it is int, force var-type int, if it is a float, you need to force it to be a float or double. Remember, PHP is a dynamic language, so it will change the var-type easily.
The variable may be an array, which you will need to deal with if you are attempting to insert the data into a string data type field.

All these must be must be considered if one is to create a script that is completely secure from Injection. -- If you want to ensure you are secure, create your code the way PHP intended for it to be created. Use proper type-casting.
It will save you many headaches down the road, it is well worth the research anyone would bother to spend on data types and type casting in PHP and your database.

People take database manipulation and interaction much too lightly, and it shows... 20,000 sites hacked in 4 months is too many. Certainly unnecessary and could have been avoided if the programmers had bothered to become familiar with correct sanitisation methods and typecasting.

If you are a user who is unsure about your script, you are unsure if you are using proper sanitisation and type casting methods, talk to programmers who you know are experienced in the language you are working with.
Do some research, on type casting and proper usage of the mysql_real_escape_string() function. If you are not using MySQL, check your DBMS type to see what function it uses to properly sanitise strings.
And last but not least, read Type Casting in PHP on how to properly use typecasting in PHP.
And use Google to research the topic in question, a little bit of research will go a long way to providing you with the necessary knowledge to build secure and stable scripts for your site or your clients.

- Highway of Life