martes, 17 de febrero de 2009

A couple of things on PHP

Hi!

As I promised yesterday on Bash Tricks I, I would be making a spin off article on things related to PHP.

I want to talk about two things, actually.

1 - Performance: Are constants faster than variables?
2 - Security: My usage of a FS_ROOT constant (that could become a variable depending of the results of point 1) to hide all non-starting scripts from apache.

Variables vs Constants
Yesterday I ran a script 1000 times to see if a script that handled a variable or a constant was faster. The one with the constant was a little faster. Let's redo the test but with 10000 times instead:

variable.php (without the php starting tags):
$VARIABLE = 5;
echo $VARIABLE . "\n";

constant.php:
define('CONSTANT', 5);
echo CONSTANT . "\n";

Let's run them then:
echo Variable; time ( i=0; while [ $i -lt 10000 ]; do php variable.php > /dev/null; i=$(( $i + 1)); done ); echo Constant; time ( i=0; while [ $i -lt 10000 ]; do php constant.php > /dev/null; i=$(( $i + 1)); done )
Variable

real 12m49.269s
user 4m31.665s
sys 2m23.161s
Constant

real 12m36.780s
user 4m28.013s
sys 2m27.241s

Well.. not much difference, really.

Now.... does this difference translate into a php script running on apache? To pull it off with bash, I will have to use one of my favorite (and most basic) tricks I've come to use when doing web development: Acting as a web client from a terminal. As a matter of fact, bash won't be the client... but I will certainly use bash in order to run the request a number of times. How does it work? As web developers most probably know, we can use telnet to make requests on a web server. Let's do a basic request: www.yahoo.com but on a yahoo.com and let's ask for the default web page to see what it says:

telnet yahoo.com 80
Trying 68.180.206.184...
Connected to yahoo.com.
Escape character is '^]'.
GET http://www.yahoo.com HTTP/1.0

HTTP/1.1 301 Moved Permanently
Date: Tue, 17 Feb 2009 19:15:04 GMT
Location: http://www.yahoo.akadns.net/
Connection: close
Content-Type: text/html; charset=utf-8

The document has moved here.



Connection closed by foreign host.

After I connected to the web service successfully, I made the requests GET http://www.yahoo.com HTTP/1.0 followed by an empty line, and then the web service replied with the headers, an empty line and then the content of the web page.

Cool.... but I won't be doing that 10000 times to test my scripts on apache, right? Instead of telnet, let's use another tool: netcat. Then we can send netcat the web request to its input stream, effectively sending it to the web server. Like this:
{ echo GET http://www.yahoo.com/ HTTP/1.0; echo; } | netcat yahoo.com 80
HTTP/1.1 301 Moved Permanently
Date: Tue, 17 Feb 2009 19:20:15 GMT
Location: http://www.yahoo.akadns.net/
Connection: close
Content-Type: text/html; charset=utf-8

The document has moved here.

Cool, now I can make as many requests as I want one after the other, and so I can see how long it takes to run to each of the scripts. So, let's see:

echo Variable; time ( i=0; while [ $i -lt 10000 ]; do { echo GET http://localhost/variable.php HTTP/1.0; echo; } | netcat 127.0.0.1 80 > /dev/null; i=$(( $i + 1)); done ); echo Constant; time ( i=0; while [ $i -lt 10000 ]; do { echo GET http://localhost/constant.php; echo; } | netcat 127.0.0.1 80 > /dev/null; i=$(( $i + 1)); done )
Variable

real 2m17.096s
user 0m58.724s
sys 1m8.092s
Constant

real 2m7.158s
user 0m55.163s
sys 1m2.608s

Well.... I notice two things here:
1 There was a roughly 10% reduction when using constants
2 This puts in perspective the Spawning processes is expensive mantra, doesn't it? A reduction of roughly 75% when compared with running the script with the PHP binary.

Parenthesis: I tried with 1000 times instead of 10000 and something weird happened. The tests ran in under 7 seconds each (always with the constants as a winner), but that's almost 20 times faster (instead of the expected 10 times). Any explanations for it? End of parenthesis.

Now, if you wanted to use the value of the variable inside a function without passing it as a variable, you would have to use a "global" directive and then use the value, but there's no need to do that with a constant. Let's see how that changes the result:
variable.php
$VARIABLE = 5;

function printValue() {
global $VARIABLE;
echo $VARIABLE . "\n";
}

printValue();


constant.php
define('CONSTANT', 5);

function printValue() {
echo CONSTANT . "\n";
}

printValue();


Le's run the scripts 1000 times:
echo Variable; time ( i=0; while [ $i -lt 10000 ]; do { echo GET http://localhost/variable.php HTTP/1.0; echo; } | netcat 127.0.0.1 80 > /dev/null; i=$(( $i + 1)); done ); echo Constant; time ( i=0; while [ $i -lt 10000 ]; do { echo GET http://localhost/constant.php; echo; } | netcat 127.0.0.1 80 > /dev/null; i=$(( $i + 1)); done )
Variable

real 2m20.279s
user 0m59.952s
sys 1m7.548s
Constant

real 2m11.901s
user 0m57.440s
sys 1m3.176s

Again the constants are winners with a little less than 10% reduction in time. So I guess that's it on this subject. Constants are winners over and over again.

My usage of FS_ROOT

After so much time dealing with PHP and requires/includes, I came to use a constant that always tells me where the root of the project is on the File System. This value along with a number of others (HTTP_ROOT, DB settings and so on) is in a single script (strangely enough, it's called conf.php). Now, no matter where I start the execution of the script, I know where to find conf.php relative to this starting point and then I just don't care about where the others scripts are... I always call them using FS_ROOT as the root directory of the others scripts:
require_once FS_ROOT . "/model/one_model.php";
require_once FS_ROOT . "/utilities.php";

You get the idea. I guess that's not rocket science.... but then something weird happened. I was integrating phpweby's ip2country into my new project. This module has a script that can update the ip2country information on the DB that has to be deleted for security reasons... and then, it hit me: That script, which is not supposed to be called from the web, could be outside of the space mapped by apache and then I wouldn't have to delete it, would I? And that brought me to another thought: How about if I didn't have to publish ANY of the scripts that I use in my project, besides the starting scripts? And that's when FS_ROOT becomes vital. By using FS_ROOT to locate all the other scripts, they could just be outside of the apache mapped files and so the project is "safer", don't you think?

So, now the project I'm working with has like 20 scripts "inside" apache, and all the other scripts (a whole bunch of them) are safely protected outside of apache's reach. Now, I've never used root cages, so I don't know if a cage would allow this kind of behavior. What do you think of the trick?

Well. That's it for today. I hope you can take advantage of this information.

No hay comentarios:

Publicar un comentario