web development war stories from the frontlines to the backend

Pretty major day in the world of the web, eh?

The PHP development team announced the release of version 5.3.0. This is a major milestone. Here's what I'm most excited about:

  • Lambda Functions and Closures
  • ternary short cut "?:"
  • Optional garbage collection for cyclic references

I'm excited about support for namespaces, but not the particular syntactical implementation chosen.

Also, the Mozilla team announced the official release of Firefox 3.5. Firefox is my browser of choice thanks to the great web development plugins available for it such as:

Version 3.5 looks to make an already excellent browser even faster.

I like Python. I like Ruby. I like C, C++, and Objective-C. I like Java. I also (actually) like PHP. I like programming - get it? Use whatever gets the job done and done well. Use whatever achieves the performance and scalability you require for a given task. Use what makes sense given a specific problem's domain. Use whatever aligns itself with the way your mind works. Don't be a one trick pony (that isn't a Django reference). Learn multiple languages, their strengths, and their weaknesses. Understand when a language's strengths will allow you to solve a problem faster, easier, better. Don't force a square peg through a round hole.

Watch this video. It's excellent food for thought and drives home the importance, as a programmer, of learning new, different, languages to expand your ability to solve problems in a variety of ways. The video mentions the Sapir–Whorf hypothesis which suggests that a particular language influences how a person understands and interacts with the world. This makes a lot of sense.

A programmer who thinks only in terms of a single language will attempt to solve every problem with that language. You need more than one tool on your belt because not every problem is a nail.

PHP Drinks Java

tags: ,

This post got me thinking about exactly why it is that PHP developers dislike Java?

While researching I stumbled upon this post which attempts to explain why you shouldn't treat PHP as if it were Java. The example singleton code looks nearly identical in both PHP and Java. The author suggests that an "experienced" PHP developer wouldn't attempt to think in terms of Java and would instead choose to implement this concept in PHP with the use of a global variable. This is ridiculous. It's crap like this that gives PHP a bad rap.

No, PHP isn't Java and shouldn't be treated as such. Nor should you abandon all logic and reason and disregard features of the language that make it appear Java-like.

PHP's object model borrows heavily from Java. This is fact. Its property and method visibility, single inheritance, interfaces, final classes, and abstractions are all very Java-like. Not to mention exceptions and garbage collection. I'll be damned if that doesn't cover a large portion of the features that developers use to write software in both PHP and Java. Are all of these concepts exclusive to Java? No, they're not. It just seems that the PHP team felt that Java made some good decisions and decided to emulate them.

I'll bet that lots of PHP code is unknowingly written Java-like. Is this because a PHP developer is "thinking in Java"? Perhaps it's because the foundational toolset which a PHP developer was given was designed with Java in mind. I'm not sure it makes sense to decide to use PHP to solve a problem and then fight with/ignore/don't take advantage of the features of the language. I think that would be a good reason to choose another language, no?

How could a language (obviously) borrow so heavily from another language and simultaneously shun the mention of that language?

Found some great PHP resources that I'd like to share. I haven't seen much talk of these so I'm hoping I can help spread the word.

First off libmemcached.

Most PHP folks are familiar with the memcache (note the lack of a 'd' in the name) PECL extension. This extension exposes a simple API for PHP apps to interact with memcache instances. It works - it's simple, stable, and has been available since 2004. Nothing special.

On the other hand, libmemcached "is a small, thread-safe client library for the memcached protocol. The code has all been written with an eye to allow for both web and embedded usage." - "It has been designed to be light on memory usage, thread safe, and provide full access to server side methods." And, fortunately, there's a new PECL extension that wraps libmemcached in a client library for PHP called memcached (note the 'd'). It was released in late January and is still considered "beta" however in my testing it has been stable. This extension provides a rich interface to your memcache instances including the new check and set (cas), replace, and append operations. As libmemcached becomes more widely adopted and its development continues, it makes sense to unify support behind a common library.

Lastly, igbinary. This is a PHP extension which provides binary serialization for PHP objects and data. It's a drop in replacement for PHP's built in serializer. Why is this important? When storing data in memcache it is first serialized (this is done automatically by the client library, such as memcached). Conversely when retrieving data from memcache the data is unserialized. The default PHP serializer uses a textual representation of data and objects. This is a waste of memory. Also, as objects increase in size and complexity the time it takes to (un)serialize increases significantly. Igbinary stores data in a compact binary format which reduces the memory footprint and performs operations faster. Most importantly memcached has built in support to take advantage of igbinary as its default serializer, yet another reason to use it as your memcache client library.

Check these resources out and let me know how they work for you!

Background

"Long polling" is the name used to describe a technique which:

  • An AJAX request is made (utilizing a javascript framework such as jQuery)
  • The server waits for the data requested to be available, loops, and sleeps (your server-side PHP script)
  • This loop repeats after data is returned to the client and processed (usually in your AJAX request's onComplete callback function)

This essentially simulates a continuous real-time stream from the client to the server. It can be more efficient than a regular polling technique because of the reduction in HTTP requests. You're not asking over and over and over again for new data - you ask once and wait for an answer. In most cases this reduces the latency in which data becomes available to your application.

There are a variety of use cases in which this technique can be handy. At the top of the list are real-time web-based chat applications. Each client executes a long polling loop for chat and user events (sign on/sign off/new message). Meebo is perhaps the greatest example of this.

It's important to note some of the server side technical limitations of long polling. Because connections remain open for considerably longer time than a typical HTTP request/response cycle you want your web server to be able to handle a large number of simultaneous connections. Apache isn't the best candidate for this type of situation. nginx and lighttpd are two lightweight web servers built from the ground up to handle a high volume of simultaneous connections. Both support the FastCGI interface and as such can be configured to support PHP. Again, Meebo uses lighttpd.

For similar reasons - it's also a good idea to choose a different sub-domain to handle long polling traffic. Because of client side browser limitations you don't want long polling connections interfering with regular HTTP traffic delivering page and media resources for your application.

Implementation

jQuery makes implementation a breeze.

var lpOnComplete = function(response) {
	alert(response);
	// do more processing
	lpStart();
};

var lpStart = function() {
	$.post('/path/to/script', {}, lpOnComplete, 'json');
};

$(document).ready(lpStart);

Straightforward. When the document is ready the loop begins. Each iteration the returned data is processed and the loop is restarted.

On the server side - just like we discussed earlier:

$time = time();
while((time() - $time) < 30) {
	// query memcache, database, etc. for new data
	$data = $datasource->getLatest();

	// if we have new data return it
	if(!empty($data)) {
		echo json_encode($data);
		break;
	}

	sleep(25000);
}

Actually, a couple points of interest here. We don't actually loop infinitely server side. You may have noticed the logic for the while loop - if we've executed for more than 30 seconds we discontinue the loop and return nothing. This nearly eliminates the possibility of substantial memory leaks. Also, if we didn't put a cap on execution time we would need to print a "space" character and flush output buffers every iteration of the loop to keep PHP abreast to the status of this process/connection. Without output being sent PHP cannot determine if the connection was lost via connection_status() or connection_aborted(). As a result this could lead to a situation where there are an increasing number of "ghost" processes eating up server resources. Not good!

That pretty much sums it up! Not that difficult, right?

As always, questions/comments are welcome, hope this helps!

I'm not sure why I haven't posted this yet. The following code block is (one way) of simulating named parameters in PHP for class method calls. It utilizes PHP's magic method __call and takes advantage of PHP 5's Reflection API for determining default values of parameters not passed.

Parameters can be passed two ways:

$obj->method(array('key' => 'value', 'key2' => 'value2'));
$obj->method(':key = value', ':key2 = value2');

Also, if using the latter (preferred) method of parameter passing you can define one parameter as an array in your method declaration and this will intelligently handle that as well. The TestClass below is an example of this.

$tc = new TestClass;
$tc->test(':a = testing', ':b = true', array('p', 'h', 'p'));

class TestClass extends NamedParameters
{
	public function _test($a = 'test', $b = false, $c = array())
	{
		echo '$a = '.$a."\n";
		echo '$b = '.($b ? 'true' : 'false').' ('.gettype($b).")\n";
		echo "\$c:\n";
		foreach($c as $k => $v) {
			echo '   '.$k.' = '.$v."\n";
		}
	}
}

And finally, here's the implementation:

class NamedParameters
{
	/**
	 * Implementation of PHP's magic method __call to support named parameters
	 *
	 * Actual helper method names that want to use named parameters are
	 * prefixed with _ so calls are redirected through this method. It
	 * uses PHP's reflection api to simulate named parameters.
	 *
	 * Named parameters can be passed in one of two ways.
	 * 		$obj->method(array('key' => 'value', 'key2' => 'value2'));
	 * or
	 *		$obj->method(':key = value', ':key2 = value2');
	 *
	 * @param string $n name of method application actually called
	 * @param array $a parameters passed to method
	 * @return mixed output of method call
	 * @access public
	 */
	public function __call($n, $a)
	{
		if(method_exists($this, '_'.$n)) {
			$methodParams = array();
			$passedParams = array();

			$reflectMethod = new ReflectionMethod(get_class($this), '_'.$n);
			if(isset($a[0]) && is_array($a[0])) {
				// first parameter passed is an array, so we assume all parameters are in this array
				$passedParams = $a[0];
			} else {
				// passing parameters as strings using named parameter syntax
				foreach($a as $v) {
					if(is_string($v)) {
						// format is ':parameterName = parameterValue'
						if(preg_match("/^:([a-z0-9]+)\s*=\s*(.+)$/isD", $v, $out)) {
							$passedParams[$out[1]] = $out[2];
						}
					} elseif(is_array($v)) {
						// technique allows one parameter to be an "array" parameter
						$passedParams['__array__'] = $v;
					}
				}
			}

			// loop through the parameters of the function being called
			foreach($reflectMethod->getParameters() as $i => $param) {
				$defaultAvailable = $param->isDefaultValueAvailable();
				$parameterName = $param->getName();
				if($defaultAvailable) {
					$default = $param->getDefaultValue();
					if(($paramType = gettype($default)) == 'array') {
						// match this parameter of the function being called to the array
						// that was passed, if any (see above)
						$parameterName = '__array__';
					}
				}

				if(array_key_exists($parameterName, $passedParams)) {
					// this parameter of the function being called was passed in the named parameters
					$val = $passedParams[$parameterName];
					if($defaultAvailable && is_string($val)) {
						// if the function being called specified default values for this parameter
						// we can type cast
						switch($paramType) {
							case 'boolean':
								$val = (strtolower($val) == 'true') ? true : false;
								break;
							case 'integer':
								$val = intval($val);
								break;
							case 'double':
								$val = floatval($val);
								break;
							default:
								break;
						}
					}
					$methodParams[] = $val;
				} elseif($defaultAvailable) {
					// parameter was not passed, assign a default value
					$methodParams[] = $default;
				} else {
					// parameter was not passed and no default value exists, trigger an error
					trigger_error("Beast __call to '".get_class($this)."::".$n."' missing parameter #".($i+1)." (".$parameterName.")", E_USER_ERROR);
					return null;
				}
			}

			// call the function and direct output through handler
			return $this->output(call_user_func_array(array(&$this, '_'.$n), $methodParams), isset($passedParams['return']) ? $passedParams['return'] : false);
		} else {
			// method doesn't exist, trigger an error
			trigger_error('Beast __call to a non-existant method '.get_class($this).'::'.$n, E_USER_ERROR);
			return null;
		}
	}
}

Hope you find this useful!

This post is simply stating the obvious. Sometimes even obvious things, in the wee hours of the morning, aren't so.

When you specify parameters in your URLconf like:

urlpatterns = patterns('',
    url(r'^mark/(?P<id>\d+)/(?P<complete>\d+)/$', views.mark, name='mark'),
)

Keep in mind that each captured argument is a Python string. Even if the regex only captures integers - 'complete' is still passed as a Python string to your 'mark' view.

So if you intended to pass a 0 for false and 1 for true you must make sure to convert to an integer because only a string of length 0 is False. ('0' == True)

def mark(request, id, complete):
    todo = get_object_or_404(pk=id)
    todo.complete = int(complete)
    todo.save()
    return HttpResponse()

37signals_php

What exactly are they using it for? Front-end?

I've spent more time with Django the past couple days. Read my installation guide and my first impressions to get caught up. I wanted to address a couple issues I came across as I was exposed to certain architectural designs of Django.

It might be helpful to note which books available today cover Django 1.0. I do realize that the "official" Django book covering 1.0 is in the works, but, in the meantime I recommend Python Web Development with Django and Pro Django.

Lets start with the template system. While Django's template system is powerful, I'm sure, it's basically a free for all of file names and directory structures. You could conceivably create a single 'templates' directory and place all of a projects template files in that directory (naming the files as you please) for any and all the apps that compose the project. While it's not recommended you actually do that, Django would support it because it doesn't seem to really dictate or enforce any particular convention. I feel like Rails, comparatively speaking, provides solid conventions for a developer to follow in terms of file naming, directory naming, and directory structure.

I was also surprised at the fact that Django does NOT bundle a pluralization library. Even simple cases (ending in y to ies) aren't handled automatically. A model named 'Category' appears as 'Categorys'. It does provide support for explicitly specifying plural names for Models via the verbose_name_plural Meta property:

class Category(models.Model):
    name = models.CharField(max_length=128)
    class Meta:
        verbose_name_plural = 'categories'

Coming from a world of a custom built PHP framework - it's REALLY nice to have features like the command line sandbox to be able to play with Models and interact with the project's codebase. It's also great that Django bundles a development webserver for a quick and easy create-test-edit cycle. It recognizes changes to your code while it's running - there's no need to restart manually.

The effects are immeasurable with respect to the fact that Python functions are first-class. Django uses this extensively (such as your URL configuration or default values for Models). It's extremely intuitive to be able to do the following in your Model definition:

class Post(models.model)
    stamp = models.DateTimeField(default=datetime.now)

Notice this wasn't written as datetime.now(). That would actually execute the function when the class is declared and all entries would receive an identical stamp. Instead we're passing a reference to the function. Django detects this and calls the function when the Model is instantiated.

I'm also excited about the way the Django framework handles requests and responses. The concept of the view receiving an HttpRequest object, giving context, and returning an HttpResponse object just makes sense. It's very much in line with the way HTTP works. It's simple, powerful, and elegant.

I've read about (and watched video of) some issues that people have with Django. These usually refer to difficulties one may have scaling it. Lack of built-in support for multiple databases and sharding are cited, among other reasons. I think, for what it's able to do right out of the box, it's fantastic. Scaling (in general) isn't always straightforward. In many cases it requires specific tools and solutions for the task at hand. These issues should in no way prevent you from using Django for a project!

More soon.

After trying for many years to maintain a personal blog, with many starts and stop, I have given up. I realized I don't *want* to write personal thoughts on the web. Instead, I've been messing around with my own custom tumblelog scripts. I use delicious, Flickr, last.fm, and other services that I want to aggregate in one place. I know there are services out there that do this already, but I thought rolling my own would be a good excuse to dust off my PHP skills.

One feature I wanted to have was the ability to grab images from other sites, re-size them, and copy them to my local server.   Nothing special, but very useful.

$remote_image = "some_img_url";

$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $remote_image);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 0);

$file_as_string = curl_exec($ch);
$new_file_name =  substr(strrchr($remote_image, "/"), 1);
$path_to_img = '/path/to/image/folder/'.$new_file_name;

resizeImage($file_as_string, 600, $path_to_img);

curl_close($ch);

function resizeImage($img_as_string, $max_width, $dst){
   $image = imagecreatefromstring($img_as_string);
   $orig_width = imagesx($image);
   $orig_height = imagesy($image);

   $width = $orig_width;
   $height = $orig_height;

   if ($orig_width > $max_width){
      $height = ($orig_height * $max_width) / $orig_width;
      $width = $max_width;
   }			

   $image_p = imagecreatetruecolor($width, $height);

   imagecopyresampled($image_p, $image, 0, 0, 0, 0, $width, $height, $orig_width, $orig_height);

   imagejpeg($image_p, $dst, 100);
}

What this script does is:

  • using curl, grab the image remotely (as a string)
  • parse the name of the image, create a string that represents the server path to copy the image to
  • proportionally resize the image if it's wider than 600 pixels
  • save it

This code can be easily modified to resize the image if it's taller than x pixels.  I made a very simple form, so now I can supply the script with the image url, and it takes care of the rest.

« Previous Entries  Next Page »

Recent Posts

Categories

Archives