Tudor Barbu's professional blog

Ramblings about software development
RSS feed

Code refactoring

I’ve had a discussion a few days ago with a friend about code refactoring. I think that most of the problems in IT emerge from the fact that almost everybody follow to the following pattern:

- Does it compile?
- Yes, but…
- Then ship it!!! Now!

I consider code refactoring a very important step of the development process, especially when dealing with strict deadlines and short iterations. When you receive today the specs for a project and the project manager tells you that the deadline is yesterday, you don’t have the time to properly implement all the design patterns, write documentation,

Unfortunately I have seen a lot of projects where nobody bothered to refactor the code as it was deemed uneconomic – “How will we explain this to the customer? Paying twice for the same thing…” – but most of the time the customer ends up paying several times more, because the development process becomes really slow, the deliverables become extremely buggy and the number of billed hours grows exponentially.

Or I just have a lot to learn about business practices and this is the desired way of doing development :) Who knows? Because I’ve seen this happening way to often.

I came across this article yesterday. The author – who calls himself Uncle Bob – has a pretty interesting biography, but that article proves that even the best of us can come up with really weird ideas. I’m not sure if regulating software development is worst than SOPA or not.

After reading a letter from another developer which tells a story about a greedy & careless (like any other) manager who jeopardizes the life of several patients by sacrificing good practices in order to hit a deadline, “Uncle Bob” comes to the conclusion that software development should be a regulated profession. And here is where things become weird.

Who can regulate software development? Who is to say that John Doe is a good developer and James Doe a bad one!? Except the market?

And further more, how will this be done? Through academic assertion? Everyone will agree that this idea is incredibly dumb, as most industry icons don’t have college education. Here are some examples:

  • Steve Jobs (dropped out)
  • Bill Gates (dropped out)
  • Larry Ellison (dropped out twice)
  • Michael Dell (dropped out)

… just to name a few. Who can say that these people can’t work in technology because they didn’t pass some lame tests?

Or let’s just have an independent body – like the WWW Consortium – which regulates the industry. Who will fund it? I for sure am not going to pay a share of my salary to keep some boring bureaucrats in office just to tell me what I can and cannot write.

And what happens with new – yet unregulated – technologies? Can we use those? If a new framework appears, do we have to wait until it gets tested and approved? That could take years, the W3C is working on the HTML5 specs since 2004…

I agree that we all should be held accountable for our actions, but there are enough regulating bodies out there. If you’re developing medical software, then it should be clinically tested like any other medical equipment or if it’s software for cars, the bodies that issue car licenses will test it. The same for software managing nuclear power-plants or airplanes. In order to say that a nuclear reactor is safe to use, the people testing it will just have to test the software also – which I’m sure happens anyway.

If it fails, the company that wrote the software gets fined, the product is banned and the people involved get sacked. And if you get canned enough times, then nobody will hire you.

Problem solved, the profession is regulated. No bureaucracy needed!

Open Hack EU

Some asshole flying a plane in front of my house at 7 in the morning just woke me up, interrupting my well deserved rest after Yahoo’s 24 hours hackathon. But this post is not about Cessnas spraying chemicals, it’s about one of the best programer oriented events I ever attended. Thank you Yahoo! for making it possible.

I teamed up with my colleague from Vendo Adolfo Abegg and Alex Brausewetter, a former colleague that decided to start his own company called Helpdesk, aimed at delivering the best customer support and ticketing software on the market. Some friendly SEO can’t hurt, right ;) ?

The hacking event was great, a lot of cool hacks, both software and hardware. Yahoo’s Pipes service was hacked by members of the Romanian Security Team on the stage, while Yahoo’s CTO was watching – awkward. It seems that they got pulled into a “private chat” by Yahoo!’s officials shortly after. Read the whole story here.

Another hack that got my attention was evA-Ziune – a site where you can report small fraud, like not getting receipts, using Foursquare’s API. This will quickly overload their servers :) !

The hardware hacks were also impressive, especially the one called Yahoo! Farm, a table size farm which can tell you how many friends you have online on YMessenger using robotic sheep. Pretty nifty! It won the Hacker’s choice prize! Anyway, all the projects are available here and the winners here. Take a look! There are a lot of cool projects out there!

Our project was mooooody.com, a site which detects your location, queries Yahoo! Weather for that location and plays music accordingly. Although it has no market value – or any other value what so ever – we had fun building it and we got a lot of positive feedback from other attendees. We mashed up a lot of clever hacks and technologies but a project that doesn’t address a real problem and is built around illegally playing copyrighted music from Youtube – a service belonging to Yahoo!’s arch-rival Google – couldn’t possibly impress the jury so we didn’t win anything. Nevertheless, we’re proud of it!

Congratulations to the winners, congratulations to Yahoo! for organizing such an event and congratulations to all attendees for creating such a cool atmosphere.

I like the way open source works. For example, my former colleague and manager started his own company that produces customer support and ticketing software called Helpdesk and one of the challenges he had to solve was building a reliable search system that will allow his users to search through thousands of tickets.

And since we all know that providing a reliable search system is not an easy job, a decision was made to outsource the searching features to IndexTank, a company that provides scaled real-time search. Here’s where the open source part comes in: since there wasn’t any suitable Zend Framework component for integration with IndexTank, Alex decided to write his own and open-source it on Github. It’s really cool and well documented. Deutschland uber alles :)

Also check out the post on IndexTank’s blog and Helpdesk‘s website.

Also talked about by Dan Pink at TED. I think this is one of the main reasons why open source projects work the way they do. A whopping 99.9% of all the software engineers I know think that they could do a much better job if they would have more autonomy. And most software we see are buggy, unusable, over-time and over-budget. I wonder why…

If you’re into management, I think you should have a good look at these video. I know a lot of people that desperately need this kind of knowledge…

PS: holding a gun to somebody’s head also works as a great motivator, as proven by the Spanish government earlier this month, when it used the military to “incentivise” the air traffic controllers to go back to work :))

firephpYou know the saying: if debugging means taking the bugs out, then programming means putting them in. Yes. We all have bugs in our code. And since not all of them can be marketed as “undocumented features”, from time to time we have to debug our applications.

The best debugger for PHP I’ve used so far is Zend’s. Zend Platform together with Zend Studio constitutes a very good development environment and a great debugging environment. Due to the fact that Zend Studio is a little pricey, I don’t use it any more, instead I’m using a highly customised vim. This makes a great development environment, but unfortunately isn’t not that great when debugging. I know you can use vim with Xdebug, but it’s quite a chore, and I don’t like it. Since old school debugging with var_dump() or print_r() is out of the question, I was looking for another way to debug my applications. And I’ve found just the thing: FirePHP. It’s a Firefox extension, just like FireBug – that can receive debug information from the server.

Since I do most of my bugging programming on Zend Framework, I also need debugging for this platform. I use the OOP based bootstraping method, where you extend your Bootstrap class from Zend_Application_Bootstrap_Bootstrap. And in the .htaccess file of the /public/ directory, I have an envelope with the current state of application (usually on of development/staging/production):

SetEnv APPLICATION_ENV development

Normally, I need the debug information only when the application is in the “development” state, so I’m using this method in the Bootstrap class.

class Bootstrap extends Zend_Application_Bootstrap_Bootstrap {
    // Bootstrap other components

    /**
     * inits FirePHP for debugging
     *
     * @return void
     */
    protected function _initFirebugDebugger() {
        if(APPLICATION_ENV == 'development') {
            // don't debug while not in "development"
            $logger = new Zend_Log();
            $writer = new Zend_Log_Writer_Firebug();
            $logger->addWriter($writer);

            Zend_Registry::set('logger',$logger);
        }
    }
}

I also like to have some syntactic sugar when developing, so I’ve define this function in the Bootstrap.php file. Yes, I know that this might be perceived as a blasphemy by some of the OOP purists out there, but I really don’t care. If you don’t want non-OOP “stains” on your code, simply create a YourApplication_Utility_Firebug class or whatever with a static debug() method and paste the code in it.

/**
 * syntactic sugar for logging errors
 * and debug messages to FireBug
 *
 * @param string $message
 * @param int $label
 * @return void
 */
function fb($message, $label = null) {
    if($label != null) {
    	$message = array($label, $message);
    }

    if(Zend_Registry::isRegistered('logger')) {
    	Zend_Registry::get('logger')->log($message);
    }
}

And now, any time you need to debug something, simply type:

fb($variable);
fb($_POST); // and so on

And all these variables will be sent to Firefox’s FirePHP toolbar and you can inspect them from there. For even better results, you can also send debug information from your ErrorController to FirePHP (comes in handy when using Ajax).

PS: I’ll have a look into FireLogger for PHP. It also look pretty interesting, although it’s only in Beta.

apacheSometimes, when you’re dealing with large images, flash movies or large javascript files, it’s generally a good idea to force them in the client’s cache.

A very simple way to achieve this is by using Apache’s mod_expires. For instance, if you add the following example, taken from the manual, to your .htaccess file – assuming of course that mod_expires is properly installed and configured – it will tell the browser to cache all the files for a month.

ExpiresDefault "access plus 4 weeks"

So, every time the client returns in the following month to the site, the browser won’t download all the static content again, but load it locally from the cache, thus minimising the loading time. Of course, there are some issues that usually appear after updates. Particularly after updates of the cached files :)

For instance, you add a new Javascript functionality to the site or make some changes in the css or swf files and the user doesn’t hit at CTRL+F5 to fully refresh the page and clear its cache, then he will see the old version of the site. Of course, one can take the short road to LamerVille and post a message on the site, asking the user to refresh the page. But that’s a little too lame to be taken into consideration, especially when dealing with respectable sites.

But there’s another way, much more elegant. First of all, place all the static, cache-able files in a separate folder. The browser will cache all the files based on their URL. If you want the browser to reload all the static data on every new request, you need to change the URLs on every new request.

Let’s say, the site resides at www.example.com and that all the static information will be served from www.example.com/static/. Now, a good idea is to make the links look like this:

http://www.example.com/static/(release-number)/css/style.css

http://www.example.com/static/(release-number)/js/cool-ajax-app.js

Where release-number is a number that increments with every new release. This way, the URLs will be different after each release thus forcing the browsers to fetch the new files. You don’t need to go through lots of files and increment the release number by hand, you can just use the following python script:

#!/usr/bin/python
"""
Read more about what this script actually does on 

http://blog.motane.lu/2009/09/21/caching-problems-with-mod_expires/

Usage:
	python increment_release.py start_directory static_prefix

Author:
	Tudor Barbu http://blog.motane.lu
"""
import sys, os, re

REGEX = ''
CACHED_DIR = ''

def main():
    global REGEX, CACHED_DIR 

    if sys.argv is not None:
        length = len( sys.argv )
        if length < 3:
            print 'Read the comments in the source code'
            exit()
    start_folder = sys.argv[1]
    CACHED_DIR = sys.argv[2]

    REGEX = re.compile( '=(\'|")' + re.escape( CACHED_DIR ) + '\/((\d+)\/|)([^\'|"]+)(\'|")' )
    
    parse_files( start_folder )

def parse_files( dir ):
    basedir = dir
    subdirectories = []
    for item in os.listdir( dir ):
        if os.path.isfile( os.path.join( basedir, item ) ):
            perform_replace( os.path.join( basedir, item ) )
        else:
            subdirectories.append( os.path.join( basedir, item ) )
    for subdir in subdirectories:
        parse_files( subdir )

def perform_replace( file ):
    global REGEX
    f = open( file, 'r' )
    contents = f.read()
    f.close()
    if REGEX.search( contents ):
        f = open( file, 'w' )
        f.write( REGEX.sub( handle_match, contents ) ) 
        f.close()

def handle_match( matches ):
    global CACHED_DIR
    if matches.group(3) is not None:
        revision_number = int( matches.group(3) ) + 1
    else:
        revision_number = 1
    return '=%s%s/%s/%s%s' % ( matches.group(1), CACHED_DIR, revision_number, matches.group(4), matches.group(5) )

if __name__ == '__main__':
    main()

…and, of course, there’s no need to create lots of directories either. A simple .htaccess rewrite rule will do. Just redirect all the URLs like /static/(number)/css/style.css to point to /static/css/style.css, by adding these 2 lines in the /static/.htaccess file:

RewriteEngine On
RewriteRule ^\/static\/(\d+)\/(.*)$ $2 [NC,L]

This should solve all your caching related problems. If you want to look savvy, you can use the version number of the head revision from subversion of whatever versioning system you might be using instead of a simple incremental number.

Yes, I know that the script is a little bit buggy but it works for me. If you have an improved version, post a comment below. Credits will be given.

The “August” error

I’ve just came back from Wurbe, where I’ve enjoyed a late evening snack (consisting of beer, Pepsi and pizza) and chated with other fellow developers on a large range of subjects, raging from current trends in development to Klingon grammar.

Bodgan Lucaciu told us a funny story about a strange bug. It’s like this: in Javascript, when you parse a number from a string – let’s say a month’s index like 07 – with parseInt you must be very careful, because, for parseInt, the leading 0 is an indicator that the number is written in octal instead of decimal.

alert(parseInt('07')); // for July - will echo 7
alert(parseInt('08')); // for August - will echo 0 as there is no 08 in octal

So if your application mysteriously stops working on the first of August, this might be your problem. Although this “feature” is marked as deprecated, it’s still present in many modern browsers, so one can never be too careful. Also keep in mind that by default, numbers starting with 0x are considered to be hexadecimals.

This is a post I’ve been trying to write for about 2 weeks now. As some of you might know, I’ve spent the previous weeks studying python and writing small scripts and I’ve decided to write a blog entry about it. As a matter of fact, I’ve also looked over the Pylons framework, but I’ll write about it in a another post. So here it is, my opinion about python alone:

What I like about python

Well, I loooooooooove the indentation. I really do. Python made it impossible for lamers to write ugly “one liners”. Everything must be indented and in its place or it won’t even compile (compiling aka no syntax errors as python is an interpreted language). After years of dealing with ugly sources with no braces, no indentation and so on, this feature is like a gift from heavens for me. I really hope it will catch on and be implemented in other languages.

I also like the for in iteration over…well…everything. This code:

for item in collection:
    do_stuff(item)

…works in most cases, even when collection is a file. In which case the loop iterates over the file’s lines. Tuples, dictionaries and lists are cool features.

What I don’t like about python

Of course, there are some things I dislike about this programming language. The first thing is that sometimes is too verbose. Python doesn’t have an post/pre increment operator. You can’t write i++ or ++i, although this code compiles. Further more, it compiles and does nothing, taking the act of debugging to a whole new level of annoyance.

You always have to write i += 1. It also doesn’t have a ternary operator. If you write a = (condition) ? b : c it will give you an compiling error.

Another weak point is its OOP capabilities. Object orientated programming is very strangely implemented in python. A class example in python looks something like this:

class MyClass:
    def __init__(self):
        self.attribute = 'default value'
    def custom_method(self, attribute):
        self.attribute = attribute
    def print_data(self):
        print self.attribute

obj = MyClass()
obj.custom_method('wassabi')
obj.print_data()

As you can see, there are no access modifiers (private, protected, public), no instantiation operator (new), the this keyword is replaced by self, and you must write it every single time you define a new method in the class. And python also allows multiple inheritance, which does one thing: annoys people.

Conclusion

Apart from some really annoying “features”, I’m starting to like python. It provides a quick way and pretty clean way to do get things done. And in the end, this is all that matters…Python is cool!

This post is about a quite common problem that I’ve encountered over and over again but didn’t look into it to find a proper answer to it. The problem is what happens when I request via AJAX a page that makes a HTTP redirect to another and how do I fix it to behave “normally”. It’s quite common problem with pages that require authentication.

Let’s say we have the following PHP script:

if( !is_user_allowed() ) {
    header( 'Location: ' . PATH_TO_LOGIN_PAGE );
    die();
}

Pretty simple and self explanatory. If the user isn’t logged in, he gets sent to the login form. But what happens if the request is made via AJAX? A simple AJAX request made using the prototype.js library looks like this:

new Ajax.Request(
    'server_side.php',
    {
        onSuccess: function( t ) {
            $( 'container' ).update( t.responseText );
        }
    }
);

What happens if the user isn’t logged in? The browser makes two requests to the server, the second one to the page containing the login form.

requests

And the whole login page gets loaded into a small container on current page, ruining the design and confusing the user.

Although I’ve encountered this problem on several occasions, I didn’t give it much thought. I’ve used that very popular design pattern “I know it’s lame, but hey, it works…moving on”. The implementation on case was like this: add a token to the login page’s markup, something like and check to see if this string is in the response text. Like this:

new Ajax.Request(
    'server_side.php',
    {
        onSuccess: function( t ) {
            if( t.responseText.indexOf( '<!--login-->') != -1 ) {
                document.location = 'login.php';
            }
            else {
                $( 'container' ).update( t.responseText );
            }
        }
    }
);

Plain and simple. And lame. I admit it. But hey! It works :) Still, I’ve found a better way of doing things, that relies on HTTP headers. First of all, in the login page, instead of a lame string message, I’ve added some custom headers to the response.

header( 'HTTP/1.0 401 Authorization Required' );
header( 'Login-path: ' . PATH_TO_LOGIN_PAGE );

And then, I’ve altered the the javascript a little. Since a response served with a 401 header won’t trigger the onSuccess callback function, this script uses onComplete.

new Ajax.Request(
    'server_side.php',
    {
        onComplete: function( t ) {
            switch( t.status ) {
                case 200:
                    $( 'container' ).update( t.responseText );
                    break; 
                case 401:
                    document.location = t.getHeader( 'Login-path' );
                    break;
            }
        }
    }
)

It works. And it’s not so lame. Mission accomplished…