Tudor Barbu's blog

Ramblings about software development

Uploadify is an awesome script and it works like a charm. But – there’s always a but – sometimes it throws a mysterious 302 error. This happened to me all day long and it drove me crazy. Well, not really, I was already crazy :) So, what to do when the HTTP 302 error pops? A quick look over HTTP statuses should point the me in the right direction:

The requested resource resides temporarily under a different URI. Since the redirection might be altered on occasion, the client SHOULD continue to use the Request-URI for future requests. This response is only cacheable if indicated by a Cache-Control or Expires header field.

The temporary URI SHOULD be given by the Location field in the response. Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s).

If the 302 status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued.

from here. In simple English, that means a redirect. So what happens!? For security reasons, I turned on the cookie-httponly setting and the client-side script was unable to access the cookies and pass the session id back to the server-side script, which in term would see this connection as coming from an non-authenticated user and issue a redirect to the login page. Thus the mysterious 302 status.

The problem can be solved really easy, by turning the cookie-httponly setting off for the entire application. If that’s not desirable, there’s a more complicated solution. First, Uploadify must send the session id to the server together with the file:

$('#fileUpload').uploadify({
    'uploader'   : '/uploadify/uploadify.swf',
    'script'     : '/images/upload/',
    'cancelImg'  : '/uploadify/cancel.png',
    'auto'       : true,
    'fileExt'    : '*.jpg;*.gif;*.png',
    'fileDesc'   : 'Image Files',
    'sizeLimit'  : 2097152,
    'scriptData' : {'sid' : '<?=Zend_Session::getId();?>'},
    onComplete   : function(event, id, fileObj, response, data) {
        // bla bla bla
    }
}

…then, turn off the auto-start in the application.ini file:

phpSettings.session.strict	= "On"

…and in the Bootstrap.php file:

protected function _initSession()
{
	if (isset($_POST['sid'])) {
		Zend_Session::setId($_POST['sid']);
	}

	Zend_Session::start();
}

Of course, there are some security issues with both approaches, but nothing serious. Took me about 2 hours to figure it out :(

02 Jun

Javascript based Flash Player

Posted by Tudor. Tags: ,

I stumbled upon this Slashdot article, about Smokescreen, a Flash player written entirely in Javascript and which can…are you ready for this…run on the iPhone/iPad/iPod Touch. In your face mr. Jobs!!!

Take a look at the demo, it’s very impressive.

I’m quite sure that by the end of the week Adobe will buy RevShock, the company that developed this player.

17 Sep

The “August” error

Posted by Tudor. Tags: , , ,

I’ve just came back from Wurbe, where I’ve enjoyed a late evening snack (consisting of beer, Pepsi and pizza) and chated with other fellow developers on a large range of subjects, raging from current trends in development to Klingon grammar.

Bodgan Lucaciu told us a funny story about a strange bug. It’s like this: in Javascript, when you parse a number from a string – let’s say a month’s index like 07 – with parseInt you must be very careful, because, for parseInt, the leading 0 is an indicator that the number is written in octal instead of decimal.

alert(parseInt('07')); // for July - will echo 7
alert(parseInt('08')); // for August - will echo 0 as there is no 08 in octal

So if your application mysteriously stops working on the first of August, this might be your problem. Although this “feature” is marked as deprecated, it’s still present in many modern browsers, so one can never be too careful. Also keep in mind that by default, numbers starting with 0x are considered to be hexadecimals.

I have found a cool Javascript library for manipulating dates called datejs. It’s really simple to use and offers a lot of syntactic sugar (examples taken from their website)

// Add 3 days to Today
Date.today().add(3).days();

// Is today Friday?
Date.today().is().friday();

// Number fun
(3).days().ago();

// 6 months from now
var n = 6;
n.months().fromNow();

It saves a lot of headaches ;)

Python Last week I’ve got a new assignment at my job: a crawler that was supposed to periodically visit some sites and download their content. Sounds simple, isn’t it? Well, it’s not. Mainly because we want to also get all the flash content and some of it is inserted with Javascript, via various libraries like SWFobject or directly with document.write in some cases. I needed a snapshot of how the page actually looks like when the user is looking at it in a browser.

This meant that I had to get the content *after* all the javascripts contained in page finished execution. In developer language, this means after the window.onload event takes place. And, of course, I also needed a Javascript interpreter. So any attempt to use wget/cURL/file_get_contents was destined to fail from the start. I needed browser power :) So I’ve googled around for some.

The first thing I came across was using COM to connect to an Internet Explorer instance from python, use it to navigate back and forth and get the HTML content as it’s interpreted by IE’s engine. This had 3 major drawbacks:

  • it requires Internet Explorer
  • it requires Microsoft Windows
  • it requires an opened IE window

Since we want to migrate everything from our windows servers to linux, it would be pointless to go with this approach, since I’d have to rewrite in a month or so. Let aside the “lameness” of the technologies involved :) And I’m looking for a solution that doesn’t require an opened browser window, mainly because it should work on servers without X because I don’t want to :P (GTK doesn’t work without X – credits go to Alex Novac – and yes, it was retarded of me to think otherwise).

This solution wasn’t good enough, so I kept looking and came across the HtmlUnit Java library. This library is used to write tests in Java for web based applications. Pretty cool. And not so much. Although Java was once my one true love, after all these years spent with scripting languages, declaring variables, compiling the code, writing only OOP code and so on seemed a little…unfamiliar. But it takes more than anApiWithReallyLongCamelCasedClassNames to stop me, so I’ve installed Eclipse and made some tests. Disappointing! The library isn’t very tolerant with messy HTML and Javascript, and, since nobody out there, in the real world, actually abides to W3C recommendations, this library it’s somewhat useless in my case.

The next thing I’ve tried was a solution based on python that relied on integration with Gecko via hulahop. I must admit that I couldn’t get it to work under Ubuntu Jaunty Jackalope, due to incompatibilities in the system’s libraries. I’m sure that with enough time and patience, it can be pursued to work. But, as I didn’t had any, I’ve moved on and tried pywebkitgtk. This proved to be quite okay (I’m not a Safari fan) and it worked out of the box.

After spending several days searching the web, reading articles and trying out different softwares, I decided to share my findings with the world and write a tutorial on how to get the content of a page in python *after* its javascript finished execution. Here it goes:

First of all, install pywebkitgtk. Under Ubuntu, you can do it directly from the repository:

sudo apt-get install python-webkitgtk libwebkit-1.0-1 libwebkit-dev

…it will attempt to install a lot of other stuff, linked libraries and so on. Just say yes :P
After the installation is complete, it’s generally a good idea to test it! The following code should display a window with Google’s first page in it:

#!/usr/bin/env python

import gtk
import webkit

window = gtk.Window()
view = webkit.WebView()
view.open('http://www.google.com')
window.add(view)
window.show_all()
window.connect('delete-event', lambda window, event: gtk.main_quit())

gtk.main()

…if it doesn’t, maybe you did something wrong. See if all the packages are in their place. For the conversation’s sake, let’s assume it worked move on. As I said in the first paragraph, I wan to load a webpage, wait for it to execute all the JS in it and take the generated HTML source. A strange problem with pywebkitgtk is that nor the WebView object, nor the encapsulated WebFrame object don’t have a “get_html()” method or something similar. Really, there is no clean way to get the site’s content. But, fortunately, on pywebkitgtk’s wiki. I’ve found this hack that does just that:

class WebView(webkit.WebView):
    def get_html(self):
        self.execute_script('oldtitle=document.title;document.title=document.documentElement.innerHTML;')
        html = self.get_main_frame().get_title()
        self.execute_script('document.title=oldtitle;')
        return html

It executes a javascript that takes the content of the whole document and stores it in the title. And since there is a get_title() method that returns the title’s content, this workaround gets the job done. Kind of lame, but it suffices.

As previously stated, in my application I didn’t want to have a browser window open and with GTK is possible to run your app without calling window.show() or window.show_all(). Long story short, this is how I did it:

#!/usr/bin/env python
import sys, threads # kudos to Nicholas Herriot (see comments)
import gtk
import webkit
import warnings
from time import sleep
from optparse import OptionParser

warnings.filterwarnings('ignore')

class WebView(webkit.WebView):
	def get_html(self):
		self.execute_script('oldtitle=document.title;document.title=document.documentElement.innerHTML;')
		html = self.get_main_frame().get_title()
		self.execute_script('document.title=oldtitle;')
		return html

class Crawler(gtk.Window):
	def __init__(self, url, file):
		gtk.gdk.threads_init() # suggested by Nicholas Herriot for Ubuntu Koala
		gtk.Window.__init__(self)
		self._url = url
		self._file = file

	def crawl(self):
		view = WebView()
		view.open(self._url)
		view.connect('load-finished', self._finished_loading)
		self.add(view)
		gtk.main()

	def _finished_loading(self, view, frame):
		with open(self._file, 'w') as f:
			f.write(view.get_html())
		gtk.main_quit()

def main():
	options = get_cmd_options()
	crawler = Crawler(options.url, options.file)
	crawler.crawl()

def get_cmd_options():
	"""
		gets and validates the input from the command line
	"""
	usage = "usage: %prog [options] args"
	parser = OptionParser(usage)
	parser.add_option('-u', '--url', dest = 'url', help = 'URL to fetch data from')
	parser.add_option('-f', '--file', dest = 'file', help = 'Local file path to save data to')

	(options,args) = parser.parse_args()

	if not options.url:
		print 'You must specify an URL.',sys.argv[0],'--help for more details'
		exit(1)
	if not options.file:
		print 'You must specify a destination file.',sys.argv[0],'--help for more details'
		exit(1)

	return options

if __name__ == '__main__':
	main()

Download it, try it out. I worked wonders for me and I hope it will prove useful to other people too…

15 Apr

Javascript debate – generation clash

Posted by Tudor. Tags:

My previous post spawned a debate over ymessenger with a friend of mine, called Raul. He’s older, but not more savvy in the client side programming it seems. The debate was the following: what will the following code produce:

    f = function() { return function() { alert( 'message 1' ); } }
    window.onload = f;
    alert( 'message 2' );

…versus…

    f = function() { return function() { alert( 'message 1' ); } }
    window.onload = f();
    alert( 'message 2' );

In the first example, the code will only display “message 2″ and in the second one it will show “message 2″ followed by “message 1″. As a simple test will prove, I was right…again :P

14 Apr

Javascript question

Posted by Tudor. Tags:

I’ve found the best way to probe somebody’s javascript knowledge. Ask him a simple question! One question: What will the following code produce and most important why?

window.onload = alert( 'message 1' );
alert( 'message 2' );
23 Mar

Changing brs to newlines and vice-versa

Posted by Tudor. Tags:

Few days ago, I was searching for a javascript that could translate all newline characters (carriage returns, line feeds and combinations) to <br />s. Something like the nl2br PHP function. Not finding any suitable one, I’ve decided to write one myself and share it to with the world.

String.prototype.nl2br = function() {
	var br;
	if( typeof arguments[0] != 'undefined' ) {
		br = arguments[0];
	}
	else {
		br = '<br />';
	}
	return this.replace( /\r\n|\r|\n/g, br );
}

String.prototype.br2nl = function() {
	var nl;
	if( typeof arguments[0] != 'undefined' ) {
		nl = arguments[0];
	}
	else {
		nl = '\r\n';
	}
	return this.replace( /\<br(\s*\/|)\>/g, nl );
}

Some quick examples:

var myString = document.getElementById( 'textarea' );
var brString = myString.nl2br();
// using HTML? specify the tag you want to use as a line ender
var brString = myString.nl2br( '<br>' );

The br2nl method converts all <br />s, <br>s and other variations to newlines. Here you can also specify the desired line delimitators (\r, \r\n\,\n – defaults to \r\n, Windows style).

air_appicon_150x150As previously stated, I intend to get my bachelor degree this year. And I have to chose from a variety of projects. I have 2 options, either choose a PHP project, that will be ready in no time and get rid of it fast or choose to do a project with a technology I don’t know yet and learn that technology in the process, so that I can honestly that I’ve learn something useful while in school. I decided to give AIR a try, for one possible project, a RSS aggregator and a RSS reader (desktop app). I if get along with AIR, this will be painless. I’ll develop the browser based application and use AIR to build the desktop one on top of this one.

Installing AIR on my Ubuntu box was a piece of cake, it worked right out the box. Unfortunately I can’t say the same about Aptana, which kept poping errors and I had a hard time installing the AIR plugin. But I get this type of problems a lot when dealing with Eclipse based software and I got used to it in time…

So I’ve the gas and tried out some code examples. Copy/paste, monkey style, see how well it does. Well, what can I say…I’m impressed. If this was a gaming blog, I would have to say “AIR pwnz”. Really! AIR is a much developer friendly platform that I initially thought. So today I’m starting my AIR learning quest for 1 week and at which end I’ll decide if I’ll go forward with AIR for my project. As I’m a seasoned web developer, I want to see how easy it is to port an web application from a browser based app to an AIR one. Yes, I know AIR does that by default. And not quite. For instance, in order to send something to the server, in Javascript you use Ajax, like such

new Ajax.Request( url, options);

…while in AIR you have to write…

var request = new air.URLRequest( url );
var loader = new air.URLLoader();
loader.load( request );

If I were to use a design pattern like MVC, it will be a lot easier to port a browser application to AIR. I’ll have to rewrite the model and probably the view. No meddling with the controller, the application’s logic would remain unchanged (parse some feeds and display them to the user). This would be a big plus. So first of all I’ve wanted to try AIR with Javascript MVC and make some sniffing in the controller, see if the application is running inside the browser or on AIR, load a different model and a different view for each case. But after giving it some thought I’ve realised that I also have to read the Javascript MVC’s documentation, try it out and get used to it. And I just don’t have enough time for that, because those South Park episodes aren’t going to watch themselves. Or do I? I’ll post some code examples as things go forward. This is a plead to my readers, if any of you is familiar with AIR, please leave a comment. I would appreciate having someone to share my problems with ;)

Motto: I’m not fat, I’m just big boned. Back to South Park…

This post is about a quite common problem that I’ve encountered over and over again but didn’t look into it to find a proper answer to it. The problem is what happens when I request via AJAX a page that makes a HTTP redirect to another and how do I fix it to behave “normally”. It’s quite common problem with pages that require authentication.

Let’s say we have the following PHP script:

if( !is_user_allowed() ) {
    header( 'Location: ' . PATH_TO_LOGIN_PAGE );
    die();
}

Pretty simple and self explanatory. If the user isn’t logged in, he gets sent to the login form. But what happens if the request is made via AJAX? A simple AJAX request made using the prototype.js library looks like this:

new Ajax.Request(
    'server_side.php',
    {
        onSuccess: function( t ) {
            $( 'container' ).update( t.responseText );
        }
    }
);

What happens if the user isn’t logged in? The browser makes two requests to the server, the second one to the page containing the login form.

requests

And the whole login page gets loaded into a small container on current page, ruining the design and confusing the user.

Although I’ve encountered this problem on several occasions, I didn’t give it much thought. I’ve used that very popular design pattern “I know it’s lame, but hey, it works…moving on”. The implementation on case was like this: add a token to the login page’s markup, something like and check to see if this string is in the response text. Like this:

new Ajax.Request(
    'server_side.php',
    {
        onSuccess: function( t ) {
            if( t.responseText.indexOf( '<!--login-->') != -1 ) {
                document.location = 'login.php';
            }
            else {
                $( 'container' ).update( t.responseText );
            }
        }
    }
);

Plain and simple. And lame. I admit it. But hey! It works :) Still, I’ve found a better way of doing things, that relies on HTTP headers. First of all, in the login page, instead of a lame string message, I’ve added some custom headers to the response.

header( 'HTTP/1.0 401 Authorization Required' );
header( 'Login-path: ' . PATH_TO_LOGIN_PAGE );

And then, I’ve altered the the javascript a little. Since a response served with a 401 header won’t trigger the onSuccess callback function, this script uses onComplete.

new Ajax.Request(
    'server_side.php',
    {
        onComplete: function( t ) {
            switch( t.status ) {
                case 200:
                    $( 'container' ).update( t.responseText );
                    break;
                case 401:
                    document.location = t.getHeader( 'Login-path' );
                    break;
            }
        }
    }
)

It works. And it’s not so lame. Mission accomplished…