Changing the time zone in Ubuntu

Posted on Monday, January 4th, 2010 under , ,

Ubuntu This one is from the “d’oh” category. As recently I’ve moved to Barcelona and since Bucharest and Barcelona are in different time zones, I wanted to set my system’s time an hour back, in order to display the correct time.

I went the easy way: right clicked on the clock in the upper right, selected “Adjust date and time” and made the modifications. After a few minutes the clock was showing Bucharest’s time and my changes were overwritten. I though I must have imagined setting the clock right in the first place – since I “tasted” Moritz thoroughly – and did it again.

I’ve ignored the clock for a while and, later on, when I looked at it, it was displaying Bucharest’s time again :(

D’oh! Ubuntu queries it’s time servers from time to time and sets the hour according to the information retrieved from these servers. And since my timezone was still set to Europe/Bucharest, on each update my system was receiving the time for that area. I’ve changed the timezone accordingly and everything started working. Magic!

To change the timezone of on you Ubuntu system go to System > Administration > Time and date. Or, if you’re a console freak, just punch in:

sudo dpkg-reconfigure tzdata

in order to set the correct time zone.

Advertising blog entries on Pidgin’s status

Posted on Friday, August 21st, 2009 under , , ,

I wanted to write this post ever since I’ve read Radu’s Fortune and Pidgin’s status on Ubuntu post. Radu’s approach is quite lame and hard to use, because it relies on the user exporting a SQL dump file every time he posts something on the blog.

And really, do you need *all* the posts in the database? Do you really want to advertise blog entries from 2 years ago? Come on!

My idea is much better…obviously :P …and it goes like this: parse the blog’s RSS and take “inspiration” from there for the statuses. The weapon of choice for this task is python with its feedparser library.

First, some prerequisites:

sudo apt-get install python-feedparser

…and then, the python script:

#!/usr/bin/python
import feedparser
import random
import os
 
# feed url
FEED = 'http://feeds2.feedburner.com/motanelu'
 
feed = feedparser.parse( FEED )
index = random.randrange(0, len( feed['items'] ) - 1 )
status = 'purple-remote "setstatus?status=available&message=%s %s"' % ( feed['items'][index].title, feed['items'][index].link )
os.system( status )
 
# EOF

I consider this approach better than Radu’s, because it doesn’t require exporting the database or messing around with fortune. Read this post to see how to update Pidgin’s status using cron. And enjoy :P

Pidgin, Ubuntu and Yahoo! Messenger

Posted on Wednesday, June 24th, 2009 under , , ,

Pidgin Starting yesterday, my Pidgin stopped connecting to Yahoo!’s messenger service (MSN kept working). Being lazy, I’ve just changed the scs.msg.yahoo.com host to 66.163.181.166 in the user profile panel – the first solution I’ve came across on Google – and it started working again. For another day or so.

But today, it stopped working again and I couldn’t get it working again. So I’ve used Meebo to go online and ask others if their YM! clients worked, if they’ve also encountered this problem, and, most important, what did they do to fix it. And – thanks to Radu – I’ve found out the root of this problem: Yahoo! changed the specifications of its im protocol. Fortunately, the Pidgin guys don’t waste time and the problem can be solved by upgrading to Pidgin 2.5.7.

Details here and here.

Pywebkitgtk – execute Javascript from python

Posted on Thursday, June 18th, 2009 under , , , , ,

Python Last week I’ve got a new assignment at my job: a crawler that was supposed to periodically visit some sites and download their content. Sounds simple, isn’t it? Well, it’s not. Mainly because we want to also get all the flash content and some of it is inserted with Javascript, via various libraries like SWFobject or directly with document.write in some cases. I needed a snapshot of how the page actually looks like when the user is looking at it in a browser.

This meant that I had to get the content *after* all the javascripts contained in page finished execution. In developer language, this means after the window.onload event takes place. And, of course, I also needed a Javascript interpreter. So any attempt to use wget/cURL/file_get_contents was destined to fail from the start. I needed browser power :) So I’ve googled around for some.

The first thing I came across was using COM to connect to an Internet Explorer instance from python, use it to navigate back and forth and get the HTML content as it’s interpreted by IE’s engine. This had 3 major drawbacks:

  • it requires Internet Explorer
  • it requires Microsoft Windows
  • it requires an opened IE window

Since we want to migrate everything from our windows servers to linux, it would be pointless to go with this approach, since I’d have to rewrite in a month or so. Let aside the “lameness” of the technologies involved :) And I’m looking for a solution that doesn’t require an opened browser window, mainly because it should work on servers without X because I don’t want to :P (GTK doesn’t work without X – credits go to Alex Novac – and yes, it was retarded of me to think otherwise).

This solution wasn’t good enough, so I kept looking and came across the HtmlUnit Java library. This library is used to write tests in Java for web based applications. Pretty cool. And not so much. Although Java was once my one true love, after all these years spent with scripting languages, declaring variables, compiling the code, writing only OOP code and so on seemed a little…unfamiliar. But it takes more than anApiWithReallyLongCamelCasedClassNames to stop me, so I’ve installed Eclipse and made some tests. Disappointing! The library isn’t very tolerant with messy HTML and Javascript, and, since nobody out there, in the real world, actually abides to W3C recommendations, this library it’s somewhat useless in my case.

The next thing I’ve tried was a solution based on python that relied on integration with Gecko via hulahop. I must admit that I couldn’t get it to work under Ubuntu Jaunty Jackalope, due to incompatibilities in the system’s libraries. I’m sure that with enough time and patience, it can be pursued to work. But, as I didn’t had any, I’ve moved on and tried pywebkitgtk. This proved to be quite okay (I’m not a Safari fan) and it worked out of the box.

After spending several days searching the web, reading articles and trying out different softwares, I decided to share my findings with the world and write a tutorial on how to get the content of a page in python *after* its javascript finished execution. Here it goes:

First of all, install pywebkitgtk. Under Ubuntu, you can do it directly from the repository:

sudo apt-get install python-webkitgtk libwebkit-1.0-1 libwebkit-dev

…it will attempt to install a lot of other stuff, linked libraries and so on. Just say yes :P
After the installation is complete, it’s generally a good idea to test it! The following code should display a window with Google’s first page in it:

#!/usr/bin/env python
 
import gtk
import webkit
 
window = gtk.Window()
view = webkit.WebView()
view.open('http://www.google.com')
window.add(view)
window.show_all()
window.connect('delete-event', lambda window, event: gtk.main_quit())
 
gtk.main()

…if it doesn’t, maybe you did something wrong. See if all the packages are in their place. For the conversation’s sake, let’s assume it worked move on. As I said in the first paragraph, I wan to load a webpage, wait for it to execute all the JS in it and take the generated HTML source. A strange problem with pywebkitgtk is that nor the WebView object, nor the encapsulated WebFrame object don’t have a “get_html()” method or something similar. Really, there is no clean way to get the site’s content. But, fortunately, on pywebkitgtk’s wiki. I’ve found this hack that does just that:

class WebView(webkit.WebView):
    def get_html(self):
        self.execute_script('oldtitle=document.title;document.title=document.documentElement.innerHTML;')
        html = self.get_main_frame().get_title()
        self.execute_script('document.title=oldtitle;')
        return html

It executes a javascript that takes the content of the whole document and stores it in the title. And since there is a get_title() method that returns the title’s content, this workaround gets the job done. Kind of lame, but it suffices.

As previously stated, in my application I didn’t want to have a browser window open and with GTK is possible to run your app without calling window.show() or window.show_all(). Long story short, this is how I did it:

#!/usr/bin/env python
import sys, threads # kudos to Nicholas Herriot (see comments)
import gtk
import webkit
import warnings
from time import sleep
from optparse import OptionParser
 
warnings.filterwarnings('ignore')
 
class WebView(webkit.WebView):
	def get_html(self):
		self.execute_script('oldtitle=document.title;document.title=document.documentElement.innerHTML;')
		html = self.get_main_frame().get_title()
		self.execute_script('document.title=oldtitle;')
		return html
 
class Crawler(gtk.Window):
	def __init__(self, url, file):
		gtk.gdk.threads_init() # suggested by Nicholas Herriot for Ubuntu Koala
		gtk.Window.__init__(self)
		self._url = url
		self._file = file
 
	def crawl(self):
		view = WebView()
		view.open(self._url)
		view.connect('load-finished', self._finished_loading)
		self.add(view)
		gtk.main()
 
	def _finished_loading(self, view, frame):
		with open(self._file, 'w') as f:
			f.write(view.get_html())
		gtk.main_quit()
 
def main():
	options = get_cmd_options()
	crawler = Crawler(options.url, options.file)
	crawler.crawl()
 
def get_cmd_options():
	"""
		gets and validates the input from the command line
	"""
	usage = "usage: %prog [options] args"
	parser = OptionParser(usage)
	parser.add_option('-u', '--url', dest = 'url', help = 'URL to fetch data from')
	parser.add_option('-f', '--file', dest = 'file', help = 'Local file path to save data to')
 
	(options,args) = parser.parse_args()
 
	if not options.url:
		print 'You must specify an URL.',sys.argv[0],'--help for more details' 
		exit(1)
	if not options.file:
		print 'You must specify a destination file.',sys.argv[0],'--help for more details'
		exit(1)
 
	return options
 
if __name__ == '__main__':
	main()

Download it, try it out. I worked wonders for me and I hope it will prove useful to other people too…

Ubuntu with 4 GB of RAM

Posted on Monday, June 8th, 2009 under ,

I’ve upgraded my laptop by changing its RAM DIMMS. It used to have two 1GB dimms and now it has 2 2GB dimms. But unfortunately, Ubuntu “sees” only 3 GB of RAM. After searching the web, I’ve found out that a kernel upgrade should do the trick. And it did. A simple…

sudo apt-get install linux-restricted-modules-server linux-headers-server linux-image-server linux-server

…followed by a restart did the job. Now my system has 4GB of RAM:

tudor@thor:~$ free -m
             total       used       free     shared    buffers     cached
Mem:          4022        764       3257          0         28        310
-/+ buffers/cache:        425       3596
Swap:         3851          0       3851

…which will allow me to run a decent 2GB virtual machine with Windows, so I won’t have any problems running Flex Builder, and perhaps I will emulate a MacOS machine, probably Leopard. I’ve never used a Mac so far and I’m very curious about it. And a real Mac is a little too…”precious” for me at this moment.

PS: I still have two 1GB dimms on a shelf…anyone interested?

Ubuntu 8.10 – simply works

Posted on Monday, December 22nd, 2008 under , ,

Ubuntu I’ve bought a new laptop several weeks ago. A HP Compaq 6820s. Top specs, 2 gigs of RAM. Since it’s manly for home use, I tried to put Windows XP on it, thinking it would be a piece of cake. And it was, until Windows booted for the first time. And then hell broke loose. Microsoft’s OS needed drivers for most of the components. And getting those drivers from the HP’s site wasn’t as easy as one might imagine because HP’s designers never heard of Jakob Nielsen’s teachings.

After about half an hour of searching, swearing and downloading, I had all the drivers on the laptop’s hard drive. But wait, they weren’t tested in Microsoft’s labs and those lame warnings kept popping out. “Are you sure you want to install this driver?” Well, the decision was this: either install the untested drivers and risk system failure, or use my laptop without a wireless connection, without bluetooth, without a decent screen resolution and so on. And since I’ve paid a lot of money for those extra features, I’ve chose to ignore the warnings which proved to be another bad idea (the first bad idea was installing windows in the first place). My system stopped working. Nothing was being displayed to the screen (possibly because of the untested graphics card driver).

So I’ve decided to leave aside windows for now and give the new Ubuntu a try. I was amazed. It had drivers for all the hardware and it worked right out of box. Without any lame warnings, without any boring EULA’s and so on…

Ubuntu Intrepid Ibex – it simply works :)