Friday, December 09, 2011

Porting to Python 3: What are you waiting for?

Lately, there seems to be a lot of posts trying to encourage people to port their apps/libraries/modules to Python 3.  I'd like to add my voice and perhaps, use as a reminder this old post: more than 2 years ago, Crunchy was made compatible with Python 2.4, 2.5, 2.6 ... and 3.1.  At the time, we made the decision to not follow what seem to be the official recommendation, namely to have one code base based on 2.x and use the 2to3 tool to do the automatic translation.  Instead, we decided to have a single code base, which seems the way people are doing it these days.  There were of course some challenges, especially as we kept compatibility all the way back to Python 2.4: since Crunchy puts a Python interpreter inside your browser and manipulates what's there, regardless of the encoding, it has to be able to do a lot of string (and bytes for 3.x) manipulations, as well as dealing with incompatible syntax for handling exceptions, etc.  when running the code that is fed to it.  This was done when there was very little known about best practices for porting from 2.x to 3.  Partly as a result, Crunchy's code is not the best example to look at when it comes to doing such a port ... but it certainly demonstrated that it was possible to port a non-trivial program with a tiny team.  Now, the situation is quite different, and many more projects have been ported, and you can benefit from their experience.

So, what are you waiting for?

Thursday, September 22, 2011

Errors should never pass silently

 - The Zen of Python

Lately, I have been programming a lot in Javascript, and have come to appreciate even more what Tim Peters expressed so well in writing what is now known as The Zen of Python.  In particular, I appreciate the philosophy that "Errors should never pass silently."

Those that don't care about Javascript can stop reading now.  Others may benefit, and perhaps even enlighten me even further, about a "feature" I came across.

I'm working on a new website which will include only interactive tutorials.  [No, not Crunchy-based ones, but honest-to-goodness interactive tutorials that require nothing but a browser.]   To make my life easier, I use jQuery as well as CodeMirror.  A user can edit the code in the CodeMirror editor and view the result "live" in a separate window (html iframe).  Every time the content of the editor is changed, the iframe gets updated, as illustrated here.

I started by using the standard $ jQuery notation inside the html iframe without including a link to jQuery in the iframe, since I already included it in the parent page, and got an error in the console to the effect that $ was not defined.  Fair enough.  I changed the source to also include a link to jQuery inside the content of the editor (to be copied in the iframe) and the error vanished.  However, no javascript code was run from the iframe, not even a simple test alert.  Meanwhile, the javascript console remained silent.

Errors should never pass silently.

Eventually, by searching on the web, I was able to find a solution.  However, before I found a solution, I encountered a truly puzzling "feature".

After including a link to jQuery in the editor (which then was copied into the iframe) and reloading the page, if I then proceeded to remove that link from the editor, suddenly all javascript code embedded in the iframe, including all jQuery calls, would work.

I can just imagine giving these instructions to beginners:
We are going to use jQuery, as you can see from the code in the editor.  Now, to begin, you have to select the line where jQuery is loaded and delete it so that the browser (iframe) can use it. 
I still have not figured out how that phantom jQuery worked... and strongly believe that some error message or warning should have appeared in the javascript console (for Chrome, or Firebug for Firefox).

In any event, to solve my problem, I had to do 3 things:

  1. Only load jQuery once, in the main window.
  2. Write $ = parent.$; inside the iframe before using the $ notation.  
  3. As I wanted to draw things on a canvas, replace the usual $("#canvas") jQuery idiom by $("#canvas", document) everywhere.

Now everything appears to work just as expected.

Sunday, January 16, 2011

Book review: Pro Python System Administration

Summary: Pro Python System Administration is a comprehensive book showing how Python can be used effectively to perform a variety of system administration tasks.  I would recommend it highly to anyone having to do system administration work.  For more information, please consult the author's web site.

There is a saying that "no good deed goes unpunished".  I feel that a counterpart should be "no bad talk goes unrewarded".  At Pycon 2009, I gave a talk on plugins that has to be amongst the worst presentations I ever gave. Yet, as an unexpected result from that talk, I received a free copy of Pro Python System Administration written by Rytis Sileika. This blog entry is a review of that book.

This book is written for system administrators, something in which I have no experience; therefore, this review will definitely not have the depth that an expert may have given it.

Four general areas of system administrations are covered: network management, web server and web application management, database system management, and system monitoring.  Examples are given on a Linux system with no indication as to whether or not a given example is applicable to other operating systems. Given that Python works with all major operating systems, and that the book focuses on using Python packages, I suspect that the content would be easily adaptable to other environments. 

While the book is classified, on the cover, as being addressed to advanced Python programmers, the author in the book introduction indirectly suggests that this book would be appropriate for people that have some minimal experience with Python.  I suspect that the classification on the book cover was done (wrongly) by the editor as I found the examples very readable and I would not claim to be an advanced Python programmer.

The book is divided into 13 chapters, each focused on one or a few well-defined tasks. While the tasks in a given chapter are relatively independent of those of other chapters, there is a natural progression in terms of topics introduced and it is probably better to read the book in the natural sequence rather than reading chapters randomly - in other words, this book is not simply a random collection of recipes for a series of tasks.

    1. Reading and Collecting Performance Data Using SNMP

In this chapter, Sileika introduces the Simple Network Management Protocol, or SNMP after which he shows how one can query SNMP devices using Python and the PySNMP library. In the second part of that chapter, he introduces RRDTool, an application for graphing monitoring data, and shows how to interact with it using the rrdtool module. In the last section of this first chapter, he shows how to create web pages (with the eventual goal of displaying on the web monitoring data) using the Jinja2 templating system.

    2. Managing Devices Using the SOAP API 

In this chapter, Sileika introduces the Simple Object Access Protocol or SOAP and gives examples based on using the Zolera SOAP Infrastructure (ZSI) package.  The bulk of the chapter focuses on explaining how to manage and monitor Citrix Netscaler load balancers. The Python logging module is introduced and used.

    3. Creating a Web Application for IP Address Accountancy  

In this chapter, the Django framework is introduced and used to build a web application that maintains IP addresses allocation on an internal network.  Rather than using the web server included with Django, Sileika shows how to use Django with the Apache web server.

    4. Integrating the IP Address Application with DHCP

This chapter is a continuation of the previous one, where the application previously developed is enhanced with the addition of Dynamic Host Configuration Protocol (DHCP) services as well as a few others. More advanced Django methods are included as well as some AJAX calls.

    5. Maintaining a List of Virtual Hosts in an Apache Configuration File

Another Django based application is introduced, this time with a focus on the administration side.
    6. Gathering and Presenting Statistical Data from Apache Log Files

This chapter focuses on building a plugin-based modular framework to analyze log files.  The content of this chapter is the reason why I received a free copy of the book in the first place: Sileika mentioned to me in an email that the architecture described was mostly inspired by the presentation I gave at PyCon with a few modifications that allow for information exchange between the plug-in modules.  When I got the original email, I was really surprised given that I had tried to forget about the talk I had given on plugins. 

In my opinion, Sileika does an excellent job in explaining how plugins can be easily created with Python and used effectively to design modular applications.

Dynamic discovery and loading of plugins is illustrated, and the GeoIP Python library is used to find the physical location corresponding to a given IP address.
    7. Performing Complex Searches and Reporting on Application Log Files

This chapter shows how to use Exctractor, an open source log file parser tool, to parse more complex log files than those generated by an Apache server as illustrated in the previous chapter. Of note is the introduction and usage of Python generators as well as an introduction to parsing XML files with Python.
    8. A Web Site Availability Check Script for Nagios 

This chapter shows how to use Python with Nagios, a network monitoring system. Python packages/modules illustrated include BeautifulSoup, urllib and urllib2.  The monitoring scripts developed do more than simply checking for site availability and actually test the web application logic.

    9. Management and Monitoring Subsystem 
    10. Remote Monitoring Agents
    11. Statistics Gathering and Reporting  

Chapters 9, 10 and 11 explain how to build a "simple" distributed monitoring system.  A number of Python module & packages are used including xmlrpclib, SimpleXMLRPCServer, CherryPy, sqlite3, multiprocessing, ConfigParser, subprocess, Numpy, matplotlib and others.  The application developed is a good example of using Python as a "glue language", to use third-party modules & packages.  One weakness is that the introduction to statistics included is rather elementary and, I believe, could have been shortened considerably given the intended audience.

    12. Automatic MySQL Database Performance Tuning 

This chapter revisits the use of plugins, with a slightly more advanced application, where information can be exchanged between the different plugins.

    13. Using Amazon EC2/S3 as a Data Warehouse Solution

This last chapter gives a crash course on Amazon's "cloud computing" offerings.  It is a good final note to the book, and a good starting point for future explorations.

Overall, I found the book quite readable even though it is outside of my area of expertise.  Occasionally, I had the feeling that there were a few "fillers" (e.g. overlong log listings, etc.) that could have been shortened without losing anything of real value.  This is very much an applied book, with real life examples that could be either used "as-is" or used as starting points for more extended applications.

I would recommend this book highly to anyone who has to perform any one of the system administration tasks mentioned above. It is also a good source of non-trivial examples demonstrating the use of a number of Python modules and packages.  The code that is given would likely save many hours of development.   As is becoming the norm, the source code included in the book is available from the publisher.  Also, many of the prototypes covered in the book are available as open source projects at the author's web site, where a discussion forum is also available.