technology from back to front

Archive for November, 2008

Profiting from Agile

In September we sponsored the Agile Conference
and while it was held prior to the global meltdown, the turn-out was not great. Certainly attendance at the event does not seem
to be growing at anything like the pace that Agile is being adopted. Is this because it’s now ubiquitous amongst the type of organisations that send people to conferences?

In general the most entertaining conference speakers were the most informative, with too many “case studies” turning into hard pitches. Rob Thomsett made a compelling case for the need to adopt an agile mindset at board level to make successful agile programmes.

Peter Merrick was interesting on the subject of working with new
clients in an Agile context and while our approach has always been to
help clients discover their requirements through the early planning
phases, he advocates taking a more hands-on approach to requirements
analysis. Our approach to new business seemed to resonate with him too.

While there were a fair few of the familiar faces there hawking their enterprise wares, it’s refreshing to meet the smaller practitioners who have adopted an agile approach through a will to “do the right thing and do it right”, rather than just jump on the latest bandwagon.

by
mike
on
25/11/08

Final electoral chart now online

I’d anticipated making this post within days of the election, but while the winner was known as soon as they called California, the result in Missouri has only been called in the last couple of days following a tight recount. In the end the state went to John McCain, a blow to the pride of the former “bellwether state” which has gone to the winner in every Presidential election in the last century except this one and 1956. So we are now ready to present the final chart for the 2008 US Presidential elections, including scattergrams that show how this year compares to 2004, and how the final results compare to the final projections from Nate Silver’s fivethirtyeight.com that we used through the night. Incidentally, Silver accurately predicted the winner in every state but one, Indiana, which went to Obama by less than a 1% margin.

This was useful on election night, but it was a lot less useful than I had hoped, because what I didn’t take into account is that states are “called” for one side or another long before any estimates of the final voting percentages are available. Next time around I shall re-design it to take that into account. For now, time to get busy on something to watch during the next UK general election!

by
Paul Crowley
on
20/11/08

Tracing Python memory leaks

While I was writing a python daemon, I noticed that my application process memory usage is growing over time. The data wasn’t increasing so there must have been some memory leak.

It’s not so easy for a Python application to leak memory. Usually there are three scenarios:

  1. some low level C library is leaking
  2. your Python code have global lists or dicts that grow over time, and you forgot to remove the objects after use
  3. there are some reference cycles in your app

I remembered the post from Marius Gedminas, in which he traced his memory leaks, but I haven’t noticed before that he published his tools.
The tools are awesome. Just take a look at my session:

$ pdb ./myserver.py
> /server.py(12)()
-> import sys
(Pdb) r
2008-11-13 23:15:36,619 server.py      INFO   Running with verbosity 10 (>=DEBUG)
2008-11-13 23:15:36,620 server.py      INFO   Main dir='./server', args=[]

After some time, when my application collected some garbages I pressed Ctrl+C:

2008-11-13 18:41:40,136 server.py      INFO   Quitting
(Pdb) import gc
(Pdb) gc.collect()
58
(Pdb) gc.collect()
0

Let’s see some statistics of object types in memory:

(Pdb) import objgraph
(Pdb) objgraph.show_most_common_types(limit=20)
dict                       378631
list                       184791
builtin_function_or_method 57542
tuple                      55478
Message                    48129
function                   45575
instancemethod             31949
NonBlockingSocket          31876
NonBlockingConnection      31876
_socketobject              31876
_Condition                 28320
AMQPReader                 14900
cell                       9678

Message objects definitely shouldn’t be in the memory. Let’s see where are they referenced:

(Pdb) objgraph.by_type('Message')[1]
<amqplib.client_0_8.Message object at 0x8a5b7ac>
(Pdb) import random
(Pdb) obj = objgraph.by_type('Message')[random.randint(0,48000)]
(Pdb) objgraph.show_backrefs([obj], max_depth=10)
Graph written to objects.dot (15 nodes)
Image generated as objects.png

This is what I saw:

Message object references

Ok. A Channelobject still has references to our Message. Let’s move on to see why Channel is not freed:

(Pdb) obj = objgraph.by_type('Channel')[random.randint(0,31000)]
(Pdb) objgraph.show_backrefs([obj], max_depth=10)
Graph written to objects.dot (35 nodes)
Image generated as objects.png

Channel object references are much more interesting – we just caught a reference cycle here!

Channel object references

There is also one other class that’s not being freed – NonBlockingConnection:

(Pdb) obj = objgraph.by_type('NonBlockingConnection')[random.randint(0,31000)]
(Pdb) objgraph.show_backrefs([obj], max_depth=10)
Graph written to objects.dot (135 nodes)
Image generated as objects.png

Here’s the cycle we’re looking for:

NonBlockingConnection object references

To fix this issue it’s enough to break the reference loops in one place. This is the code that fixes the reference loops:

        # we don't need channel and connection any more
        channel.close()
        connection.close()
        # remove the reference cycles:
        del channel.callbacks
        del connection.channels
        del connonection.connection
by
marek
on
14/11/08

Firefox tabs are finally usable

If you use Firefox, go and install the Ctrl-Tab add-on.

Tabs are great for reducing clutter, but they fail to make life much easier because the tab navigation doesn’t support the common patterns of use. For example, I end up opening the same page in multiple tabs because it is quicker to do that than to hunt for it in the existing tabs. Eventually even that is unmanageable and I have to manually garbage collect tabs.

Ctrl-Tab makes tabs usable again by fixing the navigation. The key combination Control-Tab cycles through the tab history (in the same way as task switchers), so that a single tap take you to the previous tab you visited, which is almost always the thing you wanted. Successive taps of Tab show thumbnails of the tab contents while you flip through them. Better still, Control-A (Control-Shift-a) shows thumbnails of all open tabs, and lets you filter incrementally by typing in the auto-focussed box, then select the tab with the arrow keys or mouse. In a single swoop it’s now much easier for me to switch to something I have open than open it again – the way it should be.

by
mikeb
on
12/11/08

Simple inter-process locks

I recently faced a very common problem, how to make sure that only one instance of my program is running at a time on the host.

There are a lot of approaches that can be taken to solve this problem, but I needed a portable solution for Python.

My first idea was to use widely known IPC techniques to lock some global resource. In C I would just create a semaphore and lock it. One problem is that a semaphore is not unlocked when a process dies. Another issue is a lack of support of named semaphores for Python.

The best solution on Unix is to gain an exclusive write lock on a file using fcntl(LOCK_EX).

Of course it doesn’t work on Windows. But for this OS the solution is to take advantage of their mutex facilities using pywin32 module. I was surprised to see that this method works quite well.

It’s also possible to use the fact that only one process at a time can bind to specific tcp/ip port (unless you use SO_REUSEPORT). This is the most portable, but also the most obscure method.

Here’s the code for this inter process “locking”. It’s not really locking, because you can’t block and wait for a lock. All you can do is grab a lock or get an exception. But this is enough to make sure that there is only one process that’s using a resource. This is how you can use this module:

import interlocks, time

lock = interlocks.InterProcessLock("my resource name")
try:
    lock.lock()
except interlocks.SingleInstanceError:
    print "Other process has acquired this lock."
else:
    print "Press CTRL+C to release the lock..."
    while True: time.sleep(32767)

Test code for the interlocks module needs to open an external process that blocks the resource. The code is not perfect (race conditions), but should be enough for just a test case:

def execute(cmd):
    ''' spawn a new python process that will execute 'cmd' '''
    cmd = '''import time;''' + cmd + '''time.sleep(10);'''
    pid = os.spawnv(os.P_NOWAIT,'/usr/bin/python', ['/usr/bin/python', '-c', cmd])
    time.sleep(1) # poor man's synchronization
    return pid

lock = interlocks.InterProcessLock('test')

# lock resource from other process
pid = execute("import interlocks; a=interlocks.InterProcessLock('test');a.lock();")
try: # fail to grab a lock
    lock.lock()
except interlocks.SingleInstanceError: print "success: the lock is blocked by spawned process"
else: print "FAILURE: the lock should be blocked by spawned process (pid=%i), but isn't" % (pid,)

os.kill(pid, signal.SIGKILL)
time.sleep(1) # poor man's synchronization

Coding the tests wasn’t so painful, much more problematic was to make tests run on Windows. Obviously we need an os.kill replacement for this platform. The next problem is to make os.spawnv() work on Windows at all: which slashes to use or how to encode spaces in the path. Another issue is that the process pid returned from os.spawnv() can’t be killed. It seems that the return value is not really a proper pid. Don’t waste your time like I did, use subprocess.Popen(). Fixed test code, without os.spawnv is included in the lib.

by
marek
on
05/11/08

Electoral diagrams will be updated live

These diagrams are based on the latest projections from fivethirtyeight.com and have more detailed explanations; there’s also a cartogram and two scattergrams to show how accurate polling is and how things have changed since 2004. I’ll be updating them during the night as states are called. If you’re watching the election, do check the diagram from time to time and see if you find it informative – and do comment, I’ll be here to reply.

by
Paul Crowley
on

Who’s winning on election night?

I find the maps and charts that the TV networks provide nearly useless for understanding the state of play during an election night, so I’ve taken to designing my own diagrams. For tomorrow’s Presidential elections, I’ve turned the projections on fivethirtyeight.com into a graph which illustrates the likely outcome of the election and the paths to victory for the two candidates:

Today’s predictions

The x-axis represents the projected margin of victory – leftwards for Obama, rightwards for McCain. The y-axis represents electoral votes. The states are ordered by margin of victory.

From this graph you can immediately see that Obama is projected to take all the Kerry states by a margin of 6.9% or more, and the Bush states Iowa and New Mexico appear to be firmly in his pocket with projected margins of 11.0% and 8.6%. That puts him 5 EVs from the middle line – a draw – and thus 6 EVs from victory. So if Obama wins all of these states plus any other state with 6 EVs or more – or any two other states – he wins the election.

That’s useful for now, but what about during election night itself? You can see a chart that says something like Kerry is 10 EVs ahead of Bush, but that doesn’t help clarify which of them is really doing better – if they’ve called New York a whole load of New England states while a lot of Southern states are still waiting to announce, then the Democratic lead might be no more than you would expect, or it might even be less.

Here’s a fantasy scenario what I have lined up for the Presidential elections tomorrow [removed - the final result is now online]. When they call states, I’ll move them to the top or bottom of the graph as appropriate. The area inbetween the called states is the remaining battleground – and anyone who can win all the states up to and across the finish line can win the election.

I will also be maintaining a scattergram showing how the projections have done against reality, and a cartogram illustrating the electoral college.

This is tricky information to represent in a single graph, so any ideas for improvements will be gratefully received – thanks!

by
Paul Crowley
on
03/11/08

Search

Categories

You are currently browsing the LShift Ltd. blog archives for November, 2008.

Feeds

Archives

2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us