From: dizzyd@jabber.org Subject: Some thoughts on Threading Date: May 25, 2004 2:53:54 PM PDT To: bkirsch@osafoundation.org Cc: lisa@osafoundation.org Greetings, I've been poking around Chandler a little bit more, reading docs, and just generally trying to understand what makes Chandler tick. One thing that stands out to me is the lack of a clear threading strategy. I'd like to throw out a strawman idea. First off, I would submit that it is unnecessary to have more than two threads running in Chandler at any given time. Python is well known for having a GIL (Global Interpreter Lock) that prohibits two threads from executing Python code simultaneously. Unless the majority of code in Chandler is not in Python, you'll time waiting on the GIL when you could be executing code. With the appropriate async event loops and primitives, threading is overkill for the majority of desktop applications. This statement is reinforced by the fact that most (if not all) of the major graphical toolkits currently in use today have at their heart an event loop. Additionally, threading a desktop application that is meant to be a platform for people to build "parcels" on means that you are increasing the pre-requisite development knowledge necessary to build a stable, comprehensible "parcel". All of that said, give the technology choices that Chandler has made, I think it is unavoidable that it should use less than two threads. However, at least with two threads you can provide a reasonable API that will hide (for the most part) the fact that code is running on different threads. There has been some indication that the Chandler community is interested in using Twisted, and I have some familiarity with Twisted, so my strawman incorporates this framework. Specifically, I would suggest that Chandler should have one thread for the GUI event loop and one thread for the Twisted event loop. All graphical operations, callbacks, etc. would happen on the GUI thread, while agent ops, network ops, etc. would live on the Twisted thread. Twisted provides a nice facility for doing timed callbacks and long-lived (efficient) network operations. Considering that Chandler is a platform for integrating various forms of personal information management, and that more and more personal information are exchanged via the 'Net, it make sense to have at the core of your platform a system that allows you to manage tens of sockets simultaneously without introducing gross locking complexities (that _will_ emerge if every socket is on its own thread). The one argument I've heard this far against using an async model (such as Twisted) versus threads is that threads are a universal concept, whereas Twisted is very python specific (esp. once you get into deferreds, etc.) I would note that while threads are indeed a universal concept, the semantics and implementation of them vary wildly from platform to platform (language to language, etc.) Just because people understand the concept of threads doesn't mean they know how to write solid code with them. While Twisted is admittedly Python-centric, it's conceivable that an reasonably small intermediate API could be developed that would hide most of the Python-oddities/specificity. Admittedly all of this would require a rather dramatic change in the underlying architecture of Chandler. However, if you intend to build a serious product, that is meant to be the platform for the next generation of personal information management tools, you're going to have to make solid architectural decisions sooner, rather than later. As most everyone knows, it's far cheaper to invest development time near the beginning of a project rather than at the end of the project. So, in a nutshell, that's my $0.02. I would be glad to help out in the prototyping of the idea, as I have time.
The Reator is technically not thread safe. All code accessing the Reactor must operate in the Reactor thread. To accomplish this use reactor.callFromThread.
It really depends on what your program is doing, but the most common cause is this: it is firing -- but it's an error, not a success, and you have forgotten to add an errback, so nothing happens. Always add errbacks!The reason this happens is that unhandled errors in Deferreds get printed when the Deferred is garbage collected. Make sure your Deferred is garbage collected by deleting all references to it when you are done with it, e.g. after callback() is called.
You don't. Deferreds don't magically turn a blocking function call into a non-blocking one. A Deferred is just a simple object that represents a deferred result, with methods to allow convenient adding of callbacks. (This is a common misunderstanding; suggestions on how to make this clearer in the Deferred Execution howto are welcome!)If you have blocking code that you want to use non-blockingly in Twisted, either rewrite it to be non-blocking, or run it in a thread. There is a convenience function, deferToThread, to help you with the threaded approach -- but be sure to read Using Threads in Twisted.
Open Source Applications Foundation |
|