Chile’s Ministry of Energy announced today that Chile will be observing
daylight saving time again. Chile Standard Time will be changed back to UTC
-4 at 00:00 on 15 May, and DST will be observed from 00:00 on 14 August 2016,
changing time in Chile to UTC -3.
Chile used to observe DST every year until a permanent UTC offset of -3 was
introduced in 2015.
It is unclear whether the time change also applies to Easter Island.
I was looking to make more room on my phone’s home screen, and I realized that
my use of App.net had dwindled more than enough to remove it. I never
post any more, but there are a couple of people I would still like to follow
that don’t cross post to Twitter.
App.net has RSS feeds for every user, but they include both posts and replies.
I only want to see posts. So I brushed off my primitive XSLT skills.
I wrote an XSLT program to delete RSS items that begin with @. While I was at
it, I replaced each title with the user’s name, since the text of the post is
also available in the description tag.
Here is the transformation that would filter my posts, if I had any:
Now I can use xsltproc to filter the RSS.
In order to fill in the username automatically, I wrapped the XSLT program in a
shell script that also invokes curl.
While adding multithreading support to a Python script,
I found myself thinking again about the difference between multithreading and
multiprocessing in the context of Python.
For the uninitiated, Python multithreading uses threads to do parallel
This is the most common way to do parallel work in many programming languages.
But CPython has the Global Interpreter Lock (GIL), which means that
no two Python statements (bytecodes, strictly speaking) can execute at the same
time. So this form of parallelization is only
helpful if most of your threads are either not actively doing anything
(for example, waiting for input),
or doing something that happens outside the GIL
(for example launching a subprocess or doing a numpy calculation).
Using threads is very lightweight, for example, the threads share memory space.
Python multiprocessing, on the other hand, uses multiple system level processes,
that is, it starts up multiple instances of the Python interpreter.
This gets around the GIL limitation, but obviously has more overhead.
In addition, communicating between processes is not as easy as reading and
writing shared memory.
To illustrate the difference, I wrote two functions. The first is called idle
and simply sleeps for two seconds. The second is called busy and
computes a large sum. I ran each 15 times using 5 workers, once using threads
and once using processes. Then I used matplotlib to visualize the
Here are the two idle graphs, which look essentially identical.
(Although if you look closely, you can see that the multiprocess version is
And here are the two busy graphs. The threads are clearly not helping
I have a Python script that downloads OFX files
from each of my banks and credit cards.
For a long time, I have been intending to make the HTTP requests multithreaded,
since it is terribly inefficient to wait for one response to arrive before
sending the next request.
Here is the single-threaded code block I was working with.
Using the Python 2.7 standard library, I would probably use either the
threading module or multiprocessing.pool.ThreadPool.
In both cases, you can call a function in a separate thread but you cannot
access the return value. In my code, I would need to alter Download
to take a second parameter and store the output there. If the second parameter
is shared across multiple threads, I have to worry about thread safety.
Doable, but ugly.
In Python 3.2 an higher, the concurrent.futures module
makes this much easier. (It is also backported to Python 2.)
Each time you submit a function to be run on a separate thread, you get a Future
object. When you ask for the result, the main thread blocks until your thread is
complete. But the main benefit is that I don’t have to make any changes to
In a typical run, my 6 accounts take 3, 4, 5, 6, 8, and 10 seconds to
download. Using a single thread, this is more than 30 seconds. Using multiple
threads, we just have to wait 10 seconds for all responses to arrive.