A rate limited wrapper for Python’s thread safe queues.
Some external APIs have rate limits that allow faster-than-consecutive
queries, e.g. if the rate limit is very high or the API response is
quite slow. To make the most of the API, the best option is often to
make API calls from multiple threads. Then you can put the requests or
URLs to call in a
queue.Queue and have the threads consume the URLs
as they make the calls. However, you still have to make sure that the
total calls from all your threads don’t exceed the rate limit, which
requires some nontrivial coordination.
ratelimitqueue package extends the three built-in Python queues
queue package -
PriorityQueue - with configurable, rate limited counterparts.
get() method is rate limited across all threads so
that workers can safely consume from the queue in an unlimited loop, and
putting the items in the queue doesn’t need to require blocking the main
ratelimitqueue, simply install it with
pip install ratelimitqueue
The most basic usage is rate limiting calls in the main thread by
RateLimitQueue. For a rate limit of 2 calls per
rlq = ratelimitqueue.RateLimitQueue(calls=2, per=1) # load up the queue for url in LIST_OF_URLS: rlq.put(url) # make the calls while rlq.qsize() > 0: url = rlq.get() make_call_to_api(url) rlq.task_done()
A more typical use case would be to have a pool of workers making API calls in parallel:
rlq = ratelimitqueue.RateLimitQueue(calls=3, per=2) n_workers = 4 def worker(rlq): """Makes API calls on URLs from queue until it is empty.""" while rlq.qsize() > 0: url = rlq.get() make_call_to_slow_api(url) rlq.task_done() # load up the queue for url in LIST_OF_URLS: rlq.put(url) # make the calls with multiprocessing.dummy.Pool(n_workers, worker, (rlq,)) as pool: rlq.join()
Working versions of these examples can be found in the in the examples directory.