Member-only story
Python Celery Best Practices
Tips and tricks to help you build scalable distributed apps with Celery

Building standalone systems that are resilient is challenging enough. Add distribution and suddenly you have lots more moving parts to worry about. Naturally, the more moving parts in a piece of software the more complexity and time it takes to maintain — Python Celery is no exception.
In the first installment of this series of Celery articles, we looked at getting started with Celery using standalone python and integrating it into your Django web application projects.
In this installment, we’ll be looking at the best practices you should follow to make your Celery enabled applications more resilient, better performing, and to improve monitoring and observability.
Celery for the Right Use Cases
It’s easy to think of Celery as one-size-fits-all solution for every convincible problem. When you don’t understand the tool well enough, it’s easy to try to fit it into every use-case. But using Celery may be overkill when you have a simple use-case and you’re not looking for distribution. If you have a resource that needs to be throttled, a simple queue such as AWS SQS should suffice — it’s easier to configure and maintain than configuring Celery.
What Celery is useful for is the execution of tasks that cost you in terms of performance and resource utilization — for example, within the handler of an HTTP request, or when needing to handle complex computation or ETL work which may need time to execute. In situations like these, it makes sense to use Celery because, although you loose fine-grained control with the level abstraction provided, you gain asynchronous and distribution behavior.
While Celery can handle big data depending on how you code your work, it is not a direct replacement for open-source solutions such as Apache Spark — although Celery can compliment Spark and let Spark do what it does best.