Your Django view takes 30 seconds because it sends emails, generates PDFs, and calls three external APIs. Your users are staring at a loading spinner. The fix is not faster code — it is moving slow work to a background task queue.
Celery is the standard solution for Python. This guide takes you from basic tasks to production-grade workflows with retries, chains, monitoring, and the gotchas that bite every team.
Producer
task.delay()
Broker
message queue
Consumer
executes task
Backend
Redis / DB
Setup: Django + Celery + Redis
# Install
pip install celery[redis] django-celery-beat django-celery-results
# project/celery.py
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'project.settings')
app = Celery('project')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks() # Auto-find tasks.py in each app
# project/__init__.py
from .celery import app as celery_app
__all__ = ('celery_app',)
# settings.py
CELERY_BROKER_URL = 'redis://localhost:6379/0'
CELERY_RESULT_BACKEND = 'redis://localhost:6379/1'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'UTC'
Basic Tasks
# orders/tasks.py
from celery import shared_task
from django.core.mail import send_mail
@shared_task
def send_order_confirmation(order_id: int):
"""Send order confirmation email in the background."""
order = Order.objects.select_related('customer').get(id=order_id)
send_mail(
subject=f'Order #{order.id} Confirmed',
message=f'Thank you for your order of {order.total}.',
from_email='noreply@example.com',
recipient_list=[order.customer.email],
)
return f"Email sent for order {order_id}"
@shared_task
def generate_invoice_pdf(order_id: int):
"""Generate PDF invoice and upload to S3."""
order = Order.objects.get(id=order_id)
pdf = render_invoice(order)
url = upload_to_s3(pdf, f"invoices/order-{order_id}.pdf")
order.invoice_url = url
order.save(update_fields=['invoice_url'])
return url
# Call from your view:
def create_order(request):
order = Order.objects.create(...)
# Fire and forget - returns immediately
send_order_confirmation.delay(order.id)
generate_invoice_pdf.delay(order.id)
return JsonResponse({"order_id": order.id}) # Responds instantly!
Task Retries: Handling Failures Gracefully
@shared_task(
bind=True,
max_retries=3,
default_retry_delay=60, # 60 seconds between retries
)
def call_payment_api(self, order_id: int):
"""Process payment with automatic retries."""
order = Order.objects.get(id=order_id)
try:
result = payment_gateway.charge(
amount=order.total,
token=order.payment_token,
)
order.payment_status = 'completed'
order.save(update_fields=['payment_status'])
return result
except payment_gateway.TemporaryError as exc:
# Retry with exponential backoff
raise self.retry(
exc=exc,
countdown=60 * (2 ** self.request.retries), # 60s, 120s, 240s
)
except payment_gateway.PermanentError as exc:
# Do not retry permanent failures
order.payment_status = 'failed'
order.save(update_fields=['payment_status'])
raise # Let it fail
Task Chains: Sequential Workflows
from celery import chain
# Execute tasks in sequence, passing results forward
workflow = chain(
validate_order.s(order_id), # Step 1: validate
process_payment.s(), # Step 2: charge (receives validation result)
send_confirmation.s(), # Step 3: email (receives payment result)
generate_invoice.s(), # Step 4: PDF
)
# Execute the chain
workflow.delay()
# Each task receives the return value of the previous task
@shared_task
def validate_order(order_id):
order = Order.objects.get(id=order_id)
if order.total <= 0:
raise ValueError("Invalid order total")
return {"order_id": order_id, "total": float(order.total)}
@shared_task
def process_payment(validation_result):
order_id = validation_result["order_id"]
# ... process payment
return {"order_id": order_id, "payment_id": "pay_123"}
Task Groups and Chords: Parallel Execution
from celery import group, chord
# Group: run tasks in parallel (fan-out)
batch = group(
send_notification.s(user_id) for user_id in user_ids
)
batch.delay() # All notifications sent in parallel
# Chord: parallel tasks + callback when ALL complete (fan-out, fan-in)
workflow = chord(
[
fetch_price.s('AAPL'),
fetch_price.s('GOOGL'),
fetch_price.s('MSFT'),
],
compile_report.s() # Called with list of all results
)
workflow.delay()
@shared_task
def fetch_price(symbol):
price = stock_api.get_price(symbol)
return {"symbol": symbol, "price": price}
@shared_task
def compile_report(prices):
# prices = [{"symbol": "AAPL", "price": 150}, ...]
report = generate_market_report(prices)
return report
Rate Limiting
# Limit to 10 calls per minute (external API rate limit)
@shared_task(rate_limit='10/m')
def call_external_api(item_id):
return api_client.process(item_id)
# Per-worker rate limit
@shared_task(rate_limit='100/h')
def send_sms(phone, message):
sms_gateway.send(phone, message)
# Rate limiting in settings (global)
CELERY_TASK_DEFAULT_RATE_LIMIT = '100/m'
Periodic Tasks with Celery Beat
# settings.py
from celery.schedules import crontab
CELERY_BEAT_SCHEDULE = {
'cleanup-expired-sessions': {
'task': 'accounts.tasks.cleanup_sessions',
'schedule': crontab(hour=3, minute=0), # Daily at 3 AM
},
'send-daily-digest': {
'task': 'notifications.tasks.send_daily_digest',
'schedule': crontab(hour=8, minute=0, day_of_week='1-5'), # Weekdays 8 AM
},
'sync-inventory': {
'task': 'inventory.tasks.sync_from_warehouse',
'schedule': 300.0, # Every 5 minutes
},
}
# Run the beat scheduler:
# celery -A project beat --loglevel=info
Monitoring with Flower
# Install and run
pip install flower
celery -A project flower --port=5555
# Flower dashboard shows:
# - Active/completed/failed task counts
# - Worker status and resource usage
# - Task execution time histograms
# - Real-time task queue depth
# - Individual task details and tracebacks
# Access at http://localhost:5555
Production Configuration
# settings.py - Production-ready config
CELERY_BROKER_URL = 'redis://redis:6379/0'
CELERY_RESULT_BACKEND = 'redis://redis:6379/1'
# Reliability
CELERY_TASK_ACKS_LATE = True # Acknowledge after execution (not before)
CELERY_WORKER_PREFETCH_MULTIPLIER = 1 # Fetch one task at a time
CELERY_TASK_REJECT_ON_WORKER_LOST = True # Requeue if worker crashes
# Performance
CELERY_WORKER_CONCURRENCY = 4 # Workers per process
CELERY_TASK_COMPRESSION = 'gzip' # Compress large payloads
CELERY_RESULT_EXPIRES = 3600 # Clean up results after 1 hour
# Safety
CELERY_TASK_TIME_LIMIT = 300 # Hard kill after 5 minutes
CELERY_TASK_SOFT_TIME_LIMIT = 240 # Raise SoftTimeLimitExceeded at 4 min
CELERY_TASK_ALWAYS_EAGER = False # Never True in production!
Common Pitfalls
- Passing ORM objects to tasks: Always pass IDs, not model instances. Objects cannot be serialized to JSON, and even if pickled, the data may be stale by the time the task runs.
- TASK_ALWAYS_EAGER in production: This runs tasks synchronously (no queue). Great for testing, disastrous in production.
- No time limits: A hung HTTP call inside a task blocks the worker forever. Always set TASK_TIME_LIMIT.
- Ignoring task results: If you never read results, set
ignore_result=Trueto avoid filling up Redis. - Database connections exhaustion: Each worker opens its own DB connection. With 20 workers, that is 20 connections. Use connection pooling (pgbouncer).
- Not handling idempotency: Tasks can be retried or delivered twice. Design tasks so running them twice produces the same result (use database constraints, check before insert).
Key Takeaways
- Move anything over 500ms to a background task — emails, PDFs, API calls, data processing
- Always pass IDs, not objects to Celery tasks — re-fetch from the database inside the task
- Use retries with exponential backoff for external API calls — temporary failures are normal
- Chains for sequential workflows, chords for fan-out/fan-in — compose complex pipelines from simple tasks
- Set time limits on every task — a hung worker is worse than a failed task
- Monitor with Flower — you cannot fix what you cannot see
- Design tasks to be idempotent — they will be retried, and that must be safe
Celery transforms your Django app from a synchronous request-response system into an asynchronous workflow engine. The key is starting simple — one task, one worker, one queue — and adding complexity (chains, chords, multiple queues) only when your workload demands it.