Celery Setup

Celery provides asynchronous task processing for FKApi, enabling background jobs like data scraping and scheduled tasks. This guide covers installation, configuration, and usage.

Overview

Celery is completely optional for FKApi. The system automatically falls back to threading if Celery is not available. However, Celery is recommended for production as it provides:

Better performance for background tasks
Task scheduling with Celery Beat
Task monitoring with Flower
Distributed task processing
Task retry and error handling

When to Use Celery

Use Celery

Production deployments
Scheduled tasks (daily scraping)
High-traffic environments
Task monitoring required

Skip Celery

Development/testing
Low-traffic deployments
Simpler setup preferred
No scheduled tasks needed

Architecture

FKApi uses Celery with the following components:

Redis: Message broker and result backend
Celery Worker: Processes async tasks
Celery Beat: Scheduler for periodic tasks
Flower: Web-based monitoring dashboard

┌─────────────┐
│   Django    │
│  (Web App)  │
└──────┬──────┘
       │ Submit tasks
       ↓
┌─────────────┐     ┌──────────────┐
│    Redis    │←────│Celery Worker │
│   (Broker)  │     │  (Executor)  │
└─────────────┘     └──────────────┘
       ↑
       │ Schedule tasks
┌──────┴──────┐
│ Celery Beat │
│ (Scheduler) │
└─────────────┘

Prerequisites

Before installing Celery, ensure you have:

Redis installed and running
Python 3.10 or higher
FKApi base installation complete

Installation

Install Redis

Redis is required as the message broker for Celery.

Linux (Ubuntu/Debian)

sudo apt-get update
sudo apt-get install redis-server
sudo systemctl start redis-server
sudo systemctl enable redis-server

# Verify Redis is running
redis-cli ping
# Should return: PONG

macOS

brew install redis
brew services start redis

# Verify Redis is running
redis-cli ping
# Should return: PONG

Windows

Option 1: Using Memurai (Recommended)

Download from https://www.memurai.com/
Install and start the service
Redis will run on localhost:6379

Option 2: Using Docker

docker run -d --name redis -p 6379:6379 redis:7-alpine

Option 3: Using WSL2

# In WSL2 terminal
sudo apt-get update
sudo apt-get install redis-server
sudo service redis-server start

Install Celery Packages

Install Celery and related dependencies:

# Activate virtual environment
source venv/bin/activate  # Linux/Mac
# or
venv\Scripts\activate  # Windows

# Install Celery packages
pip install celery[redis]==5.4.0
pip install django-celery-beat==2.6.0
pip install flower==2.0.1
pip install django-redis==5.4.0

These packages are already included in requirements.txt, so if you installed all dependencies, you already have Celery.

Configure Environment Variables

Update your .env file to enable Celery:

# Enable Celery
ENABLE_CELERY=True

# Redis configuration
REDIS_URL=redis://localhost:6379/1
CELERY_BROKER_URL=redis://localhost:6379/0
CELERY_RESULT_BACKEND=redis://localhost:6379/0

Verify Celery Configuration

The Celery app is configured in fkapi/celery.py:

import os
from celery import Celery

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "fkapi.settings")

app = Celery("fkapi")
app.config_from_object("django.conf:settings", namespace="CELERY")
app.autodiscover_tasks()

This configuration:

Automatically discovers tasks in Django apps

Loads settings from Django with CELERY_ prefix

Uses Redis as broker and result backend

Running Celery

Starting Workers

Celery workers process async tasks. Start a worker with:

# Using management command
python manage.py celery worker --loglevel=info --pool=threads --concurrency=4

# Or using Celery directly
celery -A fkapi worker --loglevel=info --pool=threads --concurrency=4

Worker Options Explained

--loglevel=info: Set logging level (debug, info, warning, error, critical)
--pool=threads: Use thread pool (required for Windows)
--concurrency=4: Number of worker threads (adjust based on CPU cores)
-A fkapi: Name of the Celery app

Starting Beat Scheduler

Celery Beat schedules periodic tasks. Start the scheduler with:

celery -A fkapi beat --loglevel=info --scheduler django_celery_beat.schedulers:DatabaseScheduler

The DatabaseScheduler stores schedules in the database, allowing runtime configuration through Django admin.

Starting Flower (Monitoring)

Flower provides a web UI for monitoring Celery:

celery -A fkapi flower --port=5555

Access the dashboard at: http://localhost:5555 Flower shows:

Active workers
Running tasks
Task history
Success/failure rates
Worker resource usage

Running All Services

For development, you’ll need to run three separate processes:

# Terminal 1: Django
python manage.py runserver

# Terminal 2: Celery Worker
celery -A fkapi worker --loglevel=info --pool=threads --concurrency=4

# Terminal 3: Celery Beat
celery -A fkapi beat --loglevel=info --scheduler django_celery_beat.schedulers:DatabaseScheduler

# Terminal 4 (optional): Flower
celery -A fkapi flower --port=5555

Scheduled Tasks

FKApi includes scheduled tasks configured in settings.py:

CELERY_BEAT_SCHEDULE = {
    'scrape_daily': {
        'task': 'core.tasks.scrape_daily',
        'schedule': crontab(hour=0, minute=0),  # Daily at midnight
    },
}

Modifying Schedules

You can modify schedules in several ways:

Django Admin
Settings File

Go to http://localhost:8000/admin/
Navigate to Periodic Tasks
Add or edit tasks
Changes take effect immediately

Edit CELERY_BEAT_SCHEDULE in fkapi/settings.py:

# Daily at 3 AM
'schedule': crontab(hour=3, minute=0)

# Every 6 hours
'schedule': crontab(minute=0, hour='*/6')

# Weekly on Monday at 3 AM
'schedule': crontab(day_of_week=1, hour=3, minute=0)

# Every 30 minutes
'schedule': crontab(minute='*/30')

Restart Celery Beat after changes.

Task Examples

Creating a Task

Define tasks in your Django app’s tasks.py:

from celery import shared_task
import logging

logger = logging.getLogger(__name__)

@shared_task
def scrape_user_collection(userid):
    """Scrape user collection from FootballKitArchive."""
    logger.info(f"Starting scrape for user {userid}")
    try:
        # Your scraping logic here
        result = perform_scrape(userid)
        logger.info(f"Completed scrape for user {userid}")
        return result
    except Exception as e:
        logger.error(f"Error scraping user {userid}: {e}")
        raise

Calling a Task

Call tasks asynchronously from your views or API endpoints:

from core.tasks import scrape_user_collection

# Async execution (with Celery)
task = scrape_user_collection.delay(userid=12345)

# Get task ID
task_id = task.id

# Check task status later
from celery.result import AsyncResult
result = AsyncResult(task_id)
if result.ready():
    print(result.result)

Fallback to Threading

If Celery is not available, FKApi automatically uses threading:

import threading

def scrape_async(userid):
    """Fallback to threading if Celery unavailable."""
    thread = threading.Thread(target=perform_scrape, args=(userid,))
    thread.start()

Monitoring

Command Line

Monitor Celery from the command line:

# Check active workers
celery -A fkapi inspect active

# Check registered tasks
celery -A fkapi inspect registered

# Check worker stats
celery -A fkapi inspect stats

# Monitor events in real-time
celery -A fkapi events

# Purge all tasks
celery -A fkapi purge

Flower Dashboard

Flower provides comprehensive monitoring:

Start Flower: celery -A fkapi flower --port=5555
Open http://localhost:5555
View:
- Tasks: History of all tasks
- Workers: Active workers and their status
- Monitor: Real-time task execution
- Broker: Redis connection stats

Troubleshooting

Redis Connection Failed

Error: Error: Redis is not running Solution:

# Check if Redis is running
redis-cli ping
# Should return: PONG

# Start Redis
# Linux: sudo systemctl start redis-server
# Mac: brew services start redis
# Windows: Start Memurai service or Docker container

Tasks Not Executing

Error: Tasks appear in queue but don’t execute Solution:

Check worker is running: celery -A fkapi inspect active
Check for errors in worker logs
Verify task is registered: celery -A fkapi inspect registered
Check Redis connection: redis-cli ping

ModuleNotFoundError

Error: ModuleNotFoundError: No module named 'celery' Solution:

# Install Celery
pip install celery[redis] django-celery-beat

# Or disable Celery in .env
ENABLE_CELERY=False

Worker Crashes

Error: Worker exits unexpectedly Solution:

Check worker logs for errors
Increase worker timeout
Check database connection
Verify memory availability
Check for task deadlocks

Tasks Slow or Hanging

Error: Tasks take too long or never complete Solution:

Check database performance
Add timeouts to external API calls
Monitor Redis memory usage
Increase worker concurrency
Profile task code for bottlenecks

Production Deployment

For production, run Celery as a system service:

Systemd Service (Linux)

Create /etc/systemd/system/celery.service:

[Unit]
Description=Celery Worker
After=network.target redis.target

[Service]
Type=forking
User=www-data
Group=www-data
WorkingDirectory=/var/www/fkapi
Environment="PATH=/var/www/fkapi/venv/bin"
ExecStart=/var/www/fkapi/venv/bin/celery -A fkapi worker \
    --loglevel=info \
    --pool=threads \
    --concurrency=4 \
    --logfile=/var/log/celery/worker.log
Restart=always

[Install]
WantedBy=multi-user.target

Enable and start:

sudo systemctl enable celery
sudo systemctl start celery
sudo systemctl status celery

Docker Deployment

See the Docker guide for containerized deployment with docker-compose.

Best Practices

Task Design

Keep tasks small and focused
Make tasks idempotent (safe to retry)
Handle errors gracefully
Add logging for debugging
Set task timeouts

Performance

Use appropriate concurrency settings
Monitor Redis memory usage
Use task routing for different priorities
Implement rate limiting for external APIs
Profile slow tasks

Reliability

Configure task retries
Use result backends for important tasks
Monitor task success rates
Set up alerting for failures
Back up Redis data in production

Security

Use strong Redis passwords
Limit Redis network access
Validate task inputs
Use encrypted connections in production
Keep Celery and Redis updated

Get Started

Core Concepts

Guides

Management Commands

Celery Setup

Overview

When to Use Celery

Use Celery

Skip Celery

Architecture

Prerequisites

Installation

Running Celery

Starting Workers

Starting Beat Scheduler

Starting Flower (Monitoring)

Running All Services

Scheduled Tasks

Modifying Schedules

Task Examples

Creating a Task

Calling a Task

Fallback to Threading

Monitoring

Command Line

Flower Dashboard

Troubleshooting

Redis Connection Failed

Tasks Not Executing

ModuleNotFoundError

Worker Crashes

Tasks Slow or Hanging

Production Deployment

Systemd Service (Linux)

Docker Deployment

Best Practices

Next Steps

Docker Deployment

Monitoring

Get Started

Core Concepts

Guides

Management Commands

Documentation Index

​Overview

​When to Use Celery

Use Celery

Skip Celery

​Architecture

​Prerequisites

​Installation

​Running Celery

​Starting Workers

​Starting Beat Scheduler

​Starting Flower (Monitoring)

​Running All Services

​Scheduled Tasks

​Modifying Schedules

​Task Examples

​Creating a Task

​Calling a Task

​Fallback to Threading

​Monitoring

​Command Line

​Flower Dashboard

​Troubleshooting

​Redis Connection Failed

​Tasks Not Executing

​ModuleNotFoundError

​Worker Crashes

​Tasks Slow or Hanging

​Production Deployment

​Systemd Service (Linux)

​Docker Deployment

​Best Practices

​Next Steps

Docker Deployment

Monitoring

Overview

When to Use Celery

Architecture

Prerequisites

Installation

Running Celery

Starting Workers

Starting Beat Scheduler

Starting Flower (Monitoring)

Running All Services

Scheduled Tasks

Modifying Schedules

Task Examples

Creating a Task

Calling a Task

Fallback to Threading

Monitoring

Command Line

Flower Dashboard

Troubleshooting

Redis Connection Failed

Tasks Not Executing

ModuleNotFoundError

Worker Crashes

Tasks Slow or Hanging

Production Deployment

Systemd Service (Linux)

Docker Deployment

Best Practices

Next Steps