URLs site with status filter, refactoring, django-tasks-scheduler low high priority queues

This commit is contained in:
Luciano Gervasoni
2025-03-25 21:44:26 +01:00
parent 24b4614049
commit 9d2550b374
9 changed files with 111 additions and 55 deletions

View File

@@ -4,7 +4,6 @@ conda create -n matitos_urls python=3.12
conda activate matitos_urls
# Core
pip install django psycopg[binary] django-redis django-tasks-scheduler
# django-rq
# Fetcher
pip install feedparser python-dateutil newspaper4k lxml[html_clean] googlenewsdecoder gnews duckduckgo_search GoogleNews
# News visualization
@@ -88,33 +87,41 @@ RQ_DEFAULT_TIMEOUT=${REDIS_PORT:-900}
RQ_DEFAULT_RESULT_TTL=${RQ_DEFAULT_RESULT_TTL:-3600}
```
* Django DB
* Deploy
```
# Generate content for models.py
python manage.py inspectdb
# Migrations
python manage.py makemigrations api; python manage.py migrate --fake-initial
# Create user
python manage.py createsuperuser
```
* Deploy
```
# Server
# 1) Server
python manage.py runserver
# Workers
# python manage.py rqworker high default low
# 2) Workers
python manage.py rqworker high default low
# Visualize DB
http://localhost:8080/?pgsql=matitos_db&username=supermatitos&db=matitos&ns=public&select=urls&order%5B0%5D=id
```
* Scheduled tasks
```
Names: Fetch Feeds, Fetch Parser, Fetch Search
Callable: api.tasks.fetch_feeds, api.tasks.fetch_parser, api.tasks.fetch_search
Task type: Repetable task (or cron...)
Queue: Default
Interval: 15min, 2h, 30min
Names: Process raw URLs, Process error URLs, Process MissingKids URLs
Callable: api.tasks.process_raw_urls, api.tasks.process_error_urls, api.tasks.process_missing_kids_urls_50
Task type: Repetable task (or cron...)
Queue: Low, Low, Default
Interval: 1h, 4h, 2h
```
* Utils
```
python manage.py rqstats
python manage.py rqstats --interval=1 # Refreshes every second
python manage.py rqstats --json # Output as JSON
python manage.py rqstats --yaml # Output as YAML
```