Dockerization, whitenoise serving static, refactor

This commit is contained in:
Luciano Gervasoni
2025-04-04 10:53:16 +02:00
parent 5addfa5ba9
commit 4dbe2e55ef
39 changed files with 708 additions and 1238 deletions

View File

@@ -5,6 +5,14 @@
- Fetch parsing URL host
- Fetch from RSS feed
- Fetch searching (Google search & news, DuckDuckGo, ...)
++ Sources -> Robustness to TooManyRequests block
- Selenium based
- Sites change their logic, request captcha, ...
- Brave Search API
- Free up to X requests per day. Need credit card association (no charges)
- Bing API
- Subscription required
- Yandex. No API?
- Process URLs -> Updates raw URLs
- Extracts title, description, content, image and video URLs, main image URL, language, keywords, authors, tags, published date
- Determines if it is a valid article content