Matitos

  • Scheduled tasks
    • Fetcher -> Inserts raw URLs
      • Fetch parsing URL host
      • Fetch from RSS feed
      • Fetch searching (Google search & news, DuckDuckGo, ...)
    • Process URLs -> Updates raw URLs
      • Extracts title, description, content, image and video URLs, main image URL, language, keywords, authors, tags, published date
      • Determines if it is a valid article content
    • Valid URLs
      • Generate summary
      • Classification
        • 5W: Who, What, When, Where, Why of a Story
        • Related to child abuse?
        • ...

Georgia Institute of Technology https://comm.gatech.edu resources writers

  • Visualization of URLs

    • Filter URLs
      • By status, search, source, language
    • Charts
  • Content generation

    • Select URLs:
      • Valid content
      • language=en
      • published_date during last_week
      • Use classifications
    • Merge summaries, ...
Description
No description provided
Readme 2.2 MiB
Languages
Python 59.3%
Jupyter Notebook 21.7%
HTML 16.6%
Dockerfile 2.2%
Shell 0.2%