Pushover and PodPing from RSS

25 May, 2021 08:00AM ยท 4 minute read

In my efforts to support the Podcasting 2.0 initiative, I thought I should see how easy it was to incorporate their new PodPing concept, which is effectively a distributed RSS notification system specifically tailored for Podcasts. The idea is that when a new episode goes live, you notify the PodPing server and it then adds that notification to the distributed Hive blockchain system and then any apps can simply watch the blockchain and this can trigger the download of the new episode in the podcast client.

This has come predominantly from their attempts to leverage existing technology in WebSub, however when I tried the WebSub angle a few months ago, the results were very disappointing with many minutes, hours passing before a notification was seen and in some cases it wasn’t seen at all.

I leveraged parts of an existing Python script I’ve been using for years for my RSS social media poster, but stripped it down to the bare minimum. It consists of two files, checkfeeds.py (which just creates an instance of the RssChecker class) and then the actual code is in rss.py.

This beauty of this approach is that it will work on ANY site’s RSS target. Ideally if you have a dynamic system you could trigger the GET request on an episode posting event, however since my sites are statically generated and the posts are created ahead of time (and hence don’t appear until the site builder coincides with a point in time after that post is set to go live) it’s problematic to create a trigger from the static site generator.

Whilst I’m an Electrical Engineer, I consider myself a software developer of many different languages and platforms, but for Python I see myself more of a hacker and a slasher. Yes, there are better ways of doing this. Yes, I know already. Thanks in advance for keeping that to yourself.

Both are below for your interest/re-use or otherwise:

from rss import RssChecker

rssobject=RssChecker()

checkfeeds.py

CACHE_FILE = '<Cache File Here>'
CACHE_FILE_LENGTH = 10000
POPULATE_CACHE = 0
RSS_URLS = ["https://RSS FEED URL 1/index.xml", "https://RSS FEED URL 2/index.xml"]
TEST_MODE = 0
PUSHOVER_ENABLE = 0
PUSHOVER_USER_TOKEN = "<TOKEN HERE>"
PUSHOVER_API_TOKEN = "<TOKEN HERE>"
PODPING_ENABLE = 0
PODPING_AUTH_TOKEN = "<TOKEN HERE>"
PODPING_USER_AGENT = "<USER AGENT HERE>"

from collections import deque
import feedparser
import os
import os.path
import pycurl
import json
from io import BytesIO

class RssChecker():
    feedurl = ""

    def __init__(self):
        '''Initialise'''
        self.feedurl = RSS_URLS
        self.main()
        self.parse()
        self.close()

    def getdeque(self):
        '''return the deque'''
        return self.dbfeed

    def main(self):
        '''Main of the FeedCache class'''
        if os.path.exists(CACHE_FILE):
            with open(CACHE_FILE) as dbdsc:
                dbfromfile = dbdsc.readlines()
            dblist = [i.strip() for i in dbfromfile]
            self.dbfeed = deque(dblist, CACHE_FILE_LENGTH)
        else:
            self.dbfeed = deque([], CACHE_FILE_LENGTH)

    def append(self, rssid):
        '''Append a rss id to the cache'''
        self.dbfeed.append(rssid)

    def clear(self):
        '''Append a rss id to the cache'''
        self.dbfeed.clear()

    def close(self):
        '''Close the cache'''
        with open(CACHE_FILE, 'w') as dbdsc:
            dbdsc.writelines((''.join([i, os.linesep]) for i in self.dbfeed))

    def parse(self):
        '''Parse the Feed(s)'''
        if POPULATE_CACHE:
            self.clear()
        for currentfeedurl in self.feedurl:
            currentfeed = feedparser.parse(currentfeedurl)

            if POPULATE_CACHE:
                for thefeedentry in currentfeed.entries:
                    self.append(thefeedentry.get("guid", ""))
            else:
                for thefeedentry in currentfeed.entries:
                    if thefeedentry.get("guid", "") not in self.getdeque():
#                        print("Not Found in Cache: " + thefeedentry.get("title", ""))
                        if PUSHOVER_ENABLE:
                            crl = pycurl.Curl()
                            crl.setopt(crl.URL, 'https://api.pushover.net/1/messages.json')
                            crl.setopt(pycurl.HTTPHEADER, ['Content-Type: application/json' , 'Accept: application/json'])
                            data = json.dumps({"token": PUSHOVER_API_TOKEN, "user": PUSHOVER_USER_TOKEN, "title": "RSS Notifier", "message": thefeedentry.get("title", "") + " Now Live"})
                            crl.setopt(pycurl.POST, 1)
                            crl.setopt(pycurl.POSTFIELDS, data)
                            crl.perform()
                            crl.close()

                        if PODPING_ENABLE:
                            crl2 = pycurl.Curl()
                            crl2.setopt(crl2.URL, 'https://podping.cloud/?url=' + currentfeedurl)
                            crl2.setopt(pycurl.HTTPHEADER, ['Authorization: ' + PODPING_AUTH_TOKEN, 'User-Agent: ' + PODPING_USER_AGENT])
                            crl2.perform()
                            crl2.close()

                        if not TEST_MODE:
                            self.append(thefeedentry.get("guid", ""))

rss.py

The basic idea is:

  1. Create a cache file that keeps a list of all of the RSS entries you already have and are already live
  2. Connect up PushOver (if you want push notifications, or you could add your own if you like)
  3. Connect up PodPing (ask @[email protected] or @[email protected] for a posting API TOKEN)
  4. Set it up as a repeating task on your device of choice (preferably a server, but should work on a Synology, a Raspberry Pi or a VPS)

VPS

I built this initially on my Macbook Pro using the Homebrew installed Python 3 development environment, then installed the same on a CentOS7 VPS I have running as my Origin web server. Assuming you already have Python 3 installed, I added the following so I could use pycurl:

CENTOS

yum install -y openssl-devel
yum install python3-devel
yum group install "Development Tools"
yum install libcurl-devel
python3 -m pip install wheel
python3 -m pip install --compile --install-option="--with-openssl" pycurl

Alpine Linux

apk add openssl-dev
apk add python3-dev
python3 -m pip install --upgrade pip
apk add curl-dev
python3 -m pip install --compile --install-option="--with-openssl" pycurl

Whether you like “pycurl” or not, obviously there are other options but I stick with what works. Rather than refactor for a different library I just jumped through some extra hoops to get pycurl running.

Finally I bridge the checkfeeds.py with a simply bash script wrapper and call it from a CRON Job every 10 minutes.

Job done.

Enjoy.