FeedDirectory.org as a possible alternative for gpodder.net

Hello All,

There is a new service for synchronizing RSS feeds subscription and episode position: https://feeddirectory.org/

It is still in early stages, but it might be interesting to follow up, if it matures it could be an alternative for gpodder.net

Best Regards,
Guilherme

1 Like

Sounds pretty nice. They even plan to open-source the code: Building feeddirectory.org: We're live, kinda I really like the fact that they use hashing, so they do not need to store the email address.

A few things I am missing before we can integrate:

  • Get the state of multiple episodes at once using the API
  • Get all episode/subscription changes after a specific date (otherwise, we need to request every single episode of every single subscription on each sync)
  • Private subscriptions (I think the API can currently query who is subscribed to which feed, which I consider a privacy issue)
  • Clarification on how they deal with url redirects (do they keep the old url or do they switch? What happens to episode states if they switch?)
1 Like

All of these shouldnā€™t be an issueā€¦ give me a week or so and Iā€™ll have a solution for at least the first 3 points.

With regards to the ā€œURL redirectsā€ - feeddirectory.org doesnā€™t perform any processing or fetching from the URLs themselves, the URLs are used to identify unique feeds so Iā€™m not sure how this is relevantā€¦ maybe Iā€™m misunderstanding the requirement though.

1 Like

Okay, so FeedDirectory (unlike gpodder) is just a database that does not do any feed parsing by itself. Then the fourth point is no longer relevant :slight_smile:

To clarify my message above: The feature ā€œget state of multiple episodesā€ would be used for the initial sync - so basically downloading the complete information stored for one user.

@stucoates First of all thanks for working on this - itā€™s a great initiative. I have two questions:

  • Since you donā€™t do any processing or fetching URLs, Iā€™m wondering: do you ā€˜aggregateā€™ data in a way? As in: are you able to provide/develop a list of popular feeds (average for the whole userbase)?
  • And a related question, are the two datasets (ā€˜the things you publishā€™ and ā€˜the things you consumeā€™) separate or linked somehow?

Iā€™m asking because of the following scenario:
I as an AntennaPod user follow xyz.com/rss, and the feed changes address (to abc.com/rss) with a nice 301/308 redirect on the server. Because of the redirect AntennaPod updates the feed it follows. However, the person owning/hosting the feed forgot to update their entry on feeddirectory.org, which thus stays xyz.com/rss. My AntennaPod installation and the feeddirectory.org entry are then ā€˜conflictingā€™

(Also, but you might label this ā€˜out of scopeā€™, it would be cool to get data on which feeds are popular, so that relational recommendations (ā€˜those subscribed to x are subscribed to yā€™) could be presented for example directly in AntennaPod for those that sync with your service.)

1 Like

Thereā€™s no API at the moment to do any ā€œtop 20ā€ type of thing but itā€™s pretty trivial to do. Iā€™d have to think if there are any possible privacy issues relating to this, but I suppose if it just uses public subscriptions and ignores private ones then there wouldnā€™t be an issue.

The published and subscriptions are held separately but are joined by the account so ā€œthose subscribed to x are subscribed to yā€ is perfectly possible, but as before, Iā€™ll have to think of any possible privacy issues.

The automatic upgrading of links in feeddirectory.org is something Iā€™m thinking about when I start building the ā€œhealth checkā€ backendā€¦ if a 301/308 is returned when the crawler visits a link, itā€™s pretty easy to update everything. Whether this is the correct thing to do will require some thought.

1 Like

Hereā€™s a preview of the APIs thatā€™ll be rolling out this week to ā€œfixā€ the things mentioned in this thread: FeedDirectory.org

I still need to do some testing prior to putting the code live, but thought Iā€™d give you an early look into my thinking about how things should be working with FD.

Hope this all makes sense - please let me know if thereā€™s anything too confusing.

1 Like

Thanks for the quick reply :slight_smile:

Do I correctly understand that while you note thereā€™s a link between the two:

thereā€™s no risk associated with the ā€˜conflictā€™ between client and server?

Nice that a top x and relational recommendations are technically possible, understandably privacy is a concern. From a user perspective I would keep my subscriptions private, but I wouldnā€™t mind making them usable for ā€˜the algorithmā€™ to determine relational recommendations (provided that thereā€™s no way to de-anonymise my data in a way, e.g. if no results are returned until thereā€™s 25 or 50 users subscribed to a given feed).

Just to recap ByteHamsterā€™s comments:

  • See above, using lookback-seconds

About the privacy/permissions: I understand from the docs that subscription privacy has two switches: private and follower-only. However, they seem mutually exclusive to me, so I would rather expect one switch with three possible values: public, followers only, private (as you suggested in your reply on Mastodon).

Also, as a user I probably wouldnā€™t set this on a per-feed level, but rather for my account as a whole. Would it make sense to add POST/GET calls for the account level, and that this account value is taken as default for all newly added feeds?

Thanks, I just had a quick look at it. Wouldnā€™t specifiying a timestamp be more precise than lookback-seconds? What if the request somehow takes a few seconds to be processed and returned? Also, both you and the clients need to do date calculations (for example when using paged loading) whereas with a simple timestamp, we can just pass the most recent timestamp +1 to get the next page.

I think the client can handle this :slight_smile:

Sure, but if in AP there is a ā€˜globalā€™ setting that determines the behaviour of individual requests, the behaviour would be different on the client (AP) and the serverā€™s web interface (or another client) - or I would have to change that setting on every client (incl web). If these other clients offer such setting, anyway. Doesnā€™t feel very logic.

Iā€™m not sure I understand the concern around a possible conflict between published items and subscriptions. They are held separately but against the same account - different APIs also.

Iā€™d also thought about having thresholds around inclusion in any data aggregation algorithmsā€¦ will have a think about what the best approach is here once I have some real life data to look at.

Completely understand the possible confusion here. The private and follower-only switches are kind of mutually exclusive in this case - the ā€œmost privateā€ case will win (e.g. private overrules follower-only). Thereā€™s a reason why Iā€™ve done the implementation like this - it makes adding extra switches a lot easier at a later date. App developers can decide how they surface the settings and flip the appropriate bits on the FD side. Possible to have a higher-level API that flips multiple switches in one shot, but thereā€™s a danger there of API bloat if too many of those exist.

There will be (soon) switched at the account level for defaulting the subscription/publish settings so, for example, all new subscriptions are private (or follower-only) by default.

1 Like

The reason Iā€™ve gone for an offset from current time is that Iā€™ve had so many issues in the past (far too many years to mention) with people (ā€œdevelopersā€) not understanding the concept of timezones let alone the ability to specify a correctly formatted timestamp, so for the sake of my own sanity Iā€™ve removed that from the equation completely. The FD service will work at UTC, and the apps can work with whatever they want, as long as they remain consistent.

If thereā€™s enough demand Iā€™ll add the option to supply a timestamp instead, but maybe with the caveat that if you F it up youā€™re on your own. :wink:

Yeah, I already suspected timezones to be the problem. How about returning the timestamp in the server responses? Then a client can just take that same timestamp when synchronizing next time.

Yep, good idea - I could return the server timestamp in all API responses. The only problem I can see here is that because a lot can happen at the same time, if the client relies on the timestamp being the exact point that data is read, itā€™s possible for a parallel write to have a timestamp before the read but is not yet visible to the read due to the way RDBMS transaction isolation worksā€¦ only the matter of milliseconds, but still possible.

So, whatever I do, itā€™s probably safer for clients to assume that they may receive duplicate items if theyā€™re being retrieved based on the time. A ā€œsaferā€ way would be to provide a serial event number where the client says ā€œlast event I received is 12345, send me everything from event 12346ā€ but this means a greater amount of smarts on the client, a more complex API, and more complexity on the server to assign sequential event IDs on the server.

Iā€™ll go ahead with the timestamp in API responses and the ability to send timestamp instead of ā€œloopback-secondsā€ and see how it goes.

Have just put the new code live on feeddirectory.org, along with a short blogpost: Building feeddirectory.org: Permissions, Followers, Comments

I think Iā€™ve covered all of the features that have been requested here and would really appreciate any further feedback. Oh, and yes @ByteHamster, it now supports timestamps. :wink:

Sadly, I donā€™t own an Android device to enable me to test any AP integration, but will assist where I can with getting things up and running.

3 Likes

Oops, forgot to mentionā€¦ as part of each API response thereā€™s now a ā€œserver-timeā€ component - this is the database transaction time - which should help with any subsequent ā€œmodified-sinceā€ requests. Use with the caveat I mentioned earlier about updates happening in parallel with the read being invisible.

1 Like

I guess itā€™s time to create an issue on GitHub for the technical implementation. Iā€™m wondering though, @stucoates, do you plan to make a web interface for people to create & manage accounts?
I can create mock-ups for how that could work in AntennaPod, but I feel that account creation is out of scope (for gpodder.net we also only offer ā€˜Loginā€™).

1 Like

Hold off doing any design/implementation for a few days. Iā€™ve had a discussion with another app developer about how these things can work and it may be that FD issues unique tokens for an app/user combination - this way a secret reset will not result in the need to re-authenticate everything thatā€™s connected to an account.

For this, yes I will be building a simple account management UI and associated new APIs.

Iā€™ll get back to you once this is done - should be a few days before Iā€™ve made the decision and built the APIs, the UI will probably be done at some point next week.

1 Like