Sounds pretty nice. They even plan to open-source the code: Building feeddirectory.org: We're live, kinda I really like the fact that they use hashing, so they do not need to store the email address.
A few things I am missing before we can integrate:
Get the state of multiple episodes at once using the API
Get all episode/subscription changes after a specific date (otherwise, we need to request every single episode of every single subscription on each sync)
Private subscriptions (I think the API can currently query who is subscribed to which feed, which I consider a privacy issue)
Clarification on how they deal with url redirects (do they keep the old url or do they switch? What happens to episode states if they switch?)
All of these shouldnāt be an issueā¦ give me a week or so and Iāll have a solution for at least the first 3 points.
With regards to the āURL redirectsā - feeddirectory.org doesnāt perform any processing or fetching from the URLs themselves, the URLs are used to identify unique feeds so Iām not sure how this is relevantā¦ maybe Iām misunderstanding the requirement though.
Okay, so FeedDirectory (unlike gpodder) is just a database that does not do any feed parsing by itself. Then the fourth point is no longer relevant
To clarify my message above: The feature āget state of multiple episodesā would be used for the initial sync - so basically downloading the complete information stored for one user.
@stucoates First of all thanks for working on this - itās a great initiative. I have two questions:
Since you donāt do any processing or fetching URLs, Iām wondering: do you āaggregateā data in a way? As in: are you able to provide/develop a list of popular feeds (average for the whole userbase)?
And a related question, are the two datasets (āthe things you publishā and āthe things you consumeā) separate or linked somehow?
Iām asking because of the following scenario:
I as an AntennaPod user follow xyz.com/rss, and the feed changes address (to abc.com/rss) with a nice 301/308 redirect on the server. Because of the redirect AntennaPod updates the feed it follows. However, the person owning/hosting the feed forgot to update their entry on feeddirectory.org, which thus stays xyz.com/rss. My AntennaPod installation and the feeddirectory.org entry are then āconflictingā
(Also, but you might label this āout of scopeā, it would be cool to get data on which feeds are popular, so that relational recommendations (āthose subscribed to x are subscribed to yā) could be presented for example directly in AntennaPod for those that sync with your service.)
Thereās no API at the moment to do any ātop 20ā type of thing but itās pretty trivial to do. Iād have to think if there are any possible privacy issues relating to this, but I suppose if it just uses public subscriptions and ignores private ones then there wouldnāt be an issue.
The published and subscriptions are held separately but are joined by the account so āthose subscribed to x are subscribed to yā is perfectly possible, but as before, Iāll have to think of any possible privacy issues.
The automatic upgrading of links in feeddirectory.org is something Iām thinking about when I start building the āhealth checkā backendā¦ if a 301/308 is returned when the crawler visits a link, itās pretty easy to update everything. Whether this is the correct thing to do will require some thought.
Hereās a preview of the APIs thatāll be rolling out this week to āfixā the things mentioned in this thread: FeedDirectory.org
I still need to do some testing prior to putting the code live, but thought Iād give you an early look into my thinking about how things should be working with FD.
Hope this all makes sense - please let me know if thereās anything too confusing.
Do I correctly understand that while you note thereās a link between the two:
thereās no risk associated with the āconflictā between client and server?
Nice that a top x and relational recommendations are technically possible, understandably privacy is a concern. From a user perspective I would keep my subscriptions private, but I wouldnāt mind making them usable for āthe algorithmā to determine relational recommendations (provided that thereās no way to de-anonymise my data in a way, e.g. if no results are returned until thereās 25 or 50 users subscribed to a given feed).
About the privacy/permissions: I understand from the docs that subscription privacy has two switches: private and follower-only. However, they seem mutually exclusive to me, so I would rather expect one switch with three possible values: public, followers only, private (as you suggested in your reply on Mastodon).
Also, as a user I probably wouldnāt set this on a per-feed level, but rather for my account as a whole. Would it make sense to add POST/GET calls for the account level, and that this account value is taken as default for all newly added feeds?
Thanks, I just had a quick look at it. Wouldnāt specifiying a timestamp be more precise than lookback-seconds? What if the request somehow takes a few seconds to be processed and returned? Also, both you and the clients need to do date calculations (for example when using paged loading) whereas with a simple timestamp, we can just pass the most recent timestamp +1 to get the next page.
Sure, but if in AP there is a āglobalā setting that determines the behaviour of individual requests, the behaviour would be different on the client (AP) and the serverās web interface (or another client) - or I would have to change that setting on every client (incl web). If these other clients offer such setting, anyway. Doesnāt feel very logic.
Iām not sure I understand the concern around a possible conflict between published items and subscriptions. They are held separately but against the same account - different APIs also.
Iād also thought about having thresholds around inclusion in any data aggregation algorithmsā¦ will have a think about what the best approach is here once I have some real life data to look at.
Completely understand the possible confusion here. The private and follower-only switches are kind of mutually exclusive in this case - the āmost privateā case will win (e.g. private overrules follower-only). Thereās a reason why Iāve done the implementation like this - it makes adding extra switches a lot easier at a later date. App developers can decide how they surface the settings and flip the appropriate bits on the FD side. Possible to have a higher-level API that flips multiple switches in one shot, but thereās a danger there of API bloat if too many of those exist.
There will be (soon) switched at the account level for defaulting the subscription/publish settings so, for example, all new subscriptions are private (or follower-only) by default.
The reason Iāve gone for an offset from current time is that Iāve had so many issues in the past (far too many years to mention) with people (ādevelopersā) not understanding the concept of timezones let alone the ability to specify a correctly formatted timestamp, so for the sake of my own sanity Iāve removed that from the equation completely. The FD service will work at UTC, and the apps can work with whatever they want, as long as they remain consistent.
If thereās enough demand Iāll add the option to supply a timestamp instead, but maybe with the caveat that if you F it up youāre on your own.
Yeah, I already suspected timezones to be the problem. How about returning the timestamp in the server responses? Then a client can just take that same timestamp when synchronizing next time.
Yep, good idea - I could return the server timestamp in all API responses. The only problem I can see here is that because a lot can happen at the same time, if the client relies on the timestamp being the exact point that data is read, itās possible for a parallel write to have a timestamp before the read but is not yet visible to the read due to the way RDBMS transaction isolation worksā¦ only the matter of milliseconds, but still possible.
So, whatever I do, itās probably safer for clients to assume that they may receive duplicate items if theyāre being retrieved based on the time. A āsaferā way would be to provide a serial event number where the client says ālast event I received is 12345, send me everything from event 12346ā but this means a greater amount of smarts on the client, a more complex API, and more complexity on the server to assign sequential event IDs on the server.
Iāll go ahead with the timestamp in API responses and the ability to send timestamp instead of āloopback-secondsā and see how it goes.
I think Iāve covered all of the features that have been requested here and would really appreciate any further feedback. Oh, and yes @ByteHamster, it now supports timestamps.
Sadly, I donāt own an Android device to enable me to test any AP integration, but will assist where I can with getting things up and running.
Oops, forgot to mentionā¦ as part of each API response thereās now a āserver-timeā component - this is the database transaction time - which should help with any subsequent āmodified-sinceā requests. Use with the caveat I mentioned earlier about updates happening in parallel with the read being invisible.
I guess itās time to create an issue on GitHub for the technical implementation. Iām wondering though, @stucoates, do you plan to make a web interface for people to create & manage accounts?
I can create mock-ups for how that could work in AntennaPod, but I feel that account creation is out of scope (for gpodder.net we also only offer āLoginā).
Hold off doing any design/implementation for a few days. Iāve had a discussion with another app developer about how these things can work and it may be that FD issues unique tokens for an app/user combination - this way a secret reset will not result in the need to re-authenticate everything thatās connected to an account.
For this, yes I will be building a simple account management UI and associated new APIs.
Iāll get back to you once this is done - should be a few days before Iāve made the decision and built the APIs, the UI will probably be done at some point next week.