Convert timestamps in the description to pseudo chapter marks

ghast · February 24, 2021, 7:32am

There are some podcasts like Das Coronavirus-Update von NDR Info | NDR.de - Nachrichten - NDR Info
that have timestamps in their notes but for technical reasons can’t produce proper chapters.
It would be nice if AntennaPod could fall back to using those if no standard chapter marks are there.
Not every number in a description is a valid chapter mark , i know.

ByteHamster · February 24, 2021, 8:11am

AntennaPod already does that. Try clicking the timestamps.

ghast · February 24, 2021, 8:43am

We can click them but they are not part of the chapter tab, which they imho should

ByteHamster · February 24, 2021, 12:07pm

Oh, okay. Now I get it. I don’t think there is any reliable way to parse them. Descriptions are full html - the chapters could be tables, lists, text, etc. They could be in one line or multiple lines. They could be separated from the timestamps or not.

The publishers should actually use one of the many ways to specify real chapters (eg in the media file, using podlove chapters, or using podcasting 2.0 chapters) instead of having apps guess what could be chapters and what not. Even just detecting time stamps is unreliable enough - if there is something like “19:00 Uhr”, it is detected as a time stamp, too.

keunes · February 25, 2021, 7:26am

I thought also it would be nice if the shownotes could be parsed and used for the chapter list. With the technical challenges/limitations I guess it doesn’t make sense to try and make it work.

ghast · February 25, 2021, 8:20am

How about putting some constraints on it?

The time has to be within the podcasts duration and there has to be more than one possible timestamp?
That would skip simple "this show was originally aired at 19:00 o’clock " and similiar problems.

ByteHamster · February 25, 2021, 9:13am

We already do that. We even guess if the publisher meant HH:MM or MM:SS (or HH:MM:SS)

keunes · February 26, 2021, 9:31am

I’m a podcast host and I’ve got one live event of a friend and one webinar of my own to announce. The first one is at 15:00 Friday night, the other one next month last Sunday at 19:00. → That’s two timestamps in the shownotes, and te episodes happens to be 25 minutes long. They’d be listed under chapters.

A podcaster might use simple text, but then use the vertical seperator | to seperate the title and a longer description or link to more information. If we implement processing of basic time annotation followed by the section title, then the nex person will ask that the section description (seperated only by | ) is process correctly. The next might ask for handing || as a seperator because that’s what their podcast uses.

Also, if the podcaster decides to put them in a table, the app would have to interpret the table, which is extra complex. Think of all the possible variations of tables that could exist, with just two examples below:

time	title	description
12:00	Goin in-depth	In this segment our expert explains about the depth of depths: the earth’s core!
24:00	Music!	This week aired Kraantje Pappie and Justin Bieber

time	title
12:00	Goin in-depth
	In this segment our expert explains about the depth of depths: the earth’s core!
24:00	Music!
	This week aired Kraantje Pappie and Justin Bieber

Code would have to be written to determine the table structure and then process the date in it.

In other words: there’s so many exceptions and edge cases that it’s impossible to make a complete list of constraints that covers all cases. Instead, the most efficient method for everyone, is if podcasters adhere to one of the 3(+) standards that many podcast apps already support. So the best advise we can give here is to reach out to NDR and ask them to go about it the proper way