slurdge / goeland Goto Github PK
View Code? Open in Web Editor NEWAn alternative to rss2email written in golang with many filters
License: MIT License
An alternative to rss2email written in golang with many filters
License: MIT License
Describe the bug
I've downloaded the release 0.13.0 from here goeland_linux_amd64 but when I execute the version
command it reports that it is 0.11.0
Expected behavior
I would expect to get 0.13.0
Server (please complete the following information):
First, I'd like to say I really like goeland. It's simple, yet powerful and I like the fact that it's stateless and easily configurable with a config file.
Now to my problem, what I hoped I could get by using goeland is a weekly digest of all the new articles of the blogs I follow sent to me in one email. It would look something like:
### Weekly digest
All the new posts from last week:
#### [Blog one](https://linktotheweb.site)
* [How to make a cake?](https://linktotheblog.post)
* [Are cakes still relevant in 2022?](https://linktotheblog.post)
#### [Blog three](https://linktotheweb.site)
* [Make a pie in three steps](https://linktotheblog.post)
There is no Blog two
because it won't have posted anything the past week.
However, right now this is not possible for serveral reasons from what I see in the code:
URL
field to the Source
struct).merge
source type makes you loose the information about the subsources like source.Title
which I would need to do what I want.Is this a usecase you'd like to support? If yes, is there a certain way you would like it implemented? I am down to work on this.
I believe it would be cleaner to loop over all the sources and their entries in the html template than doing in go and embedding a single long string in the html.
This make check fail for no reasons
This issue is made to document that reddit either RSS or plain downloading of pages is broken in latests version of golang. Eigher 429 or 403 is answered depending on the variations.
The fix is as follow:
//this one is needed because of incompatibility between latest golang and reddit
defaultClient = http.Client{
Transport: &http.Transport{TLSClientConfig: &tls.Config{},},
}
Describe the bug
Relative links in certain RSS feeds don't get rewritten
To Reproduce
Expected behavior
Relative links in the linked rss feed get rewritten
With the following pipe definition:
[sources.lexfridman]
url = "https://nitter.nl/lexfridman/search/rss?f=tweets&e-replies=on"
type = "feed"
filters = ["unseen", "links", "includelink", "digest"]
I get an email that looks like this:
As you can see, the tweet body appears as both the title and the content. I'd ideally like the title to be the tweet author name (which may be different from @lexfridman in the case of retweets or if I'm making a digest of multiple Twitter users), and then let the body be the same.
I know you've got that replace
filter for simple text manipulation in the body, but have you got any ideas about more complex field manipulation that might make it possible to insert the <dc:creator>
tag into the title or something like that?
With the following source:
[sources.apnews]
url = "https://rss.kylrth.com/:proxy:items=||*[class=content]||p/https://apnews.com/apf-topnews"
type = "feed"
filters = ["unseen", "lasthours", "links", "includelink", "combine(3)"]
I get the following logs:
Executing pipe named: apnews
Fetching source: apnews of type feed
Retrieved 10 feeds for source apnews
Executing unseen filter with args: []
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
After unseen: 0 feeds
After unseen: []
Executing lasthours filter with args: []
After lasthours: 0 feeds
After lasthours: []
Executing links filter with args: []
After links: 0 feeds
After links: []
Executing includelink filter with args: []
After includelink: 0 feeds
After includelink: []
Executing combine filter with args: [3]
After combine(3): 0 feeds
After combine(3): []
&{apnews Top News: US & International Top News Stories Today | AP News []}
running with v0.10.2
in Docker. Somehow the URLs for the items are not showing up in the keys for the "unseen" filter, so all items have the same key.
This is my config.toml
loglevel = "info"
dry-run = false
run-at-startup = true
purge-days = 5
[email]
host = "smtp.xxx.com"
port = 2525
authentication = "none"
[sources]
[sources.hackernews]
url = "https://hnrss.org/newest"
type = "feed"
filters = ["all", "today"]
allow-insecure = false
[pipes]
[pipes.hackernews]
disabled = false
source = "hackernews"
destination = "email"
email_to = ["[email protected]", "[email protected]"]
email_from = ["[email protected]"]
When running 'docker-compose up -d' ,Goeland sends an email, but the feeds will be updated in the future no email will be sent
This is issue can be worked on for Hacktoberfest.
Difficulty: ⭐
Time needed: A few hours
Right now, the project needs more documentation, especially with the filters.
Steps:
Hi,
Could you add a plain text option? I'd rather not use HTML in emails if given the choice.
docker run -d -v /root/goeland/config:/config --name=goeland slurdge/goeland:latest
It didn't work !
For example, some wikipedia articles have a ' in their title.
I kept getting massive amounts of logs (gigabytes every few days), finally narrowed it down to goeland, and turned off debug logging, thinking the problem was solved. But I wasn't getting any emails from a particular pipe and I was confused by that. Turns out I had accidentally included the merge source in its own set of sources, causing an infinite loop. All the other pipes ran as expected, which was why I wasn't tipped off sooner.
It could be good to include a check for self-reference like that. Obviously you could still come up with contrived examples to produce infinite looping, like by having two merge sources reference each other, but this would warn users quickly if they've made a simple mistake. Thoughts?
Right now the email default template is embedded within the application, but people wanting to customize it have to get to the source code.
It would be useful to get it from within the application as an action, such as output-email-template
.
Hello! Any plans on SSL/TLS support for smtp servers? I've tried a couple of servers(Google, yandex, mail ru) and all of them don't work supossedly because of lack of encryption(ERR conn timed out). Would be nice to be able to toggle SSL usage via config file.
level=fatal msg="cannot create email pool: Mail Error: SMTP Connection timed out"
\Downloads\goeland> smtp connection timed out
When I give the command goeland ./goeland.exe run, which I intend to use to send news from the RSS source I created via email, it gives the above error. Can anyone with information help me?
(I am very new and my knowledge is really limited)
Now that daemon mode is supported we need a proper Dockerfile
Would it be possible to add a filter which will limit the max number of characters which will be put to email message from article in RSS feed?
In different wording, the article from RSS feed have 5000 characters. In current version of goeland, the whole article text will be placed in email message. My target is to have a possibility to limit the number of characters, which will be placed in email message.
Example email message:
Article title #1
Text of article #1 limited to 100 characters.
Article title #2
Text of article #2 limited to 100 characters.
Please let me know if there are more details or different description is required.
Thank you
Hello,
would it be possible (or is it already possible?) to add an option to configuration file to accept self-signed SSL certificates (certificate signed by unknown authority) and expired SSL certificates in HTTPS connections?
I'm aware about the security risk, but sometimes it is difficult to get the SSL certificate corrected / updated.
Thank you
As I was migrating feeds into goeland, I noticed this one seems to lose its video links:
Probably a consequence of how they're referenced in the feed content (the src of an iframe, within a figure). Any way to preserve these?
Is your feature request related to a problem? Please describe.
Some webpages are sending HTTP 403 for goeland, but when I will use standard web browser, the RSS feed is available. I would assume that the webpages are checking User Agent.
Describe the solution you'd like
Add configuration option, per source, for configuring User Agent (like goeand will act as Firefox, or custom configurable string).
Describe alternatives you've considered
I'm not aware about more effective option now.
Additional context
Not required.
Thank you
There are a few places where the text could be i18n
Since you now offer a docker image, I put together a docker-compose.yml
file containing:
services:
app:
image: slurdge/goeland
restart: unless-stopped
volumes:
- ~/docker/volumes/goeland:/data
Issues I've encountered:
~/docker/volumes/goeland
(as user 1000), it gets created by docker as root and config.toml
cannot be written to it. Shouldn't it be created automatically by docker as the container user (1000?) and just work?config.toml
is created/edited, successfully, a "permission denied" error occurs when creating goeland.db
.I admit that I'm pretty new to docker (and to goeland!), so perhaps this is all expected behaviour and I'm "doing it wrong". Either way, I'm hoping for guidance.
P.S. is there a way to default the container to start with the --run-at-startup
flag?
Feeds are often only including an URL or summary. With existing filters, it's possible to cleanly grab the content of the article, however it would also be good to have an 'auto content' option for people which are not used to CSS
I have been playing with this fantastic project but it took me a while to get it working with a particular smtp server. It seemed to boil down to authentication options. It seems like AuthPlain
is the default but the underlying package allows for other options, needing AuthLogin
in my case.
I have no idea of go or smtp, but I got to a minimal working example with the following additions to run.go by looking at the surrounding code:
authentications := map[string]email.AuthType{"plain": email.AuthPlain, "login": email.AuthLogin, "crammd5": email.AuthCRAMMD5}
authentication, found := authentications[config.GetString("email.auth")]
if !found {
authentication = email.AuthPlain
}
server.Authentication = authentication
which allows configuration files to have an email.auth
field.
Happy to do a pull request as per the contributing guidelines with that change and updating the documentation if that sounds sensible, but I am afraid I do not know how to go about tests here.
Many thanks!
Right now, the emails are send with all the recipients in clear. We may want to hide the recipient in case of large mailings.
I know it's easy to just run this as a cron job, but being able to run goeland as a daemon or a Docker container would mean I can manage goeland along with all the other services I may have on the same machine.
I've created an example in this repo that demonstrates how I'm currently building goeland to run as a simple docker-compose service. This works just fine for me, but I wanted to bring it up here to see if you were interested in supporting this kind of usage.
Supporting this could be as simple as adding a goeland daemon
subcommand, if you aren't interested in supporting Docker.
Describe the bug
I have two RSS feeds, from which the RSS digest is not working:
http://www.dsl.sk/export/rss_articles.php
http://www.zive.sk/rss/najnovsie/
Instead of RSS digest, only last article or no article is received. Period of polling of listed feeds is 24h.
To Reproduce
Steps to reproduce the behavior:
goeland configuration, which is not working for listed RSS feeds, but it is working for 200+ other RSS feeds:
[sources]
[sources.src1]
url = "http://www.dsl.sk/export/rss_articles.php"
type = "feed"
filters = ["unseen", "includelink", "embedimage(left)", "digest(4)", "retrieve"]
[sources.src2]
url = "http://www.zive.sk/rss/najnovsie/"
type = "feed"
filters = ["unseen", "includelink", "embedimage(left)", "digest(4)", "retrieve"]
[pipes]
[pipes.src1]
source = "src1"
destination = "email"
email_to = "[email protected]"
email_from = "[email protected]"
email_title = "{{.EntryTitle}}"
[pipes.src2]
source = "src2"
destination = "email"
email_to = "[email protected]"
email_from = "[email protected]"
email_title = "{{.EntryTitle}}"
Expected behavior
Get the digest of all articles from last run.
Screenshots
N/A
Additional context
Add any other context about the problem here.
The attached huge and complicated patch adds a URL variable for use in the email template to allow for adding a clickable link to to the original article like so:
<a href="{{.URL}}">{{.EntryTitle}}</a>
Patch:
diff -urN goeland-0.16.0.dist/cmd/run.go goeland-0.16.0.cet/cmd/run.go
--- goeland-0.16.0.dist/cmd/run.go 2023-10-27 07:34:30.000000000 -0700
+++ goeland-0.16.0.cet/cmd/run.go 2023-12-16 01:56:03.808909272 -0800
@@ -160,6 +160,7 @@
EntryFooter string
ContentID string
CSS string
+ URL string
}{
EntryTitle: html.EscapeString(entry.Title),
EntryContent: entry.Content,
@@ -169,6 +170,7 @@
EntryFooter: footer,
ContentID: "cid:" + logoAttachmentName,
CSS: defaultCSS,
+ URL: entry.URL,
}
if destination == "htmlfile" {
data.ContentID = "data:image/png;base64," + base64.StdEncoding.EncodeToString(logoBytes)
Hi folks,
Thank you for Goeland. It's FANTASTIC software!!!
I'm trying to get rss2email experience - by getting a single e-mail per each new post in RSS feed.
I configured Goeland like this:
...
[sources.xyz]
url = "https://.../"
type = "feed"
filters = ["unseen", "retrieve", "includesourcetitle", "includelink", "untrack"]
allow-insecure = false
...
[pipes.xyz]
disabled = false
source = "xyz"
destination = "email"
email_title = "xyz: {{.EntryTitle}}"
email_to = ["[email protected]"]
email_cc = []
email_bcc = []
email_from = "xyz <from@example.>"
# email_title =
...
But I'm getting e-mails with plain text - with no links to the post.
Is it a bug?
Thank you.
Desktop
Trying to create custom template and found no documentation about placeholders for templater as well as no ability to simply get a link for entry.
This feature would grab all the titles and create a TOC at the beginning of the mail.
It would need CSS for <li>
items in templates.
Using source as
[sources.calcio]
url = "https://www.corrieredellosport.it/rss/calcio"
type = "feed"
filters = ["unseen","includelink","embedimage","digest"]
The email works fine, displaying the embedded image.
If I use this resource:
[sources.focus_scienza]
url = "https://www.focus.it/rss/scienza.rss"
type = "feed"
filters = ["unseen","includelink","embedimage","digest"]
The email doesn't display the image, showing a blank line space instead. The tag in the RSS is the same, called Enclosure. My Rss client (Vienna) displays both of them. It would be great if I didn't have to specify "embedimage" as filter, but the image would be displayed by default, as normally an RSS client does.
cron = "*/5 * * * *"
Many entries are sent repeatedly in a short time !
I only want to send new entries, already sent entries will not be sent !
How to configure in config.toml ? Thanks
Hi,
I'm trying to get emails for the following feed : https://www.youtube.com/feeds/videos.xml?channel_id=UCb0MyY46T9ZYOzDHkYnIoXg
Unfortunately the email I receive only contain the goeland logo and footer.
Here's the content of one of these email :
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset=UTF-8
<!DOCTYPE html><html lang=3D"en"><head><title>Portable Nuclear Weapons | Pl=
ainly Difficult Short</title><meta charset=3D"utf-8"/><meta name=3D"viewpor=
t" content=3D"width=3Ddevice-width,initial-scale=3D1"/><meta http-equiv=3D"=
x-ua-compatible" content=3D"IE=3Dedge"/><style type=3D"text/css">"a:hover" =
{
text-decoration: none !important
}@media screen and (min-width:600px){
h1 {
font-size: 48px !important;
line-height: 48px !important
}
.intro {
font-size: 24px !important;
line-height: 36px !important
}
}
</style></head><body style=3D"-webkit-text-size-adjust:100%;-ms-text-size-a=
djust:100%;overflow-wrap:break-word;height:100%;width:100%;margin:0;padding=
:0"><!--[if (gte mso 9)|(IE)]><table cellspacing=3D0 cellpadding=3D0 =
border=3D0 width=3D720 align=3Dcenter role=3Dpresentation><tr><=
td><![endif]--><div role=3D"article" aria-label=3D"Portable" nuclear=
=3D"" weapons=3D"" |=3D"" plainly=3D"" difficult=3D"" short=3D"" lang=3D"en=
" style=3D"font-family: 'Avenir Next', -apple-system, BlinkMacSyste=
mFont, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple=
Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol'; fon=
t-size: 18px; font-weight: 400; line-height: 28px; margin: 0 auto; max-widt=
h: 720px; padding: 40px 20px 40px 20px;"><header><a href=3D"https://www.git=
hub.com/slurdge/goeland" style=3D"-webkit-text-size-adjust:100%;-ms-text-si=
ze-adjust:100%;color:rgb(16, 120, 189);font-weight:600;text-decoration:unde=
rline"><center><img style=3D"-ms-interpolation-mode:bicubic;border:0;height=
:auto;line-height:100%;outline:none;text-decoration:none;max-width:100%;bac=
kground-color:white;padding:16px;border-radius:16px" src=3D"cid:20230312.20=
[email protected]"/></center></a>
</header><main></main><footer style=3D"margin-top: 24px; padding: 16px; bor=
der-radius: 16px;"><center><p style=3D"font-size: 16px; font-weight: 400; l=
ine-height: 16px;">Enjoy your =F0=9F=93=A7 by <a href=3D"https://www.github=
.com/slurdge/goeland" style=3D"-webkit-text-size-adjust:100%;-ms-text-size-=
adjust:100%;color:rgb(16, 120, 189);font-weight:600;text-decoration:underli=
ne">goeland</a></p></center></footer></div><!--[if (gte mso 9)|(IE)]><=
;/table><![endif]--></body></html>
--2eb3659b3f9ed2b1fd5958bb7479e3a2bdb9cb44b07ae7141f18ae081ba2--
--26b4b4ee37adb6c6ba198b9fc94b43fc695f2f35bbc27842d0e44c47a5dc
Content-Disposition: inline;
filename="logo.png"
Content-Id: <[email protected]>
Content-Transfer-Encoding: base64
Content-Type: image/png;
name="logo.png"
iVBORw0KGgoAAAANSUhEUgAAAPoAAABoCAMAAADvnB1HAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJ
bWFnZVJlYWR5ccllPAAAAF1QTFRF////TU1Npqam9PT0WFhYenp6bm5u09PT3t7e95MevLy8kJCQ
6enpY2NjhYWFsbGxm5ubx8fH+a5W/eTHonA2+rxy//jxzYIq+Jos/uvV+8mPwn0tpZmK/NCd////
u6G8bwAAAB90Uk5T////////////////////////////////////////AM0ZdhAAAAL4SURBVHja
7J3ZduMgDIZZjYONl7bT2fv+jznTbAfb9cLmGCn/VXuaQ/UZkEREFPKBVsT6EYem6ASRBugElyx0
gk43dEIwshOk5Bd0glOI0T/hn+jI0WvmLK3qh9jNVe9urJpF76mfZNfzPbEVk8LP0mpuSM/xzira
fWaf6ybETDXzMGmYpE4OXlYizEY2My4NlWFJF76SwRbOoBMTPDIV7ZHBKZ3blaWIMLhRScDrJoJt
VC/4ENdw0cliMn6XYNW3k1kRsnI1Nr4nVu1oRkwZ262P/kGxczRdVD80Lu6OL80jwqjDZmRiS+7g
IT0YWJEDitvwMtqK1DZ4TQ4qXlnrkkcnLxQ5sJSJzK7X05HDTHwTlV2nCxsJxCKy6wS+I6V0NHad
JGIk3fAiDnt+5PZhIIQ9R/I47HmSx2DPlTycPV/yUPacycPY8yYPYc+d3J89f3JfdgjkfuwwyH3Y
oZC7s8Mhn2cv1VSwyGfYe+NbE82dvaYoyL9i71bLghwqe/PIuuVj2dstlxVKkOx8Sz1baJjslZxq
8jw6qL7uC3HFJEBHvzWvG9QtwbGveG+7bsmAscu1V9Zy7bpXruxy/aX3pEdwUOxb0pX7QaaBlNts
810tsCVPeMvU0l+l7G43Jyq6eXsAeC7nm2iitH8DNO1LYsPdrYDt9iXdQhoZLXmOD71evd8JRtdZ
NuNngWDF15ew34+Du0Cw2cv/02x6y+Wv3OWGLIMnvI3VgDq/+UT6P6+f+okR/e101je86KdfeNFf
EKFXeNGv6dzvE77Nfk1pvl/If7zjIb99wPjlrL/v+LZ6gS+h4QJUAcojtNES7aQbtGcX2qIj18Dq
Lw5vWwiK9MR6JzdoydG9Q9NCu1SyVVZ5HVciVyf4PHQW3Nq+TgiDXG9q+jG8PwaCnBc+7Y1AzHkV
sXNQZnLvJCShHNdc0Q2curLbgm/6PWz6OJqbE43eybvt1ltSH65d47OtJkr0Z/dglOiI22WjZMfb
Gv/5hQhYvwbjnwADABz/SJCHx6o2AAAAAElFTkSuQmCC
--26b4b4ee37adb6c6ba198b9fc94b43fc695f2f35bbc27842d0e44c47a5dc--
And here's my config.toml :
## Log level
## Either "none", "error", "debug", "info"
loglevel = "none"
## Dry run
## Do not output anything or send email after fecthing the sources
#dry-run = false
## Do not sanitize input
## This is not sanitize (the default) any input.
## Use at your own risk as you will include everything from your sources, including scripts, etc.
## You can always sanitize afterwards with the 'sanitize' filter.
unsafe-no-sanitize-filter = false
## Run all the pipes once at startup in daemon mode
run-at-startup = false
## Purge days
## Number of days to keep the entries when the purge command is used
## Can be overrided by command line switch
purge-days = 90
## Auto purge
## Automatically run the purge command after the run command
auto-purge = true
[email]
host = ""
port = 587
username = ""
password = ""
## Include header in email
## Put a nice goeland logo in emails
#include-header = true
## Include footer in email
## Put "Sent with ❤️ by goeland in the bottom of HTML emails"
#include-footer = true
## Include title in header
include-title = true
## Email timeout in milliseconds
#timeout-ms = 5000
## Logo file
#logo = internal:goeland.png
## Template file
#template = "/path/to/template.html"
[sources]
[sources.hackernews]
url = "https://hnrss.org/newest"
type = "feed"
# See doc for available filters
filters = ["all", "today"]
# Allow invalid certificates
allow-insecure = false
[sources.youtube]
url = "https://www.youtube.com/feeds/videos.xml?channel_id=UCb0MyY46T9ZYOzDHkYnIoXg"
type = "feed"
# See doc for available filters
filters = ["all"]
# Allow invalid certificates
allow-insecure = false
[pipes]
[pipes.youtube]
#Either put disabled = true or prefix pipes with disalbed like this: disabled.pipes.hackernews
disabled = false
source = "youtube"
destination = "email"
email_to = [""]
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.