Pool Of Ideas — Automate Getting News Portal Articles For PR Linkbuilding

MacLoush
Ecommerce Tips
Published in
3 min readMay 17, 2023

I want to keep an inventory to news articles published online on a news portal, which is a collective portal of news publishers. So I wrote an AHK script to scrape the news portal for the article names.

Photo by Florian Olivo on Unsplash

With this setup, I will have a .txt file with all the titles, moreover it’s possible to time the scraping, to repeat the action in let’s say every 15 minutes, when the news portal refreshing the web content.

So my list will be automatically updated.

What is this all good for?

I will prepare a pool or article titles, which will be at my disposal, and I will have the option to filter this.

If I find something appealing, I will be able to check out the content later, go back in time and make some research or get some inspiration.

THEN SOMETHING CHANGED

I decided that I want to have the articles’ URLs, and then I failed.

I tried and tried, I made researches, read tons of forums.

But nothing. I found a way how to extract the URLs separately, but not the articles’ URLs, but every URLs in general.

Damn.

The good thing in AHK, that there is a massive community behind, which is helpful, fast, so I asked for help on Reddit, AHK group.

Guess what, in a day, someone replied to my post, asked me to properly format my code, and the very next day a solution was provided.

Which did not work.

But I figured out that AHK v2 code is not going to run on the installed v1 background.

So this issue is resolved.

Example of the result scraping news articles

Would you like to have the updated script?

Here you are.

F1::GetArticlesUrls()

GetArticlesUrls() {
URLDownloadToFile https://www.hirstart.hu, % A_Temp "\html"
FileRead html, % "*PP65001 " A_Temp "\html"
html := StrReplace(html, "<head>", "<head><meta http-equiv='X-UA-Compatible' content='IE=Edge'>")
document := ComObjCreate("HTMLFile")
document.write(html)
headers := document.querySelectorAll("h2 a")
all := ""
loop % (headers.length - 1) {
elem := headers[A_Index - 1]
if (elem.href ~= "^http")
all .= elem.innerText "`t" elem.href "`n"
}
ObjRelease(document)
FileDelete % A_Temp "\html"
FileOpen(A_Desktop "\hu.csv", 0x1, "UTF-8").Write(all)
MsgBox 0x40040, Complete!, Scrapping from hirstart.hu finished!
}

Where could you use some inspiration from news articles?

Of course in your own PR Linkbuilding activities.

Read the details below:

You don’t know how to use Autohotkey?

I am also just a padawan of the topic, but an enthusiastic one.

You can just bravely turn to the AHK community if you need help. They will be there for you.

--

--

MacLoush
Ecommerce Tips

Tips on Desktop Automation | Digital Marketing | Parenting | Savings | Investment and Corporate Life | Subsrcibe: https://medium.com/@macloush/subscribe