Scrolling Hacker News every morning and copy-pasting interesting titles into a spreadsheet is the kind of busywork that nobody admits to doing, yet a lot of people do it. This template gives you a one-click way to grab the front page — every story title, every link — and save it as a CSV file you can open in Excel, Numbers, or import into a notebook. Useful if you write a weekly tech digest, track tone of voice on YC content, or just want a clean dataset of what got upvoted today.
How this workflow works
The workflow uses five blocks chained together in a straight line. No branches, no loops — it's the simplest possible scrape-and-export pipeline.
manual_trigger— You hit Run in the sidepanel. The trigger is configured withtargetTab: "new", so the workflow opens a fresh tab instead of hijacking the one you're reading.navigate— Loadshttps://news.ycombinator.comin that new tab.wait_for— Waits until at least onetr.athingrow is visible. That's the CSS class HN uses on every story row, and it's been stable for years.matchFirst: truemeans the block resolves the moment the first row paints, not when all 30 are loaded.scrape_list— The actual extraction. It iterates over everytr.athingrow and, for each one, reads two fields:title(from the.titleline > atext) andurl(from the same anchor'shrefattribute). The result is an array of{title, url}objects available downstream as$('Scrape story rows').items.export_data— Takes that array, converts it to CSV, and triggers a browser download namedhn-headlines.csv. The CSV header row is auto-generated from the field names you defined.
You'll see Chrome's native download bar appear with the file ready. No server round-trip, no clipboard juggling — the scrape happens inside the page you opened, in your own browser session.
Customising it for your case
A few changes you'll probably want to make.
- Scrape a different ranking. Swap the
urlfield on thenavigateblock fromhttps://news.ycombinator.comtohttps://news.ycombinator.com/newest(just posted),/show(Show HN), or/ask(Ask HN). The selectors don't change. - Add the points and comment count. Hacker News exposes those on the row immediately after
tr.athing(the metadata row). That's slightly outside the simple-list shape, so the easier path is to add fields like{ "name": "user", "selector": ".hnuser" }to the existingscrape_list— the user link is on the same row level as the meta. For points, you'll need a follow-up scrape. - Change the output format. In the
export_datablock, setformatto"json"and rename the file tohn-headlines.json. Same data, different shape — handy if you're feeding it into a script.
Common gotchas
HN is one of the friendlier scraping targets — no JavaScript-heavy SPA, no Cloudflare gate, no rate limiting on the front page. But two things bite people. First, if you run the workflow back-to-back, you'll get the same 30 stories cached — the front page only refreshes every few minutes. Second, if you're logged into HN, your personal "hide" decisions affect what shows up; the scrape sees only the rows your account can see. Run in an incognito tab if you want the canonical front page.
FAQ
Do I need an API key for this? No. HN's front page is fully public HTML and no headers are required. If you'd rather hit their official API (hacker-news.firebaseio.com), see the json-feed-to-csv template instead — it's a better fit for clean JSON data.
Can I run this on a schedule? Yes. Swap the manual_trigger block for a schedule_trigger and pick a cron expression like 0 8 * * * (every day at 8 AM). The download still triggers in your active browser session, so the browser needs to be open.
Will this break if HN redesigns the page? Probably. The tr.athing and .titleline > a selectors have survived since 2019, but if YC ever ships a real redesign, you'll need to update the containerSelector and fields[].selector values. Automa users will recognise the pattern — same fragility, same fix.