BNOD

← All templates
scrapingwikipediascraperesearch

Wikipedia — capture article intro

Open a Wikipedia article URL, capture the article title (h1) and the first body paragraph, log both. Selectors are stable across all language editions.

Install in BNOD

Install in BNOD

Opens BNOD sidepanel with this template installed. Requires BNOD extension.

You're researching a topic and want the canonical one-paragraph definition — the lede that Wikipedia editors fight over for months. Manually clicking through 20 articles to copy each intro takes longer than the research itself. This template grabs the title and the first body paragraph from any Wikipedia URL, in any language edition, and logs them so you can paste a clean batch into your notes. Useful for academic literature reviews, building a glossary, or feeding LLM context windows with vetted definitions.

How this workflow works

Six blocks in a straight chain. The structure is deliberately minimal so it stays robust across Wikipedia's 300+ language editions, which all share the same DOM structure.

  1. manual_trigger — Sidepanel Run button. Exposes one input called article of type url, defaulting to the Service Worker article. Override per run by typing or pasting any other Wikipedia URL.
  2. navigate — Opens {{vars.input.article}}. Because targetTab: "new", this happens in a fresh tab.
  3. wait_for — Waits for the #firstHeading element to render. That's the ID Wikipedia uses for the article title, consistent across desktop and mobile skins, and across all language editions.
  4. get_text (first one) — Reads the text of #firstHeading. Captured as $('Capture article title').text.
  5. get_text (second one) — Reads the first body paragraph using the selector #mw-content-text .mw-parser-output > p:not(.mw-empty-elt). The :not(.mw-empty-elt) filter is critical: Wikipedia includes invisible spacer paragraphs at the top of some articles, and without that filter you'd get an empty string. matchFirst: true makes sure only the first matching paragraph is captured.
  6. log_data — Concatenates "{{title}} — {{first paragraph}}" and writes it to the workflow log with the label wikipedia. You'll see this in the sidepanel Run history.

The whole flow runs in 1-2 seconds against a reasonable connection.

Customising it for your case

The template is a starting point — a few directions you can take it.

Common gotchas

Wikipedia's article DOM is extremely stable, but a few edges trip people up. Disambiguation pages (e.g. /wiki/Apple) have a different structure — the first paragraph is usually a one-line "Apple may refer to..." note instead of a real definition. Articles marked as stubs may have only one short sentence. And on mobile redirects (en.m.wikipedia.org), the wait_for still works but performance is slower because the mobile skin loads more JavaScript. If a captured paragraph is empty, your article probably has a .hatnote element pushing the real lede down — adjust the selector to > p:not(.mw-empty-elt):not(.hatnote).

FAQ

Do I need an API key? No. Wikipedia is fully open. There's an official API at en.wikipedia.org/api/rest_v1/page/summary/<title> that returns the same intro paragraph as JSON if you'd rather skip the DOM scrape — replace this template's navigate + get_text blocks with one http_request block.

Does it work on Wikimedia Commons or Wiktionary? Partially. They share some templates but the title element ID differs — Wiktionary uses #firstHeading too, Commons does not. Test before relying on it.

How is this different from copy-paste? You can run it in a loop. Chain it with a loop block iterating over an array of article URLs and you get a 100-article extract in under a minute. Automa and Browserflow both have equivalent loop blocks; the selectors port directly.

Will it break on a Wikipedia redesign? They tested a new skin (Vector 2022) without breaking these selectors. The IDs #firstHeading and #mw-content-text are part of MediaWiki's core HTML output, not theme-specific.

Blocks used

  • manual_trigger
  • navigate
  • wait_for
  • get_text
  • log_data

Works on

  • https://*.wikipedia.org/*
Install in BNOD

Free. No signup required.

Related templates