(Version française)

A browser extension for extracting and downloading tweets for text mining.

Cite this sofware

If you use this extension for your research, please reference it as follows:

Moncomble, F. (2024). 𝕏-Scraper (Version 0.4) [JavaScript]. Arras, France: Université d’Artois. Available at: https://fmoncomble.github.io/X-scraper/

Installation

Firefox (recommended: automatic updates)

Chrome/Edge

    Remember to pin the add-on to the toolbar.

    Instructions for use

    • Navigate to 𝕏/Twitter and perform a search (simple or advanced)
      • It is advised to create a specific account for the purpose of scraping content
    • Click the add-on’s icon in the toolbar
    • Click Start scraping
    • The interface appears as a layer over the current webpage:
      • (Optional) Set the maximum number of tweets to scrape
      • Choose your preferred output format:
        • XML/XTZ for an XML file to import into TXM using the XML/TEI-Zero + CSV module
          • When initiating the import process, open the “Textual planes” section and type ref in the field labelled “Out of text to edit”
        • TXT for plain text
        • CSV
        • XLSX (Excel spreadsheet)
        • JSON
    • You can abort at any time
    • Click Download to collect the output

    Known issues and limitations

    Too many requests

    The add-on collects tweets by automatically scrolling the search results page. This makes repeated calls to the 𝕏/Twitter server, which eventually times out with a 429 response (Too may requests). When that happens (generally after scraping ~900 tweets), download the file, click Reset, allow a few minutes for the server to ‘cool down’, then adjust your search parameters to avoid collecting duplicates and resume scraping.

    Interface redesign

    ⚠️ Important! In v0.2, the add-on’s popup window needs to remain open for the extension to behave properly. Clicking outside it, switching to another tab/window, or switching to a different app will cause it to close, effectively preventing the user from interacting with the extension during or after the scraping process.

    This is addressed in v0.3 through a redesigned interface: make sure to avail of the newest version.

    Create an ad-hoc account

    Although Elon Musk has repeatedly expressed his opposition to scraping 𝕏/Twitter data, collecting publicly available data for research purposes is legal in most countries. However, as a precaution, it is advisable to create an ad-hoc account for this specific purpose.