A browser extension for extracting and downloading tweets for text mining.
Cite this sofware
If you use this extension for your research, please reference it as follows:
Moncomble, F. (2024). 𝕏-Scraper (Version 0.4) [JavaScript]. Arras, France: Université d’Artois. Available at: https://fmoncomble.github.io/X-scraper/
Installation
Firefox (recommended: automatic updates)
Chrome/Edge
Remember to pin the add-on to the toolbar.
Instructions for use
- Navigate to 𝕏/Twitter and perform a search (simple or advanced)
- It is advised to create a specific account for the purpose of scraping content
- Click the add-on’s icon in the toolbar
- Click
Start scraping
- The interface appears as a layer over the current webpage:
- (Optional) Set the maximum number of tweets to scrape
- Choose your preferred output format:
XML/XTZ
for an XML file to import into TXM using theXML/TEI-Zero + CSV
module- When initiating the import process, open the “Textual planes” section and type
ref
in the field labelled “Out of text to edit”
- When initiating the import process, open the “Textual planes” section and type
TXT
for plain textCSV
XLSX
(Excel spreadsheet)JSON
- You can abort at any time
- Click
Download
to collect the output
Known issues and limitations
Too many requests
The add-on collects tweets by automatically scrolling the search results page. This makes repeated calls to the 𝕏/Twitter server, which eventually times out with a 429 response (Too may requests). When that happens (generally after scraping ~900 tweets), download the file, click Reset
, allow a few minutes for the server to ‘cool down’, then adjust your search parameters to avoid collecting duplicates and resume scraping.
Interface redesign
⚠️ Important! In v0.2, the add-on’s popup window needs to remain open for the extension to behave properly. Clicking outside it, switching to another tab/window, or switching to a different app will cause it to close, effectively preventing the user from interacting with the extension during or after the scraping process.
This is addressed in v0.3 through a redesigned interface: make sure to avail of the newest version.
Create an ad-hoc account
Although Elon Musk has repeatedly expressed his opposition to scraping 𝕏/Twitter data, collecting publicly available data for research purposes is legal in most countries. However, as a precaution, it is advisable to create an ad-hoc account for this specific purpose.
Leave a Reply