What is TwitterAPI.io?

TwitterAPI.io is a real-time X (Twitter) API alternative that provides 75+ endpoints covering tweets, users, search, and trends at $0.00015 per read — about 33× cheaper than the official X API ($0.005–$0.010 per read). It uses pay-per-use pricing with no subscriptions or minimums, built for developers and AI agents that need predictable per-call cost at scale.

How much does the X API cost vs TwitterAPI.io?

The official X API charges $0.005 per post read and $0.010 per user/DM/following read on its pay-per-usage tier; TwitterAPI.io charges a flat $0.00015 per read across the same endpoints — about 33× cheaper. For one million reads per month, the official X API costs roughly $5,000 vs $150 on TwitterAPI.io, with the same data structure and no monthly cap on either side.

Does TwitterAPI.io have a monthly cap or rate limit?

TwitterAPI.io has no monthly cap on API calls; it uses pay-per-use pricing with built-in rate limiting and a user-configurable spending limit. This matches the official X API model (which also has no platform monthly cap, only user-set spending limits) but at about 1/33 of the per-call cost, making it predictable for high-volume use cases.

What endpoints does TwitterAPI.io provide?

TwitterAPI.io provides 75+ endpoints covering tweets (advanced search, lookups, timeline), users (profiles, followers, following), trends, lists, spaces, and tweet filtering rules (real-time webhook stream). The endpoint surface is X API-compatible — same resource shape — so existing integrations migrate by swapping the host and using an X-API-Key header instead of OAuth.

Blogtwitter scraping

Twitter (X) Scraping — A Developer's Guide

By Michael Park•6 min read

Twitter (X) scraping covers a wide span of workflows — pulling user timelines for OSINT research, fetching hashtag conversations for brand monitoring, archiving historical posts for academic study, building real-time monitoring dashboards. The implementation choice matters more than most dev teams realize: the wrong path means weekly maintenance whenever X tweaks their HTML, the right path means stable structured JSON with a few lines of code.

This guide walks the four paths with runnable Python, per-path cost from each provider's published pricing, and the practical decision rule for which to pick. Pricing references are URL-cited; cost ratios derived from those URLs.

01 — Section

The four paths — at a glance

Path 1 — twitterapi.io API: structured JSON returned directly, $0.00015 per tweet (twitterapi.io/pricing), no Developer Console required, no HTML parsing, no UI-change breakage.

Path 2 — X official API: structured JSON, $0.005 per post (docs.x.com), requires X Developer Console + bearer token. 7-day window on recent-search; full-archive search is enterprise tier.

Path 3 — Browser-automation scrapers (Playwright / Puppeteer): free to run, you write the HTML parsing yourself, breaks on every X UI change, ToS-risk if X detects automation patterns. Reasonable for one-off small jobs; painful as a production dependency.

Path 4 — Third-party scraper SaaS (Apify / scrapfly / similar): abstracts the browser path, hosts the maintenance, priced per actor run or per credit. Pay-as-you-go without the dev-overhead of building your own Playwright stack.

02 — Section

Path 1 — twitterapi.io (recommended default)

Auth is a single X-API-Key header — no OAuth, no X account required (the API authenticates by key, not by user-login). Sign up at twitterapi.io with email, receive the key, start calling.

Pricing per twitterapi.io/pricing: $0.00015 per returned tweet, $0.00018 per profile lookup, no monthly minimums.

python

import os, requests

HEADERS = {"X-API-Key": os.environ["TWITTERAPI_IO_KEY"]}
BASE = "https://api.twitterapi.io"

def scrape_user_timeline(handle: str, max_pages: int = 10):
    """Scrape a user's timeline — pure API, no HTML parsing."""
    tweets, cursor = [], None
    for _ in range(max_pages):
        params = {"userName": handle}
        if cursor: params["cursor"] = cursor
        r = requests.get(
            f"{BASE}/twitter/user/last_tweets",
            headers=HEADERS, params=params, timeout=15,
        )
        r.raise_for_status()
        resp = r.json()
        tweets.extend(resp.get("data", []))
        cursor = resp.get("next_cursor")
        if not cursor: break
    return tweets

def scrape_hashtag(tag: str, max_pages: int = 10):
    """Scrape hashtag conversations — full advanced-search operators."""
    tweets, cursor = [], None
    for _ in range(max_pages):
        params = {"query": f"#{tag} -is:retweet lang:en"}
        if cursor: params["cursor"] = cursor
        r = requests.get(
            f"{BASE}/twitter/tweet/advanced_search",
            headers=HEADERS, params=params, timeout=15,
        )
        r.raise_for_status()
        resp = r.json()
        tweets.extend(resp.get("tweets", []))
        cursor = resp.get("next_cursor")
        if not cursor: break
    return tweets

# Both functions return structured JSON — no HTML parsing
for t in scrape_user_timeline("nasa")[:5]:
    print(f"  {t['id']}: {t.get('text', '')[:80]}")

03 — Section

Path 2 — X official API

Requires X Developer Console onboarding (developer.x.com), which requires an X account in good standing. Auth via OAuth bearer token. Recent-search returns last 7 days; full-archive (/2/tweets/search/all) is academic/enterprise tier.

Pricing per docs.x.com/x-api/getting-started/pricing: $0.005 per post read.

python

# pip install tweepy
import tweepy

client = tweepy.Client(bearer_token="YOUR_X_BEARER")

def scrape_x_official(query: str, max_results: int = 100):
    tweets = []
    for page in tweepy.Paginator(
        client.search_recent_tweets,
        query=query,
        max_results=max_results,
        tweet_fields=["created_at", "public_metrics", "author_id"],
        limit=10,
    ):
        tweets.extend(page.data or [])
    return tweets

for t in scrape_x_official("#machinelearning -is:retweet lang:en")[:5]:
    print(f"  {t.id}: {t.text[:80]}")

04 — Section

Path 3 — Browser-automation (Playwright)

The classic 'scrape it ourselves' path. Spin up a headless browser, navigate to a tweet URL or search page, parse the rendered HTML. Works without any API key but has three structural problems:

Problem 1 — Maintenance: X changes their HTML structure regularly. Every change breaks your selectors. Dev teams running Playwright scrapers typically spend half a day per month patching parsers.

Problem 2 — Login required for most content: anonymous browsers see limited tweets; full content requires login. Login automation triggers X's detection (captcha, account lock, anti-bot flags). Maintaining a pool of working accounts is its own engineering project.

Problem 3 — ToS-risk: Playwright + automated login patterns violate X's terms. Accounts get suspended, IPs get rate-limited, your scraper stops working at the worst possible moment (right before a deadline).

Reasonable for: a one-off small scrape, learning the platform, or a workflow that genuinely needs DOM-level data not exposed via API. Not reasonable for: a production dependency.

python

# pip install playwright
# python -m playwright install chromium
from playwright.sync_api import sync_playwright

def scrape_via_browser(tweet_url: str):
    """Illustrative only — production use should prefer API path."""
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        page = browser.new_page()
        page.goto(tweet_url, wait_until="networkidle")
        # Selectors here break whenever X redesigns. Treat as fragile.
        text = page.locator("[data-testid='tweetText']").first.text_content()
        likes = page.locator("[data-testid='like']").first.text_content()
        browser.close()
        return {"text": text, "likes": likes}

# Real-world: this works today, breaks next month, requires constant maintenance
# Most production teams move off this path within 6 months

05 — Section

Path 4 — Third-party scraper SaaS

Tools like Apify, scrapfly, ScraperAPI host the browser-automation maintenance for you. You call their API, they handle the proxy rotation, the parser updates, the anti-bot evasion. Pricing varies by provider — Apify's Twitter scraper actor is ~$0.40-$2.00 per 1,000 tweets depending on plan; scrapfly charges per credit per scrape with cost dependent on render mode and proxy tier.

Reasonable for: workflows that need the abstraction over Playwright. Less reasonable for: most workflows, because if you're paying anyway, paying per structured tweet (paths 1 and 2) is operationally simpler than paying per actor run with parsing edge cases.

06 — Section

Side-by-side — 4-path matrix

Per-tweet cost derived from each provider's published pricing page. Apify and scrapfly pricing approximate based on their public plan tiers at apify.com/pricing and scrapfly.io/pricing.

Dimension	twitterapi.io API	X official API	Playwright DIY	Apify / scrapfly
Per-tweet cost	$0.00015 (twitterapi.io/pricing)	$0.005 (docs.x.com)	$0 + dev time	$0.0004-$0.002 per tweet (~typical actor pricing)
Setup friction	API key (email signup)	X Developer Console (X account required)	code + accounts + proxies	API key (provider signup)
Maintenance	none — provider handles	none — provider handles	weekly to monthly	none — provider handles
HTML parsing	none — JSON returned	none — JSON returned	full — your code	none — provider returns JSON
ToS risk	low (read-only public data)	low (official)	medium-high	medium
Best for	most workloads, default choice	already on X bill	one-off learning	when you specifically want the Apify ecosystem

Two practical observations: (a) the dev-time cost of DIY scraping dominates the dollar cost of API paths for any sustained workload; (b) at 33× cheaper per call than X official, twitterapi.io's economics let you scrape at scale without re-deciding budget every month.

07 — Section

Common scraping workloads — which path fits

Brand monitoring (track mentions of your brand): twitterapi.io advanced_search with "your brand" query, hourly cron, ~500 tweets/run = $0.075/run × 24/day = $1.80/day. Stable, no maintenance.

OSINT / journalism (research a person or topic): same advanced_search pattern, ad-hoc queries. Per-query cost is single-digit cents.

Academic research (full-archive multi-year pull): twitterapi.io archive depth + cheap per-tweet cost. A 5-year-archive of a moderate-activity account is single-digit dollars.

Real-time monitoring (instant alerts): WebSocket streaming via twitterapi.io's stream endpoint, or polling the search endpoint every 30s. Stream is the cleaner pattern for high-volume.

Analytics product (build a dashboard for users): twitterapi.io for the read layer + your warehouse + your dashboard frontend. Per-call cost stays linear with usage.

08 — Section

Picking the path — decision rule

Default: twitterapi.io API. Lowest setup friction, lowest per-call cost, zero maintenance burden. No Developer Console gating.

Already on X official for other workflows: X official; marginal cost rides on the same auth.

Genuinely need DOM-level data (visible-rendering details, paid-tier-only fields): Playwright for the specific case, kept small and isolated from your main pipeline.

Want the Apify ecosystem (actor marketplace + workflow integration): Apify, knowing the per-tweet cost is higher than API paths.

Most teams ending up at twitterapi.io start by trying Playwright (fragility) → trying X official (cost) → landing at twitterapi.io. Save the iteration time by starting there.

python

# Practical example: stable brand-mention scraper running on cron.
import os, requests, json
from datetime import datetime, timezone

HEADERS = {"X-API-Key": os.environ["TWITTERAPI_IO_KEY"]}
BASE = "https://api.twitterapi.io"

def scrape_brand_mentions(brand: str, hours_back: int = 1, out_path: str = "mentions.jsonl"):
    query = f'"{brand}" -is:retweet within_time:{hours_back}h'
    tweets, cursor = [], None
    for _ in range(20):  # cap pages so a runaway query doesn't drain credit
        params = {"query": query}
        if cursor: params["cursor"] = cursor
        r = requests.get(
            f"{BASE}/twitter/tweet/advanced_search",
            headers=HEADERS, params=params, timeout=15,
        )
        r.raise_for_status()
        resp = r.json()
        tweets.extend(resp.get("tweets", []))
        cursor = resp.get("next_cursor")
        if not cursor: break
    snapshot = {
        "brand": brand,
        "captured_at": datetime.now(timezone.utc).isoformat(),
        "count": len(tweets),
        "tweets": tweets[:200],  # cap for storage
    }
    with open(out_path, "a") as f:
        f.write(json.dumps(snapshot) + "\n")
    return snapshot

snap = scrape_brand_mentions("twitterapi.io")
print(f"captured {snap['count']} mentions in last hour at {snap['captured_at']}")
for t in snap["tweets"][:3]:
    print(f"  @{t.get('author', {}).get('userName')}: {t.get('text', '')[:80]}")

# Cost framing (math from cited pricing pages):
#   ~500 tweets per hourly run × $0.00015 = $0.075 per run
#   Hourly × 24 × 30 = $54/month for continuous brand monitoring
#   Same workload via Playwright: 0 dollars + 4 hours/month patching selectors after X redesigns
# The dev-time cost makes the dollar cost negligible — API path wins on total cost

09 — Questions

Questions readers ask

Is scraping Twitter (X) against the terms of service?

It depends on the path. Using the official X API or third-party API providers like twitterapi.io is the standard developer workflow within terms. Browser-automation + automated-login at scale is the path where ToS-violation risk lives. Read-only data via documented APIs is generally fine; review docs.x.com developer terms for your specific commercial use case.

Can I scrape tweets without an X account?

Yes, via twitterapi.io — API-key auth doesn't require an X account. X official requires an account for the Developer Console. Playwright + anonymous browsing returns limited content (login-walled features hidden). See /blog/twitter-no-account-api-read-only for the no-X-account path detail.

How fast can I scrape — what are the rate limits?

twitterapi.io rate limits are per-account, with defaults in the thousands of requests/hour. X official's rate limits vary by tier (50/15-min on the basic search). Playwright is limited by your browser instance + X's anti-bot detection (slow + flaky). For high-volume, the API path always wins.

What about scraping deleted tweets?

Deleted tweets are typically filtered out by the search endpoints. For deleted-tweet research, see /blog/deleted-tweet-search and /blog/twitter-archive-tweet-finder-guide — different specialty workflows.

Can I scrape and store the data commercially?

Public tweet data + profile data storage is generally allowed by both X's developer terms and most third-party API providers. Restrictions vary by use case (resale of data, training AI models, advertising) — review the specific terms before commercial use. Personal data + DMs are not accessible via public scraping regardless.

What's the cost difference at scale (1M tweets / month)?

Math from cited pricing pages: 1M tweets at twitterapi.io = $150/month; at X official = $5,000/month; via Apify actor varies by plan tier per apify.com/pricing. Playwright is $0 in dollars but the ongoing dev-maintenance time when X changes their HTML rapidly compounds. The API paths' total cost (dollars + zero dev-time) usually wins for sustained workloads.

10 — Further reading

Continue

Sources & further reading

Stop reading. Start building.

Starter credits cover real testing on real data. Google sign-in, no card, no application queue.

Get an API key