twitterapi.io is an independent third-party service. Not affiliated with X Corp.

Blogtwitter analysis

Twitter (X) Analysis API — A Developer's Guide

By Michael Park4 min read

Twitter (X) analysis is one of the broadest umbrella terms in dev workflows — covering content analytics, sentiment analysis, brand monitoring, competitive intel, campaign-measurement, and academic research. Underneath all of those use cases the workflow decomposes the same way: pull matching tweets via the search API, aggregate client-side by your chosen dimensions, persist for trend analysis.

This guide walks the primitives in Python with runnable code, per-call cost from each provider's published pricing page, and the canonical aggregation patterns. Pricing references are URL-cited.

01 — Section

The three-step pattern under every analysis workflow

1. Fetch — query the search API with operators matching your scope (keyword / hashtag / from:user / date range). Paginate via cursor until exhausted or budget hit.

2. Aggregate — group client-side by the dimensions that matter for your analysis: author, time bucket (hour / day / week), engagement bucket (likes / retweets / replies), hashtag overlap, language.

3. Persist + repeat — write each snapshot to your warehouse with a captured-at timestamp. The longitudinal series is where the actually-interesting analytics signals live (trends, growth rates, peer comparisons).

02 — Section

twitterapi.io's /twitter/tweet/advanced_search accepts the full X advanced-search expression as the query parameter. Auth is X-API-Key header.

Pricing per twitterapi.io/pricing: $0.00015 per returned tweet. No monthly minimum.

python
import os, requests
from collections import defaultdict, Counter

HEADERS = {"X-API-Key": os.environ["TWITTERAPI_IO_KEY"]}
BASE = "https://api.twitterapi.io"

def analysis_for(query: str, max_pages: int = 10):
    """Run analysis pull + client-side aggregation."""
    tweets, cursor = [], None
    for _ in range(max_pages):
        params = {"query": query}
        if cursor:
            params["cursor"] = cursor
        r = requests.get(
            f"{BASE}/twitter/tweet/advanced_search",
            headers=HEADERS, params=params, timeout=15,
        )
        r.raise_for_status()
        resp = r.json()
        tweets.extend(resp.get("tweets", []))
        cursor = resp.get("next_cursor")
        if not cursor: break

    # Aggregate by author
    by_author = Counter()
    # Aggregate by hour
    by_hour = Counter()
    # Engagement distribution
    likes = []
    for t in tweets:
        author = t.get("author", {}).get("userName", "unknown")
        by_author[author] += 1
        ts = t.get("created_at", "")
        if ts: by_hour[ts[:13]] += 1
        pm = t.get("public_metrics", {})
        likes.append(pm.get("like_count", 0))

    return {
        "total": len(tweets),
        "top_authors": by_author.most_common(10),
        "by_hour": dict(by_hour),
        "engagement_p50": sorted(likes)[len(likes)//2] if likes else 0,
    }

result = analysis_for('"machine learning" min_faves:50 lang:en')
print(f"total: {result['total']}")
for author, count in result["top_authors"]:
    print(f"  {author}: {count}")
03 — Section

Path 2 — X official

X's /2/tweets/search/recent accepts the same operator set. Auth is bearer token from the X Developer Console.

Pricing per docs.x.com/x-api/getting-started/pricing: $0.005 per post read, 24h UTC dedup window.

python
# pip install tweepy
import tweepy
from collections import Counter

client = tweepy.Client(bearer_token="YOUR_X_BEARER")

def analysis_x(query: str, max_results: int = 100):
    tweets = []
    for page in tweepy.Paginator(
        client.search_recent_tweets,
        query=query,
        max_results=max_results,
        tweet_fields=["created_at", "public_metrics", "author_id"],
        limit=10,
    ):
        tweets.extend(page.data or [])
    by_author = Counter(t.author_id for t in tweets)
    return {"total": len(tweets), "top_authors": by_author.most_common(10)}

print(analysis_x('"machine learning" min_faves:50 lang:en'))
04 — Section

Common aggregation dimensions

Six aggregations production analysis workflows compute:

- By author — count tweets per author, sum engagement per author. Surfaces top contributors / influencers in a topic.

- By time bucket — hour / day / week of created_at. Surfaces activity curves; useful for campaign-moment detection or steady-state monitoring.

- By engagement bucket — distribution of like_count across the result set. Surfaces long-tail (a few viral, many quiet).

- By hashtag overlap — extract #tag from each tweet body, count co-occurrences. Surfaces topic clusters.

- By language — group by lang field. Useful for cross-regional brand analysis.

- By mention graph — extract @user mentions, count co-mentions. Surfaces network ties.

05 — Section

Side-by-side comparison — 2 API paths

Same job (run analysis pull) framed across the two paths. Costs derived from cited pricing.

Dimensiontwitterapi.ioX official
Per-tweet cost$0.00015 (twitterapi.io/pricing)$0.005 (docs.x.com)
AuthX-API-Key headerbearer token
Query syntaxfull advanced-search operatorsfull advanced-search operators (same)
24h UTC dedupno — re-reads re-billedyes
Librarynonexdk or tweepy
Best foranalytics workloads at scale, archive backfillsalready-on-X-bill mixed workloads

Two practical observations: (a) cost ratio ~33× per call compounds at any meaningful volume; (b) X official's 24h UTC dedup helps re-polling workloads on the same day.

06 — Section

Persistence + trend analysis

Storage shape: (query_id, captured_at, top_authors, by_hour, p50_engagement) row per pull. Daily snapshot cadence builds the longitudinal dataset for trend analysis.

Schema choice: SQLite for small workloads, Postgres for production multi-user analytics, BigQuery / Snowflake for warehouse-scale.

Compute metrics over time: 7-day rolling mean of top-author share, week-over-week change in posting cadence, hashtag-trend ranks. Plot in Metabase / Grafana / your own React dashboard.

Anomaly detection: alert when a topic's tweet volume exceeds N× normal — useful for trending-event triggers.

07 — Section

Picking a path — the decision rule

Building analysis into a product or dashboard? → twitterapi.io. Per-call cost makes broad-coverage analytics economically viable.

Already on the X bill for other workflows? → X official; marginal analysis cost rides on same auth + the 24h dedup helps.

Research / one-off analysis? → either works at low-volume cost.

Most production analytics teams pair twitterapi.io for the bulk-fetch layer + their own warehouse for the longitudinal storage + a BI tool for visualization.

python
# Practical example: longitudinal brand-mention analysis with daily snapshots.
import os, requests, json
from datetime import datetime, timezone
from collections import Counter

HEADERS = {"X-API-Key": os.environ["TWITTERAPI_IO_KEY"]}
BASE = "https://api.twitterapi.io"

def daily_analysis(brand: str, out_path: str = "brand_analysis.jsonl"):
    tweets, cursor = [], None
    for _ in range(10):
        params = {"query": f'"{brand}" min_faves:10 lang:en'}
        if cursor:
            params["cursor"] = cursor
        r = requests.get(
            f"{BASE}/twitter/tweet/advanced_search",
            headers=HEADERS, params=params, timeout=15,
        )
        r.raise_for_status()
        resp = r.json()
        tweets.extend(resp.get("tweets", []))
        cursor = resp.get("next_cursor")
        if not cursor: break

    by_author = Counter(t.get("author", {}).get("userName", "unknown") for t in tweets)
    p50 = sorted([t.get("public_metrics", {}).get("like_count", 0) for t in tweets])[len(tweets)//2] if tweets else 0

    snapshot = {
        "brand": brand,
        "captured_at": datetime.now(timezone.utc).isoformat(),
        "total": len(tweets),
        "top_authors": dict(by_author.most_common(10)),
        "engagement_p50": p50,
    }
    with open(out_path, "a") as f:
        f.write(json.dumps(snapshot) + "\n")
    return snapshot

snap = daily_analysis("twitterapi.io")
print(snap)

# Cost framing (math from cited pricing pages):
#   ~500 tweets per query × $0.00015 = $0.075 per daily run
#   Per brand × 30 days = $2.25/mo
#   Same workload via X official: $0.005 × 500 × 30 = $75/mo (~33x more)
# Brand coverage at twitterapi.io scales linearly + cheaply.
08 — Questions

Questions readers ask

How is this different from competitor analysis or hashtag analytics?

It's the umbrella pattern — competitor analysis (track specific accounts), hashtag analytics (track specific tags), brand monitoring (track brand mentions), academic research are all specializations of the same fetch-aggregate-persist primitive. See per-use-case guides for the specific aggregations each surfaces.

What sentiment analysis library should I use?

The API returns raw tweet text — sentiment is downstream. Options: OpenAI / Anthropic LLM with a classifier prompt (cents per 100 tweets at gpt-4-class rates), local fine-tuned transformer model (one-time setup, no per-call cost), or a simple keyword-cue baseline (free, less accurate). Pick by accuracy + cost budget.

How often should I re-pull for ongoing analysis?

Depends on your decision cadence. Daily snapshots fit most analytics dashboards; hourly catches campaign-moment events; weekly is fine for steady-state brand monitoring. Match the polling cadence to how often someone actually looks at the dashboard.

Can I store user data from analysis?

Public tweet data + profile data is generally fine for storage per X's developer terms. Personal data (private DMs, protected accounts) is not surfaced by the public read API. For commercial use, review X's developer agreement + your jurisdiction's privacy laws.

What's a reasonable engagement-threshold filter for analysis?

Depends on your topic's volume. For broad terms (machine learning), min_faves:50 filters to high-signal content. For niche terms, min_faves:5 or no filter catches more. Tune against your sample data.

How do I extend this to multi-language analysis?

Drop the lang:en operator to get all languages, then group results by lang field client-side. Some sentiment classifiers handle multi-language; if not, route per-language to specialized models.

09 — Further reading

Continue

Sources & further reading
More from this series
Build it

Stop reading. Start building.

Starter credits cover real testing on real data. Google sign-in, no card, no application queue.

Get an API key
    Twitter (X) Analysis API — Developer Guide | TwitterAPI.io