Twitter (X) Hashtag Analytics — API Guide for Devs
Hashtag analytics on Twitter (X) — top contributors, engagement distribution, time-of-day activity, sentiment input — comes from one underlying primitive: search the firehose by #hashtag and aggregate the returned tweets client-side. Most dashboard tools wrap this and charge per-seat for the UI; the dev path is direct API access plus your own aggregation.
This guide walks the two practical API paths in 2026 (twitterapi.io advanced_search and X official recent search), runnable Python with hashtag-specific aggregations, and per-call cost from each provider's published pricing page so you can model your own bill before committing.
What 'hashtag analytics' means in practice
The underlying primitive is: given #hashtag, return matching tweets with their public_metrics. Everything an analytics dashboard shows is client-side aggregation over that set.
Common aggregations you'll build:
- Top contributors — group tweets by author, sum engagement, rank by total reach
- Engagement distribution — histogram of like_count, retweet_count across the matched set
- Activity-over-time — bucket tweets by hour or day, count or sum engagement per bucket
- Sentiment seeds — feed tweet text to your sentiment classifier (an LLM or a fine-tuned model) and roll up the score per author / per time bucket
- Velocity — rate of new tweets matching the hashtag per minute, useful for trending-event detection
Path 1 — twitterapi.io `/twitter/tweet/advanced_search`
twitterapi.io's advanced_search accepts the full X advanced-search expression as the query parameter, which includes #hashtag plus optional engagement / language / date filters.
Pricing per twitterapi.io/pricing: $0.00015 per returned tweet. No monthly minimum, no free tier. Auth is X-API-Key header.
Pick this when you need to query hashtags at scale (running dashboards, doing batch analytics, building research datasets). The per-tweet cost compounds favorably for read-heavy workloads.
import os, requests
from collections import defaultdict
HEADERS = {"X-API-Key": os.environ["TWITTERAPI_IO_KEY"]}
BASE = "https://api.twitterapi.io"
def hashtag_analytics(tag: str, min_faves: int = 0, max_pages: int = 20):
"""Collect tweets matching #tag, aggregate by author and hour bucket."""
rows, cursor = [], None
for _ in range(max_pages):
params = {"query": f"#{tag} min_faves:{min_faves}"}
if cursor:
params["cursor"] = cursor
r = requests.get(
f"{BASE}/twitter/tweet/advanced_search",
headers=HEADERS, params=params, timeout=15,
)
r.raise_for_status()
resp = r.json()
rows.extend(resp.get("tweets", []))
cursor = resp.get("next_cursor")
if not cursor:
break
# Top contributors by total engagement
by_author = defaultdict(int)
for t in rows:
a = t.get("author", {}).get("userName", "unknown")
pm = t.get("public_metrics", {})
by_author[a] += pm.get("like_count", 0) + pm.get("retweet_count", 0)
top = sorted(by_author.items(), key=lambda x: -x[1])[:10]
return {"total_tweets": len(rows), "top_contributors": top}
result = hashtag_analytics("AIagents", min_faves=50)
print(f"matched {result['total_tweets']} tweets")
for user, engagement in result["top_contributors"]:
print(f" {user}: {engagement} total engagement")
Path 2 — X official `/2/tweets/search/recent`
X's official search endpoint accepts the same operator set including #hashtag. Auth is a bearer token from the X Developer Console. Pricing per docs.x.com/x-api/getting-started/pricing: $0.005 per post read.
Pick this when you're already on the X bill for other workflows — the marginal hashtag-search cost rides on the same auth + bill.
# pip install tweepy
import tweepy
from collections import defaultdict
client = tweepy.Client(bearer_token="YOUR_X_BEARER")
query = "#AIagents min_faves:50 lang:en"
rows = []
for page in tweepy.Paginator(
client.search_recent_tweets,
query=query,
max_results=100,
tweet_fields=["created_at", "public_metrics", "author_id"],
limit=10,
):
rows.extend(page.data or [])
by_author = defaultdict(int)
for t in rows:
pm = t.public_metrics
by_author[t.author_id] += pm["like_count"] + pm["retweet_count"]
print(f"matched {len(rows)} tweets")
for uid, eng in sorted(by_author.items(), key=lambda x: -x[1])[:10]:
print(f" user_id={uid}: {eng} total engagement")
Aggregations — what to compute client-side
Once you have the matched tweets, the analytics work happens in your code. Build these as reusable helpers:
- Top contributors — group by author.user_name, sum like_count + retweet_count, rank descending
- Hourly bucket — parse created_at to UTC, bucket by hour, count or sum engagement per bucket; useful for activity-curve visualizations
- Engagement distribution — compute percentiles (p50, p90, p99) on like_count to see the long-tail shape
- Sentiment input — concatenate or batch tweet text and feed to your sentiment classifier (OpenAI API, local model, or any text classifier); roll up the score by author or hour
- Co-occurrence — extract other hashtags from each tweet, count co-occurrences with your target hashtag to surface related topics
Side-by-side comparison — 2 API paths, 5 dimensions
Same job (collect tweets matching #hashtag, run client-side aggregations) framed across the two paths. Costs are derived from each provider's published pricing page.
Three honest patterns: (a) operator semantics are identical — same X advanced-search grammar; (b) cost ratio per call is ~33.33× ($0.005 / $0.00015), derivable from each provider's pricing page — at scale this dominates the bill; (c) X official has 24h UTC dedup (same tweet in same UTC day = 1 charge) which helps re-polling workloads, twitterapi.io has no equivalent so you cache results yourself.
Practical patterns — paginating, deduplicating, rate-limiting
Pagination: results larger than one page require following the cursor / next_token returned by the API. Loop until cursor is null or you've hit your budget cap.
Deduplication: tweet IDs may repeat across re-runs (the same tweet still matches the hashtag). Track IDs you've already saved and skip duplicates downstream.
Rate limits: both providers publish rate limits. Wrap each call with retry-on-429 + jittered backoff. Treat 5xx the same way.
Polling cadence: hashtags evolve over hours. For trending-event detection poll every 1-5 minutes; for routine analytics poll every 15-60 minutes; for batch backfill, run once with deep pagination then stop.
Picking a path — the decision rule
Building hashtag analytics into a product (dashboard, alert, monitoring) at any meaningful volume? → twitterapi.io. The per-call ratio at $0.00015/tweet compounds favorably against X official's $0.005/tweet at any volume above pure prototype.
Already paying X for credits because you write or read other surfaces? → use X official; the marginal hashtag-search cost rides on the same auth, and the 24h UTC dedup helps re-polling.
Prototyping or one-off research? → either works. The bill at small volume is pennies; pick by whichever auth is easiest for you to set up.
Most production analytics teams run twitterapi.io for the bulk hashtag-search layer + X official for any compound read+write workflow that needs unified auth.
# Practical example: track engagement velocity for a hashtag, hourly snapshots,
# compute delta-per-hour as the trending signal.
import os, requests, time
from datetime import datetime, timezone
from collections import defaultdict
HEADERS = {"X-API-Key": os.environ["TWITTERAPI_IO_KEY"]}
BASE = "https://api.twitterapi.io"
def hashtag_snapshot(tag: str) -> dict:
"""Single snapshot — counts + top contributors at this moment."""
rows, cursor = [], None
for _ in range(10): # cap pages
params = {"query": f"#{tag}"}
if cursor: params["cursor"] = cursor
r = requests.get(
f"{BASE}/twitter/tweet/advanced_search",
headers=HEADERS, params=params, timeout=15,
)
r.raise_for_status()
resp = r.json()
rows.extend(resp.get("tweets", []))
cursor = resp.get("next_cursor")
if not cursor: break
by_author = defaultdict(int)
for t in rows:
a = t.get("author", {}).get("userName", "unknown")
pm = t.get("public_metrics", {})
by_author[a] += pm.get("like_count", 0) + pm.get("retweet_count", 0)
return {
"captured_at": datetime.now(timezone.utc).isoformat(),
"total_tweets": len(rows),
"top_contributors": sorted(by_author.items(), key=lambda x: -x[1])[:10],
}
snap = hashtag_snapshot("AIagents")
print(f"{snap['captured_at']}: {snap['total_tweets']} tweets")
for user, eng in snap["top_contributors"]:
print(f" {user}: {eng}")
# Cost framing (math from cited pricing):
# 100 tweets per snapshot × $0.00015 = $0.015 per call
# Hourly for a month: 24 × 30 × $0.015 = $10.80/mo (per hashtag tracked)
# Same workload via X official: 24 × 30 × 100 × $0.005 = $360/mo (per hashtag)
# Multiply by N tracked hashtags. Verify against live pricing before committing.Questions readers ask
Does the hashtag search return retweets, replies, and quotes too?
By default yes — any tweet containing #hashtag matches. To narrow, use operators: #hashtag -filter:replies (no replies), #hashtag -filter:retweets (no retweets), #hashtag filter:quotes (quotes only). See the operator reference for the full filter list.
How do I track multiple hashtags at once?
Boolean OR in the query: #AIagents OR #LLM OR #ai. Each unique tweet returned is one billable unit. Group by hashtag client-side after the fetch; a tweet matching multiple hashtags is counted once but you can tally it under each matched tag.
What's the right cadence for trending-event detection?
5-minute polling catches most trending-event signals. Faster than that (1 min) mostly returns the same tweets. For routine monitoring, 15-60 min is sufficient. The X surface itself doesn't update finer than every few minutes for trends, so polling faster buys nothing.
Can I get a sentiment score directly from the API?
No — both providers return raw tweet content. Sentiment classification is your downstream task. Common approach: batch tweet text to an LLM (OpenAI, Anthropic, etc.) with a classifier prompt, or run a local fine-tuned model. Costs and accuracy depend on the classifier you pick.
Do I need different code for case-insensitive hashtags?
Hashtags are case-insensitive on X — #AIagents and #aiagents match the same tag. The search API handles this transparently; you don't normalize case in code.
What about archived hashtags — can I backfill historical data?
twitterapi.io's advanced_search depth and X's search_recent_tweets recent window both have published time-coverage limits — verify on the endpoint docs for what each supports for your date range. For full historical archives, X's search_all_tweets (where available on your access tier) covers deeper history at the same per-call rate.
Continue
- twitterapi.io — pricing
- X API — pricing (docs.x.com, 2026 verified)
- X official — Build a query (operators)
- Tweepy documentation
- Twitter (X) API — cluster hub
- Twitter (X) Advanced Search API guide
- Twitter (X) hashtag tracker tools — dev roundup
- Twitter (X) analytics tools — dev roundup
- twitterapi.io pricing
Stop reading. Start building.
Starter credits cover real testing on real data. Google sign-in, no card, no application queue.
Get an API key