Twitter (X) History API — How to Export Tweet Timeline
Exporting an account's tweet history programmatically is a foundational research task — academic discourse analysis, journalist evidence archives, brand-content backup, AI training datasets. The workflow is straightforward: paginate over the account's archive with a from: query and persist incrementally to disk.
This guide walks the workflow in Python with cost framing from each provider's published pricing page. Pricing references are URL-cited so you can re-derive the numbers.
What 'history export' actually involves
Three sub-tasks the workflow covers:
1. Pagination — the account's full archive may be thousands of tweets; iterate over cursor pages until exhausted.
2. Persistence — write each fetched batch to disk immediately (JSONL is the canonical format); a crashed run shouldn't lose progress.
3. Deduplication — restart-safe runs check the persisted file for seen IDs before re-fetching.
These three together turn a one-off API call into a robust archive-export workflow.
Path 1 — twitterapi.io `/twitter/tweet/advanced_search`
Pass query=from: (no date filter = full archive) and follow next_cursor until no more results. Per twitterapi.io/pricing: $0.00015 per returned tweet.
Depth covered by twitterapi.io's advanced_search is deeper than X official's recent-search window — practical for full-archive exports. For specific time windows add since:YYYY-MM-DD until:YYYY-MM-DD to narrow.
import os, requests, json, time
HEADERS = {"X-API-Key": os.environ["TWITTERAPI_IO_KEY"]}
BASE = "https://api.twitterapi.io"
def export_history(handle: str, out_path: str = "history.jsonl"):
"""Restart-safe full-archive export for @handle."""
seen = set()
if os.path.exists(out_path):
with open(out_path) as f:
for line in f:
seen.add(json.loads(line)["id"])
print(f"resume: {len(seen)} tweets already saved")
cursor = None
while True:
params = {"query": f"from:{handle}"}
if cursor:
params["cursor"] = cursor
r = requests.get(
f"{BASE}/twitter/tweet/advanced_search",
headers=HEADERS, params=params, timeout=15,
)
r.raise_for_status()
resp = r.json()
rows = resp.get("tweets", [])
new = 0
with open(out_path, "a") as f:
for t in rows:
if t["id"] in seen:
continue
seen.add(t["id"])
f.write(json.dumps(t) + "\n")
new += 1
print(f"+{new} tweets (total: {len(seen)})")
cursor = resp.get("next_cursor")
if not cursor:
break
time.sleep(0.3) # gentle pace
return len(seen)
total = export_history("twitterapi_io")
print(f"archived: {total} tweets")
Path 2 — X official `/2/tweets/search/all`
X's official full-archive endpoint requires elevated access. The standard /2/tweets/search/recent is a ~7-day rolling window — not suitable for historical exports. search_all_tweets covers deeper history but is gated by access tier.
Pricing per docs.x.com/x-api/getting-started/pricing: $0.005 per post read.
# pip install tweepy — requires elevated access for search_all_tweets
import tweepy, json
client = tweepy.Client(bearer_token="YOUR_X_BEARER")
def export_history_x(handle: str, out_path: str = "history_x.jsonl"):
with open(out_path, "a") as f:
for page in tweepy.Paginator(
client.search_all_tweets,
query=f"from:{handle}",
max_results=500,
tweet_fields=["created_at", "public_metrics"],
):
for t in page.data or []:
f.write(json.dumps({"id": t.id, "text": t.text, "created_at": str(t.created_at), "public_metrics": t.public_metrics}) + "\n")
export_history_x("twitterapi_io")
Side-by-side comparison — 2 paths to history export
Same job (export full archive) across both providers. Costs derived from cited pricing.
Two practical observations: (a) cost ratio (~33×) compounds at archive scale — 10K tweets is $1.50 vs $50; (b) X official's full-archive access requires the right tier — twitterapi.io has no such gating.
Restart-safe + production patterns
Incremental writes: append to JSONL per batch — never wait until the end to flush. A crash mid-export should leave a usable partial archive.
Dedup on resume: load seen IDs from the existing file at startup. Skip already-fetched tweets without spending API calls.
Rate handling: 429s happen on large exports. Wrap each call with retry-on-429 + jittered backoff.
Cursor persistence: if the run is multi-day, persist the latest cursor too — resume from where you left off instead of re-scanning.
Use cases
Academic discourse research — full-archive corpora for topic modeling, sentiment analysis, network analysis. Single-digit-dollar cost at twitterapi.io rates for most-volume accounts.
Brand content archive — backup your own brand's tweets in case of platform change or moderation event.
AI training data — train language models or recommendation systems on real public X data. Respect X's developer terms regarding data use.
Journalism evidence — preserve evidence of public statements for ongoing coverage. Combine with media-CDN download for visual evidence.
Picking a path — the decision rule
Building a research dataset on a budget? → twitterapi.io. Cost-per-call makes multi-account archive builds economically viable.
Already on the X bill with elevated access? → X official search_all_tweets; marginal export-cost rides on the same auth.
One-off archive for a single account? → twitterapi.io. Faster onboarding, lower bill.
# Practical example: full-archive export with progress reporting + cost estimate.
import os, requests, json, time
HEADERS = {"X-API-Key": os.environ["TWITTERAPI_IO_KEY"]}
BASE = "https://api.twitterapi.io"
COST_PER_TWEET = 0.00015 # from twitterapi.io/pricing
def export_with_progress(handle: str, out_path: str):
seen = set()
if os.path.exists(out_path):
with open(out_path) as f:
for line in f:
seen.add(json.loads(line)["id"])
cursor = None
api_calls = 0
while True:
params = {"query": f"from:{handle}"}
if cursor:
params["cursor"] = cursor
r = requests.get(
f"{BASE}/twitter/tweet/advanced_search",
headers=HEADERS, params=params, timeout=15,
)
r.raise_for_status()
api_calls += 1
resp = r.json()
rows = resp.get("tweets", [])
new = 0
with open(out_path, "a") as f:
for t in rows:
if t["id"] in seen: continue
seen.add(t["id"])
f.write(json.dumps(t) + "\n")
new += 1
cost_so_far = len(seen) * COST_PER_TWEET
print(f"page {api_calls}: +{new} | total {len(seen)} | cost ~${cost_so_far:.4f}")
cursor = resp.get("next_cursor")
if not cursor: break
time.sleep(0.3)
return len(seen)
total = export_with_progress("twitterapi_io", "archive.jsonl")
print(f"\nfinal: {total} tweets, ~${total * COST_PER_TWEET:.4f} spent")
# Cost framing (math from cited pricing pages):
# 10,000 tweets via twitterapi.io: 10,000 × $0.00015 = $1.50
# Same workload via X official: 10,000 × $0.005 = $50
# For research-grade archive builds, twitterapi.io is the clear cost-efficient path.Questions readers ask
How far back can I export with twitterapi.io?
Coverage depth varies — check the advanced_search endpoint documentation for the current supported time window. For most accounts the depth covers multi-year histories. Verify against your specific date range needs before committing to a large export.
What if the account is deleted or suspended mid-export?
Deleted accounts return restricted data; suspended accounts return errors. Persist what you have, log the error, and treat that as a data event — useful information for research timelines.
Can I export retweets and quote-tweets too?
Yes — from: matches retweets and quotes the account posted. Filter client-side if you want originals only — -filter:retweets -filter:quotes in the query narrows down.
How do I export only tweets within a specific time window?
Add since:YYYY-MM-DD until:YYYY-MM-DD to the query. Useful for research on a specific period (an election, an event window, a campaign).
Are there X terms-of-service concerns with mass exports?
API-backed exports of public data within X's documented developer terms are standard. Research and academic use have generally-accepted norms. Commercial use of bulk-exported data should be reviewed against your jurisdiction's privacy laws + X's developer agreement.
What if I need media files (images, videos) from the archive?
Tweet objects include media_keys references. Loop through them and fetch each URL from X's media CDN — the file fetch itself isn't metered by per-tweet read rate. See /blog/twitter-images-api-extraction-guide for the media-download pattern.
Continue
- twitterapi.io — pricing
- X API — pricing (docs.x.com, 2026 verified)
- X official — search_all_tweets reference
- Twitter (X) API — cluster hub
- How to download tweets via API — Python guide
- Twitter (X) Advanced Search API guide
- Twitter (X) images API — extract media URLs
- twitterapi.io pricing
Stop reading. Start building.
Starter credits cover real testing on real data. Google sign-in, no card, no application queue.
Get an API key