Tutorial
Build a Web Scraper in 5 Minutes
Scrape any website with JavaScript rendering, pagination, and structured data extraction using BrowserFabric.
Most web scraping tools break on JavaScript-heavy sites. With BrowserFabric, you get a full Chromium browser in the cloud that renders everything — SPAs, dynamic content, lazy-loaded images.
Prerequisites
pip install browserfabric export BROWSERFABRIC_API_KEY=bf_your_key_here
Step 1: Scrape a single page
import browserfabric
import asyncio
async def scrape_page():
async with browserfabric.browser() as session:
await session.navigate(
"https://news.ycombinator.com",
wait_until="networkidle"
)
# Extract data with JavaScript
stories = await session.evaluate_js("""
Array.from(document.querySelectorAll('.athing'))
.slice(0, 10)
.map(row => ({
title: row.querySelector('.titleline a')?.textContent,
url: row.querySelector('.titleline a')?.href,
rank: row.querySelector('.rank')?.textContent,
}))
""")
for story in stories:
print(f"{story['rank']} {story['title']}")
asyncio.run(scrape_page())Step 2: Handle pagination
Use click and wait_for to navigate through pages:
async def scrape_with_pagination():
async with browserfabric.browser() as session:
await session.navigate("https://news.ycombinator.com")
all_titles = []
for page in range(3):
titles = await session.evaluate_js("""
Array.from(document.querySelectorAll('.titleline a'))
.map(a => a.textContent)
""")
all_titles.extend(titles)
print(f"Page {page+1}: {len(titles)} titles")
# Click "More" and wait for content
try:
await session.click("a.morelink")
await session.wait_for(".athing")
except:
break
print(f"Total: {len(all_titles)} titles")Step 3: Use batch operations
For maximum efficiency, use the batch endpoint to run multiple operations in a single HTTP call:
curl -X POST https://api.browserfabric.com/api/v1/services/browseruse/batch \
-H "Authorization: Bearer bf_your_key" \
-H "Content-Type: application/json" \
-d '{
"session_id": "YOUR_SESSION_ID",
"operations": [
{"tool_name": "navigate", "arguments": {"url": "https://example.com", "wait_until": "networkidle"}},
{"tool_name": "evaluate_js", "arguments": {"expression": "document.title"}},
{"tool_name": "take_screenshot", "arguments": {"full_page": true}}
]
}'Tips
- Use
wait_until: "networkidle"for JavaScript-heavy pages - Use
wait_forbefore interacting with dynamically loaded elements - Use
scrollto trigger lazy-loaded content before scraping - Save persistent contexts with
persist=Trueto avoid re-authentication on sites that require login
Check out the full API documentation for all 28 available browser tools.