Tillitsdone
down Scroll to discover

Extract Data from Dynamic Websites with Puppeteer

Learn how to scrape dynamic websites using Puppeteer, a powerful Node.js library.

Master techniques for handling JavaScript-rendered content, infinite scrolling, and automated browser interactions.
thumbnail

How to Extract Data from Dynamic Websites with Puppeteer

Abstract futuristic network mesh structure floating in space featuring neon green and off-white colors interconnected nodes representing web data shot from low angle perspective with dramatic lighting high-quality ultra-realistic cinematic 8K UHD high resolution sharp and detail

Web scraping static websites is relatively straightforward, but what happens when you need to extract data from dynamic websites that load content through JavaScript? That’s where Puppeteer comes in - a powerful Node.js library that gives you control over Chrome or Chromium, allowing you to automate browser actions and extract data from JavaScript-rendered pages.

Understanding Dynamic Websites and Why Traditional Scraping Falls Short

Traditional web scraping tools like Cheerio or regular HTTP requests often fail when dealing with modern websites. Why? Because these websites load their content dynamically after the initial HTML is delivered. Think of single-page applications (SPAs), infinite scrolling feeds, or any content that appears after clicking a button.

Geometric abstract composition of interconnected cubes and spheres floating in space minimalist yellow and orange gradient colors captured from bird's eye view perspective high-quality ultra-realistic cinematic 8K UHD high resolution sharp and detail

Getting Started with Puppeteer

First, let’s set up our project. Create a new directory and initialize it with npm:

Terminal window
mkdir puppeteer-scraper
cd puppeteer-scraper
npm init -y
npm install puppeteer

Here’s a basic example that navigates to a website and takes a screenshot:

const puppeteer = require('puppeteer');
async function scrapeWebsite() {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
await page.screenshot({ path: 'screenshot.png' });
await browser.close();
}
scrapeWebsite();

Advanced Data Extraction Techniques

When dealing with dynamic websites, you’ll often need to wait for specific elements to load or interact with the page before extracting data. Here’s a real-world example of extracting data from an infinite scroll page:

async function scrapeInfiniteScroll() {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com/feed');
let items = [];
let previousHeight = 0;
while (items.length < 100) { // Collect 100 items
items = await page.evaluate(() => {
return Array.from(document.querySelectorAll('.item')).map(item => ({
title: item.querySelector('.title').innerText,
description: item.querySelector('.description').innerText
}));
});
previousHeight = await page.evaluate('document.body.scrollHeight');
await page.evaluate('window.scrollTo(0, document.body.scrollHeight)');
await page.waitForFunction(`document.body.scrollHeight > ${previousHeight}`);
await page.waitForTimeout(1000); // Wait for new content to load
}
await browser.close();
return items;
}

Modern architectural structure with clean lines and smooth surfaces cool blue and off-white color scheme photographed from dramatic diagonal angle high-quality ultra-realistic cinematic 8K UHD high resolution sharp and detail

Best Practices and Performance Tips

  1. Always close your browser instances to prevent memory leaks
  2. Use page.evaluate() strategically to run code in the browser context
  3. Implement proper error handling and retries
  4. Consider using a stealth plugin to avoid detection
  5. Cache results when possible to minimize repeated requests

Here’s an example implementing these practices:

const puppeteer = require('puppeteer');
async function resilientScrape(url, maxRetries = 3) {
let browser;
try {
browser = await puppeteer.launch({
headless: true,
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
const page = await browser.newPage();
await page.setViewport({ width: 1920, height: 1080 });
await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36');
let retries = 0;
while (retries < maxRetries) {
try {
await page.goto(url, { waitUntil: 'networkidle0' });
// Your scraping logic here
break;
} catch (error) {
retries++;
if (retries === maxRetries) throw error;
await page.waitForTimeout(1000 * retries);
}
}
} catch (error) {
console.error('Scraping failed:', error);
throw error;
} finally {
if (browser) await browser.close();
}
}

Conclusion

Puppeteer is an incredibly powerful tool for extracting data from dynamic websites. By understanding its capabilities and following best practices, you can build robust scraping solutions that handle modern web applications with ease.

Organic flowing abstract sculpture with smooth curves and edges bright orange and cool grey color palette captured from ground level perspective with dramatic shadows high-quality ultra-realistic cinematic 8K UHD high resolution sharp and detail

icons/logo-tid.svg Latest Blogs
Discover our top articles, selected to support the growth of your business.
https://imgproxy-landing-page.tillitsdone.com/sig/rs:fit:1200:630/plain/https%3A%2F%2Fcms-r2.tillitsdone.com%2Fwp-content-prod%2Fuploads%2F2025%2F05%2FTill-its-done_SEO_R08_apr_1440x697.jpg@webp รู้จักกับ บริษัท Software House คืออะไร ทำอะไรบ้าง Software House คือศูนย์บริการที่ครบวงจรในการพัฒนาเทคโนโลยี ช่วยสนับสนุนธุรกิจในยุค 4.0 และสร้างโอกาสใหม่ ๆ ในตลาดการแข่งขันที่มีการเปลี่ยนแปลงอย่างรวดเร็ว https://imgproxy-landing-page.tillitsdone.com/sig/rs:fit:1200:630/plain/https%3A%2F%2Fcms-r2.tillitsdone.com%2Fwp-content-prod%2Fuploads%2F2025%2F05%2FTill-its-done_SEO_R07_apr_1440x697.jpg@webp Mobile App Developer คืออาชีพอะไร และมีความสำคัญอย่างไร Mobile App Developer มีบทบาทสำคัญในการขับเคลื่อนธุรกิจในยุคดิจิทัล โดยมุ่งพัฒนาประสบการณ์ผู้ใช้ และสนับสนุนการเติบโตขององค์กรในอนาคต https://imgproxy-landing-page.tillitsdone.com/sig/rs:fit:1200:630/plain/https%3A%2F%2Fcms-r2.tillitsdone.com%2Fwp-content-prod%2Fuploads%2F2025%2F05%2FTill-its-done_SEO_R06_apr_1440x697.jpg@webp React Native คืออะไร ทำความรู้จัก และเริ่มต้นสร้าง Project React Native คือ Framework ที่ช่วยให้นักพัฒนาสร้างแอปมือถือ โดยมีประสิทธิภาพใกล้เคียงกับ Native App ซึ่งลดเวลาและค่าใช้จ่ายในการพัฒนา แต่ทำได้ยังไงกันนะ https://imgproxy-landing-page.tillitsdone.com/sig/rs:fit:1200:630/plain/https%3A%2F%2Fcms-r2.tillitsdone.com%2Fwp-content-prod%2Fuploads%2F2025%2F05%2FTill-its-done_SEO_R02_apr_1440x697-1.jpg@webp Website Development คืออะไร สำคัญอย่างไร Website Development เป็นกระบวนการที่สำคัญในการสร้างเว็บไซต์ ซึ่งจะช่วยให้ธุรกิจของคุณเติบโตในตลาดออนไลน์ได้อย่างยั่งยืนและมีประสิทธิภาพ image_generation/Debug-TailwindCSS-with-DevTools-1732752708935-cdd0a53458db0224ae03d6d0b9599879.png Debug TailwindCSS Issues with Browser DevTools Learn practical techniques for debugging TailwindCSS using browser DevTools. Master the cascade, understand style overrides, and solve common responsive design issues efficiently. image_generation/Jest-Coverage-Reports-Guide-1732733982763-bc09ffcd377b2159e9e17e9d31cc1515.png Using Jest's Coverage Reports for Better Tests Learn how to leverage Jest's coverage reports to write more effective tests, understand coverage metrics, and set meaningful thresholds to maintain high-quality code in your projects.
icons/logo-tid.svg

พูดคุยกับซีอีโอ

พร้อมที่จะสร้างเว็บ/แอปของคุณให้มีชีวิตชีวาหรือเสริมทีมของคุณด้วยนักพัฒนาชาวไทยผู้เชี่ยวชาญหรือไม่?
ติดต่อเราวันนี้เพื่อหารือเกี่ยวกับความต้องการของคุณ แล้วมาสร้างโซลูชันที่ปรับแต่งเพื่อบรรลุเป้าหมายของคุณกัน เรายินดีช่วยเหลือทุกขั้นตอน!
🖐️ Contact us
Let's keep in Touch
Thank you for your interest in Tillitsdone! Whether you have a question about our services, want to discuss a potential project, or simply want to say hello, we're here and ready to assist you.
We'll be right here with you every step of the way.
Contact Information
rick@tillitsdone.com+66824564755
Find All the Ways to Get in Touch with Tillitsdone - We're Just a Click, Call, or Message Away. We'll Be Right Here, Ready to Respond and Start a Conversation About Your Needs.
Address
9 Phahonyothin Rd, Khlong Nueng, Khlong Luang District, Pathum Thani, Bangkok Thailand
Visit Tillitsdone at Our Physical Location - We'd Love to Welcome You to Our Creative Space. We'll Be Right Here, Ready to Show You Around and Discuss Your Ideas in Person.
Social media
FacebookInstagramLinkedIn
Connect with Tillitsdone on Various Social Platforms - Stay Updated and Engage with Our Latest Projects and Insights. We'll Be Right Here, Sharing Our Journey and Ready to Interact with You.
We anticipate your communication and look forward to discussing how we can contribute to your business's success.
We'll be here, prepared to commence this promising collaboration.
Frequently Asked Questions
Explore frequently asked questions about our products and services.
Whether you're curious about features, warranties, or shopping policies, we provide comprehensive answers to assist you.