How to scrape dynamic pages with Python and Selenium

1 min readJul 29, 2022

Getting the html source of a webpage is easy with Python. With requests it can be achieved in one line.

This works well as long as the content of the page is static. For some dynamic pages no useful information can be shown without running scripts. The solution is to first execute the scripts and then scrape the result.

Tools

First you will need to install a WebDriver: basically an API to a Browser. Each major Browser has its own WebDriver, in this example I used Firefox’s one. You can install the WebDriver manually, but then additional configuration and manual updates are necessary. It is easier to use Webdriver Manager, which takes care of installing and keeping the WebDriver up to date.

You will need Selenium to talk to the WebDriver. Selenium has its own Python wrapper, which is used in this example.

Solution

Install the python libraries.

Use this code to execute the scripts and get the resulting html.

Parsing

The page source can than be parsed as the one of a static web page. Python’s standard library has its own parser html.parser. Another popular Python parser is Beautiful Soup, for which I recommend this tutorial on pythonprogramming.net.

Originally published at https://lucaf.eu on July 29, 2022.

How to scrape dynamic pages with Python and Selenium

Tools

Solution

Parsing

Written by Luca Franceschini