Using Selenium & Chrome to automatically download Blob files


The Selenium WebDriver is a brilliant way to programmatically interact with websites. You can write little Python scripts which can click around inside browser windows and do "stuff".

I use it to download a file generated by a Javascript Blob and automatically save it to disk. Here's how.

Set up the WebDriver

After you've installed Selenium and the Chrome WebDriver, this is the standard boilerplate to use it in Python:

from selenium import webdriver 
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By

Set Up Chrome

You can pass whatever options and arguments you need - I use it in headless mode which means it doesn't display a window.

chrome_options = Options()
chrome_options.add_argument('--headless=new')
chrome_options.add_argument('--window-size=1920,1080')

Set where to save the files

These options force the blob to download automatically to a specific location.
Note There is no way to set the default file name.

chrome_options.add_experimental_option("prefs", {
        "download.default_directory"  : "/tmp/",
        "download.prompt_for_download": False,
        "download.directory_upgrade"  : True,
})

Download the file

This opens the browser, finds the link, then clicks on it. Your XPATH will be different to mine.

driver = webdriver.Chrome(options=chrome_options)
driver.get("https://example.com")

download_button = driver.find_element(By.XPATH, "//button[@data-integration-name='button-download-data-csv']")
download_button.click()

Rename from the default name

As mentioned, there's no way to set a default file name. But if you know what the file is going to be called, you can rename after it has been downloaded.

time.sleep(2) # Wait until the file has been downloaded. Increase if it is a big file
os.rename("/tmp/example.csv", "/home/me/newfile.csv")

#       Stop the driver
driver.quit()

And there you go. Stick that in a script and you can automatically download and rename Blob URls.


Share this post on…

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">