Downloading with chrome headless and selenium

Question

I'm using python-selenium and Chrome 59 and trying to automate a simple download sequence. When I launch the browser normally, the download works, but when I do so in headless mode, the download doesn't work.

# Headless implementation
from selenium import webdriver

chromeOptions = webdriver.ChromeOptions()
chromeOptions.add_argument("headless")

driver = webdriver.Chrome(chrome_options=chromeOptions)

driver.get('https://www.mockaroo.com/')
driver.find_element_by_id('download').click()
# ^^^ Download doesn't start

# Normal Mode
from selenium import webdriver

driver = webdriver.Chrome()

driver.get('https://www.mockaroo.com/')
driver.find_element_by_id('download').click()
# ^^^ Download works normally

I've even tried adding a default path:

prefs = {"download.default_directory" : "/Users/Chetan/Desktop/"}
chromeOptions.add_argument("headless")
chromeOptions.add_experimental_option("prefs",prefs)

Adding a default path works in the normal implementation, but the same problem persists in the headless version.

How do I get the download to start in headless mode?

I've also tried using submit and sending Keys.ENTER. It works for the normal browser,but not the headless one. — TheChetan, Commented Aug 12, 2017 at 11:45
do you want it to be done using chrome only?? or firefox also would do? — Prakash Palnati, Commented Aug 14, 2017 at 7:09
Why not just use urllib to download the file? clicking on the file to simulate downloading only counts for some of the user cases. Ive used browsers where it opens a "save as" window before it starts downloading. If you are clicking to see if it exists on server, or to verify the contents of the file, urllib is probably going to be your best bet. — TehTris, Commented Aug 16, 2017 at 20:56
@TehTris the problem is, I'm doing this on another site that requires me to have logged in earlier. That sets both headers and cookies, so I need to set both before using it. But using just js, there seems to be no way to get the request headers from the client side... So I can't use urlllib — TheChetan, Commented Aug 17, 2017 at 1:55

Michael Mintz · Accepted Answer · 2023-01-13 18:43:23Z

66

The Chromium developers recently added a 2nd headless mode (in 2021). See https://bugs.chromium.org/p/chromium/issues/detail?id=706008#c36

They later renamed the option in 2023 for Chrome 109 -> https://github.com/chromium/chromium/commit/e9c516118e2e1923757ecb13e6d9fff36775d1f4

For Chrome 109 and above, the --headless=new flag will now allow you to get the full functionality of Chrome in the new headless mode, and you can even run extensions in it. (For Chrome versions 96 through 108, use --headless=chrome)

Usage: (Chrome 109 and above):

options.add_argument("--headless=new")

Usage: (Chrome 96 through Chrome 108):

options.add_argument("--headless=chrome")

If something works in regular Chrome, it should now work with the newer headless mode too.

edited Jan 13, 2023 at 18:43

answered Sep 24, 2022 at 19:43

Michael Mintz

15.3k8 gold badges46 silver badges92 bronze badges

1

I just edited my existing solution because in 2023 for Chrome 109, they renamed the previous option from --headless=chrome to --headless=new.
– Michael Mintz
Commented Jan 13, 2023 at 18:46
2

Tested today, it worked.
– Rolandas Ulevicius
Commented Feb 1, 2023 at 11:11
2

This is freaking magical ;)
– Sasha Kolsky
Commented Feb 22, 2023 at 7:13
3

FWIW, this is now officially announced as shipping in Chrome 112: developer.chrome.com/articles/new-headless Please report any bugs you run into so we can fix them!
– Mathias Bynens
Commented Feb 23, 2023 at 9:27
2

Thank you so much you saved my day. Till 108 version, even --headless worked for me.
– Nandan A
Commented Feb 28, 2023 at 16:52

| Show 6 more comments

Shawn Button · Accepted Answer · 2020-05-01 13:43:06Z

63

Yes, it's a "feature", for security. As mentioned before here is the bug discussion: https://bugs.chromium.org/p/chromium/issues/detail?id=696481

Support was added in chrome version 62.0.3196.0 or above to enable downloading.

Here is a python implementation. I had to add the command to the chromedriver commands. I will try to submit a PR so it is included in the library in the future.

def enable_download_in_headless_chrome(self, driver, download_dir):
    # add missing support for chrome "send_command"  to selenium webdriver
    driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')

    params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': download_dir}}
    command_result = driver.execute("send_command", params)

For reference here is a little repo to demonstrate how to use this: https://github.com/shawnbutton/PythonHeadlessChrome

update 2020-05-01 There have been comments saying this is not working anymore. Given this patch is now over a year old it's quite possible they have changed the underlying library.

edited May 1, 2020 at 13:43

answered Nov 18, 2017 at 14:06

Shawn Button

6391 gold badge6 silver badges4 bronze badges

3

I tried this and it doesn't work for me :( When I try exactly like that, I get nothing, and when I just turn off the "headless" mode, I get the file, but then Chrome crashes. If I completely remove the code from this answer together with the headless mode, Chrome works like expected. I guess Chrome's API has changed?
– bitstream
Commented Jul 6, 2018 at 10:37
@bitstream It worked for me on Chromium 68.0.3440.75 & chromedriver 2.38, check my full example
– Fayçal
Commented Aug 7, 2018 at 11:11
@shawn-button how download videos..seems HTML5 videos play on chrome by default
– Mostafa
Commented Dec 28, 2018 at 19:05
You said that Support was added in chrome version 62.0.3196.0 or above to enable downloading. But I am currently working with Chrome 71 but it does not work there either. The same workaround process need to be followed..
– Saradamani
Commented Jan 31, 2019 at 9:02
3

does this still up to date? I tried the method in the github, it doesn't download file. I tested the code can download file when not in headless mode. the print out shows: response from browser: result:value:None
– Henry
Commented Oct 4, 2019 at 23:45

| Show 2 more comments

Fayçal · Accepted Answer · 2018-08-07 11:22:38Z

28

Here's a working example for Python based on Shawn Button's answer. I've tested this with Chromium 68.0.3440.75 & chromedriver 2.38

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_experimental_option("prefs", {
  "download.default_directory": "/path/to/download/dir",
  "download.prompt_for_download": False,
})

chrome_options.add_argument("--headless")
driver = webdriver.Chrome(chrome_options=chrome_options)

driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': "/path/to/download/dir"}}
command_result = driver.execute("send_command", params)

driver.get('http://download-page.url/')
driver.find_element_by_css_selector("#download_link").click()

edited Aug 7, 2018 at 11:22

answered Aug 7, 2018 at 11:09

Fayçal

5786 silver badges10 bronze badges

Also be careful that element target is not set to "_blank" otherwise switching tab and trying to download the file won't work
– romainm
Commented Nov 16, 2019 at 16:52
1

Thanks for posting. I also needed to add chromedriver_location = "/path/to/chromedriver", and then reference that in the driver definition, i.e. driver = webdriver.Chrome(chromedriver_location,options=chrome_options) sidenote: the chrome_options param is deprecating soon, and is already replaced with the options param, as demonstrated in my little example here.
– gannagainz
Commented Feb 28, 2020 at 18:31

Add a comment |

Some1Else · Accepted Answer · 2017-08-15 05:08:25Z

18

+50

This is a feature of Chrome to prevent from software to download files to your computer. There is a workaround though. Read more about it here.

What you need to do is enable it via DevTools, Something like that:

async function setDownload () {
  const client = await CDP({tab: 'ws://localhost:9222/devtools/browser'});
  const info =  await client.send('Browser.setDownloadBehavior', {behavior : "allow", downloadPath: "/tmp/"});
  await client.close();
}

This is the solution some one gave in the mentioned topic. Here is his comment.

edited Aug 15, 2017 at 5:08

answered Aug 14, 2017 at 6:56

Some1Else

4892 silver badges6 bronze badges

5

This solution requires to patch Chrome, it's not a workaround. The command Browser.setDownloadBehavior is not present in Chrome v62.0.3186.0.
– Florent B.
Commented Aug 16, 2017 at 10:39
I jumped into the same issue a couple of months ago. Haven't found any solution until today, thanks to a dude commenting my question and pointing me here. Reading this answer makes me happy, but I truly have no clue on how to copy or adapt this code in my source.
– aPugLife
Commented Aug 16, 2017 at 14:58
@TheChetan thanks! interesting link, though i am developing it in java and a chromePrefs.put("Browser.setDownloadBehavior", "allow"); would help more, if only this string was a real one and working.. ):
– aPugLife
Commented Aug 18, 2017 at 13:41
7

How would you do it in Python with selenium?
– Martin Thoma
Commented Aug 21, 2017 at 7:53
@Nihvel Are you able to address this in Java ?Can you please post the solution
– Coded9
Commented Feb 2, 2018 at 13:06

| Show 2 more comments

gannagainz · Accepted Answer · 2021-03-05 00:02:25Z

UPDATED PYTHON SOLUTION - TESTED Mar 4, 2021 on chromedriver v88 and v89

This will allow you to click to download files in headless mode.

    from selenium import webdriver
    from selenium.webdriver.common.keys import Keys
    from selenium.webdriver.chrome.options import Options

    # Instantiate headless driver
    chrome_options = Options()

    # Windows path
    chromedriver_location = 'C:\\path\\to\\chromedriver_win32\\chromedriver.exe'
    # Mac path. May have to allow chromedriver developer in os system prefs
    '/Users/path/to/chromedriver'

    chrome_options.add_argument("--headless")
    chrome_options.add_argument("--no-sandbox")
    chrome_options.add_argument("--disable-dev-shm-usage")
    
    chrome_prefs = {"download.default_directory": r"C:\path\to\Downloads"} # (windows)
    chrome_options.experimental_options["prefs"] = chrome_prefs

    driver = webdriver.Chrome(chromedriver_location,options=chrome_options)

    # Download your file
    driver.get('https://www.mockaroo.com/')
    driver.find_element_by_id('download').click()

This should be the accepted answer, only added the download path in chrome_prefs as experimental_options and did the trick, thanks! — Darklord5, Commented May 19, 2022 at 17:38

Hazem · Accepted Answer · 2017-08-18 07:08:44Z

4

Maybe the website that you handle returns different HTML pages for browsers, means the XPath or Id that you want maybe differently in headless browser. Try to download pageSource in headless browser and open it as HTML page to see the Id or XPath that you want. You can see this as c# example How to hide FirefoxDriver (using Selenium) without findElement function error in PhantomDriver? .

answered Aug 18, 2017 at 7:08

Hazem

3521 gold badge5 silver badges19 bronze badges

After I get the page and I do driver.get_screenshot_as_file('foo.png'), I get an image of the actual thing and it looks ok. Also, the driver is able to find the button. Investigating this.
– Jugurtha Hadjar
Commented Aug 20, 2017 at 17:57

Add a comment |

user7018603user7018603 · Accepted Answer · 2018-08-01 17:26:28Z

A full working example for JavaScript with selenium-cucumber-js / selenium-webdriver:

const chromedriver = require('chromedriver');
const selenium = require('selenium-webdriver');
const command = require('selenium-webdriver/lib/command');
const chrome = require('selenium-webdriver/chrome');

module.exports = function() {

  const chromeOptions = new chrome.Options()
    .addArguments('--no-sandbox', '--headless', '--start-maximized', '--ignore-certificate-errors')
    .setUserPreferences({
      'profile.default_content_settings.popups': 0, // disable download file dialog
      'download.default_directory': '/tmp/downloads', // default file download location
      "download.prompt_for_download": false,
      'download.directory_upgrade': true,
      'safebrowsing.enabled': false,
      'plugins.always_open_pdf_externally': true,
      'plugins.plugins_disabled': ["Chrome PDF Viewer"]
    })
    .windowSize({width: 1600, height: 1200});

  const driver = new selenium.Builder()
    .withCapabilities({
      browserName: 'chrome',
      javascriptEnabled: true,
      acceptSslCerts: true,
      path: chromedriver.path
    })
    .setChromeOptions(chromeOptions)
    .build();

  driver.manage().window().maximize();

  driver.getSession()
    .then(session => {
      const cmd = new command.Command("SEND_COMMAND")
        .setParameter("cmd", "Page.setDownloadBehavior")
        .setParameter("params", {'behavior': 'allow', 'downloadPath': '/tmp/downloads'});
      driver.getExecutor().defineCommand("SEND_COMMAND", "POST", `/session/${session.getId()}/chromium/send_command`);
      return driver.execute(cmd);
    });

  return driver;
};

The key part is:

  driver.getSession()
    .then(session => {
      const cmd = new command.Command("SEND_COMMAND")
        .setParameter("cmd", "Page.setDownloadBehavior")
        .setParameter("params", {'behavior': 'allow', 'downloadPath': '/tmp/downloads'});
      driver.getExecutor().defineCommand("SEND_COMMAND", "POST", `/session/${session.getId()}/chromium/send_command`);
      return driver.execute(cmd);
    });

Tested with:

Chrome 67.0.3396.99
Chromedriver 2.36.540469
selenium-cucumber-js 1.5.12
selenium-webdriver 3.0.0

Thanks for posting the javascript solution. It wasn't completely obvious how to execute the command. — Unnamed, Commented Oct 19, 2018 at 16:47

victorvartan · Accepted Answer · 2019-01-25 22:29:18Z

1

Usually it's redundant seeing the same thing just written in another language, but because this issue drove me crazy, I hope I'm saving someone else from the pain... so here's the C# version of Shawn Button's answer (tested with headless chrome=71.0.3578.98, chromedriver=2.45.615279, platform=Linux 4.9.125-linuxkit x86_64)):

            var enableDownloadCommandParameters = new Dictionary<string, object>
            {
                { "behavior", "allow" },
                { "downloadPath", downloadDirectoryPath }
            };
            var result = ((OpenQA.Selenium.Chrome.ChromeDriver)driver).ExecuteChromeCommandWithResult("Page.setDownloadBehavior", enableDownloadCommandParameters);

edited Jan 25, 2019 at 22:29

answered Jan 25, 2019 at 22:05

victorvartan

1,0022 gold badges11 silver badges31 bronze badges

chromeOptions.AddArgument("--headless"); chromeOptions.AddArgument("--no-sandbox"); chromeOptions.AddArgument("--disable-dev-shm-usage"); var driver = new ChromeDriver(chromeOptions); driver.Manage().Timeouts().ImplicitWait = TimeSpan.FromSeconds(5000); var enableDownloadCommandParameters = new Dictionary<string, object> { { "behavior", "allow" }, { "downloadPath",@"C:/Users/<name>/Downloads/" } }; Still not working and downloading the pdf file
– Arijit Sarkar
Commented Apr 18, 2024 at 14:51
@ArijitSarkar I tested it on a Linux environment and with specific chrome and driver versions (very important to have compatible versions) and in your comment you specified "C:/Users/<name>/Downloads/" (which by the way is not even a valid Windows path, you need to actually replace <name> with an user name or better yet use another path altogether)...
– victorvartan
Commented Dec 11, 2024 at 13:55

Add a comment |

Manasi Vora · Accepted Answer · 2018-09-18 10:53:03Z

Following is the equivalent in Java, selenium, chromedriver and chrome v 71.x. The code in is the key to allow saving of downloads Additional jars: com.fasterxml.jackson.core, com.fasterxml.jackson.annotation, com.fasterxml.jackson.databind

System.setProperty("webdriver.chrome.driver","C:\libraries\chromedriver.exe");

            String downloadFilepath = "C:\\Download";
            HashMap<String, Object> chromePreferences = new HashMap<String, Object>();
            chromePreferences.put("profile.default_content_settings.popups", 0);
            chromePreferences.put("download.prompt_for_download", "false");
            chromePreferences.put("download.default_directory", downloadFilepath);
            ChromeOptions chromeOptions = new ChromeOptions();
            chromeOptions.setBinary("C:\\pathto\\Chrome SxS\\Application\\chrome.exe");

            //ChromeOptions options = new ChromeOptions();
            //chromeOptions.setExperimentalOption("prefs", chromePreferences);
            chromeOptions.addArguments("start-maximized");
            chromeOptions.addArguments("disable-infobars");


            //HEADLESS CHROME
            **chromeOptions.addArguments("headless");**

            chromeOptions.setExperimentalOption("prefs", chromePreferences);
            DesiredCapabilities cap = DesiredCapabilities.chrome();
            cap.setCapability(CapabilityType.ACCEPT_SSL_CERTS, true);
            cap.setCapability(ChromeOptions.CAPABILITY, chromeOptions);

            **ChromeDriverService driverService = ChromeDriverService.createDefaultService();
            ChromeDriver driver = new ChromeDriver(driverService, chromeOptions);

            Map<String, Object> commandParams = new HashMap<>();
            commandParams.put("cmd", "Page.setDownloadBehavior");
            Map<String, String> params = new HashMap<>();
            params.put("behavior", "allow");
            params.put("downloadPath", downloadFilepath);
            commandParams.put("params", params);
            ObjectMapper objectMapper = new ObjectMapper();
            HttpClient httpClient = HttpClientBuilder.create().build();
            String command = objectMapper.writeValueAsString(commandParams);
            String u = driverService.getUrl().toString() + "/session/" + driver.getSessionId() + "/chromium/send_command";
            HttpPost request = new HttpPost(u);
            request.addHeader("content-type", "application/json");
            request.setEntity(new StringEntity(command));**
            try {
                httpClient.execute(request);
            } catch (IOException e2) {
                // TODO Auto-generated catch block
                e2.printStackTrace();
            }**

        //Continue using the driver for automation  
    driver.manage().window().maximize();

Matheus Araujo · Accepted Answer · 2019-07-10 15:35:51Z

0

I solved this problem by using the workaround shared by @Shawn Button and using the full path for the 'downloadPath' parameter. Using a relative path did not work and give me the error.

Versions:
Chrome Version 75.0.3770.100 (Official Build) (32-bit)
ChromeDriver 75.0.3770.90

edited Jul 10, 2019 at 15:35

answered Jul 8, 2019 at 14:32

Matheus Araujo

5,7392 gold badges23 silver badges24 bronze badges

2

Please write your chrome and chrome driver versions. Major changes are being done in each releases and the workarounds can be useless.. For example bugs.chromium.org/p/chromium/issues/detail?id=696481#c198
– Ferhat S. R.
Commented Jul 10, 2019 at 6:42

Add a comment |

Jorge Mendes · Accepted Answer · 2020-10-29 16:57:48Z

Using: google-chrome-stable amd64 86.0.4240.111-1,chromedriver 86.0.4240.22, selenium 3.141.0 python 3.8.3

Tried multiple proposed solutions, and nothing really worked for chrome headless, also my testing website opens a new blank tab and then the data is downloaded.

Finally gave up on headless and implemented pyvirtualdisplay and xvfd to emulate X server, something like:

from selenium.webdriver.chrome.options import Options # and other imports
import selenium.webdriver as webdriver
import tempfile

url = "https://really_badly_programmed_website.org"

tmp_dir = tempfile.mkdtemp(prefix="hamster_")

driver_path="/usr/bin/chromedriver"

chrome_options = Options() 
chrome_options.binary_location = "/usr/bin/google-chrome"

prefs = {'download.default_directory': tmp_dir,}
chrome_options.add_experimental_option("prefs", prefs)

with Display(backend="xvfb",size=(1920,1080),color_depth=24) as disp:

    driver = webdriver.Chrome(options=chrome_options, executable_path=driver_path)
    driver.get(url)

At the end everything worked and had the dowload file on the tmp folder.

Jason · Accepted Answer · 2021-04-18 05:50:20Z

I finally got it to work by upgrading to Chromium 90! I previously had version 72-78, but I saw that it had been fixed recently: https://bugs.chromium.org/p/chromium/issues/detail?id=696481 so i decided to give it a shot.

So after upgrading, which took a while (home brew in MacOS is so slow...), I simply did, without setting options or anything (this is a JavaScript example):

await driver.findElement(By.className('download')).click();

And it worked! I saw the downloaded PDF in the same working folder that I had been trying to download for a long time...

Collectives™ on Stack Overflow

Downloading with chrome headless and selenium

12 Answers 12

Not the answer you're looking for? Browse other questions tagged
python
google-chrome
selenium
google-chrome-headless
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

12 Answers 12

Not the answer you're looking for? Browse other questions tagged pythongoogle-chromeseleniumgoogle-chrome-headless or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
google-chrome
selenium
google-chrome-headless
or ask your own question.