最近在在使用selenium爬取數(shù)據(jù)的時候,需要用到代理和JS渲染,使用PhantomJS渲染的效果無法解析部分數(shù)據(jù),所以用了chrome渲染,現(xiàn)在找到的ChromeDriver設(shè)置有密碼的代理都是Python版本的,昨天試了好幾次,終于把Java版本的也調(diào)通了,現(xiàn)記錄一下:
1、編寫background.js
var config = {
mode: "fixed_servers",
rules: {
singleProxy: {
scheme: "http",
host: "你自己的代理IP或域名",
port: 你自己的代理端口(Int整數(shù))
},
bypassList: ["不需要代理的域名清單,使用逗號分隔"]
}
};
chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});
function callbackFn(details) {
return {
authCredentials: {
username: "代理的用戶名如:user1",
password: "代理的密碼如:pwd1"
}
};
}
chrome.webRequest.onAuthRequired.addListener(
callbackFn,
{urls: ["<all_urls>"]},
['blocking']
);
2、編寫manifest.json
{
"version": "1.0.0",
"manifest_version": 2,
"name": "Chrome Proxy",
"permissions": [
"proxy",
"tabs",
"unlimitedStorage",
"storage",
"<all_urls>",
"webRequest",
"webRequestBlocking"
],
"background": {
"scripts": ["background.js"]
},
"minimum_chrome_version":"22.0.0"
}
3、將background.js和manifest.json 壓縮到proxy.zip文件中,記住proxy.zip里的background.js和manifest.json必須在根目錄下,不能嵌套任何目錄,如下:

image.png
4、將proxy的配置信息添加到ChromeOptions中,并配置chromedriver的路徑信息:
ChromeOptions co = new ChromeOptions();
o.addExtensions(new File("f:/tmp/proxy/proxy.zip")); //將proxy的信息添加到ChromeOptions中
System.setProperty("webdriver.chrome.driver","drivers/chromedriver.exe"); //配置chromedriver.exe的路徑信息
5、以百度為例,實現(xiàn)一個webdriver,并等待直到輸入框加載完畢,代碼如下:
RemoteWebDriver webdriver = new ChromeDriver(co);
webdriver.get("https://www.baidu.com/");
WebDriverWait wait = new WebDriverWait(webdriver, 10);
wait.until(ExpectedConditions.visibilityOfElementLocated(By.cssSelector("input#kw")));
完整代碼如下:
import java.io.File;
import org.openqa.selenium.By;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import org.openqa.selenium.remote.RemoteWebDriver;
import org.openqa.selenium.support.ui.ExpectedConditions;
import org.openqa.selenium.support.ui.WebDriverWait;
public class ProxyedChromeDriver {
public static void main(String[] args) {
ChromeOptions co = new ChromeOptions();
co.addExtensions(new File("f:/tmp/proxy/proxy.zip")); //將proxy的信息添加到ChromeOptions中
System.setProperty("webdriver.chrome.driver","drivers/chromedriver.exe"); //配置chromedriver.exe的路徑信息
RemoteWebDriver webdriver = new ChromeDriver(co);
webdriver.get("https://www.baidu.com/");
WebDriverWait wait = new WebDriverWait(webdriver, 10);
wait.until(ExpectedConditions.visibilityOfElementLocated(By.cssSelector("input#kw")));
webdriver.quit();
}
}
效果如下:

image.png