欧美日韩中文婷婷,亚洲色图手机在线视频

好久沒(méi)更新博客了，最近在做一個(gè)知乎的小爬蟲(chóng)，基于springboot+myabtis+webmagic
webmagic是一個(gè)簡(jiǎn)單靈活的Java爬蟲(chóng)框架?；赪ebMagic，支持多線程爬取，爬取邏輯明確、是一個(gè)易維護(hù)的爬蟲(chóng)。

官方給出的流程圖是像下面這樣的：

webmagic.png

Downloader 代表負(fù)責(zé)從互聯(lián)網(wǎng)上下載頁(yè)面，以便后續(xù)處理
PageProcessor相當(dāng)于將一個(gè)網(wǎng)頁(yè)與其他頁(yè)面相同的標(biāo)簽邏輯抽取出來(lái)，并將其解析加入一個(gè)Page對(duì)象當(dāng)中
Scheduler主要是負(fù)責(zé)管理待抓取的URL，以及一些去重的工作
Pipeline是將PageProcessor中抽取的邏輯放入爬取隊(duì)列，包括計(jì)算、持久化到文件、數(shù)據(jù)庫(kù)等。WebMagic默認(rèn)提供了“輸出到控制臺(tái)”和“保存到文件”兩種結(jié)果處理方案。

Pipeline定義了結(jié)果保存的方式，如果你要保存到指定數(shù)據(jù)庫(kù)，則需要編寫(xiě)對(duì)應(yīng)的Pipeline。對(duì)于一類(lèi)需求一般只需編寫(xiě)一個(gè)Pipeline。

項(xiàng)目結(jié)構(gòu)

![project.png](https://upload-images.jianshu.io/
/5309010-3636e4e84fbe07b6.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

定制DownLoader ：HttpClientDownloaderExtend

DownLoader實(shí)現(xiàn)思路
由于webmagic繼承于HttpDownLoader的下載器,所以我們需要先對(duì)HttpDownLoader的繼承關(guān)系先做一下簡(jiǎn)要說(shuō)明:

structrue.png

httpDownLoader的設(shè)計(jì)思路是為了方便多線程爬取,為了明確這個(gè)思路,讓我們看看在其所繼承的AbstractDownLoader當(dāng)中一個(gè)請(qǐng)求是如何在下載時(shí)進(jìn)行傳遞的呢?

工作流程:發(fā)出請(qǐng)求-下載-處理響應(yīng)
- 當(dāng)我們發(fā)出一個(gè)獲取頁(yè)面的請(qǐng)求時(shí),需要先建立一個(gè)httpClient對(duì)象，然后利用雙重鎖檢查httpclient實(shí)例是否被上一個(gè)線程獲取到鎖提前釋放，如果當(dāng)前httpclient獲取到鎖但未申請(qǐng)實(shí)例，則認(rèn)為已經(jīng)處于可關(guān)閉的狀態(tài),加入關(guān)閉連接集合
- 當(dāng)設(shè)置好當(dāng)前httpclient的連接屬性之后,我們通過(guò)handlResponse方法返回處理之后的Page對(duì)象,這個(gè)Page對(duì)象包含了頁(yè)面的全部信息，包括標(biāo)簽，帶爬取的頁(yè)面集合
- 在發(fā)出請(qǐng)求和返回請(qǐng)求的流程中，download方法即是將請(qǐng)求中需要發(fā)送的uri進(jìn)行限定,然后將請(qǐng)求交給handleResponse返回Page對(duì)象
需求分析:
我們需要設(shè)置例如像https://github.com/code4craft這樣的連接，但是如果我們需要匹配到具體的項(xiàng)目鏈接，修改正則顯得比較麻煩。假設(shè)需要獲取https://github.com/code4craft/webmagic這樣的鏈接，通過(guò)httpDownloader設(shè)置需要拓展的連接后綴/webmagic可以達(dá)到我們想要的效果

package com.complone.zhihumagic.downloader;

import com.google.common.collect.Sets;
import org.apache.commons.io.IOUtils;
import org.apache.commons.lang3.StringUtils;
import org.apache.http.HttpHost;
import org.apache.http.HttpResponse;
import org.apache.http.NameValuePair;
import org.apache.http.client.config.CookieSpecs;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpUriRequest;
import org.apache.http.client.methods.RequestBuilder;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.util.EntityUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import us.codecraft.webmagic.Page;
import us.codecraft.webmagic.Request;
import us.codecraft.webmagic.Site;
import us.codecraft.webmagic.Task;
import us.codecraft.webmagic.downloader.AbstractDownloader;
import us.codecraft.webmagic.downloader.HttpClientGenerator;
import us.codecraft.webmagic.selector.PlainText;
import us.codecraft.webmagic.utils.HttpConstant;
import us.codecraft.webmagic.utils.UrlUtils;

import java.io.IOException;
import java.nio.charset.Charset;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;
/** 
* @Description 拓展了 HttpClientDownloader
* 允許在下載的時(shí)候 對(duì)url  進(jìn)行再加工處理
* @author complone
*/
public class HttpClientDownloaderExtend extends AbstractDownloader {

    private Logger logger = LoggerFactory.getLogger(getClass());

    private final Map<String, CloseableHttpClient> httpClients = new HashMap<String, CloseableHttpClient>();

    private HttpClientGenerator httpClientGenerator = new HttpClientGenerator();

    private String urlExtend;

    //允許對(duì)從隊(duì)列里面獲取的url 進(jìn)行處理
    public HttpClientDownloaderExtend(String urlExtend){
        this.urlExtend = urlExtend;
    }

    private CloseableHttpClient getHttpClient(Site site) {
        if (site == null) {
            return httpClientGenerator.getClient(null);
        }
        String domain = site.getDomain();
        CloseableHttpClient httpClient = httpClients.get(domain);
        if (httpClient == null) {
            synchronized (this) {
                httpClient = httpClients.get(domain);
                if (httpClient == null) {
                    httpClient = httpClientGenerator.getClient(site);
                    httpClients.put(domain, httpClient);
                }
            }
        }
        return httpClient;
    }

    @Override
    public Page download(Request request, Task task) {
        Site site = null;
        if (task != null) {
            site = task.getSite();
        }
        Set<Integer> acceptStatCode;
        String charset = null;
        Map<String, String> headers = null;
        if (site != null) {
            acceptStatCode = site.getAcceptStatCode();
            charset = site.getCharset();
            headers = site.getHeaders();
        } else {
            acceptStatCode = Sets.newHashSet(200);
        }
        request.setUrl(request.getUrl()+urlExtend);
        logger.info("downloading page {}", request.getUrl());
        CloseableHttpResponse httpResponse = null;
        int statusCode=0;
        try {
            HttpUriRequest httpUriRequest = getHttpUriRequest(request, site, headers);
            httpResponse = getHttpClient(site).execute(httpUriRequest);
            statusCode = httpResponse.getStatusLine().getStatusCode();
            request.putExtra(Request.STATUS_CODE, statusCode);
            if (statusAccept(acceptStatCode, statusCode)) {
                Page page = handleResponse(request, charset, httpResponse, task);
                onSuccess(request);
                return page;
            } else {
                logger.warn("code error " + statusCode + "\t" + request.getUrl());
                return null;
            }
        } catch (IOException e) {
            logger.warn("download page " + request.getUrl() + " error", e);
            if (site.getCycleRetryTimes() > 0) {
                return addToCycleRetry(request, site);
            }
            onError(request);
            return null;
        } finally {
            request.putExtra(Request.STATUS_CODE, statusCode);
            try {
                if (httpResponse != null) {
                    //ensure the connection is released back to pool
                    EntityUtils.consume(httpResponse.getEntity());
                }
            } catch (IOException e) {
                logger.warn("close response fail", e);
            }
        }
    }

    @Override
    public void setThread(int thread) {
        httpClientGenerator.setPoolSize(thread);
    }

    protected boolean statusAccept(Set<Integer> acceptStatCode, int statusCode) {
        return acceptStatCode.contains(statusCode);
    }

    protected HttpUriRequest getHttpUriRequest(Request request, Site site, Map<String, String> headers) {
        RequestBuilder requestBuilder = selectRequestMethod(request).setUri(request.getUrl());
        if (headers != null) {
            for (Map.Entry<String, String> headerEntry : headers.entrySet()) {
                requestBuilder.addHeader(headerEntry.getKey(), headerEntry.getValue());
            }
        }
        RequestConfig.Builder requestConfigBuilder = RequestConfig.custom()
                .setConnectionRequestTimeout(site.getTimeOut())
                .setSocketTimeout(site.getTimeOut())
                .setConnectTimeout(site.getTimeOut())
                .setCookieSpec(CookieSpecs.BEST_MATCH);
        if (site.getHttpProxyPool().isEnable()) {
            HttpHost host = site.getHttpProxyFromPool();
            requestConfigBuilder.setProxy(host);
            request.putExtra(Request.PROXY, host);
        }
        requestBuilder.setConfig(requestConfigBuilder.build());
        return requestBuilder.build();
    }

    protected RequestBuilder selectRequestMethod(Request request) {
        String method = request.getMethod();
        if (method == null || method.equalsIgnoreCase(HttpConstant.Method.GET)) {
            //default get
            return RequestBuilder.get();
        } else if (method.equalsIgnoreCase(HttpConstant.Method.POST)) {
            RequestBuilder requestBuilder = RequestBuilder.post();
            NameValuePair[] nameValuePair = (NameValuePair[]) request.getExtra("nameValuePair");
            if (nameValuePair.length > 0) {
                requestBuilder.addParameters(nameValuePair);
            }
            return requestBuilder;
        } else if (method.equalsIgnoreCase(HttpConstant.Method.HEAD)) {
            return RequestBuilder.head();
        } else if (method.equalsIgnoreCase(HttpConstant.Method.PUT)) {
            return RequestBuilder.put();
        } else if (method.equalsIgnoreCase(HttpConstant.Method.DELETE)) {
            return RequestBuilder.delete();
        } else if (method.equalsIgnoreCase(HttpConstant.Method.TRACE)) {
            return RequestBuilder.trace();
        }
        throw new IllegalArgumentException("Illegal HTTP Method " + method);
    }

    protected Page handleResponse(Request request, String charset, HttpResponse httpResponse, Task task) throws IOException {
        String content = getContent(charset, httpResponse);
        Page page = new Page();
        page.setRawText(content);
        page.setUrl(new PlainText(request.getUrl()));
        page.setRequest(request);
        page.setStatusCode(httpResponse.getStatusLine().getStatusCode());
        return page;
    }

    protected String getContent(String charset, HttpResponse httpResponse) throws IOException {
        if (charset == null) {
            byte[] contentBytes = IOUtils.toByteArray(httpResponse.getEntity().getContent());
            String htmlCharset = getHtmlCharset(httpResponse, contentBytes);
            if (htmlCharset != null) {
                return new String(contentBytes, htmlCharset);
            } else {
                logger.warn("Charset autodetect failed, use {} as charset. Please specify charset in Site.setCharset()", Charset.defaultCharset());
                return new String(contentBytes);
            }
        } else {
            return IOUtils.toString(httpResponse.getEntity().getContent(), charset);
        }
    }

    protected String getHtmlCharset(HttpResponse httpResponse, byte[] contentBytes) throws IOException {
        String charset;
        // charset
        // 1、encoding in http header Content-Type
        String value = httpResponse.getEntity().getContentType().getValue();
        charset = UrlUtils.getCharset(value);
        if (StringUtils.isNotBlank(charset)) {
            logger.debug("Auto get charset: {}", charset);
            return charset;
        }
        // use default charset to decode first time
        Charset defaultCharset = Charset.defaultCharset();
        String content = new String(contentBytes, defaultCharset.name());
        // 2、charset in meta
        if (StringUtils.isNotEmpty(content)) {
            Document document = Jsoup.parse(content);
            Elements links = document.select("meta");
            for (Element link : links) {
                // 2.1、html4.01 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
                String metaContent = link.attr("content");
                String metaCharset = link.attr("charset");
                if (metaContent.indexOf("charset") != -1) {
                    metaContent = metaContent.substring(metaContent.indexOf("charset"), metaContent.length());
                    charset = metaContent.split("=")[1];
                    break;
                }
                // 2.2、html5 <meta charset="UTF-8" />
                else if (StringUtils.isNotEmpty(metaCharset)) {
                    charset = metaCharset;
                    break;
                }
            }
        }
        logger.debug("Auto get charset: {}", charset);
        // 3、todo use tools as cpdetector for content decode
        return charset;
    }
}

保存用戶信息實(shí)體類(lèi)GithubUserInfo
ps:這里用了注解模式,在webamgic當(dāng)中有一種存儲(chǔ)隊(duì)列叫做PageModel<T>有興趣的朋友可以查閱文檔配合使用


import us.codecraft.webmagic.model.annotation.ExtractBy;
import us.codecraft.webmagic.model.annotation.HelpUrl;
import us.codecraft.webmagic.model.annotation.TargetUrl;

import javax.persistence.Column;
import javax.persistence.Id;

/**
 * Created by complone on 2018/11/2.
 */
@TargetUrl("https://github.com/\\w+/\\w+")
@HelpUrl("https://github.com/\\w+")
public class GithubUserInfo {


    @Id
    @Column(name = "g_id")
    private Integer githubId;

    @Column(name = "nickname")
    @ExtractBy(value = "http://h1[@class='vcard-names']/span[2]/text()")
    private String nickname;

    @Column(name = "author")
    @ExtractBy(value = "http://h1[@class='vcard-names']/span[2]/text()")
    private String author;


    public void setAuthor(String author) {
        this.author = author;
    }

    public void setGithubId(Integer githubId) {
        this.githubId = githubId;
    }

    public void setNickname(String nickname) {
        this.nickname = nickname;
    }

    public Integer getGithubId() {
        return githubId;
    }

    public String getAuthor() {
        return author;
    }

    public String getNickname() {
        return nickname;
    }
}

定制PageProcessor：GithubProcessor

我們先嘗試爬取Github上作者的連接進(jìn)行測(cè)試

在工作流程中PageProcessor是負(fù)責(zé)抽取爬取邏輯并將其保存至一個(gè)持久化對(duì)象的組件
需求分析：爬取相關(guān)的頁(yè)面標(biāo)簽，結(jié)果集的作者字段為空則進(jìn)入下一個(gè)頁(yè)面進(jìn)行爬取,并將爬取成功的結(jié)果存入持久化對(duì)象

ps: 在webamgic的工作流程中所有的爬取組件都應(yīng)該作為一個(gè)bean對(duì)象為spring所管理,所以@Component需要在每個(gè)定制的組件上加入(至于為什么不能直接在processor中啟動(dòng)爬蟲(chóng)，后續(xù)再下面會(huì)講)

package com.complone.zhihumagic.processor;

import com.complone.zhihumagic.downloader.HttpClientDownloaderExtend;
import com.complone.zhihumagic.mapper.GithubUserInfoMapper;
import com.complone.zhihumagic.model.GithubUserInfo;
import com.complone.zhihumagic.pipeline.GithubUserPipeline;
import com.complone.zhihumagic.service.GithubUserService;
import org.apache.http.HttpHost;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
import us.codecraft.webmagic.Page;
import us.codecraft.webmagic.Site;
import us.codecraft.webmagic.Spider;
import us.codecraft.webmagic.downloader.HttpClientDownloader;
import us.codecraft.webmagic.model.OOSpider;
import us.codecraft.webmagic.pipeline.Pipeline;
import us.codecraft.webmagic.processor.PageProcessor;

@Component
public class GitHubProcessor implements PageProcessor {

    @Autowired
    private GithubUserService githubUserService;


    private static final String start_url = "https://github.com/code4craft";


    // 部分一：抓取網(wǎng)站的相關(guān)配置，包括編碼、抓取間隔、重試次數(shù)等
    private Site site = Site.me().setRetryTimes(3).setSleepTime(1000);
//            .setHttpProxy(new HttpHost("45.32.50.126",4399));

    GithubUserInfo githubUserInfo = new GithubUserInfo();

    @Override
    // process是定制爬蟲(chóng)邏輯的核心接口，在這里編寫(xiě)抽取邏輯
    public void process(Page page) {
        // 部分二：定義如何抽取頁(yè)面信息，并保存下來(lái)
        page.putField("author", page.getHtml().xpath("http://h1[@class='vcard-names']/span[2]/text()").toString());
        page.putField("name", page.getHtml().xpath("http://h1[@class='vcard-names']/span[1]/text()").toString());
        page.putField("readme", page.getHtml().xpath("http://div[@id='readme']/tidyText()"));


        if (page.getResultItems().get("name") == null) {
            //skip this page
            page.setSkip(true);

        }
        githubUserInfo.setAuthor(page.getHtml().xpath("http://h1[@class='vcard-names']/span[2]/text()").toString());
        githubUserInfo.setNickname(page.getHtml().xpath("http://h1[@class='vcard-names']/span[1]/text()").toString());

        System.out.println(githubUserInfo.getNickname() + " ------------------ "+githubUserInfo.getAuthor());
        // 部分三：從頁(yè)面發(fā)現(xiàn)后續(xù)的url地址來(lái)抓取
        page.addTargetRequests(page.getHtml().links().regex("(https://github\\.com/[\\w-]+)").all());

        page.putField("githubUserInfo",githubUserInfo);
//        githubUserService.insertGithubUserInfo(githubUserInfo);

    }

    @Override
    public Site getSite() {
        return site;
    }

    public void start(PageProcessor pageProcessor, Pipeline pipeline) {
        Spider.create(pageProcessor).addUrl(start_url).addPipeline(pipeline).thread(5).run();
    }

    public static void main(String[] args) {

        Spider spider = Spider.create(new GitHubProcessor())
                .addUrl(start_url)
                .addPipeline(new GithubUserPipeline())
                //從"https://github.com/code4craft"開(kāi)始抓
                //開(kāi)啟5個(gè)線程抓取
                .thread(5);
        spider.run();
    }
}

定制Pipline： GithubUserPipline

Pipeline在工作流程中作為一個(gè)將之前持久化對(duì)象導(dǎo)出的組件，此時(shí)我們需要將其導(dǎo)出到mysql
需求分析：在process方法中將持久化對(duì)象檢查是否存在author為空的情況，避免https://github.com/topic這樣的github模塊存儲(chǔ)到數(shù)據(jù)庫(kù),同時(shí)在多個(gè)線程進(jìn)行爬取時(shí)，記錄當(dāng)前爬取的數(shù)據(jù)條數(shù)

package com.complone.zhihumagic.pipeline;

import com.complone.zhihumagic.mapper.GithubUserInfoMapper;
import com.complone.zhihumagic.model.GithubUserInfo;
import com.complone.zhihumagic.service.GithubUserService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
import us.codecraft.webmagic.ResultItems;
import us.codecraft.webmagic.Task;
import us.codecraft.webmagic.pipeline.Pipeline;

import java.util.Map;

/**
 * Created by complone on 2018/11/2.
 */
@Component("githubUserPipeline")
public class GithubUserPipeline implements Pipeline{

    @Autowired
    private GithubUserService githubUserService;
    private volatile int count = 0;

    @Override
    public void process(ResultItems resultItems, Task task) {
        GithubUserInfo githubUserInfo = resultItems.get("githubUserInfo");

        for (Map.Entry<String, Object> entry : resultItems.getAll().entrySet()) {
            System.out.println(entry.getKey() + ":\t" + entry.getValue());
            if (entry.getKey() == "author"&& resultItems.get("author")==null){
                //防止github可能爬取到topic之類(lèi)的鏈接，防止數(shù)據(jù)行為空
                    continue;
            }
        }
        githubUserService.insertGithubUserInfo(githubUserInfo);
        count++;
        System.out.println("已經(jīng)插入第"+count+"條數(shù)據(jù)");
    }
}

啟動(dòng)爬蟲(chóng)控制器：Basecontroller
將GithubPageProcessor與GithubPipeline作為bean組件傳入爬蟲(chóng)，避免在bean未注冊(cè)完成時(shí)存入數(shù)據(jù)庫(kù)

package com.complone.zhihumagic.controller;

import com.alibaba.fastjson.JSON;
import com.complone.zhihumagic.mapper.GithubUserInfoMapper;
import com.complone.zhihumagic.mapper.UserDetailInfoMapper;
import com.complone.zhihumagic.model.GithubUserInfo;
import com.complone.zhihumagic.model.UserBaseInfo;
import com.complone.zhihumagic.model.UserDetailInfo;
import com.complone.zhihumagic.pipeline.GithubUserPipeline;
import com.complone.zhihumagic.processor.GitHubProcessor;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestMethod;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.ResponseBody;
import tk.mybatis.mapper.entity.Example;

import java.util.List;

@Controller
public class BaseController {
    @Autowired
    private UserDetailInfoMapper userDetailInfoMapper;

    @Autowired
    private GithubUserInfoMapper githubUserInfoMapper;

    @RequestMapping(value = "/searchByName")
    public  @ResponseBody
    List<UserDetailInfo> searchByName(@RequestParam(value = "name",  required = true)String name){
        Example example1 = new Example(UserBaseInfo.class);
        example1.selectProperties("nickname","location","weiboUrl","headline","description");
        example1.createCriteria().andLike("nickname", name);
        List<UserDetailInfo> result = (List<UserDetailInfo>) userDetailInfoMapper.selectByExample(example1);
        System.out.println("查找昵稱(chēng)為"+name+"結(jié)果為 "+JSON.toJSONString(result));
        return result;
    }


    @RequestMapping(value = "/test",method = RequestMethod.GET)
    public @ResponseBody int test(){
        UserDetailInfo ui = new UserDetailInfo();
        ui.setPageurl("https://www.geo43.com");
        ui.setNickname("v2ex");
        int row = userDetailInfoMapper.insertOne(ui);
        return row;
    }

    @RequestMapping(value = "/desc",method = RequestMethod.POST)
    public @ResponseBody int testGIthub(){
        GithubUserInfo githubUserInfo = new GithubUserInfo();
        githubUserInfo.setNickname("nacy");
        githubUserInfo.setAuthor("complone");
        int row = githubUserInfoMapper.insertGithubUserInfo(githubUserInfo);
        return row;
    }

    @Autowired
    private GitHubProcessor gitHubProcessor;

    @Autowired
    private GithubUserPipeline githubUserPipeline;

    @RequestMapping("/start")
    public String start() {
        gitHubProcessor.start(gitHubProcessor,githubUserPipeline);
        return "GithubSpider is close!";
    }
}

保存多線程抽取邏輯時(shí)出現(xiàn)的并發(fā)任務(wù)，參考流水線操作SaveTask。
需要開(kāi)啟事務(wù)@Transactional顯示設(shè)置需要的提交模式

package com.complone.zhihumagic;

import com.complone.zhihumagic.mapper.UserDetailInfoMapper;
import com.complone.zhihumagic.model.UserDetailInfo;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;

import java.util.concurrent.BlockingDeque;
import java.util.concurrent.BlockingQueue;


public class SavingTask implements Runnable {

    Logger logger = LoggerFactory.getLogger(SavingTask.class);

    @Autowired
    private UserDetailInfoMapper userDetailInfoMapper;

    private BlockingQueue<UserDetailInfo> blockingDeque; //新建一個(gè)阻塞隊(duì)列

    private volatile boolean isStop = false; //標(biāo)記多線程征用資源時(shí)，鎖是否得到釋放

    private volatile int i =0; //多線程存儲(chǔ)次數(shù)計(jì)數(shù)器

    public UserDetailInfoMapper getUserDetailInfoMapper() {
        return userDetailInfoMapper;
    }

    public void setUserDetailInfoMapper(UserDetailInfoMapper userDetailInfoMapper) {
        this.userDetailInfoMapper = userDetailInfoMapper;
    }

    public SavingTask(BlockingDeque<UserDetailInfo> blockingDeque) {
        this.blockingDeque = blockingDeque;
    }



    @Override
    public void run() {
        while (true) {
            if (isStop) { //線程標(biāo)記符，判斷是否終止
                return;
            }
            UserDetailInfo userDetailInfo = blockingDeque.poll(); //獲取進(jìn)入阻塞隊(duì)列的對(duì)象
            if (userDetailInfo == null) {
                try {
                    Thread.currentThread().sleep(2000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            } else {
                synchronized (this){ //分離寫(xiě)對(duì)象操作,加鎖防止線程爭(zhēng)用
                    try{
                        userDetailInfoMapper.insertSelective(userDetailInfo);
                        logger.info("-------------存貯了：{}------------",++i);
                    }catch (Exception e){
                        logger.error("-------出現(xiàn)問(wèn)題------{}---{}",e,userDetailInfo );
                    }

                }
            }
        }
    }

    public void startSave() {
        this.isStop = true;
    }

    public void stopSave() {
        this.isStop = true;
    }
}

----------------------------- 分割線，今日待更------------------------------------------

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

記一次初學(xué)Webmagic的踩坑之旅：爬取知乎數(shù)據(jù)

記一次初學(xué)Webmagic的踩坑之旅：爬取知乎數(shù)據(jù)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

記一次初學(xué)Webmagic的踩坑之旅：爬取知乎數(shù)據(jù)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av