驗(yàn)證
命令行輸入:scrapyd
輸出如下表示打開成功:
bdccl@bdccl-virtual-machine:~$ scrapyd
Removing stale pidfile /home/bdccl/twistd.pid
2017-12-15T19:01:09+0800 [-] Removing stale pidfile /home/bdccl/twistd.pid
2017-12-15T19:01:09+0800 [-] Loading /usr/local/lib/python2.7/dist-packages/scrapyd/txapp.py...
2017-12-15T19:01:10+0800 [-] Scrapyd web console available at http://127.0.0.1:6800/
2017-12-15T19:01:10+0800 [-] Loaded.
2017-12-15T19:01:10+0800 [twisted.scripts._twistd_unix.UnixAppLogger#info] twistd 17.9.0 (/usr/bin/python 2.7.12) starting up.
2017-12-15T19:01:10+0800 [twisted.scripts._twistd_unix.UnixAppLogger#info] reactor class: twisted.internet.epollreactor.EPollReactor.
2017-12-15T19:01:10+0800 [-] Site starting on 6800
2017-12-15T19:01:10+0800 [twisted.web.server.Site#info] Starting factory <twisted.web.server.Site instance at 0x7f9589b0fa28>
2017-12-15T19:01:10+0800 [Launcher] Scrapyd 1.2.0 started: max_proc=4, runner=u'scrapyd.runner'1234567891011
發(fā)布爬蟲
常用命令:
部署爬蟲到scrapyd:
首先切換到爬蟲項(xiàng)目根目錄下,修改scrapy.cfg,將下面這一行的注釋去掉:
url = http://localhost:6800/
然后在終端中執(zhí)行如下命令:
scrapyd-deploy <*target> -p PROJECT_NAME (target 為項(xiàng)目標(biāo)簽,與scrapy.cfg文件中[deploy]選項(xiàng)對(duì)應(yīng),可選)
然后在瀏覽器中打開:http://localhost:6800/或http://127.0.0.1:6800/即可在瀏覽器中查看爬蟲任務(wù)執(zhí)行狀態(tài)以及對(duì)應(yīng)爬蟲的job_id
查看狀態(tài):
scrapyd-deploy -l 啟動(dòng)爬蟲:
curl http://localhost:6800/schedule.json -d project=PROJECT_NAME -d spider=SPIDER_NAME 停止爬蟲:
curl http://localhost:6800/cancel.json -d project=PROJECT_NAME -d job=JOB_ID 刪除項(xiàng)目:
curl http://localhost:6800/delproject.json -d project=PROJECT_NAME 列出部署過的項(xiàng)目:
curl http://localhost:6800/listprojects.json
列出某個(gè)項(xiàng)目?jī)?nèi)的爬蟲:
curlhttp://localhost:6800/listspiders.json?project=PROJECT_NAME 列出某個(gè)項(xiàng)目的job:
curl http://localhost:6800/listjobs.json?project=PROJECT_NAME
----!