初识 scrapy 框架 - 安装

前面豆子学习了基本的urllib的模块,通过这个模块可以写一些简单的爬虫文件。如果要处理大中型的爬虫项目,urllib就显得比较low了,这个时候可以使用scrapy框架来实现,很多基本的处理在scrapy里面已经做好了。

首先来安装一下。推荐的流程如下:

  1. 首先升级一下pip
C:\WINDOWS\system32>python -m pip install --upgrade pip
Requirement already up-to-date: pip in c:\python36\lib\site-packages
  1. 安装wheel
    C:\WINDOWS\system32>pip install wheel
    Requirement already satisfied: wheel in c:\python36\lib\site-packages

3.安装lxml

C:\WINDOWS\system32>pip install lxml
Collecting lxmlDownloading lxml-4.1.1-cp36-cp36m-win32.whl (3.2MB)100% |████████████████████████████████| 3.2MB 307kB/s
Installing collected packages: lxml
Successfully installed lxml-4.1.1
  1. 安装twisted(如果在线安装报错的话,可以考虑离线安装)
    从https://www.lfd.uci.edu 上搜索twisted, 可以看见他有很多个版本

初识 scrapy 框架 - 安装

执行一下python命令看看当前的版本,可以看见我的版本是3.6.2,32位的

c:\Users\yuan.li\Downloads>python
Python 3.6.2 (v3.6.2:5fd33b5, Jul  8 2017, 04:14:34) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.

这样的话可以下载对应的版本twisted-17.9.0-cp36-win32.whl就行了。
下载之后,手动安装

c:\Users\yuan.li\Downloads>pip install Twisted-17.9.0-cp36-cp36m-win32.whl
Processing c:\users\yuan.li\downloads\twisted-17.9.0-cp36-cp36m-win32.whl
Requirement already satisfied: incremental>=16.10.1 in c:\python36\lib\site-packages (from Twisted==17.9.0)
Requirement already satisfied: Automat>=0.3.0 in c:\python36\lib\site-packages (from Twisted==17.9.0)
Requirement already satisfied: zope.interface>=4.0.2 in c:\python36\lib\site-packages (from Twisted==17.9.0)
Requirement already satisfied: hyperlink>=17.1.1 in c:\python36\lib\site-packages (from Twisted==17.9.0)
Requirement already satisfied: constantly>=15.1 in c:\python36\lib\site-packages (from Twisted==17.9.0)
Requirement already satisfied: six in c:\python36\lib\site-packages (from Automat>=0.3.0->Twisted==17.9.0)
Requirement already satisfied: attrs in c:\python36\lib\site-packages (from Automat>=0.3.0->Twisted==17.9.0)
Requirement already satisfied: setuptools in c:\python36\lib\site-packages (from zope.interface>=4.0.2->Twisted==17.9.0)
Installing collected packages: Twisted
Successfully installed Twisted-17.9.0
  1. 最后安装scrapy
c:\Users\yuan.li\Downloads>pip install scrapy
Collecting scrapyDownloading Scrapy-1.5.0-py2.py3-none-any.whl (251kB)100% |████████████████████████████████| 256kB 2.3MB/s
Collecting pyOpenSSL (from scrapy)Downloading pyOpenSSL-17.5.0-py2.py3-none-any.whl (53kB)100% |████████████████████████████████| 61kB 4.5MB/s
Collecting cssselect>=0.9 (from scrapy)Downloading cssselect-1.0.3-py2.py3-none-any.whl
Collecting parsel>=1.1 (from scrapy)

安装完成之后,执行一下scrapy, 看看是否工作

c:\Users\yuan.li\Downloads>scrapy
Scrapy 1.5.0 - no active projectUsage:scrapy <command> [options] [args]Available commands:bench         Run quick benchmark testfetch         Fetch a URL using the Scrapy downloadergenspider     Generate new spider using pre-defined templatesrunspider     Run a self-contained spider (without creating a project)settings      Get settings valuesshell         Interactive scraping consolestartproject  Create new projectversion       Print Scrapy versionview          Open URL in browser, as seen by Scrapy[ more ]      More commands available when run from project directory

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/278548.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

Vue使用Vuex一步步封装并使用store

文章目录一、安装Vuex依赖二、一步步封装store1. main.js中全局引入store仓库&#xff08;下一步创建&#xff09;2. this.$store3. this.$store.state4. this.$store.getters&#xff08;this. $store.state的升级&#xff09;5. this.$store.commit(mutations)6. this.$store…

linux自学(四)之开始centos学习,网络配置

上一篇&#xff1a;linux自学&#xff08;三&#xff09;之开启虚拟机 安装好镜像之后&#xff0c;重启之后需要登录&#xff0c;我这里直接是root账号直接登录的&#xff0c;注意&#xff1a;输入密码的时候不显示。 之后输入ifconfig最常用的命令来查看网卡信息&#xff0c;出…

k8s extender_Windows Home Server的Drive Extender的9种选择

k8s extenderNow that Microsoft has officially killed off the best part about Windows Home Server what can you do? Here are some alternatives for drive extender that you can use if you want to build a WHS of your own. 既然Microsoft正式取消了Windows Home Se…

为什么element的el-backtop会不管用,来看这里

<template>Scroll down to see the bottom-right button.<el-backtop target".page-component__scroll .el-scrollbar__wrap"></el-backtop> </template>把target指向你要产生“回到顶部”按钮的组件&#xff0c; 这个组件一定要是产生滚动条…

如何创建一份springboot的docker镜像

2019独角兽企业重金招聘Python工程师标准>>> FROM centos:7 ENV JAVA_HOME /usr/java/jdk1.7.0_55 ENV MAC_PUBLISH_PATH /home/app ENV LOG_PATH /var/log ENV PATH $JAVA_HOME/bin:$PATH ENV TIME_ZONE Asia/Shanghai COPY jdk-7u55-linux-x64.rpm /opt/ RUN mkd…

Xamarin.Android 开发中遇到旋转屏幕错误

错误信息 : System.NotSupportedException: Unable to find the default constructor on type App5.MyFragment. Please provide the missing constructor. 错误图片&#xff1a; 解决方法&#xff1a;干脆不让他旋转屏幕&#xff0c;当下QQ、微信等app都没有旋转等功能&#…

原生js打印指定节点元素

很简单&#xff08;可粘贴至txt文档后改后缀为html打开看效果&#xff09;&#xff1a; <!doctype html> <html lang"en"> <head><meta charset"utf-8"><title>打印</title><meta name"viewport" conte…

Android社会化分享详解

前言现如今app市场竞争激烈&#xff0c;做app不会放过任何推广自己的app的渠道&#xff0c;如果app中没有社会化分享功能&#xff0c;那真的是OUT了&#xff0c;我们先来看下一些app中的分享界面功能吧。现在主流的分享平台&#xff0c;一般用的都是微信、QQ、微博&#xff0c;…

windows7黑屏修复_如何在Windows 10更新后修复黑屏

windows7黑屏修复RealVector/Shutterstock.comRealVector / Shutterstock.comSome Windows 10 PCs have been rebooting to a black screen after installing the June 2019 cumulative update from Windows Update. This seems scary at first, but luckily there’s a quick …

[sol]250OJ 1~10

下载 转载于:https://www.cnblogs.com/yztblog/p/10208314.html

vue/cli4 创建vue项目选项详解

多版本创建项目一、vue-cli2.x二、vue-cli3.x三、vue-cli4.x1.查看 vue 版本&#xff1a; 项目中,找到package.json文件夹 找"dependencies"中的vue &#xff1b; 若无项目&#xff0c;在cmd中输入 where vue&#xff0c;cd到vue目录下输入 npm list vue &#xff0c…

java 商品评价计算算法

import java.io.Serializable; import java.lang.ref.ReferenceQueue; import java.lang.ref.WeakReference; import java.math.BigDecimal; import java.math.RoundingMode;/*** 商品评价算法* * project icomment* fileName ProductScore.java* Description* author light-z…

rainmeter使用教程_如何使用Rainmeter在桌面上显示报价

rainmeter使用教程I’ve never really been a desktop gadgets and widgets type of person, but I often put an inspirational quote on my desktop wallpaper. Today we’ll show you how to do this using Rainmeter, no matter what wallpaper you switch to. 我从来没有真…

Some code changes cannot be hot swapped into a running virtual machine

java运行中修改代码不能改变立刻应用到本次运行中转载于:https://www.cnblogs.com/Pusteblume/p/10211110.html

自定义v-drag指令(横向拖拽滚动)

指令 Vue.directive(drag, {// 钩子函数&#xff0c;被绑定元素插入父节点时调用 (父节点存在即可调用&#xff0c;不必存在于 document 中)。inserted: (el, binding, vnode, oldVnode) > {console.log(el, binding, vnode, oldVnode)let drag el; // 要拖拽的元素// let …

javascript获取时间差

function GetDateDiff(startTime, endTime, diffType) {//将xxxx-xx-xx的时间格式&#xff0c;转换为 xxxx/xx/xx的格式 startTime startTime.replace(/\-/g, "/");endTime endTime.replace(/\-/g, "/");//将计算间隔类性字符转换为小写diffType diffTy…

JMeter扩展JMeter插件获取更多监听器

为了获取更多监听器&#xff0c;方便的监控系统及应用&#xff0c;有必要安装第三方插件 插件下载地址&#xff1a; https://jmeter-plugins.org/downloads/old/ http://pan.baidu.com/s/1gfC11yN 注&#xff1a;如果插件和软件版本不兼容&#xff0c;可能在开启Jmeter时会报错…

如何阻止Chrome(或Edge)接管媒体密钥

Google Chrome now has built-in support for media keys. Unfortunately, Chrome will take over your media keys and prevent them from controlling apps like Spotify when you’re watching YouTube, for example. Here’s how to make Chrome ignore your media keys. G…

js滚动条滚动到指定元素

let item document.getElementById("item"); // 指定的元素 let wrapper document.getElementById("wrapper"); // 其父元素 - 必须是产生滚动条的元素// 元素聚焦法定位 // item.focus(); // 可用 outline:none; 除去聚焦产生的框; 对于默认没有聚焦的…

开源性能测试工具JMeter快速入门(一)

目录一、JMeter简介二、JMeter功能介绍三、JMeter脚本四、关于JMeter小提示一、JMeter简介1.定义JMeter是Apache组织开发的基于Java的压力测试工具。用于对软件做压力测试&#xff0c;它最初被设计用于Web应用测试&#xff0c;但后来扩展到其他测试领域。 1&#xff09;它可以用…