首页 / PYTHON / Python Urllib库详解

Python Urllib库详解

内容导读

互联网集市收集整理的这篇技术教程文章主要介绍了Python Urllib库详解，小编现在分享给大家，供广大互联网技能从业者学习和参考。文章包含5807字，纯文字阅读大概需要9分钟。

内容图文

Urllib库详解

什么是Urllib?

Python内置的HTTP请求库

urllib.request 请求模块
urllib.error 异常处理模块
urllib.parse url解析模块
urllib.robotparser robots.txt解析模块

相比Python2变化

python2

import urllib2
response = urllib2.urlopen('http://www.baidu.com')

python3

import urllib.request
response = urllib.request.urlopen('http://www.baidu.com')

urllib

urlopen

urllib.request.urlopen(url,data=None,[timeout,]*,cafile=None,capath=None,cadefault=False,context=None)

import urllib.request
response = urllib.request.urlopen('http://www.baidu.com')
print(response.read().decode('utf-8'))

import urllib.parse
import urllib.request

data = bytes(urllib.parse.urlencode({'word':'hello'}),encoding='utf8')
response = urllib.request.urlopen('http://httpbin.org/post',data=data)
print(response.read())

import urllib.request
response = urllib.request.urlopen('http://httpbin.org/get',timeout=1)
print(response.read())

import socket
import urllib.request
import urllib.error
try:
    response = urllib.request.urlopen('http://httpbin.org/get',timeout=0.1)
except urllib.error.URLError as e:
    if isinstance(e.reason,socket.timeout):
        print('TIME OUT')

响应

响应类型

import urllib.request
response = urllib.request.urlopen('https://www.python.org')
print(type(response))

状态码、响应头

import urllib.request
response = urllib.request.urlopen('http://www.python.org')
print(response.status)
print(response.getheaders())

Request

import urllib.request
request = urllib.request.Request('https://python.org')
response = urllib.request.urlopen(request)
print(response.read().decode('utf-8'))

from urllib import request,parse
url = 'http://httpbin.org/post'
headers = {
    'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36',
    'Host':'httpbin.org'
}
dict = {
    'name':'puqunzhu'
}
data = bytes(parse.urlencode(dict),encoding='utf8')
req = request.Request(url=url,data=data,headers=headers,method='POST')
response = request.urlopen(req)
print(response.read().decode('utf-8'))

from urllib import request,parse
url = 'http://httpbin.org/post'
dict = {
    'name':'puqunzhu'
}
data = bytes(parse.urlencode(dict),encoding='utf8')
req = request.Request(url=url,data=data,method='POST')
req.add_header('User-Agent','Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36')
response = request.urlopen(req)
print(response.read().decode('utf-8'))

Handler

代理

import urllib.request
proxy_handler = urllib.request.ProxyHandler({
    'http':'http://61.135.217.7:80',
    'https':'https://61.150.96.27:46111',
})
opener = urllib.request.build_opener(proxy_handler)
response = opener.open('http://www.baidu.com')
print(response.read())

import http.cookiejar,urllib.request
cookie = http.cookiejar.CookieJar()
handler = urllib.request.HTTPCookieProcessor(cookie)
opener = urllib.request.build_opener(handler)
response = opener.open('http://www.baidu.com')
for item in cookie:
    print(item.name+"="+item.value)

import http.cookiejar,urllib.request
filename = "cookie.txt"
cookie = http.cookiejar.MozillaCookieJar(filename)
handler = urllib.request.HTTPCookieProcessor(cookie)
opener = urllib.request.build_opener(handler)
response = opener.open('http://www.baidu.com')
cookie.save(ignore_discard=True,ignore_expires=True)

import http.cookiejar,urllib.request
filename = "cookie.txt"
cookie = http.cookiejar.LWPCookieJar(filename)
handler = urllib.request.HTTPCookieProcessor(cookie)
opener = urllib.request.build_opener(handler)
response = opener.open("http://www.baidu.com")
cookie.save(ignore_discard=True,ignore_expires=True)

import http.cookiejar,urllib.request
cookie = http.cookiejar.LWPCookieJar()
cookie.load('cookie.txt',ignore_discard=True,ignore_expires=True)
handler = urllib.request.HTTPCookieProcessor(cookie)
opener = urllib.request.build_opener(handler)
response = opener.open("http://www.baidu.com")
print(response.read().decode('utf-8'))

异常处理

from urllib import request,error
try:
    response = request.urlopen('http://cuiqingcai.com/index.htm')
except error.URLError as e:
    print(e.reason)

from urllib import request,error
try:
    response = request.urlopen('http://cuiqingcai.com/index.htm')
except error.HTTPError as e:
    print(e.reason,e.code,e.headers,sep="\n")
except error.URLError as e:
    print(e.reason)
else:
    print("Request Successfully")

import socket
import urllib.request
import urllib.error
try:
    response = urllib.request.urlopen('http://www.baiduc.com',timeout=0.01)
except urllib.error.URLError as e:
    print(type(e.reason))
    if isinstance(e.reason,socket.timeout):
        print("TIME OUT")

URL解析

urlparse

urllib.parse.urlparse(urlstring,scheme='',allow_fragments=True)

from urllib.parse import urlparse
result = urlparse('http://www.baidu.com/index.html;urser?id=5#comment')
print(type(result),result)

from urllib.parse import urlparse
result = urlparse('www.baidu.com/index.html;user?id=5#comment,scheme="https"')
print(result)

from urllib.parse import urlparse
result = urlparse('http://www.baidu.com/index.html;user?id=5#comment,scheme="https"')
print(result)

from urllib.parse import urlparse
result = urlparse('http://www.baidu.com/index.html;user?id=5#comment,allow_fragments=False')
print(result)

from urllib.parse import urlparse
result = urlparse('http://www.baidu.com/index.html#comment',allow_fragments=False)
print(result)

utlunoarse

from urllib.parse import urlunparse
data =['http','www.baidu.com','index.html','user','a=6','comment']
print(urlunparse(data))

urljoin

from urllib.parse import urljoin
print(urljoin('http://www.baidu.com','?category=2#comment'))
print(urljoin('htttp://www.baidu.com/about.html','htttp://www.baidu.com/FAQ.html'))

urlencode

from urllib.parse import urlencode
params = {
    'name':'puqunzhu',
    'age':23
}
base_url="http://www.baidu.com?"
url=base_url + urlencode(params)
print(url)

内容总结

以上是互联网集市为您收集整理的Python Urllib库详解全部内容，希望文章能够帮你解决Python Urllib库详解所遇到的程序开发问题。如果觉得互联网集市技术教程内容还不错，欢迎将互联网集市网站推荐给程序员好友。

内容备注

版权声明：本文内容由互联网用户自发贡献，该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容，请发送邮件至 gblab@vip.qq.com 举报，一经查实，本站将立刻删除。

内容手机端

扫描二维码推送至手机访问。

本文链接：https://qyyshop.com/info/850835.html

来源：【匿名】

【上一篇】Python函数式编程【下一篇】浅谈PHP运行Python脚本的方法

更多 ►

【Python Urllib库详解】教程文章相关的互联网学习教程文章

python引用和对象详解【代码】

python引用和对象详解@[马克飞象]python中变量名和对象是分离的例子 1：a = 1 这是一个简单的赋值语句，整数 1 为一个对象，a 是一个引用，利用赋值语句，引用a指向了对象1.例子 2：>>> a = 1 >>> id(a) 24834392 >>> a = ‘banana‘ >>> id(a) 139990659655312第一个语句中， 2是储存在内存中的一个整数对象，通过赋值引用a 指向了对象 1 第二个语句中，内存中建立了一个字符串对象‘banana’，通过赋值将引用a 指向了 ‘bana...

Python 复数数据类型详解（complex）[学习 Python 必备基础知识][看此一篇就够了]【代码】【图】

您的“关注”和“点赞”，是信任，是认可，是支持，是动力......如意见相佐，可留言。本人必将竭尽全力试图做到准确和全面，终其一生进行修改补充更新。目录1 复数数据类型概述2 从复数中提取实部和虚部3 相关函数 complex()3.1 complex() 函数概述3.2 注意事项4 文章其他地址4.1 微信公众号：码农阿杰4.2 CSDN 博客5 参考资料5.1 Python 3.8.2 documentation1 复数数据类型概述复数数据类型，简称复数类型。在 Python 中用comple...

Python 3.x--paramiko模块详解【代码】【图】

一、使用paramiko模块实现SSH功能下列代码在Windows上运行，连接虚拟机中centos系统。import paramiko# 创建SSH对象 ssh = paramiko.SSHClient() # 允许连接不在know_hosts文件上的主机ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) # 连接服务器 ssh.connect(hostname="192.168.0.99", port=22, username="root", password="rootroot") # 执行命令 stdin, stdout, stderr = ssh.exec_command(‘df‘) # 获取结果 re...

安卓自动化测试，贺晓聪之uiautomator设备和选择器~Python详解【代码】【图】

1、设备对象引入uiautomator，获取设备对象<所谓设备对象可理解为：Android模拟器或者真机>语法：from uiautomator import device as dd 即为设备对象1.1、获取设备信息语法：d.info返回值：{ u‘displayRotation‘: 0,u‘displaySizeDpY‘: 640,u‘displaySizeDpX‘: 360,u‘currentPackageName‘: u‘com.android.launcher‘,u‘productName‘: u‘takju‘,u‘displayWidth‘: 720,u‘sdkInt‘: 18,u‘displayHeight‘: 1184,u‘...

第五周-第06章节-Python3.5-内置模块详解之Range模块【代码】

如果你对在Python生成随机数与random模块中最常用的几个函数的关系与不懂之处，下面的文章就是对Python生成随机数与random模块中最常用的几个函数的关系，希望你会有所收获，以下就是这篇文章的介绍。random.random()用于生成用于生成一个指定范围内的随机符点数，两个参数其中一个是上限，一个是下限。如果a > b，则生成随机数1n: a <= n <= b。如果 a <b，则 b <= n <= a。123456print random.uniform(10, 20) print random.un...

【转】Python中的zip()与*zip()函数详解【代码】

前言实验环境: Python 3.6；示例代码地址：下载示例；本文中元素是指列表、元组、字典等集合类数据类型中的下一级项目（可能是单个元素或嵌套列表）。zip(*iterables)函数详解zip()函数的定义从参数中的多个迭代器取元素组合成一个新的迭代器；返回：返回一个zip对象，其内部元素为元组；可以转化为列表或元组；传入参数：元组、列表、字典等迭代器。zip()函数的用法当zip()函数中只有一个参数时zip(iterable)从iterable中依次取...

python复制文件的方法实例详解【代码】

本文实例讲述了python复制文件的方法。分享给大家供大家参考。具体分析如下：这里涉及Python复制文件在实际操作方案中的实际应用以及Python复制文件的相关代码说明，希望你会有所收获。Python复制文件： import shutil import os import os.path src = " d:\\download\\test\\myfile1.txt " dst = " d:\\download\\test\\myfile2.txt " dst2 = " d:/download/test/测试文件夹.txt " dir1 = os.path.dirname(src) print ( " ...

Python正则表达式的使用范例详解【代码】

作为一个概念而言，正则表达式对于Python来说并不是独有的。但是，Python中的正则表达式在实际使用过程中还是有一些细小的差别。本文是一系列关于Python正则表达式文章的其中一部分。在这个系列的第一篇文章中，我们将重点讨论如何使用Python中的正则表达式并突出Python中一些独有的特性。我们将介绍Python中对字符串进行搜索和查找的一些方法。然后我们讲讨论如何使用分组来处理我们查找到的匹配对象的子项。我们有兴趣使用的Pyth...

python中闭包详解【代码】

闭包这个概念好难理解，身边朋友们好多都稀里糊涂的，稀里糊涂的林老冷希望写下这篇文章能够对稀里糊涂的伙伴们有一些帮助~ 请大家跟我理解一下，如果在一个函数的内部定义了另一个函数，外部的我们叫他外函数，内部的我们叫他内函数。闭包：　　在一个外函数中定义了一个内函数，内函数里运用了外函数的临时变量，并且外函数的返回值是内函数的引用。这样就构成了一个闭包。一般情况下，在我们认知当中，如果一个函数结束，函数的...

python接口自动化（六）--发送get请求接口（详解）【代码】【图】

简介　　如果想用python做接口测试，我们首先有不得不了解和学习的模块。它就是第三方模块：Requests。虽然Python内置的urllib模块，用于访问网络资源。但是，它用起来比较麻烦，而且，缺少很多实用的高级功能。更好的方案是使用requests。它是一个Python第三方库，处理URL资源特别方便。查看其中文官网：http://cn.python-requests.org/zh_CN/latest/index.html 英文官网：http://www.python-requests.org/en/master/ 可以看...

Python数据库连接池DBUtils详解【代码】

Python数据库连接池DBUtils详解what‘s the DBUtils　　DBUtils 是一套用于管理数据库连接池的Python包，为高频度高并发的数据库访问提供更好的性能，可以自动管理连接对象的创建和释放。并允许对非线程安全的数据库接口进行线程安全包装。DBUtils提供两种外部接口：PersistentDB ：提供线程专用的数据库连接，并自动管理连接。PooledDB ：提供线程间可共享的数据库连接，并自动管理连接。实测证明 PersistentDB 的速度是最高的，但...

73.Python中ORM聚合函数详解：Count【代码】

Count：用来求某个数据的个数。在以下所有的示例中所采用的模型为：from django.db import models# 定义作者模型 class Author(models.Model):name = models.CharField(max_length=100, unique=True)age = models.IntegerField()email = models.EmailField()class Meta:db_table = 'author'def __str__(self):return "%s,%s,%s" % (self.name,self.age, self.email)# 定义出版社模型 class Publisher(models.Model):name = models....

Python的set集合详解【代码】

Python 还包含了一个数据类型 —— set （集合）。集合是一个无序不重复元素的集。基本功能包括关系测试和消除重复元素。集合对象还支持 union（联合），intersection（交），difference（差）和 sysmmetric difference（对称差集）等数学运算。创建集合set　　大括号或 set() 函数可以用来创建集合。　　　set集合类需要的参数必须是迭代器类型的，如：序列、字典等，然后转换成无序不重复的元素集。由于集合是不重复的，所以可以...

python+selenium——详解介绍Selenium常用API的使用--python语言（完整版）

参考：http://www.51testing.com/html/03/n-3725703-2.html from selenium import webdriverdriver = webdriver.Firefox() # 指定浏览器驱动 #对象实例化--------------------------------------------------------------------------url1=‘https://www.baidu.com‘ #需要打开的网址driver.get(url1) #driver.get("https://www.baidu.com")====================================================================================...

Python 集合set()添加删除、交集、并集、集合操作详解【代码】【图】

创建集合setpython set类是在python的sets模块中，大家现在使用的python2.7.x中，不需要导入sets模块可以直接创建集合。set(‘boy‘) Out[1]: {‘b‘, ‘o‘, ‘y‘} 集合添加和删除python 集合的添加有两种常用方法，分别是add和update。集合add方法：是把要传入的元素做为一个整个添加到集合中，例如：set(‘boy‘) Out[1]: {‘b‘, ‘o‘, ‘y‘}a = set(‘boy‘)a.add(‘python‘)a Out[4]: {‘b‘, ‘o‘, ‘python‘, ‘y‘}...

首页 / PYTHON / Python Urllib库详解

Python Urllib库详解

内容导读

内容图文

Urllib库详解

什么是Urllib?

Python内置的HTTP请求库

相比Python2变化

python2

python3

urllib

urlopen

响应

响应类型

状态码、响应头

Request

Handler

代理

Cookie

异常处理

URL解析

urlparse

utlunoarse

urljoin

urlencode

内容总结

内容备注

内容手机端

【Python Urllib库详解】教程文章相关的互联网学习教程文章

PYTHON - 相关标签

URLLIB - 相关标签

PYTHON - 技术教程分类

PYTHON - 最新教程

PYTHON - 最热教程