python爬虫多次请求超时的几种重试方法
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了python爬虫多次请求超时的几种重试方法,小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含3312字,纯文字阅读大概需要5分钟。
内容图文
![python爬虫多次请求超时的几种重试方法](/upload/InfoBanner/zyjiaocheng/633/8618c6b771c4482a894d1f71aefcab85.jpg)
第一种方法
headers = Dict()
url = 'https://www.baidu.com'
try:
proxies = None
response = requests.get(url, headers=headers, verify=False, proxies=None, timeout=3)
except:
# logdebug('requests failed one time')
try:
proxies = None
response = requests.get(url, headers=headers, verify=False, proxies=None, timeout=3)
except:
# logdebug('requests failed two time')
print('requests failed two time')
总结 :代码比较冗余,重试try的次数越多,代码行数越多,但是打印日志比较方便
第二种方法
def requestDemo(url,):
headers = Dict()
trytimes = 3 # 重试的次数
for i in range(trytimes):
try:
proxies = None
response = requests.get(url, headers=headers, verify=False, proxies=None, timeout=3)
# 注意此处也可能是302等状态码
if response.status_code == 200:
break
except:
# logdebug(f'requests failed {i}time')
print(f'requests failed {i} time')
总结 :遍历代码明显比第一个简化了很多,打印日志也方便
第三种方法
def requestDemo(url, times=1):
headers = Dict()
try:
proxies = None
response = requests.get(url, headers=headers, verify=False, proxies=None, timeout=3)
html = response.text()
# todo 此处处理代码正常逻辑
pass
return html
except:
# logdebug(f'requests failed {i}time')
trytimes = 3 # 重试的次数
if times < trytimes:
times += 1
return requestDemo(url, times)
return 'out of maxtimes'
总结 :迭代 显得比较高大上,中间处理代码时有其它错误照样可以进行重试; 缺点 不太好理解,容易出错,另外try包含的内容过多时,对代码运行速度不利。
第四种方法
@retry(3) # 重试的次数 3
def requestDemo(url):
headers = Dict()
proxies = None
response = requests.get(url, headers=headers, verify=False, proxies=None, timeout=3)
html = response.text()
# todo 此处处理代码正常逻辑
pass
return html
def retry(times):
def wrapper(func):
def inner_wrapper(*args, **kwargs):
i = 0
while i < times:
try:
print(i)
return func(*args, **kwargs)
except:
# 此处打印日志 func.__name__ 为say函数
print("logdebug: {}()".format(func.__name__))
i += 1
return inner_wrapper
return wrapper
总结 :装饰器优点 多种函数复用,使用十分方便
第五种方法
#!/usr/bin/python
# -*-coding='utf-8' -*-
import requests
import time
import warnings
warnings.filterwarnings("ignore")
def get_xiaomi():
try:
# for n in range(5): # 重试5次
# print("第"+str(n)+"次")
for _ in range(5): # 重试5次
url = "https://www.mi.com/22"
headers = {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",
"Connection": "keep-alive",
# "Cookie": "xmuuid=XMGUEST-D80D9CE0-910B-11EA-8EE0-3131E8FF9940; Hm_lvt_c3e3e8b3ea48955284516b186acf0f4e=1588929065; XM_agreement=0; pageid=81190ccc4d52f577; lastsource=www.baidu.com; mstuid=1588929065187_5718; log_code=81190ccc4d52f577-e0f893c4337cbe4d|https%3A%2F%2Fwww.mi.com%2F; Hm_lpvt_c3e3e8b3ea48955284516b186acf0f4e=1588929099; mstz=||1156285732.7|||; xm_vistor=1588929065187_5718_1588929065187-1588929100964",
"Host": "www.mi.com",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.90 Safari/537.36"
}
response = requests.get(url,headers=headers,timeout=10,verify=False)
res= response.text
# print(res)
print(response.status_code)
if response.status_code==200:
break
return res
except:
result = "异常"
return result
if __name__ == '__main__':
print(get_xiaomi())
第六种方法
内容总结
以上是互联网集市为您收集整理的python爬虫多次请求超时的几种重试方法全部内容,希望文章能够帮你解决python爬虫多次请求超时的几种重试方法所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。