python中urllib库的用法_Python编程

urllib是Python中一个用于处理URL的标准库，它包含了四个模块：urllib.request, urllib.error, urllib.parse, urllib.robotparser。

其中，urllib.request模块用于打开和读取URL，支持HTTP、HTTPS、FTP、文件等各种协议，可以用于访问Web页面、下载文件等操作。以下是urllib.request模块中常用的方法：

urlopen(url, data=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT, *, cafile=None, capath=None, cadefault=False, context=None)：打开一个URL，返回一个类文件对象，可以读取其中的内容。
urlretrieve(url, filename=None, reporthook=None, data=None, *, context=None)：下载一个URL指向的文件，将其保存到本地文件系统中。
urlencode(query, doseq=False, safe='', encoding=None, errors=None, quote_via=quote_plus)：将一个字典或序列转换成URL编码的字符串。
Request(url, data=None, headers={}, origin_req_host=None, unverifiable=False, method=None)：构造一个HTTP请求，可以设置请求头、请求方法等信息。
build_opener([handler, …])：创建一个OpenerDirector对象，可以使用不同的处理器（Handler）来处理HTTP请求，例如使用代理、cookie、认证等。
install_opener(opener)：将一个OpenerDirector对象安装为全局默认的Opener，之后所有的urlopen()请求都会使用该Opener处理。

以下是一个简单的示例：

import urllib.request

# 打开一个URL并读取其中的内容
response = urllib.request.urlopen('http://www.baidu.com')
html = response.read()
print(html)

# 下载一个文件并保存到本地
url = 'http://www.example.com/somefile.zip'
urllib.request.urlretrieve(url, 'localfile.zip')

# 构造一个HTTP请求并设置请求头
req = urllib.request.Request('http://www.example.com')
req.add_header('User-Agent', 'Mozilla/5.0')
response = urllib.request.urlopen(req)
html = response.read()
print(html)

注意：在Python 3.x中，urllib库被拆分成了urllib.request、urllib.error、urllib.parse、urllib.robotparser四个子模块，需要根据实际需要导入相应的模块。