11. Python 标准库精选深度解析

最新推荐文章于 2026-07-02 16:02:29 发布

原创最新推荐文章于 2026-07-02 16:02:29 发布 · 687 阅读

16 ·

本内容遵循CC 4.0 BY-SA版权协议

GEO检测

收录于

Python从入门到精通

Python 标准库精选深度解析

标准库概述
数据结构增强
- 2.1 collections：namedtuple, deque, defaultdict, Counter, OrderedDict, ChainMap
- 2.2 heapq：堆队列
- 2.3 bisect：有序列表二分查找
函数式编程工具
- 3.1 functools：partial, lru_cache, reduce, wraps, singledispatch
- 3.2 itertools：无穷迭代器、排列组合、链式分组
- 3.3 operator：运算符的函数形式
日期与时间
- 4.1 datetime：date, time, datetime, timedelta
- 4.2 time：时间戳、休眠、格式化
数学与随机
- 5.1 math：常数与初等函数
- 5.2 decimal：精确十进制浮点
- 5.3 fractions：有理数
- 5.4 random：随机数生成
- 5.5 statistics：统计函数
文件与系统交互
- 6.1 os 与 os.path：操作系统接口
- 6.2 pathlib：面向对象路径操作
- 6.3 shutil：高级文件操作
- 6.4 tempfile：临时文件/目录
- 6.5 glob：文件名模式匹配
数据序列化与交换
- 7.1 json：JSON 格式处理
- 7.2 csv：逗号分隔值
- 7.3 pickle：Python 原生序列化
网络与通信
- 8.1 urllib 与 requests 简介
- 8.2 http.server：简单的 HTTP 服务器
- 8.3 smtplib 与 email：邮件处理
并发与异步
- 9.1 threading：线程
- 9.2 multiprocessing：进程
- 9.3 asyncio：异步 I/O
开发辅助
- 10.1 logging：日志系统
- 10.2 unittest：单元测试框架
- 10.3 pdb 与 breakpoint：调试器
- 10.4 typing：类型提示
总结与学习路径

1. 标准库概述

Python 的标准库是一套随解释器一同分发的模块集合，涵盖文件处理、网络通信、数据结构、数学计算、并发编程、测试调试等方方面面，被形容为“自带电池”。掌握常用的标准库模块能极大提高开发效率，避免重复造轮子。

2. 数据结构增强

2.1 `collections`：namedtuple, deque, defaultdict, Counter, OrderedDict, ChainMap

namedtuple – 创建轻量级的不可变对象，兼具元组的性能和对象的可读性。

from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
p = Point(10, 20)
print(p.x, p[0])   # 10 10

deque – 双端队列，支持在两端高效 O(1) 插入/弹出。

from collections import deque
dq = deque([1,2,3])
dq.appendleft(0)
dq.pop()           # 3

defaultdict – 访问缺失键时自动调用工厂函数生成默认值。

from collections import defaultdict
dd = defaultdict(list)
dd['a'].append(1)   # 无需初始化空列表

Counter – 计数统计。

from collections import Counter
cnt = Counter("abracadabra")
print(cnt.most_common(2))   # [('a', 5), ('b', 2)]

OrderedDict – 保持插入顺序（Python 3.7+ 普通 dict 已有序，但 OrderedDict 仍有 move_to_end 等特殊方法）。
ChainMap – 将多个字典组合为一个逻辑视图。

2.2 `heapq`：堆队列

基于列表实现最小堆。

import heapq
heap = []
heapq.heappush(heap, 3)
heapq.heappush(heap, 1)
heapq.heappush(heap, 2)
print(heapq.heappop(heap))   # 1

2.3 `bisect`：有序列表二分查找

用于维护已排序列表的插入点。

import bisect
lst = [1, 3, 5, 7]
pos = bisect.bisect_left(lst, 4)   # 2
bisect.insort(lst, 4)              # 原地插入保持有序 → [1, 3, 4, 5, 7]

3. 函数式编程工具

3.1 `functools`：partial, lru_cache, reduce, wraps, singledispatch

partial – 冻结部分参数。

from functools import partial
power_of_2 = partial(pow, exp=2)   # 函数 f(x) = x**2
print(power_of_2(5))               # 25

lru_cache – 缓存函数返回值。

from functools import lru_cache
@lru_cache(maxsize=128)
def fib(n):
    if n < 2: return n
    return fib(n-1) + fib(n-2)

reduce – 累积运算。

from functools import reduce
print(reduce(lambda a, b: a * b, [1,2,3,4]))   # 24

singledispatch – 函数重载，根据第一个参数的类型分发。

from functools import singledispatch
@singledispatch
def process(val):
    print(f"未知类型: {type(val)}")
@process.register(int)
def _(val):
    print(f"处理整数: {val}")
@process.register(str)
def _(val):
    print(f"处理字符串: {val}")
process(10)   # 处理整数: 10

3.2 `itertools`：无穷迭代器、排列组合、链式分组

import itertools
# 无穷迭代器
# itertools.count(), itertools.cycle(), itertools.repeat()
# 排列组合
for p in itertools.combinations('ABC', 2):   # ('A','B'), ('A','C'), ('B','C')
    pass
# chain 连接多个可迭代对象
list(itertools.chain([1,2], [3,4]))   # [1,2,3,4]
# groupby 分组
data = [('a', 1), ('a', 2), ('b', 3)]
for k, g in itertools.groupby(data, key=lambda x: x[0]):
    print(k, list(g))

3.3 `operator`：运算符的函数形式

将原生运算符（如 +、*、[] 等）转化为函数，方便用于 map、sorted 等。

import operator
nums = [1, 2, 3]
print(list(map(operator.neg, nums)))                     # [-1, -2, -3]
data = [('a', 2), ('b', 1)]
data.sort(key=operator.itemgetter(1))                   # 按第二项排序

4. 日期与时间

4.1 `datetime`：date, time, datetime, timedelta

from datetime import datetime, date, timedelta

# 当前时间
now = datetime.now()
print(now.strftime("%Y-%m-%d %H:%M:%S"))   # 2025-05-12 10:30:00

# 日期运算
tomorrow = date.today() + timedelta(days=1)
diff = datetime(2025, 12, 31) - now
print(f"距离年底还有 {diff.days} 天")

4.2 `time`：时间戳、休眠、格式化

import time
ts = time.time()               # Unix 时间戳
time.sleep(0.5)                # 暂停 0.5 秒
print(time.ctime(ts))          # 人类可读格式

5. 数学与随机

5.1 `math`：常数与初等函数

import math
print(math.pi, math.e)         # 3.14159... 2.71828...
print(math.sqrt(2))            # 1.414...
print(math.factorial(5))       # 120
print(math.gcd(12, 18))        # 6

5.2 `decimal`：精确十进制浮点

避免二进制浮点误差，适合金融计算。

from decimal import Decimal, getcontext
getcontext().prec = 10        # 设置精度
print(Decimal('0.1') + Decimal('0.2'))   # 0.3

5.3 `fractions`：有理数

from fractions import Fraction
print(Fraction(1, 3) + Fraction(1, 6))   # 1/2

5.4 `random`：随机数生成

import random
print(random.randint(1, 10))          # 随机整数
print(random.choice(['a','b','c']))   # 随机选择
random.shuffle(lst)                   # 洗牌

5.5 `statistics`：统计函数

import statistics
data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]
print(statistics.mean(data))          # 均值
print(statistics.median(data))        # 中位数
print(statistics.stdev(data))         # 标准差

6. 文件与系统交互

6.1 `os` 与 `os.path`：操作系统接口

import os
# 环境变量
print(os.environ.get('HOME'))
# 路径操作
path = os.path.join('dir', 'file.txt')
dir_name = os.path.dirname(path)
# 文件操作
os.rename('old.txt', 'new.txt')
os.remove('file.txt')
os.mkdir('newdir')

6.2 `pathlib`：面向对象路径操作

from pathlib import Path
p = Path('.') / 'src' / 'main.py'
print(p.suffix)            # .py
print(p.read_text())       # 读取全部文本
# 遍历
for py in p.parent.glob('*.py'):
    print(py)

6.3 `shutil`：高级文件操作

import shutil
shutil.copy('src.txt', 'dst.txt')
shutil.copytree('src_dir', 'dst_dir')
shutil.move('old', 'new')
shutil.rmtree('dir')

6.4 `tempfile`：临时文件/目录

import tempfile
with tempfile.NamedTemporaryFile(mode='w+t', suffix='.txt', delete=True) as tmp:
    tmp.write('Hello')
    tmp.seek(0)
    print(tmp.read())

6.5 `glob`：文件名模式匹配

import glob
for file in glob.glob('*.py'):
    print(file)

7. 数据序列化与交换

7.1 `json`：JSON 格式处理

import json
data = {'name': 'Alice', 'age': 30}
# 序列化
json_str = json.dumps(data, indent=2, ensure_ascii=False)
# 反序列化
parsed = json.loads(json_str)
# 直接文件读写
with open('data.json', 'w') as f:
    json.dump(data, f)

7.2 `csv`：逗号分隔值

import csv
with open('data.csv', 'r', newline='') as f:
    reader = csv.DictReader(f)
    for row in reader:
        print(row['Name'])

7.3 `pickle`：Python 原生序列化

import pickle
obj = {'a': [1,2,3]}
with open('obj.pkl', 'wb') as f:
    pickle.dump(obj, f)
with open('obj.pkl', 'rb') as f:
    restored = pickle.load(f)

8. 网络与通信

8.1 `urllib` 与 `requests` 简介

标准库 urllib.request 可用于发起 HTTP 请求，但更推荐第三方 requests 库。仅展示标准库用法：

from urllib.request import urlopen, Request
req = Request('https://httpbin.org/get', headers={'User-Agent': 'Python'})
with urlopen(req) as resp:
    print(resp.read().decode())

8.2 `http.server`：简单的 HTTP 服务器

从命令行即可启动：python -m http.server 8000。
也可以在代码中定制：

from http.server import HTTPServer, BaseHTTPRequestHandler
class Handler(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.end_headers()
        self.wfile.write(b'Hello')
# HTTPServer(('', 8000), Handler).serve_forever()

8.3 `smtplib` 与 `email`：邮件处理

import smtplib
from email.mime.text import MIMEText

msg = MIMEText('邮件正文')
msg['Subject'] = '主题'
msg['From'] = 'sender@example.com'
msg['To'] = 'receiver@example.com'

with smtplib.SMTP('localhost') as server:
    server.send_message(msg)

9. 并发与异步

9.1 `threading`：线程

import threading, time

def worker(n):
    print(f"Thread {n} starting")
    time.sleep(0.5)
    print(f"Thread {n} done")

threads = [threading.Thread(target=worker, args=(i,)) for i in range(3)]
for t in threads: t.start()
for t in threads: t.join()

9.2 `multiprocessing`：进程

绕开 GIL，适合 CPU 密集型任务。

from multiprocessing import Process
def heavy(n):
    return sum(i*i for i in range(n))

if __name__ == '__main__':
    ps = [Process(target=heavy, args=(10**6,)) for _ in range(4)]
    for p in ps: p.start()
    for p in ps: p.join()

9.3 `asyncio`：异步 I/O

import asyncio

async def fetch(url):
    print(f"Fetching {url}")
    await asyncio.sleep(1)   # 模拟 IO
    return f"data from {url}"

async def main():
    results = await asyncio.gather(fetch('url1'), fetch('url2'))
    print(results)

asyncio.run(main())

10. 开发辅助

10.1 `logging`：日志系统

import logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logging.info('程序启动')
logging.error('发生错误', exc_info=True)

10.2 `unittest`：单元测试框架

import unittest

def add(a, b): return a + b

class TestMath(unittest.TestCase):
    def test_add(self):
        self.assertEqual(add(2, 3), 5)

if __name__ == '__main__':
    unittest.main()

10.3 `pdb` 与 `breakpoint`：调试器

def buggy(a, b):
    breakpoint()   # Python 3.7+ 内置断点，进入 pdb
    return a / b

10.4 `typing`：类型提示

from typing import List, Dict, Optional
def process(items: List[str], config: Optional[Dict] = None) -> str:
    return ",".join(items)

11. 总结与学习路径

标准库是 Python 开发者最值得投入时间学习的部分之一。从数据结构增强、函数式工具，到文件系统、网络并发，再到调试测试模块，善用它们能让你事半功倍。建议：

先熟悉日常高频模块（collections、datetime、os、json、logging、unittest）。
遇到需求时先查标准库，再考虑第三方库。
阅读官方文档的“库参考”获取最新细节。

标准库为你的 Python 编程提供了坚实的基石，持续探索它们将极大提升开发效率与代码质量。

标签

#python #开发语言

11. Python 标准库精选 深度解析

Python 标准库精选 深度解析

目录

1. 标准库概述

2. 数据结构增强

2.1 collections：namedtuple, deque, defaultdict, Counter, OrderedDict, ChainMap

2.2 heapq：堆队列

2.3 bisect：有序列表二分查找

3. 函数式编程工具

3.1 functools：partial, lru_cache, reduce, wraps, singledispatch

3.2 itertools：无穷迭代器、排列组合、链式分组

3.3 operator：运算符的函数形式

4. 日期与时间

4.1 datetime：date, time, datetime, timedelta

4.2 time：时间戳、休眠、格式化

5. 数学与随机

5.1 math：常数与初等函数

5.2 decimal：精确十进制浮点

5.3 fractions：有理数

5.4 random：随机数生成

5.5 statistics：统计函数

6. 文件与系统交互

6.1 os 与 os.path：操作系统接口

6.2 pathlib：面向对象路径操作

6.3 shutil：高级文件操作

6.4 tempfile：临时文件/目录

6.5 glob：文件名模式匹配