- 文章代码基于jupyter notebook运行
首先,安装必要的库:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model
读取csv数据集,并大致预览:
df = pd.read_csv('d:/boston_house_prices.csv')
df
df.describe()
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT MEDV
count 506.000000 506.000000 506.000000 506.000000 506.000000 506.000000 506.000000 506.000000 506.000000 506.000000 506.000000 506.000000 506.000000 506.000000
mean 3.613524 11.363636 11.136779 0.069170 0.554695 6.284634 68.574901 3.795043 9.549407 408.237154 18.455534 356.674032 12.653063 22.532806
std 8.601545 23.322453 6.860353 0.253994 0.115878 0.702617 28.148861 2.105710 8.707259 168.537116 2.164946 91.294864 7.141062 9.197104
min 0.006320 0.000000 0.460000 0.000000 0.385000 3.561000 2.900000 1.129600 1.000000 187.000000 12.600000 0.320000 1.730000 5.000000
25% 0.082045 0.000000 5.190000 0.000000 0.449000 5.885500 45.025000 2.100175 4.000000 279.000000 17.400000 375.377500 6.950000 17.025000
50% 0.256510 0.000000 9.690000 0.000000 0.538000

本文档在jupyter notebook环境中演示了如何使用sklearn库进行一元线性回归模型的构建。首先安装所需库,接着加载并预览CSV数据集,通过分析相关系数确定研究变量。然后,提取CRIM和NOX两列,计算回归系数和截距,通过R平方评估模型拟合优度,并展示完整模型。最后,进行了预测操作。
376

被折叠的 条评论
为什么被折叠?



