看过,搜过pandas的用法文章,大都不全,写得洋洋洒洒,实用不强,把自己的练习整理资料拿出来看看;
1、文件导入,
import pandas as pd #导入Pandas 用别名pd; import os #导入模块,可以检测文件 等操作;
2、写个 读取函数;
# 获取指数 CSV文件,读取头部,尾部,信息;
def ReadIndexCsv(code='999999.SH'):
# print("\n Get start end date csv def()")
codefile_path = 'E:\PythonDataOut\index\\' + code + '.csv' # ; D_000001.SZ
flag1 = os.path.isfile(codefile_path) # 检查文件是否存在
if flag1:
# print(filename1,'file exists')
# 读取
p1 = pd.read_csv(codefile_path)
return p1
else:
return -1 # False 表示 没有找到文件 ,后续验证;
3、在这里放一个文件,就是上证指数的日线数据,如果没有,那就复制上如下数据 用写字板保存,改名称;
E:\PythonDataOut\index\999999.SH.csv
trade_date,open,high,low,close,vol,amount
2021/1/4,3474.68,3511.66,3457.21,3502.96,380790800,5.23E+11
2021/1/5,3492.19,3528.68,3484.72,3528.68,407995934,5.68E+11
2021/1/6,3530.91,3556.8,3513.13,3550.88,370230926,5.22E+11
2021/1/7,3552.91,3576.2,3526.62,3576.2,405348226,5.46E+11
2021/1/8,3577.69,3588.06,3544.89,3570.11,345557896,5.02E+11
2021/1/11,3571.32,3597.7,3516.99,3531.5,362479154,5.27E+11
2021/1/12,3518.01,3608.34,3517.47,3608.34,323406153,4.86E+11
2021/1/13,3613.28,3622.35,3575.59,3598.65,388127791,5.55E+11
2021/1/14,3584.93,3599.06,3559.6,3565.9,347666819,5.11E+11
2021/1/15,3566.28,3589.27,3533.79,3566.38,324612744,4.80E+11
2021/1/18,3554.8,3608.77,3544.26,3596.22,301652565,4.52E+11
2021/1/19,3596.36,3603.15,3553.02,3566.38,323439541,4.67E+11
2021/1/20,3564.12,3589.96,3556.44,3583.09,272271661,3.97E+11
2021/1/21,3590.92,3636.24,3585.8,3621.26,327466995,4.70E+11
2021/1/22,3616.54,3616.54,3585.03,3606.75,324463084,4.85E+11
2021/1/25,3605.36,3637.1,3591.02,3624.24,327341070,5.27E+11
2021/1/26,3610.97,3610.97,3564.74,3569.43,278139884,4.36E+11
2021/1/27,3567.55,3578.8,3546.49,3573.34,264107161,3.96E+11
2021/1/28,3534.67,3549.54,3496.88,3505.18,270862461,3.92E+11
2021/1/29,3521.72,3531.6,3446.55,3483.07,293662664,4.17E+11
2021/2/1,3477.17,3506.39,3469.88,3505.28,277567901,3.71E+11
2021/2/2,3510.81,3535.5,3495.57,3533.68,264540630,3.84E+11
2021/2/3,3531.15,3544.01,3508.51,3517.31,297351716,4.14E+11
2021/2/4,3503.78,3524.72,3465.77,3501.86,298834854,4.19E+11
2021/2/5,3509.49,3536.54,3492.96,3496.33,290146174,3.97E+11
2021/2/8,3504.56,3542.21,3492.13,3532.45,249799619,3.59E+11
2021/2/9,3539.77,3604.01,3528.68,3603.49,253821760,3.72E+11
2021/2/10,3612.61,3662.77,3612.51,3655.09,257940824,3.93E+11
3、主函数如下,有注释,
if __name__ == '__main__':
c1 = ReadIndexCsv()
print(c1.head(5)) #表示输出的是前几行,默认是5
print(c1.tail(5)) #看最后几行的数字,默认是5
print(pd.__version__) #输出版本号
print(c1.dtypes) #输出数据类型
print(c1.shape) #多少行,多少列
print(c1.shape[0]) #多少行,
print(c1.shape[1]) #多少列
print("---------------------------------")
print(c1.index) #输出每一行的名字
print(c1.columns) #输出每一列的名字
for i in c1.index:
print("当前行索引是:",i)
print(c1.sample(n = 4)) #随机抽取几行
print(c1.sample(frac = 0.5)) #按比例frac随机抽取
print("---------------------------------")
print(c1.describe())
4、运行一下看看;
这篇博客介绍了如何使用Python的pandas库处理上证指数的日线数据,包括文件导入、自定义读取函数,并提供了部分实际数据作为示例。通过示例代码,读者可以学习到pandas基础操作。
5265

被折叠的 条评论
为什么被折叠?



