目 录
1、两类json格式
(1)对象格式
{"name":"JSON","address":"北京市西城区","age":25}#JSON的对象格式的字符串
(2)数组对象格式
[{"name":"JSON","address":"北京市西城区","age":25},{"name":"ROSE","address":"北京市东城区","age":23}]#数组对象格式
本文将分别讲解对象格式json和数组对象格式json与dataframe相互转化。
2、json转dataframe
2.1 对象格式json
- json文件转dataframe
import pandas as pd
import json
from pandas import json_normalize
obj={"col_1":[1,2,3,4],
"col_2":[4,3,2,1]}
with open("test.json","w",encoding="utf-8") as f:
f.write(obj)
import pandas as pd
df = pd.read_json("test.json",encoding="utf-8", orient='records')
print(df)
输出:

- json对象(变量)转dataframe
import pandas as pd
import json
from pandas import json_normalize
obj={"col_1":[1,2,3,4],
"col_2":[4,3,2,1]}
df = pd.DataFrame.from_dict(obj)
print(df)
输出:

对象格式json与dict格式相同,可以用pd.DataFrame.from_dict()方法将该种形式的json转为dataframe。
2.2 数组对象格式json
- json文件转dataframe
import pandas as pd
import json
from pandas import json_normalize
obj=[{"姓名": "小明",
"数学": "100",
"英语": "99",
"语文": "80"
},{"姓名": "小红",
"数学": "98",
"英语": "95",
"语文": "98"}]
with open("test.json","w",encoding="utf-8") as file_obj:
json.dump(obj, file_obj)
import pandas as pd
df = pd.read_json("test.json",encoding="utf-8", orient='records')
print(df)
输出:

- json对象(变量)转dataframe
import pandas as pd
import json
from pandas import json_normalize
obj=[{"姓名": "小明",
"数学": "100",
"英语": "99",
"语文": "80"
},{"姓名": "小红",
"数学": "98",
"英语": "95",
"语文": "98"}]
df = json_normalize(obj)
print(df)
输出:

3、dataframe转json
3.1 df.to_json()方法
to_json()方法中的参数orient有以下几种:
“columns”(默认),split",“records”, “index”,“values”
- columns
data=df.to_json(orient="columns",force_ascii=False)
data
输出:
{"姓名":{"0":"小明","1":"小红"},"数学":{"0":"100","1":"98"},"英语":{"0":"99","1":"95"},"语文":{"0":"80","1":"98"}}
可以看出,这种转化结果得到的是每列列名对应一个字典,字典内容中value是该列某位置处的元素,key是对应的index。
- split
df.to_json(orient="split",force_ascii=False)
输出:
'{"columns":["姓名","数学","英语","语文"],"index":[0,1],"data":[["小明","100","99","80"],["小红","98","95","98"]]}'
- records
df.to_json(orient="records",force_ascii=False)
输出:
'[{"姓名":"小明","数学":"100","英语":"99","语文":"80"},{"姓名":"小红","数学":"98","英语":"95","语文":"98"}]'
可以看出,orient="records"时,得到的就是数组对象格式。
- index
df.to_json(orient="index",force_ascii=False)
输出:
'{"0":{"姓名":"小明","数学":"100","英语":"99","语文":"80"},"1":{"姓名":"小红","数学":"98","英语":"95","语文":"98"}}'
这种方法与orient="columns"相对应。
- values
df.to_json(orient="values",force_ascii=False)
输出:
'[["小明","100","99","80"],["小红","98","95","98"]]'
这种方法没有列名和index信息。
3.2 to_dict()方法
这种方法得到的只是对象格式的json:
orient的可取值包括’dict’、‘list’、‘series’、‘split’、‘records’、‘index’
- ‘dict’
df.to_dict(orient="dict")
输出:
{'姓名': {0: '小明', 1: '小红'},
'数学': {0: '100', 1: '98'},
'英语': {0: '99', 1: '95'},
'语文': {0: '80', 1: '98'}}
- ‘list’
df.to_dict(orient='list')
输出:
{'姓名': ['小明', '小红'],
'数学': ['100', '98'],
'英语': ['99', '95'],
'语文': ['80', '98']}
- ‘series’
df.to_dict(orient='series')
输出:
{'姓名': 0 小明
1 小红
Name: 姓名, dtype: object,
'数学': 0 100
1 98
Name: 数学, dtype: object,
'英语': 0 99
1 95
Name: 英语, dtype: object,
'语文': 0 80
1 98
Name: 语文, dtype: object}
- ‘records’
df.to_dict(orient='records')
输出:
[{'姓名': '小明', '数学': '100', '英语': '99', '语文': '80'},
{'姓名': '小红', '数学': '98', '英语': '95', '语文': '98'}]
- ‘index’
df.to_dict(orient='index')
输出:
{0: {'姓名': '小明', '数学': '100', '英语': '99', '语文': '80'},
1: {'姓名': '小红', '数学': '98', '英语': '95', '语文': '98'}}

本文详细介绍了如何将两类json格式(对象格式和数组对象格式)转换为pandas dataframe,以及如何将dataframe转换回json。重点讨论了df.to_json()方法的不同orient参数效果,如'columns'、'split'、'records'、'index'、'values',以及to_dict()方法的不同orient选项。
3333

被折叠的 条评论
为什么被折叠?



