1.解决:安装pandas模块
ImportError Traceback (most recent call last)
<ipython-input-1-7dd3504c366f> in <module>()
----> 1 import pandas as pd
ImportError: No module named pandas
2.降低numpy包版本(近期更新)
c:\python27\lib\site-packages\pandas\_libs\__init__.py:4: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
from .tslib import iNaT, NaT, Timestamp, Timedelta, OutOfBoundsDatetime
c:\python27\lib\site-packages\pandas\__init__.py:26: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
from pandas._libs import (hashtable as _hashtable,
c:\python27\lib\site-packages\pandas\core\dtypes\common.py:6: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
from pandas._libs import algos, lib
c:\python27\lib\site-packages\pandas\core\util\hashing.py:7: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
from pandas._libs import hashing, tslib
c:\python27\lib\site-packages\pandas\core\indexes\base.py:7: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
from pandas._libs import (lib, index as libindex, tslib as libts,
c:\python27\lib\site-packages\pandas\tseries\offsets.py:21: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
import pandas._libs.tslibs.offsets as liboffsets
c:\python27\lib\site-packages\pandas\core\ops.py:16: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
from pandas._libs import algos as libalgos, ops as libops
c:\python27\lib\site-packages\pandas\core\indexes\interval.py:32: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
from pandas._libs.interval import (
c:\python27\lib\site-packages\pandas\core\internals.py:14: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
from pandas._libs import internals as libinternals
c:\python27\lib\site-packages\pandas\core\sparse\array.py:33: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
import pandas._libs.sparse as splib
c:\python27\lib\site-packages\pandas\core\window.py:36: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
import pandas._libs.window as _window
c:\python27\lib\site-packages\pandas\core\groupby\groupby.py:68: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
from pandas._libs import (lib, reduction,
c:\python27\lib\site-packages\pandas\core\reshape\reshape.py:30: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
from pandas._libs import algos as _algos, reshape as _reshape
c:\python27\lib\site-packages\pandas\io\parsers.py:45: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
import pandas._libs.parsers as parsers
c:\python27\lib\site-packages\pandas\io\pytables.py:50: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected zd, got zd
from pandas._libs import algos, lib, writers as libwriters
pip show numpy 查看numpy版本;

pip install -U numpy==1.12.0,降低numpy的版本
3.windows下路径表示应改成绝对路径,转义无歧义,路径用/分隔(系统属性默认为\)或\\.
IOError Traceback (most recent call last)
<ipython-input-2-d37f255501ae> in <module>()
----> 1 ratings=pd.read_csv("C:\data\datasets\movielens\ml-20m\ratings.csv",header=0)
c:\python27\lib\site-packages\pandas\io\parsers.pyc in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, doublequote, delim_whitespace, low_memory, memory_map, float_precision)
676 skip_blank_lines=skip_blank_lines)
677
--> 678 return _read(filepath_or_buffer, kwds)
679
680 parser_f.__name__ = name
c:\python27\lib\site-packages\pandas\io\parsers.pyc in _read(filepath_or_buffer, kwds)
438
439 # Create the parser.
--> 440 parser = TextFileReader(filepath_or_buffer, **kwds)
441
442 if chunksize or iterator:
c:\python27\lib\site-packages\pandas\io\parsers.pyc in __init__(self, f, engine, **kwds)
785 self.options['has_index_names'] = kwds['has_index_names']
786
--> 787 self._make_engine(self.engine)
788
789 def close(self):
c:\python27\lib\site-packages\pandas\io\parsers.pyc in _make_engine(self, engine)
1012 def _make_engine(self, engine='c'):
1013 if engine == 'c':
-> 1014 self._engine = CParserWrapper(self.f, **self.options)
1015 else:
1016 if engine == 'python':
c:\python27\lib\site-packages\pandas\io\parsers.pyc in __init__(self, src, **kwds)
1706 kwds['usecols'] = self.usecols
1707
-> 1708 self._reader = parsers.TextReader(src, **kwds)
1709
1710 passed_names = self.names is None
pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader.__cinit__()
pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source()
atings.csv does not existC:\data\datasets\movielens\ml-20m
4.由于工具包更新后,部分方法失效,需要用新的方法。
e.g有order属性,.用sort_value方法
AttributeError Traceback (most recent call last)
<ipython-input-12-dec372faf98a> in <module>()
1 #降序排列
----> 2 rating_by_title.order(ascending=False)[:10]
c:\python27\lib\site-packages\pandas\core\generic.pyc in __getattr__(self, name)
4374 if self._info_axis._can_hold_identifiers_and_holds_name(name):
4375 return self[name]
-> 4376 return object.__getattribute__(self, name)
4377
4378 def __setattr__(self, name, value):
AttributeError: 'Series' object has no attribute 'order'
5.在jupyter中装python2和python3两个内核
为Jupyter Notebook添加多个python内核
- 查看jupyter notebook内核列表
jupyter kernelspec list
- 安装或删除其他内核
ipython kernel install --name python2 #安装python2
jupyter kernelspec uninstall python2 #删除python2
6.NameError: name 'raw_input' is not defined
raw_input() was renamed to input()
python2中的raw_input方法在python3中为input()
7.安装scraoy难,用anaconda安装
8.用chrome下载文件速度慢(一般为国外资料),找国内镜像或者百度云
9.pandas读取csv处理时报错:ParserError: Error tokenizing data. C error: Expected 1 fields in line 29, saw 2
文件默认的是以逗号为分隔符,但是中文中逗号的使用率很高,爬取中文数据时就容易造成混淆,所以使用pandas写入csv时可以设置参数 sep=’\t’ ,即以tab为分隔符写入。毕竟tab在中文习惯里用的很少嘛。
那这样在后面读取csv进行数据处理时,一定记得加上一个参数delimiter:
delimiter="\t"
#这样读入:
df=pd.read_csv('path',delimiter="\t")
- 1
- 2
- 3
- 4
不然你把dataframe打印出来看看就是挤在一团,没有分列的,后面对csv进行处理的时候还可能会出现标题那样的错误
ParserError: Error tokenizing data. C error: Expected 1 fields in line 29, saw 2
这个方法可能不能成功列表,用下面参数较好
df_status0_invertory = pd.read_csv(inventory_dir + inventory_status0_file_name, delimiter=',', header=None,
error_bad_lines=False)
解决方法:
加入参数error_bad_lines=False
6.Anaconda便捷安装
https://mirrors.tuna.tsinghua.edu.cn/help/anaconda/
在使用Python进行数据分析时,遇到了一些常见问题,包括安装pandas模块,降低numpy版本,处理windows路径,应对工具包更新导致的方法变更,以及在Jupyter中配置多个Python内核。此外,解决了NameError、ParserError和读写csv时的分隔符问题。通过设置错误处理参数和调整分隔符,成功进行数据处理。
1009

被折叠的 条评论
为什么被折叠?



