首页 / PYTHON / python – Pandas合并错误TypeError：’>’和’str’实例之间不支持’>’

python – Pandas合并错误TypeError：’>’和’str’实例之间不支持’>’

内容导读

互联网集市收集整理的这篇技术教程文章主要介绍了python – Pandas合并错误TypeError：’>’和’str’实例之间不支持’>’，小编现在分享给大家，供广大互联网技能从业者学习和参考。文章包含9473字，纯文字阅读大概需要14分钟。

内容图文

python – Pandas合并错误TypeError：’>’和’str’实例之间不支持’>’

我有一个包含几个表的数据集,每个表都有国家,年份和一些指标.我已将所有excel表转换为csv文件,然后将它们合并到一个表中.

问题是我有一些拒绝合并的表,并出现以下消息TypeError：’>’ ‘int’和’str’实例之间不支持
我尽我所能,但没有运气,仍然出现同样的错误！
此外,我尝试了数百个不同的文件,但仍有数十个文件面临这个问题.

对于示例文件file17.csv和file35.csv(如果有人需要重复它).这是我使用的代码：

# To load the first file
import pandas as pd
filename1 = 'file17.csv'
df1 = pd.read_csv(filename1, encoding='cp1252', low_memory=False)
df1.set_index(['Country', 'Year'], inplace=True)
df1.dropna(axis=0, how='all', inplace=True)
df1.head()

出&GT&GT&GT

+-------------+------+--------+--------+
|             |      | ind500 | ind356 |
| Country     | Year |        |        |
| Afghanistan | 1800 | 603.0  | NaN    |
|             | 1801 | 603.0  | NaN    |
|             | 1802 | 603.0  | NaN    |
|             | 1803 | 603.0  | NaN    |
|             | 1804 | 603.0  | NaN    |
+-------------+------+--------+--------+

在&GT&GT&GT

# To load the second file
filename2 = 'file35.csv'
df2 = pd.read_csv(filename2, encoding='cp1252', low_memory=False)
df2.set_index(['Country', 'Year'], inplace=True)
df2.dropna(axis=0, how='all', inplace=True)
df2.head()

出&GT&GT&GT

# To merge the two dataframes
gross_df = pd.merge(df1, df2, left_index=True, right_index=True, how='outer')
gross_df.dropna(axis=0, how='all', inplace=True)
print (gross_df.shape)
gross_df.to_csv('merged.csv')

重要的提醒：
我注意到在所有成功的文件中,列名称以升序显示,即ind001,ind009,ind012,因为它们是自动排序的.而有错误的文件有一个或多个列有错误排列的列,如ind500,后面是第一个表中的in356,同样适用于提供的第二个样本.

请注意,两个dataframes指示了两个索引(国家和年份)

错误

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\algorithms.py in safe_sort(values, labels, na_sentinel, assume_unique)
    480         try:
--> 481             sorter = values.argsort()
    482             ordered = values.take(sorter)

TypeError: '>' not supported between instances of 'int' and 'str'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-11-960b2698de60> in <module>()
----> 1 gross_df = pd.merge(df1, df2, left_index=True, right_index=True, how='outer', sort=False)
      2 gross_df.dropna(axis=0, how='all', inplace=True)
      3 print (gross_df.shape)
      4 gross_df.to_csv('merged.csv')

C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\reshape\merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator)
     52                          right_index=right_index, sort=sort, suffixes=suffixes,
     53                          copy=copy, indicator=indicator)
---> 54     return op.get_result()
     55 
     56 

C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\reshape\merge.py in get_result(self)
    567                 self.left, self.right)
    568 
--> 569         join_index, left_indexer, right_indexer = self._get_join_info()
    570 
    571         ldata, rdata = self.left._data, self.right._data

C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\reshape\merge.py in _get_join_info(self)
    720             join_index, left_indexer, right_indexer = \
    721                 left_ax.join(right_ax, how=self.how, return_indexers=True,
--> 722                              sort=self.sort)
    723         elif self.right_index and self.how == 'left':
    724             join_index, left_indexer, right_indexer = \

C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\indexes\base.py in join(self, other, how, level, return_indexers, sort)
   2995             else:
   2996                 return self._join_non_unique(other, how=how,
-> 2997                                              return_indexers=return_indexers)
   2998         elif self.is_monotonic and other.is_monotonic:
   2999             try:

C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\indexes\base.py in _join_non_unique(self, other, how, return_indexers)
   3076         left_idx, right_idx = _get_join_indexers([self.values],
   3077                                                  [other._values], how=how,
-> 3078                                                  sort=True)
   3079 
   3080         left_idx = _ensure_platform_int(left_idx)

C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\reshape\merge.py in _get_join_indexers(left_keys, right_keys, sort, how, **kwargs)
    980 
    981     # get left & right join labels and num. of levels at each location
--> 982     llab, rlab, shape = map(list, zip(* map(fkeys, left_keys, right_keys)))
    983 
    984     # get flat i8 keys from label lists

C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\reshape\merge.py in _factorize_keys(lk, rk, sort)
   1409     if sort:
   1410         uniques = rizer.uniques.to_array()
-> 1411         llab, rlab = _sort_labels(uniques, llab, rlab)
   1412 
   1413     # NA group

C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\reshape\merge.py in _sort_labels(uniques, left, right)
   1435     labels = np.concatenate([left, right])
   1436 
-> 1437     _, new_labels = algos.safe_sort(uniques, labels, na_sentinel=-1)
   1438     new_labels = _ensure_int64(new_labels)
   1439     new_left, new_right = new_labels[:l], new_labels[l:]

C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\algorithms.py in safe_sort(values, labels, na_sentinel, assume_unique)
    483         except TypeError:
    484             # try this anyway
--> 485             ordered = sort_mixed(values)
    486 
    487     # labels:

C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\pandas\core\algorithms.py in sort_mixed(values)
    469         str_pos = np.array([isinstance(x, string_types) for x in values],
    470                            dtype=bool)
--> 471         nums = np.sort(values[~str_pos])
    472         strs = np.sort(values[str_pos])
    473         return _ensure_object(np.concatenate([nums, strs]))

C:\ProgramData\Anaconda2\envs\conda_python3\lib\site-packages\numpy\core\fromnumeric.py in sort(a, axis, kind, order)
    820     else:
    821         a = asanyarray(a).copy(order="K")
--> 822     a.sort(axis=axis, kind=kind, order=order)
    823     return a
    824 

TypeError: '>' not supported between instances of 'int' and 'str'

解决方法:

此错误表示合并DF中的索引具有不同的dtypes

演示 – 如何将字符串索引级别转换为int：

In [183]: df
Out[183]:
              0         1         2         3
bar 1 -0.205037  0.762509  0.816608 -1.057907
    2  1.249104  0.338777 -0.982084  0.329330
baz 1  0.845695 -0.996365  0.548100 -0.113733
    2  1.247092 -2.674061 -0.071993 -0.734242
foo 1 -1.233825 -0.195377 -0.240303  1.168055
    2 -0.108942 -0.615612 -1.299512  0.908641
qux 1  0.844421  0.251425 -0.506877  1.307800
    2  0.038580  0.045072 -0.262974  0.629804

In [184]: df.index
Out[184]:
MultiIndex(levels=[['bar', 'baz', 'foo', 'qux'], ['1', '2']],
           labels=[[0, 0, 1, 1, 2, 2, 3, 3], [0, 1, 0, 1, 0, 1, 0, 1]])

In [185]: df.index.get_level_values(1)
Out[185]: Index(['1', '2', '1', '2', '1', '2', '1', '2'], dtype='object')

In [187]: df.index = df.index.set_levels(df.index.get_level_values(1) \
                       .map(lambda x: pd.to_numeric(x, errors='coerce')), level=1)

结果：

In [189]: df.index.get_level_values(1)
Out[189]: Int64Index([1, 2, 1, 2, 1, 2, 1, 2], dtype='int64')

更新：试试这个：

In [247]: d1 = pd.read_csv('https://docs.google.com/uc?id=1jUsbr5pw6sUMvewI4fmbpssroG4RZ7LE&export=download', index_col=[0,1])

In [248]: d2 = pd.read_csv('https://docs.google.com/uc?id=1Ufx6pvnSC6zQdTAj05ObmV027fA4-Mr3&export=download', index_col=[0,1])

In [249]: d2 = d2[pd.to_numeric(d2.index.get_level_values(1), errors='coerce').notna()]

In [250]: d2.index = d2.index.set_levels(d2.index.get_level_values(1).map(lambda x: pd.to_numeric(x, errors='coerce')), level=1)

In [251]: d1.reset_index().merge(d2.reset_index(), on=['Country','Year'], how='outer').set_index(['Country','Year'])
Out[251]:
                            ind500  ind356  ind475  ind476        ind456
Country               Year
Afghanistan           1800   603.0     NaN     NaN     NaN           NaN
                      1801   603.0     NaN     NaN     NaN           NaN
                      1802   603.0     NaN     NaN     NaN           NaN
                      1803   603.0     NaN     NaN     NaN           NaN
                      1804   603.0     NaN     NaN     NaN           NaN
                      1805   603.0     NaN     NaN     NaN           NaN
                      1806   603.0     NaN     NaN     NaN           NaN
                      1807   603.0     NaN     NaN     NaN           NaN
                      1808   603.0     NaN     NaN     NaN           NaN
                      1809   603.0     NaN     NaN     NaN           NaN
...                            ...     ...     ...     ...           ...
Bahamas, The          1967     NaN     NaN     NaN     NaN  18381.131314
Gambia, The           1967     NaN     NaN     NaN     NaN    937.355288
Korea, Dem. Rep.      1967     NaN     NaN     NaN     NaN   1428.689253
Lao PDR               1967     NaN     NaN     NaN     NaN   1412.359955
Netherlands Antilles  1967     NaN     NaN     NaN     NaN  14076.731352
Russian Federation    1967     NaN     NaN     NaN     NaN  11794.726437
Serbia and Montenegro 1967     NaN     NaN     NaN     NaN   2987.080489
Syrian Arab Republic  1967     NaN     NaN     NaN     NaN   2015.913906
Yemen, Rep.           1967     NaN     NaN     NaN     NaN   1075.693355
Bahamas, The          1968     NaN     NaN     NaN     NaN  18712.082830

[46607 rows x 5 columns]

内容总结

以上是互联网集市为您收集整理的python – Pandas合并错误TypeError：’>’和’str’实例之间不支持’>’全部内容，希望文章能够帮你解决python – Pandas合并错误TypeError：’>’和’str’实例之间不支持’>’所遇到的程序开发问题。如果觉得互联网集市技术教程内容还不错，欢迎将互联网集市网站推荐给程序员好友。

内容备注

版权声明：本文内容由互联网用户自发贡献，该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容，请发送邮件至 gblab@vip.qq.com 举报，一经查实，本站将立刻删除。

内容手机端

扫描二维码推送至手机访问。

本文链接：https://qyyshop.com/info/781277.html

来源：【匿名】

【上一篇】python使用sklearn中的SVM（入门级）【下一篇】浅谈PHP运行Python脚本的方法

更多 ►

【python – Pandas合并错误TypeError：’>’和’str’实例之间不支持’>’】教程文章相关的互联网学习教程文章

python, 爬虫爬取彩票网数据，pandas分析数据并实现可视化饼图【代码】【图】

import lxml import requests from lxml import etreeurl = ‘https://datachart.500.com/ssq/history/newinc/history.php?limit=5000&sort=0‘ resp = requests.get(url) hm = etree.HTML(resp.text) # 在返回页面内容的任意位置查找id=tdata的tbody标签，并取其下所有的tr标签内容，赋给trs列表 trs = hm.xpath("//tbody[@id=‘tdata‘]/tr")f = open(‘data.csv‘, ‘w‘) # 将攫取的数据存到data.csv文件 for tr in trs:data_l...

Python Pandas库的学习（三）【代码】【图】

今天我们来继续讲解Python中的Pandas库的基本用法那么我们如何使用pandas对数据进行排序操作呢？food.sort_values("Sodium_(mg)",inplace= True) print(food["Sodium_(mg)"]) food.sort_values("Sodium_(mg)",inplace=True,ascending= False) print(food["Sodium_(mg)"])我们对food，进行sort_values方法，会自动帮我们排序，第一个参数"Sodium_(mg)"是我们数据中的列名意思是说，你要对哪一列数据进行排序，inplace 参数的意思是，...

【转载】python安装numpy和pandas【代码】

转载：原文地址 http://www.cnblogs.com/lxmhhy/p/6029465.html 最近要对一系列数据做同比比较，需要用到numpy和pandas来计算，不过使用python安装numpy和pandas因为linux环境没有外网遇到了很多问题就记下来了。首要条件，python版本必须是2.7以上。linux首先安装依赖包yum -y install blas blas-devel lapack-devel lapack yum -y install seaborn scipy yum -y install freetype freetype-devel libpng libpng-devel yum -y ins...

Python3快速入门（十五）——Pandas数据处理【代码】

Python3快速入门（十五）——Pandas数据处理一、函数应用1、函数应用简介如果要将自定义函数或其它库函数应用于Pandas对象，有三种使用方式。pipe()将函数用于表格，apply()将函数用于行或列，applymap()将函数用于元素。2、表格函数应用可以通过将函数对象和参数作为pipe函数的参数来执行自定义操作，会对整个DataFrame执行操作。# -*- coding=utf-8 -*- import pandas as pd import numpy as npdef adder(x, y):return x + yif ...

利用 Python 进行数据分析（九）pandas 汇总统计和计算【图】

pandas 对象拥有一些常用的数学和统计方法。例如，sum() 方法，进行列小计： sum() 方法传入 axis=1 指定为横向汇总，即行小计： idxmax() 获取最大值对应的索引：还有一种汇总是累计型的，cumsum()，比较它和 sum() 的区别：unique() 方法用于返回数据里的唯一值： value_counts() 方法用于统计各值出现的频率： isin() 方法用于判断成员资格：安装步骤已经在首篇随笔里写过了，这里不在赘述。利用 Python 进行数据分析（一）简...

Python Numpy,Pandas笔记【代码】

NumpyNumpy是python的一个库。支持维度数组与矩阵计算并提供大量的数学函数库。#浮点数转int arr = np.array([1.2,1.3,1.4],[1.5,1.6,1.7])#创建ndarray时候也可以指定dtype arr.astype(dtype = np.int) #对数组批量运算,作用在每个元素上 arr = np.array([1,2,3],[4,5,6]) print arr**5 #索引和切片 arr = np.array([1,2,3,4,5,6]) print arr[:2]#arr[0]和arr[1] arr = np.array([1,2,3],[4,5,6]) print arr[:2] #打印第1，2行#布...

利用Python进行数据分析-Pandas(第二部分)【代码】

上一个章节中我们主要是介绍了pandas两种数据类型的具体属性，这个章节主要介绍操作Series和DataFrame中的数据的基本手段。重新索引　　pandas对象的一个重要方法是reindex，其作用是创建一个新对象，它的数据符合新的索引：import pandas as pdobj = pd.Series([4.5, 7.2, -5.3, 3.6], index=[‘d‘, ‘b‘, ‘a‘, ‘c‘]) print(obj)d 4.5 b 7.2 a -5.3 c 3.6 dtype: float64 用该Series的reindex将会根据新索引进行...

2018.03.26 Python-Pandas 字符串常用方法【代码】

import numpy as np　　import pandas as pd 1#字符串常用方法 - strip 2 s = pd.Series([‘ jack ‘,‘jill‘,‘ jease ‘,‘feank‘])3 df = pd.DataFrame(np.random.randn(3,2),columns=[‘ Column A‘,‘ Column B‘],index=range(3))4print(s)5print(df.columns)6 7print(‘----‘)8print(s.str.lstrip().values)#去掉左边的空格 9print(s.str.rstrip().values)#去掉右边的空格10 df.columns = df.columns.str.strip() 11pri...

Python数据分析--Pandas知识点(三)【代码】【图】

本文主要是总结学习pandas过程中用到的函数和方法, 在此记录, 防止遗忘.Python数据分析--Pandas知识点(一)Python数据分析--Pandas知识点(二)下面将是在知识点一, 二的基础上继续总结. 前面所介绍的都是以表格的形式中展现数据, 下面将介绍Pandas与Matplotlib配合绘制出折线图, 散点图, 饼图, 柱形图, 直方图等五大基本图形.Matplotlib是python中的一个2D图形库, 它能以各种硬拷贝的格式和跨平台的交互式环境生成高质量的图形, 比如...

python中安装pandas【代码】【图】

在运行网上找的代码时，报错：ImportError: No module named ‘pandas‘，解决：安装pandas安装过程：（因为网上教程有的说用pip命令行安装；有的直接下载安装包，然后复制到Python的安装目录中，就对比了一下有没有区别，发现并没有什么区别。而且pip命令行安装会把pandas需要的其他安装包自动安装，而手动安装的话，需要再一个一个安装依赖包）w+r打开命令行，直接在c:\user\admin下用pip命令安装，安装后，pandas就是在"python的...

python+matplotlib制作雷达图3例分析和pandas读取csv操作【代码】【图】

1.例一图1代码1#第1步：导出模块import numpy as np import matplotlib.pyplot as plt from matplotlib import font_manager # 中文字体设置第1步，导出模块#中文字体设置第2步：引出字体模块和位置 my_font = font_manager.FontProperties(fname="/usr/share/fonts/truetype/noto/simsun.ttf")#数据来源，单独设定，非文件来源 #dataLenth = 8 #数据个数，8组数据 #标签 labels = np.array([‘3℃‘,‘5℃‘,‘6℃‘,‘3℃‘,‘1...

Python，使用pandas保存数据为csv格式的文件【代码】

使用pandas对数据进行保存时，可以有两种形式进行保存　　一、对于数据量不是很大的文件，可以放到列表中，进行一次性存储。　　二、对于大量的数据，可以考虑一边生成，一边存储，可以避免开辟大量内存空间，去往列表中存储数据。本人才疏学浅，只懂一些表面的东西，如有错误，望请指正！下面通过代码进行说明 1import pandas as pd2 3 4class SaveCsv:5 6def__init__(self):7 self.clist = [[1,2,3], [4,5,6], [7,8,9...

Python - pandas 数据分析【代码】

pandas: powerful Python data analysis toolkit官方文档： http://pandas.pydata.org/pandas-docs/stable/ 1. 导入包pandasimport pandas as pd 2. 获取文件夹下文件名称import os filenames=[]path="C:/Users/Forrest/PycharmProjects/test" for file in os.listdir(path):filenames.append(file) 3. 读前几行文件(.csv文件)# -*- coding: utf-8 -*- ##读前几行文件 f= open("C:/Users/Forrest/PycharmProjects/test/train.csv")...

Python数据分析库pandas ------ merge、concatenation 、pd.concat合并与拼接【代码】【图】

对于合并操作，熟悉SQL的读者可以将其理解为JOIN操作，它使用一个或多个键把多行数据结合在一起.事实上，跟关系型数据库打交道的开发人员通常使用SQL的JOIN查询，用几个表共有的引用值（键）从不同的表获取数据。以这些键为基础，我们能够获取到列表形式的新数据，这些数据是对几个表中的数据进行组合得到的。pandas库中这类操作叫作合并，执行合并操作的函数为 merge(). 1import pandas as pd2import numpy as np3 4 frame1 = p...

Python数据分析(二): Pandas技巧 (2)【图】

Pandas的第一部分: http://www.cnblogs.com/cgzl/p/7681974.htmlgithub地址: https://github.com/solenovex/My-Machine-Learning-Notebook很抱歉, 因为工作繁忙, 更新的比较慢.数据的选取和索引 Pandas对数据的基本操作原文：http://www.cnblogs.com/cgzl/p/7908420.html

PYTHON - 技术教程分类

Python3 教程 Python3 简介 Python3 环境搭建 Python3 基础语法 Python3 基本数据类型 Python3 解释器 Python3 注释 Python3 运算符 Python3 数字(Number) Python3 字符串 Python3 列表 Python3 元组 Python3 字典 Python3 集合 Python3 编程第一步 Python3 条件控制 Python3 循环语句 Python3 迭代器与生成器 Python3 函数 Python3 数据结构 Python3 模块 Python3 输入和输出 Python3 File Python3 OS Python3 错误和异常 Python3 面向对象 Python3 命名空间/作用域 Python3 标准库概览 Python3 实例 Python3 CGI编程 Python3 MySQL(PyMySQL) Python3 网络编程 Python3 SMTP发送邮件 Python3 多线程 Python3 日期和时间 Python3 内置函数 Python3 MongoDB Python3 urllib python 全部

PYTHON - 最热教程

python如何统计字符串中字母个数？使用Python进行微信公众号开发（三）回...Python+PyQT5的子线程更新UI界面的实例 python时间戳怎么获得？如何获得当前时...vscode调试python时提示无法将“conda”...python接口自动化全局变量access_token...python收取邮件(腾讯企业邮箱)python如何绘制降水图详解python并发获取snmp信息及性能测试...怎么卸载Python3.6？

首页 / PYTHON / python – Pandas合并错误TypeError：’>’和’str’实例之间不支持’>’

python – Pandas合并错误TypeError：’>’和’str’实例之间不支持’>’

内容导读

内容图文

内容总结

内容备注

内容手机端

【python – Pandas合并错误TypeError：’>’和’str’实例之间不支持’>’】教程文章相关的互联网学习教程文章

python, 爬虫爬取彩票网数据，pandas分析数据并实现可视化饼图【代码】【图】

Python Pandas库的学习（三）【代码】【图】

【转载】python安装numpy和pandas【代码】

Python3快速入门（十五）——Pandas数据处理【代码】

利用 Python 进行数据分析（九）pandas 汇总统计和计算【图】

Python Numpy,Pandas笔记【代码】

利用Python进行数据分析-Pandas(第二部分)【代码】

2018.03.26 Python-Pandas 字符串常用方法【代码】

Python数据分析--Pandas知识点(三)【代码】【图】

python中安装pandas【代码】【图】

python+matplotlib制作雷达图3例分析和pandas读取csv操作【代码】【图】

Python，使用pandas保存数据为csv格式的文件【代码】

Python - pandas 数据分析【代码】

Python数据分析库pandas ------ merge、concatenation 、pd.concat合并与拼接【代码】【图】

Python数据分析(二): Pandas技巧 (2)【图】

PYTHON - 相关标签

TR - 相关标签

PYTHON - 技术教程分类

PYTHON - 最新教程

PYTHON - 最热教程