[Python字符串提取]
摘要:根据要求进行字符串的提取,并去重
导入分析所需的库import pandas as pd
构造数据集
as1 = pd.DataFrame({'a':[1,2,3,4],
'b':['adwdea,asdw;swa,des','swa,dwad;asdw;swa','se;dw,asd;erf,de','de']})
编写分析函数
def trans(b):
as1['c'] = b.str.split(";")
c = as1['c'].tolist()
for i in range(len(c)):
for j in range(len(c[i])):
c[i][j] = c[i][j].split(",")[0]
return c
trans(as1['b'])
as1['d'] = as1['c'].apply(lambda x:set(x)).apply(lambda x:",".join(x))
as1
转载本文请联系原作者获取授权,同时请注明本文来自李立科学网博客。
链接地址:http://blog.sciencenet.cn/blog-3262505-1137397.html
下一篇:python学习——数据批量替换