天天看點

dataframe 合并_pandas的合并(merge)操作

dataframe 合并_pandas的合并(merge)操作
dataframe 合并_pandas的合并(merge)操作
dataframe 合并_pandas的合并(merge)操作

merge與concat的差別在于,merge需要依據某一共同的行或列來進行合并

使用pd.merge()合并時,會自動根據兩者相同column名稱的那一列,作為key來進行合并。

注意每一列元素的順序不要求一緻

1.一對一合并

import numpy as npimport pandas as pdfrom pandas import Series,DataFramedf1 = DataFrame({'employee':['Bob','Jake','Lisa'], 'group':['Accounting','Engineering','Engineering'], })df2 = DataFrame({'employee':['Lisa','Bob','Jake'], 'hire_date':[2004,2008,2012], })display(df1,df2)輸出: employeegroup0Bob Accounting1Jake Engineering2Lisa Engineering employee hire_date 0Lisa 2004 1Bob 2008 2Jake 2012 df3 = pd.merge(df1,df2) df3 輸出: employeegrouphire_date0 BobAccounting 20081 JakeEngineering 20122 LisaEngineering 2004
           

多對一合并

df3 = DataFrame({ 'employee':['Lisa','Jake'], 'group':['Accounting','Engineering'], 'hire_date':[2004,2016]})df4 = DataFrame({'group':['Accounting','Engineering','Engineering'], 'supervisor':['Carly','Guido','Steve'] })display(df3,df4,pd.merge(df3,df4))輸出:employeegrouphire_date0LisaAccounting20042 JakeEngineering2016groupsupervisor0AccountingCarly1EngineeringGuido2EngineeringSteveemployeegrouphire_datesupervisor0LisaAccounting2004Carly1JakeEngineering2016Guido2JakeEngineering2016Steve
           

多對多合并

df1 = DataFrame({'employee':['Bob','Jake','Lisa'], 'group':['Accounting','Engineering','Engineering']})df5 = DataFrame({'group':['Engineering','Engineering','HR'], 'supervisor':['Carly','Guido','Steve'] })display(df1,df5,pd.merge(df1,df5))#多對多display(pd.concat([df1,df5]))輸出:employeegroup0BobAccounting1JakeEngineering2LisaEngineeringgroupsupervisor0EngineeringCarly1EngineeringGuido2HRSteveemployeegroupsupervisor0JakeEngineeringCarly1JakeEngineeringGuido2LisaEngineeringCarly3LisaEngineeringGuido employeegroupsupervisor0BobAccountingNaN1JakeEngineeringNaN2LisaEngineeringNaN0NaNEngineeringCarly1NaNEngineeringGuido2NaNHRSteve
           

(1)key的規範化

使用on=顯式指定哪一列為key

df1 = DataFrame({'employee':['Jack',"Summer