天天看點

ML之LoR:利用信用卡資料集(欠采樣{Nearmiss/Kmeans/TomekLinks/ENN}、過采樣{SMOTE/ADASYN})同時采用LoR算法(PR和ROC評估)進行是否欺詐二分類

設計思路

ML之LoR:利用信用卡資料集(欠采樣{Nearmiss/Kmeans/TomekLinks/ENN}、過采樣{SMOTE/ADASYN})同時采用LoR算法(PR和ROC評估)進行是否欺詐二分類

輸出結果

ML之LoR:利用信用卡資料集(欠采樣{Nearmiss/Kmeans/TomekLinks/ENN}、過采樣{SMOTE/ADASYN})同時采用LoR算法(PR和ROC評估)進行是否欺詐二分類
ML之LoR:利用信用卡資料集(欠采樣{Nearmiss/Kmeans/TomekLinks/ENN}、過采樣{SMOTE/ADASYN})同時采用LoR算法(PR和ROC評估)進行是否欺詐二分類
ML之LoR:利用信用卡資料集(欠采樣{Nearmiss/Kmeans/TomekLinks/ENN}、過采樣{SMOTE/ADASYN})同時采用LoR算法(PR和ROC評估)進行是否欺詐二分類
ML之LoR:利用信用卡資料集(欠采樣{Nearmiss/Kmeans/TomekLinks/ENN}、過采樣{SMOTE/ADASYN})同時采用LoR算法(PR和ROC評估)進行是否欺詐二分類

實作代碼

F:\Program Files\Python\Python36\lib\site-packages\matplotlib\axes\_axes.py:6462: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.

 warnings.warn("The 'normed' kwarg is deprecated, and has been "

0    284315

1       492

Name: Class, dtype: int64

Default 方法

Undersampling RandomUnderSampler 方法

F:\Program Files\Python\Python36\lib\site-packages\imblearn\under_sampling\_prototype_selection\_nearmiss.py:178: UserWarning: The number of the samples to be selected is larger than the number of samples available. The balancing ratio cannot be ensure and all samples will be returned.

 "The number of the samples to be selected is larger"

Undersampling NearMissV1 方法

F:\Program Files\Python\Python36\lib\site-packages\sklearn\svm\_base.py:977: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.

 "the number of iterations.", ConvergenceWarning)

Undersampling NearMissV2 方法

Undersampling NearMissV3 方法

Undersampling ClusterCentroids 方法

Undersampling TomekLinks 方法

Undersampling EditedNearestNeighbours 方法

資料清洗後大類樣本數量

Original:  227451

After Tomek Link:  227429

After ENN:  227326

Oversampling RandomOverSampler 方法

Oversampling SMOTE 方法

Oversampling ADASYN 方法