fuzzywuzzy使用编辑距离(Levenshtein Distance)来计算序列之间的差异
github:
https://github.com/seatgeek/fuzzywuzzy安装
pip install fuzzywuzzy
代码示例
from fuzzywuzzy import fuzz
text1 = "北京绿色公交占比年底将达93.7%"
text2 = "北京的绿色公交车,年底占比将达到93.7%"
print(fuzz.ratio(text1, text2))
# 74
参考
python: fuzzywuzzy学习笔记