深入淺出 python_深入淺出python 第4-5章

第四章持久存儲

1、程式生成資料 strip()

：從字元串中去除不想要的空白符

man=[]
other=[]
try:
	data=open('sketch.txt')
	for each_line in data:
		try:
			(role,line_spoken)=each_line.split(':',1)
#strip（）删除line_spoken變量中不需要的空白符
#将去除空白符後的字元串再賦回到自身
			line_spoken=line_spoken.strip()
#根據role的值将line_spoken增加到适當的清單中
			if role=='Man':
				man.append(line_spoken)
			elif role=='Other Man':
				other.append(line_spoken)
		except ValueError:
			pass
	data.close()
except IOError:
	print('The datafile is missing!')
#輸出各個清單
print(man)
print(other)

2、以寫模式打開檔案 Print()

的file參數控制将資料發送/儲存到哪裡

try:
    #打開兩個檔案，分别賦到一個檔案對象
    man_file=open('man_data.txt','w')
    other_file=open('other_data.txt','w')
    #使用print将指定的清單儲存到指定的磁盤檔案
    print(man,file=man_file)
    print(other,file=other_file)

    man_file.close()
    other_file.close()
except IOError:
    print('File error.')

IOError會導緻資料被破壞，確定關閉檔案。

finally

組總會執行，不論try/except語句出現異常

try:
    #打開兩個檔案，分别賦到一個檔案對象
    man_file=open('man_data.txt','w')
    other_file=open('other_data.txt','w')
    #使用print将指定的清單儲存到指定的磁盤檔案
    print(man,file=man_file)
    print(other,file=other_file)

except IOError:
    print('File error.')

finally:
    #把close調用移到最後，減少資料破壞錯誤的可能性，確定檔案妥善的關閉
    man_file.close()
    other_file.close()

如何發現錯誤的特定資訊

python變量

隻包含資料對象的一個引用，python中的字元串不可變，元祖為不可變的清單，所有的資料類型都不可變。原來的資料不能改變，還有其他的變量指向它，永遠不可能知道還有哪些變量指向某個特定的字元串

locals()

傳回目前作用域内定義的所有名的一個集合。

向

except

組傳入一個異常對象，并使用

關鍵字賦到一個辨別符

str()：

通路任何資料對象（支援串轉換）的串表示

操作符用于檢查成員關系

“+”

操作符：用于字元串時将聯結兩個字元串，用于數字時将兩個數相加

try:
	data=open('missing.txt')
	print(data.readline(),end='')
#為異常對象給定一個名
except IOError as err:
	#使用str()要求異常對象表現為一個字元串
	print('File error:'+str(err))
finally:
	#in操作符測試成員關系
	if 'data' in locals():
		#在locals傳回的集合中搜尋字元串data，找到，則檔案成功打開，沒有找到，安全調用close（）方法。
		data.close()

File error:[Errno 2] No such file or directory: 'missing.txt'

用with處理檔案

with語句

利用了一種名為上下文管理協定的python技術，自動處理所有已打開檔案的關閉工作，即使出現異常也不例外，也使用

as關鍵字

try:
    #打開兩個檔案，分别賦到一個檔案對象
    man_file=open('man_data.txt','w')
    other_file=open('other_data.txt','w')
    #使用print将指定的清單儲存到指定的磁盤檔案
    print(man,file=man_file)
    print(other,file=other_file)

except IOError as err:
    print('File error:'+str(err))

finally:
    #把close調用移到最後，減少資料破壞錯誤的可能性，確定檔案妥善的關閉
    if 'man_file' in the locals():
        man_file.close()
    if 'other_file' in the locals():
        other_file.close()

with模式

try:
    with open('man_data.txt','w') as man_file:
        print(man,file=man_file)
    with open('other_data.txt','w') as other_file:
        print(other,file=other_file)
except IOError as err:
    print('File error:'+str(err))

print()會模仿python解釋器實際存儲清單資料的格式來顯示資料，展示資料在記憶體中的“樣子”

解析資料格式

定制代碼

标準輸出：sys.stdout

print_lol()第四個參數：辨別把資料寫入哪個位置，

一定要為參數提供一個預設值sys.stdout。這樣如果調用這個函數沒有指定檔案對象，則會依然寫至螢幕。

import sys
def print_lol(the_list,indent=False,level=0,fn=sys.stdout):
	for each_item in the_list:
		if isinstance(each_item,list):
			print_lol(each_item,indent,level+1,fn)
		else:
			if indent:
				for tab_stop in range(level):
					print("t",end='',file=fn)
			print(each_item,file=fn)

import nester
try:
    with open('man_data.txt','w') as man_file:
        nester.print_lol(man,file=man_file)
    with open('other_data.txt','w') as other_file:
        nester.print_lol(other,file=other_file)
except IOError as err:
    print('File error:'+str(err))

“腌制資料”

pickle庫：儲存和加載幾乎任何python資料對象，包括清單。dump()儲存資料，load()恢複資料，必須以二進制通路模式打開檔案。高效的

儲存和恢複資料的機制，儲存到磁盤和磁盤恢複。

import pickle
try:
        #将通路模式改為“可寫二進制模式”，調用替換為“pickle”
    with open('man_data.txt','wb') as man_file, open('other_data.txt','wb') as other_file:
        pickle.dump(man,file=man_file)
        pickle.dump(other,file=other_file)
except IOError as err:
    print('File error:'+str(err))
#pickle接觸資料腌制時出現的異常
except pickle.PickleError as perr:
        print('Pickle error:'+str(perr))

使用pickle的通用檔案I/O

第5章推導資料處理資料

從各個檔案将資料讀入各自的清單

方法串鍊

：data.strip().split()，strip應用到data中的資料行，去除不想要的空白符，split建立一個清單，得到的清單應用到以上代碼的目标辨別符。

從左到右讀

#打開檔案
with open('james.txt')as jaf:
    #讀資料行
    data=jaf.readline()
#将資料轉換為一個清單
james=data.strip().split(',')
with open('julie.txt') as juf:
    data=juf.readline()
julie=data.strip().split(',')
with open('mikey.txt') as mif:
    data=mif.readline()
mikey=data.strip().split(',')
with open('sarah.txt')as saf:
    data=saf.readline()
sarah=data.strip().split(',')

print(james)
print(julie)
print(mikey)
print(sarah)

深入淺出 python_深入淺出python 第4-5章

排序

原地排序：sort()，排序後的資料替換原來的資料

複制排序：sorted()，原資料的順序依然保留

預設升序，降序可以傳入參數

reverse=True

函數串鍊 從右到左讀

#打開檔案
with open('james.txt')as jaf:
    #讀資料行
    data=jaf.readline()
#将資料轉換為一個清單
james=data.strip().split(',')
with open('julie.txt') as juf:
    data=juf.readline()
julie=data.strip().split(',')
with open('mikey.txt') as mif:
    data=mif.readline()
mikey=data.strip().split(',')
with open('sarah.txt')as saf:
    data=saf.readline()
sarah=data.strip().split(',')

#建立函數，對時間格式進行清理
def sanitize(time_string):
    #in操作符檢查字元是否包含一個短橫線與冒号
    if '-' in time_string:
        spliter='-'
    elif':' in time_string:
        spliter=':'
    else:
        #如果字元串不需要清理，就什麼也不做
        return(time_string)
    #分解字元串，抽出分鐘和秒部分
    (mins,secs)=time_string.split(spliter)
    return(mins+'.'+secs)


#建立4個開始為空的新清單
clean_james=[]
clean_julie=[]
clean_mikey=[]
clean_sarah=[]

#取原清單中的各個資料項進行清理，然後将清理後的清單追加到适當的清單
for each_t in james:
    clean_james.append(sanitize(each_t))
for each_t in julie:
    clean_julie.append(sanitize(each_t))
for each_t in mikey:
    clean_mikey.append(sanitize(each_t))
for each_t in sarah:
    clean_sarah.append(sanitize(each_t))
#輸出已經清理後的新清單
print(sorted(clean_james))
print(sorted(clean_julie))
print(sorted(clean_mikey))
print(sorted(clean_sarah))

深入淺出 python_深入淺出python 第4-5章

清單推導簡化重複代碼：

清單推導：list comprehension 減少将一個清單轉換為另一個清單時編寫的代碼，支援函數程式設計概念的一個例子

從一個代碼轉換為另一個清單需要：

建立一個新清單存放轉換後的資料
疊代處理原清單中的每個資料項
每次疊代後完成轉換
将轉換後的資料追加到新清單

深入淺出 python_深入淺出python 第4-5章

#打開檔案
with open('james.txt')as jaf:
    #讀資料行
    data=jaf.readline()
#将資料轉換為一個清單
james=data.strip().split(',')
with open('julie.txt') as juf:
    data=juf.readline()
julie=data.strip().split(',')
with open('mikey.txt') as mif:
    data=mif.readline()
mikey=data.strip().split(',')
with open('sarah.txt')as saf:
    data=saf.readline()
sarah=data.strip().split(',')

#建立函數，對時間格式進行清理
def sanitize(time_string):
    #in操作符檢查字元是否包含一個短橫線與冒号
    if '-' in time_string:
        spliter='-'
    elif':' in time_string:
        spliter=':'
    else:
        #如果字元串不需要清理，就什麼也不做
        return(time_string)
    #分解字元串，抽出分鐘和秒部分
    (mins,secs)=time_string.split(spliter)
    return(mins+'.'+secs)

#清單推導完成轉換，再利用sorted()對新清單進行排序
print(sorted([sanitize(t) for t in james]))
print(sorted([sanitize(t) for t in julie]))
print(sorted([sanitize(t) for t in mikey]))
print(sorted([sanitize(t) for t in sarah]))

疊代删除重複項

像是一個過濾器，重複項删除過濾器需要在清單建立過程中檢查所建立的清單

#把無序且不一緻的資料替換為經過清理的有序副本
james=sorted([sanitize(t) for t in james])
julie=sorted([sanitize(t) for t in julie])
mikey=sorted([sanitize(t) for t in mikey])
sarah=sorted([sanitize(t) for t in sarah])

#建立空清單，存放唯一的資料項
unique_james=[]
#在現有資料上疊代處理
for each_t in james:
    #如果資料項不在新清單中，追加到清單中
    if each_t not in unique_james:
        unique_james.append(each_t)
#從清單分片中得到前3個資料項
print(unique_james[0:3])

unique_julie=[]
for each_t in julie:
    if each_t not in unique_julie:
        unique_julie.append(each_t)
print(unique_julie[0:3])

unique_mikey=[]
for each_t in mikey:
    if each_t not in unique_mikey:
        unique_mikey.append(each_t)
print(unique_mikey[0:3])

unique_sarah=[]
for each_t in sarah:
    if each_t not in unique_sarah:
        unique_sarah.append(each_t)
print(unique_sarah[0:3])

深入淺出 python_深入淺出python 第4-5章

用集合删除重複項

集合中的資料項是無序的，不允許重複。

工廠函數：用于建立某個類型的新的資料項，set()建立一個新的集合，資料清單中的任何重複項都會被忽略。

#建立函數，對時間格式進行清理
def sanitize(time_string):
    #in操作符檢查字元是否包含一個短橫線與冒号
    if '-' in time_string:
        spliter='-'
    elif':' in time_string:
        spliter=':'
    else:
        #如果字元串不需要清理，就什麼也不做
        return(time_string)
    #分解字元串，抽出分鐘和秒部分
    (mins,secs)=time_string.split(spliter)
    return(mins+'.'+secs)

#建立新函數，接受一個檔案名作為唯一的參數
def get_coach_data(filename):
    #增加異常處理代碼
    try:
        #打開檔案，讀取資料
        with open(filename)as f:
            data=f.readline()
        #将資料傳回代碼之前先對資料完成分解、去除空白符處理
        return(data.strip().split(','))
    except IOError as ioerr:
        #通知使用者有錯誤（如果出現錯誤），并傳回“None”來提示失敗
        print('File error:'+str(ioerr))
        return(None)
#調用函數,将結果賦至清單
sarah=get_coach_data('sarah.txt')
james=get_coach_data('james.txt')
julie=get_coach_data('julie.txt')                       
mikey=get_coach_data('mikey.txt')

#對sorted形成的清單進行分片，解決重複項問題
print(sorted(set([sanitize(t) for t in james]))[0:3])
print(sorted(set([sanitize(t) for t in julie]))[0:3])                        
print(sorted(set([sanitize(t) for t in mikey]))[0:3])
print(sorted(set([sanitize(t) for t in sarah]))[0:3])

深入淺出 python_深入淺出python 第4-5章

第四章持久存儲

第5章推導資料處理資料

繼續閱讀

Python 深入淺出 - PyMySQL 操作 MySQL 資料庫

Python 深入淺出 - 高階函數

Python 深入淺出 - 面向對象

深入淺出 python_深入淺出python 第4-5章

第四章 持久存儲

第5章 推導資料 處理資料

繼續閱讀

第四章持久存儲

第5章推導資料處理資料