天天看點

計算基因組染色體長度的Python腳本

調用指令:python cal_chrom_length.py hg19.fasta hg19.len 

import sys
import os

fasta_path = sys.argv[1]
len_path = sys.argv[2]
temp_path = len_path + ".temp"

fw = open(len_path, "w+")
os.system("fastalength " + fasta_path + " > " + temp_path)
with open(temp_path) as f:
    lines = f.readlines()
    for line in lines:
        cols = line.strip("\n").split()
        fw.write("\t".join([cols[1], cols[0]]) + "\n")
fw.close()
os.remove(temp_path)
           

輸出的hg19.len檔案的結構是:染色體名稱<TAB>長度,适用于下遊的genomeCoverageBed、bedGraphToBigWig等分析。

繼續閱讀