Primer 서열 분석을 위한 python 코드

바닐라스카이 2021. 8. 17. 09:33

2021. 8. 17. 09:33

NGS 데이터에서 adapter 서열과 primer 서열에 따른 read 분류.

아래는 pseudo code 이므로 적절한 변환이 필요하다.

import regex

#ambiguous base list
ambiguous_base_dic = {"N":"ATGC","R":"AG","Y":"TC","K":"GT","M":"AC","S":"GC","W":"AT","B":"CGT","D":"AGT","H":"ACT","V":"ACG"}

#if ambiguous base occured, change to regular expression format
for word, initial in ambiguous_dic.items():
	primer_f = primer_f.replace(word, "["+initial+"]")
    
#index sequences are allowed one mismatch
primer = r"(^{0})".format(primer_f)+"{e<=1}"

#if primer sequence is inside sequence, print out.
if regex.findall(primer, sequence):
	print(sequence)

저작자표시 비영리 변경금지 (새창열림)

'Computer Science > python' 카테고리의 다른 글

logging 모듈 사용하기 (0)	2022.02.17
f-string을 활용한 regex 사용법 (0)	2022.02.15
String Format으로 길이 고정하기 (0)	2020.06.24
python multi-level argparse (0)	2019.07.12
python 파일 입출력 (0)	2019.07.12

Be great

Primer 서열 분석을 위한 python 코드

'Computer Science > python' 카테고리의 다른 글

+ Recent posts

티스토리툴바