Я отправляю последовательность в виде текста в tbSeq
Я нашел эту последовательность на http://pro-161-70.ib.unicamp.br/~itaraju/tools/pimw/what.htm
Это дает мне некоторые результаты и изображение (как показано ниже), сохраненное на диске как 'output.gif'
.
import requests
import lxml.html
url = 'http://pro-161-70.ib.unicamp.br/~itaraju/cgi-bin/itaraju/bioinf/pimw.cgi'
payload = {
'arquivo': '',
'opShowTitle': 'ON',
'opShowSeq': 'ON',
'opShowStat': 'ON',
'opShowpimw': 'ON',
'opGelVirtual': 'ON',
'opMap': 'gel0.def',
'opPK': 'Default',
'tbCt': 3.55,
'tbNt': 7,
'tbArg': 12.01,
'tbAsp': 4.06,
'tbCys': 9,
'tbGlu': 4.45,
'tbHis': 5.985,
'tbLys': 10.01,
'tbTyr': 10.01,
'tbSeq': '''>gi|532319|pir|TVFV2E|TVFV2E envelope protein
ELRLRYCAPAGFALLKCNDADYDGFKTNCSNVSVVH
CTNLMNTTVTTGLLLNGSYSENRTQIWQKHRTSNDS
ALILLNKHYNLTVTCKRPGNKTVLPVTIMAGLVFHSQ
KYNLRLRQAWCHFPSNWKGAWKEVKEEIVNLPKER
YRGTNDPKRIFFQRQWGDPETANLWFNCHGEFFYCK
MDWFLNYLNNLTVDADHNECKNTSGTKSGNKRAPG
PCVQRTYVACHIRSVIIWLETISKKTYAPPREGHLECT
STVTGMTVELNYIPKNRTNVTLSPQIESIWAAELDRY
KLVEITPIGFAPTEVRRYTGGHERQKRVPFVXXXXXX
XXXXXXXXXXXXXXXXVQSQHLLAGILQQQKNL
LAAVEAQQQMLKLTIWGVK''',
}
# send POST
r = requests.post(url, data=payload)
#print r.text
# convert HTML string into HTML tree
html = lxml.html.fromstring(r.text)
# get all images
imgs = html.cssselect('img')
# get second image
if len(imgs) > 1:
url = 'http://pro-161-70.ib.unicamp.br/~itaraju/cgi-bin/itaraju/bioinf/' + imgs[1].attrib['src'].strip()
print "Downloading ...", url
with open('output.gif', 'wb') as handle:
r = requests.get(url, stream=True)
if not r.ok:
# Something went wrong
pass
for block in r.iter_content(1024):
if not block:
break
handle.write(block)
print '.',
print
# get data
for tr in html.cssselect('tr'):
for td in tr.cssselect('tr'):
print td.text_content().strip().replace('\n', ' | '),
print
Результат:
Downloading ... http://pro-161-70.ib.unicamp.br/~itaraju/cgi-bin/itaraju/bioinf/../../../tools/htdocs/tmp/gel.15548.gif
. . . . . . . . . . . . . . . . . . . . . . . . . .
ORF:
gi|532319|pir|TVFV2E|TVFV2E envelope protein
Sequence:
ELRLRYCAPAGFALLKCNDADYDGFKTNCS NVSVVHCTNLMNTTVTTGLLLNGSYSENRT QIWQKHRTSNDSALILLNKHYNLTVTCKRP GNKTVLPVTIMAGLVFHSQKYNLRLRQAWC HFPSNWKGAWKEVKEEIVNLPKERYRGTND PKRIFFQRQWGDPETANLWFNCHGEFFYCK MDWFLNYLNNLTVDADHNECKNTSGTKSGN KRAPGPCVQRTYVACHIRSVIIWLETISKK TYAPPREGHLECTSTVTGMTVELNYIPKNR TNVTLSPQIESIWAAELDRYKLVEITPIGF APTEVRRYTGGHERQKRVPFVXXXXXXXXX XXXXXXXXXXXXXVQSQHLLAGILQQQKNL LAAVEAQQQMLKLTIWGVK
MW: | pI:
40969.02 | | 9.35
Amino-acid composition
Ala (A) | 20 | 5.3% | | Cys (C) | 12 | 3.2% | | Asp (D) | 10 | 2.6% | | Glu (E) | 19 | 5.0% | | Phe (F) | 12 | 3.2% | | Gly (G) | 20 | 5.3% | | His (H) | 11 | 2.9% | | Ile (I) | 16 | 4.2% | | Lys (K) | 24 | 6.3% | | Leu (L) | 34 | 9.0% | | | | | Met (M) | 5 | 1.3% | | Asn (N) | 27 | 7.1% | | Pro (P) | 16 | 4.2% | | Gln (Q) | 17 | 4.5% | | Arg (R) | 21 | 5.5% | | Ser (S) | 16 | 4.2% | | Thr (T) | 30 | 7.9% | | Val (V) | 24 | 6.3% | | Trp (W) | 10 | 2.6% | | Tyr (Y) | 13 | 3.4% Ala (A) | 20 | 5.3% Cys (C) | 12 | 3.2% Asp (D) | 10 | 2.6% Glu (E) | 19 | 5.0% Phe (F) | 12 | 3.2% Gly (G) | 20 | 5.3% His (H) | 11 | 2.9% Ile (I) | 16 | 4.2% Lys (K) | 24 | 6.3% Leu (L) | 34 | 9.0% Met (M) | 5 | 1.3% Asn (N) | 27 | 7.1% Pro (P) | 16 | 4.2% Gln (Q) | 17 | 4.5% Arg (R) | 21 | 5.5% Ser (S) | 16 | 4.2% Thr (T) | 30 | 7.9% Val (V) | 24 | 6.3% Trp (W) | 10 | 2.6% Tyr (Y) | 13 | 3.4%
Ala (A) | 20 | 5.3%
Cys (C) | 12 | 3.2%
Asp (D) | 10 | 2.6%
Glu (E) | 19 | 5.0%
Phe (F) | 12 | 3.2%
Gly (G) | 20 | 5.3%
His (H) | 11 | 2.9%
Ile (I) | 16 | 4.2%
Lys (K) | 24 | 6.3%
Leu (L) | 34 | 9.0%
Met (M) | 5 | 1.3%
Asn (N) | 27 | 7.1%
Pro (P) | 16 | 4.2%
Gln (Q) | 17 | 4.5%
Arg (R) | 21 | 5.5%
Ser (S) | 16 | 4.2%
Thr (T) | 30 | 7.9%
Val (V) | 24 | 6.3%
Trp (W) | 10 | 2.6%
Tyr (Y) | 13 | 3.4%
Total: | 379
Theoretical 2D gel:
Маленькая красная точка :)
EDIT: пример с файлом - файл должен быть отправлен в поле с именем arquivo
import requests
import lxml.html
url = 'http://pro-161-70.ib.unicamp.br/~itaraju/cgi-bin/itaraju/bioinf/pimw.cgi'
payload = {
# 'arquivo': '', # remove it
'opShowTitle': 'ON',
'opShowSeq': 'ON',
'opShowStat': 'ON',
'opShowpimw': 'ON',
'opGelVirtual': 'ON',
'opMap': 'gel0.def',
'opPK': 'Default',
'tbCt': 3.55,
'tbNt': 7,
'tbArg': 12.01,
'tbAsp': 4.06,
'tbCys': 9,
'tbGlu': 4.45,
'tbHis': 5.985,
'tbLys': 10.01,
'tbTyr': 10.01,
'tbSeq': '',
}
files = {'arquivo': open('sequence.fasta').read()}
#url = 'http://httpbin.org/post' # special portal for tests
# send POST
r = requests.post(url, data=payload, files=files)
#print r.text
# convert HTML string into HTML tree
html = lxml.html.fromstring(r.text)
# get all images
imgs = html.cssselect('img')
# get second image
if len(imgs) > 1:
url = 'http://pro-161-70.ib.unicamp.br/~itaraju/cgi-bin/itaraju/bioinf/' + imgs[1].attrib['src'].strip()
print "Downloading ...", url
with open('output.gif', 'wb') as handle:
r = requests.get(url, stream=True)
if not r.ok:
# Something went wrong
pass
for block in r.iter_content(1024):
if not block:
break
handle.write(block)
print '.',
print
# get data
for tr in html.cssselect('tr'):
for td in tr.cssselect('tr'):
print td.text_content().strip().replace('\n', ' | '),
print
Используемый файл sequence.fasta
>gi|532319|pir|TVFV2E|TVFV2E envelope protein
ELRLRYCAPAGFALLKCNDADYDGFKTNCSNVSVVH
CTNLMNTTVTTGLLLNGSYSENRTQIWQKHRTSNDS
ALILLNKHYNLTVTCKRPGNKTVLPVTIMAGLVFHSQ
KYNLRLRQAWCHFPSNWKGAWKEVKEEIVNLPKER
YRGTNDPKRIFFQRQWGDPETANLWFNCHGEFFYCK
MDWFLNYLNNLTVDADHNECKNTSGTKSGNKRAPG
PCVQRTYVACHIRSVIIWLETISKKTYAPPREGHLECT
STVTGMTVELNYIPKNRTNVTLSPQIESIWAAELDRY
KLVEITPIGFAPTEVRRYTGGHERQKRVPFVXXXXXX
XXXXXXXXXXXXXXXXVQSQHLLAGILQQQKNL
LAAVEAQQQMLKLTIWGVK
person
furas
schedule
29.06.2014
pimw.htm
- он ничего не делает. Вы должны отправить POST наpimw.cgi
- и вы должны опубликовать все поля, доступные наpimw.htm
- person furas   schedule 30.06.2014pimw.htm
показывает только форму. Вы должны отправить данные вpimw.cgi
, потому чтоpimw.cgi
генерирует результаты. - person furas   schedule 30.06.2014