INSTRUCTIONS:
a. The cellular function (or role, or definition, or description, or whatever you want to call it) of the gene product. This includes an EC number when appropriate. It does not include the organism. I do not just want a copy of the NCBI definition (in some cases it will be too vague, or even incorrect). Be as specific as you can be reasonably, but a precise function is not always possible. There might be more than one function. It is possible for a protein to have more than one function (e.g., by fusion). It is possible for a DNA region to include more than one gene. Be sure to describe which you mean. For example: the protein has phosphomanomutase and phosphoglucomutase activities (EC ..., EC ...); the protein is a fusion of RNA polymerase subunit beta (EC ...) and RNA polymerase subunit beta' (EC ...); or the DNA encodes proteins for (i) DNA polymerase I (EC ...) and (ii) DNA ligase (EC ...).
b. The organism (including strain) that the sequence is from, if a reasonable conjecture can be made. If not knowable, then say so.
c. The reasons for your assessment. What tools did you use? What databases did you search? Be specific: "I used blast." is not specific; it does not tell me the program or the database or how the results were interpreted. I do not want a copy of your blast results (I have my own, and if you tell me what you did, I should be able to reproduce anything I do not have).
This assignment cannot be successfully completed with just a single tool and database. The questions range from trivial to very difficult. If you put your answers on a separate page, make clear which question you are talking about.
I was trying to answer, but I am stuck do not know how to answer - please help me - I am worried that I will not be able to do it ..
Web site used:
http://www.ncbi.nlm.nih.gov/sites/gquery
http://blast.ncbi.nlm.nih.gov/Blast.cgi ... =blasthome
http://blast.ncbi.nlm.nih.gov/Blast.cgi ... =BlastHome
http://theseed.uchicago.edu/FIG/index.cgi
1. ref|ZP_01016162
[Note: this is the NCBI accession number for the protein in question]
ANSWER: ???
a) Description: It is a hypothetical protein PB2503_01477 [Parvularcula bermudensis HTCC2503] >gi|84690833|gb|EAQ16673.1| hypothetical protein PB2503_01477 [Parvularcula bermudensis HTCC2503]
--> hypothetical protein PB2503_01477 [Parvularcula bermudensis HTCC2503]
EC????
b) ORGANISM: Parvularcula bermudensis HTCC2503
Bacteria; Proteobacteria; Alphaproteobacteria; Parvularculales;
Parvularculaceae; Parvularcula.
c) I used the NCBI BLAST Home page: http://blast.ncbi.nlm.nih.gov/Blast.cgi.
I chose the blastp / "protein blast" program to search for similar sequences in the nr (non-redundant) protein database. I pasted functions into the Enter Query Sequence box. I checked the following settinging: Database, Non-redundant protein sequences (nr); Algorithm, blastp; Max target sequences2, 500; Short queries, Automatically adjust ...; Expect threshold, 10; Word size, 3;Matrix, BLOSUM62; Gap Costs, Existence: 11 Extension 1; Compositional adjustments, Conditional compositional score ...; Filter, unchecked; and Mask, both choices unchecked
2. gb|ACO77251
[Note: this is the GenBank accession number for the protein in question]
a) DEFINITION: It is Rhodanese-like protein [Azotobacter vinelandii DJ].
EC: EC(318599)????
b) ORGANISM: Azotobacter vinelandii DJ Bacteria; Proteobacteria; Gammaproteobacteria; Pseudomonadales: Pseudomonadaceae; Azotobacter.
Inherited blast name: g-proteobacteria
c) I used the NCBI BLAST Home page: http://blast.ncbi.nlm.nih.gov/Blast.cgi. I chose the blastp / "protein blast" program to search for similar sequences in the nr (non-redundant) protein database. I pasted functions into the Enter Query Sequence box. I checked the following settinging: Database, Non-redundant protein sequences (nr); Algorithm, blastp; Max target sequences2, 500; Short queries, Automatically adjust ...; Expect threshold, 10; Word size, 3;Matrix, BLOSUM62; Gap Costs, Existence: 11 Extension 1; Compositional adjustments, Conditional compositional score ...; Filter, unchecked; and Mask, both choices unchecked
3. ref|YP_521192
a) DEFINITION: hypothetical protein DSY4959 [Desulfitobacterium hafniense Y51].
EC=???
b) ORGANISM Desulfitobacterium hafniense Y51 Bacteria; Firmicutes; Clostridia; Clostridiales; Peptococcaceae; Desulfitobacterium.
c) Same as before
4. ref|YP_891092
[Note: for full credit you need to explain why is this protein so much bigger than the corresponding protein (ref|YP_001848402.1) in Mycobacterium marinum M]
a) DEFINITION: replicative DNA helicase [Mycobacterium smegmatis str. MC2 155].
EC =
/EC_number="3.1.-.-"
/EC_number="3.6.1.-"
b) ORGANISM Mycobacterium smegmatis str. MC2 155 Bacteria; Actinobacteria; Actinobacteridae; Actinomycetales; Corynebacterineae; Mycobacteriaceae; Mycobacterium.
C) 5. ref|ZP_03623824
[Note: The obvious answer is wrong. The peptidoglyan of most Spirochetes, including that of Borrelia burgorferi, does not contain 2,6-diaminopimelate (Beck, G., Benach, J. L., and Habicht, G. S. 1990. Isolation, preliminary chemical characterization, and biological activity of Borrelia burgdorferi peptidoglycan. Biochem. Biophys. Res. Commun. 167: 89-95). However, the investigators at Laboratory of Human Bacterial Pathenogenesis, Rocky Mountain Laboratories, National Institutes of Allergy and Infectious Disease, know their spirochete biology and got the annotation correct in the genomes two closely related species (even as they misspell the name of their laboratory). Be sure to explain how you get to your answer (there are lots of ways from here to there).]
6. ref|YP_629528
---
I think, for 7-10 - the sequence need to be copy and pasted to .... http://blast.ncbi.nlm.nih.gov/Blast.cgi ... =blasthome
7. dna07
>dna07
gggagaggttggccggctggtgccgccccgggacttcaaatcccgtggga
ggtcccgcaagggagctccggagggttcgattccctccctct
8. dna08
>dna08
atgtatcctatggaacaaagggaagttaacatacaggacgagctgttaaacaggttcaaa
caacagggaacaaccattacagtatttctaacgaggggaaacaggatagtgggaaaggtt
ctcgaccacgacaggtacaccattctccttgaagttgaaggacagcctcacctgatttac
aagcacgcggtttctaccatagtagagggtggatga
9. dna09
>dna09
gtggaatgtatttttattattagtttagtgtatatttaaaggaaaaatatgacaacattt
ttagccaaaaattggagttcgttaattaagccaacaaaggttcagtatgaagctgttgat
aataatccaaatattaagactatggttgttgagccattggagcgaggattaggattaact
ttaggtaattctttgcgtagggtgttattatcttctttacgtggagcggcaattacttca
ataaaaattccaggagtagaacatgaattatcaccagtaagtggagttaaagaagattta
acagatattattttaaatattcgagatgtaattgttaaaatggattcagttcaaaagtgt
aatttaaggctagaagttacaggaccggctgttgtaacggctggtatgattactgttact
gataaacaagatgttacaatattaaatcctcaacatgttatttgtaatttaagcaaagga
tttaatttagaaatggatcttatatgtgaacaaggaaaaggatatgtgcctactagttgt
ttacataacagtgattctcctataggagcaattcatcttgatgcattgtttaatcctgta
agaagagtgtcttataaagttgaaaactcaatggttggacagatgactaactatgataag
ttaataattactgttgaaactaatggtgtagttaatccggatgcagcattagggttagca
gcaagaatattactagatcaattgcaggtatttatcaattttcaagaagttgaagaagaa
aaaccagaaaaattagaacttcaaactattaatcctgttttattaaagaaagtatatgaa
ttagagttatctgtaagatctcaaaattgcttaaaaaatgaaaatatagtttatgttggt
gatttagtagctagaactgaaactcaaatgcttaaaacagctaattttggtcgtaaatca
ttgaatgagcttaagaaagttttagctaattttaacttagaatttggcatgaaagatatt
ggttggcctcctgagaatcttgaatcacttgcaaaaaaacatgaagatcaatattaatat
aatagtaggtaattaatgcgtcatagagttagtggtagaaagttaaatagaacaactagt
catttattagcaatgttagcaaatatgtctgtatcattaatacaacatgagcagattaat
actacattacctaaagcaaaagaacttaggccttttgttgaaaagcttataactgttgca
aaaaaaggtaacttaaatgctcgtagatacctgatatctaaaattaagaatgagcttgca
gttgaaaaattgatgactacattagctcctagatatgctgaacgtcacggtggatatata
cgaattttaaaagctgggtttaggtatggagatatggctcctatggcgtatatagaattt
gtagaccgtaatattgaatctaaaggtaaagaatttaaagcattgaagaatgactctagg
aatgcaaagttaatagcagaacaatctaactaaatattgatacaaattaattgatgtaca
atagataaagttaataatggataatattttaaattaggatatagagatttaaagatggca
aatcataaatcaacacaaaaatcaattaggcaagatcaaaagaggaatttgataaataag
agtagaaaatctaatgttaaaacttttctaaagagagtaacattagcaattaatgctgga
gataaaaaagttgctagtgaagctctaagtgcagctcattcaaaattagctaaagcagca
aataaaggaatttacaagttaaatactgtttcaaggaaagttagtagattatctaggaag
attaaacagcttgaagataaaatataatcagaagattatattttgttctggtgtttaatg
ttaattattttgttaatcatattacacagcg
10. pep10
>pep10
mpkneslkkilvlgsgaikigeagefdysgsqclkaihedgiksvlinpniatiqtdtrf
adqvyllpvtpnyvesivekerpdgimlayggqtalncgvkleeagilkkydvkvlgtqv
dgikntedrqlfkdsmkeagvpvlksktvtnfedakkvaeeleypviirvaytlggrggg
iahneielheivergckaslvgqvlveeyighwkqieyevmqdydgnnvivcnmenvlsm
kvhtgdnivvapsqtinnheyhmlrtaalratkhvgivgecniqyaldadsdryvaiein
prlsrssalaskatgyplaymsakiglgynlselvnritksttacfepsldyvvckhprw
dfskfelvnrklgvtmksvgevmavgrtfeeslqkairmldigndglvlnrsngktytee
eieyklshhddqilynvaialkmgvsvdriyklsaidpwfiekiqnivntesnlkeseln
esvlrnakkmgfsdnqiarvkektpdevrkirkdfgiipavkqidtlaaewpavtnylyl
tyggnsndiqvspdekgvvvvgagpyrigssvefdwgtvnmvwglqengeknvsvvncnp
etvstdydictrlyfeeltqerlldisdfenpkgvitcvggqtannltpglaqrginimg
tsaidvdraedrskfsaeldklhiqqpkwqafsnlnearnfaqdvgfpvivrpsyvlsga
amkvvwsqdelktyvkeatdvspdhpvviskfmldslevdvdgisngkevvigaivehid
sagvhsgdammcippwrlsnkiietindytkkialtfnvkgpfnlqflihddhvyvieln
irasrsmpfvsklvktnlislaskaildkplpkvpenkwqkihnygikvpqfsfmqldga
dialgvemqstgeaacfgnsfydalskgltsvgynlpdkgtalvtaggsqnkekllpsia
klkqlgfkilatehtaeffeekvgdieivhkiseperkpnisdllyekkidfiinipstl
slekyvgmlddeyqirrkslelgipvlttieladsfvktlewlqnnettrdpiepydiie
Please HELP soooonnn -
