[Baypiggies] Hiring / Bioinformatics Tutor/Hack day
Corey Coughlin
corey.coughlin at comcast.net
Mon May 17 09:44:23 CEST 2010
+1, sounds interesting
On 5/12/2010 5:21 PM, Glen Jarvis wrote:
> This email covers two topics (although they can be, but don't have to
> be, inter-related):
>
> * A job opening
> * A tutor/hack day to give the computer scientists a real
> Bioinformatics problem to solve
>
> I put them together as a benefit for those who may be considering a
> job in this field. You can have a day to work on these types of
> problems to see if it interests you or bores you to tears..
>
>
> === Job Opening ===
> Some time back, I sent out an email regarding my bioinformatics lab
> hiring a programmer. I tried to give a feel for what work would be
> like on a daily basis. And, I tried to set your expectation for pay
> (less than industry).
>
> We still have that job opening -- probably because I set your
> expectation so well :( .
>
> I was intentionally not involved in the interviewing/hiring process
> because I wanted to have no appearance of impropriety (as I was also
> interviewing for a position to move from contractor to full time
> employee). So, if you weren't hired, I don't really know why.... I
> intentionally stayed out of that loop to keep as professional as
> possible. I only know the position is still open.
>
> With that said, my boss is talking about hiring another programmer
> again for a short term (possibly a year or less). Although, if it
> works out on both sides, it could turn into a permanent position (as
> it was for me - I was hired full time). Finding a fit for this
> position is actually difficult (on both sides).
>
> Sooooooo...... I'm going to stick my neck out and try something new:
> Working on a small bioinformatics problem in an open source environment.
>
>
> === Tutor/Hack day ===
> I've been wanting to get the open source community more involved with
> some of the problems that we're tackling. Open Source code is *so*
> much better than code reviewed by only a few eyes. And, this would
> also give everyone a chance to see what a problem would be like.
>
> There are some *real* bioinformaticians on this list (I don't yet
> consider myself on that level yet -- although I'm getting there). So,
> if you're a real bioinformatician, this may be a trivial problem for
> you. But, if you want to come and help explain things/help others work
> this out, that'd be cool!
>
> I'd like to get together (on a weekend, possibly) and hack on this
> problem. I will describe the things that I think you need to know:
>
> * What is FASTA format (http://www.ncbi.nlm.nih.gov/blast/fasta.shtml)
> * An brief introduction to BioPython (http://biopython.org/)
> * What is a genome
> * What is a gene
> * What are amino acids (contrasting against DNA data)
> * What is a 'percent identity' between genes
> * What is a species
> * What is a strain (loosely defined because it seems to be very loose
> in this problem)
> * The term taxa (plural) and taxon (singular)
> * How can genes vary and still be the same gene
> * How errors can exist in different databases
> * An introduction to the JGI (http://www.jgi.doe.gov/) database
> * An introduction to the UniProt (http://www.uniprot.org/)
>
>
> With this introduction, you should have a theoretical understanding of
> all that you need to solve this problem -- the rest is coding. (That
> is, if I do my job and explain things well -- and don't fall into pot
> holes of information that I don't know).... Also, I over simplified
> things that you don't need to know for this problem (e.g., We won't
> talk about open reading frames at all or what that means. Since we're
> already given amino acids, we don't care).
>
> The problem is:
>
> I will give you a file in FASTA format of the genes for a particular
> species (let's say: Chlamydophila pneumoniae). That file will contain
> a list of genes, one after the other, again in FASTA format. The file
> will have the JGI unique identifiers. However, we also want the
> UniProt identifier for this same gene.
>
> Now, this should be as simple as: "Take the gene from the JGI
> database, look-up the same gene in UniProt, record the number, dust
> off your hands - you're done" -- There are lots of little tedious
> problems, however, that keep it from being this easy.
>
> For example, if two genes are absolutely identical (they have the same
> amino acid sequence) except for in a single position, are they
> actually identical? What if the sequence found was in a strain instead
> of from the original exact species?
>
> Let me ask another question: If you were to somehow magically sequence
> your personal entire genome (everything - not just genes) from a cell
> in your toe and also sequence your entire genome from a cell from your
> nose, would they be identical? I bet not... I'll explain why. Now, we
> expect less differences in actual genes (not in other parts of your
> genome), but even then, there can be some variation...
>
> These are the types of questions/problems that we'll be getting into
> if you're so interested...
>
> Who's up for this? We'll get date and time once we have a set of
> interested people...
>
> You don't have to be interested in this job to be interested in this
> problem (and/or to do more in bioinformatics).
>
>
> Cheers,
>
>
>
> Glen
> --
> Whatever you can do or imagine, begin it;
> boldness has beauty, magic, and power in it.
>
> -- Goethe
>
>
> _______________________________________________
> Baypiggies mailing list
> Baypiggies at python.org
> To change your subscription options or unsubscribe:
> http://mail.python.org/mailman/listinfo/baypiggies
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/baypiggies/attachments/20100517/7553298e/attachment.html>
More information about the Baypiggies
mailing list