[Tutor] vcf_files and strings

Thu Oct 6 07:47:33 CEST 2011

On 2011-10-05 21:29, Anna Olofsson wrote:
> vcf file: 2 rows, 10 columns.
>
> The important column is 7 where the ID is, i.e.
> refseq.functionalClass=missense. It's a missense mutation, so then I
> want to extract refseq.name=NM_003137492, or I want to extract only
> the ID, which in this case is NM_003137492.
>
> Then I want to do exactly the same thing for all the other mutations,
> but only for the missense mutations not the other ones. How do I
> accomplish that? Where do I start?

I would split the rows into the columns (analyze your file to find the 
seperator), then look for "missense" in the 7th column in every row and 
if found regex for the name/ID.

Are you able to code that yourself or do you need more hints?

Bye, Andreas