[Tutor] regex advice

Norman Khine norman at khine.net
Tue Jan 6 12:43:01 CET 2015


hello,
i have the following code:

import os
import sys
import re

walk_dir = ["app", "email", "views"]
#t(" ")
gettext_re = re.compile(r"""[t]\((.*)\)""").findall

for x in walk_dir:
    curr_dir = "../node-blade-boiler-template/" + x
    for root, dirs, files in os.walk(curr_dir, topdown=False):
        if ".git" in dirs:
            dirs.remove(".git")
        if "node-modules" in dirs:
            dirs.remove("node-modules")

        for filename in files:
            file_path = os.path.join(root, filename)
            print('\n- file %s (full path: %s)' % (filename, file_path))
            with open(file_path, 'rb') as f:
                f_content = f.read()
                msgid = gettext_re(f_content)
                print msgid

which traverses a directory and tries to extract all strings that are
within

t(" ")

for example:

i have a blade template file, as

replace page
  .row
    .large-8.columns
      form( method="POST", action="/product/saveall/#{style._id}" )
        input( type="hidden" name="_csrf" value=csrf_token )
        h3 #{t("Generate Product for")} #{tt(style.name)}
        .row
          .large-6.columns
            h4=t("Available Attributes")
            - for(var i = 0; i < attributes.length; i++)
              - var attr = attributes[i]
              - console.log(attr)
                ul.attribute-block.no-bullet
                  li
                    b= tt(attr.name)
                  - for(var j = 0; j < attr.values.length; j++)
                    - var val = attr.values[j]
                      li
                        label
                          input( type="checkbox" title="#{tt(attr.name)}:
#{tt(val.name)}" name="#{attr.id}" value="#{val.id}")
                          |
                          =tt(val.name)
                          = " [Code: " + (val.code || val._id) + "]"
                          !=val.htmlSuffix()
          .large-6.columns
            h4 Generated Products
            ul#products
        button.button.small
          i.icon-save
          |=t("Save")
        =" "
        a.button.small.secondary( href="/product/list/#{style.id}" )
          i.icon-cancel
          |t=("Cancel")

when i run the above code, i get

- file add.blade (full path:
../node-blade-boiler-template/views/product/add.blade)
 type="hidden" name="_csrf" value=csrf_token
"Generate product for")} #{tt(style.name
"Available Attributes"
attr.name
 type="checkbox" title="#{tt(attr.name)}: #{tt(val.name)}" name="#{attr.id}"
value="#{val.id}"
val.name
"Generated products"
"Save"



so, gettext_re = re.compile(r"""[t]\((.*)\)""").findall is not correct as
it includes

results such as input( type="hidden" name="_csrf" value=csrf_token )

what is the correct way to pull all values that are within t(" ") but
exclude any tt( ) and input( )

any advice much appreciated

norman


-- 
%>>> "".join( [ {'*':'@','^':'.'}.get(c,None) or chr(97+(ord(c)-83)%26) for
c in ",adym,*)&uzq^zqf" ] )


More information about the Tutor mailing list