[Tutor] regex advice
Norman Khine
norman at khine.net
Tue Jan 6 12:43:01 CET 2015
hello,
i have the following code:
import os
import sys
import re
walk_dir = ["app", "email", "views"]
#t(" ")
gettext_re = re.compile(r"""[t]\((.*)\)""").findall
for x in walk_dir:
curr_dir = "../node-blade-boiler-template/" + x
for root, dirs, files in os.walk(curr_dir, topdown=False):
if ".git" in dirs:
dirs.remove(".git")
if "node-modules" in dirs:
dirs.remove("node-modules")
for filename in files:
file_path = os.path.join(root, filename)
print('\n- file %s (full path: %s)' % (filename, file_path))
with open(file_path, 'rb') as f:
f_content = f.read()
msgid = gettext_re(f_content)
print msgid
which traverses a directory and tries to extract all strings that are
within
t(" ")
for example:
i have a blade template file, as
replace page
.row
.large-8.columns
form( method="POST", action="/product/saveall/#{style._id}" )
input( type="hidden" name="_csrf" value=csrf_token )
h3 #{t("Generate Product for")} #{tt(style.name)}
.row
.large-6.columns
h4=t("Available Attributes")
- for(var i = 0; i < attributes.length; i++)
- var attr = attributes[i]
- console.log(attr)
ul.attribute-block.no-bullet
li
b= tt(attr.name)
- for(var j = 0; j < attr.values.length; j++)
- var val = attr.values[j]
li
label
input( type="checkbox" title="#{tt(attr.name)}:
#{tt(val.name)}" name="#{attr.id}" value="#{val.id}")
|
=tt(val.name)
= " [Code: " + (val.code || val._id) + "]"
!=val.htmlSuffix()
.large-6.columns
h4 Generated Products
ul#products
button.button.small
i.icon-save
|=t("Save")
=" "
a.button.small.secondary( href="/product/list/#{style.id}" )
i.icon-cancel
|t=("Cancel")
when i run the above code, i get
- file add.blade (full path:
../node-blade-boiler-template/views/product/add.blade)
type="hidden" name="_csrf" value=csrf_token
"Generate product for")} #{tt(style.name
"Available Attributes"
attr.name
type="checkbox" title="#{tt(attr.name)}: #{tt(val.name)}" name="#{attr.id}"
value="#{val.id}"
val.name
"Generated products"
"Save"
so, gettext_re = re.compile(r"""[t]\((.*)\)""").findall is not correct as
it includes
results such as input( type="hidden" name="_csrf" value=csrf_token )
what is the correct way to pull all values that are within t(" ") but
exclude any tt( ) and input( )
any advice much appreciated
norman
--
%>>> "".join( [ {'*':'@','^':'.'}.get(c,None) or chr(97+(ord(c)-83)%26) for
c in ",adym,*)&uzq^zqf" ] )
More information about the Tutor
mailing list