sorry, i should be more specific about the encoding it's euc-jp i googled alittle, and you can still use re.findall with the japanese kana, but i didnt find anything about kanji.