[Ironpython-users] [Newbee] - Ironpython and .Net - Problem of encoding accented characters

bruno gallart bruno.gallart at orange.fr
Thu Jan 3 23:38:59 CET 2013


Hi,

First, I wish you a Good Year and a Good Health.
For me I begin the year with a little stupid encoding's problem :

I  am doing a spellchecker with the Hunspell ' library for .Net -- > 
Nhunspell for a language of south of Europe with many accented 
characters. The spellchecker works fine with the words without accented 
characters but if there is some I have strange problem, exemple :

/self.mySpellChecker = NHunspell.Hunspell("xx_FR.aff","xx_FR.dic")//
//self.suggestions = self.mySpellChecker.Suggest("ome") /

If I give directly the word " ome" that is wrong version for "òme" "the 
man" the result, (self.suggestions) shall give me "òme" , "the man" and 
"íme" , "the hymn"... it is very good.

But if I take all the words of an Word's document 2007 I have created, 
when the loop is at the level of  "ome" the self.suggestions give 
nothing. I think it comes from the character accented.

The difference maybe comes from the 
/"self.theWords.Item(self.ltItemsErreurs[self.index]).Text)/" that 
gives  the word wrong but in a String's format. In .Net if I remember 
well the String's object is an Unicode ? I have tried to do unicode, 
decode, encode in iso-8859-15, my xx_FR.dic and xx_FR.aff are in UTF8, 
the Word's document is a 2007 version etc... I don't understand now, I 
am drowned in my error. Somebody can gives me some trails  ????

Thanks if somebody can give me some advice and sorry for my poor english,

And, this week, I am going to buy the book " IronPython in Action", I 
think that it will be better.
Cheers,
Bruno


_Suggestion's part of the program_/
         self.theWords.Item(self.ltItemsErrors[self.index]).Font.Color= 
255//
//        self.suggestions = 
self.mySpellChecker.Suggest(self.theWords.Item(self.ltItemsErreurs[self.index]).Text)//
////
//        if self.suggestions://
//            for sug in self.suggestions://
//                self._ltSuggestions.Items.Add(sug)//
//        else://
//            self._ltSuggestions.Items.Add("No proposition")//
//
         # the word before en black //
////if self.index >= 1://
//self.theWords.Item(self.ltItemsErreurs[self.index-1]).Font.Color = 000//
////
//        if self.index == len(self.ltItemsErreurs)-1://
//            self._ltSuggestions.Items.Add("End of tractament")//
//self.theWords.Item(self.ltItemsErreurs[self.index]).Font.Color = 000//
//            self.index = 0////
//        else://
//            self.index = self.index + 1/

_Little part of code to charge all the words and spell them:_

/ltItemsErreurs = []/
/        self.doc = 
self.word_application.Documents.Open("c:\\test\\LeTestUltime_2007.docx")/
/        self.lesMots = self.doc.Content.Words//
//        self.nbrMots = self.doc.Content.Words.Count//
//        for n in xrange(1, self.nbrMots)://
//            if self.lesMots.Item(n).Text.strip() not in 
self.ltLettresAEviter://
//                existe = 
self.monCorrecteur.Spell(self.lesMots.Item(n).Text.strip())//
//                if not existe://
////self.ltItemsErreurs.append(n)/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ironpython-users/attachments/20130103/24647c3e/attachment.html>


More information about the Ironpython-users mailing list