[Ironpython-users] [Newbee] - Ironpython and .Net - Problem of encoding accented characters
bruno gallart
bruno.gallart at orange.fr
Thu Jan 3 23:38:59 CET 2013
Hi,
First, I wish you a Good Year and a Good Health.
For me I begin the year with a little stupid encoding's problem :
I am doing a spellchecker with the Hunspell ' library for .Net -- >
Nhunspell for a language of south of Europe with many accented
characters. The spellchecker works fine with the words without accented
characters but if there is some I have strange problem, exemple :
/self.mySpellChecker = NHunspell.Hunspell("xx_FR.aff","xx_FR.dic")//
//self.suggestions = self.mySpellChecker.Suggest("ome") /
If I give directly the word " ome" that is wrong version for "òme" "the
man" the result, (self.suggestions) shall give me "òme" , "the man" and
"íme" , "the hymn"... it is very good.
But if I take all the words of an Word's document 2007 I have created,
when the loop is at the level of "ome" the self.suggestions give
nothing. I think it comes from the character accented.
The difference maybe comes from the
/"self.theWords.Item(self.ltItemsErreurs[self.index]).Text)/" that
gives the word wrong but in a String's format. In .Net if I remember
well the String's object is an Unicode ? I have tried to do unicode,
decode, encode in iso-8859-15, my xx_FR.dic and xx_FR.aff are in UTF8,
the Word's document is a 2007 version etc... I don't understand now, I
am drowned in my error. Somebody can gives me some trails ????
Thanks if somebody can give me some advice and sorry for my poor english,
And, this week, I am going to buy the book " IronPython in Action", I
think that it will be better.
Cheers,
Bruno
_Suggestion's part of the program_/
self.theWords.Item(self.ltItemsErrors[self.index]).Font.Color=
255//
// self.suggestions =
self.mySpellChecker.Suggest(self.theWords.Item(self.ltItemsErreurs[self.index]).Text)//
////
// if self.suggestions://
// for sug in self.suggestions://
// self._ltSuggestions.Items.Add(sug)//
// else://
// self._ltSuggestions.Items.Add("No proposition")//
//
# the word before en black //
////if self.index >= 1://
//self.theWords.Item(self.ltItemsErreurs[self.index-1]).Font.Color = 000//
////
// if self.index == len(self.ltItemsErreurs)-1://
// self._ltSuggestions.Items.Add("End of tractament")//
//self.theWords.Item(self.ltItemsErreurs[self.index]).Font.Color = 000//
// self.index = 0////
// else://
// self.index = self.index + 1/
_Little part of code to charge all the words and spell them:_
/ltItemsErreurs = []/
/ self.doc =
self.word_application.Documents.Open("c:\\test\\LeTestUltime_2007.docx")/
/ self.lesMots = self.doc.Content.Words//
// self.nbrMots = self.doc.Content.Words.Count//
// for n in xrange(1, self.nbrMots)://
// if self.lesMots.Item(n).Text.strip() not in
self.ltLettresAEviter://
// existe =
self.monCorrecteur.Spell(self.lesMots.Item(n).Text.strip())//
// if not existe://
////self.ltItemsErreurs.append(n)/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ironpython-users/attachments/20130103/24647c3e/attachment.html>
More information about the Ironpython-users
mailing list