[Python-ideas] unicodedata.itergraphemes (or str.itergraphemes / str.graphemes)

Joshua Landau joshua at landau.ws
Wed Jul 10 14:10:37 CEST 2013


On 10 July 2013 13:04, Philipp A. <flying-sheep at web.de> wrote:
> 2013/7/10 David Kendal <me at dpk.io>
>
> Well, right. I meant “a new type” like dict.keys() and dict.values() are
> “view types”
> on a dictionary that provide iterator interfaces. This would just be a
> “grapheme view” on a string.
>
> i think that’s the way to go. who would want dozens of new functions in
> unicodedata?

You've missed both of our points. Consider:

>>> {}.keys()
dict_keys([])
>>> iter({}.keys())
<dict_keyiterator object at 0x7fe3d633a890>

There are good reasons why a "view" should not be its iterator.

> how about something like the following? it can easily be extended to get a
> reverse iterator.
>
> setting its pos and calling find_grapheme or __next__ or previous allows for
> bruce’s usecases.
>
> class GraphemeIterator:
>     def __init__(self, string, start=0):
>         self.string = string
>         self.pos = start
>
>     def __iter__(self):
>         return self
>
>     def __next__(self):
>         _, next_pos, grapheme = self.find_grapheme()
>         self.pos = next_pos
>         return grapheme
>
>     def previous(self):
>         prev_pos, _, grapheme = self.find_grapheme(backwards=True)
>         self.pos = prev_pos
>         return grapheme
>
>     def find_grapheme(self, i=None, *, backwards=False):
>         """finds next complete grapheme in string, starting at position i
>         if backwards is not set, finds grapheme starting at i, or the next
> one if i is in the middle of one
>         if it is set, it finds the grapheme which i points to, even if
> that’s the middle.
>         if str[i] is the beginning of a grapheme, backwards finds the one
> before it.
>         """
>         if i is None:
>             i = self.pos
>         ...
>         return (start, end, grapheme)
>
> def find_grapheme(string, i, backwards=False):
>     """ convenience function for oneshotting it """
>     return GraphemeIterator(string, i).find_grapheme(backwards=backwards)


More information about the Python-ideas mailing list