[Chennaipy] Fwd: Building Open Source Tamil Spellchecker – Day 8 – Porting from C# to Python

Shrinivasan T tshrinivasan at gmail.com
Fri Aug 28 14:45:27 EDT 2020


---------- Forwarded message ---------
அனுப்புநர்: Shrinivasan T <tshrinivasan at gmail.com>
Date: சனி, 29 ஆக., 2020, முற்பகல் 12:14
Subject: Building Open Source Tamil Spellchecker – Day 8 – Porting from C#
to Python
To: <ilugc at freelists.org>


Recently, Tamil Virtual Academy released 10 Tamil NLP tools as Free/Open
Source Software with source code. It has a SpellChecker too. Read more here
about this.

https://goinggnu.wordpress.com/2020/08/16/tamilvu-released-10-tamil-software-as-free-open-source-software/

The spellchecker is written in C#. I want it to be ported to Python so that
we can extend it very well.

C# is very new language for me. Asked in ChennaiPy mailing lists and social
media. Many friends encouraged and came forward to help.

We thought to port it to Mono and then Python. But It is a long path. As
all the C# code is mostly with parsing a big JSON file and many if, else,
for stuff, I hope we can port it by reading line by line and rewriting in
Python.

I had too many doubts on the C# code, its syntax. Wanted to open the code
in any IDE, run it with debug mode. As I dont have Windows, was looking for
help from Friends.

I have few friends for decades. We even dont know when we started to know
each other. But I have many such friends, mostly from free software
communities. We help each other on many things.

Today, Manik, a friend from FreeTamilComputing Community, pinged me and
enquired about the requirement. He spinned up a Windows VM, loaded the
code, compiled and explained me the basic workflow.

We immediately started the read the code line by line and wrote in Python
line by line. Though he is also very new to C#, he googled for C# syntax
and I googled for Python Syntax. Yes. We are long time programmers who
still google for basic syntax. 🙂

https://github.com/tshrinivasan/Tamilinaiya-Spellchecker/blob/master/PythonPort/from_Csharp.py

Here is our half cooked initial version. There are still more functions to
be ported. After ported, it needs many rounds of debugging,auditing,
improving, etc. Still it is a good start. For weeks, I was dreaming about
how to do this. A quick call, helping hands and time gave a great boost on
the progress.

I got a call from Rajesh, who is a C# programmer too to help on this. Once
the initial version is ported, by next week, hoping to get his help to have
a deep study and analysis.

The big dream of bringing a open source Tamil spellchecker is happening.
Happy to be a small part of this.

Tons of thanks for all friends for their helping hands and good hearts.

Read Previous days notes on building tamil spellchecker.

   1. Study notes on open-tamil spellchecker – day 1
   <https://goinggnu.wordpress.com/2020/05/22/study-notes-on-open-tamil-spellchecker-day-1/>
   2. Building Tamil Spellchecker – Day 2 – Bloom Filter to quick query on
   dataset
   <https://goinggnu.wordpress.com/2020/05/23/building-tamil-spellchecker-day-2-bloom-filter-to-quick-query-on-dataset/>
   3. Building Tamil Spellchecker – Day 3 – Collecting all Tamil Nouns
   <https://goinggnu.wordpress.com/2020/05/24/building-tamil-spellchecker-day-3-collecting-all-tamil-nouns/>
   4. Building Tamil Spellchecker – Day 4 – Shall we collect ALL Tamil
   Words?
   <https://goinggnu.wordpress.com/2020/05/25/building-tamil-spellchecker-day-4-shall-we-collect-all-tamil-words/>
   5. Building Tamil Spellchecker – Day 5 – started collecting ALL Tamil
   Words
   <https://goinggnu.wordpress.com/2020/05/26/building-tamil-spellchecker-day-5-started-collecting-all-tamil-words/>
   6. Building Open Source Tamil Spellchecker – Day 6 – How fast is bloom
   filter for 24 lakh words?
   <https://goinggnu.wordpress.com/2020/05/27/building-open-source-tamil-spellchecker-day-6-how-fast-is-bloom-filter/>
   7. Building Open Source Tamil Spellchecker – Day 7 – Scrapping websites
   to get more words
   <https://goinggnu.wordpress.com/2020/06/04/building-open-source-tamil-spellchecker-day-7-scrapping-websites-to-get-more-words/>
   8.
   https://goinggnu.wordpress.com/2020/08/29/building-open-source-tamil-spellchecker-day-8-porting-from-c-to-python
   /
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chennaipy/attachments/20200829/d4868d19/attachment.html>


More information about the Chennaipy mailing list