From jeffrey.fischer at gmail.com  Fri Sep 16 16:07:46 2022
From: jeffrey.fischer at gmail.com (Jeff Fischer)
Date: Fri, 16 Sep 2022 13:07:46 -0700
Subject: [Baypiggies] BayPiggies meeting next Thursday (Sept 22): Debugging,
 Scraping, and NLP
Message-ID: <CAPFNg_2q6PSxxgW9bc84+Z0ShRU680BWS-uj5p7XWahtakqaZQ@mail.gmail.com>

BayPiggies Sept 22, 2022 7:00 pm - 8:30 pm PDT (online)
This month, we'll have a lightning talk from Ryan Kuhl on debugging and a
full talk from Stephen McInerney on Web scraping and NLP. We hope that you
can join us!

*Lightning Talk: Debugging with ipdb*
*Speaker:* Ryan Kuhl
*Speaker Bio:*
Ryan is a Miami based software engineer at Tatari, co-founder of Public
Sector ML, and student at Georgia Institute of Technology. Ryan has been
programming professionally with python for 9 years and loves to build
performant APIs and chunky SQL queries! When not programming for work he's
studying machine learning and quantum computing. Connect to Ryan via email
at ryan at kuhl.dev, LinkedIn at linkedin.com/in/kuhl or GitHub at
GitHub.com/lame.

*Main Talk: NLP, Topic Modeling and Scraping of conference talks to find
which topics are hot and not*
*Speaker:* Stephen McInerney
NLP (Natural Language Processing) and Topic Modeling are subdomains of
Machine Learning which are core technologies for Python data scientists;
and the automated collection of data by Scraping (in a TOS-compliant,
ethical way) is a rarely-discussed practice. Outline:

   - Review the basic steps, present a typical pipeline for
   Scraping+NLP+Topic Modeling and cover packages used
   - As a motivating example, we investigate changes in Python conference
   topics 2016-2022, and statistically extract conclusions on what's hot and
   not, as of 2022
   - We also handle foreign-language abstracts and outline how machine
   translation can be used for Topic Modeling
   - We illustrate best practices in Scraping on text data, maximally
   preserving and augmenting with metadata
   - Review the basic steps, present a typical pipeline (segmentation,
   handling Unicode, Levenshtein distance, word-vectors, Transformer, NER, IE).
   - Overview of related NLP/ML/Deep Learning packages we use both for
   prototyping and production.
   - Topic Modeling using LDA is a highly iterative clustering process to
   "learn" which topics seem to be similar/related/identical/different
   - In this specific case, we augment conference abstracts with whatever
   metadata is helpful to topic-modeling e.g. speaker interests, affiliation,
   links to Twitter
   - Example: "token" means an entirely different topic when it co-occurs
   with "crypto"/"blockchain"/"web3" versus when it co-occurs with
   "API"/"authentication"/"appsec"/"2FA"/"Oauth". But how do we automatically
   learn hundreds and then thousands of such cases?

*Speaker Bio:* Stephen McInerney
Data scientist and NLP specialist for over a decade, specializing in
domain-specific (biotech/legal/financial) and multilingual NLP, in both
startups and large companies. Kaggle competitor; have led "Kaggle Together"
classes. Former Data Science co-chair of SF Bay Area ACM and organizer of
multiple Data Science Camps. Passionate about open-source.
www.linkedin.com/in/stephenmcinerney

*RSVP*
We will conduct the meeting via Zoom meeting. To RSVP, go to
https://www.meetup.com/baypiggies/events/288471326/. When you RSVP "Yes" to
this event, the link to the Zoom meeting will become visible in MeetUp.

*Code of Conduct*
https://baypiggies.net/pages/code_of_conduct.html
Interactions online have less nuance than in-person interactions. Please be
Open, Considerate and Respectful. Also, please refrain from discussing
topics unrelated to the Python community or the technical content of the
meeting.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/baypiggies/attachments/20220916/92c8c0be/attachment.html>

From jeffrey.fischer at gmail.com  Mon Sep 19 16:55:37 2022
From: jeffrey.fischer at gmail.com (Jeff Fischer)
Date: Mon, 19 Sep 2022 13:55:37 -0700
Subject: [Baypiggies] An upcoming talk on "productionizing Pandas"
Message-ID: <CAPFNg_3W2Tg=iNtF=f7GQfeyzGd8Gxm4_ECmx1FNin7MHhnt1g@mail.gmail.com>

Hi everyone,
Tomorrow (Tuesday) night, an engineering team from my company (C3.ai) is
giving a talk on technology they developed to use Pandas as the basis of
production machine learning (rather than rewriting to something like
SparkSQL). If you are interested, here is the link on Meetup:
https://www.meetup.com/c3-ai-enterprise-ai/events/288213225/

Thanks,
Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/baypiggies/attachments/20220919/7a9cad42/attachment.html>

From glen at glenjarvis.com  Fri Sep 23 00:30:40 2022
From: glen at glenjarvis.com (Glen Jarvis)
Date: Fri, 23 Sep 2022 04:30:40 +0000
Subject: [Baypiggies] Zoom bombing / apologies
Message-ID: <JASbx5B1Szoj_YH27aOdmzpzFluz12LguaIHh8JmODS_XQ1ycYMqrZpehBLxPi2ZsLguSXCXr7m7N-LMRfuwpYxvGqmvXs6jDk0zH_kOUgQ=@glenjarvis.com>

There was an individual who Zoom bombed us tonight for our meeting. I'm usually good at muting stray microphones, kicking bad users (usually before they get disruptive), spotlighting the speakers so their camera shows on the video, etc.

But, whoever was doing this Zoom bomb was able to elude me, unfortunately. They masked their activity as another user (so it was harder to kick them), they were able to get audio when it was disabled, etc. I also was removing the screen annotations as soon as they were being put up -- but, they were able to keep putting them up.

I want to deeply apologize as, at least once, there was something written with a Zoom annotation that wasn't just juvenile but was offensive. We ended the meeting early.

Why don't we use Webinar Format?

Because many of our members originally did not? like the idea of registering their identity just to attend a meeting as well as signing NDAs when we were in the physical world, I've been trying to respect that as much as possible in the virtual world. Real meetings are also more interactive and engaging.

However, because of this event, we may be forced to require registrations and go back to Webinar format. I have an open ticket with Zoom support (#15460891) for a root cause analysis and security suggestions.

It is always a struggle to strike that real balance between a completely open environment and enforcing good behavior. Some of our original open source systems were high trust and assumed good behavior. Rarely was it wrong.

Kindest Regards,

Glen Jarvis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/baypiggies/attachments/20220923/921df92b/attachment.html>

From glen at glenjarvis.com  Fri Sep 23 14:08:58 2022
From: glen at glenjarvis.com (Glen Jarvis)
Date: Fri, 23 Sep 2022 18:08:58 +0000
Subject: [Baypiggies] Zoom bombing / apologies
In-Reply-To: <JASbx5B1Szoj_YH27aOdmzpzFluz12LguaIHh8JmODS_XQ1ycYMqrZpehBLxPi2ZsLguSXCXr7m7N-LMRfuwpYxvGqmvXs6jDk0zH_kOUgQ=@glenjarvis.com>
References: <JASbx5B1Szoj_YH27aOdmzpzFluz12LguaIHh8JmODS_XQ1ycYMqrZpehBLxPi2ZsLguSXCXr7m7N-LMRfuwpYxvGqmvXs6jDk0zH_kOUgQ=@glenjarvis.com>
Message-ID: <3Adspyq4BSGTGXKT8jSogYrGGYEBEa2pH3rhKLMCJWgkidOCY7ZW8FI2brX_wD1Q0BJG8nLg9A6axyrgAMaz89aPqshDalBLlLewsSIxAQc=@glenjarvis.com>

After this was all over last night, trying to make sure the speakers were okay, trying to make sure the other organizers were okay, trying to make sure the audience was okay, reviewing security settings, opening tickets with Zoom, etc, I realized that I wasn't feeling so hot myself.

This morning, I picked up an old classic that I love "Daring Greatly" and I suddenly remembered "The Man in the Arena" speech. I err and come up short again and again. But, I do it daring greatly :)

I've modified it to be gender neutral:

> ?It is not the critic who counts, not the one who points out how the strong person stumbled or how the doer of deeds might have done them better. The credit belongs to the one who is actually in the arena, whose face is marred with sweat and dust and blood; who strives valiantly; who errs and comes up short again and again; who knows the great enthusiasms, the great devotions, and spends oneself in a worthy cause; who, if he or she wins, knows the triumph of high achievement; and who, if fails, at least fails while daring greatly, so that his or her place shall never be with those cold and timid souls who know neither victory nor defeat.?

Kindest Regards,

Glen

------- Original Message -------
On Thursday, September 22nd, 2022 at 9:30 PM, Glen Jarvis via Baypiggies <baypiggies at python.org> wrote:

> There was an individual who Zoom bombed us tonight for our meeting. I'm usually good at muting stray microphones, kicking bad users (usually before they get disruptive), spotlighting the speakers so their camera shows on the video, etc.
>
> But, whoever was doing this Zoom bomb was able to elude me, unfortunately. They masked their activity as another user (so it was harder to kick them), they were able to get audio when it was disabled, etc. I also was removing the screen annotations as soon as they were being put up -- but, they were able to keep putting them up.
>
> I want to deeply apologize as, at least once, there was something written with a Zoom annotation that wasn't just juvenile but was offensive. We ended the meeting early.
>
> Why don't we use Webinar Format?
>
> Because many of our members originally did not? like the idea of registering their identity just to attend a meeting as well as signing NDAs when we were in the physical world, I've been trying to respect that as much as possible in the virtual world. Real meetings are also more interactive and engaging.
>
> However, because of this event, we may be forced to require registrations and go back to Webinar format. I have an open ticket with Zoom support (#15460891) for a root cause analysis and security suggestions.
>
> It is always a struggle to strike that real balance between a completely open environment and enforcing good behavior. Some of our original open source systems were high trust and assumed good behavior. Rarely was it wrong.
>
> Kindest Regards,
>
> Glen Jarvis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/baypiggies/attachments/20220923/af7d19f7/attachment.html>