More usenet usage statistics, by programming language

Carl Banks imbosol at vt.edu
Sat Jan 25 03:37:55 EST 2003


Aaron K. Johnson wrote:
> In message <%djY9.51$Xw5.38 at nwrddc04.gnilink.net>, Carl Banks wrote:
>> Aaron K. Johnson wrote:
>> > In message <v339gg9p1rlb3e at news.supernews.com>, "John Roth" wrote:
>> >> 
>> >> I don't understand. Number of unique posters in the last 200 posts to a
>> >> newsgroup I understand,
>> >> and 647 to the (one) Python newsgroup I understand, but I don't
>> >> understand how you get
>> >> 647 different posters out of the last 200 posts.
>> >> 
>> >> Oh, and Clipper is an old data base language, somewhere in the dbase
>> >> family.
>> > 
>> > oops, sorry....I meant 2000!
>> 
>> 
>> Ok, then how do you account for 3715 different posters for Java:
> 
> I should have explained....each comp.lang.x.subgroup hierarchy gets
> totalled together as one group. So comp.lang.java.moderated,
> comp.lang.java.whatever get added together.....which makes me
> realize, I should make sure there are no-cross posts....back to the
> drawing board.

This approach has more problems than just the possibility of
cross-posting.  As I expected, you considered 2000 articles from all
six(?) Java newsgroups, meaning that the total you gave for Java is
from a sample of 12000 (or whatever) posts, but the total you gave for
Python is from a sample of only 2000 posts.  Of course Java's going to
have more unique posters then.



>> > survey begins here:
>> > 
>> > java 3715
>> > (c/c++ taken together) 1511
>> > basic 1292
>> > perl 1240
>> > c++ 953
>> > pascal 905
>> 
>> 
>> FWIW, N different posters in the last M posts is an irrelevant
>> statistic for me.  N different posters in the X period of time might
>> mean something--it certainly correlates somewhat to acutal
>> popularity--but there are too many factors involved to get anything
>> better than a vague estimate from it.
> 
> I'm just going for 'less and less vague each time'. Any more suggestions?

I suggest that even a theoretically perfect measurement of newsgroup
posting activity will still only be a vague (and skewed) estimate of
language popularity.  Vague because there are a lot of factors that
can contribute to higher or lower newsgroup volume; skewed because
these factors don't affect all languages equally.


-- 
CARL BANKS




More information about the Python-list mailing list