More usenet usage statistics, by programming language 
Author Message
 More usenet usage statistics, by programming language

Hello all,

For what its worth, I've updated my usenet language stats (interpret them how
you will)

This time, the data reflects the actual number of DIFFERENT posters to each
group in the last 200 posts to each group, as opposed to just the total posts
to the group.

I think this gives a better idea about how many people relative to population
of programmers use each language. There are some surprises. Like, what in the
hell is Clipper?

You may also decide to put c/c++ together, in which case the total is 1511,
right under java (which makes intuitive sense)

Best,
Aaron.

survey begins here:

java 3715
(c/c++ taken together) 1511
basic 1292
perl 1240
c++ 953
Pascal 905
clipper 690
python 647
labview 579
javascript 570
c 558
clarion 491
tcl 443
asm 381
scheme 361
vhdl 359
fortran 341
lisp 325
ruby 315
cobol 310
postscript 308
ada 267
Smalltalk 262
prolog 231
functional 230
idl-pvwave 226
forth 223
verilog 220
awk 205
vrml 151
rexx 142
apl 139
mumps 124
pl1 117
objective-c 106
misc 105
eiffel 99
logo 78
ml 72
asm370 63
dylan 37
modula3 30
oberon 23
modula2 22
pop 19
icon 13
idl 12
limbo 5
clos 4
prograph 2
clu 2
visual 0
sather 0
hermes 0
comp 0
beta 0



Wed, 13 Jul 2005 04:26:31 GMT  
 More usenet usage statistics, by programming language



Quote:

> Hello all,

> For what its worth, I've updated my usenet language stats (interpret
them how
> you will)

> This time, the data reflects the actual number of DIFFERENT posters to
each
> group in the last 200 posts to each group, as opposed to just the
total posts
> to the group.

> I think this gives a better idea about how many people relative to
population
> of programmers use each language. There are some surprises. Like, what
in the
> hell is Clipper?

> You may also decide to put c/c++ together, in which case the total is
1511,
> right under java (which makes intuitive sense)

> Best,
> Aaron.

> survey begins here:

> java 3715
> (c/c++ taken together) 1511
> basic 1292
> perl 1240
> c++ 953
> pascal 905
> clipper 690
> python 647
> labview 579
> javascript 570
> c 558
> clarion 491
> tcl 443
> asm 381
> scheme 361
> vhdl 359
> fortran 341
> lisp 325
> ruby 315
> cobol 310
> postscript 308
> ada 267
> smalltalk 262
> prolog 231
> functional 230
> idl-pvwave 226
> forth 223
> verilog 220
> awk 205
> vrml 151
> rexx 142
> apl 139
> mumps 124
> pl1 117
> objective-c 106
> misc 105
> eiffel 99
> logo 78
> ml 72
> asm370 63
> dylan 37
> modula3 30
> oberon 23
> modula2 22
> pop 19
> icon 13
> idl 12
> limbo 5
> clos 4
> prograph 2
> clu 2
> visual 0
> sather 0
> hermes 0
> comp 0
> beta 0

I don't understand. Number of unique posters in the last 200 posts to a
newsgroup I understand,
and 647 to the (one) Python newsgroup I understand, but I don't
understand how you get
647 different posters out of the last 200 posts.

Oh, and Clipper is an old data base language, somewhere in the dbase
family.

John Roth



Wed, 13 Jul 2005 04:47:21 GMT  
 More usenet usage statistics, by programming language

Quote:

> I don't understand. Number of unique posters in the last 200 posts to a
> newsgroup I understand,
> and 647 to the (one) Python newsgroup I understand, but I don't
> understand how you get
> 647 different posters out of the last 200 posts.

> Oh, and Clipper is an old data base language, somewhere in the dbase
> family.

oops, sorry....I meant 2000!


Wed, 13 Jul 2005 05:37:40 GMT  
 More usenet usage statistics, by programming language

Quote:


> > I don't understand. Number of unique posters in the last 200 posts to a
> > newsgroup I understand,
> > and 647 to the (one) Python newsgroup I understand, but I don't
> > understand how you get
> > 647 different posters out of the last 200 posts.

> > Oh, and Clipper is an old data base language, somewhere in the dbase
> > family.

> oops, sorry....I meant 2000!

So, among other problems, this means if a given newsgroup had a single
large thread with a half-dozen regulars posting ten times each in a
big argument, that particular language would appear less popular...

-Peter



Wed, 13 Jul 2005 06:14:35 GMT  
 More usenet usage statistics, by programming language

Quote:


>> I don't understand. Number of unique posters in the last 200 posts to a
>> newsgroup I understand,
>> and 647 to the (one) Python newsgroup I understand, but I don't
>> understand how you get
>> 647 different posters out of the last 200 posts.

>> Oh, and Clipper is an old data base language, somewhere in the dbase
>> family.

> oops, sorry....I meant 2000!

Ok, then how do you account for 3715 different posters for Java:

Quote:
> survey begins here:

> java 3715
> (c/c++ taken together) 1511
> basic 1292
> perl 1240
> c++ 953
> pascal 905

FWIW, N different posters in the last M posts is an irrelevant
statistic for me.  N different posters in the X period of time might
mean something--it certainly correlates somewhat to acutal
popularity--but there are too many factors involved to get anything
better than a vague estimate from it.

--
CARL BANKS



Wed, 13 Jul 2005 06:36:43 GMT  
 More usenet usage statistics, by programming language

Quote:

> prograph 2
> clu 2
> visual 0
> sather 0
> hermes 0
> comp 0
> beta 0

Another question: how can there be 0 different posters in the last
2000 posts in comp.lang.beta?  It seems to me that, irrespective of
this being a very rough indicator of language popularity, your method
has some very grave flaws in it.  (That, or you've utterly failed to
explain how it works.)

Unless you can justify your methodology better, I would say these
numbers are useless even for a rough (and skewed) estimate.

--
CARL BANKS



Wed, 13 Jul 2005 06:36:43 GMT  
 More usenet usage statistics, by programming language

Quote:


> > I don't understand. Number of unique posters in the last
> > 200 posts to a newsgroup I understand,
> > and 647 to the (one) Python newsgroup I understand, but I don't
> > understand how you get
> > 647 different posters out of the last 200 posts.

> > Oh, and Clipper is an old data base language, somewhere in the dbase
> > family.

> oops, sorry....I meant 2000!

Which begs the question...: How does Java get 3715 posters in 2000
messages?

lies-damned-lies-and-statistics'ly y'rs
-- bjorn



Wed, 13 Jul 2005 06:09:51 GMT  
 More usenet usage statistics, by programming language

Quote:


> >> I don't understand. Number of unique posters in the last 200 posts to a
> >> newsgroup I understand,
> >> and 647 to the (one) Python newsgroup I understand, but I don't
> >> understand how you get
> >> 647 different posters out of the last 200 posts.

> >> Oh, and Clipper is an old data base language, somewhere in the dbase
> >> family.

> > oops, sorry....I meant 2000!

> Ok, then how do you account for 3715 different posters for Java:

I should have explained....each comp.lang.x.subgroup hierarchy gets totalled
together as one group. So comp.lang.java.moderated, comp.lang.java.whatever get
added together.....which makes me realize, I should make sure there are
no-cross posts....back to the drawing board.

Quote:
> > survey begins here:

> > java 3715
> > (c/c++ taken together) 1511
> > basic 1292
> > perl 1240
> > c++ 953
> > pascal 905

> FWIW, N different posters in the last M posts is an irrelevant
> statistic for me.  N different posters in the X period of time might
> mean something--it certainly correlates somewhat to acutal
> popularity--but there are too many factors involved to get anything
> better than a vague estimate from it.

I'm just going for 'less and less vague each time'. Any more suggestions?
It'll take some tinkering with my code, but I might be able to get that date
based evaluation soon, I'll let you know.


Wed, 13 Jul 2005 08:48:25 GMT  
 More usenet usage statistics, by programming language

Quote:


> > > I don't understand. Number of unique posters in the last 200 posts to a
> > > newsgroup I understand,
> > > and 647 to the (one) Python newsgroup I understand, but I don't
> > > understand how you get
> > > 647 different posters out of the last 200 posts.

> > > Oh, and Clipper is an old data base language, somewhere in the dbase
> > > family.

> > oops, sorry....I meant 2000!

> So, among other problems, this means if a given newsgroup had a single
> large thread with a half-dozen regulars posting ten times each in a
> big argument, that particular language would appear less popular...

> -Peter

Peter,

Each poster is counted only once.

-Aaron.



Wed, 13 Jul 2005 08:43:05 GMT  
 More usenet usage statistics, by programming language

Quote:




> > > > I don't understand. Number of unique posters in the last 200 posts to a
> > > > newsgroup I understand,
> > > > and 647 to the (one) Python newsgroup I understand, but I don't
> > > > understand how you get
> > > > 647 different posters out of the last 200 posts.

> > > > Oh, and Clipper is an old data base language, somewhere in the dbase
> > > > family.

> > > oops, sorry....I meant 2000!

> > So, among other problems, this means if a given newsgroup had a single
> > large thread with a half-dozen regulars posting ten times each in a
> > big argument, that particular language would appear less popular...

> > -Peter

> Peter,

> Each poster is counted only once.

I understand that most basic point.  Let me try out an example
to help clarify *my point*.

There are 2000 posts retrieved from comp.lang.noisy.  There is
a recent thread involving five people who each contributed 201
messages.  That means 1000 of those 2000 messages are eliminated
instantly by your filtering of non-unique posters.  That leaves
only 1000 posts from which to measure the number of unique
authors, aside from these prolific five.

Does that help?  The comments about needing to examine across
a fixed duration are probably reasonable...

-Peter



Wed, 13 Jul 2005 11:17:14 GMT  
 More usenet usage statistics, by programming language

Quote:




> > > > > I don't understand. Number of unique posters in the last 200 posts
> to a
> > > > > newsgroup I understand,
> > > > > and 647 to the (one) Python newsgroup I understand, but I don't
> > > > > understand how you get
> > > > > 647 different posters out of the last 200 posts.

> > > > > Oh, and Clipper is an old data base language, somewhere in the dbase
> > > > > family.

> > > > oops, sorry....I meant 2000!

> > > So, among other problems, this means if a given newsgroup had a single
> > > large thread with a half-dozen regulars posting ten times each in a
> > > big argument, that particular language would appear less popular...

> > > -Peter

> > Peter,

> > Each poster is counted only once.

> I understand that most basic point.  Let me try out an example
> to help clarify *my point*.

> There are 2000 posts retrieved from comp.lang.noisy.  There is
> a recent thread involving five people who each contributed 201
> messages.  That means 1000 of those 2000 messages are eliminated
> instantly by your filtering of non-unique posters.  That leaves
> only 1000 posts from which to measure the number of unique
> authors, aside from these prolific five.

> Does that help?  The comments about needing to examine across
> a fixed duration are probably reasonable...

> -Peter

Yes, thanks.

I'm now exploring some feature of the nntplib, in particular xhdr, to find and
organize date data. Your comments have been helpful! I'll post the results as I
complete the work!

Thanks,
Aaron.



Wed, 13 Jul 2005 13:28:30 GMT  
 More usenet usage statistics, by programming language

Quote:




>> >> I don't understand. Number of unique posters in the last 200 posts to a
>> >> newsgroup I understand,
>> >> and 647 to the (one) Python newsgroup I understand, but I don't
>> >> understand how you get
>> >> 647 different posters out of the last 200 posts.

>> >> Oh, and Clipper is an old data base language, somewhere in the dbase
>> >> family.

>> > oops, sorry....I meant 2000!

>> Ok, then how do you account for 3715 different posters for Java:

> I should have explained....each comp.lang.x.subgroup hierarchy gets
> totalled together as one group. So comp.lang.java.moderated,
> comp.lang.java.whatever get added together.....which makes me
> realize, I should make sure there are no-cross posts....back to the
> drawing board.

This approach has more problems than just the possibility of
cross-posting.  As I expected, you considered 2000 articles from all
six(?) Java newsgroups, meaning that the total you gave for Java is
from a sample of 12000 (or whatever) posts, but the total you gave for
Python is from a sample of only 2000 posts.  Of course Java's going to
have more unique posters then.

- Show quoted text -

Quote:
>> > survey begins here:

>> > java 3715
>> > (c/c++ taken together) 1511
>> > basic 1292
>> > perl 1240
>> > c++ 953
>> > pascal 905

>> FWIW, N different posters in the last M posts is an irrelevant
>> statistic for me.  N different posters in the X period of time might
>> mean something--it certainly correlates somewhat to acutal
>> popularity--but there are too many factors involved to get anything
>> better than a vague estimate from it.

> I'm just going for 'less and less vague each time'. Any more suggestions?

I suggest that even a theoretically perfect measurement of newsgroup
posting activity will still only be a vague (and skewed) estimate of
language popularity.  Vague because there are a lot of factors that
can contribute to higher or lower newsgroup volume; skewed because
these factors don't affect all languages equally.

--
CARL BANKS



Wed, 13 Jul 2005 16:37:55 GMT  
 More usenet usage statistics, by programming language
Quote:

> Hello all,

> For what its worth, I've updated my usenet language stats (interpret them how
> you will)

> This time, the data reflects the actual number of DIFFERENT posters to each
> group in the last 200 posts to each group, as opposed to just the total posts
> to the group.

<snip>
This is very interesting to me.  I can suggest an improvement, if you
want to spend some more time on this.  Instead of using the last 2000
posts, use the last week or month.  A popular language will get 2000
posts much faster than an unpopular language.

I hope you redo it in this way.

z

--
"When one enters an unplowed field he is sure to leave furrows, even if
his plow is not of the sharpest." - Timin



Thu, 14 Jul 2005 08:26:56 GMT  
 More usenet usage statistics, by programming language

Quote:


>> Hello all,

>> For what its worth, I've updated my usenet language stats (interpret them how
>> you will)

>> This time, the data reflects the actual number of DIFFERENT posters to each
>> group in the last 200 posts to each group, as opposed to just the total posts
>> to the group.
><snip>
>This is very interesting to me.  I can suggest an improvement, if you
>want to spend some more time on this.  Instead of using the last 2000
>posts, use the last week or month.  A popular language will get 2000
>posts much faster than an unpopular language.

>I hope you redo it in this way.

Actually, I think a better measure might be counting distinct posters with a
minimum of two posts. This would indicate interest (e.g., dialog or more than
one question) more strongly than a single post, IWT. A table including both
would be interesting. I agree on time interval vs flat latest count also.
I guess you might expect spikes around significant announcements (good or bad)
which could skew results from different languages.

Another interesting measure might be the aggregate volume of unquoted text vs total,
and those numbers divided by number of unique posters, for average level of interest.

Regards,
Bengt Richter



Thu, 14 Jul 2005 10:02:02 GMT  
 More usenet usage statistics, by programming language

Quote:
> Actually, I think a better measure might be counting distinct posters with a
> minimum of two posts. This would indicate interest (e.g., dialog or more
> than
> one question) more strongly than a single post, IWT. A table including both
> would be interesting. I agree on time interval vs flat latest count also.
> I guess you might expect spikes around significant announcements (good or
> bad)
> which could skew results from different languages.

> Another interesting measure might be the aggregate volume of unquoted text
> vs total,
> and those numbers divided by number of unique posters, for average level of
> interest.

> Regards,
> Bengt Richter

I already correct the script to do flat time vs. posts.

See the results for december in the 'December 2002 comp.lang.* stats ' thread.

Your other ideas are interesting...thanks....

Best,
Aaron.



Thu, 14 Jul 2005 10:23:31 GMT  
 
 [ 19 post ]  Go to page: [1] [2]

 Relevant Pages 

1. A Gathering of Language Usage Statistics AKA Programmers Poll

2. usenet statistics script....happy hacking!!!

3. Programming language usage?

4. Smalltalk Usage Statistics

5. Forth word usage statistics

6. Tcl/Tk usage statistics

7. Number of posts to Usenet language groups

8. Derivation of PL/I (was Usenet group for PL/M language)

9. Usenet group for PL/M language

10. Usenet group for PL/M language

11. Derivation of PL/I (was Usenet group for PL/M language)

12. Usenet group for PL/M language

 

 
Powered by phpBB® Forum Software