Channels ▼
RSS

Web Development

Quantifying Popular Programming Languages


November, 2003: Quantifying Popular Programming Languages

Thomas has authored four books on C, and coauthored Efficient C (with Jim Brodie) and C++ Programming Guidelines (with Daniel Saks). Plum Hall (his company) provides test suites for C, C++, Java, and C#. Thomas can be contacted at tplum@plumhall.com.


For any number of reasons, people always want to know the relative popularity of various programming languages. To provide one measure of language popularity, we focused upon one publicly available and objective measurement—the number of web-based job offers that specify requirements for different programming languages. Our most recent analysis covered the 12-month period from July 2002 to June 2003. We scanned job offers in the category "software" from several employment web sites, eliminating duplicates. Table 1 presents our results.

There are several technical issues: We eliminated duplicate offers on a monthly basis. We used case-insensitive matches. In counting "Java" requirements, we had to avoid false hits on "JAVASCRIPT" (obviously). We counted "J2EE," "J2SE," and "J2ME" as equivalent to "JAVA." We added "JAVASCRIPT," "JSCRIPT," and "ECMASCRIPT" together to make a "J*script" total. To exclude "VBA" and "VBSCRIPT" from the "Vbasic" total, we matched "VB followed by any letter except A or S," adding that to the matches for "VISUAL BASIC." The total indicated as "Vbasic.net" counts job offers that matched "Vbasic" and also ".NET." We noticed false hits on "PASCAL" when the name "Pascale" appeared in the job offer, so we counted only "PASCAL followed by a nonletter." Several web sites could not properly handle "C#" as a lookup keyword, so we scanned the full text of all "software" offers and performed our own keyword search. The number of (nonduplicate) job offers per month varied, but was never less than 4000.

Determining the percentages for "C" was the most challenging. Our first attempts used a simple regular expression like the other matches: "[^A-Za-z0-9]C[^.A-Za-Z0-9#+]" (which means "the letter C, preceded by a character that isn't a letter or digit, and followed by a character that isn't a letter or a digit or a period or a sharp-sign or a plus"). We then visually scanned a month's data, finding that about 5 percent of the "C" hits are clearly not C programming jobs: "Bldg C," "Suite C," "Unit C," "A/C" (air conditioning?), "C-Level" ("C-level sales," "C-level executive"), "4.6.C," "C-1426," and so on. Therefore, we prefiltered "C-LANG" and "C-CODE" into plain "C," we prefiltered "C-SHARP" or "C SHARP" into "C#," we prefiltered "C ++" or "C PLUS PLUS" into "C++," then we excluded "period before or after C," and we excluded "hyphen after C."

We added another category—"C/C++"—which contained all the "C" cases that also contain "C++" somewhere (usually, but not always, the keyword "C/C++"). The "C/C++" percentage was usually more than half of the "C" percentage, and about half of the "C++" percentage.

From early reviewers, we received some comments on our methodology. One reader cautioned that percentages based on published job offers overlook those offers filled internally within the organization and that the percentages of jobs filled internally might be significantly different. We agree, but can't study internally filled jobs with our methodology. Other readers requested more languages. We've added all the languages requested so far; if you want more languages or more detailed analysis, just ask us.

If you believe that some publicly available job sites are biased in favor of, or against, any particular programming language, we would be grateful for your information. (As of today, we are unaware of any such biases.)

My special thanks to Doug Teeple (teeple@wi-fone.com) and John Breeden (jbreeden@plumhall.com), for valuable assistance with the survey software.

TPJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 
Dr. Dobb's TV