RationalWiki talk:Pages by popularity

So is the number roughly equal to the number of pageviews? 16:51, 15 November 2015 (UTC)
 * The number is literally the number of pageviews, at least coming in via the /wiki form of the URL. (Other forms are not counted, e.g. diffs, in-MediaWiki redirections, etc. So some might need to be hand-merged if you cared a lot about the count.) - David Gerard (talk) 19:58, 15 November 2015 (UTC)

I suggest we add Category:Entry point to the highest-up pages. Bicycle wheel  20:16, 15 November 2015 (UTC)
 * Yeah, that's what that cat is for: pages in need of attention and improvement - David Gerard (talk) 20:18, 15 November 2015 (UTC)

Free code review
Eww, that command line. Try this on for size:

awk '/\/wiki\/[^?]/ {sub(/.*\/wiki\//,"",$2); sub(/\/.*/,"",$2); a[$2]++} END {for(i in a){print a[i],i}}' "$1" | sort -n -r -k 1

Fully POSIX! Best practices! Agile waterfall development model! In the cloud! --Ymir (talk) 13:33, 16 November 2015 (UTC)


 * Hideous one-liners are supposed to be Perl, aren't they? Mostly what it actually needs is more processing of HTML escapes, but not all of them (or I'd have whacked recode in there to do the escaping). Also canonicalising to initial uppercase requests direct to lower-case versions of page names. - David Gerard (talk) 14:56, 16 November 2015 (UTC)


 * But basically when there's a script that doesn't suck I'll do the rest. Got the page hit logs back to Dec 2012 - David Gerard (talk) 14:57, 16 November 2015 (UTC)

Heh, I wouldn't say it's that hideous, at least compared to some of the truly awe-inspiring code golf out there. If you want nicer-looking code:
 * 1) !/bin/sh

awk '/\/wiki\/[^?]/ { sub(/.*\/wiki\//, "", $2) sub(/\/.*/, "", $2) $2 = toupper(substr($2, 1, 1)) substr($2, 2) a[$2]++ } END { for(i in a) {print a[i],i} }' "$1" | sort -n -r -k 1

That also does what I think you wanted regarding canonicalization of page names. Want a Perl version? :) By the way, if there actually isn't anything else to the script, you should use "$@" instead of "$1", so you can pass to the script as many filenames as you want. --Ymir (talk) 13:19, 17 November 2015 (UTC)