Skip to main content

Raymii.org Logo (IEC resistor symbol)logo

Quis custodiet ipsos custodes?
Home | About | All pages | RSS Feed | Gopher

Complete word count analysis of Security Now, episode 1 trough 370.

Published: 09-09-2012 | Author: Remy van Elst | Text only version of this article


Table of Contents


Security Now is a podcast by Leo Laporte and Steve Gibson released on theTwit.tv network.

Steve pays to get the podcast transcribed, and the files are up over aregrc.com.

If you like this article, consider sponsoring me by trying out a Digital OceanVPS. With this link you'll get $100 credit for 60 days). (referral link)

I decided to run my analyzer over the complete podcast text archive. Thisis from episode 001 to 371.

Get the files:

for i in {001..371}; do curl http://www.grc.com/sn/sn-${i}.txt >> sn.txt; echo $i; done

Clean the files up:

cat sn.txt | LC_CTYPE=C tr -cd '[:alnum:] [:space:]' > csn.txt

Analyze the text file:

cat csn.txt | LC_CTYPE=C tr [:space:] '\n' | grep -v "^\s*$" | sort | uniq -c | sort -bnr > count-combined.txt

Result:

ed count-combined.txt 4619301,20np1       65548 the2       49919 to3       42284 that4       40759 STEVE5       40065 I6       39496 a7       35321 of8       31706 and9       30845 it10      29930 is11      24634 you12      22213 And13      20365 in14      16467 this15      14406 was16      13811 So17      13761 its18      13711 for19      12847 have20      11599 on

Full result

Steve only

cat sn.txt | grep "STEVE:" > stonly.txt     cat stonly.txt | LC_CTYPE=C tr -cd '[:alnum:] [:space:]' > stonlyclean.txtcat stonlyclean.txt | LC_CTYPE=C tr [:space:] '\n' | grep -v "^\s*$" | sort | uniq -c | sort -bnr > sto.txt

Result

ed sto.txt 4619301,20np1       65548 the2       49919 to3       42284 that4       40759 STEVE5       40065 I6       39496 a7       35321 of8       31706 and9       30845 it10      29930 is11      24634 you12      22213 And13      20365 in14      16467 this15      14406 was16      13811 So17      13761 its18      13711 for19      12847 have20      11599 on 

Steve only

Leo Only

cat sn.txt | grep "LEO:" > leoonly.txt     cat leoonly.txt | LC_CTYPE=C tr -cd '[:alnum:] [:space:]' > leoonlyclean.txtcat leoonlyclean.txt | LC_CTYPE=C tr [:space:] '\n' | grep -v "^\s*$" | sort | uniq -c | sort -bnr > leoc.txt

Result

ed leoc.txt 3672361,20np1       40349 LEO2       30161 the3       25301 to4       24623 I5       23060 a6       19027 you7       17115 it8       16441 that9       15115 of10      13676 and11      12256 is12      9785 in13      8689 And14      8282 this15      7633 have16      7552 on17      7094 for18      6492 its19      6032 do20      5922 know    

Leo only

Tags: analyze, articles, bash, leo-laporte, podcast, security-now, steve-gibson