Have Fun With Your Tweets in Powershell

I can’t remember who got me interested in wordle.net but it struck me that I would love to see all of my Twitter posts as a “Beautiful Word Cloud” (from their site). The Java applet on that site transforms a bunch of text (or a site) into a word cloud with the size of the word being representative of its frequency in the text.  I saw a the post (and source code) from Adam Franco for retrieving Twitter content using PHP but was more interested in a Powershell version, and I didn’t want the posts in XML – that wouldn’t be useful input to Wordle.

Starting from the end … what I wanted was to use either of the following lines in Powershell:

"testfirst" | Export-Tweets | out-file adam.txt
or
Export-Tweets -User "testfirst" | out-file adam.txt

The first of these uses the pipeline to receive the Twitter username, useful if you want to put it into a loop to say, extract the contents of more than one user account.  The second one is a more literal request for the tweets of a single Twitterer.  In both cases, the output is to be saved in a file name of my choice using the handy out-file cmdlet.  In other words, I don’t want a script that hard-codes the output file name.

Recognize that first you have to get the Export-Tweets function into memory, so perhaps a more complete scenario that depicts my intended usage is the following, typed into a Powershell console:

.Export-Tweets.ps1
"testfirst" | Export-Tweets | out-file adam.txt

The following script accomplishes these modest goals.  There are two Powershell tricks to point out.  The first trick is enabling pipeline input for a function.  You’ll see that in the definition of the function Export-Tweets since it has the begin{}, process{}, and end{} blocks.  The pipeline input is enabled in the process{} block while the explicit syntax with the User parameter is enabled in the end{} block.  The trick is to define the function that does the work in the begin{} block.

The second trick is substituting % for foreach.  This really cuts down on the script length, however, use that substitution with caution if you’re writing scripts that other people have to read.  It’s a useful and learnable enough reduction that I wanted to use it in this case.

Here’s the script:

function global:AppendTweets([xml] $tweetsPage)
{
    $tweetsPage.selectnodes("/statuses/status") | % { $_.selectSingleNode("text").get_InnerXml() | write-output }
}

function global:CountTweets([xml] $tweetsPage)
{
    return $tweetsPage.selectnodes("/statuses/status").Count
}

function global:GetTweetsPage([string] $user, [string] $page)
{
    
    [string]$urlbase="http://twitter.com/statuses/user_timeline/"
    [string]$url=$urlbase + $user + ".xml?page=" + $page
    
    write-host "Connecting to URL " $url

    $webclient= New-Object "System.Net.WebClient"
    [xml] $tweetsPage=$webclient.DownloadString($url)
    return $tweetsPage
}

function global:Export-Tweets([string] $User)
{
    begin
    {
        function GetUserTweets ([string] $User)
        {
            $pageIndex = 1
            $numTweets = 1
            while ($numTweets -ge 1)
            { 
                $tweetsPage = GetTweetsPage $user $pageIndex
                $numTweets = CountTweets $tweetsPage
                if ($numTweets -ge 1)
                {
                    AppendTweets $tweetsPage
                }
                $pageIndex += 1
            }
        }

    }

    process
    {
        if ($_)
        {
            GetUserTweets $_
        }
    }

    end
    {
        if ($User)
        {
            GetUserTweets $User
        }
    }

}

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s