Showing posts from July, 2012

Tweets per Day Analysis

This is another awesome tool “Twitter Venn” from Jeff Clerk.  
You can enter the brand and competitor names separated by commas and search. After the data is retrieved a Venn diagram is shown which illustrates the rate of tweets containing the search terms in the various combinations.
As per Jeff Clerk, this tool supports investigation into the relationship between how words are used within the messages of all the people usingTwitter.
It also helps in visualize the overlap between various sets of topics.

When you click on the bubbles, it will show your related keywords cloud on bottom left hand side. Whereas on bottom right hand side you can see the average tweets per day.
Also on clicking the bubble, you can see the original tweets. Like when I clicked on the Brown color bubbles which represent the common tweets of Wipro and Cognizant, below tweet appeared:

You can analyze that the tweets having topics Cognizant and Infosys are more as compared to other combinations. This will help you ana…

Twitter Words Association Analysis

Recently I came across Twitter Spectrum tool from Jeff Clerk. This tool is modified version of News Spectrum tool.
Here you can enter two topics and then analyse the associated words based on twitter data. Blue and Red color represents the associated words of those two topics whereas Purple represents the common words.
You can click on any word to see the related tweets. The visualization is really awesome and you can easily analyze the data.
For example, I have taken “icici” and “hdfc” as two topics. Below is the twitter spectrum based on these two topics:
If you analyze the associated words, you will find that “Security”, “Insurance” words are associated more with ICICI bank whereas “digital”is associated with HDFC bank.
But words like “online banking”, “mobile banking” are still missing. This type of visualization will help the brand to plan their startegy of type of tweets to be posted on twitter.

Tool URL :
Do you …

To find frequency of the words using RapidMiner

In my previous post, I wrote on How to read and write data in RapidMiner. In this post, I am covering How to count the words frequency in text using RapidMiner. The model contains following operators:
Read ExcelNominal to TextProcess DocumentsTokenizeTransform casesFilter Stopwords

RapidMiner model is shown below:

In Process documents operator, add 3 operators as shown below:

Tokenize operator splits the text of a document into a sequence of tokens.
Transform cases operator transform the words cases in desired format.
Fiter Stopwords operator removes English stopwords from a document like and, or, not, is, an etc…
Output :

If you are looking for XML of this word frequency model using RapidMiner, leave your email ID in comment box.


How to Read, Write data and Transform Cases in RAPIDMINER

I recently started exploring RapidMiner to do sentiment analysis and text classification of social media data. So I am going to post some tutorials on RapidMiner based on what I have learned so far on this tool. In this post, I am writing on very basic thing – How to read, write data and transform cases in RapidMiner.
RapidMiner is a free tool and can be downloaded from .  Make sure you have Text Analytics plugin of RapidMiner installed.
Below is the model, I have built in RapidMiner to read and write text.
It includes 5 operators Read ExcelNominal to TextProcess DocumentsTransform CasesWrite Excel

We start with Read Excel operator. “Read Excel” operator loads data from MS Excel spreadsheets. This operator is able to reads data from Excel 95, 97, 2000, XP, 2003 and 2007.

Select the excel file from your system which you want to load.

Excel file which I have uploaded using Read Operator is shown below.

Connect it with the “Nominal to Text” operator.  This operator replaces all …