Jolt Productivity Award: Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites, by Matthew A. Russell
There is a new gold rush happening today. The gold is not the a mineral, rather it is data data we can mine and turn into valuable information. Instead of digging into the earth or panning rivers, we need to mine the Web, specifically the Social Web. If you are planning to find fame and fortune from this gold rush, make sure to pack Matthew Russell's Mining the Social Web in your toolkit.
The Social Web has a wealth of data waiting to be discovered, analyzed, and turned into valuable information. Huge companies, such as Google and Facebook, depend upon this information in order to remain profitable. But you don't have to be a big company in order to mine it. Russell gives you everything you need to dig in and get started.
Mining the Social Web serves up 10 bite-sized chapters that will take you from tenderfoot to a knowledgeable social Web hacker. Spend an hour working through the first chapter and you'll be hooked. Russell takes you through the steps necessary to analyze the latest trends on Twitter, see who's tweeting about the trends, organize the data, and visualize it with tools such as Graphviz.
While Russell shows you exactly how to perform each step, and provides plenty of ideas for you to try, he also encourages you to explore on your own. His style reminds one of a great teacher. He poses a problem, shows you how to solve that problem, and then expands on it and challenges you to reinforce your learning by going further on your own.
In order to take advantage of this book, you must understand Python. All of the examples are written in Python and many external modules are used. Russell makes it as easy as possible for you to understand what is happening, even if you aren't Python-fluent, but this only goes so far. If you really want to get the most out of this book, you should have a working knowledge of the Python language.
Don't read this book when you have no Internet connection. This book reads better in electronic form than the printed page, and makes liberal use of hyperlinks to reference information. In the preface, Russell says that he does this in order for the reader to look at reliable, current information rather than out-of-date material. Without the links, material would need to be included directly in the text, making the book larger and less direct.
Once Russell sets the hook with Twitter hacking in the first chapter, he reels you in with a series of fascinating chapters beginning with capturing information using micro-formats (such as geo, XFN, and others). He shows you how to build graphs that express relationships between pages including microformat notation. While the book is not a tutorial on analytics, it contains plenty of examples of data analysis techniques with reference for more. It also shows you how to use many other tools for massaging data and extracting informational nuggets.
a significant portion of the book is devoted to different sources of data that is ready to be mined. These include mail messages, Twitter, LinkedIn, blogs, Google Buzz, and Facebook. The Google Buzz chapter indicates how quickly things change (today, that chapter would focus on Google+).
The final short chapter discusses the Semantic Web. There isn't much code, simply because the Semantic Web vision has yet to materialize. What is clear, however, is that Mining the Social Web can position you to take advantage of the Semantic Web when it arrives.
Gary Pollice


