Saturday, August 10, 2013

Field Directed Work Week 7

This week I spent time doing a variety of tasks.  The first task was inputting web logs into IBM’s Analyst Notebook and Tableau.  This task was to familiarize me with the different topic headings of web logs.  In this case the logs were Squid Proxy Access.  I learned quite a bit about the features and capabilities of Microsoft Access and Notepad to manipulate the data into a usable format. I also had to help define the headings of what is in a Squid Proxy Access web log.  The types of information you can obtain from these logs include from what IP address you are at and what website (URL) you are trying to access and the duration of time it took to access it and whether you had a success log-on or not.  Here are a few screenshots of the data in both tools. 


Analyst Notebook view of IP Address node and link to the URL user is trying to access



Tableau allows for quick drag and drop of attributes to create views/graphs/visualizations on the fly and it does calculations, statistics and trend lines as well



Squid Result Codes and their Counts



Bubble view of IP address counts



Number of websites accessed per IP address (Client Address)


The second task was inputting citations into IBM’s Analyst Notebook.  There are approximately, 4000 citations on the Web of Science search of “code reuse” from Web of Science.  I am wanting to demonstrate the relationship of title, author and keyword.   The issue with this connection is that there are multiple authors and multiple keywords.  In the RIS format of these citations the authors appear as A1, A1, etc and keywords as KW, KW, etc.  They aren’t designated by individual tags (A1, A2, A3 or K1, K2, K3).   Therefore the visualization includes one long line of information which can be messy.  To separate the author fields and keyword fields per record will take some programming because it creates a multiple attribute system (Cartesian multiplication).  Here is an Analyst Notebook screenshot of the records in journals, books and chapters connected to its Web of Science address.

ANB WOS Code Reuse 

Journal, Book and Chapter links to Web of Science



Literature search for the past 15 years on Web of Science, each node denotes a year




Year linked to Title linked to Keywords or Authors

The third task was gathering all datasets to present to my mentors what sets can go into what tools thus far to present on Monday.

No comments: