Sunday, November 17, 2013

Tutorial: Teaching Research Staff at Pacific Northwest National Laboratory How to Use Web of Science

Cramer_Carolyn_3H948_orig For Lorraine Bruce’s LIS560 “Instructional and Training Strategies for Information Professionals” class, I focused on the information needs and information seeking  behaviors of “senior” and “early/mid career” research scientists at Pacific Northwest National Laboratory.  Based on the various learning styles and behaviors of the target audience, I created a tutorial for the research staff on how to access and utilize the widely used scientific database “Web of Science”.   I also created a rubric to assess and gauge competencies and skills following the tutorial.


The five learning tasks of the tutorial were to:

  • Locate and access PNNL’s Technical Library Website
  • Navigate the Web of Science Interface
  • Perform basic and advanced searches
  • Utilize faceting and refinements of the search results
  • Knowledge of other features of the database

Other materials included in this blog post are links to:

Training Module Presentation: Information Seeking Behaviors and Learning Styles of Research Scientists at PNNL

Rubric on the Web of Science Skills

Web of Science Quick Tips Handout

Pacific Northwest National Laboratory’s Technical Library Contact Information

LIS 560 Training Module Part A

LIS 560 Training Module Part B

Friday, August 23, 2013

Field Directed Work Week 9

This is my last and final week in Virginia for the summer.  I spent a majority of the week working on my final artifact (Powerpoint slides) which included all of the datasets and visualization analytic tools I worked with.  The visual analytic tools I worked with were IN-SPIRE, Tool “B”, IBM’s i2 Analyst Notebook and Tableau Software.  The dataset types were library citations (Database searches in Web of Science and IEEE), twitter, emails from database sources, and web logs.  Each raw dataset had to be formatted for each designated tool.  Each tool is built upon different theories, algorithms, and background mathematics, to display the dataset visually for the user to discover trends or contextual relationships of the data they may not have seen with the naked eye. 

Introduction Slide Introduction Slide


Visual Analytics2 Visual Analytic Tools and Different Data Sources


Visual Analytics1

Data Flow into the Visualization tools


Included with the roadmap are the initial views of the tools and the types of figures a user can obtain to discover the relationships within the dataset.  These views were created throughout the summer and can be seen in previous weeks’ blog entries.


ConclusionSlide Information can come from a variety of sources and interpreting large sets of data can be quite daunting, but with the use of visual analytic tools, users can discover relationships and trends in a given dataset. 

The final artifact, Powerpoint Presentation, was sent to the Directed Fieldwork Advisor directly and if you would like to see the final slides feel free and email or leave a comment here.  I did not want the information to be distributed publicly.  Thanks!

Friday, August 16, 2013

Field Directed Work Week 8

This week I spent time ingesting Twitter datasets into Tableau Software.  Tableau Software is a great tool to make tables, charts and graphs on the fly by dragging and dropping the dimensions and measures you want to have to calculated instantaneously.  The datasets are mostly in CSV or Excel format.  The dimensions are dictated by the field or column headings.  The data must be formatted in Excel to designate whether it is a number, string, text, etc.


(1) Arrows that point to the dimensions (fields) from the excel spreadsheet (2) Measured values.  This is the table with the designated dimension and measured values, this is a table displaying tweet counts (4) by user (3)


Dragging and dropping dimensions and measures to formulate graphs or figures, in this bar chart we’ve plotted tweet counts by time zone


The various types of figures you can visualize based on the number of dimensions (fields) and measures.  You can make tables, maps, heat maps, tree maps, stacked or horizontal bar charts, side by side style, line, dual line, area, pie, scatter, circle, bullet, Gantt, packed bubble and histogram charts.  The title of this widget is “Show Me” so it shows you your options based on your criteria.



Stacked bar chart that displays, number of tweets per year, by designated twitter ID (color block)



“Packed Bubble” visualization of twitter data by device or method, the larger the bubble the higher the number of devices



Graph of web log data that display the website URL and number of hits (counts)


I also spent the majority of this week gathering all datasets and writing up my final artifacts for my mentor and for my field directed work.  I also played with Tool “D” ingesting photos and videos and examining the capabilities that tool can offer for intelligence analysts.

Saturday, August 10, 2013

Field Directed Work Week 7

This week I spent time doing a variety of tasks.  The first task was inputting web logs into IBM’s Analyst Notebook and Tableau.  This task was to familiarize me with the different topic headings of web logs.  In this case the logs were Squid Proxy Access.  I learned quite a bit about the features and capabilities of Microsoft Access and Notepad to manipulate the data into a usable format. I also had to help define the headings of what is in a Squid Proxy Access web log.  The types of information you can obtain from these logs include from what IP address you are at and what website (URL) you are trying to access and the duration of time it took to access it and whether you had a success log-on or not.  Here are a few screenshots of the data in both tools. 


Analyst Notebook view of IP Address node and link to the URL user is trying to access



Tableau allows for quick drag and drop of attributes to create views/graphs/visualizations on the fly and it does calculations, statistics and trend lines as well



Squid Result Codes and their Counts



Bubble view of IP address counts



Number of websites accessed per IP address (Client Address)


The second task was inputting citations into IBM’s Analyst Notebook.  There are approximately, 4000 citations on the Web of Science search of “code reuse” from Web of Science.  I am wanting to demonstrate the relationship of title, author and keyword.   The issue with this connection is that there are multiple authors and multiple keywords.  In the RIS format of these citations the authors appear as A1, A1, etc and keywords as KW, KW, etc.  They aren’t designated by individual tags (A1, A2, A3 or K1, K2, K3).   Therefore the visualization includes one long line of information which can be messy.  To separate the author fields and keyword fields per record will take some programming because it creates a multiple attribute system (Cartesian multiplication).  Here is an Analyst Notebook screenshot of the records in journals, books and chapters connected to its Web of Science address.

ANB WOS Code Reuse 

Journal, Book and Chapter links to Web of Science



Literature search for the past 15 years on Web of Science, each node denotes a year




Year linked to Title linked to Keywords or Authors

The third task was gathering all datasets to present to my mentors what sets can go into what tools thus far to present on Monday.

Friday, August 2, 2013

Field Directed Work Week 6

One of the big passions I have is working with the outreach programs at PNNL (High School, Undergrad, Graduate, or Community College students).  One of the programs I worked with throughout the year is SULI (Science Undergraduate Laboratory Internships) offered by the Department of Energy.  I went back home to Richland, Washington to participate in the first wave of summer students symposium.  One of the students, Claudia Gallegos, presented a way to visualize the twitter feeds from the Boston Marathon bombings.  I contacted the student and her mentors to obtain her raw datasets and then import the sets into IN-SPIRE.   The sets included data a few months up until the bombing through the days of the man hunt.  A tremendous amount of information!  Visualizing the tweets, retweets, the times, hastags and buzz words used is very interesting.  Here are a few of her slides from her presentation.




Other applications for the visualization of tweets is to see the response of disease epidemics and natural disaster responses.  It seems that the norm for people with smart phones, computers and mobile devices that the use of social media can spread information faster than a news reporter can, but it is up to the viewer to believe whether the data is fact or fiction.  It was a really interesting way to see how social media can be used and visualized from a large dataset.

In this short week, I was also able to obtain an unrestricted Open Source Center account and set up a feed of data into email.  The database has an advanced search feature option that includes from what source country you want the records to come from.  In this case I had set up a search on Syria with source countries of Qatar, Saudi Arabia, Turkey, Iran, Lebanon, Israel, Russia, Jordan and France.  The resultant records are either designated classified or unclassified.  These emails accrued for a few days and I was able to input the text into IN-SPIRE.   In order for these emails to be imported into the other tools, templates to convert these emails into XML or CSV format will be needed to define the metadata fields. 

Friday, July 26, 2013

Field Directed Work Week 5

This was a successful week at inputting the “SS” dataset into IBM’s Analyst Notebook.  A shorter practice dataset was inserted into the tool test if the data can be imported.   Based on the fields and their definitions available from the dataset I created an attribute relationship  Once it went in cleanly, all 50,000 records were inserted.  A few glitches occurred so we looked at the raw data in Excel and upon determining null data causing column misalignment they were adjusted in Excel, we were able to formulate the entire set into a peacock view in Analyst Notebook  Here is a screenshot of what peacock data looks like.

SS Peacock


The relationships of the data stems from what the user wants to correlate to each other.  For example I created a fake dataset of name, birth date and gender in Excel.

Association Chart

I chose an association chart, but the user has many options on how they want the data to relate


Import Specification

I then determined how I wanted to correlate the data, in this case I wanted the name associated with the date associated with the gender


Male Female

Once the data is imported an individual male chart and a female chart are created showing the relationships of name to date of birth.


At the end of the week upon returning to Richland for a few days, I met with a co-worker to discuss next steps in massaging some twitter data to be inputted into the tools and how to deal with columns with null information.

Sunday, July 21, 2013

Happy 5th Anniversary, Charlottesville and the Grand Caverns

2013 is our fifth year anniversary and we’re spending it over here in Virginia with Lucy.  We decided to get out of town and go to Charlottesville. 


Here are this year’s Glassy Babies to add to our collection.  The Creme Brulee colors represent our trip to Utah last year with all the reds, orange, and rust colors of the landscape.  The Canary color is all the happiness we experienced last year with all of Lucy’s milestones and celebration of another fun year together.  The Hudson colors represent the sunsets we experienced in Kauai last year.  The colors include blues, purples and hues of rust.  Last year was a great year!


Charlottesville is a super cute town where University of Virginia, Monticello and lots of Virginia history exists.  We stayed in the Boars Head Inn for two nights and it was phenomenal.






Our fancy breakfast, our first time having REAL maple syrup.  Yummy!


Monticello is Thomas Jefferson’s home.  It is an engineering marvel.  He was such a forward thinker of his time, from his agriculture research, to his weather predicting, to his water collection methods and clock, calendar contraption.  He also had a HUGE library which was really neat to see.  Monticello is on the back of a nickel.


IMG_0867 IMG_0868

IMG_0870 IMG_0873





IMG_0875 IMG_0883


Downtown Charlottesville

IMG_0889 IMG_0887



Grand Caverns

We went to the Grand Caverns on our way to the Shenandoah Mountains.  It was really neat to see. Lucy loved it too.

IMG_0894 IMG_0910