WebTracker: A Tool for Understanding Web Use

Don Turnbull


This paper presents a study ending in 1998, conducted over 16 months to develop a comprehensive view of Web usage utilizing a new combination of data collection and analysis methods devised by Choo, Detlor, & Turnbull (1998). The main approach in studying Web use was the triangulation of information from three diverse sources: an initial survey questionnaire; usage logs gathered with a custom-developed Web tracking application; and follow-up interviews with study participants. Findings are reported of the study's empirical investigation that shows how the proposed methodology was utilized to study corporate Web users.

Keywords

Web Use; Survey; Questionnaire; Client Application; WebTracker; Interview; Methodology

Overview

The second tool used in this study, WebTracker, is a tool for gathering Web browsing metrics developed for the Faculty of Information Studies at the University of Toronto. WebTracker was designed because of the inaccuracy of using Proxy or Firewall servers (Pitkow, 1997) to study micro moves when using the Web and the lack of current, publicly-available browser code for the Windows environment to instrument a browser. Previous studies used XMosaic (Catledge & Pitkow, 1995 and Cuhna, Bestavros, & Corvella, 1995) on UNIX systems, but as our study focused on corporate users who predominantly work on Microsoft Windows platforms, we required a different tool. Despite the presence of newer, Windows-specific Web browser source code from the Mozilla project (Eich, et. al, 1998) we felt that installing a new, instrumented browser would not allow us to observe the actual behavior of users participating in the study. Users can simply work on the Web as they did before, with their usual technical configurations and browser preferences including bookmarks and toolbar choices.

WebTracker Architecture

WebTracker runs on Windows 3.1, Windows 95, Windows NT 3.5x and 4, and Windows 98 environments. It is a 32-bit application with standard Windows controls and behaviors. Moreover, WebTracker runs like any typical Windows application, using normal install procedures, standard systems processes, and can therefore be uninstalled easily as well.

Primarily, WebTracker watches the Web browser and collects menu choices, button bar selections, and keystroke actions. These actions are associated with the open Web page (URL), tagged with a date-time stamp and recorded in a daily log file. This tracking method enables log analysis that can essentially reconstruct move-by-move how participants looked for information on the Web. The log file uses the following format:

User ID

Browser Action

Date-Time Stamp

URL

Web Page Title

The UserID field is taken from the entry in WebTracker via the User Identification dialog while Browser Action is taken from a code file installed in the WebTracker directory that is specific to the browser version on the participants machine. The Date-Time Stamp is taken from the system clock and the URL is the actual protocol and address of the page loaded into the Web browser. Finally, the Web Page Title is taken from the HTML <TITLE> tag in each Web page displayed by the browser in its Title Bar.

WebTracker was designed to collect the most relevant browser actions, mainly interaction using buttons, menus, and the keys that control the Web browser functionality. Mouse clicks are only recorded when a link is selected on the current Web page. This table shows all of the different actions logged. Note that scroll bar use and certain menu functions were specifically not implemented but could be added for future studies.

Interface Object

Browser Action

User Activity

Button

 back

The Back button on the Navigation Toolbar

Button

 forward

The Forward button on the Navigation Toolbar

Button

 reload

The Reload button on the Navigation Toolbar used to reload the Web page

Button

 home

The Home button on the Navigation Toolbar

Button

 search

The Search button on the Navigation Toolbar

Button

 guide

The Netscape button on the Navigation Toolbar

Button

 print

The Print button on the Navigation Toolbar

Button

 security

The Security button on the Navigation Toolbar

Button

 stop

The Stop button on the Navigation Toolbar

Menu

 file new

File - New to open another Browser window

Menu

 file open

File - Open Page… to enter a URL to access or open a local file

Menu

 save as

File - Save As…. to save the Web page locally

Menu

 send page

File - Send Page… to send a page via the Browser's or other email application

Menu

 open page

File - Edit Page to open the Web page in the browser's built-in Editor

Menu

 print preview

File - Print Preview

Menu

 print

File - Print…

Menu

 edit copy

Edit - Copy to copy selected text from the Web page onto the Windows Clipboard

Menu

 select all

Edit - Select All

Menu

 find in page

Edit - Find in Page…

Menu

 search internet

Edit - Search Internet…

Menu

 search directory

Edit - Search Directory…

Menu

 reload

View - Reload

Menu

 refresh

View - Refresh

Menu

 page source

View - Page Source

Menu

 back

Go - Back

Menu

 forward

Go - Forward

Menu

 Go home

Go - Home to return to the user-specified Home page

Menu

 add bookmarks

Communicator - Bookmarks - Add Bookmark to add the current Web page to the bookmark file

Menu

 edit bookmarks

Communicator - Bookmarks - Edit Bookmarks… to open the Bookmark file (for reorganizing, searching, or editing bookmarks)

Menu

 history

Communicator - Tools - History to open the browser's History window

Key

 edit bookmark

Ctrl + B to open the Bookmark file (for reorganizing, searching, or editing bookmarks)

Key

 copy

Ctrl + C to copy selected text from the Web page onto the Windows Clipboard

Key

 add bookmark

Ctrl + D to add the current Web page to the bookmark file

Key

 history

Ctrl + H to open the browser's History window

Key

 new window

Ctrl + N to open another browser window

Key

 open page

Ctrl + O to enter a URL to open

Key

 print

Ctrl + P to print the URL

Key

 reload

Ctrl + R to reload the Web page

Key

 save as

Ctrl + S to save the Web page locally

Key

 page source

Ctrl + U to view the HTML of the Web page

Key

 back

Alt + ¬ (left arrow key)

Key

 forward

Alt + ® (right arrow key)

Key

 stop

The Esc key to stop the Web page from loading

Key

 page up

The PageUp key to move through the Web page

Key

 page down

The PageDown key to move through the Web page


Table 1: Browser Actions Recorded by WebTracker

These browser actions are recorded in an ASCII text, tab-delimited log file named with the system date using a ".TXT" extension (e.g. 110198.TXT). For each day of use, WebTracker creates a separate log file. These files can be viewed with any text editor application.

WebTracker Deployment

During the initial stages of the study, we physically visited the users' individual work environments and installed WebTracker to run at system startup as a minimized application. By developing WebTracker as a standalone, typical Windows application, participants could see it running, and have WebTracker available for suspending or viewing their usage logs. After verifying that WebTracker was functional, we again explained how WebTracker works by showing the few user functions available. These included the option of turning WebTracker logging off by selecting the Web Tracker is INACTIVE radio button.  Also, the current WebTracker log file can be viewed by selecting Today's Data from the View menu as shown in Figure 4.

 

Figure 1: WebTracker Main Window

Next, we showed each participant how to enter a User Identification string that is appended to each entry in the WebTracker log file. This is the only actual interaction with WebTracker that is required by the participant. Once configured to load at system startup as minimized, WebTracker runs without any additional intervention for the duration of the study.

 

Figure 2: WebTracker Setup Window

As shown in Figure 6, the Expanded window mode also allows the user to make logging either active or inactive and adjust the interval at which the Web browser is polled for data. This interval can be adjusted from 1 to 5 seconds to accommodate various system speeds, eliminate double log entries, and prevent interference with other client applications. The Message History display box shows the codes that correspond to the user action in the Web browser. In this example, a URL http://witanweb.iit.nrc.ca/www/AuthorFAQ is being selected by the user from a link on the current Web page and noted in the Last Event display box as the "LINK_TO" activity.

 

 

Figure 3: WebTracker Expanded Window

 

Once WebTracker has been demonstrated, participants are encouraged to use their Web browsers as they normally would. From that point on, their Web use is collected into log files.

WebTracker Logs

The strength of this study's methodology rests upon the collection and storage of daily Web usage activity in actual organizational settings. This information is recorded in each participant's log files. Table 5 shows a short, but typical set of log entries that WebTracker might record when using a Web browser.

User ID

Browser Action

Date and Time

URL Visited

Web Page Title

DT

STARTUP

1/1/02 4:29:44 PM

 

 

DT

LINK_TO

1/1/02 4:34:59 PM

http://donturn.fis.utoronto.ca/test/index.html

Test Site Home

DT

LINK_TO

1/1/02 4:35:08 PM

http://donturn.fis.utoronto.ca/test/test1.html

Test Site Page 1

DT

button back

1/1/02 4:35:09 PM

http://donturn.fis.utoronto.ca/test/test1.html

Test Site Page 1

DT

LINK_TO

1/1/02 4:35:17 PM

http://donturn.fis.utoronto.ca/test/index.html

Test Site Home

DT

key add bookmark

1/1/02 4:35:17 PM

http://donturn.fis.utoronto.ca/test/index.html

Test Site Home

DT

Key open page

1/1/02 4:35:28 PM

http://donturn.fis.utoronto.ca/test/index.html

Test Site Home

DT

LINK_TO

1/1/02 4:35:35 PM

http://www.ora.com/

OReilly Home

DT

Key back

1/1/02 4:35:41 PM

http://www.ora.com/

OReillyHome

DT

LINK_TO

1/1/02 4:35:44 PM

http://donturn.fis.utoronto.ca/test/index.html

Test Site Home

DT

menu save as

1/1/02 4:35:46 PM

http://donturn.fis.utoronto.ca/test/index.html

Test Site Home

DT

LINK_TO

1/1/02 4:36:11 PM

http://donturn.fis.utoronto.ca/test/printme.html

Test Site Print Me

DT

menu print

1/1/02 4:36:15 PM

http://donturn.fis.utoronto.ca/test/printme.html

Test Site Print Me

DT

button back

1/1/02 4:36:46 PM

http://donturn.fis.utoronto.ca/test/printme.html

Test Site Print Me

DT

LINK_TO

1/1/02 4:36:56 PM

http://donturn.fis.utoronto.ca/test/index.html

Test Site Home

DT

button reload

1/1/02 4:36:59 PM

http://donturn.fis.utoronto.ca/test/index.html

Test Site Home

DT

LINK_TO

1/1/02 4:37:14 PM

http://donturn.fis.utoronto.ca/test/index.html

Test Site Home

DT

SHUTDOWN

1/1/02 4:41:11 PM

 

 

 

Table 2: WebTracker Log View

In this session, the Web browser was started at 4:29 pm on January 1, 2002. The "Test Site Home" Web page was accessed, as it is the default browser Home Page. From there, a link from that page was followed to the "Test Site Page 1" Web page, and then the Back button was clicked on the Navigation toolbar to return to the previous page. Next, this page was bookmarked. While on the "Test Site Home" page, the Ctrl-O keystroke was used to open the Open Page dialog box where the URL "www8.org" was entered. The "OReilly Home" page was loaded, then the Back button was selected while on that page, returning to the previous page "Test Site Home". Next, this page was subsequently saved as a file on the local hard drive by using the File - Save As& menu command. Next, the "Test Site Print Me" link was selected, opening that page in the browser. The page was printed using the File - Print& menu command and then the Back toolbar button was selected to return to the "Test Site Home" page. The Reload toolbar button was selected, the page was reloaded into the browser window, and finally the browser was closed at 4:41 pm.

After the ten business day tracking period, we visited each user site and uninstalled WebTracker, while collecting individual log files for analysis.

WebTracker Results

The tracking logs from 33 participants were collected and analysed. From this large data set, 61 significant episodes of information seeking were isolated and analyzed in terms of their modes of viewing or searching, and their associated Web information moves. The selection of episodes was guided by evidence of the episode having consumed a relatively substantial amount of time and effort or having been a recurrent activity.

Data from the tracking log files helped determine the moves exercised by participants as they used their Web browsers to view and find information. Data about the sequence of site visits, repetitions of these sequences, movements backwards and forwards between pages, the use of bookmarking, the selection of sites from stored bookmarks, the use of search engines, printing, and other actions and events captured by the WebTracker were examined to trace the selection and development of information seeking moves over the duration of each episode. Using Ellis' model of information seeking behaviors as a guide (Ellis, 1989; Ellis et. al., 1993; Ellis and Haugan (1997), the participants Web moves were classified into starting, chaining, browsing, differentiating, monitoring, and extracting information seeking behaviors.

The analysis of the WebTracker data led to the production of a behavioral framework which relates motivations (the strategies and modes of viewing and searching) and moves (the tactics used to find and use information). More details of the framework can be found in Choo, Detlor, & Turnbull (forthcoming). A preliminary analysis of the tracking data for the pilot portion of the study can be found in Choo, Detlor, & Turnbull (1998).

These interviews provided insight into the context behind each individual participant's Web usage within their organizational settings.

Summary

The research presented here outlines a tool used to gather  empirical evidence for studying Web use. WebTracker may be used within a larger methodological framework (Choo, Detlor, Turnbull, 1998 and Choo, Detlor, Turnbull, 1999) for a rich portrayal of how individuals use Web-based information in their natural work settings, interact with Web pages, the usability of Web pages, and Web browser interaction.

The author expresses his thanks to Professor Chun Wei Choo for his theoretical knowledge and practical guidance. Brian Detlor was also of great assistance with editorial and methodological issues. This research was supported by a grant from the Social Sciences and Humanities Research Council of Canada. More information about studies utilizing WebTracker are available at http://choo.fis.utoronto.ca/esproject/.

References

Catledge, L. D., & Pitkow, J. E. (1995). Characterizing Browsing Strategies in the World-Wide Web. Computer Networks and ISDN Systems, 27, 1065-1073.

Choo, C.W., Detlor, B. & Turnbull, D. (1998). A Behavioral Model of Information Seeking on the Web  Preliminary Results of a Study of How Managers and IT Specialists Use the Web. Proceedings of the 61st Annual Meeting of the American Society of Information Science, 290-302.

Cuhna, C.R., Bestavros, A. & Crovella, M.E. (1995). Characteristics of WWW Client-Based Traces. http://www.cs.bu.edu/techreports/abstracts/95-010.

Eich, B., et. al. (1998). Mozilla.org. http://www.mozilla.org/.

Ellis, D. & Haugan, M (1997). Modelling the Information Seeking Patterns of Engineers and Research Scientists in an Industrial Environment. Journal of Documentation  53(4), 384-403.

Ellis, D., Cox, D., & Hall, K.(1993). A Comparison of the Information Seeking Patterns of Researchers in the Physical and Social Sciences. Journal of Documentation, 49(4). 356-369.

Ellis, D. (1989). A Behavioural Model for Information Retrieval System Design. Journal of Information Science, 15(4/5), 237-247.

Flanagan, J. C.(1954). The critical incident technique. Psychological Bulletin  51(4), 327-358.

Kehoe, C., Pitkow, J. & Rogers, J. (1998). GVU's Ninth WWW User Survey Report. http://www.gvu.gatech.edu/user_surveys/survey-1998-04.

Pitkow, J. and Recker, M. (1994). Results from the first World-Wide Web survey. Special issue of Journal of Computer Networks and ISDN systems, 27, 2.

Pitkow, J. (1997, April 7-11). In Search of Reliable Usage Data on the WWW. Sixth International World Wide Web Conference Proceedings, Santa Clara, CA.