Anuj Nanavati
INF 385Q - Knowledge Management Systems
March 1, 2005
This paper examines the role of the Haystack platform created as part of the Haystack project at MIT Laboratory for Computer Science and Artificial Intelligence (CSAI), as an innovative tool for personal information management system using semantic Web technologies. The first section of the paper presents an overview of personal information management systems and their limitations in current collaborative environment. The second section explains the Haystack project and how it can change the way personal information is created, visualized and organized. The last section details some of the important features of the Haystack platform as it relates to personal information management.
Personal information management (PIM) is a system or strategy designed by individuals to organize and integrate personally important information. It transforms random pieces of information into something that can be systematically applied and that expands ones personal knowledge. Professor Paul Dorsey from Millikin University identified seven skills necessary for personal knowledge management: (1) retrieving information; (2) evaluating/assessing information; (3) organizing information; (4) analyzing information; (5) presenting information; (6) securing information; and (7) collaborating around information [6]. Thus main objective of an effective PIM system is to retrieve information specific to users' needs and organize as well as mange it to enhance ones ability to work better in a collaborative environment. Increase in various types of information in today's collaborative environment poses a major challenge in achieving this objective. As the web and other databases grow larger and the amount of non-web info (email, chat, spreadsheets, etc) in our lives continues to proliferate, managing the content in our personal storehouses of data is becoming more and more important.
Traditional information management systems organize and manage information for individual's information needs but they do not allow integration of different types of information from different users at the same time [1]. Users are forced to use different systems for different types of data. Like blogs, emails, instant messages, Web pages etc. have to be viewed in different applications. The Semantic Web promises to open innumerable opportunities for automation and information retrieval by standardizing the protocols for metadata exchange [5]. Semantic Web technologies have identified relations between data and different types of data by using the Resource Description Framework (RDF). Semantic Web technologies allow users to organize, manage and present their information in a customized flexible environment, which was done by multiple applications in past [3]. This personalization is possible in semantic Web environment as users can gain direct access to the different types of underlying information and control how it is presented for themselves [4].
The Haystack project is an undergoing research application of knowledge access user technologies component of the research project Oxygen [2], at MIT CSAI Laboratory. Project Oxygen is aimed at creating a human centered pervasive computing environment, which can be freely available everywhere (like oxygen) through a combination of user specific and system technologies. One of the user technologies of project Oxygen is knowledge access, which offers greatly improved access to information, customized to the needs of people, applications, and software systems. They allow users to access their own knowledge bases, the knowledge bases of friends and associates, and those on the Web through semantic connection networks. The main objective of knowledge access user technologies is to support personalized, collaborative, and communal knowledge to find and organize information people use. It observes and adapts to its users, so as to better meet their needs. These characteristics were missing in traditional information management systems as stated earlier [1].
Haystack and the Semantic Web support personalized information management and collaboration through metadata management and manipulation. The Haystack Project seeks to apply semantic web technologies to personal information management. Haystack is a platform for creating, visualizing, and organizing information using RDF. The Haystack platform was designed to let individuals manage their information in the ways that make the most sense to them. Haystack lets users define whichever arrangements of, connections between, and views of information they find most effective by removing the arbitrary barriers created by applications that handle only certain information types and by recording a set of relationships defined by the developer. It provides maximum flexibility in describing and organizing data, the freedom to group related items together (regardless of the programs used to edit the items), ease in manipulating and visualizing information in ways appropriate to the task at hand, and the ability to delegate tasks to software agents.
Haystack exhibits a number of improvements over traditional information management approaches [1]:
Some of the key features of the Haystack project are related to four major aspects of personal information management, which makes the project unique with respect to other tools in this arena: information in one place, working with information not programs, operations as information objects, and personalized and situational access to information.
In
the past, information was scattered between e-mail clients and servers,
filesystems, calendars, address books, the Web, and other custom
repositories. Haystack eliminates this partition so that individuals
can work with their information in a unified fashion. As shown in
screenshot 1 different types of information such as e-mail messages,
news feeds, search results, contact information, chat messages, Web
pages can be viewed and managed from a single window.
Screenshot 1: Haystack home page - information in one place
Traditional
tools force users to remember what program controls which kinds of
information. In Haystack, users focus on their information, not the
program. This allows users to perform program specific functions across
different programs to which those function are not inherent. For
example rotating a picture is an inherent function of graphics editor
but Haystack allows this function to be performed into the email reader
when need arises while composing a message [screenshot 2].
Screenshot 2: Context menu on an image - working with information not objects
The other example of this can be the function of drag and drop. Traditional email programs do not allow adding attachments by dragging and dropping files from the directory structure, which can be performed in Haystack.
In
Haystack, every entity whether it's simple text in an e-mail message or
an email message itself is considered as an information object.
Anything can be right-clicked on to show its context menu, allowing
immediate access to all operations that make sense for that object. An
operation can also be downloaded from outside applications which will
be immediately available to use. Screenshot 3 shows how the context
menu is displayed showing all the possible operations on a particular
object (here word "plethora").
Screenshot 3: Context menu on a word (object) - operations as information objects
Haystack
provides new ways of accessing information that is specific to a
situation and a person. Let's say a user wants to access specific
elements from all different possible elements of a particular
collection for a particular situation at hand. Haystack can help him
retrieve these elements by analyzing all the elements in the collection
with their commonalities and lets a user make his selection based on
the situation from the given data. Each time a selection is made
Haystack will show a subcollection of the original collection.
Selection can also incorporate other criteria which were not initially
part of user's information request [Screenshot 4].
Screenshot 4: Personalized and situational access to information