May 04, 2003

KDD & IR PPT Presentation

Here is the link to my class presentation on KDD & IR - Download file

Posted by amdonovan at 12:31 PM | Comments (0)

May 03, 2003

Any comment about Connection of IR and Information Policy

I plan to write Master's report this fall. The topic will be about the connection or integation of Information Retrieval(IR) technology and information policy.

But, I haven't decided any specific topic. Do you have any idea about that?

Posted by judith at 01:45 PM | Comments (0)

April 30, 2003

Differences between IR and IF

"Information retrieval(IR) is very closely related to information filtering(IF) in that they both have the goal of retrieving information relevant to what a user wants, while minimizing the amount of irrelevant information retrieved" (Foltz & Dumais, 1992, p.52)

However, here is three primary differences between IR and IF.
First, user preferences in information filtering represent long-term interests, while queries in IR tend to represent a short-term interest that can be satisfied by the retrieval.
Second, information filtering is applied to streams of incoming data, while in IR, changes to the database do not occur often, and retrieval is not limited to the new items in the database.
Third, IF involves the process of removing information from a stream, while IR involves the process of finding information in that stream (Foltz & Dumais).

Posted by judith at 10:29 PM | Comments (0)

April 13, 2003

Database types - an overview

In trying to understand how data is discovered in and retrieved from databases, I've been trying to understand more about how different kinds of databases are used and what they are useful for. This is the best short summary I've found on the subject. Written in 2001. Object-Relational DBMSs - The Road Ahead

Posted by amdonovan at 03:53 PM | Comments (0)

A KDD and IR perspective on TIA

A take on TIA that bucks the opinion trend and focuses on the KDD and IR aspects of TIA. Written by Howard Bloom, author of Global Brain: The Evolution of Mass Mind. Wired 11.04: VIEW
Quote -- "One of TIA's component programs, Genisys, aims to totally reinvent the database, increasing its usefulness and its contents by an order of magnitude. It will be the database of databases, with an add-on "Babblefish" able to parse and cross-reference every possible information stream. The most ambitious TIA initiative, Genoa II, is working to produce cognitive amplifiers - a symbiotic thinking system that weaves together human and machine intelligences more tightly than ever before."


Posted by amdonovan at 03:19 PM | Comments (0)

April 04, 2003

Enhancing Privacy and Trust

Privacy and Trust are major issues in promoting corporate portals' functions-gatherig, sharing and disseminating of information. Those issues are also related to all topics of KMS.

This article provides "new non-third party mechanisms to overcome" the barriers against privacy and trust, and also solutions for "finding shared preferences, discovering communities with shared values, removing disincentives posed by liabilities, and negotiating on behalf of a group" ,and techniques "to enable these new capabilities".

Posted by judith at 03:40 PM | Comments (0)

Knowledge Creation by Knowledge Pump

I found more specific information about the Knowledge Pump Sytem which we learned in "collaborative flitering" class.
The Knowledge Pump can foster an evironment that encourage the flow, use and creation of knowledge by supporting social network and electronic repositories.
.

Posted by judith at 01:19 PM | Comments (0)

April 01, 2003

Open Directory Project

My analysis of the Open Directory Project ODP is posted in .pdf form at http://www.ischool.utexas.edu/~khaack/TKMS/ODP.pdf

ODP is the world's largest human (volunteer)-edited Web directory. It can be used as a search engine but allows the user much more freedom in choosing their direction of searching than search engines that simply return ranked results.

Here is the ODP website (which is often slammed) http://dmoz.org

Posted by khaack at 09:43 AM | Comments (0)

March 18, 2003

XML and Databases

For those who would like more info about how XML works in/with databases-- XML and Databases //Quote//This paper gives a high-level overview of how to use XML with databases. It describes how the differences between data-centric and document-centric documents affect their usage with databases, how XML is commonly used with relational databases, and what native XML databases are and when to use them.//Endquote//

Posted by amdonovan at 10:18 AM | Comments (0)

D-Lib Magazine: Uncovering Information Hidden in Web Archives

Uncovering Information Hidden in Web Archives. A view of KDD and IR from an archival perspective. A good primer on how data warehousing works from D-Lib Magazine.

Posted by amdonovan at 10:16 AM | Comments (0)

March 11, 2003

zote

Many classmates have understandable skepticism about blogs as a KM tool, but here's an interesting example of how blogs can be used to share tacit knowledge. A few months back, I posted a little story about a neighbor and her difficulties using Zote, a Mexican laundry detergent. Apparently someone from Zote found my site through Google, and posted a comment explaining Zote. I thanked him via email, and he responded with a more detailed discussion of Zote, which I've posted here. I never would have imagined someone would have found my question and answered it, but it certainly worked out.

Posted by mcchris at 09:19 PM | Comments (8)

March 10, 2003

ACM Ubiquity interview with director of RLG

I ran across this recent interview with James Michalko, the director of RLG about Information Access on the Wide Open Web

He hits a lot of issues related to the difficulties of searching on the Web. His comment on availability of scholarly resources on the Web, "Some firms, such as Amazon, have created algorithms and done the computational analysis that asks, 'Did you really mean this?' and says 'If this is what you want, then you will find the following things relevant.' We must deliver authoritative trusted information using those kinds of paradigms or we will simply become museums of long-term storage instead of current use. These are some of the ways in which the CS community could make accessible on behalf of the broad Internet community enormous amounts of wonderful resources that right now are either inaccessible or severely under used."

Posted by amdonovan at 01:03 PM | Comments (0)

March 05, 2003

Information-Seeking behavior

I had the assignment about information-seeking behavior in the first semester in U.S. The assignment required to interview someone with pre-formatted open questionnaire, therefore, which should not have closed questions such as yes or no.

While I was interviewing interviewee, who was seeing a doctor regularly because of her eyes problem, I realized the study of information-seeking behavior is fundamental to our field, library and information science.

I found good resource about it.

Solomon, P. (1997) "Conversation in Information-Seeking Contexts: A Test of an Analytical Framework." Library & Information Science Research 19 no. 3, 217-248.

Also available on the WWW. URL:
http://ils.unc.edu/~solomon/hp/ConInfo.html


Abstract

"This article develops an analytical framework to support the analysis of conversations in information seeking contexts. The framework brings together linguistic and sociolinguistic issues such as vocabulary, cohesion, coherence, turn taking, turn allocation, overlaps, gaps, openings, closings, frames, repairs, role specification, and stylistic features. These issues serve as viewpoints for exploring how information-seeking conversations differ from casual conversation and conversations in restricted conversational domains (e.g., teacher-student; physician-patient). A sample of nine conversations from two information seeking contexts (i.e., school library media center, public library) is used to test the utility of the analytical framework and explore possible characteristics of information seeking conversations. The findings support the utility of the framework for various purposes including: training of information specialists, feedback on their performance, design of human-computer dialogues, elicitation of decision making processes during information seeking, and support for natural language processing."

Posted by judith at 11:44 PM | Comments (0)

March 02, 2003

Knowledge Representation, KDD, and IR

Listed below are some of the resources I have run across while trying to educate myself on the basics of KDD/IR (especially concerning description/discovery of Web resources). Some of the tutorials might prove helpful when tackling the assigned class readings on KDD/IR (eons from now).

KDD Glossaries

Machine Learning Glossary of Terms
Special Issue on Applications of Machine Learning and the Knowledge Discovery Process
http://robotics.stanford.edu/~ronnyk/glossary.html

Machine Discovery Terminology
compiled by W. Kloesgen and J. Zytkow
http://orgwis.gmd.de/projects/explora/terms.html

Data Mining Glossary from Two Crows -
http://www.twocrows.com/glossary.htm

Datawarehouse Terminology
by Creative Data:
http://www.credata.com/research/terminology.html

Introductory material:

KDD -

Knowledge Discovery In Databases: Tools and Techniques
by Peggy Wright
http://www.acm.org/crossroads/xrds5-2/kdd.html?ROLES=0PSA0STA0EMA&DOMAIN=.acm.org

Data Mining -

Introduction to Data Mining and Knowledge Discovery. 3rd Ed. Published by Two Crows Corporation
http://www.twocrows.com/intro-dm.pdf

Web Resources IR -

Practical Issues for Automated Categorization of Web Sites
by John M. Pierre, Metacode Technologies, Inc.
September 2000
http://www.ics.forth.gr/isl/SemWeb/proceedings/session3-3/html_version/semanticweb.html

Info on DAML+OIL from daml.org:

http://www.daml.org/

Tutorials on DAML+OIL from xml.com:

http://www.xml.com/pub/a/2002/01/30/daml1.html
http://www.xml.com/pub/a/2002/03/13/daml.html

Basic basics on Ontology Inference Layer (OIL):

http://www.ontoknowledge.org/oil/

And, of course, more on Web Ontology Language (OWL):

http://www.w3.org/TR/2002/WD-owl-guide-20021104/#Abstract

Current work on Web Resource representation and IR:

For an overview of clickstream analysis of Web activity:

INFORMATIONWEEK.com News, March 12, 2001
Pan For Gold In The Clickstream
http://www.informationweek.com/828/prmining.htm

Using Topic Maps for Web Resources description and IR:

http://www.xml.com/pub/a/2002/09/11/topicmaps.html?page=1

Project Aristotle(sm): Automated Categorization of Web Resources, is a clearinghouse of projects, research, products and services that are investigating or which demonstrate the automated categorization, classification or organization of Web resources. A working bibliography of key and significant reports, papers and articles, is also provided. Projects and associated publications have been arranged by the name of the university, corporation, or other organization with which the principal investigator of a project is affiliated.
http://www.public.iastate.edu/~CYBERSTACKS/Aristotle.htm

An online textbook for those who REALLY want to get into the nitty gritty of Information Retrieval:

INFORMATION RETRIEVAL, 2nd Ed (1999). by C.J. van Rijsbergen
Department of Computing Science, University of Glasgow:
http://www.dcs.gla.ac.uk/~iain/keith/

Posted by amdonovan at 12:43 PM | Comments (0)

February 28, 2003

Email, Forwarding, Privacy, and Copyright

Slashdot pointed me to this fascinating piece by James Grimmelman on LawMeme (which I've never followed before but will watch in the future):

Accidental Privacy Spills: Musings on Privacy, Democracy, and the Internet

The article discusses the spread of a personal email by Laurie Garrett, a journalist attending on the January World Economic Forum in Davos. Particularly, iit addresses Garrett's (justifiable?) anger at learned that her "private" email had been forwarded without her permission by someone in her circle of trust and had subsequently been discussed by "techno-liberalists" on lists such as MetaFilter.

All in all, I think that this piece ties together a lot of the themes we've discussed so far in class.

Posted by dcplumer at 10:38 PM | Comments (1)

February 25, 2003

UT groups sharing knowledge

Yesterday there was an interesting day of "Show and Tell" that was co-hosted by the College of Communication / School of Information / Deptartment of Electrical Engineering / Department of Computer Science. I am not sure why the Scool of Information didn't make this day more apparent to its students but I will be keeping my eyes and ears open for future forums of this sort.

Speakers (including our own Turnbull, Bias, and Chen) spoke on topics ranging from "Autonomous Learning Agents in Dynamic, Multiagent Environments" to "Educational Content in Video Games" to "Data Mining for Informational Retrieval" ...

What was great about this forum is that innovative, dynamic research is being done across campus and it appears that professors and students are recognizing the importance of sharing knowledge throughout the university community.

The speaker, Joydeep Ghosh, (his website) who discussed his research on "Data Mining for Info. Retrieval" was particularly timely to our reading of "Lifestreams ..." because he is researching the idea that "Personal data should be accessible anywhere and compatibility should be automatic" (81). A great deal of the advancement of personal information management seems to be locked in the future of creating intelligent agents that will be able to "interact" with users by being able to differentiate the levels of importance that users attach to the information they encounter.
--- Is a website important because the user accesses it frequently or do they access it frequently simply because it is a default? What banking information is really necessary ... can certain things be hidden, like investment notations, and just appear as a calendar reminders? ----

< short rant >
Presently, I stay away from a lot of PIM programs and devices because I have to go through so many steps to program in my preferences, schedules, etc. What I would use is something that might not "be smarter" than me but that would give me a confidence in that it appears to be ... maybe that is a naive statement but I think that if I purchase a device to be my "assistant" then it should be able to be as assistant-like as possible. Afterall, if a real-life assistant couldn't keep all of a boss' comings and goings straight they'd probably be fired!

< / rant >

To conclude, there appears to be some truly innovative research happening (and/or) on the brink of happening at UT in the fields of knowledge management, data mining, and general information goodness. As a graduate student it is comforting to learn about the various groups on campus coming together because it means opportunities are ripe for fututre research.

Posted by khaack at 01:43 PM | Comments (2)

February 04, 2003

the world's oldest profession?

I've been doing some background reading on knowledge discovery and information retrieval and I find myself lost in the mists of time. The importance of writing to human memory is something I had thought about before of course, but I had not thought of it in the context of early pictographic writing and how it was used to record data and establsh collective knowledge bases. The idea being that an individual's knowledge was no longer bounded by what he/she could remember.

Posted by amdonovan at 08:40 AM | Comments (0)