INF 384C: Organizing and Providing Access to Information

 

Syllabus

Course Information

Grading

Brief Outline

Detailed Outline

Administrativa

 

Top

Course Information

Instructor: Miles Efron
Office: SZB 562E
Office Hours: Tues. 10:00-11:00
Email: miles@ischool.utexas.edu 
Web: http://www.ischool.utexas.edu/~miles

 

Class Meeting Time and Place:

Tue. 12:00-3:00 SZB 468

Unique Number: 27425

Catalog Description of Course: Introduction to general principles and features of organizing and providing access to information, including varieties and numbers of information-bearing objects, different traditions of practice, user concerns, metadata and metadata formats, document representation and description, subject access, and information system features and evaluation.

 

Instructor’s Description of Course: This is an information science class.  It is about ideas, not particular software or technologies.  Completing this course will help you understand what information is and how organizing it relates to information access.  Our work will focus on three core problems: representing information, classification, and indexing.  Aside from this conceptual framework, students who complete the course will gain a foundation in the following technical skills:

1.    Representing and manipulating structured documents using XML

2.    Data modeling and simple relational database design

3.    Creating and evaluating information retrieval systems.

 

From this list, you can see that this is not a course where you will learn to catalog materials.  Nor is it a course where you will learn archival arrangement.  Perhaps less obviously, this is not a remedial technology course.  We will use computers, but students will largely be expected to perform computation outside of class.  In other words, we won’t devote much time to learning any particular software; I take it for granted that you can do this on your own.  Finally, it should be noted that this course (especially the latter half of it) is inherently mathematical in its focus. 

 

Email policy

Email will be the primary method of communication outside of class.  Thus it is crucial that all enrolled students subscribe to our class listserv. Instructions for joining the list are at

 

            https://utlists.utexas.edu/sympa/

 

The name of the list is inf384cs09.  

 

N.B. You may email the instructor outside of class, but before you do, please ask yourself the following questions:

1.  Can the matter wait?  In this case, save the issue for our next meeting or office hours.

2.  Is it likely that other students share my concern?  In this case, send your note to the class list.

3.  Is the instructor the right person to answer this question?  Don’t be afraid to lean on your fellow students and the purpleshirts in the lab.

 

If you still think it’s the right thing to do, email the instructor.

 

Reading

I ask you to buy the following books for this course:

 

1.  Weinberger, David.  (2007).  Everything is Miscellaneous.  Times books.

2.  Hunter, Eric J.  (2002).  Classification Made Simple. 2nd Edition.  Ashgate.

3.  Ray, Eric. (2003). Learning XML. 2nd edition. Sebastepol, O’Reilly Media.

 

 

 

All other readings for this course are available electronically.  If possible, I have linked directly to them in the syllabus below.  Where this was not possible, you can find readings in various scholarly journals that are now available online through the UT Library website (e-journals).

Top

Graded Assignments

Assignment

Weight

Metadata quality analysis (group)

15%

XML creation

10%

Data Modeling

10%

Midterm exam

20%

Final exam

30%

Class engagement

15%

 

Grading Details:

I will use the following schedule in calculating final grades:

 

A+ = 100

A = 95-99

A- = 90-94

B+ = 85-89

B = 80-84

B- = 75-79

C+ = 70-74

C = 65-69

C- = 60-64

 

F = <60

 

 

 

A few notes on grading:

  1. Late work is not fair to your fellow students.  Therefore, late assignments will be penalized 1/3 of a letter grade per day, beginning at the time the assignment is due.  Thus an A paper turned in two hours late becomes an A-.  The next day (at the hour class begins) it becomes a B+.  The next day it becomes a B.  After the next day (at class time) the work will receive no credit.

 

  1. Grading is inherently subjective.  I promise to think seriously about your grades.  Likewise, I expect you to take me at my word.  Please don’t ask me to change a grade unless you truly think I’ve made a mistake.

 

  1. As the name implies “class engagement” is the degree to which you evince not only understanding of course material, but also the extent to which you help the class move forward.  Throughout the semester you should be asking questions (not just to me, but to the class) and expressing opinions about the issues we are covering. 

 

  1. Finally: not everybody in this class will get an A.  Please keep in mind that a B is a very good grade; I reserve anything higher for truly outstanding work. 

 

 Class Attendance

All students must attend both the midterm and final exams.  Missing either of these exams will result in a zero grade on that test.  The only exceptions that will be made to this rule are:

1.    students who will be absent due to a religious holiday may reschedule the exam.  However, students must inform the instructor of this plan in writing by the second class meeting.

2.    Students who are too ill to attend class should contact the instructor as soon as possible.  With a note from the student’s doctor, a makeup will be arranged.

Short of these contingencies, all students are required to take the exams on the dates and times shown in the schedule below.

 

On non-exam days, attendance is your prerogative.  Be aware, though, that your class engagement grade depends on the extent to which you distinguish yourself among your peers.  I won’t take roll each week.  But I will notice who routinely says intelligent things.  If you are gone, you won’t be among these students.

 

Top

Detailed Outline

Date

Due

Topics

1/20

 

Preliminaries: information and organization

 

Class slides

 

Motivations for organizing information

 

In-class exercise: Library OPAC search

 

 

Reading:

  1. [obviously optional] Borgman, C. L. (1996).  Why are online catalogs still hard to use? JASIS 47(7). Pp. 493-503.  (online through UT e-journals)

 

 

1/27

 

Is ‘everything’ miscellaneous?

 

Class slides

 

In-class list-making exercise

 

Reading:

1.    Weinberger, D.  (2005).  Everything is Miscellaneous.  Times.

 

 

Two venues for information organization

Institutional repositories

Reading:

  1. Crow, R.  (2002).  The case for institutional repositories: A SPARC position paper. (available online)

 

Personal Information Management

Reading:

1.    Czerwinski, Mary et al.  (2006).  Digital memories in an era of ubiquitous computing and abundant storage.  Communications of the ACM.  49(1), 44-50. (available online)

2/3

 

Introduction to metadata

 

Class slides

 

In-class Dublin Core exercise

 

Reading:

  1. Wason, Thomas D. and David Wiley.  (2001).  Structured Metadata Spaces. In Greenberg, J. (ed.) Metadata and Organizing Educational Resources on the Internet. NYm Haworth Press.  (pp. 263-277).  [available online]
  2. Hillman, D. I. (2000). Using Dublin Core. (available online).
  3. Lagoze and Van De Sompel. (2001).  The Open Archives Initiative: Building a low-barrier interoperability framework. JCDL 2001. (available online).
  4. Efron, M.  (2006).  Metadata use in OAI-Compliant institutional repositories.  Digital Curation and Trusted Repositories, JCDL Workshop. (available online).

 

 

 

2/10

Metadata quality analysis

 XML 1

 

In-class XML practice

 

Class slides

 

All in-class XML examples are available online

 

Reading:

  1. Ray, Erik T.  Learning XML. Chs. 1, 2, 4.

 

 

2/17

 

 

XML 2 -- XML transformations, XML schemas, RDF, Semantic Web

 

Second in-class XML practice

 

Class slides

 

 

Reading:

  1. Ray, Erik. T. Learning XML. Chs. 5, 6, 7.

 

2/24

XML analysis

Semantic Web; RDF, RDF Schema, Ontologies

 

Class slides

 

Reading:

1.    Tim Berners-Lee, James Hendler, and Ora Lassila, "The Semantic Web" Scientific American (May 2001). (available online) .

2.    Natalya Noy and Deborah McGuinness "Ontology 101 (1-20, through section 4)". (available online).

3.    Catherine Marshall. "Taking a Stand on the Semantic Web". (available online).

 

 

3/3

 

Metadata and document representation wrap-up

 

Reading:

1.    Shirky, Clay. XML: No Magic Problem Solver [available online].

2.    Weibel, Stuart L. (2005).  Border crossings: Reflections on a decade of metadata consensus building.  D-Lib Magazine.  11(7/8).  [available online].

 

Exam review

 

3/10

 

 

Midterm Exam

 

3/17

 

 

Spring Break -- No class meeting

 

 

 

3/24

 

Introduction to Relational databases (data modeling)

 

class slides (pdf); and slides (keynote)

In-class practice

 

Reading:

1.    Teory, T. et al. (2006). Database Modeling and Design. Morgan-Kaufmann. Ch. 1, Ch. 2.2, Ch. 4.1-4.3 [available through blackboard].

 

3/31

 

Introduction to classification

 

Class slides

 

Reading:

  1. Hunter.  Classification Made Simple Chs. 1-3, 4.1, 5, 9, 10 (pp. 70-72), 12, 16.

 

In-Class Exercise: Newspaper headline classification

 

4/7

 

 

ECIR -- no class meeting

 

4/14

Data Modeling

Lecture by iSchool visitor

 

Statistical classification

 

Class slides

 

Reading:

  1. Frank, E. and Witten, I.  (2005)  Data Mining: Practical Machine Learning Tools and Techniques.  Morgan Kaufmann. Chs. 1.1-1.2, 2.1-2.3, 4.35.1-5.2. [Available through the UT libraries as an electronic book]

 

4/21

 

Introduction to information retrieval I

 

Class slides

 

Reading:

1.    Manning and Scheutze. (2007) Introduction to Information Retrieval. Cambridge. Ch. 1. (available online)

 

4/28

 

 

 

Introduction to information retrieval

 

In-Class Exercise: Search Engine Evaluation

 

Reading:

1.    Manning and Scheutze. (2007) Introduction to Information Retrieval. Cambridge. Chs. 1, 6. (available online)

 

5/5

 

 

FINAL EXAM

 

 

 

Top

Administrativa

The University of Texas Honor Code

The core values of the University of Texas at Austin are learning, discovery, freedom, leadership, individual opportunity, and responsibility. Each member of the University is expected to uphold these values through integrity, honesty, trust, fairness, and respect toward peers and community.

Electronic mail Notification Policy

All students should become familiar with the University's official e-mail student notification policy. It is the student's responsibility to keep the University informed as to changes in his or her e-mail address. Students are expected to check e-mail on a frequent and regular basis in order to stay current with University-related communications, recognizing that certain communications may be time-critical. It is recommended that e-mail be checked daily, but at a minimum, twice per week. The complete text of the policy is available at http://www.utexas.edu/its/policies/emailnotify.html.

In this course e-mail will be used as a means of communication with students. You will be responsible for checking your e-mail regularly for class work and announcements. Note: if you are an employee of the University, your e-mail address in Blackboard is your employee address.

Contacting and Meeting with the Instructor

The instructor keeps office hours twice weekly (see above).  Students may email the instructor if necessary, but are encouraged to seek advice either during office hours or in class.

Students with disabilities

Any student with a documented disability (physical or cognitive) who requires academic accommodations should contact the Services for Students with Disabilities area of the Office of the Dean of Students at 471.6259 (voice) or 471.4641 (TTY for users who are deaf or hard of hearing) as soon as possible to request an official letter outlining authorized accommodations.

 

Top