HISTORY AND TYPES OF INDEXES
INDEXING HISTORY
Classifying,
categorizing and organizing information seems to be a basic human activity. Flowing
from this is the need to retrieve information using those skills. Indexing is
the basic mode of doing this. The core of an index is using a known set of
symbols, such as the alphabet.
Book index is indeed the oldest among the
figurative or applied senses of the word, and that this specific usage (like
the word itself) goes back to ancient Rome. There, when used in relation to
literary works, the term index was used for the little slip attached to papyrus
scrolls on which the title of the work (and sometimes also the name of the
author) was written so that each scroll on the shelves could be easily
identified without having to pull them out for inspection. So that the copyists
may take some bits of parchment to make title slips from them. From this developed
the usage of index for the title of books.
Finding
the actual first index seems to be a difficult exercise because of a lack of
detailed book and manuscript catalogues. Cataloguers of Hebrew and Latin Mss
often refer to a table of contents as an index. Many mss are being digitized
but the cataloguing seems to be inadequate; the presence of indexes not being
mentioned.
In the
10th century Hebrew scholars called Masoretes appear to have compiled list with
Words from the Hebrew bible arranged in alphabetical order and the phrases from
which they come aligned with them.
An early
form of indexing could also be seen in the tables (cannon) comparing the verses
of the four Gospels found in mss from the dark ages. The first step in indexing
is the breaking up of mss into chapters, especially done for the Bible and
Justinian’s legal code. Stephan Langdon (1150-1228) did this for the Bible.
Chapters had been assigned in 2nd and 3rd century Greek New Testament mss. He
included verse numbers with his chapter divisions. This made the compilation of
concordances to the Bible possible.
Medieval
indexing reflected the interests of the Institutions, libraries and monastic
houses, where they were compiled. They were what we would call in-house
products. Alphabetical lists of moral issues in a compendium of treatises might
be produced, but neutral information, such as that on rivers rocks and flowers,
might be ignored. Their usefulness was often limited by copyist’s errors and
mistakes in the numbering of leaves.
Knowledge
of the alphabet could not be assumed. Giovanni di Genoa in his Catholican (1286) though it necessary
to explain this:-
“ ‘Amo’ comes before ’bibo’ because ‘a’ is the first
letter of the former and ‘b’ is the first letter of the of the latter and ‘a’
comes before ‘b’ … by the grace of God working in me, I have devised this
order”. Eisenstein (1979)
As late
as 1604 Robert Cowdrey in his Table
alphabetical of hard English words noted that ‘the reader must learn the
alphabet, to wit: the order of the letters as they stand. Eisenstein (1979).
1ST
PRINTED INDEXES
One of
the consequences of the invention of printing after 1456 was the
commercialization of publishing. Production was now from printer’s workshops in
commercial centres rather than from stationers and scribes working in
university towns or monastic scriptoria. Important was the use of alphabet in
the arranging of the types.
The
oldest printed indexes are found in two editions of St Augustine's De arte
praedicandi, published respectively by Fust and Schoeffer (the printers of
Gutenberg's Bible) in Mainz, and by Mentelin in Strassburg, probably in the
early 1460s. Previous research has established Fust's priority, while Mentelin
probably copied Fust's edition, including the index. The book's preface
specifically mentions the index and explains its use. The index, whose locators
refer to paragraphs indicated by letters, contains 230 entries for only 29
pages of text; it has many cross-references and some rotated multi-word
entries. In a later advertisement for his books, Schoeffer mentioned the index
to Augustine's book as a useful feature. The first dated index appeared in 1468
in Speculum vitae, a moral treatise printed by Sweynhcym and Pannartz in Rome.
This index was also reprinted many times by other early printers.
Among all
the books and other printed material from the shop of Fust and Schoeffer came
from the joint press before 1467, there is only one that has an index, namely
St Augustine's De arte praedicandi (On the art of preaching) which is the
fourth part of his larger work Dedoctrina Christiana.
Shortly
after the Augustine's index, whose exact dating is still unknown, the first
dated index appeared in the editio princeps of Speculum vitae,y3 a moral
treatise discussing the advantages and merits as well as the disadvantages and
perils of various professions from king to shepherd, written by the Spanish
bishop Rodrigo de Zamora (Rodericus Sancius Zamorensis, 1404-1470), and printed
by Sweynheym and Pannartz in Rome as their fifth publication in 1468. The book
has 300 large pages (287 x 200 mm), 292 of which contain the preface, table of
contents and text, and only six and a half pages (leaves 147a-150b) of index.
The
printer Peter Schoeffer (ca. 1425-1503) of Mainz in his catalogues specifically
mentions that his better arranged books have complete indexes; an index being
an important sales point. The earliest dated (sort of) printed list
representing an index appeared in Epistolae
Hieronymi in 1470, according to Colin Clair (1969) it was a list of the
first words of each of the gatherings. Later that year Ulrich Han printed his,
with the catch words of each double page listed. He gives a specific date for Exposito Psalteri, 4 October, 1470. Schoeffer
did not date all his first books so there might have been an earlier one. Hans
H. Wellisch (1986) claims that it was Furst and Schoeffer’s editions of Saint
Augustine’s De Arte praedicandi, which,
as they can be dated to the early 1460s, had the first printed index.
In these
early years of printing the distinction between the contents page, a register
and an index was not very clear. In this 1554 Commentary by St John Chrysostom
on the Epistles of Paul the ‘ indice copiosisimo’ (most copious index) is
placed at the beginning of the work and there is no contents page. The words
would indicate that the publishers here thought the index, mentioned on the
title page, would be a draw card to encourage scholars to buy the book. The
Latin title of the index itself also shows it to be an important feature. This
translates as “Most copious index of all the matters which we have in this
book”.
Toscanelli’s
1568 commentary on Virgil has an alphabetical table of contents, which is in
fact as comprehensive as the previously mentioned ‘Indice copiosisimo’ the
table has the title in Italian ‘table taking up the notable content in this
present work of observations on Virgil.
Ortelius
in his 1570 Theatrum Orbis terrarium starts with an ‘Index tabularum” which
lists the maps alphabetically for more specific location you have to refer to
the gazetteer at the end which does not give you the page numbers only the name
of the area in which it lies and does not necessarily relate to the maps.
Indexes
go way back beyond the 17th century. The Gerardes Herbal from the 1590s had
several fascinating indexes according to Hilary Calvert. Barbara Cohen writes
that the alphabetical listing in the earliest ones only went as far as the
first letter of the entry... no one thought at first to index each entry in
either letter-by-letter or word-by-word order. Maja-Lisa writes that Peter
Heylyn's 1652 Cosmographie in Four Bookes includes a series of tables at
the end. They are alphabetical indexes and he prefaces them with "Short
Tables may not seeme proportionalble to so long a Work, expecially in an Age
wherein there are so many that pretend to learning, who study more the Index
then they do the Book."
EIGHTEENTH
CENTURY
The 18th
century produced some, interesting index entries. The ladies magazine or Entertaining companion for the fair sex. Vol.7,
1776. It has separate indexes
for Essays and prose, Poetry and Births, Marriages and deaths.
An
important indexer from this period was Alexander
Cruden (1699-1770), with his famous A Complete Concordance to the Holy Scriptures (1737). He is reputed to have been so enthusiastic
about compiling it, working late into the night that he failed to notice that the stock of his bookshop was
being depleted and was surprised
by the subsequent fall in sales.
NINETEENTH
CENTURY
Indexing came
to be more professional in this period, but first it had more of the same
thing.
Another
Magazine from early 19th century was the
ladies monthly museum. Vol. 6, 1801 the issue of 1801 has an index
similar to the earlier one. Notes and
queries was a journal for artists, antiquaries, genealogists and
literary people. The index of Vol. 1, 1849-50 included titles and headings
exactly as they appeared in the text, with resulting curious entries.
John
Fiske sometime librarian at Harvard wrote Darwinism and other essays, 1893 and might have indexed it.
These
people might have benefitted from the services of Mary Petherbridge, 1870-1940.
She set up a Secretarial Bureau in 1894, offering secretarial and library
services which included training in those fields as well as training in
indexing. In 1904 she published a manual called “The technique of indexing”. She worked as a freelance indexer
for various publishers and even government departments, especially the India
Office. She wrote an article, ‘Indexing as a profession for women’ for the
magazine Good housekeeping in
1923.She explained the steps in indexing and the various types of indexes,
books and periodicals as well as the indexing of documents:
“There are four stages in indexing:
The writing of the slips;
The alphabetical listing of them;
The critical editing and its attendant research;
The proof reading, this must always be done by the
indexer personally.”
However,
indexes in the modern sense, giving exact locations of names and subjects in a
book, were not compiled in antiquity, and only very few seem to have been made
before the age of printing. There are several reasons for this. First, as long
as books were written in the form of scrolls, there were neither page nor leaf
numbers not line counts (as we have them now for classical texts). Also, even
had there been such numerical indicators, it would have been impractical to
append an index giving exact references, because in order for a reader to
consult the index, the scroll would have to be unrolled to the very end and
then to be rolled back to the relevant page. (Whoever has had to read a book
available only on microfilm, the modern successor of the papyrus scroll, will
have experienced how difficult and inconvenient it is to go from the index to
the text.) Second, even though popular works were written in many copies
(sometimes up to several hundreds), no two of them would be exactly the same,
so that an index could at best have been made to chapters or paragraphs, but
not to exact pages.
THE ORIGINS OF ALPHABETICAL INDEXING
Rouse and
Rouse report that subject indexing was invented in Paris in the thirteenth
century. (This is ironic in light of the absence of indexes in many modern
French works.) It is an interesting coincidence that the citation index
described by Gaster was produced in the same century as the first subject
indexes, and in the same country — in the city of Avignon.
Wellisch
(1994) suggests that subject indexing began in the 4th century with the
Apothegmata, a compilation of sayings of the Greek Church Fathers. Witty
describes this as an alphabetically arranged tool rather than a subject index
to a narrative text (1973, p. 196). Richardson (1939, p. 844) states that the
earliest Biblical dictionary was the Onomasticon by Eusebius (264-340 C.E.),
but it was not in alphabetical order.
Bacher
(1912) notes that a Greek dictionary of Biblical proper names is ascribed to
Philo Judaeus, who lived in Alexandria from 20B.C.E.to40C.E
FUNCTION OF AN INDEX
The function
of an index is to provide users with an effective and systematic means for
locating documentary units (complete documents or parts of documents) that are
relevant to information needs or requests. An index should therefore:
a. identify
documentary units that treat particular topics or possess particular features.
b. indicate
all important topics or features of documentary units in accordance with the
level of exhaustivity appropriate for the index.
c.
discriminate between major and minor treatments of particular topics or manifestations
of particular features.
d. provide
access to topics or features using the terminology of prospective users.
e. provide
access to topics or features using the terminology of verbal texts being
indexed whenever possible.
f. use
terminology that is as specific as documentary units warrant and the indexing
language permits.
g. provide
access through synonymous and equivalent terms.
h. guide
users to terms representing related concepts (narrower terms, other related
terms, and if possible, broader terms).
i. provide
for the combination of terms to facilitate the identification of particular
types or aspects of topics or features and to eliminate unwanted types or
aspects.
j. provide a
means for searching for particular topics or features by means of a systematic
arrangement of entries in displayed indexes or, for non—displayed indexes, by
means of a clearly documented and displayed method for entering, combining, and
modifying terms to create search statements and for reviewing retrieved item
TYPES OF INDEX
Indexes may
be categorized by type of object to which headings refer; by type of term used
for index headings; by type or extent of indexable matter used to produce the
index; by method of arranging entries; by method of term coordination; by type,
format, genre, or medium of documents being indexed; by medium of the index; by
mode of publication; by periodicity, that is, whether the index is a one-time
(closed—end) index or a continuing (open-end) index; and by type of authorship.
The following examples illustrate common types of indexes. They are by no means
exhaustive.
Indexes by type of object referred to
A. authors:
all types of document creators such as writers, composers, illustrators,
translators, editors, choreographers, artists, sculptors, painters, inventors.
B. subjects
(topics or features): topics treated in documents and/ or features of
documentary units (for example, genre, format, methodological approach).
Separate indexes are often devoted to special types of topics such as persons,
places, or corporate bodies; features, such as genres (for example, poetry,
drama); or notations, such as International Standard Book Numbers (ISBN).
Indexes by type of term used for headings
A. names:
proper nouns, such as names of persons, places, corporate bodies.
B. numbers
or notations: numerical or coded designations, such as classification notation,
patent number, ISBN, date.
C. words and
phrases: common words and phrases (as opposed to names or proper nouns).
Indexes by type or extent of indexable matter on which
an index is based
A. full text
of documents.
B.
abstracts.
C. titles
only.
D. first
lines only (for example, first lines of poems).
E. citations
(reference citations to other documents)
Indexes by arrangement of entries
A.
alphabetical or alphanumeric.
B.
classified: Headings arranged on the basis of relations among concepts
represented by headings, for example, hierarchy, inclusion, chronology, or
other association. Classified indexes are often based on existing classification
schemes, such as the Dewey Decimal Classification.
C.
alphabetico-classed: Broad headings arranged alphabetically. Narrower headings
are grouped under broad headings and arranged alphanumerically or relationally
on the basis of hierarchy, inclusion, chronology, or other association.
NQTE:
Electronic indexes often have no arrangement that is apparent to the user.
However, indexes designed for human scanning, browsing, and examination must
have some arrangement, regardless of medium
Indexes by method of document analysis
A. human
intellectual analysis and identification of topics and concepts expressed and/
or features manifested.
B. computer
algorithms designed to identify useful terms, phrases, or features.
C.
combination of computer—based and human analysis.
Indexes by method of term selection
A.
assignment of terms to represent topics and features (whether or not the term
is in the documentary unit being indexed).
B.
extraction of terms from the documentary unit.
C. a
combination of assignment and extraction methods
Indexes by method of term coordination
A.
pre-coordinate combination, such as subject heading indexes, string indexes,
chain indexes, keyword indexes (including KWIC, KWOC, KWAC indexes), rotated,
and permuted indexes.
B.
post-coordinate combination. Includes the use of Boolean operators, proximity
measures, and the combination of weighted terms
Indexes by type, periodicity, format, genre, or medium
of document(s) being indexed
Examples
are: books, monographs, periodicals, serials, poetry, fiction, short stories,
films, videos, illustrations, pictures, paintings, artifacts, software,
computer-readable texts, maps, and sound recordings.
Indexes by medium of index
A. printed
or written.
B.
microform.
C. electronic
media, including online, CD-ROM.
D. Braille
Indexes by proximity of documentary units
A. indexes
published together with the documentary units to which they refer, including
both back-of-the-book indexes and full-text databases.
B. indexes
published separately from the documentary units to which they refer.
Indexes by periodicity of the index
A. one—time,
closed—end indexes.
B.
continuing, open-end indexes.
Indexes by authorship
A. authored:
An authored index; a separately authored document distinct from the document(s)
that is (are) being indexed. It is created independently by one or more persons
through intellectual analysis of text, as distinguished from indexes that are
created solely through algorithmic analysis of text carried out electronically.
B.
automatically generated
Genealogical Indexing
Genealogical
indexes allow users to look up people’s names and find information about
personal and family relationships. They often eliminate the need to access
original source materials example cemetery inscriptions.
Book indexing
Book
indexes provide access to detailed contents of books. Back of the book indexes
are made for all types of non-fiction books, including textbooks, multi-volume
works, technical reports and annual reports. Book indexes are lists of words,
generally alphabetical, at the back of a book, giving a page location of the
subject or name associated with each word.
Legal Indexing
Legal
indexing involves indexing of legal materials by form and content. Legal
indexers are familiar with legal concepts and classification and are able to
translate the classification into accessible indexes.
Periodical and Newspaper Indexes
Periodical
and newspaper indexes give access to the contents of individual articles and
other items in serialized publications. Many periodical and newspaper indexes
are based on a controlled vocabulary to ensure consistent use of terms from
year to year.
Pictorial Indexes
Indexes
to images help users identify relevant pictures in collection of photographs,
art works, videos and films.
Word Indexes
Word
and name indexes, which are sometimes called concordances, are indexes to the
individual names and words that the author used, and in one sense most closely
represent the information and ideas the author had in mind when creating the
manuscript. These indexes are the exact term or word within the context of the
document which pinpoint the subject discussed and its location.
String Indexes
Although
string indexing is a modern term, the idea to display a series of rotating
index entries from a basic list of index terms that make up the string. The
objective is to give the user an entry point for all index terms and to display
them in context with each other. A string index is usually, but not necessarily,
output by a computer. Except for basic keyword in context (KWIC) indexing,
string indexing is generally not a fully automated process.
Permuted Title Indexes
Permuted
title indexes are created by systematically rotating information-conveying
words in the title as subject entry points into the index. The premise of
permuted title index is that titles effectively indicate the content of
documents. Because it reflects the content of a document, a permuted title
index helps users decide if that document would satisfy their information
needs.
Multimedia Indexes
These
indexes integrate images, sounds, and textual material.
Internet Indexes
Internet
indexes exist in traditional forms, in hypermedia forms, in automatic forms,
and in implicit forms.
Hypermedia Indexes
Hypermedia
is text and non-textual material in an electronic form that allows users to
search non-linearly by associations. This type of indexes allows users to
thread their way to what they want through electronic nodes and links between
those nodes.
First-Line Indexes
First-line
indexes generally refer to poems. In these indexes all the words in the first
line poem are listed in their alphabetical position in the index.
Faceted Indexes
Facet,
by definition, means one side of something that has many sides. In a faceted
indexing system, any subject is not a single unit but has many aspects; thus a
faceted index attempts to discover all the individual aspects of a subject and
then synthesize them in a way that best describes the subject under discussion.
Cumulative Indexes
A
cumulative index is a combination or merging of a set of indexes over time.
Such indexes for established works can often cover many decades. Generally,
such indexes apply to journals and to large, important works and are published
as separate volumes. Cumulative indexes are complex and usually are done by
teams of indexers.
Coordinate Indexes
Coordinate
indexes allow terms to be combined or coordinated
Classified Indexes
A
classified index has its contents arranged systematically by classes or subject
headings.
Citation index
A
citation index consists of a list of articles, with a sub-list under each
article subsequently published papers that cite the articles. In order words,
given a particular paper, a citation index shows who cited that paper at a
later point in time. Basically, this kind of index implies that a cited paper
has an internal subject relationship with the papers that cite it, and this
relationship is used to cluster related documents.
Comments
Post a Comment