BdD des Sciences d’Information

Accueil du site > Science de l’Information et Bibliothéconomie (Généralités) > Experiences of Harvesting Web Resources in Engineering using Automatic (...)

Experiences of Harvesting Web Resources in Engineering using Automatic Classification

Jessica Lindholm, Tomas Schönthal and Kjell Jansson

mardi 4 janvier 2005, par Collecte CND R.L

The story behind Engine-e [1], a recently created robot-generated Web index, is best told by starting in 1994 with the development and maintenance of EELS (Engineering Electronic Library, Sweden) [2], a manually indexed quality-controlled subject gateway in Engineering. EELS was accompanied by the experimental robot-generated index, "All" Engineering [3], created within the DESIRE framework [4]. The solution used already in "All" Engineering is similar to that of Engine-e, but with some distinct differences.

Work on EELS was initiated by SUTL (Swedish University Technology Libraries) in 1994 with the purpose of giving the technology libraries and universities an opportunity to explore the Internet from selected links to valuable resources with a special focus on resources from Sweden and other Nordic countries.

As such it was a very early implementation of a subject-based information gateway [5].

A group of some ten subject editors from Swedish technology universities carried out the tasks of collecting, evaluating, indexing, cataloguing and updating resources in EELS [6].

The technical development of EELS was carried out by NetLab [7], a research and development department at Lund University Libraries, Sweden. Traugott Koch, senior librarian and digital library scientist at NetLab, suggested at an early stage the development of a robot-generated index within EELS, which later was realised in "All" Engineering. The intention was to integrate "All"

Engineering as far as possible into the same structure as the one used in the original EELS where the classification scheme of Engineering Information Inc. was used [8]. The harvesting robot in "All" Engineering collected resources starting from seven reliable quality-controlled subject gateways and followed their links down to two or three sub-levels [9].

As matters progressed, it became apparent that the work to be done by the subject editors proved itself to be the most problematic part of the EELS project. During the period 1994-2000 the Web had expanded exponentially and search engines had greatly increased their performance.

Thus the problem of coverage in EELS became increasingly urgent. The editors discovered how labour-intensive it was to keep EELS up to date in this new environment. At the same time users’ need for EELS seemed to diminish as new generation search engines grew increasingly popular.

Furthermore, the funding for technical development from the Royal Library’s Department for National Co-ordination and Development, BIBSAM [10], had also come to an end, except if EELS were to be integrated into a planned national service together with the quality-controlled subject gateways from other Swedish national resource libraries.

Eventually the SUTL consortium could no longer guarantee the quality of resources in EELS, due to difficulties experienced by the subject editors. All work on EELS was frozen in 2001.

Documents joints

Suivre la vie du site RSS 2.0 | Plan du site | Espace privé | SPIP | squelette