Franco BERTACCINI
Università di Bologna
bertaccini@sslmit.unibo.it

 

XTERM: A Flexible Standard-Compliant XML-Based Termbase Management System

 

This demo introduces XTerm a Termbase management system currently under development at the Terminology Center of the School for Interpreters and Translators of the University of Bologna.

 

The project "Languages and productive activities"

The XTerm termbase was originally designed to handle terminological data resulting from more than 7 years of work carried out by students, researchers and staff of the School within the framework of the "Languages and productive activities" project.

The project's primary goal is to promote collaboration between the University and companies located in Emilia Romagna and surrounding regions. Almost every entry in the termbase has been revised by in-house experts working in one of the many companies that participate in the project (e.g. Ferrari, Aprilia and Ferragamo).

The migration from the previous format is still underway, the system, however, is fully funcional already. Once the migration is complete, the termbase will have more than 70000 terms, which have already been normalised to minimise inconsistencies. The final result will be a constantly expanding termbase that needs virtually no technical support and whose data can be seamlessly exported in any XML dialect.

However, XTerm is not bound to this particular project, and it can easily be adapted to different terminological projects, in fact it is not merely a termbase, it is a TermBase Management System (TBMS).

The whole system consists of 4 main components:

1.  the database engine (mySQL, Oracle, Access) which takes care of the low level handling of the raw data;

2.  Xterm.NET, a graphic environment for data insertion, termbase management, querying and visualisation;

3.  One or more XML Schemata defining the data structure of the termbase (a virtually unlimited number of differently structured termbases can be hosted on the same machine);

4.  XTerm.portal, a web site that provides general access to the termbase(s) through a querying engine;

 

Xml TERMinology for NETworks

XTerm.NET is a terminology management solution that allows users to create, manage and view multilingual terminological databases.

With XTerm.NET it is possible to manage anything from a small monolingual project to a great number of large projects containing millions of terms in all ISO 639 defined languages (depending on the capabilities of the underlying database engine).

The system combines the data-consistency of traditional relational databases with the flexibility of the XML Schema definition. It becomes thus not only possible but also extremely easy to customise and replicate the terminological database structure.

The application consists of a small core of base projects and data-handling functionalities. Such a structure is then expanded via a number of dedicated plug-ins resulting in a highly modular and open system. Plug-ins developed so far include:

1.  Termbase-related plug-ins (used to connect the application to various database engines);

2. 2. Visualisation-related plug-ins (useful to customise the rendering of visual information);

3. 3. Import and Export plug-ins (that allow terminological data exchange and conversions between XTerm and almost every other terminological format);

XTerm.NET interface is meant to be extremely user-friendly. It is graphically integrated with intuitive icons and toolbars to ensure smooth navigation as you work with the application; the uniform look-and-feel is designed to reduce learning curve, through consistent use of the latest Microsoft Windows system standardized features. Customizable interface allows users to choose their own personal settings. Easy-to-follow, indexed, online help with a built-in search function will assist the user in finding information on any Xterm.NET feature or function.

 

Metamorphic terminological data definition

The increasing need to experiment with different terminological data memorization and representation schemes is dealt with in XTerm by adopting a standard data definition language: XML Schema. Starting from data definitions stated in this flexible but well-defined XML format, XTerm.NET encapsulates the hierarchical structure defined by the XML Schema in a relational data definition representation that it is able to handle no matter what the data definition itself contains or consists of; the system will accept just about *any* data structure that conforms to a minimum set of limitations imposed on the XML Schema making it truly dynamic.

 

Scaling and portability

The XTerm.NET terminology system can be scaled to adapt to the needs of large companies/institutions that want to utilise a central database server (currently, we support mySQL and Oracle Sybase) in order to allow a large number of users to work on a single terminology system, and at the same time can be scaled to adapt to the needs of freelance translators using a desktop database (such as MS Access).

XTerm.NET ability to communicate with different database engines, is currently achieved by a set of plug-ins, i.e. small programs that handle different database types. We are currently working on a web service component that will make XTerm.NET totally independent from the underlying database engine when using a remote connection to a central database server.

 

ISO 12620 compliance

XTerm is being developed with XML and ISO in mind, to ensure maximum compatibility with other termbases. The termbase currently under development has been defined as a superset of ISO 12620 (i.e. it contains a higher degree of specificity, especially as relations are concerned).

Thanks to XML support, terminological data can then be exported in standard formats such as TBX and MARTIF. Moreover, the metamorphic data structure ensures compatibility with new terminology interchange formats and with future updates of the current standards.

 

XHTML 1.1 compliant web interface

Web-based access to the termbase is achieved through XTerm.portal, a web application that interacts with the termbase.

The portal provides a wide array of search and visualisation options to suit the needs and tastes of every user. “One-button search” provides an extremely simple search mechanism while “Expert search” allows users to fine-tune the search options to retrieve accurate results.

Terminological record display can be customised to show the type and quantity of information the user desires: from simple bilingual glossaries to full-fledged terminological entries containing linguistic, semantic and ontological information (such as grammatical notes, contexts, definitions and relations).

 

XHTML 1.1 compliance ensures cross-platform compatibility and accessibility.

 

CARMA (Computer Aided Relation MAnagement)

Computer Aided Relation MAnagement is a plug-in that helps the terminographer in establishing relations between entries in the termbase.

Ideally every entry in the termbase has to be related to other entries, CARMA helps in keeping relations consistent throughout the database and avoiding broken links in the conceptual systems by suggesting possible relations between terms. This is done by analysing existent relations and proposing new ones (e.g., if A is related to B, and B is related to C then it is likely that A is also related to C).

 

COSY generator (COnceptual SYstem generator)

This is a plug-in that generates graphical representations of conceptual systems automatically or semiautomatically.

The relations expressed between terms in the database are used to visualise computer-generated representations of the conceptual systems, which terminographers can then edit to suit their own tastes.

This is a great productivity improvement since the terminographer who wants to provide graphical representations of the conceptual system is no longer forced to manually draw them by hand, possibly wasting even more time and resources in learning to use the drawing tool.

 

Conclusions

At the time of writing XTerm is a woking prototype and there are as yet no specific plans for its release policy. However, we do not rule out the possibility to make it available for research purposes in forms yet to be defined.