The MANUS Internet system is used to develop and share source documents in the TEI P5 XML standard (http://www.tei-c.org/). It is a standard that enables for inclusion of very rich metadata of documents, their structure and representation of their text in rich layers of various tags (see guidelines). This standard gains increasing popularity among humanist scholars of various fields to presenting texts for on–line research. The structured and standardized form of XML files makes them machine–readable, and this feature is widely used by the MANUS system in many stages.
The system defines several schemas of TEI P5 documents’ metadata for various types of documents: from complex descriptions of manuscripts and old prints (with a rich header of physical description, physical condition and content), through books and chapters, journals, numbers and articles to individual documents unconnected by any structure. The system can be configured to display each of these types in separate (several) collections. For each this type, the system provides dedicated user–friendly web forms for editing its TEI P5 metadata. Due to that, there is no need to directly edit an XML file, the structure of which is very complex. This solution not only facilitates entering data through appropriate components, but also allows for efficient correction of any errors. For advanced users, it is possible to connect a web–based XML editor from the database eXistdb (http://www.exist-db.org/) associated with the system, or uploading XML files edited outside the system. Editors also have the possibility of uploading scans associated with a given TEI P5 document, and the system will automatically do references to these scans in the document.
The system also defines the TEI P5 scheme for biographies of people who are authors, editors, translators, etc. of documents stored in the database. It provides pages that group documents related to those relationships with the person, regardless of what pseudonym/appeal they have signed in the document.
The system is integrated with the Solr search engine (http://lucene.apache.org/solr/), giving the user advanced faceted search capabilities, in different layers of text and taking into account the lemmatization of the Polish language (or other one available in Solr). The system can be easily adjusted to index only selected XML nodes and selected text layers, and selected languages within the text layer as well. Search results can be limited to a given layer and language and can be exported to tsv files (text/tab-separated-values
).
The clear graphic interface of the portal allows users not only to search and browse the contents of XML files, but also to efficiently navigate through related data, and to view associated scans together with their text layers. All that thanks to the developed XQuery subsystem, which MANUS uses in order to connect to the associated XML database. Additional functions of working with documents are implemented for the needs of specific research projects.