Collection > Infrastructure > S-7 Automated data and knowledge discovery in engineering literature

S-7 Automated data and knowledge discovery in engineering literature

The aim of Task S-7-1 is to provide a corpus consisting of engineering research literature in a machine readable, uniformly structured format. As a starting point we identified literature that was rated as relevant by engineering researchers at the TU Darmstadt. Based on this literature list, we harvested exemplary articles of a journal via the publisher’s API. Even in this rather small sample, not all articles contained the full text, but instead only the metadata. In addition to these initial technical tests, we started with the examination of the legal framework conditions in cooperation with Task S-7-2 (TIB Hannover). This was done by inspecting the text and data mining sections, if available, of existing license agreements between the University and State Library Darmstadt and the publishers as well as, also if available, the publishers’ text and data mining policies. The results vary from publisher to publisher: Not all publishers provide information about whether text and data mining analyses of their content is allowed, some allow it under certain conditions and some demand consultation or separate license agreements. In order to obtain legal certainty, several publishers were contacted. Although up to now we are not yet able to provide structured literature for text and data mining, this preliminary work is a prerequisite for its legally compliant provision in the future.

[S-7 has 0 items on its scientific dissemination list 2021]