LEXIA: A Data Science Environment for Semantic Analysis of German Legal Texts
The analysis of legal data using information technology, more specifically text and data mining algorithms, has become very attractive in the field of legal informatics. Additionally, legal science and practice consist of data-, knowledge-, and time-intensive tasks, which have always been in the focus of legal informatics. This paper contributes a data science environment, which is in particular suited for legal texts, e.g. documents from legislation and jurisdiction but also contracts and patents. The environment consists of a reference architecture and a specific data model. Furthermore, it integrates an easily adaptable and extendable text mining engine allowing reuse of components. The base line architecture for the text mining engine is the Apache UIMA. The environment enables to collaboratively specify linguistic and semantic structures. Thereby, it uses an existing rule-based script language, namely Apache Ruta. This paper shows how the system can be used to unveil legal definitions in the German Civil Code (BGB) by not only finding them but also by determining which legal term is defined and how. This functionality enables the structuring of unstructured information, i.e., text, which enables data scientists and legal experts to semantically investigate and explore legal texts.
Table of contents
- 1. Introduction
- 2. Research method and objectives: The legal domain – a challenge for data science
- 3. Related work
- 4. Reference architecture and data model
- 4.1. Reference architecture for the data science environment
- 4.2. Data Model, Data Storage and Access
- 4.3. Text Mining Engine
- 4.4. Importer and Exporter
- 5. Unveiling semantic structures in laws
- 5.1. Determining legal definitions in legal texts using Apache UIMA and Apache Ruta
- 5.2. UIMA pipeline for semantic annotation of legal texts
- 5.3. Accessing results and annotations through the user interface
- 6. Conclusion, outlook, and future applications
- 7. Acknowledgement
- 8. Bibliography
Loggen Sie sich bitte ein, um den ganzen Text zu lesen.
There are no comments yet
Ihr Kommentar zu diesem Beitrag
AbonnentInnen dieser Zeitschrift können sich an der Diskussion beteiligen. Bitte loggen Sie sich ein, um Kommentare verfassen zu können.
No comments