Relations from Italian Wikipedia using Unsupervised Information Extraction

Main Authors: Pierpaolo Basile, Lucia Siciliani, Pierluigi Cassotti, Marco de Gemmis, Pasquale Lops
Format: info dataset
Bahasa: ita
Terbitan: , 2021
Subjects:
Online Access: https://zenodo.org/record/5498034
Daftar Isi:
  • This dataset contains relations extracted from the Italian Wikipedia by the WikiOIE framework. WikiOIE is based on UDPipe and the Universal Dependencies project for text processing. It easily allows customizing the information extraction (IE) approach to automatically extract triples (subject, predicate, object). This dataset contains relations extracted by two unsupervised IE methods. The former (simple) is based only on PoS-tag patterns; the latter (simpledep) also uses syntactic dependencies. The extraction process is provided in JSON format. More information and the Java code are available here https://github.com/pippokill/WikiOIE Pierluigi Cassotti, Lucia Siciliani, Pierpaolo Basile,Marco de Gemmis, and Pasquale Lops. 2021. Extracting relations from Italian Wikipedia using unsupervised information extraction. In Proceedings of the 11th Italian Information Retrieval Workshop 2021 (IIR 2021). CEUR-WS.