ATIS DATASETS (EN & GR)
Airline Travel Information System corpus (ATIS) is a popular, benchmark dataset in NER research area1. The domain of the dataset is airline travel, and the intents include finding a flight, enquiring about airlines or their services, asking about ticket prices, etc. The ATIS pilot corpus is a corpus designed to measure progress in Spoken Language Systems that include both a speech and a natural language component. Due to the nature of the dataset, it has been used extensively to train, test and develop NER and IC networks and question answering systems, for both spoken and written language. As the dataset consists of audio recordings and their corresponding manual transcripts and every utterance is completely labelled, it is convenient for both types of approaches.
UNIWAY DATASET
UniWay dataset simulates the interaction between students and the university’s back-office staff. The data consists of questions made by graduate students (including Erasmus students) of the Athens University of Economics and Business (AUEB) and the Aristotle University of Thessaloniki (AUTH) collected with a fully anonymous and GDPR compliant process. Students were asked to provide two versions of their questions in two English and Greek, whenever it was possible.
PV Power Forecasting (SEGAN, 2023)
We propose a Day Ahead Market-Intra Day Market (DAM-IDM) forecasting tool that facilitates the participation of PV power into the DAM and IDM and consists of four forecasters. The forecasters have been developed according to operational rules of DAM and IDM. For the implementation of the forecasting tool, we have compared two Deep Learning models, i.e., CNN and Transformer. We emphasize on the development of a simple, low- cost, and highly efficient forecasting methodology that will be attractive for the potential stakeholders.
The forecasting tool has been evaluated on data from five PV plants in Kozani area, Greece (5 csv files PV#1, PV#2, …, PV#5). The historical PV production data cover a period from 01/01/2020 to 10/03/2020 with 15-min resolution.