Social Sciences & Humanities
Advancing media history by transparent automatic genre classification
The amount of online information online and the impact it has on society imply the need for an automated assessment of information quality. As quality assessments can be perceived as subjective or biased, and since commercial social networks often operate in a non-transparent manner, technology supporting such assessments needs to be transparent and tunable to support specific scholarly requirements.
To achieve these goals, we build on the existing QuPiD and NEWSGAC platforms. QuPiD is a proof of concept pipeline for information quality assessment that involves crowdsourcing, machine learning, and symbolic reasoning.
To allow scholars to benefit from the platform, we need to empower the user to tune such pipelines. For example, she may decide to collect training data manually or from a crowdsourcing platform; to either use supervised or unsupervised machine learning methods to analyse the quality of documents, and she should be aware of the implications of her decisions.
We build on the NEWSGAC framework, as it allows domain specialists to investigate and tune their machine learning pipelines. We aim at extending NEWSGAC´s transparency-enabling architecture to fulfil the above requirements, thus considering hybrid pipelines that combine crowdsourcing, symbolic reasoning, and machine learning.