Institute of Information Science Academia Sinica
Topic: TIGP (SNHCC)--Conquering Cross-source Failure for News Credibility: Learning Generalizable Representations beyond Content Embedding
Speaker: Prof. Yi-Shin Chen (Department of Computer Science, National Tsing Hua University)
Date: 2020-06-15 (Mon) 14:00 – 16:00
Location: Webex Meeting ID: 581 250 658 / Password: sfBbyVa429f


Meeting Room Link:

Log in after 1:30 PM, June 15. The lecture will start at 2:00 PM. 

False information on the Internet has caused severe damage to society. Researchers have proposed methods to determine the credibility of news and have obtained good results. As different media sources (publishers) have different content generators (writers) and may focus on different topics or aspects, the word/topic distribution for each media source is divergent from others. We expose a challenge in the generalizability of existing content-based methods to perform consistently when applied to news from media sources non-existing in the training set, namely the cross-source failure. A cross-source setting can cause a decrease beyond 15 − 19% in accuracy for current methods; content-sensitive features are considered one of the major causes of cross-source failure for a content-based approach. To overcome this challenge, we propose a syntactic network for news credibility (SYNC), which focuses on function words and syntactic structure to learn generalizable representations for news credibility and further reinforce the cross-source robustness for different media. Experiments with cross-validation on 194 real world media sources showed that the proposed method could learn the generalizable features and outperformed the state-of-the-art methods on unseen media sources. Extensive analysis on the embedding feature representation represents a strength of the proposed method compared to current content embedding feature approaches. We envision that the proposed method is more robust for real-life application with SYNC on account of its good generalizability.