The study reveals language patterns in which AI models are linked to factual and false articles.
New studies of MIT researchers; It works under the heading of “automatic fraud detection system” and how fine is the machine learning models in the language of real and false stories; but it reveals consistent differences. The research also; It highlights how fake news detectors need to undergo rigorous testing to be effective in real-world applications.
Fake news that is popular as a concept during the presidential election in the United States; is a form of propaganda created to give opinions about websites or to mislead the public and readers. In order to understand this, researchers; started to develop automated fake news detectors called “learning neural networks” from data scores to recognize fake articles, language tips. Given the new articles to be evaluated, these networks; With very high accuracy, it can distinguish reality from fiction in controlled environments.
However, one of the problems is the “black box” problem; that is, there is nothing telling what linguistic patterns the networks analyze during education. They were also trained and tested on the same topics; this may limit their potential to generalize to new topics that have to analyze news over the internet. In a paper presented at the conference and at the Neural Computing Systems Workshop, researchers; it addresses both these issues. They developed a deep learning model, learning to identify the language patterns of fake and real news.
Some of his work “opens cracks in the black box.” Thus, they could easily find the words and phrases that they captured to make the predictions of the model.
In addition, they tested their models on a new topic that they did not see in education. This approach includes individual articles; for news readers only, it categorizes language models that represent real-world practice more closely. Traditional fake news detectors also classify articles based on text combined with source information such as the Wikipedia page or website.
B Section E and Cognitive Sciences (BCS) Brain, Mind and Machines Center (CBMM) Eugene McDermott Professor Tomaso Poggio Laboratory Postdoctoral Assistant Author Xavier Boix: “In our case, we want to understand what is the only language based on the decision-making process of the classifier. Because this is to us; It can give an idea of the language of fake news. ”
The model identifies sets of words that tend to appear more often in real or fake news. Some of them are clear, some of them are less obvious… Researchers find out that the findings in fake news; It points to differences (which support exaggeration and superiorities) but are more focused on “conservative” word choices, examine real news, but are consistent.
Model of researchers; is an evolutionary neural network that works on fake news and real news datasets. For training and testing, researchers; They used a popular fake news research dataset called Kaggle, containing approximately 12,000 fake news sample articles from 244 different websites. They also compiled a data set of more than 2,000 real news samples from the New York Times and 9,000 from The Guardian.
In education, the model captures the language of an article as “word embeddings” where words are represented as vectors. (Basically, the sequences of numbers are combined with the words of similar semantic meanings.) In doing so, it also captures triple words in the form of patterns that provide some situations, such as a negative comment about a political party. Looking at a new article; the model text scans similar patterns and sends them to a series of layers. A final output layer determines the probability of each pattern: real or fake…
Researchers, using the same topics; He first trained and tested the model in the traditional way. However, they thought that this might create a natural bias in the model because some issues are mostly subject to fake or real news. For example, fake news is more likely to contain the words “Trump” and “Clinton.”
O’Brien: “But that’s not what we want. This only shows topics that are very heavy in fake and real news. However, this was not enough for us: We wanted to find the true patterns in the language and the indication of it. ”
Later, the researchers model; They trained on all subjects without mentioning the word “Trump” and tested the model only on examples that are separate from the training data and contain the word “Trump”. In this test, 93 percent; 87 percent accuracy was achieved in the second approach. Researchers, the model with this deficit accuracy; It emphasizes the importance of using topics from the education process so that they can generalize what they have learned on new topics.
More Research Needed..
Researchers to open the “black box”; They carried out their work step by step. In this experiment, while the model predicts about each of these three words, depending on whether the group of three is real or false; a certain part of the model is activated. Researchers like this; He designed a method that would return to the section specified in each of his predictions and then find exactly the words that made him active.
More research is needed to determine how useful this information is for readers, Boix says. In the future, the model; It can potentially be combined with automatic status controllers and other tools to give readers an advantage in combating false information. After some refinements, the model; It may also be the basis of a browser extension or application that alerts readers to potential fake news language.