laitimes

Arxiv Network Science Abstracts 5(2019-07-11)

author:Web Research Express

Pairwise link prediction;

Summary of the Democratic Party's poll on liberal responses;

Team repositories adopt dynamics: GitHub commit log analysis;

Metrics used to characterize heavy tail networks;

Mobile phone data has the potential to inform infrastructure planning in developing countries;

Pairwise link prediction

Pairwise Link Prediction

Address: http://arxiv.org/abs/1907.04503

作者: Huda Nassar, Austin R. Benson, David F. Gleich

Summary: Link prediction is a common problem in network science that spans many disciplines. The goal is to predict the appearance of a new link or to find links that are missing in the network. A typical method for link prediction uses the topology of the network to predict the most likely future or lost connections between a pair of nodes. However, network evolution is often mediated by higher-order structures involving multiple node pairs; For example, factions on three nodes (also known as triangles) are key to the structure of a social network, but a standard link prediction framework does not directly predict these structures. To address this gap, we propose a new link prediction task, called pairwise link prediction, which directly targets the prediction of new triangles, one of which is to find out which nodes are most likely to form triangles with a given side. We developed two PageRank-based approaches for paired link prediction problems and naturally extended existing link prediction methods. Our experiments on various networks have shown that diffusion-based approaches are less sensitive to the type of graph used and are more consistent in the results. We also show how to use the Pairwise Link Prediction Framework to get better predictions in the context of standard link prediction evaluation.

The Democratic Party's summary of public opinion in the Liberal Response Survey

原文标题: Democratic summary of public opinions in free-response surveys

Address: http://arxiv.org/abs/1907.04359

By Tatsuro Kawamoto, Takaaki Aoki

Abstract: Social surveys are widely used as a method of gaining public opinion. Sometimes it is preferable to gather opinions by asking questions in a free-response format rather than a multi-select format. Despite their advantages, they are rarely used in practice to answer questions freely, as they often require manual analysis. Therefore, the classification of free-form text can be a daunting task in large-scale surveys and can be influenced by analyst interpretation. In this study, we propose a web-based survey framework in which responses are automatically classified according to statistical principles. This is possible because, in addition to the text, each respondent also evaluates the similarities between the replies. We demonstrated our approach through polls of the 2016 U.S. presidential election and surveys of specific college graduates. The proposed method helps analysts explain the basic semantics of responses in large-scale surveys.

Team repositories employ dynamic: GitHub commit log analysis

原文标题: Dynamics of Team Library Adoptions: An Exploration of GitHub Commit Logs

Address: http://arxiv.org/abs/1907.04527

By Pamela Bilo Thomas, Rachel Krohn, Tim Weninger

Abstract: When a group of people struggle to understand new information, as various ideas compete for attention, a struggle ensues. As the team learned together, the steep learning curve was overcome. To understand how these team dynamics play a role in software development, we explored the Git log, which provides a complete history of changes to the software repository. In these repositories, we observe code additions, ideas that represent successful implementations, code removals, ideas that fail or are superseded. By examining the patterns between these commit types, we can begin to understand how the team adopts the new information. We specialize in what happens after a project adopts a software library, i.e. when the library is first used in the project. We found that a variety of factors, including team size, library popularity, and popularity on Stack Overflow, were all related to the speed at which teams learned and successfully adopted new software libraries.

A measure used to characterize a tailless network

原文标题: A measure for characterizing heavy-tailed networks

Address: http://arxiv.org/abs/1907.04808

By Scott A. Hill

Abstract: Heavy-tailed networks are usually characterized in the literature by the similarity of their degree distribution to the power law. However, many real-life heavy-tail networks do not have a power-law distribution, and in many applications, the scale-free nature of the network does not matter as long as the network has a hub. Here, we propose the Cooke-Nieboer index (CNI), which is a non-asymptotic measure of the heavy tailing of the network degree distribution, which does not assume the power law form. CNI is easy to compute and clearly distinguishes networks with power law, exponential, and symmetry distributions.

Mobile phone data has the potential to inform infrastructure planning in developing countries

原文标题: Mobile phone data’s potential for informing infrastructure planning in developing countries

Address: http://arxiv.org/abs/1907.04812

Author: Hadrien Salat, Zbigniew Smoreda, Markus Schläpfer

Abstract: Developing countries do not always provide high-quality census data. Instead, mobile phone data is becoming synonymous with assessing population density, activity, and social characteristics. They provide additional advantages for infrastructure planning, such as real-time updates, including mobile information and recording the activities of temporary visitors. We combined various datasets from Senegal to assess the potential for insufficient census data for mobile phone data in alternative infrastructure planning in developing countries. As an application case, we tested their ability to accurately predict domestic electricity consumption. We show that, contrary to popular belief, average mobile activity does not correlate well with population density. However, it can provide a better estimate of electricity consumption than basic census data. More importantly, we have successfully used curve and network clustering techniques to improve the accuracy of predictions, restore good population mapping potential, and reduce the collection of information data for planning into smaller samples.

Disclaimer: The copyright of the abstract of Arxiv article belongs to the original author of the paper, which is translated and sorted by myself, please do not reprint it without consent. This series is updated simultaneously on the WeChat public account "Web Science Research Express" (WeChat netsci) and the personal blog https://www.complexly.me (providing RSS subscription).
Arxiv Network Science Abstracts 5(2019-07-11)