CONTENTS

  • A Machine Learning (ML) model for automatic detection of pneumonia​, K. Stokes, University of Warwick
  • ORANGE, Open source machine learning and data visualization for novice and expert. Interactive data analysis workflows with a large toolbox. B. Zupan, University of Ljubljana
  • Eye and Eye Identification, Z. Emersic + Face Deidentification, B. Meden, University of Ljubljana
  • AMORE project: Humans meet Models on Object Naming, G. Boleda et al., University of Pompeu Fabra
  • Unsupervised-Learning Power Control for Cell-Free Wireless Systems​, A. Lozano et al., University of Pompeu Fabra.
  • The Few-Get-Richer: A Surprising Consequence of Popularity-Based Rankings, G. Le  Mens et al., University of Pompeu Fabra

A Machine Learning (ML) model for automatic detection of pneumonia

Katy Stokes

MRC DTP, Applied Biomedical Signal Processing and Intelligent eHealth Lab. S

Pneumonia has a devastating impact on health worldwide, but the majority of the burden falls in low- and middle-income countries (LMICs). In order to combat this, global access to early diagnosis and referral is key. Machine Learning (ML) is a tool which may be used to overcome certain diagnostic challenges such as a low specificity of symptoms, lack of accessible diagnostic tests and varied clinical presentation. Such models are also suitable for incorporation with phone apps or wearable sensors.

The aim of this study is to investigate the feasibility of an evidence based and interpretable ML model, using symptoms and signs as features, for the purpose of distinguishing pneumonia and bronchitis patients.

Data from 4500 patients (1500 bronchitis, 3000 pneumonia) containing information on population characteristics, symptoms and laboratory test results was used to train and compare performance between three of the most common ML methods: logistic regression; decision tree and support vector machine. Manual feature selection was performed, such that the model would be built using 6 easily recognised symptoms. This selection was based on both relevance to pneumonia/bronchitis and prior use in similar models reported in the literature. Models were developed through a thorough hold out process of training, validation and testing. 

The best performing model was found to be a decision tree, with a sensitivity and specificity of >80%. The tree-based model has the benefit of being easily interpretable, which grants advantage over previously reported ML models. Further, it is suitable for incorporation into a diagnostic tool, for the purpose of early diagnosis and treatment of pneumonia patients in low resource settings.

AMORE project – ERC Starting Grant

Carina Silberer, Sina Zarrieß, Matthijs Westera, Gemma Boleda

The AMORE project combines human data and computational modelling in order to understand how we use language to refer to entities in the world (things, animals, people). We give a concrete illustration of this approach by presenting an example of our recent research. Earlier, we collected a dataset we called ManyNames, by asking hundreds of people to provide names for objects in images, for a total of 25K images and 36 people per image. This poster presents our subsequent work aimed at improving the quality of our data (by filtering out errors and noise) and using the resulting dataset, ManyNames version 2 (MN v2), to compare the object naming behavior of humans to that of current computational models. We analyze issues in the data collection method originally employed, standard in Language & Vision research, and find that the main source of noise in the data comes from simulating a naming context solely from an image with a target object marked with a bounding box, which causes subjects to sometimes disagree regarding which object is the target. We also find that both the degree of this uncertainty in the original data and the amount of true naming variation in MN v2 differs substantially across object domains (e.g., vehicles vs. people vs. animals). We use MN v2 to analyze a popular Language & Vision model and demonstrate its effectiveness on the task of object naming. However, our fine-grained analysis reveals that what appears to be human-like model behavior is not stable across domains, e.g., the model confuses people and clothing objects much more frequently than humans do. We also find that standard evaluations underestimate the actual effectiveness of the naming model: on the single-label names of the original dataset (Visual Genome), it obtains 27% accuracy points less than on MN v2. Our work demonstrates the importance, for understanding language in humans as well as models of Language & Vision, of a dataset like MN v2 that does justice to the variation and nuance in natural, human behavioral data, as collected from a large pool of participants. [Gema Boleda]

Unsupervised-Learning Power Control for Cell-Free Wireless Systems

Rasoul Nikbakht, Anders Jonsson, Angel Lozano

This work applies feedforward neural networks to the problem of centralized power allocation in the downlink of cell-free wireless systems with conjugate beamforming. The formulation relies only on large-scale channel gains. Most importantly, the learning is unsupervised, foregoing the taxing precomputation of training data that supervised learning would require. Two loss metrics are entertained, namely (i) the max-min of the user signal-to-interference ratios (SIRs), or more precisely a generalized form of max-min that can be softened at will to regulate the tradeoff between average performance and fairness, and (i) the max-product of the SIRs, which intrinsically effects such tradeoff. The results indicate that the unsupervised-learning approach can match the performance of vastly more computationally demanding methods.

Work supported by the Spanish Ministry of Economy and Competitiveness, under the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502) and by the European Research Council (grant agreement 694974).

The Few-Get-Richer: A Surprising Consequence of Popularity-Based Rankings [ID:1863]

Fabrizio  Germano (1,2),  Vicenç  Gómez (1)Gaël Le  Mens (1,2,3)

(1) Universitat Pompeu Fabra, Barcelona, Spain   
(2) Barcelona Graduate School of Economics.   
(3) Southern Denmark University

Ranking algorithms play a crucial role in online platforms ranging from search engines to recommender systems. In this paper, we identify a surprising consequence of popularity-based rankings: the fewer the items reporting a given signal, the higher the share of the overall traffic they collectively attract. This few-get-richer effect emerges in settings where there are few distinct classes of items (e.g., left-leaning news sources versus right-leaning news sources), and items are ranked based on their popularity.  We demonstrate analytically that the few-get-richer effect emerges when people tend to click on top-ranked items and have heterogeneous preferences for the classes of items. Using simulations, we analyse how the strength of the effect changes with assumptions about the setting and human behaviour. We also test our predictions experimentally in an online experiment with human participants. Our findings have important implications to understand the spread of misinformation.