Here, I highlight some of my most recent and relevant publications. For a complete list of my publications visit my google scholar profile.

  1. Learning Latent Variable Models from Non-Stationary Data Streams

    Making inferences based on a data stream is challenging for several reasons. First of all, it requires continuous model updating and the ability to handle a posterior distribution conditioned on an unbounded data set. Secondly, the underlying data distribution may drift from one time step to another, and the classic i.i.d. (or data exchangeability) assumption does not hold any more. In this paper, we present a Bayesian approach which addresses these issues for general latent variable models within the conjugate exponential family. Our proposal makes use of a novel scheme based on hierarchical (non-conjugate) priors to explicitly model temporal changes of the model parameters, which induces an exponential forgetting mechanism with adaptive forgetting rates. A variational inference scheme is derived which maintains the computational efficiency of variational methods over conjugate models.

    Masegosa, A. R., Ramos-López D., Nielsen, T. D., Langseth, H., Salmerón, A. Learning Latent Variable Models from Non-Stationary Data Streams. Submitted to Bayesian Analysis 2019.

  2. Scaling up Bayesian variational inference using distributed computing clusters

    In this paper we present an approach for scaling up Bayesian learning using variational methods by exploiting distributed computing clusters managed by modern big data processing tools like Apache Spark or Apache Flink, which efficiently support iterative map-reduce operations. Our approach is defined as a distributed projected natural gradient ascent algorithm, has excellent convergence properties, and covers a wide range of conjugate exponential family models. We evaluate the proposed algorithm on three real-world datasets from different domains (the Pubmed abstracts dataset, a GPS trajectory dataset, and a financial dataset) and using several models (LDA, factor analysis, mixture of Gaussians and linear regression models). Our approach compares favorably to stochastic variational inference and streaming variational Bayes, two of the main current proposals for scaling up variational methods. For the scalability analysis, we evaluate our approach over a network with more than one billion nodes and approx. latent variables using a computer cluster with 128 processing units (AWS). The proposed methods are released as part of an open-source toolbox for scalable probabilistic machine learning http://www.amidsttoolbox.com.

    Andrés R Masegosa, Ana M Martinez, Helge Langseth, Thomas D Nielsen, Antonio Salmerón, Darío Ramos-López, Anders L Madsen. Scaling up Bayesian variational inference using distributed computing clusters. International Journal of Approximate Reasoning, 88, 435-451. 2017.

  3. Bayesian Models of Data Streams with Hierarchical Power Priors

    Making inferences from data streams is a pervasive problem in many modern data analysis applications. But it requires to address the problem of continuous model updating, and adapt to changes or drifts in the underlying data generating distribution. In this paper, we approach these problems from a Bayesian perspective covering general conjugate exponential models. Our proposal makes use of non-conjugate hierarchical priors to explicitly model temporal changes of the model parameters. We also derive a novel variational inference scheme which overcomes the use of non-conjugate priors while maintaining the computational efficiency of variational methods over conjugate models. The approach is validated on three real data sets over three latent variable models.

    Masegosa, A. R., Nielsen, T. D., Langseth, H., Ramos-López, D., Salmerón, A., & Madsen, A. L. Bayesian Models of Data Streams with Hierarchical Power Priors. ICML 2017.

  4. Probabilistic graphical models on multi-core CPUs using Java 8

    In this paper, we discuss software design issues related to the development of parallel computational intelligence algorithms on multi-core CPUs, using the new Java 8 functional programming features. In particular, we focus on probabilistic graphical models (PGMs) and present the parallelization of a collection of algorithms that deal with inference and learning of PGMs from data. Namely, maximum likelihood estimation, importance sampling, and greedy search for solving combinatorial optimization problems. Through these concrete examples, we tackle the problem of defining efficient data structures for PGMs and parallel processing of same-size batches of data sets using Java 8 features. We also provide straightforward techniques to code parallel algorithms that seamlessly exploit multicore processors. The experimental analysis, carried out using our open source AMIDST (Analysis of MassIve Data STreams) Java toolbox, shows the merits of the proposed solutions.

    Masegosa, A. R., Martinez, A. M., & Borchani, H. (2016). Probabilistic graphical models on multi-core CPUs using Java 8. IEEE Computational Intelligence Magazine, 11(2), 41-54.

  5. Stochastic Discriminative EM

    Stochastic discriminative EM (sdEM) is an online-EM-type algorithm for discriminative training of probabilistic generative models belonging to the exponential family. In this work, we introduce and justify this algorithm as a stochastic natural gradient descent method, i.e. a method which accounts for the information geometry in the parameter space of the statistical model. We show how this learning algorithm can be used to train probabilistic generative models by minimizing different discriminative loss functions, such as the negative conditional log-likelihood and the Hinge loss. The resulting models trained by sdEM are always generative (i.e. they define a joint probability distribution) and, in consequence, allows to deal with missing data and latent variables in a principled way either when being learned or when making predictions. The performance of this method is illustrated by several text classification problems for which a multinomial naive Bayes and a latent Dirichlet allocation based classifier are learned using different discriminative loss functions.

    Masegosa, Andres. R. 2014. Stochastic Discriminative EM. UAI 2014.