Search
Search Results
-
Comparison of Estimation Algorithms for Latent Dirichlet Allocation
Latent Dirichlet Allocation (LDA; Blei et al., J Mach Learn Res 3:993–1022, 2003) is a probabilistic topic model that has been used to detect the... -
Penalized Latent Dirichlet Allocation Model in Single-Cell RNA Sequencing
Single-cell RNA sequencing (scRNA-seq) quantifies RNA transcripts at individual cell level, providing cellular-level resolution of gene expression...
-
Sample Size for Latent Dirichlet Allocation of Constructed-Response Items
Over the past decade, topic models have been used to analyze students’ responses to constructed-response items. Analyzing students’ responses using... -
An Investigation of Prior Specification on Parameter Recovery for Latent Dirichlet Allocation of Constructed-Response Items
Latent Dirichlet Allocation (LDA) is a probabilistic model to analyze textual data. It was originally developed for corpora containing large amount... -
An Empirical Study of Develo** Automated Scoring Engine Using Supervised Latent Dirichlet Allocation
The use of constructed-response and performance-oriented items is becoming increasingly more common in educational measurement. These items may be in... -
Variational Bayes estimation of hierarchical Dirichlet-multinomial mixtures for text clustering
In this paper, we formulate a hierarchical Bayesian version of the Mixture of Unigrams model for text clustering and approach its posterior inference...
-
Pseudo-document simulation for comparing LDA, GSDMM and GPM topic models on short and sparse text using Twitter data
Topic models are a useful and popular method to find latent topics of documents. However, the short and sparse texts in social media micro-blogs such...
-
Dynamic hierarchical Dirichlet processes topic model using the power prior approach
The hierarchical Dirichlet processes (HDP) topic model is a Bayesian nonparametric model that provides a flexible mixed-membership to documents...
-
Automatic Topic Title Assignment with Word Embedding
In this paper, we propose TAWE (title assignment with word embedding), a new method to automatically assign titles to topics inferred from sets of...
-
Natural language processing and financial markets: semi-supervised modelling of coronavirus and economic news
This paper investigates the reactions of US financial markets to press news from January 2019 to 1 May 2020. To this end, we deduce the content and...
-
Clustering and Latent Factor Models
Hierarchical modelsHierarchical model were previously discussed in Sect. 3.3 . This chapter gives further details... -
Exclusive Topic Model
Digital documents are generated, disseminated, and disclosed in books, research papers, newspapers, online feedback, and other content containing... -
Mixtures of Dirichlet-Multinomial distributions for supervised and unsupervised classification of short text data
Topic detection in short textual data is a challenging task due to its representation as high-dimensional and extremely sparse document-term matrix....
-
Two-Step Approach to Topic Modeling to Incorporate Covariate and Outcome
This study investigates the applicability of topic modeling to analyze educational data. Topic modeling is useful because it reveals the latent topic... -
Lasso-based variable selection methods in text regression: the case of short texts
Communication through websites is often characterised by short texts, made of few words, such as image captions or tweets. This paper explores the...
-
Deep mixtures of unigrams for uncovering topics in textual data
Mixtures of unigrams are one of the simplest and most efficient tools for clustering textual data, as they assume that documents related to the same...
-
Identification of Key Concerns and Sentiments Towards Data Quality and Data Strategy Challenges Using Sentiment Analysis and Topic Modeling
In the era of Fourth Industrial Revolution, data and information became a valuable resource. In this data-driven economy, it is extremely important... -
Bayesian estimation of the latent dimension and communities in stochastic blockmodels
Spectral embedding of adjacency or Laplacian matrices of undirected graphs is a common technique for representing a network in a lower dimensional...
-
Mixture polarization in inter-rater agreement analysis: a Bayesian nonparametric index
In several observational contexts where different raters evaluate a set of items, it is common to assume that all raters draw their scores from the...
-
Divorce in Italy: A Textual Analysis of Cassation Judgment
Cataldo, Rosanna Grassia, Maria Gabriella Marino, Marina Mazza, Rocco Pastena, Vincenzo Zavarrone, EmmaThe dissolution of marriage is a complex...