Comments (7)
@timm Prof. any comments? I am looking for min 5 terms to be matched to be considered for a topic overlap.
from pits_lda.
are these results stable? i.e. different runs generate different topics?
and is there any discussion in the literature about lda topic instability?
from pits_lda.
- These are the graphs talking about stability. With min x terms matched, the same topic is being generated in x% of times. Some topics (~20%) are generated in all the runs.
- But there has been like next to none overlap between T1 and T2.
- Will get back to you on the literature review. There has been some studies.
Project A - T1
Project A - T2
from pits_lda.
cant get an executive summary of these.
only 20% of these topics are stable across multiple runs?
if run N times and collect the topics in all N are there repeated patterns?
from pits_lda.
stable means: if a topic has occurred more than 5 times in 10 runs. This answers your 3rd question as well.
And yes only 20% of topics are stable. But here I am only finding top 10 topics.
from pits_lda.
Hmmm.... looks liek its time to check if anyone else has found topics to be unstable
is this apper useful to you?
How to Effectively Use Topic Models for Software Engineering Tasks? An Approach Based on Genetic Algorithms ==> paper
@inproceedings{panichella2013effectively,
title={How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms},
author={Panichella, Annibale and Dit, Bogdan and Oliveto, Rocco and Di Penta, Massimiliano and Poshyvanyk, Denys and De Lucia, Andrea},
booktitle={Proceedings of the 2013 International Conference on Software Engineering},
pages={522--531},
year={2013},
organization={IEEE Press}
}
using "grid-serach" idea to tune 4 hyper-parameters of LDA, each divided into 10 bins, to investigate "What is the impact of the configuration parameters on LDA’s performance in the context of software engineering tasks"
This research question aims at justifying the need for an automatic approach that calibrates LDA’s
settings when LDA is applied to support SE tasks. they analyzed a large number of LDA configurations
for three software engineering tasks. The presence of a high variability in LDA’s performances indicates that, without a proper calibration, such a technique risks being severely under-utilized
from pits_lda.
do you know how to find who has cited a paper?
Step1: look for it in google scholar
Step3: click on the "cited by 73" link :
https://scholar.google.com/scholar?cites=9122112158639969994&as_sdt=5,34&sciodt=0,34&hl=en
enjoy!
from pits_lda.
Related Issues (20)
- DE results on pitsA
- Review - 04/27/2016 HOT 1
- Results HOT 2
- Updated ToDos
- Citemap Results HOT 5
- Meeting - 06/02
- Meeting - 06/08 HOT 2
- Meeting - 06/16 HOT 2
- Results 06-23
- Parameter alpha and beta
- Classification using LDA HOT 8
- F CR Pop Graph
- Spark Results
- VEM vs Gibbs
- terms overlap HOT 2
- LDA topics as feature selector HOT 1
- Mail with Prof. Mika Mäntylä
- Randomness
- Credibility Of LDA HOT 4
- Weekly Report - 10/11/2016 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pits_lda.