Giter Site home page Giter Site logo

T1 - T2 about pits_lda HOT 7 CLOSED

amritbhanu avatar amritbhanu commented on July 20, 2024
T1 - T2

from pits_lda.

Comments (7)

amritbhanu avatar amritbhanu commented on July 20, 2024

@timm Prof. any comments? I am looking for min 5 terms to be matched to be considered for a topic overlap.

from pits_lda.

timm avatar timm commented on July 20, 2024

are these results stable? i.e. different runs generate different topics?

and is there any discussion in the literature about lda topic instability?

from pits_lda.

amritbhanu avatar amritbhanu commented on July 20, 2024
  • These are the graphs talking about stability. With min x terms matched, the same topic is being generated in x% of times. Some topics (~20%) are generated in all the runs.
  • But there has been like next to none overlap between T1 and T2.
  • Will get back to you on the literature review. There has been some studies.

Project A - T1

file

Project A - T2

file

from pits_lda.

timm avatar timm commented on July 20, 2024

cant get an executive summary of these.

only 20% of these topics are stable across multiple runs?

if run N times and collect the topics in all N are there repeated patterns?

from pits_lda.

amritbhanu avatar amritbhanu commented on July 20, 2024

stable means: if a topic has occurred more than 5 times in 10 runs. This answers your 3rd question as well.
And yes only 20% of topics are stable. But here I am only finding top 10 topics.

from pits_lda.

timm avatar timm commented on July 20, 2024

Hmmm.... looks liek its time to check if anyone else has found topics to be unstable

is this apper useful to you?

How to Effectively Use Topic Models for Software Engineering Tasks? An Approach Based on Genetic Algorithms ==> paper

@inproceedings{panichella2013effectively,
title={How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms},
author={Panichella, Annibale and Dit, Bogdan and Oliveto, Rocco and Di Penta, Massimiliano and Poshyvanyk, Denys and De Lucia, Andrea},
booktitle={Proceedings of the 2013 International Conference on Software Engineering},
pages={522--531},
year={2013},
organization={IEEE Press}
}

using "grid-serach" idea to tune 4 hyper-parameters of LDA, each divided into 10 bins, to investigate "What is the impact of the configuration parameters on LDA’s performance in the context of software engineering tasks"

This research question aims at justifying the need for an automatic approach that calibrates LDA’s
settings when LDA is applied to support SE tasks. they analyzed a large number of LDA configurations
for three software engineering tasks. The presence of a high variability in LDA’s performances indicates that, without a proper calibration, such a technique risks being severely under-utilized

from pits_lda.

timm avatar timm commented on July 20, 2024

do you know how to find who has cited a paper?

Step1: look for it in google scholar

https://scholar.google.com/scholar?hl=en&q=How+to+effectively+use+topic+models+for+software+engineering+tasks%3F+an+approach+based+on+genetic+algorithms&btnG=&as_sdt=1%2C34&as_sdtp=

image

Step3: click on the "cited by 73" link :

https://scholar.google.com/scholar?cites=9122112158639969994&as_sdt=5,34&sciodt=0,34&hl=en

enjoy!

from pits_lda.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.