ai-se / pits_lda Goto Github PK
View Code? Open in Web Editor NEWIST journal 2017: Tuning LDA
Home Page: https://github.com/amritbhanu/LDADE-package
IST journal 2017: Tuning LDA
Home Page: https://github.com/amritbhanu/LDADE-package
parameters
pits
on Wikipedia sets
To Show:
In process:
To Dos:
[bibtex](@inproceedings{lau2014machine,
title={Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality.},
author={Lau, Jey Han and Newman, David and Baldwin, Timothy},
booktitle={EACL},
pages={530--539},
year={2014}
})
General:
Measures:
Problems:
Research Question:
Terminologies:
ACTUAL
T1 T2 T3 .. .. . . .
Doc1
Doc2
Doc3
PREDICTED - Selected from Dominant topic from doc topic distribution.
W1 W2 W3 .. .. . . .
Doc1
Doc2
Doc3
**According to literature, If a document is asked to belong to one of the dominant
topic (hard assignment), the top words from the dominant topic should be in the
actual document. If not:
- then the probability of dominant topic is very less and there might be other topic which
can be made dominant.
- or the top words are wrongly selected. The weights of words could be better to find
the same dominant topic.**
We have now x no of documents. For eg x=4, k(no of topics)=3
for x=4, we have [D1,D2,D3,D4]
Actual=[1,1,2,0]
Predicted=[1,0,2,0]
The score is = 2/4=0.50
projecta control inertia design specif perform attitud tabl spacecraft note
interrupt uplink srup fsw verif error follow specif eeprom scr
tabl initi use fals address ppu obc function event dump
checksum calcul enabl progress process task idl text oper discuss
text fault memori error plenum initi second number pressur indic
mode flight issu execut sequenc current indic point set vml
switch messag case file mode type code flexelint function use
wait int variabl vml read dump write verif task verifi
miss oper set text paramet state check valid indic number
subaddress address telemetri packet word fsw data buffer request bootload
obc safe fault projecta power address flight mode state spacecraft
code data function line valu access variabl use messag record
srobc rate spacecraft flight memori prd alloc provid link point
non load int unsign bit eeprom obc comput control data
control mode point attitud error plenum sroac target main high
grand word tlm type packet count cmd byte header command
file defin line tlm data statu macro array ambi len
command softwar flight trace link srup task uplink time spacecraft
variabl initi messag line code entri valu use extern mode
test script verifi mode engcntrl link indic issu procedur data
K=22, a=0.847433736937, b = 0.763774618977
Run: 0
Topic 0: optim method solut factor advantag softwar guarante produc known tool
Topic 1: configur templat variabl time kernel linux patch schedul spreadsheet compil
Topic 2: workshop intern confer program messag member ics list summari review
Topic 3: softwar idf invers analysi engin objectori abstract star program autom
Topic 4: softwar analysi redocument objectori engin program tool autom experi abstract
Topic 5: panel law equat length softwar debat demet storyboard tell qualifi
Topic 6: softwar objectori analysi abstract engin tool autom design program use
Topic 7: objectori trait softwar lisp analysi tool engin program design experi
Topic 8: softwar analysi tool abstract engin design autom experi program use
Topic 9: test program use techniqu gener analysi execut approach case algorithm
Topic 10: code sourc detect clone open file similar base type techniqu
Topic 11: subclass umpl substitut superclass analysi abstract softwar umplif objectori autom
Topic 12: softwar analysi tool program abstract autom objectori engin design use
Topic 13: softwar analysi objectori autom engin abstract experi design use tool
Topic 14: ide eclips plug abstract plugin netbean softwar array framework chart
Topic 15: model languag specif formal use aspect design concern implement base
Topic 16: comput context network mobil resourc awar applic devic distribut platform
Topic 17: softwar analysi autom abstract engin tool objectori use evolut design
Topic 18: applic compon architectur web servic user transform integr engin framework
Topic 19: softwar use chang tool approach program design inform code sourc
Topic 20: softwar develop project engin process research studi qualiti bug use
Topic 21: objectori softwar design analysi abstract engin program autom tool realtim
Run: 1
Topic 0: code detect sourc clone refactor pattern tool chang base studi
Topic 1: mainten softwar matter correct massiv tell taxonomi experiment comprehens cost
Topic 2: mobil devic comment data net game load analyst tune network
Topic 3: busi inform compani technolog workflow corpor lead divis comprehens execut
Topic 4: robot race challeng softwar later intellig insight pipelin win artifici
Topic 5: softwar experi analysi autom visual framework use model tool design
Topic 6: scenario chart sequenc stereotyp msc reactiv messag impli visual lsc
Topic 7: objectori model metamodel ontolog framework evolut mda omg eventu softwar
Topic 8: slice static dynam rang wide case rel larg method propos
Topic 9: softwar develop chang studi use bug project sourc report result
Topic 10: softwar model develop design process architectur use tool requir support
Topic 11: softwar analysi experi autom model design visual realtim framework abstract
Topic 12: softwar remodular analysi visual autom music multidimension experi assess abstract
Topic 13: languag specif model formal transform semant program verif gener use
Topic 14: safeti certif proof critic complianc iso siemen certifi softwar ambigu
Topic 15: program test use techniqu analysi gener approach execut base present
Topic 16: secur protocol vulner access network control polici analysi schedul attack
Topic 17: peer componentbas node cach softwar volatil har analysi netbean framework
Topic 18: aspect point concern orient modular crosscut aop messag join aspectj
Topic 19: layout softwar visualis visual autom analysi experi framework design distribut
Topic 20: applic web data revers legaci extract engin queri databas tool
Topic 21: softwar engin research workshop commun comput discuss intern industri challeng
Run: 2
Topic 0: optim method solut deviat advantag guarante factor produc known pool
Topic 1: workshop research intern engin comput track tutori topic session confer
Topic 2: signatur alert match massiv defens notif worm softwar jone smoke
Topic 3: safeti critic certif complianc hardwar healthcar ambigu certifi nasa mission
Topic 4: inspect review commit kernel linux patch author driver port peer
Topic 5: softwar experi autom componentbas engin increment assess use largescal case
Topic 6: design object compon class orient pattern aspect servic featur method
Topic 7: test code use techniqu approach chang sourc detect bug result
Topic 8: configur word assert identifi scheme macro artefact preprocessor split expand
Topic 9: softwar autom componentbas experi engin tool increment use largescal environ
Topic 10: inform sourc extract open repositori data busi list visual retriev
Topic 11: vulner schedul optim array real time buffer overflow alloc trade
Topic 12: model languag specif use gener tool approach base requir implement
Topic 13: robot win grand softwar autom intellig componentbas race autonom home
Topic 14: softwar autom recoveri largescal componentbas experi engin tool case realtim
Topic 15: softwar assess experi autom tool componentbas use largescal engin visual
Topic 16: program analysi dynam static execut use slice algorithm graph techniqu
Topic 17: applic web user databas interact client interfac migrat data approach
Topic 18: softwar autom assess componentbas experi engin use environ largescal studi
Topic 19: softwar develop engin process architectur project use mainten studi product
Topic 20: smell spreadsheet end bad templat subject speed tabl cell formula
Topic 21: context awar inconsist conflict merg resolv ide resolut revis chang
Run: 3
Topic 0: model check databas constraint data logic queri tempor satisfi schema
Topic 1: revers grammar fact word pars extract reengin engin parser cobol
Topic 2: requir design method product softwar engin goal optim process support
Topic 3: platform mobil devic android app permiss micro portabl bytecod phone
Topic 4: objectori omnipres model softwar framework largescal tool visual use object
Topic 5: graph scenario concurr interact behavior sequenc specif event behaviour monitor
Topic 6: code chang sourc softwar studi evolut clone develop detect open
Topic 7: inform legaci busi migrat reengin process recov workflow technolog compani
Topic 8: architectur compon softwar view adapt decis configur distribut support environ
Topic 9: softwar model framework objectori use tool analysi aspectori panel abstract
Topic 10: visual metaphor music boundari analyt hill climb largescal softwar overlap
Topic 11: applic analysi web use secur flow access function user detect
Topic 12: formal properti specif verif verifi composit infer reason refin state
Topic 13: softwar develop engin process research project servic manag mainten paper
Topic 14: bug report perform time use predict measur defect develop data
Topic 15: class refactor programm maintain parallel smell conflict improv ide merg
Topic 16: test techniqu gener case execut use fault input suit autom
Topic 17: softwar matrix model objectori sla tool agreement framework aspectori largescal
Topic 18: program dynam static slice analysi depend algorithm comput condit comprehens
Topic 19: model use approach tool languag base paper implement present gener
Topic 20: templat confer member compil chair metaprogram calculu welcom committe debugg
Topic 21: remodular eye idf layout invers model hyper softwar movement framework
Run: 4
Topic 0: workshop intern review research track program messag confer list session
Topic 1: model use program languag specif gener design approach tool base
Topic 2: context parallel conflict inconsist awar comput merg resolv resolut middlewar
Topic 3: procedur method softwar cobol ownership node layout solut instruct encapsul
Topic 4: visualis debugg dimension breakpoint emul comprehens workbench softwar multidimension objectori
Topic 5: compon product featur applic line reus secur approach configur softwar
Topic 6: softwar realtim autom framework analysi objectori use tool experi architectur
Topic 7: macro artefact preprocessor stream hidden preprocess expand expans scheme actor
Topic 8: factori renov constructor softwar jstar autom objectori analysi experi model
Topic 9: kernel devic schedul linux driver buffer window overflow array interrupt
Topic 10: spreadsheet end smell decomposit dataflow hierarch formula stabil speed modeldriven
Topic 11: pair propag compat renam evolut micro prioriti late makefil programm
Topic 12: visual extract databas data inform sourc tool fact xml schema
Topic 13: program analysi dynam static slice algorithm precis comput flow execut
Topic 14: objectori softwar analysi use autom framework tool design experi model
Topic 15: objectori softwar autom analysi framework legaci use largescal tool evolut
Topic 16: web page browser string constant html php javascript ajax server
Topic 17: anti antipattern linguist scc certif pattern imped occurr greater softwar
Topic 18: softwar develop engin process architectur project tool use research mainten
Topic 19: test code use sourc techniqu chang approach softwar result studi
Topic 20: softwar analysi tool use autom objectori framework comprehens evolut experi
Topic 21: optim search insight yield near soft solut sbse softwar engin
Run: 5
Topic 0: smell word macro bad identifi cognit split expand taxonomi renam
Topic 1: architectur compon applic framework base approach web interfac user use
Topic 2: tool softwar mainten assess use studi approach experi evolut realtim
Topic 3: ownership restructur organiz domin spi measur surpris incom owner encapsul
Topic 4: factori constructor softwar mainten experi cell tool molecular inspector use
Topic 5: test techniqu case gener execut fault use input effect suit
Topic 6: period month churn seri forecast firefox foundat chrome softwar latenc
Topic 7: program analysi dynam static slice algorithm use graph depend flow
Topic 8: odc softwar mainten studi experi use tool recoveri support process
Topic 9: softwar engin research revers develop industri practic workshop experi discuss
Topic 10: languag design object model tool orient class aspect transform use
Topic 11: fuzzi imperfect toss softwar mainten instabl reverseengin altogeth sdk etp
Topic 12: calibr spc softwar process spreadsheet use mainten experi variat tool
Topic 13: develop process project servic manag applic requir softwar product technolog
Topic 14: port peer breakpoint item adjust remedi estim bia softwar inspect
Topic 15: pair evolut empir ecosystem growth law softwar contributor studi regular
Topic 16: reengin method open sourc list xml tool patch mail oss
Topic 17: inform lead corpor compani divis comprehens analyst iso vice consolid
Topic 18: tool softwar mainten realtim use objectori assess experi recoveri studi
Topic 19: code softwar use sourc chang develop approach studi result tool
Topic 20: anti antipattern remodular linguist cluster occurr softwar mainten scc finegrain
Topic 21: model specif use check formal properti state approach gener verif
Run: 6
Topic 0: softwar architectur develop process use compon approach mainten support paper
Topic 1: objectori softwar analysi framework autom tool largescal increment use abstract
Topic 2: applic secur web profil vulner synthesi server polici time real
Topic 3: analysi objectori softwar autom abstract framework aspectori componentbas realtim largescal
Topic 4: inform busi compani legaci workflow lead corpor technolog divis analyst
Topic 5: softwar analysi objectori autom framework use abstract tool experi approach
Topic 6: code sourc bug use chang detect studi develop softwar identifi
Topic 7: softwar engin develop research project servic web applic revers technolog
Topic 8: reconstruct decompil obfusc readabl ast reverseengin polymorph birthmark standalon bytecod
Topic 9: parallel concurr platform perform sequenti net hardwar distribut thread multi
Topic 10: program test use gener techniqu analysi execut approach present base
Topic 11: visualis softwar objectori dimension analysi shrimp tool comprehens autom visual
Topic 12: regular law equat softwar length demet autom smalltalk analysi friend
Topic 13: optim solut method factor search comparison advantag softwar produc known
Topic 14: malwar anti breakpoint defens analysi emul worm mitig card unpack
Topic 15: remot factori renov softwar notif constructor metalanguag sent usabl analysi
Topic 16: analysi softwar autom objectori tool framework experi use largescal program
Topic 17: softwar autom analysi objectori tool largescal framework experi use approach
Topic 18: machin binari translat spreadsheet virtual packag compil window instruct assembl
Topic 19: adapt dynam runtim run time self assur failur fault reconfigur
Topic 20: softwar analysi objectori autom use tool experi abstract program framework
Topic 21: model design languag specif use class object formal pattern transform
Run: 7
Topic 0: chang softwar evolut manag mainten evolv configur version support impact
Topic 1: develop bug sourc project report code softwar studi open use
Topic 2: program code use tool sourc approach analysi techniqu java pattern
Topic 3: optim solut method profil schedul guarante produc speed advantag factor
Topic 4: cognit anti antipattern occurr linguist taxonomi softwar scc visual imped
Topic 5: trait actor gpu scala autom analysi basset softwar tool use
Topic 6: alert signatur immut defens mitig jone autom analysi mutabl worm
Topic 7: intens inter incorpor anoth intric softwar novel stream green depend
Topic 8: applic web user secur interfac servic databas migrat client interact
Topic 9: softwar analysi visual autom experi use largescal design objectori reengin
Topic 10: pair agil review confer program member track accept panel regular
Topic 11: model specif use design gener approach tool languag base requir
Topic 12: featur aspect composit modular line product orient concern increment compos
Topic 13: visual softwar star analysi gxl tool experi shrimp autom exchang
Topic 14: malwar visualis behaviour obfusc harm growth certifi emul malici viru
Topic 15: dynam static slice analysi condit precis rang execut case larg
Topic 16: busi legaci inform compani process ibm corpor reengin cobol technolog
Topic 17: test softwar use techniqu studi case result approach qualiti effect
Topic 18: objectori redocument analysi softwar visual tool mainten experi design use
Topic 19: softwar autom analysi visual experi objectori reengin largescal recoveri use
Topic 20: analysi realtim softwar visual autom use data mainten tool experi
Topic 21: softwar engin develop architectur research compon distribut design process challeng
Run: 8
Topic 0: visualis softwar autom analysi visual abstract largescal realtim componentbas framework
Topic 1: binari secur licens attack complianc malwar permiss free protect enforc
Topic 2: privaci threat anonym regul mitig softwar dca analysi autom law
Topic 3: cot shelf softwar analysi autom componentbas stateflow notif framework stand
Topic 4: applic web distribut servic class environ user perform deploy develop
Topic 5: model specif use compon languag architectur tool base approach requir
Topic 6: bug report predict defect fix develop project use repositori file
Topic 7: microblog softwar dissemin twitter million realtim analysi autom visual observatori
Topic 8: optim solut method advantag factor produc guarante support known tool
Topic 9: analysi softwar realtim visual architectur componentbas autom experi objectori use
Topic 10: code sourc pattern design detect clone program refactor tool use
Topic 11: negoti softwar win analysi autom visual abstract architectur experi componentbas
Topic 12: edit script systemat scm interrupt ident induc umpl autom session
Topic 13: artefact actor softwar hidden scala messag shrimp mcc basset analysi
Topic 14: softwar develop chang use studi process mainten project paper evolut
Topic 15: softwar engin research revers commun workshop challeng discuss industri practic
Topic 16: test program use techniqu analysi gener execut algorithm approach case
Topic 17: orient aspect object concern program modular point modul mechan separ
Topic 18: schema format exchang fact extractor organis standard softwar testabl phase
Topic 19: legaci busi reengin inform compani migrat databas technolog process workflow
Topic 20: flaw objectori mathemat softwar analysi prey predat popul ssa use
Topic 21: softwar analysi use tool autom abstract visual reengin experi largescal
Run: 9
Topic 0: bug report code sourc api fix detect use predict approach
Topic 1: design architectur compon softwar requir pattern framework approach base model
Topic 2: agreement softwar sla analysi visual realtim tool experi largescal abstract
Topic 3: optim solut method advantag visual layout softwar produc guarante known
Topic 4: anchor adjust softwar matter cognit tool analysi visual wikipedia use
Topic 5: test techniqu gener case execut use fault approach input autom
Topic 6: factori renov constructor softwar scaffold proprietari experi analysi mode largescal
Topic 7: process servic legaci technolog busi reengin migrat comprehens inform environ
Topic 8: regular wrapper length law lexic sourcecod equat softwar extrapol intension
Topic 9: softwar analysi experi realtim tool abstract largescal visual distribut autom
Topic 10: engin softwar research revers commun workshop challeng comput discuss industri
Topic 11: realtim softwar tool analysi largescal abstract experi visual model use
Topic 12: model applic use languag specif tool gener approach web base
Topic 13: object class metric orient method measur use concept coupl code
Topic 14: softwar develop chang studi code use project sourc mainten process
Topic 15: platform mobil devic deploy applic network resourc driver hardwar android
Topic 16: softwar eve abstract interact tool autom analysi model experi largescal
Topic 17: softwar use analysi realtim abstract experi tool autom largescal support
Topic 18: program code analysi use java dynam refactor type static slice
Topic 19: spreadsheet end formula microblog bidirect templat modeldriven excel dataflow cell
Topic 20: classif classifi categori taxonomi csp capac item analyst notif orthogon
Topic 21: configur wide rang conflict static merg rel larg case applic
Runtime: --- 425.590034962 seconds ---
Score: 0.9
a low alpha value places more weight on having each document composed of only a few dominant topics (whereas a high value will return many more relatively dominant topics). Similarly, a low beta value places more weight on having each topic composed of only a few dominant words.
30 topics:
TOPIC 0
capabl 0.0072857702548080085
detect 0.006889243638298703
thruster 0.006700064944549625
int 0.006607782123785966
verifi 0.0065153889947990404
monitor 0.00650296321145424
provid 0.006375887077561054
version 0.006342073856340687
mode 0.006232402158806654
illeg 0.006220333047750552
TOPIC 1
sequenc 0.033129759453011234
flight 0.03252374735588161
address 0.022959018219828313
configur 0.019362618270488675
execut 0.018645697987183907
spacecraft 0.018536089237481138
vml 0.016712577126068412
contain 0.014304561669005917
dump 0.012876027216572137
capabl 0.012353371621085145
TOPIC 2
text 0.014176183805144018
indic 0.01402861490766197
disabl 0.011578996197979974
initi 0.01114483984031251
power 0.010708437769477594
process 0.010701123372393252
issu 0.010398400658366539
number 0.008637492236074887
configur 0.007928624859391881
manag 0.007903798812742504
TOPIC 3
mode 0.07966584673217642
state 0.03718070354145717
initi 0.03572169251594875
address 0.031429716725763605
fault 0.030452571566128406
spacecraft 0.027279151643025204
submod 0.022740745115632146
event 0.021906758483994815
array 0.01824853606074547
safe 0.017724767764738542
TOPIC 4
paramet 0.0071188055404684805
sram 0.006827464128244202
mode 0.00673273831926996
srup 0.006591190141582033
trace 0.006586444155181463
lead 0.006553998061153735
verifi 0.00645559544261332
packet 0.00642680558984484
refer 0.0064026813981321005
statu 0.006329258144168653
TOPIC 5
step 0.07957132256961552
procedur 0.0645619658601564
valv 0.041675091536043866
execut 0.036320039850185164
engcntrl 0.03142032361285736
rvm 0.031151429519339312
ppu 0.028139784762838016
latch 0.026210595939019456
control 0.01665968261541891
safe 0.01489455854502568
TOPIC 6
detect 0.006901424936164037
illeg 0.006830546315072616
ssp 0.006780407223867901
calcul 0.00675266664234023
chang 0.006670584494538672
baselin 0.006639908707605641
size 0.006587158292765058
base 0.006470646527962334
byte 0.006417814822230279
attitud 0.006399922089239512
TOPIC 7
engcntrl 0.12864125725078046
rvm 0.08106913144631238
miss 0.06797058286501821
paramet 0.04712804740169639
oper 0.03959955807878258
question 0.03548013555542337
set 0.035106402441915874
lead 0.035063628473402754
baselin 0.0331610948442407
valid 0.025475517399741195
TOPIC 8
text 0.013554263569267398
issu 0.011318615133417836
monitor 0.010701013969762457
function 0.01063652335226498
data 0.010621648784653277
eeprom 0.008556587731938457
indic 0.008555600925061647
manag 0.008527574670884068
flight 0.0083612822637015
number 0.008008184417918564
TOPIC 9
refer 0.007059825821001276
lead 0.006984447280031191
scr 0.006838087440533121
receiv 0.006717873863744479
uplink 0.006674271056485936
process 0.0066346156141782455
iru 0.006563385386916043
index 0.006443017673025953
safe 0.006432388815486642
pressur 0.006411195054815109
TOPIC 10
fals 0.03580471202584637
initi 0.010931205194610211
file 0.008206711770178607
num 0.007587271599197137
word 0.007044442564413284
code 0.007024508831246105
verif 0.006556217191035221
fail 0.0064630813488873295
number 0.006456010424001303
baselin 0.006374013212012404
TOPIC 11
projecta 0.06746218996739925
point 0.04184767002882261
tabl 0.028350651612809887
specif 0.026752048078420344
control 0.02492652800690521
inertia 0.024851322524132656
design 0.019213967081341876
perform 0.01911143149108216
analysi 0.01904415564239937
note 0.018525507516305707
TOPIC 12
scr 0.007000582800495213
paramet 0.006666637905827172
oper 0.006660309291516552
process 0.006600259070505539
state 0.006473211101004202
procedur 0.0063467868940232505
latch 0.006293818254668043
task 0.00625789441880334
monitor 0.006250431817756035
fals 0.006204256309834193
TOPIC 13
text 0.05542830868272589
issu 0.03725262264123818
function 0.035717358930808504
use 0.03449499026069172
number 0.024030362307005032
configur 0.023222596144389325
file 0.02161114366150162
differ 0.02041661569850046
rate 0.019600705898817104
data 0.019428473501135756
TOPIC 14
plenum 0.06249745243079222
engcntrl 0.0409187547507851
rvm 0.036013477398814595
pressur 0.03230371555369164
paramet 0.02545298747765356
second 0.022874174938327285
check 0.02236197131682671
valu 0.022107342696258078
data 0.018644210188598787
miss 0.017962971753087812
TOPIC 15
integr 0.006770258735374216
paramet 0.0065478122693753025
state 0.006507774808691103
event 0.006457570853626896
step 0.006382456198675023
result 0.006294514557105005
sequenc 0.006211944228406801
contain 0.006182191108707061
pse 0.006174884134512087
file 0.006140997936215436
TOPIC 16
perform 0.006759823859313174
gener 0.006753369449746122
idl 0.006579616748933607
discuss 0.006444489487713062
verif 0.00644447573029011
illeg 0.006349126276971217
calcul 0.006300919187424522
respons 0.006298347915265842
fault 0.0062068278503890325
projecta 0.006161297742674911
TOPIC 17
float 0.06443236109201486
equal 0.047199710697273065
variabl 0.02934617624497164
constant 0.023256584077755587
accept 0.017902067085217875
point 0.014308891713200092
number 0.01066318416415156
sun 0.01047228634419052
line 0.010271237303141185
safe 0.010187038486312423
TOPIC 18
constant 0.0067272032380626
event 0.006509176463676518
accept 0.0064645037686356
invalid 0.006408657163372205
statu 0.006308655882355975
follow 0.006282119776234993
updat 0.006242530188278966
case 0.0062374334951456134
valid 0.006232355334315194
gener 0.006214309165593506
TOPIC 19
code 0.06089576666196665
document 0.023529369371275612
calcul 0.019601636551500184
line 0.017346617131557968
bit 0.01578553107263362
float 0.014946003104453131
equal 0.0116140446933427
valu 0.01101304871095636
variabl 0.010571777572977425
flight 0.010009773380919672
TOPIC 20
text 0.01319767734485295
engcntrl 0.012989456529975841
issu 0.011008447357302964
miss 0.010487501094100416
valv 0.010455669829729019
rvm 0.010301555088397404
state 0.010288119759840264
initi 0.010162147401351915
use 0.010023395745515468
manag 0.00805930979917511
TOPIC 21
file 0.03711533387181546
line 0.036662071751758965
prioriti 0.03252689221436745
ace 0.024601020266133027
defin 0.023781163143673706
counter 0.02375448623140276
buffer 0.023409858058972895
valu 0.02262365120226543
size 0.02100273544220952
error 0.018441500948256993
TOPIC 22
rvm 0.058557052788672445
engcntrl 0.05147203358485272
bootload 0.04588724664333789
checksum 0.045278735080949054
calcul 0.041654115486330225
memori 0.03298984546384696
text 0.03269440107902858
idl 0.026195712265557315
fsw 0.023523170638217756
address 0.021366050392282596
TOPIC 23
num 0.006609283590812713
subaddress 0.0066003824882191
fals 0.006592974808638571
switch 0.006572564017413236
rate 0.006493912398838559
fsw 0.006470888042171489
control 0.006409973661028382
sun 0.006394022472625757
set 0.006350634655874299
launch 0.0063247526201570085
TOPIC 24
suncrosscalib 0.00683729202567074
init 0.006589179818997157
address 0.006555629737091954
hop 0.006453259698936264
valu 0.006421198805018765
dump 0.006304229634976303
scrub 0.006231210980935616
checksum 0.006222658102646131
type 0.0062151285407071565
limit 0.0062148949738669084
TOPIC 25
access 0.007015605770948273
packet 0.006878738003382888
sub 0.00681656724573433
obc 0.006680487905656781
prioriti 0.006644301963427974
write 0.006550542311169962
wait 0.006507482714691407
correct 0.006478934758681904
initi 0.006461548505262755
hlp 0.0064226804419323874
TOPIC 26
transit 0.007274267735111653
use 0.0067174487488747
verif 0.0067061381279485445
ppu 0.00658514885657192
receiv 0.0065420117405110695
scrub 0.006522674447596902
execut 0.006514190194472578
defin 0.006414629726145246
main 0.006374296721232405
capabl 0.0063654238087001505
TOPIC 27
statu 0.03141061634696384
messag 0.02692935272088277
bound 0.02655592019775904
access 0.026544893112142195
flexelint 0.020997808517881016
line 0.020127632879032676
file 0.01948517834033966
attitud 0.016821070971116785
int 0.01674476456958524
valid 0.015860426415481667
TOPIC 28
request 0.006491484194327534
limit 0.006442496939425952
bootload 0.006403388811680738
fdc 0.006399974023558598
progress 0.00639587365713958
index 0.006275722391109265
control 0.006249616915266672
includ 0.006200056435239194
vcdu 0.006174630782103055
calcul 0.006159157887654544
TOPIC 29
ang 0.007032597978994753
pcontrol 0.006737330705273606
rate 0.00672458742169172
rvm 0.006693384392581234
address 0.006614323139184355
flight 0.00656254355888187
valv 0.006546939310696933
switch 0.00648290719226052
sub 0.006415598605858243
spacecraft 0.006386103243420132
[bibtex](@inproceedings{koltcov2014latent,
title={Latent dirichlet allocation: stability and applications to studies of user-generated content},
author={Koltcov, Sergei and Koltsova, Olessia and Nikolenko, Sergey},
booktitle={Proceedings of the 2014 ACM conference on Web science},
pages={161--165},
year={2014},
organization={ACM}
})
General:
Problem:
Old Solutions for stability:
Evaluation Metric:
Preprocessing step:
[bibtex](@PhDThesis{yang2015improving,
title={Improving the Usability of Topic Models},
author={Yang, Yi},
year={2015},
school={NORTHWESTERN UNIVERSITY}
})
Problems:
Motivation:
Terminologies:
Stability Measures:
General:
Datasets:
Analogy when topics are unstable:
Summary:
[bibtex](@incollection{greene2014many,
title={How many topics? stability analysis for topic models},
author={Greene, Derek and O’Callaghan, Derek and Cunningham, P{'a}draig},
booktitle={Machine Learning and Knowledge Discovery in Databases},
pages={498--513},
year={2014},
publisher={Springer}
})
Idea:
[bibtex](@inproceedings{panichella2013effectively,
title={How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms},
author={Panichella, Annibale and Dit, Bogdan and Oliveto, Rocco and Di Penta, Massimiliano and Poshyvanyk, Denys and De Lucia, Andrea},
booktitle={Proceedings of the 2013 International Conference on Software Engineering},
pages={522--531},
year={2013},
organization={IEEE Press}
})
Approaches:
Parameters up for tuning:
Definitions:
Evaluation criteria:
Need clarity? - how to convert text into data points. To do the cluster goodness evaluation.
Actual LDA-GA
Assumptions:
[bibtex](@Article{o2015analysis,
title={An analysis of the coherence of descriptors in topic modeling},
author={O’Callaghan, Derek and Greene, Derek and Carthy, Joe and Cunningham, P{'a}draig},
journal={Expert Systems with Applications},
volume={42},
number={13},
pages={5645--5657},
year={2015},
publisher={Elsevier}
})
General:
Measurea:
We have the baseline results with no smote svm, smote svm.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.