Giter Site home page Giter Site logo

wiki_econ_capability's People

Contributors

johnchuang avatar notconfusing avatar wazaahhh avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

wiki_econ_capability's Issues

explain snapshots earlier

    R1.9. It is not clear how the snapshots enter into the
    calibration process until Fig. 6 is introduced. It would be
    good to explain better the whole process on page 4.

ground-truth not exogenous?

The ground-truth comparisons aren’t exactly exogenous! They’re
based on degree and other network artifacts measured here.

Add acknowledgement

As you prepare the camera-ready version of the CSCW paper, can you please include the following acknowledgement:

"This research was supported in part by the National Science Foundation under award CCF-0424422 (TRUST)."

situate paper in CSCW discipline

Related works: R1 and R3 suggest literature that should be incorporated
in the paper. As mentioned in the first major point of this meta-review,
the related work should also show that the paper is situating itself in
the CSCW discipline. Given past work, some of the claims of the paper
need to be tempered or justified.

clarify mathematical notation

Both R1 and R2 find some of the mathematical notation confusing. This
should be fixed for clarity's sake. R1 has also identified some typos.

   R1.6. I found the notation in Eq. 1 confusing. $k_e$ should be
   a scalar, but taken literally, Eq. 1 would imply that $k_e$
   is simply $N_a$ times matrix $M$, causing a lot of confusion
   in eq. 2 with the $\frac{1}{k_e}$ term. The fact that you
   implicitly denote a generic entry $M_{ea}$ as $M$ is buried
   in the text, so it doesn't help much. In general, be more
   clear when you implicitly simplify your notation.


    R1.7. There is a typo in the notation of page 5: $\omega_a$
    should be $\overline{\omega}_e$.

    R1.8. For the sake of clarity, please specify that the \max
    operator of Eq. 5 is over \alpha and \beta.

Minor Points of R2 and R3

R2 Minor Points

(is there anything to even do here):

The new discussion of the bi-partite network random walker model was very
clear. The method as discussed in the paper now is more aligned with the
bipartite HITS algorithm; this corresponds to the eigenvector centrality
of the two projections of the matrix (M^TM, MM^T). I'm not sure that I
understood the beta parameter before. The analogous PageRank algorithm
would be the version in which a jump from e to a would be allowed with
nonzero probability, even with M_ea = 0.

Figure 5 is interesting, and it's a nice tie-in to Figure 3. It seems
like this is partial identifiability for the parameters.

R3 Minor Points

  • Some minor grammatical errors throughout that would benefit from a
    thorough copyediting pass (missing spaces, "two-node").

relate back to socio-technical practices

Expand discussion of findings back to Wikipedia and its socio-technical
practices. Many of the most significant contributions come from the
optimal alpha and beta variables found. The paper only touches briefly on
what these optimal values for various Wikipedia categories mean. The
paper should expand significantly its discussion on the broader meaning
of the alpha and beta functions. For example, I enjoyed the (too brief)
discussion on the meaning of the beta values in the Sexual acts and
Military History of the US. An significantly expanded discussion of these
findings (and perhaps for other categories) is merited. There is also a
lot of speculation in the current findings--.e.g, "presumably there are a
lot of unmediated editors fighting". Any expanded discussion should be
grounded in either empirical findings or literature.

R2.3. A related point to validation: there is discussion of result on
multiple editors creating marginal dis-value. It’s not clear to me that
this result isn’t the product of the types of editors in those fields.
If they are editors who as a population edit fewer articles per person
than in other fields, then I believe this result would appear as an
artifact of the algorithm.

=> make a graph of beta as a function of ratio = articles/editor

parameters introduced without explanations

Parameters are sometimes introduced in the paper without proper
explanation for a broad audience. For example, the semantic meaning of
alpha and beta other than the fact they control the transition
probabilities should be explained early.

Final comb

really close read for spelling/grammar

aren't we just approximating ground-truth?

R1.12. The author claim (page 6, Discussion) to have test their
model, but what they did only a calibration against a number
of ground-truth metrics. How would you validate this model?
In other words, unless the model was really bad,
I would expect anyway that an optimization process would
give you parameters that reproduce well the ground truth,
but does the model capture really quality and expertise, or
just the ground truth you gave it to approximate? The
discussion section could be a good place where to talk about
this.

notation of M

I realize there is precedence for this, but the notation that M == M_ea
(i.e., M_ea indicates the matrix) and the use of M_ea to indicate the
variable for editor e and article a is unclear.

defend stability of rho

R1.11. The text on page 6 says that $\beta^$ is stable but
figure 6 shows \rho_a and \rho_e, not $\beta^
$ itself.
Could it be that \beta changed much (have a look at the
contour lines in Fig. 5 for example) even the resulting
correlation with the ground truth didn't? Without any
explanation the reader is left to wonder what is really
going on. This is important, since the robustness of the
calibrated parameters make it possible to interpret them.

metrics too close to start with?

The manuscript describes "ground truth" of article quality and editor
expertise, but it's unclear what specific construct was used that is
independent of the editing behavior. Editor experise is measured as labor
hours, but this seems instrinsically correlated with the number of
articles edited. Article quality was measured with a PCA combining markup
complexity, headings, article length, citations, and links. The
implementation of the adapted random walker method returns correlations
in excess of .6 with other measures of article and editor quality.
There's also a question of how much gain does this walker method produce
over more naive methods that don't account for the network structure and
influence to serve as a baseline.

independence of ground truth

  1. Fitting some of the parameters of the model seems dependent on having
    access to ground-truth data. An analysis of results using a combined or
    out-of-sample parameter setting seems like the most appropriate, missing
    test. (For example, train on 11 Wikipedia categories to pick alpha, beta,
    and then try on the 12th.) This would make explicit how to select
    parameters without ground-truth, or with only partial, ground-truth.

motivate state of the art evaluations

Per R3's comment: The paper needs to motivate better its
"state-of-the-art ground-truth evaluations" for editor expertise and
article quality. It should explain how (and whether we should expect)
article quality and expertise measures to be independent from each other.

forest the trees

R3 is concerned about the clarity of the paper and found him/herself "losing the forest for the trees" when trying to connect the "interpretations to bigger picture questions about what are social mechanisms for facilitating the production of high-quality user-generated content." I agree with this statement.

economics discussion unrelated

R2. Much of the previous work on the method of reflections is done on
country exports and the global economy, but some of the discussion on
this method and global economic analysis seems unrelated.

correlate degree

R2's suggestion for "a comparison of performance to degree, or a
correlation of the output of the method to degree" should be considered.

relation to degree corellation

R2.2. Centrality measures and the method of reflections.
The model presented is based on the previously established method of
reflections, and uses the PageRank-like variation on that method
suggested by Caldarelli et al. (2012). This means the inferred ranking is
a bipartite PageRank variation similar to a bipartite HITS algorithm,
which has been previously analyzed. Most importantly, the rankings output
by these types of algorithms will be strongly correlated with degree
(here, number of articles edited or number of editors edited by). Since
degree is one of the major components of the validation/ground-truth
rankings, a significant amount of agreement could be accomplished this
way.

        2. A comparison of performance to degree, or a correlation of the output
        of the method to degree, should be shown or acknoweldged in the text.

provide intution

Provide the audience with an intuitive way with examples to understand the meaning behind the alpha and beta transition probabilities. The text on pg. 4 describes what the
various ranges of alpha and beta mean, but the authors should expand on this and possibly use illustrations/figures to show how changing the values leads to different relations between article quality, # of edits, and/or expertise.

binary values

The paper should address R1's question of why binary values were used
in M.

refute out of sample and validity necessity

Both R1 (point 12) and R2 (point 1) brings up a good point that alpha
and beta are optimized to rho_e and rho_a and then used to validate the
performance of the model (or at least the rate at which the algorithm
converges towards rho_e and rho_a). So in some sense, the higher
correlations are somewhat unsurprising because that is what we are trying
to maximize. The paper should explain this limitation; the authors may
also consider one of the further examinations suggested by R2 (see R2:
Major points: 1. Validation). For example, an out-of-sample test would be
helpful (and maybe doable in the time period).

       R1.12. The author claim (page 6, Discussion) to have test their
       model, but what they did only a calibration against a number
       of ground-truth metrics. How would you validate this model?
       In other words, unless the model was really bad,
       I would expect anyway that an optimization process would
       give you parameters that reproduce well the ground truth,
       but does the model capture really quality and expertise, or
       just the ground truth you gave it to approximate? The
       discussion section could be a good place where to talk about
       this.

       R2.1 The model has some parameters (alpha, beta) which are fit using the rank
       correlation metric (rho_e and rho_a). However, these parameters are also
       used to validate and describe the overall performance of the model. At
       the exploratory stage, this seems excusable, but merits at least one of
       the following: (a) an exploration of which aspects of the ground truth
       metrics are un/correlated (see [2]); (b) another type of validation
       (prediction? other rankings?); (c) an out-of-sample test, using the
       parameters fit to either the previous time steps or other subject areas
       in the model; (d) a discussion of how the model’s ranking qualitatively
       differs from the ground-truth.

explain broad impact

Explain the research's contributions/broad impact. The paper only cites
one CSCW article. I don't have a problem with this, but I think this is
indicative that the paper needs to better situate its contribution to the
CSCW discipline. What can we do with this new way of modeling article
quality and expertise? What are its implications towards Wikipedia and/or
other similar online collaborative editing systems? How do the optimal
alpha and beta parameters and the model inform socio-technical theories
of Wikipedia practices? What would be the next steps (beyond tweaking the
algorithm)?

        R1.1. Introduction: It is a bit surprising that the topic of
       information cascades is introduced referencing only two
       unpublished works (one of them under review). It would be
       great if the authors could also include other pointers to
       the literature of team organization and complexity.


        R1.2. The authors claim their work to be “the first attempt to
       quantify the value of collective contribution environments
       from the collaboration structure alone”, but I think that,
       at least in the domain of Wikipedia, the work by Luca de
       Alfaro and collaborators should be credited. A lot of work
       in the field of trust and reputation management tried also
       to look quality of contributions, as well as some works by
       Ulrik Brandes on collaboration networks.

       R1.3. The related work section would use a more thorough review
       of the literature. Some words should be spent on the
       connection between public goods and team production for
       example. In Microeconomics, for example, this is studied as
       a coordination problem. There is also a large literature in
       software engineering about software teams and quality.
       Another work that looked at the growth of Wikipedia pages
       and quality is the one by Huberman and Wilkinson.

       R1.4. I would tone down the claim that measuring quality is
       impossible. This is simply not true. Experts are able to
       evaluate non-code artifacts, and even opinions by
       non-experts can be aggregated into meaningful signals. OTOH,
       it seems a good idea to talk also about unpredictability of
       quality (cf. the work by Watts, Dodds and Salganick). Also,
       natural language can be evaluated with a lot quantitative
       metrics (some of which you actually use as ground truth
       later in the paper), so please tone down also the language
       there.

       R1.5  The point about number and quality is well taken. To this
       end, I would suggest to look at the work by Scott Page on
       diversity and teams.

patterns in inferred rankings

I’d be interested in any systematic patterns in the inferred
rankings, and how they differ from the ground-truth rankings -- the
maximum achievable rank correlations are quite low on some categories.

Refernce Formatting

R3 says:

  • The formatting of the references can be tightened up to just read
    "Proc. of CHI'10" rather than "Proceedings of the SIGCHI..." and drop the
    corresponding redundant booktitle element in BibTeX.

Do you know how to do this easily?

Define Coordination

Metareviewer

The issue of coordination raised by R3. I think this can be solved by
acknowledging other definitions and then defining clearly what the
authors meant by coordination (since that term is used throughout the
paper).

R3

However, the paper constructs some terms like "coordination"
problematically. The argument that the paper "[reverse engineers] a
measure of coordination" (p 2) appears to conceive of coordination as a
feature to be quantified rather than an on-going process that produces
other measurable outcomes. There's also an underlying confounding of
coordination and article quality, which may not be wrong but is
inaccurate: there may be articles that are extensively coordinated
(Israel-Palestine, global warming, abortion) that may not necessarily be
high quality. Given that the article is measuring article quality
throughout and is only looking at static structures rather than changes
over time, I would recommend the authors be very cautious about
generalizing to coordination processes.

split data section

R1.10. Section “Data” reports about both the data used in the
study and on the calibration procedure itself. I would split
it in two separate sections for the sake of clarity.

is dis-value real

The paper should address R2's concern that variability across types of
editors is a major factor in the creation of "marginal dis-value."

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.