MetaAnalysisResources

This overview provides description of some meta-analytic resources for neuroimagers (mostly Neurosynth and BrainMap). Since I've been working more closely with Neurosynth, I've given a much more thorough description of this database here (it's strengths, weaknesses, how to get it working, etc). While the description of Neurosynth will only allude to analyses that you can perform in this meta-analytic framework, code to get you started is provided in GettingStartedWithNeurosynth.md. Some code is also provided for formatting data for analysis with brainmap, and for converting between different spaces (referenced in ConversionsForBrainMap.md), which will come in handy when working with BrainMap.

Neurosynth

Neurosynth (NS) is a framework for performing large-scale meta-analysis by analyzing the frequency of words within published papers in conjunction with their activity patterns (as reported in tables). By text-mining published work, it is possible to perform meta-analyses with very large sample sizes (e.g., NS contains 15k+ studies). The default interface for NS is online but it has a more powerful analysis package available for python (though it is currently not being updated and is getting folded into something larger).

Issues with large-scale meta-analysis

Large-scale meta-analysis typically trades precision and subtlety for sample size. The idea behind NS is to use word frequencies (for 3k different phrases that are commonly used in neuroimaging) as an index of psychological or cognitive constructs. This might work fine in principle, but in practice you'll notice some problems. For example, NS associates all coordinates in a study with all of its frequently used phrases. This is a problem because the main effect described by a study might be overrepresented by certain words in-text but proportionally underrepresented in the full list of foci that is reported. In other words, the study might be about some cognitive process that is tapped into by just one of the many analyses presented in the study. In fact, this kind of thing seems to occur rather frequently in fMRI studies. Maybe you set out to test a specific contrast, but you report others in the interest of more thoroughly exploring the data. You might even be inclined to do such a thing in order to demonstrate that only the specific contrast you're interested in is capable of activating some specific brain region that you have a theory about. In any case, the result of this NS treatment is that there is a risk of inserting greater noise into your meta-analysis depending on how many effects are explored within studies, and the degree to which there is an imbalance between this and the preference in-text for a specific effect.

Here is a more concrete example of this: if a study reports foci for cognitive process A and B, but refers to process A infrequently within the text, the study will get attached to process B in the NS database, but the activity will equally represent both processes. Depending on how different process A and B are, this could contribute a lot of noise to some meta-analyses. Foci reported for more complex contrasts can also present problems. A study that is interested in contrasting process A with process B might report peaks of activity for A>B and B>A. NS would have no way of distinguishing between the two.

There is also a certain degree of disorganization in this large-scale approach to meta-analysis. For instance, you'll encounter some terms which are uninteresting (e.g., BOLD, PET, structural connectivity, network, etc) but on closer inspection can represent meaningful structure in the database (e.g., the phrase "PET" might produce a significant meta-analytic map simply because older neuroimaging studies relying on PET were interested in a different set of cognitive processes). Other times, terms that might be interesting will be invariably associated with studies that are uninteresting, or represent something else, requiring extra work to uncover and analyze them. A good example of this is studies that involve speech repetition. There is no phrase "speech repetition" in NS, so if you're interested in performing a meta-analysis for this phrase, you might be inclined to look at the meta-analysis for "repetition" instead. If you did this, you would end up with a meta-analytic map that makes little sense. That's because on closer inspection, roughly 70% of "repetition" studies refer to the method of repetition suppression. While it's easy to instead do a meta-analysis of the phrases "speech" and "repetition" (i.e., identifying studies that use both phrases frequently), this is not possible to do using the default (online) NS interface.

As a result, you'll probably want to download the NS python module, which will allow you to poke around in the dataset much more freely. However, here you might run into some issues. The python tools are no longer being supported because they are meant to be folded into a much larger meta-analysis software and database package (NiMARE). The issue is that this package has been and continues to be in development and the NS side of it hasn't really come up online yet. My understanding (last I checked) is that it's possible to use this package, but it remains in development and the documentation is still sparse. Luckily, you can still get the NS module to work as long as you're willing to mess around with different versions of its dependencies (I have a guide on what I've found to be succesful below).

While we're at it, it's also worth mentioning some smaller issues that contribute to a lack of precision in NS. For one, the automated extraction of coordinates is not foolproof. You'll occassionally find a study where foci simply don't add up. Other times, studies will not report the space in which their coordinates were computed. This presents a problem because NS tries to check if the coordinates are in talairach space (also using text parsing) and converts them to MNI.

The point of Neurosynth is reverse-inference

For all the problems with NS, there are ways the framework tries to accomadate for innacuracies. The most obvious of these is the much larger sample sizes. But another is a slightly more stringent method for performing meta-analysis. Typically, a meta-analysis looks at a group of carefully selected studies and evaluates where there is consistent patterns of activity being reported. In NS, a meta-analysis looks at a group of less carefully selected studies, but evaluates where there is activity reported more consistently in this particular group than all of the other studies in the database. The critical difference here is that NS tries to account for base rates of activity in the brain. That is, there is some information about selectivity being evaluated in the way the meta-analysis is performed. This is useful because lots of brain regions show relatively wide "functional tunings". For instance, the insula has been reported to activate under a wide range of conditions (one of the widest). That means that when you do a conventional meta-analysis for any cognitive process, you are likely to discover some consistent activity in the insula. Arguably, this activity is less interesting than other consistent activation patterns that are unique to the cognitive processs you're interested in. By comparing your cognitive process of interest to other studies in the database, you are setting a higher bar for measuring brain-behavior relationships. This is also useful if for no other reason than to try to smooth out some of the inaccuracies and issues associated with text-mining neuroimaging articles (as discussed above).

Here's a figure from reference 2 that shows base rates of activation across studies in NS. Numbers represent percentage of studies that report activity within an area.

Initially, NS referred to this approach for meta-analysis as reverse-inference because the fundamental goal was to reason backwards from patterns of activity to cognitive processes[1]. To understand why this is the case, consider that evaluating consistency of activity associated with a cognitive process (i.e., a group of studies) amounts to describing the probability (for each voxel) of finding activity in the brain given that it is engaged in that process. This is a case of forward inference where we start with a cognitive construct (or frequently used phrase) and pin it onto the brain. However, if the goal of our analysis is to understand which brain regions are specifically involved in the cognitive process we are interested in over and above others, it now becomes necessary to reason backwards from activity to cognitive constructs and ask: what's the likelihood that a cognitive process is being engaged in given that there is activity in the brain. Bringing this back to the context of word frequencies in studies, this translates to finding the probability that a phrase is frequently used given that activity is reported (in a voxel). If you are interested in reading up on the intricacies of reverse inference, there are a few studies worth looking into [1,2,3].

But Neurosynth measures association by default

There are some issues with the way in which reverse-inference is operationalized in NS that has led the framework to transition towards using the phrase "association" map to describe the meta-analysis it performs. To explain why that's the case, let me first outline how an association map is produced in NS. First, all studies in NS are separated into two groups: those that do frequently use some phrase you've selected, and those that don't. Whether a study frequently uses a phrase is determined by some frequency threshold you can set. The website assumes a frequency of 1/1000 words based on a suggestion from the original NS paper, which generally found this to be an adequate threshold for controlling incidental word usage [1]. In my experience, setting a higher threshold is often necessary to create a more consistent/homogenous group of studies. Using the python module will allow you to vary this threshold, which can also be useful for ensuring that the effects you find aren't contingent on looser frequency thresholds. The next step in generating an association map involves using these two sets of studies to construct contingency tables for each voxel that describe the presence or absence of activity in conjunction with the presence or absence of the phrase of interest. A chi-square test is then performed to test for statistical association and to produce z-scores. A high z-score in a region means activity there is more likely to be found in studies frequently using the phrase you are interested in compared to those that don't. It's maybe worth pointing out that all meta-analytic results can depend on the radius of the sphere that is drawn over each of the analyzed foci. Typically, published meta-analyses will vary this parameter to ensure the effect they are reporting is robust. This is another parameter of the meta-analysis that is unavailable on the NS website, but available in the python module.

The reason NS has transitioned to calling reverse-inference maps "association" maps is that the chi-square test, which produces z-scores, does not exactly provide evidence for selectivity (which is typically what we would like to address when relying on reverse-inference). Finding a significant result for some brain region using this test does not support the conclusion that this region is specific to, or selective of, the cognitive process that was meta-analyzed. This is because the analysis does not directly compare your cognitive process of interest to each of the others and instead collapses across all of them. You can use the python module to perform individual contrasts between meta-analyses, but these can be difficult to interpret because: i) you're going to be working with smaller sample sizes (and they are often inadequate[4,5]), ii) different phrases have drastically different sample sizes, and iii) it's hard to predict on theoretical grounds which contrasts will be relevant. To the last point--theoretically uninteresting or difficult to interpret contrasts may show activity in the region for which you are attempting to show selectivity. The rub lies in that showing evidence for selectivity requires generating comparisons for all cognitive processes (i.e., phrases) so this scenario seems difficult to avoid. An even bigger issue with this approach of using pairwise contrasts to show evidence for selectivity is that the logic is somewhat fragile: a region may still be important to some cognitive process even if the particular cognitive process you're interested in is more likely to activate it. Arguably it would be more convincing to return to the association maps and show that the cognitive process you're interested in activates the region you'd like to show specificity for, but all of the other cognitive processes don't. Alas, that seems like an unlikely outcome in most cases.

In short, the z-scores generated by the default NS analysis outlined above only represent something akin to confidence in a positive association between the cognitive process of interest and a region of the brain. If you want to generate maps closer to true reverse-inference maps, you still can using the python package. However, these are unlikely to provide clearer evidence for selectivity. Using the NS module you'll be able to generate probabilities of a study using some phrase when activation is observed at each voxel in the brain (using Bayesian statistics; for details see: [2,3]). However, even for the most obvious brain-behavior relationships, you'll find that posterior probabilities are generally very low, indicating little confidence in being able to conclude that a participant is engaged in some cognitive process given that you've observed activity in a particular region [4,5]. Ultimately, a number of other phrases in NS will predict activation in the same area nearly as well, which complicates any interpretation of selectivity from such posterior probability maps [4,5]. The reason these posterior probabilities tend to be so low is that they are intricately linked to NS term frequencies. These frequencies serve as empirical priors and as such, terms that dominate word frequencies in NS will almost always dominate posterior probabilities. This means that largely uninteresting phrases like "vision", "audition", and "memory" will always come out on top, just by virtue of the fact that these kinds of phrases act as topics that each study can fall into. You might be inclined to ditch the empirical prior (i.e., frequency a word appears in NS) for an atheoretical uniform one, especially because the prior is meant to reflect the frequency with which brains engage in the cognitive process you're interested in period. Word frequencies in NS are more likely to reflect which processes research communities are interested in. This is probably the better way to go [4], but still makes the strong assumption that the prior likelihood of a brain engaging in some cognitive process is 50%.

Here's a comparison of forward and reverse inference maps from reference [1]. Association (formally reverse-inference) maps are considerably more sparse, helping to smooth out some of the aforementioned issues with large-scale meta-analysis by setting a higher bar for significance

Installation tips

The Neurosynth python module can be found here.

You can also automatically install it using pip. Note that for some reason the data provided with it at that link is grossly out of date so you'll want to grab the latest release here.

It's generally tough to determine which data release you're working with so be careful. Once you grab the latest data and features text files either overwrite the ones in the NS module directory or just make sure you are linking to the more recent data when you are constructing your dataset. In my experience, the newer data doesn't change the outcome of any meta-analysis drastically but does tend to produce sometimes smoother results (I imagine because something like 3,000 studies have been added since the penultimate release in 2014).

In recent years, I have not been able to get NS running with the current requirements.txt file. I imagine this is because it's not being updated while NiMARE comes online. Here is what I've had to change to get it up and running:

Use python 2.7

Downgrade to:

Kiwi solver 1.1.0
Matplotlib 2.2.5
Neurosynth 0.3.5
Nibble 2.5.2
Numpy 1.16.6
Pandas 0.24.2
Scikit-learn 0.20.4
Scipy 0.16.1

Additional notes

"Decoding" is often used in the context of NS. This can mean a couple of different things all of which aim at telling you which phrases are associated with some pattern of activity that you already have. The most popular approach for doing this is to correlate your brain map with the meta-analytic map for each phrase in NS. By default association maps are used but you can imagine doing the same thing with posterior probability maps. I think this approach is okay for whole-brain maps, but it makes less sense with smaller ROIs. It also works best with unthresholded maps. If you use the NS based decoding function of Neurovault they will warn you about this. A simpler decoding approach which is the default one on the NS website (when it is available; you might not see it) is just to take some peak coordinate and extract it's association or posterior probability for each phrase in NS.
By the way, Neurovault is part of a larger open science initiative to share brain maps from studies. The end-goal is that it will help facilitate better meta-analysis that can capture effect shapes and other subtleties of fMRI results that we currently have no access to when relying on activation peaks. In fact, they've just recently released some functionality to search maps from published studies and do exactly this so you should check it out.
Resting state maps shown on the NS website are from the human connectome project (the 1000 subjects release). I believe this is much older data that has not undergone the fixes to preprocessing, analysis, etc that HCP has released since so I would use it as an approximation. The NS website also presents co-activation maps for comparison. These maps represent the likelihood that other regions of the brain activate alongside some preselected seed region across all studies in the NS database. Co-activation maps generally tend to be highly similar to functional connectivity maps.
If you ever get lost in neurosynth terms a helpful resource is the cognitive atlas. This is an initative [6] for crowd sourcing a cognitive ontology. The best way to describe it is a kind of wikipedia for cognitive science that clearly organizes cognitive/psychological concepts into categories, showing their relationships to each other, and giving examples of tasks that index them. For example, the category attention includes concepts like selective and sustained attention. Selective attention can be measured using visual pursuit and sustained attention can be measured using the Leiter International Performance Scale.
There have been some efforts to organize terms in NS into topics using author-topic models [10]. The results of that are available to download on the NS website. These models typically identify frequently co-occuring words, which guides the grouping of studies into the standard association meta-analysis in the NS framework. More sophisticated approaches that consider both brain activity and word co-occurences for defining topics have been applied to NS more recently [11].

References

[1] Paper proposing NS.

[2] Further info on reverse inference and why it's not so bad. One of the author's of [1] has previously written about the fragile assumptions involved in reverse-inference as it is typically employed by individual studies; one purpose of NS is to provide a more rigorous framework for doing reverse-inference and this is kind of motivated here.

[3] Another good paper to read if you're curious about reverse-inference.

[4] Great resource for correctly interpreting NS results.

[5] Great resource for correctly interpreting NS results.

[6] Building a cognitive atlas (and how neurosynth might help).

[7] Paper making some wild (incorrect) claims using NS.

[8] Response to [7].

[9] Response to [8].

[10] Modeling topics using word frequencies in studies.

[11] Modeling topics using word frequencies and activation tables in studies.

References 4 and 5 are from Tal Yarkoni's blog so they are the best info you can refer to for interpreting NS results. These two blog posts criticize the interpretation of a paper that used NS association maps to claim selectivity for pain in the dorsal ACC [7]. Some of what seem like more minor criticisms were formalized in a response letter in PNAS [8]. The original authors of the study wrote a reply [9]. This is when default NS maps became "association maps" and not "reverse-inference" maps

BrainMap

If you're concerned about the issues associated with large-scale meta-analytic databases, there are more conventional ones that you can turn to. One of these is BrainMap, which carefully organizes studies into a taxonomy that you can use to perform highly targeted meta-analyses. For instance, you can pull out studies based on the kind of stimulus used, the content of the stimulus, the modality in which the stimulus was presented, the kind of instructions that were provided, the type of response that was required, the type of contrasts that was reported, etc. You can even do higher-level searches based on behavioral domains and subtypes. Results can be filtered by combinations of queries, and you can even filter by subject demographics. Of course, the trade-off here is that you are working with much fewer studies. Like Neurosynth, occassionally you'll find something miscategorized (though less often)--BrainMap relies on volunteers and RAs to help code studies.

Installation tips

BrainMap has multiple software that you can download here. Sleuth is used to search the database for coordinates. These coordinates can then be saved into a text file and analyzed with GingerALE. Some scripts in this folder can help you format coordinates for meta-analysis, as well as convert between talairach and MNI spaces.

One last thing to note is that BrainMap servers can go down relatively often for long periods of time so just be aware of that. You'll need access to those servers in order to search studies in sleuth.

Some final notes

The activation likelihood estimation that gingerALE performs amounts to analyzing the consistency of activity reported across a set of studies. This analysis involves drawing a gaussian probability distribution around each foci (restricted to grey matter), and varying the width of the distribution depending on study sample size, based on empirical tests of inter-scan and inter-lab variability [2]. The union of these probability maps can then be tested for significance in a number of ways: permutation testing(i.e., evaluate if spatial association between experiments is random), random effects analysis (i.e., make inference over experiments rather than foci), etc. One advantage to this analysis over NS is that NS does not perform permutation testing (although I guess you could rig it to do so). BrainMap meta-analyses can be easily corrected for cluster size, whereas NS maps are typically reported without cluster correction (but FDR correction).

BrainMap can also be used to produce co-activation maps. My intuition is that you would be better off using NS for this, because this kind of analysis does not require any kind of organization and NS has the advantage of sample size.

You can also decode studies using BrainMap behavioral domains. To do this, you will need to download mango, which is a volume space viewer that can also perform some basic analyses. You will then want to download the behavioral analysis plugin. Now you'll be able to load in your map and "decode" it in much the same way as NS, but using a limited set of cognitive domains. Note that I think your map needs to be in TAL space for this to work. You can use my coordinate converter in this folder to prep your map.

Some other things to note: Your meta-analysis should have a min. of 17 studies. Otherwise, a single experiment can dominate others [3]. You also need a minimum of 20 studies to achieve sufficient power for moderate effects (although more are needed if there is task variance) [3].

References

[1] A paper outlining the BrainMap project

[2] Good explanation of ALE and the BrainMap approach to meta-analysis (the updated approach)

[3] My notes show [2] as the reference for this, but I don't think that's correct. I'll hunt down the right study and update this ref.

alexteghipco / metaanalysisresources Goto Github PK