As someone who used to work at the Cognitive Neurophysiology Lab in the Scripts Institute-- doing some work on functional brain image-- I can confirm this was not news even thirty years ago. I guess this is trying to make some point to lay people?
fMRI has been abused by a lot of researchers, doctors, and authors over the years even though experts in the field knew the reality. It’s worth repeating the challenges of interpreting fMRI data to a wider audience.
The way I understood it is that while individual fMRI studies can be amazing, it is borderline impossible to compare them when made using different people or even different MRI machines. So reproducibility is a big issue, even though the tech itself is extremely promising.
This isn't really true. The issue is that when you combine data across multiple MRI scanners (sites), you need to account for random effects (e.g. site specific means and variances)...see solutions like COMBAT. Also if they have different equipment versions/manufacturers those scanners can have different SNR profiles. The other issue is that there are many processing with many ways to perform those steps. In general, researchers don't process in multiple ways and choose the way that gives them the result they want or anything nefarious like that, but it does make comparisons difficult since the effects of different preprocessing variations can be significant. To defend against this, many peer reviewers, like myself, request researchers perform the preprocessing multiple ways to assess how robust the results are to those choices. Another way the field has combatted this issue has been software like fMRIprep.
It is in fact even difficult to compare the same person on the same fMRI machine (and especially in developmental contexts).
Herting, M. M., Gautam, P., Chen, Z., Mezher, A., & Vetter, N. C. (2018). Test-retest reliability of longitudinal task-based fMRI: Implications for developmental studies. Developmental Cognitive Neuroscience, 33, 17–26. https://doi.org/10.1016/j.dcn.2017.07.001
I read that paper as suggesting that development, behavior, and fMRI are all hard.
It's not at all clear to me that teenagers' brains OR behaviours should be stable across years, especially when it involves decision-making or emotions. Their Figure 3 shows that sensory experiments are a lot more consistent, which seems reasonable.
The technical challenges (registration, motion, etc) like things that will improve and there are some practical suggestions as well (counterbalancing items, etc).
While I agree I wouldn't expect too much stability in developing brains, unfortunately there are pretty serious stability issues even in non-developing adult brains (quote below from the paper, for anyone who doesn't want to click through).
I agree it makes a lot of sense though the sensory experiments are more consistent, somatosensory and sensorimotor localization results generally seem to the be most consistent fMRI findings. I am not sure registration or motion correction is really going to help much here, I suspect the reality is just that the BOLD response is a lot less longitudinally stable than we thought (brain is changing more often and more quickly than we expected).
Or if we do get better at this, it will be more sophisticated "correction" methods (e.g. deep-learners that can predict typical longitudinal BOLD changes, and those better allow such changes to be "subtracted out", or something like that). But I am skeptical about progress here given the amount of data needed to develop any kind of corrective improvements in cases where there are such low longitudinal reliabilities.
===
> Using ICCs [intraclass correlation coefficients], recent efforts have examined test-retest reliability of task-based fMRI BOLD signal in adults. Bennett and Miller performed a meta-analysis of 13 fMRI studies between 2001 and 2009 that reported ICCs. ICC values ranged from 0.16 to 0.88, with the average reliability being 0.50 across all studies. Others have also suggested a minimal acceptable threshold of task-based fMRI ICC values of 0.4–0.5 to be considered reliable [...] Moreover, Bennett and Miller, as well as a more recent review, highlight that reliability can change on a study-by-study basis depending on several methodical considerations.
The article is pointing out that one of the base assumptions behind fMRI, that increased blood flow (which is what the machine can image) is strongly correlated to increased brain activity (which is what you want to measure) is not true in many situations. This means that the whole approach is suspect if you can't tell which situation you're in.
fMRI ususally measures BOLD, changes in blood oxygenation (well, deoxygenation). The point of the paper is that you can get relative changes like that in lots of ways: you could have more or less blood, or take out more/less oxygen from the same blood.
These can be measured themselves separately (that's exactly what they did here!) and if there's a spatial component, which the figures sort of suggest, you can also look at what a particular spot tends to do. It may also be interesting/important to understand why different parts of the brain seem to use different strategies to meet that demand.
Individual fMRI is not a useful diagnostic tool for general conditions. There have been some clinics trying to push it (or SPECT) as a tool for diagnosing things like ADHD or chronic pain, but there is no scientific basis for this. The operator can basically crank up the noise and get some activity to show up, then tell the patient it’s a sign they have “ring of fire type ADHD” because they set the color pattern to reds and a circular pattern showed up at some point.
Are there proposed reasons for increased blood flow to brain regions other than neural activity? Are neurons flushing waste products or something when less active?
The BOLD response (oxygen-neuronal activity coupling) has been pretty much accepted in neuroscience. There have been criticisms about it (non-neuronal contributions, mysteries of negative responses/correlations) but in general it is pretty much accepted.
The measurement of the BOLD response is well-accepted, but the interpretation of it with respect to cognition is still basically mostly unclear. Most papers assuming BOLD response uniformly can be interpreted as "activation" are quite dubious.
Yes, I stupidly read the headline and said "no duh" but they are making a point about our understanding of brain activity. I was thinking about the part of the signal that is reliably filtered out, they are talking about something else. Sorry, I was wrong.
They are indeed coupled, but the coupling is complicated and may be situationally dependent.
Honestly, it's hard to imagine many aggregate measurements that aren't. For example, suppose you learn that the average worker's pay increased. Is it because a) the economy is booming or b) the economy crashed and lower-paid workers have all been laid off (and are no longer counted).