Mawson et al. 2017 (two papers) – internet survey of homeschoolers recruited from anti-vaccine groups; non-random, self-reported, unverified health outcomes. Retracted by the publisher after criticism.
Hooker & Miller 2020/2021 – analysis of “control group” data also from self-selected surveys; same methodological problems.
Lyons-Weiler & Thomas 2020, 2022 – data from a single pediatric practice run by one of the authors; serious selection bias.
Joy Garner / NVKP surveys – activist-run online surveys with no verification.
Enriquez et al. 2005 – a small cross-sectional study about allergy self-reports, not about overall neurodevelopment.
Large, well-controlled population studies (Denmark, Finland, the U.S. Vaccine Safety Datalink, etc.) comparing vaccinated vs. unvaccinated children show no increase in autism, neurodevelopmental disorders, or overall morbidity attributable to recommended vaccines.
Take a look at the next sentence from your linked article.
“I am pro-life personaly [sic] but it wasn’t those,” he said, using the jail’s internal messaging system. “I will just say there is a lot of information that will come out in future that people will look at and judge for themselves that goes back 24 months before the 14th. If the gov ever let’s [sic] it get out.”
I think for this to be interesting to an outside observer, it should at least address the bulk of the debate around the topic. If the GP is right and it's missing the key points of the anti-abudance critique, then I'm afraid it's missing the forest for the trees and misleading to a general audience.
Insofar as "the problem" is sifting through pages and pages of documentation to find the relevant information for an answer, I think that current-day LLMs are actually quite capable at this.
You can measure the sharpness of the position, as in this paper section 2.3 "Complexity of a position". They find their metric correlates with human performance.
I think this is something a bit different. That sort of assessment is going to find humans perform poorly in extremely sharp positions with lots of complicated lines that are difficult to evaluate. And that is certainly true. A tactical position that a computer can 'solve' in a few seconds can easily be missed by even very strong humans.
But the position Ding was in was neither sharp nor complex. A good analog to the position there is the rook + bishop v rook endgame. With perfect play that is, in most cases, a draw - and there are even formalized drawing techniques in any endgame text. But in practice it's really quite difficult, to the point that even grandmasters regularly lose it.
In those positions, on most of every move - any move is a draw. But the side with the bishop does have ways to inch up the pressure, and so the difficulty is making sure you recognize when you finally enter one of those moves where you actually need to deal with a concrete threat. The position Ding forced was very similar.
Most of every move, on every move, led to a draw - until it didn't. Gukesh had all sorts of ways to try to prod at Ding's position and make progress - prodding Ding's bishop, penetrating with his king, maneuvering his bishop to a stronger diagonal, cutting off Ding's king, and of course eventually pushing one of the pawns. He was going to be able to play for hours just constantly prodding where Ding would have stay 100% alert to when a critical threat emerges.
And this is all why Ding lost. His final mistake looks (and was) elementary, and he noticed it immediately after moving - but the reason he made that mistake is that he was thinking about how to parry the other countless dangerous threats, and he simply missed one. This is why most of everybody was shocked about Ding going for this endgame. It's just so dangerous in practical play, even if the computer can easily show you a zillion ways to draw it.
Real world systems are complicated. In theory, you could do belief propagation to update your beliefs through the whole network, if your brain worked something like a Bayesian network.
Natural selection didn't wire our brains to work like a Bayesian network. If it had, wouldn't it be easier to make converts to the Church of Reverend Bayes? /s
Alternatively, brains ARE Bayesian networks with hard coded priors that cannot be changed without CRISPR.
Isn't that Edwin T. Jaynes example just p-hacking? If only 1 out of 100 experiments produces a statistically significant result, and you only report the one, I would intuitively consider that evidence to be worth less. Can someone more versed in Bayesian statistics better explain the example?
> One who thinks that the important question is: "Which quantities are random?" is then in this situation. For the first researcher, n was a fixed constant, r was a random variable with a certain sampling distribution. For the second researcher, r/n was a fixed constant (approximately), and n was the random variable, with a very different sampling distribution. Orthodox practice will then analyze the two experiments in different ways, and will in general draw different conclusions about the efficacy of the treatment from them.
But so then the data _are_ different between the two experiments, because they were observing different random variables -- so why is it concerning if they arrive at different conclusions? In fact, the _fact that the 2nd experiment finished_ is also an observation on its own (e.g. if the treatment was in fact a dangerous poison, perhaps it would have been infeasible for the 2nd researcher to reach their stopping criteria).
I think the point is that the different planned stopping rules of each researcher--their subjective thoughts--should not affect what we consider the objective or mathematical significance of their otherwise-identical process and results. (Not unless humans have psychic powers.)
It's illogical to deride one of those two result-sets as telling us less about the objective universe just because the researcher had a different private intent (e.g. "p-hacking") for stopping at n=100.
_________________
> According to old-fashioned statistical procedure [...] It’s quite possible that the first experiment will be “statistically significant,” the second not. [...]
> But the likelihood of a given state of Nature producing the data we have seen, has nothing to do with the researcher’s private intentions. So whatever our hypotheses about Nature, the likelihood ratio is the same, and the evidential impact is the same, and the posterior belief should be the same, between the two experiments. At least one of the two Old Style methods must discard relevant information—or simply do the wrong calculation—for the two methods to arrive at different answers.
If you have two researchers, and one is "trying" to p-hack by repeating an experiment with different parameters, and one is trying to avoid p-hacking by preregistering their parameters, you might expect the paper published by the latter one to be more reliable.
However, if you know that the first researcher just happened to get a positive result on their first try (and therefore didn't actually have to modify parameters), Bayesian math says that their intentions didn't matter, only their result. If, however, they did 100 experiments and chose the best one, then their intentions... still don't matter! but their behavior does matter, and so we can discount their paper.
Now, if you _only_ know their intentions but not their final behavior (because they didn't say how many experiments they did before publishing), then their intentions matter because we can predict their behavior based on their intentions. But once you know their behavior (how many experiments they attempted), you no longer care about their intentions; the data speaks for itself.
Well no because it’s talking about either a fixed sample size or stopping when a % total is reached. Neither imply a favourable p-value necessarily.
I think the author means to say that it’s two methods incidentally equivalent in the data they collect that may draw different conclusions based on their initial assumptions. Question is how do you make coherent sense of it.
At level 1 depth it’s insightful.
At level 2 depth it’s a straw man.
At level 3 depth, just keep drinking until you’re back at level 1 depth.
> The other ... decided he would not stop until he had data indicating a rate of cures definitely greater than 60%
I believe that "definitely greater than 60%" is supposed to imply that the researcher is stopping when the p-value of their HA (theta>=60%) is below alpha, so an optional stopping (ie. "p-hacking") situation.
Yes, the possible programs are enumerable, and you can start searching with the least complex programs and work your way up in complexity. Once you find a program that explains the available data, you cannot guarantee it will continue to explain possible future data, unless, like you mention, you constrain the program space to a finite set. What you're describing is generally how people make models of the external world.
Just for the sake of argument, I can name some an entire field of science that was invalidated in light of genetic & neuroscience evidence: phrenology. At the time, it was the newest advancement in the gleaming era of scientific Enlightenment. It just happened to justify colonial policies of that time. Now, a couple hundred years later, we're walking back on a widely supported but misguided "scientific" field.
Hooker & Miller 2020/2021 – analysis of “control group” data also from self-selected surveys; same methodological problems.
Lyons-Weiler & Thomas 2020, 2022 – data from a single pediatric practice run by one of the authors; serious selection bias.
Joy Garner / NVKP surveys – activist-run online surveys with no verification.
Enriquez et al. 2005 – a small cross-sectional study about allergy self-reports, not about overall neurodevelopment.
Large, well-controlled population studies (Denmark, Finland, the U.S. Vaccine Safety Datalink, etc.) comparing vaccinated vs. unvaccinated children show no increase in autism, neurodevelopmental disorders, or overall morbidity attributable to recommended vaccines.