Talk:Bayesian probability
Mathematics Start‑class High‑priority | ||||||||||
|
Reference Classes
It's true that the remarks on the absence of reference classes are confusing, mostly because it's incorrect (now fixed in the controversy section at least). There must always be a reference class for any application of probability theory under anybody's interpretation. The basic point is that uncertainty is all expressed probabilistically. Whether the uncertainty comes from known empirical sampling variation in repeated trials, or just brute subjective states, is irrelevant. (Viz)
---
Hey authors, I'm sure the maths in this article is great, but as a non mathamatician, I can't get past the second paragraph in this article without being completely confused
"The Bayesian interpretation of probability allows probabilities to be assigned to all propositions (or, in some formulations, to the events signified by those propositions) independently of any reference class within which purported facts can be thought to have a relative frequency. "
Reference class? What is that? There is no link to what that means... Just an idea: a bit of a rewrite, so that non-mathematicians (or logical philosophers) can understand this article would be great.--Bilz0r 21:03, 28 March 2006 (UTC)
---
Types of probability: Bayesian, subjective, personal, epistemic
"Bayesian probability is also known as subjective probability, personal probability, or epistemic probability." Is this true? I'm no expert on this, but I thought subjective probability was more or less the "feel" for how probable something was, that personal probability was a very close synonym of subjective probability, and epistemic probability was, well, something different yet: the degree to which a person's belief is probably true given the total evidence that the person has. Again, I could easily be wrong, but wouldn't it be an at least somewhat controversial theory about subjective, personal, and epistemic probability to say that they are each reducible to Bayesian probability? --LMS
- The way you describe it, they all seem like the same thing. Cox's theorem suggests that either they must be the same thing or that one or more of the theories contains inconsistencies or violates "common sense".
- Bayesian probability are subjective in the sense that two different people can assign different probabilities to the same event because they each have different information available to them. Probabilities are subjective because they depend on what you know. In the frequentist world view probabilities are properties of the system, and don't vary with the observer. Jaynes talked about this a lot.-- [[User:Gavin Crooks|]
I definitely agree with Larry (i.e. "LMS") on this one. The fact that different probabilities may be assigned given different information certainly is not enough to mean they're subjective. Michael Hardy 30 June 2005 23:50 (UTC)
They are not the same thing. The mathematics involved is the same thing according to what Richard Cox is saying, but what the mathematics is applied to is the difference. Mathematicians like to think probability theory is only about mathematics and nothing else, but that's just narrowness of mathematicians. (Shameless plug: See my paper on pages 243-292 of the August 2002 issue of Advances in Applied Mathematics. I assert there that I think Cox's assumptions are too strong, although I don't really say why. I do say what I would replace them with.) -- Mike Hardy
I agree with what Larry Sanger said above. The article was wrong to say the following (which I will alter once I've figured out what to replace it with): "The term 'subjective probability' arises since people with different 'prior information' could apply the theory correctly and obtain different probabilities for any particular statement. However given a particular set of prior information, the theory will always lead to the same conclusions." That people with the SAME prior information could assign DIFFERENT probabilities is what subjectivity suggests; if people with the same prior information were always led to the same epistemic probability assignments when they correctly applied the theory, then those would be epistemically OBJECTIVE assessments of probability. They would not be "objective" in the sense in which that word unfortunately gets used most often; i.e., the probabilities would not be "in the object"; they would not be relative frequencies of successes in independent trials, nor proportions of populations, etc. I distinguish between "logical" epistemic probabilities, which are epistemically objectively assigned (and nobody has any idea how to do that except in very simple cases) and "subjective" epistemic probabilities, which measure how sure someone is of something --- the "feel", as Larry puts it. I have been known to say in public that the words "random" and "event" should be banished from the theory of probability, but we're getting into gigantic Augean stables when we say that. Michael Hardy 22:27 Jan 18, 2003 (UTC)
- It's hard to overstate the difficulties in understanding the foundations of probability. A quick web search turns up [1] which mentions several schools of thought in a few paragraphs, inevitably including several different schools within the "subjective probability" camp! User:(
It's not strictly correct to say that no one knows how to assign logical probabilities in non-trivial cases. For example, the empirical fact of Benford's Law can be logically explained using scale invariance, which is a special case of E.T. Jaynes's Principle of transformation groups. Another non-trivial case which can solved with this principle is Bertrand's Problem. Jaynes's articles are available here. Cyan 22:42 Mar 27, 2003 (UTC)
Jaynes' arguments are very interesting, but I don't think they have yet reached the stage of proof-beyond-all-reasonable-doubt, and there's a lot of work to be done before many statisticians can reliably apply them in typical applied statistics problems. Michael Hardy 00:31 Mar 28, 2003 (UTC)
But you must admit that while Jayes' position is not complete, it has wider scope and greater consistancy than the frequentist approach (muddle of ad-hoc methods)? For me, Jaynes' recent book makes this case, and does it by focusing on comparison of results (i've added an ext-link to it on his page). (I fear that you may be interpreted as implying 'even the best cars can get stuck in the mud - so you should always walk...'. While researchers rightly focus on what is left to be added, someone reading an encyclopedia is looking to learn what is) 193.116.20.220 16:58 16 May 2003 (UTC)
- It is also at times an attempt to describe the scientific method of starting with an initial set of beliefs about the relative plausiblity of various hypotheses, collecting new information (for example by conducting an experiment), and adjusting the original set of beliefs in the light of the new information to produce a more refined set of beliefs on the plausibility of the different hypotheses.
This sentence and the paragraph "Applications of Bayesian Probability" should be removed or moved to Bayes Theorem. In order to avoid confusion it is crucial to distinguish the philosophical interpretation of probability from the mathematical formula developed by Bayes. These are not the same, they are often misunderstood, and the current version of the article makes it easy to get it wrong. Bayes' Theorem is a mathematical formula whos truth cannot be reasonably disputed. Bayes probability is the interpretation of mathematical construct (probability) and there was significant dispute in the past. I suggest we discuss the philosophy and the historical dispute in this article and the math with its applications in Bayes Theorem. 134.155.53.23 14:28, 23 Dec 2003 (UTC)
Crow Paradox
Isn't this related to the All Crows are Black Paradox?
- Hempel's paradox says that "All Crows are Black" (uncontroversial) logically implies "All not-Crows are not-Black" (manifestly untrue). What's your point? 217.42.117.73 14:46, 6 Mar 2005 (UTC)
- No, it's equivalent to "all non-black things are not crows". Banno 19:44, Mar 6, 2005 (UTC)
- Could it be that the paradox intended here is that a brown cow counterintuitively becomes a confirming instance of the hypothesis that all crows are black. PJTraill 00:08, 6 November 2006 (UTC)
Observation and question
I don't know how or if this can be incorprated, but it's been my experience from comparison of frequentist multiple-range tests (e.g., Ryan-Einot-Gabriel-Welsch) with a Bayesian test (e.g., Waller-Duncan) that the former are more subject to perturbation by the overall distribution of the dataset. Specifically, if one mean is very much greater magnitude than all the other means, the frequentist test will assign the extreme mean to one group while assigning all other means to a second group, no matter how much difference there may be among the remaining means and no matter how tightly each treatment's results group around their respective means. The Bayesian test, on the other hand, does not do this. While none of us poor molecular biologists in our lab have our heads around the math, the Bayesian outcome "smells better" given our overall understanding of the specific system, developed over multiple experiments. Since we're cautious, we just report both analyses in our papers. Dogface 04:07, 13 Oct 2004 (UTC)
problem of information
I just added a section about (well-known) problem of conveying information with probabilities; I think it should be better integrated with the rest, but I am not an expert in the rest. :) Samohyl Jan 19:59, 11 Feb 2005 (UTC)
- Looks interesting, but can it be revised to either use "information" in the technical sense, or some other word if not? It seems like we should omit the disclaimer note that the use of "information" here is not a strict mathematical one if possible. Regards & happy editing, Wile E. Heresiarch 00:37, 12 Feb 2005 (UTC)
- You're right. I have changed "information" to "evidence", and it looks much better now. Samohyl Jan 10:16, 12 Feb 2005 (UTC)
Oh, man. People really shouldn't write sections when they have no clue what they are talking about. A bayesian would assign the same probability of picking a head, however the prior probability over hypotheses (what is the probability of the coin landing heads), would most likely be represented as a Beta distribution whose distribution is
P(theta;a,b) = Gamma(a+b)/Gamma(a)Gamma(b) * theta^a-1 (1-theta)^b-1
I'm going to place a NPOV message on this article until these errors are sorted out.
- why don't you just fix it? Why the fuss?Banno 06:06, August 27, 2005 (UTC)
- I think the section is about that there is other uncertainty - ignorance - than probability, and as such it cannot be represented by the probability distribution itself. You could of course represent both as a probability over the space of hypotheses, but, imho, this is exactly the core of the frequentist/bayesian debate about what can/should probabilities represent. You chose beta distribution for technical reasons (conjugate prior), but in fact, you're in this case only interested in the mean and variance of the probability distribution, so many other distributions would do. This notion is exactly what Dempster-Shafer theory or upper probability tries to model. On the other hand, you're right, from a practical POV, bayesians commonly do it this way, and it (probably) yields the same results as using Dempster-Shafer theory (except I am not sure if you won't run into computational difficulties in more complex cases than coin flipping). But I am not expert, and will happily read opinion of the others. Samohyl Jan 13:21, 27 August 2005 (UTC)
- I'm not solely interested in the mean and variance. In fact, if I were going to determine P(something), where something depended on theta then I would treat theta as a nuisance parameter and integrate it out, taking every piece of information conveyed in the distribution into account. The inappropriate thing about the section (and the reason I fixed it) is that this page is about Bayesian probability, and in Bayesian probability, you can express uncertainty in hypothesis using probability theory. I think that, eventually, these pages on Bayesian probability and inference should be rearranged and that a new page on the war between frequentists and bayesians be separated out. It's important to know that there is controversy, but the back-and-forth "this is what bayesians think but..." makes an already difficult subject matter even more difficult.
- Now, Samohyl. You say that "this is exactly the core of the debate." I agree. However, we cannot have the articles flopping back and forth. Bayesians choose to interpret probabilities as degrees of beliefs. Frequentists don't. So your section on how frequentists point out some shortfall of Bayesian analysis, when in fact, that's NOT how bayesians DO IT, is absurd. Frequentists think its a shortfall because they don't allow the solution Bayesians use.
- I think we should hear some suggestions about what might go on a "controversy page"... My suggestion would be a page on "Frequentists paradoxes", "Bayesian paradoxes", "Bayesian vs Frequentist", and then change the Bayesian pages to cite that there is controversy but then explain clearly the point of view of Bayesians (and similarly on the frequentist pages). roydanroy 13:48, 27 August 2005 (EST)
- Fine. I just found probability interpretations, which seems to be the right page. Samohyl Jan 20:36, 27 August 2005 (UTC)
Disputed: Paradoxes in Bayesian inference
I would dispute the following:
This can lead to paradox situations such as false positives in bayesian inference, where your prior probability can be based on more evidence (contains more information) than your posterior probability, if the conditional probability is not based on enough evidence.
What evidence is there that paradoxes occur? User:Blaise 21:11, 2 Mar 2005 (UTC)
- You are right, I thought there was a connection, but it isn't, so I will remove the sentence. Thanks for correction. But some example where this is a problem would be fine, if there is any. Samohyl Jan 19:54, 3 Mar 2005 (UTC)
- Actually, like most "paradoxes", this is not really a paradox. First, it is perfectly reasonable that some observations can result in an increase in a prior state of uncertainty (e.g. the experimental result that "raises more questions than answers"). Second, it can be show that such results will always be less likely than results that reduce uncertainty. The net effect is that the probability weighted average of the different results, even if some results increase uncertainty, will always produce an expected (i.e. probability weighted average) decrease in uncertainty. One example is the medical test for some rare condition. It is possible that - given one's family history, age and other factors - there is a prior probability that a person can have a 10% chance of having the condition. Given the severity of the condition, this 10% chance is considered high enough for a doctor to recommend an inexpensive - and imperfect - preliminary test. A negative result means it becomes very unlikely the person has the condition, let's say 0.001%. There is still a chance the negative result could be in error, but it is very small. A positive result, on the other hand, could just increase the chance of having the condition to to 50%. In short, a positive result has a higher uncertainty than the initial state according to Shannon's entropy formula H=-log2(p(x))p(x)-log2(1-p(x))(1-p(x)) where p(x) is the probability of having the medical condition we are looking for. The test is still useful because it can easily determine if an even more expensive test should be undertaken. But if we compute the expected uncerainty reduction, we find that the probability weighted average of he potential test results is a reduction in uncertainty. The result that decreases uncertainty will ALWAYS be more likely than results that increase uncertainty. It is, in fact, impossible to even hypothetically design a test that would violate this condition. In the case of this example, the expected change in Shannon entropy would be: dH=p(PR)*H(PR)+p(NR)*H(NR)-H(prior) where p(PR) is the probability of a positive result, H(PR) is the entropy after getting a positive result, p(NR) is the probability of a negative result, H(NR) is the entropy after getting a negative result, and H(prior) is the entropy. Any combination of probabilities of results and conditional probabilities of having the medical condition given a result must also be consistent with the rule that mutually exclusive and collectively exhaustive events must have a total probability of 1 and must be consistent with the general form of Bayes Theorem. If you don't believe me, try to find set probabilities in the medical test example such that dH is positive (that is, probability weighted avereage H increases) yet meets the condition that the probabilities of mutually exclusive and collectively exhaustive events add to 1. ERosa (talk) 05:10, 13 January 2008 (UTC)
Another dispute: What Laplace meant
In the history section, this apparent paraphrasing of Laplace bugs me:
- 'It is a bet of 11000 to 1 that the error in this result is not within 1/100th of its value'
Am I reading this wrong, or is that saying exactly the opposite of what is mean? Shouldn't "not within" be replaced by either "within" or "not more than" in this context? Or am I reading the stating of the odds backwards? I traced it all the way back to the edit which added that whole section, so it's not just some random minor vandal sticking a "not" in there, at least. --John Owens (talk) 23:28, 2005 Mar 14 (UTC)
- I am morally certain that you are correct. The sentence doesn't make sense as written, and doesn't comport with my memory of what Laplace wrote. However, I cannot find the original quote in my library. Stigler doesn't have it, nor does J. O. Berger (at least, not so I can find it from the index). I had thought that Jaynes might have it, but the index in his finally printed version is not very good. I would be comfortable in making the change, but think that if someone can find the quote we would be on more solid ground. Perhaps the original contributor of this quotation can verify it? Bill Jefferys 20:34, 29 August 2005 (UTC)
I believe I have found where the original contributor got this. The quotation from the Wiki article is
- For instance, Laplace estimated the mass of Saturn, given orbital data that were available to him from various astronomical observations. He presented the result together with an indication of its uncertainty, stating it like this: 'It is a bet of 11000 to 1 that the error in this result is not within 1/100th of its value'. Based on current estimates, he would have won the bet, as another 150 years' accumulation of data has changed the estimate by only 0.63%.
D. S. Sivia, in Data Analysis: A Bayesian Tutorial (Oxford: Clarendon Press 1996, pp. 7-8) writes
- Laplace stated that '...it is a bet of 11,000 to 1 that the error of this result is not 1/100 of its value'. He would have won the bet, as another 150 years' accumulation of data has changed the estimate by only 0.63%!
The only significant difference is the inclusion of the word 'within' in the Wiki article, which appears to be a mistake. Accordingly, I will remove that word from the article.
However, the second sentence now becomes problematic, as it appears to have been lifted from Sivia's book without attribution and only minor change. I do not know what the Wiki convention for this would be. It seems to me that a paraphrase is in order, or else an attribution to Sivia. Can someone with more experience with Wikipedia comment? Bill Jefferys 23:05, 29 August 2005 (UTC)
- Hello Bill. I'm glad to see you're here. I am familiar with your posts to sci.stat.math (mine appear under the name "Robert Dodier"). Anyway, about the unattributed quotation, we do try to keep our collective noses clean wrt copyrights. I see two solutions: (1) "put it in quote marks" and attribute the remark to Sivia. (2) replace it with a paraphrase. Maybe someone has already done one or the other. Thanks for your contributions and keep up the good work! Wile E. Heresiarch 01:12, 30 August 2005 (UTC)
Hello Robert, I remember you well (but have stopped reading UseNet as it was taking too much of my time). Thanks for your remarks. I have modified the article to explicitly credit Sivia with the comment, and have added his book in the references section.
Meantime, if you (or anyone else) wishes to look at the article I wrote on admissible decision rules or the additions I made to the section on uninformative priors, I would be grateful for any additions or corrections. Also, articles are needed on reference priors, complete class theorems and Stein effect. Is anyone willing to take a crack at some of these? Bill Jefferys 20:43, 2 September 2005 (UTC)
Need a better example of frequentist probability
In the section titled Bayesian and Frequentist probability, the statement is:
- 'For example, Laplace estimated the mass of Saturn (described above) in this way. According to the frequency probability definition, however, the laws of probability are not applicable to this problem. This is because the mass of Saturn is a constant and not a random variable, therefore, it has no frequency distribution and so the laws of probability cannot be used.'
My reading of this is that the statement refers to the value for the mass arrived at experimentally, not the absolute mass of Saturn. The experimental results will have a frequency distribution, depending on the degree of error associated with the method and apparatus used. I don't have an alternative example, but I find this one less than ideal.
- WRONG WRONG WRONG WRONG!!!!!! This is a good example precisely because it's about the absolute mass and that is NOT a random variable. The whole point of Bayesianism is to assign probability distributions NOT to things that vary randomly, but to things that are uncertain. Michael Hardy 23:17, 28 August 2005 (UTC)
- Note that some authorities say that the essence of Bayesianism is that quantities such as the mass of Saturn are considered to be random variables, with the data fixed and not themselves random variables (although a particular data set does arise from a random process) whereas under frequentism it is the other way around. Much of the Bayesian literature talks in this way. But I am in agreement with Michael Hardy that the Saturn mass example is an excellent one (besides being a classical example that is commonly given as such). Bill Jefferys 20:16, 29 August 2005 (UTC)
- Indeed, James Berger is one of the Bayesian authorities from whom you may hear that, and he will tell you explicitly that he really doesn't care about this particular issue, so at least on that point he is not an authority. Other "Bayesian authorities" use the language of "random variables" merely because it's conventional in probability theory, and it takes considerable effort to extensively revise conventional nomeclature to be consistent with one's philosophical position. Bottom line: those "Bayesian authorities" who say things like that are not necessarily taking a well-thought-out philosophical position; they're just being conventional. But see Edwin Jaynes's much-discussed book Probability Theory: the Logic of Science. Michael Hardy 01:17, 30 August 2005 (UTC)
- I asked Jim Berger about exactly this point not too long ago. He agreed that from the Bayesian POV, parameters are properly considered random variables. He didn't volunteer that he doesn't particularly care about the issue, and I didn't ask. I'm puzzled why you would think that his not caring (if this is indeed the case) would make him not an authority! But in any case, from an axiomatic point of view, parameters in Bayesian statistics possess all of the characteristics of random variables, so it seems hard to avoid concluding that they are. Of course, if you are using the phrase 'random variable' in a vernacular sense ("things that vary randomly") then they aren't, but I don't see a good reason to abandon a precise mathematical definition just because the general public uses 'random' in an imprecise way.
- I have the published version of Jaynes' book (Cambridge Press, 2003)...can you cite a page in particular? I don't want to have to search the entire book, and the index isn't very good. Bill Jefferys 19:38, 2 September 2005 (UTC)
- From a mathematical point of view, they are "random variables". I'll try to find an appropriate page in the Jaynes book, but until then, look in the index under "mind projection fallacy". Michael Hardy 00:00, 3 September 2005 (UTC)
- The word random is avoided by Bayesians because most things are not random (and some think nothing is). Jaynes goes through a lot of trouble explaining that a coin flip's result is uncertain not because coin flips are random and unpredictable, but instead, that the initial conditions of the coin flip (which 100% characterize its final head/tail state) are uncertain and that this uncertainty propagates and results in uncertainty in its final resting state. People who do bayesian analysis but still use the frequentist language (random variables, IID, etc) sometimes are still confused about this issue, falling prey to Jaynes' mind projection fallacy as Mr. Hardy cited. Roydanroy
Beta Prior Distribution at Bayesian data analysis example
Hi,
When "probability that the next ball is black is p" is calculated under situation 1 (no knowledge as to the quantities) a Beta prior distribution is calculated "B(alfa_B=1,alfa_W=1)". Nevertheless when I checked Beta distribution definition at "http://en.wikipedia.org/wiki/Beta_distribution" it doesn't correspond. Fita should be powered to "alfa_B-1" and "1-fita" should be powered to "alfa_W-1" which shoud be in our case "0" for both cases. This is not what is shown at the example where both values are powered to "1".
I also think there is a type mistake at same point. Later it refers to the case where m black balls and n white balls are drawed and it is defined "alfa_B=1+m" and "alfa_B=1+n". I assume that second case should be "alfa_W=1+n".
Besides that it would help if same notation is used everywhere ("alfa_B" should be "alfa" and "alfa_W" should be "beta").
I hope it is clear enough. Sorry for my English.
Regards
Alex
- You're right. Although some authorities define the beta distribution as in this page, the most standard definition incorporates a '-1' in the exponents of both and . Also, the Wiki article on the beta distribution defines it in this way. So for uniformity, the article on Bayesian probability should be revised to conform. Note that the normalizing constant also needs correction as per the beta distribution article. Are there any disagreements? Bill Jefferys 17:08, 4 September 2005 (UTC)
- Thanks (good eye). I've fixed it.
Hi again,
I'm not so sure that this new equation "P(fita)=1" regardless fita value is correct. It implies that statement that the probability that the next ball is black is ex. 30% is 100%, also that statement that the probability that the next ball is black is 60% is also 100%, ... it would make more sense to me if only P(0.5)=1 and all other values 0, what do you think?
Alex
- I assume by 'fita' you mean 'theta' or in symbols either or . But nowhere in the article can I find that anyone is asserting that . In example three it is suggested that when you know how many balls are in the box (and that the number of black balls equals the number of white balls), that an appropriate prior is , though I think the writer meant since isn't defined anywhere (someone check me on this). I think this isn't quite right...rather, it should be written , which puts a unit mass on and no mass anywhere else. The lower case is appropriate since the Dirac delta function is a (limiting) probability density rather than a distribution on a finite space. Bill Jefferys 16:54, 9 September 2005 (UTC)
- Hi Bill, You are right, I meant theta "" Sorry :). My question was on example 1 "You have a box with white and black balls, but no knowledge as to the quantities". After last editorial modification it states that letting represent the statement "the probability that the next ball is black is p". From that, I conclude that independently to value (from 0 to 1) probability of is always 1. So, regardless the probability value (ex. p=0.3, p=0.9, ...) a Bayesian will assign a 100% probability to the sentence. That doesn't make sense to me ... It makes sense to me that all sentences (with different "p" values") may have the same probability value since all can be equally right, except for values p=0 (since we know there are black balls in the box) and p=1 (since we know there are white balls in the box), but, to me that probability value cannot be 1. Which are your thoughts? Besides that I agree with you, is more correct. Bet regards, Alex
- I am missing something here. I do not see where it states, after example 1, after the last editorial modification, where it says that . If it were to say this, it would certainly be wrong in the context of the example. But I cannot find this statement. Even when I search the source code for the article for 'P(\theta)=1', I cannot find it.
- Please show me specifically where the statement is. Bill Jefferys 01:52, 10 September 2005 (UTC)
- Example 1 states , so, at the end it is
- OK, now I see the confusion. The statement indeed says that , but that is not a statement that dogmatically. Rather, it is a statement that can take on any value in its range, which for the beta distribution is the interval [0,1]. Equal probability is assigned to each value in the range. Recall that is a probability density function. The value 1 is just the normalization (since the integral over all of the range of has to equal 1). In another case, say where ranges from -1 to 1 (for example, a correlation coefficient), one would have range [-1,1] and so that the integral would be 1.
- So, there is nothing wrong with this equation. It is exactly correct.
- Does this clarify things for you? Bill Jefferys 13:54, 10 September 2005 (UTC)
- Hi, I've got to the same conclusion this afternoon while I was driving. means that all statements (all values) have the same probability. But it doesn't mean that they have 100% probability. I only would like to emphasize that only and values should have 0% probability since we know for sure that in the box there are black and white balls. Regards. Alex
- Technically, if you know that there are both black balls and white balls in the box, the probability is zero for . This would mean that only on the open interval (0,1) but not on the closed interval [0,1]. However, this is only a technicality, since probabilities are derived by integration over intervals and generally speaking (for well-behaved functions) the integrals are the same regardless of whether one includes or excludes the endpoints.
- To make things even more complicated, probabilities are defined on a sigma-algebra of sets, and for the real line these are usually defined as sets that are half-open and half-closed, e.g., [a,b). The technicalities of constructing such a sigma-algebra may lead to odd looking things happening at the endpoints of intervals. But, as I say, these technicalities need not concern anyone but the most fastidious, since they don't arise under most well-behaved circumstances. Bill Jefferys 22:31, 10 September 2005 (UTC)
- Do last editing modifications on example 1 affect earlier sentence "...they would choose probability 1/2 in all three cases..." refering to Bayesian? Alex
- Hi, at example 1, "..., ...." should read "..., ", shoudn't it? Alex
- You're right. I've fixed this. Bill Jefferys 15:14, 19 September 2005 (UTC)
A good overview for "dummies" would be nice.
I read this article and had a little trouble deciphering exactly what Bayesian probability is. This article seems like a good body of knowledge for people who already know about this topic. I found a good overview on the web at http://www.bayesian.org/openpage.html (Overview Section) and then it follows up with http://www.bayesian.org/bayesexp/bayesexp.htm . This type of simplified explanation is exactly what I think is missing from this article. —The preceding unsigned comment was added by A67676767 (talk • contribs) 18 December 2005.
- I agree. I'm doing an assignment about errors in information systems and I need a paragraph summing up Bayesian probability. This article was useless to me.--195.194.178.251 16:07, 6 March 2007 (UTC)
Technical tag added, this is something only an expert can understand SeanWDP 20:27, 24 March 2007 (UTC)
Bogus Portrait?
Note that the portrait of Thomas Bayes in the article is believed by most authorities not to be of Thomas Bayes, although the portrait is widely used. Don't know what to do about this. At the very least, the fact that the authenticity of the portrait is disputed should be noted. Bill Jefferys 23:28, 30 January 2006 (UTC)
It might be appropriate to assign a Bayesian probability value to the assertion that this is his portrait. —Preceding unsigned comment added by 81.153.168.64 (talk) 16:04, 24 September 2007 (UTC)
Bayesian and frequentist probability
I find this comment somewhat unclear: The Bayesian approach is in contrast to the concept of frequency probability where probability is held to be derived from observed or imagined frequency distributions or proportions of populations. If you allow someone to invent some imagined frequency distribution, then how can you rule out them imagining a multiverse, each member of which has a Saturn of different mass (distributed according to some prior)? Jdannan 13:19, 2 February 2006 (UTC)
- I have changed "imagined" distributions to "predicted" frequency distributions in the article; but I don't know whether that helps.
- I've changed it to "defined". It seems to me that a very clear division can be made over whether the prior is explicitly (implicitly) provided as part of the problem, or must be chosen by the analyst. If someone asks me for the probability of rainfall on a random February day, it would be natural to use an (essentially frequentist) analysis of historical data. If someone asks me for the probability of rain tomorrow, there is no obvious prior and I have to make some judgements.Jdannan 00:16, 4 February 2006 (UTC)
- Such imagined distributions, or statistical ensembles as they are called, are widely used by both Bayesians and frequentists to mentally visualise probabilities. The basic issue is that frequentists deny that you can meaningfully form a prior for the mass of Saturn in any objective way; on the other hand, if you know the (observable) pattern of random errors introduced by your instruments, you can form an objective law of probability for the results of your observations for a given fixed unknown parameters. And this set of probabilities would be falsifiable: it would reflect the predicted spread of repeated observations, which you could check by performing the experiment again and again. On the other hand there is only one Saturn, so (from a frequentist's point of view), any probability distribution for it is unobjective, unfalsifiable, and therefore meaningless. For the frequentist it is only probabilities which correspond to predicted frequencies of observations that you could in principle perform that can have any meaning. -- Jheald 14:32, 2 February 2006 (UTC).
FOUNDATIONS As is known, the idea of a probabilistic logic of inductive inference based on some form of the Principle of Indifference always retained a powerful appeal. Keynes recommended a modified version of the principle in order to achieve this aim. Carnap followed Keynes in this attempt of creating a purely logical theory of induction. However, up to now all modifications of the principle failed. A modified version of the Principle of Indifference may be provided without generating paradoxes and inconsistencies. Besides, a general criterion of assignment of prior probabilities in case of initial ignorance may be suggested, thus providing a reply to the objections to the so-called objective Bayesianism. The new (or modified) principle of indifference prescribes a uniform distribution over a partition of hypotheses where there is no reason to believe one more likely to be true than any other, in the sense of both irrelevance of prior information and impartiality of design d (or method of inquiry). In order to ensure the impartiality of d with respect to the parameter theta, it is sufficient that, for all possible likelihood functions that can be obtained from d, the respective maxima of likelihood remain constant with respect to theta itself. In other words, the prior is uniform if the maximum of each curve is constant (or it is situated on the same level of any other). Besides, we can assume the prior for a parameter proportional to the corresponding maximum value of likelihood for all possible likelihood functions obtainable from the projected design. cf. de Cristofaro, Foundations of the Objective Bayesian Inference This paper was presented to the First Symposium on Philosophy, History and Methodology of ERROR held to Virginia Tech in Blacksburg (Virginia) on 1-5 June 2006. Available on INTERNET. —Preceding unsigned comment added by 150.217.32.23 (talk) 20 June 2006, possibly Rodolfo de Cristofaro (signed that way elsewhere).
Credence: is it a synonym for subjective probability
The classic statement of the Sleeping Beauty problem employs the term credence, apparently as a synonym for subjective probability. I have been unable to find convincing examples of the word in this sense on the internet. The Oxford English Dictionary of 1933 offers 8 senses, of which only the first means belief, and does not havve a shade of meaning of degree of belief. Do other contributors regard credence as a technical term, or was the formulation of the problem unfortunate? PJTraill 00:50, 6 November 2006 (UTC)
- For some reason this problem (Sleeping Beauty) is usually stated using the word "credence." Maybe Adam Elga (the inventor) wanted to emphasize the subjective nature of his question in a theory-neutral manner—thus avoiding theory-loaded expressions like "subjective probability" and "Bayesianism"? INic 04:03, 6 November 2006 (UTC)
subject matter
The article doesn't clearly state what "Bayesian probability" is. It should probably be contrasted to frequency-based probability. This is not done anywhere, not even in the section claiming to do it in its name. --MarSch 12:49, 18 February 2007 (UTC)
Fallacious Article
Wow, this wikipedia article is completely biased against bayesianism and uses a straw man fallacy to dismiss it.
Let me go over it point by point.
“Laplace, however, didn't consider this general theorem to be important for probability theory. He instead adhered to the classical definition of probability.”
First of all, if you read Laplace’s book you will see that to him bayesianism was just an obvious consequence of the classical definition. He doesn’t explicitly argue in favour of bayesianism because there were no alternative theories at the time. However, he states the classical definition and then goes on to solve most of the problems in his book using Bayesian methods.
“On the Bayesian interpretation, the theorems of probability relate to the rationality of partial belief in the way that the theorems of logic are traditionally seen to relate to the rationality of full belief. Degrees of belief should not be regarded as extensions of the truth-values (true and false) but rather as extensions of the attitudes of belief and disbelief. “
What is this?? E.T. Jayes, possibly the most famous Bayesian that exists, based the whole of his theory on probability theory being an extension of logic (true and false). This is what the Cox theorem (mentioned later) proves!
“Truth and falsity are metaphysical notions, while belief and disbelief are epistemic (or doxastic) ones.”
I’m pretty sure that Jaynes would consider this statement to be an example of the Mind Projection Fallacy. There is no reason that truth or at least the practical interpretation of truth should be metaphysical instead of epistemic. Jaynes is relentless about this in is book. He goes to great length to show that attributing our thoughts to the universe instead of to our own human reasoning faculty is fallacious.
“Bayesian probability is supposed to measure the degree of belief an individual has in an uncertain proposition, and is in that respect subjective. Some people who call themselves Bayesians do not accept this subjectivity. The chief exponents of this objectivist school were Edwin Thompson Jaynes and Harold Jeffreys. Perhaps the main objectivist Bayesian now living is James Berger of Duke University. Jose Bernardo and others accept some degree of subjectivity but believe a need exists for "reference priors" in many practical situations.”
Notice that there is no reference to proponents of subjective bayesianism here. Although subjective bayesiansim is something that exists, it probably belongs more to the realm of cognitive science, psychology or human estimation. In the mathematical/ science sphere pretty much NO ONE sees it as a viable theory that could replace Frequencism. This is at the heart of the fallacy of this whole wikipedia article. It builds a straw man by portraying the bayesianism of the mathematicians and scientists as the subjective kind, the kind they would never use, and then goes on to pit it against frequencism. Bayesians only ever push the Objective version of Bayesianism. Which is IMO, from what I read, much better than frequencism.
“The difference between Bayesian and Frequentist interpretations of probability has important consequences in statistical practice. For example, when comparing two hypotheses using the same data, the theory of hypothesis tests, which is based on the frequency interpretation of probability, allows the rejection or non-rejection of one model/hypothesis (the 'null' hypothesis) based on the probability of mistakenly inferring that the data support the other model/hypothesis more. »
Here someone should add that Bayesians believe null hypothesis testing to be one of the most notorious pathological method of frequencism. A problem that is solved easily with bayesianism (by using confidence intervals, which need a prior to be calculated) Bayesians believe that it doesn’t make sense to do null hypothesis testing because given enough data you can ALWAYS reject the null hypothesis. Unless the data concords _exactly_ with your null model, (which pretty much never happens in the real world), given that you have enough data points you’ll find significant results!!. Want significant results no matter what? No problem! Just include more individuals in your study! Small biases,(maybe from your measuring instrument, maybe from some small indirect effect) that should have no significance in practice will make your null hypothesis test report significance! Bayesians find it weird that frequencists believe in tests that can be so easily manipulated.
“Controversially, Bayesian probability assignments are relative to a (possibly hypothetical) rational subject: it is therefore not inconsistent to be able to assign different probabilities to the same proposition by Bayesian methods based on the same observed data. »
But previously it was said that:
“Advocates of logical (or objective epistemic) probability, such as Harold Jeffreys, Rudolf Carnap, Richard Threlkeld Cox and Edwin Jaynes, hope to codify techniques whereby any two persons having the same information relevant to the truth of an uncertain proposition would calculate the same probability. »
See the contradiction? All the proponents of Bayesianisms are telling us that to be valid we have to find the less biased, “maximum entropy” priors that would make any rational agent arrive to the same conclusion in all the different situations with whatever units or scales he uses. They further say that bayesianism is the only technique that allows us to be this much objective, that frequencism techniques give different solutions depending on the assumptions, units and scales you give to your problems while at the same time hiding some a priori implicit assumptions that could influence results.
Let me rephrase this, as I think it is at the core of the Bayesianism/Frequencism debate. Bayesians do not think they use priors more than frequencists do. Frequencists’ priors are just hidden as implicit assumptions and as a result, frequencists often make mistakes on their priors as they are not stated in their problems. It is mathematically possible to reverse a frequencist problem and calculate the implicit prior and find out if it is a biased prior.
Bayesians spill all the beans from the beginning, they put their priors where everyone can see them and judge them to make sure they are completely unbiased given the scales and units used in the problem. Further more Bayesians have developed the principle of maximum entropy which they say is always the less biased. I must admit that implementing this can be a little complicated when you start integrating all kinds of units and dimensions in your problem. Note that for most simple problems frequencists and Bayesians arrive at the same solutions since the assumptions are so obvious frequencists got them right intuitively.
Now I would I don’t feel quite qualified to make a good replacement article. I’m just a Bayesian enthusiast and have no formal education in probability theory and I get mixed up on the some finer mathematical points and equations. However, this desperately needs to be done! --BenE 16:33, 12 May 2007 (UTC)
Mailing lists / forums for Bayesians?
Are there any mailing lists or forums where Bayesian topics are discussed? So far I have mainly found mailing lists for different software packages. ParkerJones2007 10:02, 5 June 2007 (UTC)
Too technical?
I removed the to technical tag from this page. It is the opposite of to technical. It almost needs a more mathematically technical treatment in atleast one section. —The preceding unsigned comment was added by Jeremiahrounds (talk • contribs) 09:36, 22 June 2007.
User:Logicus has added a new theory to this article
Please see this diff, in which User:Logicus added what seems to be his personal view on Bayesian probability. Edit summary was 'Mentions apparent fatal problem for Bayesian philosophy of science'. He included some famous names in his newly-added text, but did not provide any references that share his view that it is a fatal problem. My sense is that this does not belong in the article unless sources are provided. Please join in the discussion if you have an opinion. Thanks, EdJohnston 19:21, 11 August 2007 (UTC)
- I think it's an old objection to Bayesian epistemolgy, but I'm pretty sure it was resolved a long time ago through variants of Occam's razor and measures of entropy; eg. the Principle of maximum entropy. It's probably worth noting that the problem of having an infinite or indefinite number of possible hypotheses/beliefs/theories is an old one in philosophy; you can find it in Hume, for example, who points out that there are indefinite numbers of theories for a given set of facts - you can have a theory that the sun will always rise tomorrow, but on the basis of all the data we humans have right now, it is exactly as valid as the theory that the sun will rise tomorrow, except if it is January 1st 2037, in which case the universe will end. --Gwern (contribs) 19:29 11 August 2007 (GMT)
- User:Logicus has provided Howson and Urbach's book as a source, which helps clarify things. Gwern, if you have a source for your side of that issue perhaps you could locate it and add it to the article. I'd still like to see Logicus (or anyone) provide a footnote for the quoted line from Lakatos about the 'ocean of counterexamples.'
- Another response one could make to Gwern's comment is that any defence against Logicus's 'fatal problem' that depends on the Principle of maximum entropy is unlikely to sway large numbers of people, since that principle is controversial in it own right, and involves many subtleties. EdJohnston 15:26, 12 August 2007 (UTC)
- I'm afraid that it's just general knowledge on my part. But you don't have to depend on that specific principle - as I said, some formulation of Occam's razor will select the simplest explanation, and that gives you a unique theory. --Gwern (contribs) 16:20 12 August 2007 (GMT)
User Logicus' addition is original research and his own personal opinion. The addition of a reference to Howson & Urbach does not remove it from this category. It has no business in this article. Bill Jefferys 17:34, 12 August 2007 (UTC)
OK, lets start with the following I shall try and improve and complete asap, and rebutt what I regard as false allegations of original research.
- PROPOSED TEXT: However, an apparently fatal problem for the theory that scientific belief is Bayesian or even probabilist is posed by the Hegelian radical fallibilist philosophy of science of such as Duhem(1), Lenin, Bradley, Bosanquet (2), Joachim, Popper(3), Lakatos(4), Feyerabend and many others that maintains all scientific laws are false and will come to be falsified and replaced by more general laws and so on ad infinitum. For it follows a priori from this philosophy that the prior probability that any scientific law is true must be zero, whereby its posterior probability must also be zero. So for those who believe in the falsificationist fallibilist model of science and reject the Aristotelian infallibilist doctrine of positivism that science accumulates final absolute truths that are never subsequently refuted and revised, a doctrine which probably most scientists do reject, probabilism must be inapplicable. Certainly physicist Duhem claimed all physicists believe in this endless replacement of scientific laws. And there are few who agree with Hawking that one day a final theory of everything will be attained. To date the pro probabilist philosophy of science literature has notably failed to deal adequately with this apparently fatal objection to the thesis that scientific reasoning is probabilist(5), nor with Lakatos's even more radical fallibilist thesis that 'all theories are born and die refuted, ever remaining immersed in an ocean of counterexamples.'(6), also Kuhn's view in effect. (7) This fundamental problem may possibly be solved by probabilism somehow, but to date has not been.
Footnotes to be improved and completed.
1) Lenin quoted Duhem's radical fallibilism approvingly in his 1908 Materialism & Empirio-Criticism , whose fallibilist anti-idealist dialectical materialist philosophy of science was to exert a major influence on 20th century philosophy and philosophy of science along with Hegelian fallibilism in general, as follows: "Thus, the struggle between reality and the laws of physics will go on indefinitely: to every law that physics may formulate, reality will sooner or later oppose a rude refutation in the form of a fact, but, indefatigable, physics will improve, modify, and complicate the refuted law [in order to replace it with a more comprehensive law in which the exception raised by the experiment will have found its rule in turn.]" [p.376-7 Materialism and Empirio-Criticism , Peking 1972 edition, quoting Duhem's 1905 Aim and Structure of Physical Theory as to be found on p177 of the Athaneum 1962 publication of its 1914 second edition. My italics, and inclusion in brackets of what Lenin omitted.] Lenin commented on Duhem's fallibilist falsificationist model of scientific development: "This would be a quite correct exposition of dialectical materialism if the author firmly held to the existence of this objective reality independent of humanity."
2) Russell mistook the core thesis of the dominant Hegelian fallibilism of the Edwardian Anglo-Hegelians such as Bradley, Bosanquet, Joachim and others, namely that absolute truth is not finitely expressible because it requires endless qualification, for scepticism. This is what motivated him to devise the logicist programme of Principia Mathematica by first demonstrating mathematics consists of finitely expressed absolute certain truths in order to refute it. This and Russell's associated anti Hegelian-fallibilist dogmatist philosophy of logical positivism launched the major philosophical methodenstreit of the 20th century, positivism versus dialectics, and its associated major institutional schism between such as so-called 'analytical' versus 'Continental' philosophy.
3) The locus classicus of Popper's radical fallibilism of endless conjectures and refutations with no ultimate terminus, which he dubbed 'modified essentialism' (i.e. modified Aristotelianism), is his 1957 The Aim of Science paper, published as Chapter 5 of his 1972 Objective Knowledge. Popper had co-translated Lenin's radical fallibilist M & E-C from Russian to German in 1919 when he was 17. And he also developed Duhem's anti-inductivist thesis that Kepler's laws were refuted by Newton's theory of gravity, rather than the latter being deduced from them.
4) For Lakatos's expression of it, see his 1960 Necessity, Kneale and Popper 'Philosophical Papers Volume 2', Cambridge University Press, 1978, especially pp122-3 and pp125-6.
5) In its discussion of this problem in the first two editions of the main textbook of Bayesian philosophy of science, Scientific Reasoning: The Bayesian Approach by Howson and Urbach, in a special section in its concluding chapter dealing with objections entitled 'The prior probability of universal hypotheses must be zero', it points out "Popper...asserts that the probability of a universal hypothesis must, for logical reasons, be zero. Were Popper correct, that would be the end of our enterprise in this book, for the truth of Popper's thesis would imply that we could never regard unrestricted universal laws as confirmed by observational or experimental data... " [p261 E1, p391-2 E2]. After presenting their rejection of Popper's thesis that the logical reason why the probability of all universal hypotheses must be zero is that it is a logical consequence of the probability calculus, on page 261/394 Howson then turns to the problem that it is a logical consequence of his falsificationist fallibilist philosophy of science with the following remark: "Nevertheless, Popper has called our attention to a fact which deserves some comment, namely that the history of science is a history of great explanatory theories eventually being refuted. In view of this, ought we not rationally to expect all theories to be eventually refuted ?" And of course, after the historical evidence of several millenia of repeated refutations, on Bayesian reasoning the correct short answer is 'Yes !' , but which Howson then unsuccessfully tries to deny. The valid Bayesian conclusion here is that the probability of all universal hypotheses is most probably zero. But this whole critical discussion has very notably now been completely dropped from the 2006 third edition of the book, thus leaving the issue not adequately resolved.
6) See p5, [Science and Pseudoscience http://www.lse.ac.uk/collections/lakatos/scienceAndPseudoscienceTranscript.htm] in 'The methodology of scientific research programmes', Imre Lakatos (Eds Worrall & Currie) CUP, 1978, also p48 of that same volume.
7) In defence of confirmationist theories of scientific method that forbid there being any known counterexamples to a theory, which includes probabilism, John Watkins unsuccessfully attempted to rebutt Lakatos's fatal thesis for the case of Kepler's laws, which Lakatos claimed Kepler knew to be refuted by Jupiter-Saturn perturbations by 1625 before the 1629 publication of the Rudolphine Tables. Under the heading 'Are all scientific theories born refuted?' he wrote: Kuhn wrote "There are always some discrepancies [in the fit between theory and nature]. (1962, p81) and Lakatos declared that all theories 'are born refuted and die refuted' (1978, i, p5 link). Taken at its face value this thesis, if correct would wreck our theory [of confirmation]..." [See p330-334 of Watkins' Science and Scepticism, Princeton University Press, 1984] [To be completed]
--Logicus 18:15, 13 August 2007 (UTC)
Updated 14 August --Logicus 18:09, 14 August 2007 (UTC)
Sounds like original research to me. Where do Duhem, Lakatos, Popper, etc. specifically apply their notions to Bayesian probability theory?
As a physical scientist (astronomer) and a Bayesian, I find most of the ramblings of philosophers of science way off the point. They do not, in general, reflect how I and my colleagues think. Bill Jefferys 12:53, 14 August 2007 (UTC)
- I agree with Jefferys that the philosophers named above don't have much connection to Bayesian probability theory. Logicus's new material is not germane here, though it might hold some interest for publication outside Wikipedia. The article by Lakatos cited by Logicus does not mention Bayes, Laplace or probability. Our own article doesn't mention Popper, so it's not stepping into that world of issues at all. EdJohnston 13:13, 14 August 2007 (UTC)
The idea that all hypotheses are false is nothing new and is of no consequence to Bayesian theory. As the eminent Bayesian statistician, George E. P. Box famously remarked, "All models are wrong; but some models are useful." By this he meant that, of course, we don't expect that at any point in time a given model or theory of nature is the final, ultimate, exact one. But that doesn't mean that it isn't applicable to a some degree in a certain limited region of applicability.
In real-world Bayesian application (as contrasted with the rarefied version that Logicus imagines) we don't think that the hypotheses under consideration are the be-all and end-all. Indeed, it's expected that they aren't ultimately correct. But, they are expected to be useful in the sense that they may provide useful predictions or extrapolations to regions where the data are unavailable or are yet to be observed. Indeed, it is well-known that if we are choosing between several models, none of which is the "true" model, then from a posterior predictive point of view, the model with the highest posterior probability will, asymptotically in the limit of large data sets, be the model that predicts new data the best.
For sure, the ontologically "true" model will almost certainly not be amongst the ones we consider in any given analysis. This means that we also need to understand and consider how well the models under consideration fit the data actually observed; and when the fit of the best models is hopeless, that means that we need to invent/search for models that will do a better job. The Bayesian statistician Andrew Gelman (see "Posterior predictive assessment of model fitness via realized discrepancies (with discussion)," Statistica Sinica 6, 733--807. http://www.stat.columbia.edu/~gelman/research/published/A6n41.pdf) has argued this point of view persuasively. The key notion here is "model". A model is a mathematical framework for connecting observations or potential observations to unobserved parameters, in such a way that we can make predictions about what will be observed in the real world. No one thinks that a given model is the final, ontologically "true" model. But in no way does this point of view vitiate or undermine the notion that Bayesianism is a viable model of how scientists think and work.
Thus, I continue to object to Logicus' addition to this article. It is, until he presents citations to authorities (other than himself) who regard this as a "fatal flaw" in Bayesianism, merely his personal opinion and original research. Note that the Howson & Urbach reference doesn't qualify, since these authors clearly do not believe that a philosophical objection such as Logicus' is any kind of "fatal flaw." Logicus seems to think that scientists regard Bayesianism as a way of finding ontological truth. This is not the case. Bill Jefferys 00:26, 15 August 2007 (UTC)
Logicus to Bill
Dear Bill
Thank you very much indeed for this most intriguing exposition of your very own pragmatic idealist philosophy of science, allegedly also that of your many scientific colleagues, and thank you also for your previous most invaluable exceedingly forceful assertions. Indeed in respect of their main thrust, by way of humour I was reminded of the £5 argument in that most amusing Monty Python sketch 'The Argument Shop' i.e. 'mere assertion' in Michael Palin's customer complaint to Cleese.
However, with respect to your main point, which unfortunately echoes Johnston's unfortunate assertions of 11 August, that my proposed addition is Wiki-original research and not shown to be otherwise by the reference to Howson & Urbach 1993, to the very contrary that reference does indeed qualify as evidence that the central claim of my proposed text is not Wiki-original research. For that text does not categorically claim, as Johnston and you misrepresent it, that the thesis in question - that the widespread philosophy that all scientific laws are false implies they must all be assigned prior probability zero on Bayesian reasoning - actually is a fatal problem for Bayesian philosophy of science. Rather it only claims it is "an apparently fatal problem", by which was meant 'fatal if correct'. As it happens, whether this claim actually is fatal is unclear to me.
But the only Wikipedia-relevant issue here is rather whether it is recognised in the Bayesian philosophy of science literature that the doctrine that all scientific hypotheses are false is 'fatal if correct'. Whether or not people who do acknowledge this also deny or assert it actually is correct is irrelevant to this issue. In order to prove that the claim that this doctrine is apparently fatal is not Wiki-original research, in Wiki rules all that is required is to document that other significant writers in the literature also acknowledge it would be fatal if correct. And I clearly did that! And I did that purposely with reference to a book that was already listed in the article's References. And moreover I am told that book is no mere marginal minority view text, but rather happens to be a leading academic teaching text for Bayesian philosophy of science in English-language speaking academia.
So maybe I should repeat once again, especially for the hard of understanding, that book's specific phrase "that would be the end of our enterprise in this book" if the prior probability of universal hypotheses must be zero. (That book's patently failed enterprise was to demonstrate (by eliminative proof) that Bayesianism is the only theory capable of putting inductive inference on a sound foundation. But this conclusion was invalid because in fact its eliminative proof method entailed that no theory is, including Bayesianism. For the key eliminative principle of its eliminative proof method was that any inductive method that makes evidentially prior ontological suppositions about the world, such as the Principle of Indifference in frequentist theory for example, is thereby unsound. But of course the Bayesian inductive method is quintessentially also such a method, and therefore a fortiori unsound. On this basis it would seem that Popper's judgment that 'Bayesianism is in tatters', ridiculed by Lindley on the cover, in fact still stands.
Moreover, I also provided a second reference from a book not in the Wikipedia listed References that is an example that Imre Lakatos's even more radical fallibilist thesis that theories are always refuted from birth to death is regarded as fatal to confirmation theory more generally, in its admission "Taken at its face value this thesis, if correct would wreck our theory [of confirmation]". [Watkins 1984]
These two examples surely document recognition in the literature of the potential fatality for Bayesian philosophy of science of believing all theories to be false, and even moreso of believing them to be practically refuted at all times. QED (i.e. Logicus's proposed text is not Wiki-original research).
Some further points
Bill Jefferys wrote in his very first sentence on 15 August: "The idea that all hypotheses are false is nothing new and is of no consequence to Bayesian theory." Bill is quite right in the first conjunct of this sentence inasmuch as this doctrine dates at least from the ancient Greek Eleatics such as Parmenides, Zeno and Melissus, if not possibly from before in doctrines that it is impossible for mere mortals, unlike the gods, to state and know the truth and hubris to think they can. But whatever, it certainly dates from long before Bill and his doughty American colleagues adopted it. But Bill is arguably fatally wrong in his second conjunct insofar as the former idea is a fatal consequence for 'Bayesian theory' if that means 'Bayesian philosophy of science'. Indeed Bill has yet to explain how anybody can possibly believe a theory to be false, as he claims he and other Bayesians believe a priori all hypotheses are, and yet assign it a non-zero prior probability such as required to render Bayesian epistemology operational, without thereby being inconsistent. Thus the suspicion arises that Bill is doing original sociological research in Bayesian philosophy of science to support his hitherto unfounded objections to Logicus's claim which is not original research.
So Bill, in your philosophy of science it seems you agree all scientific hypotheses are false, along with such great and influential philosophers of science as Parmenides, Duhem, Lenin and Lakatos etc. So in that case, as well as believing all hypotheses are false, do you also assign prior probability zero to all hypotheses ? The essential question here is - If you definitely believe a hypothesis to be definitely false for whatever reason, what prior probability do you assign it ?
And as Pilate might have asked, what is 'ontological truth' as distinct from 'truth' ?
Forthcoming asap:
Models versus Laws and Box
Box and Models
On Box, Jefferys wrote The idea that all hypotheses are false is nothing new and is of no consequence to Bayesian theory. As the eminent Bayesian statistician, George E. P. Box famously remarked, "All models are wrong; but some models are useful."
But in the first instance, contrary to Jefferys' claim that it is inconsequential, the belief that all hypotheses are false is absolutely fatal to Bayesian philosophy of science, because that belief means that all hypotheses must be assigned prior probability zero, as all propositions believed to be false must be according to subjective Bayesianism. However, did Box really believe all hypotheses are false ? For the quotation given does not show he did, but rather only that he believed all MODELS are false. But in the traditional philosophy of science distinction between scientific LAWS or HYPOTHESES on the one hand and MODELS on the other, the latter are only the postulated initial conditions or parameters of LAWS which are applied to them to deduce predictions. Thus, for example, in one early member of Newton's series of increasingly realistic false models of the planetary system in the Principia, the planets are point masses subject to a fixed centripetal force without any mutual gravitational attraction, and the inverse-square LAW of gravitational attraction was applied to this false MODEL to deduce the falsehood of elliptical orbits.
But is there any textual evidence that Box also believed all HYPOTHESES are false, as well as all MODELS ? It would be most interesting if he did, because then the question arises of what prior probability he assigned to hypotheses he believed to be false, and thus whether he was really a subjective Bayesian or not, or at least an inconsistent one. Did Box assign probabilities to propositions according to strength of belief in their TRUTH, as in epistemic probabilism, or rather according to strength of belief in their USEFULNESS, as in pragmatic probabilism ?
The ultimate philosophical significance of this problem is that as their core thesis probabilist philosophers of science maintain the axioms of the probability calculus impose logical consistency constraints on scientific reasoning in their epistemic interpretation. As Howson & Urbach open their textbook Scientific Reasoning, "According to the Bayesian view, scientific and indeed much of everyday reasoning is conducted in probabilistic terms." , and conclude the axioms of the probability calculus constitute consistency constraints. .". They conclude "the Bayesian theory is a theory of consistent probabilistic reasoning" [p301 Ed3] in which "the consistency constraints represented by the probability axioms are both stringent and very objective, as stringent and objective as those of deductive logic." [p297 Ed3]
But this fallacy is just an aspect of the larger fallacy of the Edwardian philosophy of logical positivism, of which Bayesian probabilism is just the latest fashion in its intellectual and moral degeneration, that scientific reasoning and belief is logical and not inconsistent. But just as scientists adopt and develop all sorts of inconsistent theoretical systems, so the claim that they consistently obey the probability calculus is also obviously hogwash if scientists are flagrantly inconsistent by virtue of assigning non-zero prior probabilities to hypotheses they firmly believe to be false as a basic credo of their fallibilist philosophy of science, a credo to which Jefferys himself testifies.--Logicus 18:14, 4 September 2007 (UTC)
Jefferys' Pragmatic idealism
--Logicus 14:56, 18 August 2007 (UTC)
- I see that having lost the discussion, Logicus has resorted to sarcasm. But sarcasm doesn't change the fact that his argument remains original research.
- I put prior probabilities on models proportional to my prior belief that each is likely to be useful in a posterior predictive sense. All I care about is whether a model will accurately predict observations that have not yet been made or which have been made but which I haven't seen yet. So, for example, in fitting orbits of planets, I will use a Newtonian model, perhaps with PPN corrections if warranted, even thought I know that such models are not correct (and even if I went to full general relativity it would be incorrect, since GR is not a quantum theory it cannot be correct). But it doesn't matter...the theory I get will accurately get a space mission to Mars, or predict where an asteroid will be at a later time so that I can deflect it from a collision with Earth.
- In other words, the theories I want to use are ones that are very good in their domain, and the results I get using Bayesian methods are very good results.
- As for great and influential philosophers of science, my opinion is summarized in the old anecdote about the dean of a university who was approached by the head of the physics department, who was seeking funds for an expensive piece of equipment. The dean complained, "You physicists cost so much money! Why can't you be like the mathematicians? All they need is a pencil, paper, and a wastebasket. Or the philosophers? All they need is a pencil and paper. Bill Jefferys 17:04, 18 August 2007 (UTC)
Logicus to Bill: Thank you for your comments. However I have not lost any discussion so far as I am aware. And nor did I intend any sarcasm anywhere (what do you refer to ?), but rather only good humour: my apologies if misinterpreted, but the ‘mere assertion’ humorous point I made referred to your pre 15 August comments, which you would surely agree were no more than authoritarian assertions without evidence. But now it still does rather appear you think one proves something merely by asserting it. Thus you keep asserting my claim is original research, on the one hand in spite of the evidence I have provided to the contrary, and on the other hand you never offer any proof that it is original research. Please do explain why you think it is, given I have demonstrated other people in the literature who think it would be fatal if correct, which is surely all that is required here ?
As for your other points reporting your scientific practices and beliefs, whilst most interesting and useful (and I really mean that and am not being sarcastic in case you think I am), unfortunately they are logically quite irrelevant to the main issue of demonstrating your Wiki-original research thesis. But they do raise other interesting issues and it is perhaps a pity you did not wait until I had time to supply the forthcoming topics announced: ‘Models versus Laws and Box’ and ‘Jefferys’ Pragmatic idealism.’ Perhaps more enlightenment will follow when I do ?
Here I just say I get the impression you have now gone way off-beam. Nobody is denying Bayesian probability may be very useful in science, and it is the philosophy of probability I tend to believe in. But it is not the Bayesian philosophy of probability that is at issue here, but rather the Bayesian PHILOSOPHY OF SCIENCE that is at issue, in the article’s claim: “Some regard Bayesian inference as an application of the scientific method …”., and what has been recognised as a potentially fatal objection to this thesis. The question is whether Bayesianism can provide a general theory of scientific method, including scientific change. More later...
Best wishes --Logicus 19:03, 18 August 2007 (UTC)
- OK, I will wait for your comments. I will be interested in what you say.
- My objection (that the edit is original research) is that you still haven't provided a single citation to anyone who claims that the point you make is a "fatal objection to the thesis" that "some regard Bayesian inference as an application of the scientific method." Obviously, many scientists would disagree with such a dogmatic assertion. You have provided a citation to Howson and Urbach's earlier edition in which they dismiss this notion. If you have a cite, cite it, chapter and verse. And in any case, your original wording (that it is a fatal objection) is obviously your personal opinion and has no place in the article. A more circumspect comment, e.g., "Furd & Froomkin (cite) say blah blah blah, but Howson & Urbach (cite) say blah blah blah" might be appropriate.
- Supposing you can find an appropriate cite, then at most your comments would deserve a short paragraph. Your original edits responded to a very short comment about what some people think with a huge detailed refutation. This is an encyclopedia, not a dissertation, and the amount to be added should, if we decide that it is deserving of inclusion, be commensurate with the rest of the article. You might ask yourself whether the size and dogmatic attitude of your original edit is not ultimately the source of the objections that it occasioned, and whether anyone would have objected had your original edit been more modest both in length and language. (It would still require appropriate citations, though.) Bill Jefferys 19:39, 18 August 2007 (UTC)
Logicus to Bill
[Thanks for your latest comments just above. But if you care to read my text yet again but maybe with reading glasses on you should see that you are again quite wrong in claiming I said it definitely was a fatal objection. Rather I only ever claimed it is 'an apparently fatal objection.', which it is. I shall explain why your comments are largely mistaken later. But re your second paragraph, see my own self-criticism in agreement at the end below.]
Dear Bill
Thanks for the old anti-philosophers joke. Of course the philosophers' joke on the joke is that the dean was herself a philosopher and correctly believed philosophers are infalible, so don't need wastebaskets. But to business.
Here let me try and explain to you why your philosophy of probability as expounded in your comments is not the Bayesian theory as understood in the literature on the philosophy of probability and correspondingly that on Bayesian philosophy of science, at least from what I know of it. It is rather some kind of pragmatist theory.
Although not well explained in the current article, the philosophy of probability concerns the interpretation of the probability function or predicate 'P' in such statements as 'P(h)', meaning 'the probability of 'h' '. (Here I am dealing with propositional-probability where 'h' is a proposition, rather than event-probability where 'h' is an event, since Bayesian probability is a species of propositional probability.) According to the Bayesian interpretation of 'probability', 'P(h)' means 'the degree of belief that proposition 'h' is TRUE'. And accordingly logically true tautologies, which are logically true, are assigned probability 1, contradictions, which are logically false, probability 0.
However what you seem to subscribe to is a quite different non-Bayesian pragmatist philosophy of probability, with a quite different interpretation of the term 'probability' in which P(h) means 'the degree of belief that proposition 'h' is USEFUL'. For you say "I put prior probabilities on models proportional to my prior belief that each is likely to be useful...".
And useful for what ? On that you seem be saying 'useful for predicting novel observations': "I put prior probabilities on models proportional to my prior belief that each is likely to be useful in a posterior predictive sense. All I care about is whether a model will accurately predict observations that have not yet been made or which have been made but which I haven't seen yet." (I ignore the peculiar second disjunct of the last sentence.)
Now presumably a radical difference between this non-Bayesian theory of probability and the Bayesian theory is most dramatically illustrated by the fact that a tautology, e.g. The Moon is the Moon, is generally agreed to be utterly informatively useless, conveying zero information about the world. Thus I presume your pragmatist theory of probability would/must assign it probability 0, that is, 'the degree if belief that is is likely to be USEFUL', to express the belief with certainty that it is likely to be USELESS. (Or do your astronomer colleagues find such statements useful for predicting novel facts ?)
But Bayesian probability theory assigns a tautology probability 1, the degree of belief that a tautology is TRUE being that of absolute certainty. Thus your pragmatist probability theory is clearly not a Bayesian theory of probability. QED.
I do hope this helps you understand why you are not endorsing a Bayesian theory of probability as traditionally understood (albeit you obviously employ Bayes' Theorem (or rather Laplace's) as probabilists do), but rather advocating some kind of pragmatist theory. You are thereby apparently doing Wiki-original research and illegitimatly trying to block my proposed contribution to the article because in your view it conflicts with your illegitimate 'original research', that is, your non-Bayesian pragmatist theory of probability which should have no place in an article on Bayesian probability, nor in infuencing that article's one paragraph on Bayesian philosophy of science as an application of Bayesian probability. In short, you are apparently completely out of order.
However, I must admit my original addition became much too long, especially with the footnotes added to answer criticism from Johnston of 11 August and further objections. Also in the light of the invalidity of all the objections of Johnston, Gwern and yourself, I am persuaded to reformulate my proposed addition to try and make it clearer for the underinformed and logically confused, and much more briefly. And so I withdraw the proposal of the current drafted text, except for the purpose of Talk page discussion of issues. It even occurs to me that maybe you three are possibly not even aware that the second paragraph of the 'Applications' section of the article is stepping into philosophy of science, or that Bayesianism is about the degree of belief in THE TRUTH of a proposition. --Logicus 14:48, 19 August 2007 (UTC)
Proposed edit: Improving the first sentence ?
I propose the current first sentence of the article
"Bayesian probability is an interpretation of probability suggested by Bayesian theory, which holds that the concept of probability can be defined as the degree to which a person believes a proposition"
be replaced by the following sentence:
'Bayesian probability is an interpretation of the probability calculus which holds that the concept 'probability' should be defined as the degree to which a person (or community) believes a proposition is true.'
Reasons:
- The term 'probability' raises the mystery of what it is, but the addition of 'calculus' at least gives a clue inasmuch as it at least gives a reference to an axiomatic mathematical theory, if not to any particular axiomisation.
- The phrase "suggested by Bayesian theory" is redundant here and only introduces a complicating further unexplained term. It can be introduced and explained subsequently if needs be. (If 'Bayesian theory' means the rules of Bayesian methodology for evaluating hypotheses in science, such as 'First, assign an evidentially prior probability to your hypothesis', then obviously these should be listed.)
- And the omission of "is true" after "proposition" raises the crucial question in this context of just what it is the person believes about a proposition, whether they believe it is short, long, in French, false or whatever? In sentential Bayesianism it is the probability of their belief that it is TRUE, and thus 'P(h)' means 'the degree of belief that 'h' is true', where 'h' is some factual proposition.
- For communal belief, see Ch 8 of Gillies 2000
--Logicus 14:47, 18 August 2007 (UTC)
- Getting the first sentence right is probably an unsolvable problem, since there are so many flavors of Bayesians, but I don't think your new version is better than what we have now. If you could find a published definition of Bayesian probability that you think is good, perhaps we should take a look at it. Such a published definition could then be attributed to its author. EdJohnston 04:04, 19 August 2007 (UTC)
Logicus to Johnston
Dear Ed Johnston
Thank you very much indeed for your wholly mistaken criticisms of my proposed edit. I also note it is your misrepresentation of my text as dogmatically claiming the problem is fatal rather than apparently fatal that started a wild goose chase.
"Getting the first sentence right is probably an unsolvable problem.."
Getting the first sentence right may well be an insoluble problem. But getting it better is not. And my proposed replacement is arguably significantly better than what "we" have now, as suggested by the fact that you notably fail to rebutt any of the stated reasons why it is better. It is primarily better because it at least makes clear what the article currently does not, namely that is the TRUTH of propositions Bayesian probabilty is concerned with, not their usefulness as in Jefferys' non-Bayesian 'Pragmatist probability', for example.
"since there are so many flavors of Bayesians,"
There may well be many flavours of Bayesianism. But I was unaware there are any Bayesian philosophers of probability who dissent from the view that 'probability' means 'the probability that a proposition is TRUE'. Can you name any who do, apart from Jefferys who calls himself a Bayesian, but apparently believes 'probability' means 'the probability that a proposition is USEFUL' rather than TRUE? Certainly the current article does not mention any such dissenters in its 'Variety' section, nor anywhere else.
"but I don't think your new version is better than what we have now "
But contrary to YOUR opinion, I do think MY new version is better than what "we" have now (Is this the ROYAL "we", or just the view of a Wikipedia employee ?). At least for the good education of those many poor people who cannot afford good Encyclopedias, but have access to the web, should not my will be editorially decisive over yours in this matter ?
"If you could find a published definition of Bayesian probability that you think is good, perhaps we should take a look at it."
If you can find any published definition of Bayesian probability that supports your view of it as being 'diverse and many flavoured' with respect to differing on the crucial issue here of what 'probability' might possibly mean other than 'the degree of belief that a proposition is true', then indeed perhaps 'we' should take a look at it. And I would certainly be most interested if you could find such. But otherwise, in the event of your failing to do so, perhaps you would be kind enough to review and revise your understanding of 'Bayesian probability' ?
"Such a published definition could then be attributed to its author. "
Attributing some published definition of Bayesian probability to its author would be logically irrelevant to determining what the Bayesian view of probability is unless you first define what you mean by 'the Bayesian view of probability'. But for whatever it may be worth, on page 15 of the 3rd edition of their Scientific Reasoning (which advocates a Bayesian subjectivist philosophy of science), Howson & Urbach tell us that in epistemic probability in general "the meaning of P(a)...indicates something like the degree to which it is felt some assumed body of background knowledge renders the truth of a more or less likely..." or in other words to cut the filligree and phillibuster: 'P(a) means the degree of likeliness proposition 'a' is true', which in the case of subjectivist Bayesian epistemic probability becomes 'the degree of belief that 'a' is true'. The important point here is that it is the truth of 'a' that is at issue, and not its usefulness for some purpose, for example. In the first two editions of Scientific Reasoning the meaning of 'P(h)' was not explicitly defined for either epistemic probability in general nor for Bayesian subjective probability in particular, but it was clearly 'degree of belief that 'h' is true', as evidenced by such statements in its Edition 2 as:
"In what follows [i.e. betting quotients and degrees of belief] we shall talk of the truth and falsity of hypotheses rather than the occurrence and non-occurrence of events." p75 and "For the odds you take to be fair on h will clearly reflect the extent to which you believe it is likely that h will turn out to be true." p76
In conclusion, I submit my proposed edit should be adopted. --Logicus 18:14, 20 August 2007 (UTC)
- I'm not an expert in this field. I am surprised by the great length of your response, which has overtones of WP:SOAP. If your proposal managed to win the approval of User:Billjefferys who has published work in Bayesian statistics, I might be more inclined to study your idea in detail. I submit that this page should be more attuned to satisfying the statisticians than the philosophers. Arguments from credentials are frowned upon, I know, but I would need more motivation to study a position presented at such great length as above. Do you think you could summarize the main point in a couple of sentences? EdJohnston 18:27, 20 August 2007 (UTC)
Logicus to EdJohnston:
The main point of these remarks in judt one sentence is that you have no valid objection to my proposed edit of the first sentence, so it should be implemented. Note my previously proposed edit has been provisonally withdrawn.
For you information this article is not about Bayesian statistics, and so publications in Bayesian statistics are not relevant qualifications for advising on editing an article about the philosophy of probability, whereby approval by User Bill Jefferys is not relevant. And I note your other adviser Gwern was so off-beam on my previous proposed edit that they mistook the problem of all hypotheses being false for the grue type problem of induction of infinitely many hypotheses being compatible with the same data. In my view you choose wholly inappropriate advisers in a field about which, very far from being an expert as you admit, you apparently know little nor have any interest. Why do you intervene in itthen ? Are you a formal employee of Wikipedia? And with some kind of formal powers over editors like myself ? Why should I pay any attention to what you think ? Please make Wikipedia structure transparent.
This article wrongly classified as Bayesian statistics
Johnston says:“I submit that this page should be more attuned to satisfying the statisticians than the philosophers.”
But the categorisation of the current article under 'Bayesian statistics' and as for statisticians is very clearly profoundly mistaken since on the one hand it tells us nothing whatever about Bayesian statistics (e.g. there is not even any statement nor explanation of such basics as Bayes'/Laplace's Theorems), and on the other hand from the very outset it is very clearly explicitly concerned with the philosophy of probability in its central concern with the interpretation of the fundamental concept 'probability', as its very first sentence clearly states. Enquiries and expositions on the meanings and interpretations of the most fundamental concepts of a subject are traditionally regarded as being the province of the philosophy of the subject, conceptual analysis traditionally being regarded as the meat of philosophy. For example, the philosophy of history is traditionally concerned with the meaning and interpretations of the concept 'history'. And this article is very clearly concerned with expounding the Bayesian philosophy of probability, that is, the meaning it gives to the concept 'probability', and with contrasting it with other philosophies of probability. I suggest its proper categorisation should therefore be under 'Philosophy of probability' and 'Bayesian philosophy'.
In further support of this re-classification, note the article's current references to books in the philosophy of science by professional academic philosophers of science, such as Howson & Urbach and Bovens & Hartmann, all four academic philosophers of science of the London School of Economics Department of Philosophy, Logic and Scientific Method, and indeed also now Gillies who is also a London University philosopher. But qua philosophers, of course the learned Professor Jefferys has nothing but contempt for such people on the evidence of his personal railings against philosophers of 14 August and his joke that philosophers are not self-critical and regard their work as infallible, whose expressions of his American pragmatist 'red-neck' attitudes are surely a serious breach of Wikipedia etiquette.
Further evidence in the article's references that the subject matter of the article is that of the traditional academic subject 'Logic' or 'Logic and Scientific Method' and so 'Philosophy of science' is to be found in the very titles of books such as Jaynes's 'Probability Theory: The Logic of Science', and also of various books called 'The Foundations of...' by Kolmogorov, Ramsey and Savage, denoting their fundamental philosopical provenance inasmuch as 'foundations' are traditionally the subject of philosophy, 'The Queen of all the Sciences' and their spawning ground. --Logicus 18:11, 22 August 2007 (UTC)
Proposed edits
These are drafts to be tidied up and revised as appropriate
- Edit first sentence as proposed above by Logicus on 18 August
- The following paragraph in 'History'
"Frank P. Ramsey in The Foundations of Mathematics (1931) first proposed using subjective belief as a way of interpreting probability. Ramsey saw this interpretation as a complement to the frequency interpretation of probability, which was more established and accepted at the time. The statistician Bruno de Finetti in 1937 adopted Ramsey's view as an alternative to the frequency interpretation of probability. L. J. Savage expanded the idea in The Foundations of Statistics (1954)."
to be replaced by
'The subjective theory of probability which interprets 'probability' as 'subjective degree of belief in a proposition' was discovered independently and at about the same time by Bruno de Finetti in Italy in 'Fondementi Logici dei Ragionamento Probabilistico' (1930) and Frank Ramsey in Cambridge in 'The Foundations of Mathematics' (1931).[See p50-1, Gillies 2000] It was devised to solve the problems of the classical/frequency/logical theory of probability and replace it. L.J. Savage expanded the idea in The Foundations of Statistics (1954).'["The subjective theory of probability was discovered independently and at about the same time by Frank Ramsey in Cambridge and Bruno de Finetti in Italy." p50 Gillies 2000]
What is the evidence that Ramsey saw his theory as a complement to frequentism rather than, like Finetti, its replacement ?
- The following paragraph in 'History'
“Formal attempts have been made to define and apply the intuitive notion of a "degree of belief". The most common interpretation is based on betting: a degree of belief is reflected in the odds and stakes that the subject is willing to bet on the proposition at hand.”
to be followed by
'However, a fundamental problem is posed for the betting definition of 'degree of belief' by the fact that the only fair betting quotient on a universal hypothesis is zero, since a bet on its truth can never be won.' [Gillies 'Induction and Probability' Parkinson (ed) An Encyclopedia of Philosophy 1988; p263-4, Howson & Urbach 1989]
REVISED DRAFT 22 August: 'However, the probability of infinitistic universal hypotheses poses a problem for the betting definition of 'degree of belief' because the only fair betting quotient on an infinitistic universal hypothesis is zero, since a bet on its truth can never be won because its truth can never be decided. This becomes a fundamental problem for Bayesian philosophy of science that seeks to reduce scientific method to gambling, since scientific hypotheses are typically infinitistic universal generalisations.' [Gillies 'Induction and Probability' Parkinson (ed) An Encyclopedia of Philosophy 1988; p263-4, Howson & Urbach 1989] And notably by 1981 De Finetti himself came to reject the betting conception of probability: "...betting strictly speaking does not pertain to probability but to the Theory of Games" ["The role of 'Dutch Books' and 'Proper Scoring Rules' " in British Journal for the Philosophy of Science 32 1981 55-6.] ' --Logicus 17:34, 22 August 2007 (UTC)
- In 'Varieties' the two sentences:
“Degrees of belief should not be regarded as extensions of the truth-values (true and false) but rather as extensions of the attitudes of belief and disbelief. Truth and falsity are metaphysical notions, while belief and disbelief are epistemic (or doxastic) ones.”
should all be deleted as unintelligible mumbo-jumbo or possibly seriously misleading in creating the false impression that subjective Bayesianism has nothing whatever to do with truth or falsity as all epistemic probability does, whereas in fact it interprets 'probability' as 'the degree of belief in the truth of a proposition'. These two sentence add nothing but potential confusion to the exposition.
- In the 'Varieties' section, in the sentence
“Bayesian probability is supposed to measure the degree of belief an individual has in an uncertain proposition, and is in that respect subjective.”
replace "an uncertain" by 'a', since Bayesian probability also measures belief in certain propositions, such as tautologies to which it may assign probability 1, as well as measuring belief in uncertain propositions.
- The current para 2 of 'Applications' is unsourced and I know of nobody who makes this peculiar claim, although there may be such. But in fact it is apparently logically confused inasmuch as the 'Applications' section in general is apparently supposed to be about the applications of Bayesian probability to various subjects, and not about the applications of scientific method which properly belong to an article on scientific method. And so presumably it should rather say 'Some regard the scientific method as an application of Bayesian inference...' ? And if so, then the H & U book is an appropriate source/reference, with its page 1 opening sentence "According to the Bayesian view, scientific reasoning is conducted in probabilistic terms."
I propose the following replacement for this paragraph:
'Some regard the scientific method as an application of Bayesian probabilist inference because Bayes's Theorem is used to update the strength of prior scientific beliefs in the truth of hypotheses in the light of new information from observations or experiments. This is said to be done by the use of Bayes's Theorem to calculate a posterior probability using that evidence. Adjusting.....'
REVISED DRAFT 22 August: 'Some regard the scientific method as an application of Bayesian probabilist inference because Bayes's Theorem is used to update the strength of prior scientific beliefs in the truth of hypotheses in the light of new information from observations or experiments. This is said to be done by the use of Bayes's Theorem to calculate a posterior probability using that evidence and is justified by the Principle of Conditionalisation that P'(h) = P(h/e), where P'(h) is the posterior probability of the hypothesis 'h' in the light of the evidence 'e', but which principle is denied by some [Refs to Hacking Kyburg Howson etc discussions]. Adjusting.....' --Logicus 17:34, 22 August 2007 (UTC)
--Logicus 18:15, 21 August 2007 (UTC)
- First sentence: I like Logicus' version. EdJohnston has given no substantive reason to prefer the current version. Logicus' reads more smoothly; "suggested by Bayesian theory" is redundant as pointed out, and disturbs the flow. "probability calculus" evokes, for me, an image of arithmetic about probability being carried out in a particular way while different views may be expressed of the meaning of such arithmetic. This puts a slightly less abstract, more graspable flavour on the whole definition. However, I'm not sure whether "believes a proposition is true" is good grammar or not; perhaps it should be "believes that a proposition is true" or "believes a proposition to be true". Also, "is true" could be left off without harm (though perhaps it's slightly better with it).
- Re betting quotient: I don't understand this sentence at all and am not sure I agree with it. The terms "betting quotient" and "universal hypothesis" would have to be defined.
- Re "Degrees of belief ..." These sentences make sense to me; they are not unintelligible. I can try to explain them to you if you like. I don't see how they could create the impression that "Bayesianism has nothing whatever to do with truth or falsity", since belief and disbelief obviously have something to do with truth and falsity; that is, belief is "about" truth or falsity.
- Re replacing "an uncertain" with "a": Technically, you're right. Intuitively, I find that the version with "an uncertain" appeals more to the imagination and helps the reader form an image in the mind. I prefer the version with "an uncertain". --Coppertwig 21:45, 21 August 2007 (UTC)
- I think that Logicus is making a very constructive proposal here (thank you). I do agree with Coppertwig's comment on the "betting quotient" paragraph. I would note that the difficulty, if it is one, with universal hypotheses may be rather less than it appears. In most situations, one is dealing with hypotheses (e.g., "The Nile is more than 2000 km long," "a person selected randomly from the street has a particular medical condition") that are not universal. Surely in cases like this, assigning betting quotients is a reasonable procedure. Most hypotheses in science are of this nature (e.g., "There is a 95% probability that the Hubble constant is between 65 and 80.") Even with regard to universal hypotheses (e.g., "all ravens are black", "Nothing can go faster than the speed of light") there are two ways to view setting betting quotients. One is to assign zero prior probability, as Logicus has argued above (the sticking point here is that no matter how many black ravens you see you will not be able to show that the hypothesis is correct...I am less certain that a rational case cannot be made that observing a very large number of black ravens, given identifiable hypotheses and background information, would not change ones confidence in that universal hypothesis); but the other is rather more commonsense and intuitive, and closer to the way working scientists do it, and it is to think of the assignment of a prior probability as reflecting ones confidence in the hypothesis, whether as a working model (as I have argued above) or as an epistemic truth, the evidence for which would come from an oracle that we can consult when the bets are in to determine what is the case, just as in a horse race (the archetype of assigning betting quotients) what is the case is determined by running the race. I think that it is in this sense that one can describe many, if not most working scientists as "in their gut" Bayesians; this may be different for philosophers of science. Bill Jefferys 22:32, 21 August 2007 (UTC)
- Oh -- apparently by "universal hypothesis" you mean a statement of a form like "All X are Y." That was not at all obvious and would have to be explained to the reader. The term "betting quotient" has still not been defined. Neither of these phrases appears elsewhere on the page nor has a Wikipeda page by that name to define it. It is not at all obvious that a bet on a universal hypothesis can never be won; some can be proven one way or the other, e.g. "All prime numbers are divisible by themselves" or "All prime numbers have exactly 7 divisors". There could conceivably be a way to prove that all ravens are black (by definition, perhaps, or by viewing all ravens on Earth.) --Coppertwig 22:51, 21 August 2007 (UTC)
- Logicus on what Coppertwig says as follows:"It is not at all obvious that a bet on a universal hypothesis can never be won; some can be proven one way or the other, e.g. "All prime numbers are divisible by themselves" or "All prime numbers have exactly 7 divisors". There could conceivably be a way to prove that all ravens are black (by definition, perhaps, or by viewing all ravens on Earth.)"
- But these claims are both mistaken and also illegitimate because they constitute Wiki-original research. See 'Logicus contra Coppertwig on Bayesian probability' in User Talk: Coppertwig of 29 August for their correction, posted there to avoid any further use of this Talk page with original research discussions --Logicus 16:11, 31 August 2007 (UTC)
- That's my understanding of the meaning of the "universal hypothesis." I am not a philosopher, so Logicus can confirm or refute me. The "betting quotient" idea, I agree, has to be defined. The basic idea is that given a hypothesis X, and its contradiction ~X (for example), a rational person ought to be able to assign odds O (the betting quotient) on X versus ~X, such that he would be willing to take either side of a bet on that proposition with those odds and feel it a fair bet. For example, if you have what you believe to be a fairly tossed fair die, a bet on '1' against the alternative at fair odds would be 1:5. I put in 1 dollar, you put in 5 dollars, and the winner takes the pot. You would regard this as a fair bet for either of us. The Dutch book argument extends this to more than one proposition to justify (not prove) that probability theory is the right formulation of rational thinking under uncertainty. Once one has betting quotients, it is simple to extend these to probabilities, e.g, the probability of X is (in the above example) 1/(1+5)=1/6. Bill Jefferys 23:51, 21 August 2007 (UTC)
Logicus: Gentlemen, thanks very much indeed for your constructive comments and criticisms here, which I shall digest.--Logicus 18:14, 22 August 2007 (UTC)
- I note Logicus's addendums.
- Logicus notes that "since scientific hypotheses are typically infinitistic universal generalisations...", but I think you will find if you read any ordinary scientific journal that the hypotheses being entertained are almost always not universal. Most of science is done assuming as background information the validity of such generalizations (e.g., "the speed of light cannot be exceeded", "the momentum-energy tensor is invariably conserved," etc.) You will have to search long and hard in the usual scientific journals to find cases where anyone is actually trying to test such universal hypotheses. Thus, from a point of view of the ordinary scientific enterprise, I think that this comment grossly overstates what scientists do day to day.
- It may be that there is a valid comment that could be made regarding how philosophers of science regard such generalizations, but for the vast majority of normal science, the point is off-message. However, Howson & Urbach dispute even the validity of this point, and any reference to it should cite this authority and state their view. Bill Jefferys 00:01, 23 August 2007 (UTC)
- In response to Logicus: I consider this article to be primarily a mathematics article and also partially a philosophical one. I see no reason to give prominence to philosophical definitions here. Logicus says it's clear but has not made a convincing case. Probability is fundamental to the mathematics of statistics, and writing definitions is a common activity of mathematicians. --Coppertwig 00:10, 23 August 2007 (UTC)
Thanks again, most interesting observations Bill, will digest.--Logicus 18:13, 23 August 2007 (UTC)
Shouldn't footnotes 1 and 2 be reversed in order (or perhaps combined into a single footnote)? Footnote 2 with its quote about "independently" seems to match perfectly with the sentence to which footnote 1 is attached. --Coppertwig 18:55, 24 August 2007 (UTC)
Logicus revised edit proposal:
Does anybody wish to propose any improvement to the following attempted improvement on the above 3rd proposed edit of 22 August above before I post it, or similar ? (It has two footnotes marked 'F')
< However, the probability of infinitistic spatio-temporally unbounded and logically universal hypotheses that are so fundamental to science - such as Newton's law of inertia or his law of universal gravitation - poses a problem for the betting definition of 'degree of belief' because the only fair betting-quotient[F] on an infinitistic universal hypothesis is zero, since a bet on its truth can never be won because its truth can never be decided. This becomes a fundamental problem for Bayesian philosophy of science that scientific reasoning is Bayesian, which thereby seeks to reduce scientific method to gambling, but some regard it as soluble. [F See Gillies 'Induction and Probability' Parkinson (ed) An Encyclopedia of Philosophy 1988; p263-4, Howson & Urbach 1989] But it is also noteworthy that by 1981 De Finetti himself came to reject the betting conception of probability: "...betting strictly speaking does not pertain to probability but to the Theory of Games" ["The role of 'Dutch Books' and 'Proper Scoring Rules' " in British Journal for the Philosophy of Science 32 1981 55-6.]
F[ 'A betting quotient is the quantity p = k/(1+k), where k are the odds on a hypothesis you believe fair and that will therefore be taken as your degree of belief it is true. 'p' is called the betting-quotient associated with the odds k. Odds can be recovered uniquely from betting-quotients by means of the reverse transformation k = p/(1-p).' Paraphrase of p76 Howson & Urbach 1993, last paragraph. ] >
COMMENT: In response to the valid criticisms of Coppertwig and Jeferrys this further revised version elaborates universal hypotheses and with examples and also explains 'betting-quotient'. (But the elaboration of ‘universal hypotheses’ is very ugly, and should perhaps be an expository footnote ?) But Jefferys’ lengthy criticism that not all scientific hypotheses are logically universal statements is wholly logically irrelevant since the crucial problem for the Bayesian theory of scientific reasoning is that some are, and moreover arguably they are the most important ones (e.g. GTR, Quantum Theory etc). Nor is it relevant to start disputing recognised problems in the literature are or are not such or advocating solutions or criticisms of them as Bill does re recognised universal hypothesis problems, since that is Wiki-original research. In fact in the light of mistaken accusations of Logicus doing such, who nevertheless does not practice original research nor pedagogically irrelevant dissertations on Talk pages, it is perhaps pertinent to point out here that in his comments on singular hypotheses in science and betting-quotients Bill is practicing a combination of original research with lessons in elementary Bayesian philosophy of science for Coppertwig on the Talk page. But as Kenneth Williams said “Infamy, Infamy ! They’ve all got it in for ME !” --Logicus 18:39, 25 August 2007 (UTC)
I disagree with the statement that the only fair betting quotient is 0 because its truth can never be decided, as I've explained earlier. Could you provide a (reasonably short) quote here on the talk page that supports that statement, and then let's consider perhaps using prose attribution and possibly direct quotation (if not leaving it out). I might have more comments later. --Coppertwig 19:47, 25 August 2007 (UTC)
It is immaterial whether YOU disagree with the statement or not, since that is your Wiki-original research. The point is it is at least considered as a problem in the literature, whether people think it correct or not. Go and read the literature such as the references given. And please don't come back and tell me the logically irrelevant point that so-and-so disagrees with it. HOWEVER, I will consider rephrasing it more to your subjective taste, and maybe providing a quote in a footnote (-: This text anyway needs more improvement with more stuff taken out of main text to go in footnotes.--Logicus 14:37, 26 August 2007 (UTC)
Logicus revised edit proposal:
Logicus’s 5th proposed edit of 21 August was as follows:
In the 'Varieties' section, in the sentence “Bayesian probability is supposed to measure the degree of belief an individual has in an uncertain proposition, and is in that respect subjective.” replace "an uncertain" by 'a', since Bayesian probability also measures belief in certain propositions, such as tautologies to which it may assign probability 1, as well as measuring belief in uncertain propositions.
Coppertwig commented:
Re replacing "an uncertain" with "a": Technically, you're right. Intuitively, I find that the version with "an uncertain" appeals more to the imagination and helps the reader form an image in the mind. I prefer the version with "an uncertain". --Coppertwig 21:45, 21 August 2007 (UTC)
This objection from Coppertwig's subjective intuitions is clearly mistaken. The purpose here should not be that of appealing to the imagination and forming images in the reader's mind about UNCERTAIN propositions, which are a red-herring here. Rather it should be that of clarifying the main point being made in this 'Varieties' section, whose purpose is clearly that of explaining the SUBJECTIVITY of subjective Bayesianism as concerned with SUBJECTIVE BELIEF in contrast with OBJECTIVE probability. Introducing the complicating but irrelevant fact that it also deals with uncertain propositions as well as certain propositions simply diverts attention away from the main point about its subjectivity by introducing a different focus of attention on UNCERTAINTY, but which has logically nothing to do with the issue of subjectivity in subjective Bayesianism.
I now propose the current sentence
"Bayesian probability is supposed to measure the degree of belief an individual has in an uncertain proposition, and is in that respect subjective.”
be replaced by
'Bayesian probability means the degree of belief (or strength of belief) an individual has in the truth of a proposition, and is in that respect subjective.'
And I shall now implement this clarifying improvement in the expectation there will be no valid objections to it.
(NB Dear Coppertwig, Logicus does not use capitals to indicate emotion as you mistakenly suggest on his User Talk page, but rather as an alternative to italicisation to indicate where the emphasis of meaning lies, and for the simple reason this saves time fiddling about with italics formatting of text. And in my view your other headmasterly editorial advice on that page is also almost all totally mistaken or misplaced, as I shall explain later if I get time.) --Logicus 14:38, 26 August 2007 (UTC)
- Dear Logicus. I have kept out of your edits until now, preparing for classes and also giving you room for your edits. I think you have significantly improved the article. Your edits were both modest and accurate. Bill Jefferys 02:41, 30 August 2007 (UTC)
Logicus contra Coppertwig
Coppertwig wrote on 23 August:
"In response to Logicus: I consider this article to be primarily a mathematics article and also partially a philosophical one. I see no reason to give prominence to philosophical definitions here. Logicus says it's clear but has not made a convincing case. Probability is fundamental to the mathematics of statistics, and writing definitions is a common activity of mathematicians. --Coppertwig 00:10, 23 August 2007 (UTC)"
Coppertwig's comments are wrong and confused. The article as it stands is definitely primarily a philosophy of mathematics article, being concerned as it is with the meaning of the basic mathematical concept 'probability' and with expounding its Bayesian interpretation in contrast with others. This should hopefully become clear if one reads the article, moreover which itself tells us "...well known proponents of Bayesian probability have included...many philosophers of the 20th century."
But secondly, contrary to what Coppertwig suggests Logicus is not advocating giving prominence to "philosophical definitions", whatever they may be, and so has not made any case whatever for such, convincing or otherwise. Rather the ungainsayable case he has made is that since the article is a philosophy of maths article concerned with expounding the meaning/interpretation of the term 'probability' and its different meanings, then it is pedagogically crucial that it should provide clear definitions of these different meanings/interpretations of 'probability' it deals with, such as classical, logical and frequentist probability, for example. And this will require this article's editors to research the literature to determine these definitions, rather than self-introspecting or indulging in invalid and misplaced criticisms of Logicus's contributions. Logicus has initiated this project by providing a definition of classical probability by Bernoulli. Could Coppertwig and Jefferys possibly research the logical and frequentist definitions of 'probability' in the literature and propose representative definitions of them? As for Coppertwig's two facts that 'probability is fundamental to statistics and that writing definitions is a common activity of mathematicians', their logical relevance against Logicus's case here is entirely lost on Logicus. Both facts are subjects of the philosophy of maths. --80.6.94.131 18:15, 30 August 2007 (UTC)
- Logicus, I admire your ability to find relevant reference works. I probably won't be making that kind of contribution to this page in the near future at least, unfortunately. Re page categorization: actually, I think Wikipedia pages are supposed to have more than one category, anyway, so I think it's fine to leave it as it is. Other than that I'm unclear as to whether there are any edits still in dispute and if so what ones. --Coppertwig 22:56, 30 August 2007 (UTC)
Logicus to Coppertwig:Outstanding Edit
The following seems to be the only outstanding proposed edit:
The following:
“Degrees of belief should not be regarded as extensions of the truth-values (true and false) but rather as extensions of the attitudes of belief and disbelief. Truth and falsity are metaphysical notions, while belief and disbelief are epistemic (or doxastic) ones.”
should all be deleted as unintelligible mumbo-jumbo or possibly seriously misleading in creating the false impression that subjective Bayesianism has nothing whatever to do with truth or falsity as all 'epistemic' probability does, whereas in fact it interprets 'probability' as 'the degree of belief in the truth of a proposition'. These two sentence add nothing but potential confusion to the exposition.
You objected:
- Re "Degrees of belief ..." These sentences make sense to me; they are not unintelligible. I can try to explain them to you if you like. I don't see how they could create the impression that "Bayesianism has nothing whatever to do with truth or falsity", since belief and disbelief obviously have something to do with truth and falsity; that is, belief is "about" truth or falsity.
So I challenge you to translate them into something intelligible. What on earth does ‘doxastic’ mean, for example ? And what does “extensions” mean here ? And what does “Truth and falsity are metaphysical notions.” mean ? And you are quite wrong that “belief and disbelief obviously have something to do with truth and falsity”, since rather than belief in the truth of a proposition, there is also belief in the usefulness of a proposition for some purpose(such as Bill Jefferys believes in), or belief in the existence of something, for example.
So if you explicate these two sentences more fully, their nonsense is made apparent as follows:
“Degrees of belief should not be regarded as extensions of the truth-values (true and false) but rather as extensions of the attitudes of belief and disbelief IN THE TRUTH OF PROPOSITIONS. Truth and falsity are metaphysical notions, while belief and disbelief are epistemic (or doxastic) ones.”
And strictly speaking belief is not ‘epistemic’, that is, about knowledge, traditionally held to be propositions with some degree of truth-content, but is rather psychological, a psychological feature of subjects.
I submit these two sentences only muddy the water and serve no positive expository purpose, so should be deleted. --Logicus 18:40, 31 August 2007 (UTC)
What is probability? Researching it.
Here I would like to start a Wiki article research project for people interested in the philosophy of probability to contribute to in order to determine what the various conceptions of probability are in the literature, and whose very clear expression and contrast is so fundamentally important for a basic educational Encyclopedia article on the philosophy of probability. The fundamental problem here is that from my admittedly very limited knowledge of the literature, even those who should be most perspicuously clear about this issue, namely so-called philosophers, are in fact not so, but notably shy away from providing clear definitions of the various traditions from the very outset of their expositions. (For example, it was not until the 2006 third edition of the Howson & Urbach textbook that any defining conception of subjective Bayesian probability was attempted, and even then it was rather dithering, as quoted above.)
So the question here is, What are the allegedly different conceptions of 'probability' espoused in the various allegedly different traditions, such as the classical, the frequentist, the logical, the epistemic, the pragmatist, the subjectivist, the Bayesian conceptions etc ? I suggest 'we' need to set out an identification of all the various traditions that clearly formulates their basic conceptions/definitions of 'probability' and their logical relations e.g. Is Bayesian subjectivist probability a logical subspecies of epistemic probability which generally conceives 'probability' as 'the probability of the truth of a proposition' as distinct from event-probability that conceives probability as a property of events rather than of propositions ? And since in turn it conceives 'probability' itself as 'degree of subjective belief in the truth of a proposition' or 'strength of belief in the truth of a proposition', is it really 'epistemic', or rather only 'fideist' and 'non-epistemic' ? This is all potentially very confusing for the logically clear thinking beginner and indeed promotes the impression 'probability' is ultimately 'mumbo-jumbo', however practically successful its calculus may be.--Logicus 14:44, 20 October 2007 (UTC)
Let me start the ball rolling here by saying it seems the classical conception of probability of such as Bernoulli and Laplace seems to be as follows:
Defn: Classical 'probability' = 'the degree of certainty in the truth of a proposition'
and is conceived as an objective feature of propositions rather than a subjective variable, such that the probability of a tautology = 1 because the truth of a tautology is completely certain.
But how does the logical conception of such as Keynes and Carnap differ from the classical conception, if at all, and how does the frequentist conception differ from these, for example ?
Thus I am proposing the article and also the article 'Philosophy of probability' need to provide clear identifications of the different conceptions of probability in the kind of clear Definitional format I have provided above, but whose identifications need some serious research.
Jefferys' Probability
For example, something like Bill Jefferys' highly interesting non standard Pragmatist conception of probability that he claims is used by himself and his astronomer colleagues is as follows:
Defn: Jefferys' Pragmatist subjective Leibnizian 'probability' = 'the degree of subjective belief it is likely a proposition will be useful for predicting novel facts'.
I suspect the latter may be a highly important conception of probability, albeit not one that obeys the traditional probability calculus, as I have already proven inasmuch as it breaches its tautology axiom P(t) = 1 since presumably P(t) = 0. Its historical scientific significance is that it seems it was Leibniz (and Huygens) who first proposed that the hypothetico-deductive prediction of novel phenomena was the next best proof of the truth of a hypothesis given the logical impossibility of absolute proof of truth from the 'phenomena'. Newton's attempts at the latter for his theory of universal gravitation were clearly invalid and robustly criticised by Leibniz, whose critique was most forcefully endorsed by Roger Cotes (in an early 1713 letter to Newton). Cotes's brilliant Preface to the second edition of the Principia to try and answer Leibniz's critique cunningly revamped Newton's earlier Book 3 proof in Propostions 1 to 8 to try and solve the problem of its collapse at Proposition 5 Corollary 1, where it patently begged the question of gravitational attraction at a distance in an unproven lemma about Law 3 and mutual attraction. Cotes's Preface proof shows clear signs of adopting the Leibnizian non-inductive hypothetico-deductive standard of proof in the prediction of novel phenomena (e.g. it appeals to hypothetico-deductive predictions of Saturn-Jupiter gravitational mutual perturbation, to moving aphelia of all the planets, and the times of high tides, albeit as a matter of fact all of which were either undetectable or refuted at the time.) It was arguably that other great German philosopher and astronomer Kant, following on Hume, who wound up and rang the death knell of absolute proof of truth from the phenomena based on his critique of the invalidity of Cotes's Principia E2 Preface proof. And on the other hand it was the stunningly successful novel prediction of the return of Halley's comet in the 1650s that finally persuaded the French Academy to terminate the prize it offered for the refutation of Newton's theory of gravity, and which seemingly embodied and heralded the adoption of the Leibnizian criterion of proof of truth, albeit not one of certainty.
The pragmatisation of the Leibnizian standard of evaluating hypotheses to become not a proof nor sign of their truth, but rather maybe making the feature of predicting novel phenomena a value in itself in terms of usefulness, seems a plausible candidate for explicating the notion of the probability of a proposition in scientific reasoning. The essential idea here is that scientists evaluate hypotheses according to how likely they believe they will predict novel phenomena, and so the 'probability' of a hypothesis becomes 'the strength of belief in its usefulness for predicting novel phenomena'. Thus one might argue the French Academy employed a non-standard 'Leibnizian pragmatist Bayesian' conception of probability because it raised the probability it assigned to Newton's theory of gravity posterior to the evidence of the return of Halley's comet, which clearly established it could predict novel facts. However, a contra-pragmatist counterargument here is that like Leibniz they were only employing prediction of novel facts as a (fallible) criterion of truth or of degree of verisimilitude. But in turn an apparently fatal problem for this criterion is that tautologies are maximally absolutely true but totally useless for predicting novel facts, whereby they would surely absurdly come out totally false on that criterion. 'Strength of belief in utility for predicting novel facts' may well be an important candidate for a non-standard conception of 'probability' that breaches the standard axioms of the probability calculus (e.g. the tautology axiom). But it would seem to be a massive original research undertaking investigating what other standard probability axioms it breaches/obeys. Such are the delights of the philosophy of probability and of science (-: --Logicus 18:11, 23 August 2007 (UTC)
- I'm afraid that Wikipedia isn't an online univeristy or discussion forum, not even the talk pages. There are plenty of places where you can discuss the topics you are interested in. This is not the right place, however. So please try to have your comments short and to the point.
- I don't see the need for yet another "Philosophy of probability" article when we have Probability interpretations already. A redirect might be a good idea tho. iNic 19:06, 23 August 2007 (UTC)
- I agree with Inic that this is not the place to develop a theory. I think that Logicus hasn't understood my idea, though, since I would not say that the probability of a tautology would be anything other than 1. Logicus, please contact me by email (go to my Wiki webpage, you can contact me that way. My email address is on my homepage, given there.) and we can continue this part of the conversation offline. Bill Jefferys 00:00, 24 August 2007 (UTC)
Logicus to iNic:I disagree with you iNic. This is exactly the right place to raise the problem that this article and related articles fail to make clear what the different conceptions of probability are and how they differ by means of simply giving their definitions of ‘probability’. But to find out what they are for these articles one needs to research the literature, and people can post their findings to this Talk discussion topic. Note this article is already officially condemned for its lack of references, and I am trying to help by improving that situation. Also, you seem unaware that there already is a ‘Philosophy of probability’ article to which I referred, but sorely inadequate in my view. So I think you thoroughly misunderstand the point of my discussion here and your objections are wholly invalid. But thanks for pointing out the 'Probability interpretations' article of which I was unaware, and which is much better than the ‘Philosophy of probability’ article. --81.178.234.129 19:37, 24 August 2007 (UTC)
And Bill, I am not developing any theory here anyway, again a misunderstanding. I was only discussing your theory as an example of conceptions that need clarification, a non-standard one in your case. This is all very simple if you just stick to the point that what is needed here is clear definitions of the different conceptions of 'probability'. --81.178.234.129 19:37, 24 August 2007 (UTC)
- Thanks for adding references to the article, Logicus. That's very useful. --Coppertwig 20:04, 24 August 2007 (UTC)
- Pleased to learn somebody in Wikipediaq thinks I do some useful things in Wikipedia ! The reference for the definition of the classical conception of probability provided above is James Bernoulli’s 1713 Ars Conjectandi. But maybe it should be ‘the RATIONAL degree of certainty in the truth of a proposition’, where ‘rational’ will have to be explicated.--Logicus 19:28, 25 August 2007 (UTC)
Logicus to Bill Jefferys:
You say above, 24 August:“I think that Logicus hasn't understood my idea [of probability] though, since I would not say that the probability of a tautology would be anything other than 1.”
Interested to hear you assign probability 1 to a tautology, rather than zero because they are useless for making novel predictions. But does that therefore mean you believe tautologies are the most likely of all propositions to usefully make novel predictions ? To review this issue, you strongly objected to recognition in the article of the apparently fatal problem posed for Bayesian philosophy of science by the radical fallibilist philosophy that all hypotheses are false on the invalid ‘original research’ ground that YOU deny it is a fatal problem. But as I drew you out on your philosophy of probability, it quickly became apparent that this was because you (claim you) do not assign probabilities on the basis of belief in their TRUTH as Bayesianism does, but rather in belief in their likely USEFULNESS for predicting novel facts, and whereby this apparently fatal problem (for Bayesianism)is not a problem for your non-epistemic philosophy of probability. But of course imposing your original research on the article to block edits is invalid. And by the way, Coppertwig’s recent example of an alleged tautology useful for predicting things on User Talk:Logicus is (i) not a tautology, but at most an analytical truth dependent on the axioms of arithmetic which are not themselves tautologies (ii) logically irrelevant because it is prediction of NOVEL facts that is at issue, not mere prediction.--Logicus 19:53, 25 August 2007 (UTC)
- Logicus, this is not the place to discuss it. I will be happy to discuss it offline, and have provided a pointer to my email address. Bill Jefferys 18:14, 26 August 2007 (UTC)
BenE to Logicus:
Formal attempts have been made to define and apply the intuitive notion of a "degree of belief". The most common interpretation is based on betting: a degree of belief is reflected in the odds and stakes that the subject is willing to bet on the proposition at hand. However, the probability of the kind of spatio-temporally universal hypotheses that are so fundamental to science - such as Newton's law of inertia or his law of universal gravitation - poses a problem for the betting definition of 'degree of belief' on the ground that the only fair betting-quotient on such universal hypothesis is zero, since a bet on its truth can never be won because its truth can never be decided.
Mathematically bayesianism doesn't allow to compute a probability on a single proposition like the one above. The mathematics only apply to the comparison of two (or more) models using a well defined set of parameters which, at least in science, refer to real things in the universe, things that are described symbolically using variables that have dimensions that are at least isomorphic to the real dimensions of the universe.
In order to be mathematically tractable: It only applies to models, and it only applies to models compared to each other. This is a result of the fact that models don't need to be mutually exclusive (Sometimes they are, sometimes they're not). Bayesian science cannot answer the question: Which model is the true one? (unless we can show that all the possible models for a phenomenon are mutually exclusive; we can almost never do that for complex models) It can only answer the more relative: Which model is the truest? Which is defined as: which one makes better predictions. A third requirement if we wish to call it a science is that it only applies to models that describe/predict real physical things in the universe. These requirements are not a problem because this is precisely the things science deals with. Science doesn't deal with tautologies, and logical truths about languages or other symbolic systems. These systems, as all mathematics, are tools used by science. Science doesn't reveal truths about the working of these systems, it's the other way around, it uses these logics, where they are useful, to make predictions about the universe. Bayesian principles can be used to evaluate a model that is described using logic and other symbolic languages against the prediction of another similar model as long as these models make predictions about real physical things and are not limited to self referential statements or definitions that don't relate to anything physical. Bayesian science doesn't claim to evaluate anything further than the real and the physical.
In the above example, Bayesian probability would not say that the only fair bet is zero. It would not say anything about this bet. It would leave you clueless as you didn't define any other models to compare with Newton's.
Instead it could say, for example, that: Einstein's model predict planetary movement to a precision of 10 decimals and Newton's model only predicts it to 3 decimals hence Einstein's model is that much more predictive, and by definition that much more probable. However, it cannot say whether any of these models are ultimate truths or even give a number that represent their absolute probability. Unless one can observe all of the universe from its beginning to its end, that is an unanswerable question and Bayesian science is right in classifying it as such.
This problem of the Bayesian philosophy of probability becomes a fundamental problem for the Bayesian philosophy of science
Nope there is no problem. Bayesian philosophers of science simply say that it is a fallacy to even pose questions similar to the one above. also see:http://yudkowsky.net/bayes/technical.html --BenE 18:16, 10 September 2007 (UTC)--
- BenE, thank you for explaining the point I was trying to get over to Logicus. Bill Jefferys 14:06, 11 September 2007 (UTC)
- Logicus to Bill Jefferys: Since you seem to find BenE's remarks intelligible, or at least some of them, perhaps you would be so kind as to explain (1) what exactly is the point you were trying to get over to Logicus that you refer to here, most preferably stating it in one coherent well-formed proposition in the English language and (2) how any of BenE's remarks possibly explain that point, preferably again stating the explanation in one proposition and in good English.
- Further, please note that BenE's disagreement does not appear to be with me, but rather with the first two sentences of the text he quotes which I did not write, and particularly the claim that "a degree of belief is reflected in the odds and stakes that the subject is willing to bet on the proposition at hand.". This clearly identifies Bayesian belief as belief in propositions, including a single proposition, which is indeed a correct account of the literature on subjective Bayesian epistemic probability. BenE's disagreement is thus not with me, but rather an original research disagreement with the literature and Wikipedia article, which is not my concern.
- Also note BenE's comments, where they are intelligible, are mistaken and confused even from their very first sentence: "Mathematically bayesianism doesn't allow to compute a probability on a single proposition like the one above."
This claim is obviously false, since mathematically Bayes' Theorem itself does enable computing a posterior probability on a single proposition, including on each of the two different single propositions cited in my text, whether that probability be zero or non-zero. BenE may believe there is some other non-mathematical rule of some Bayesian methodology he has in mind that forbids doing this, but it is certainly not mathematically impossible.
- BenE appears to be doing Wiki-original research in advocating some non-standard Bayesian methodology, in particular apparently a form of event-probability, rather than epistemic-probability concerned with the probability of propositions, and thus notably gives no references nor sources for the string of dogmatic assertions he makes. But even if what he says were correct, it would mean that scientific reasoning is indeed not Bayesian epistemic as represented in the Bayesian philosophy of science literature, which is also my view, but for entirely different reasons.
- Finally on BenE, note this section of Talk 'What is probability ?' was intended to be a research project for people to provide simple, clear sourced definitions of the different conceptions of probability for this article, but which BenE never provides. Like many Wiki editors on these pages, he simply shoots his mouth off about his own opinions without sources or references.
- On another point, since you yourself at least agree with the radical fallibilist philosophy of science that all scientific laws are false according to your 15 August testimony on this Talk page, then what probability would you assign a scientific law if you assigned probabilities to propositions according to strength of belief in their TRUTH, as subjective Bayesian epistemic probabilists do according to the literature ? (I appreciate you do not accept the subjective Bayesian epistemic interpretation of probability, but have your own utilitarian pragmatic interpretation of it as 'likely usefulness of a hypothesis in making novel predictions', but just imagine you did accept it. What probability would you assign a hypothesis you believe to be definitely false ?)
- I would also be grateful to know why you assign tautologies probability 1, and thus why you evaluate them as maximally useful for making novel predictions. For instance, how is 'The Moon is the Moon' useful for making novel astronomical predictions.
- Finally, I would be grateful if you would kindly give me the reference for your wonderful George Box quotation where you said:
- “The idea that all hypotheses are false is nothing new and is of no consequence to Bayesian theory. As the eminent Bayesian statistician, George E. P. Box famously remarked, "All models are wrong; but some models are useful." “ Where did Box say this ?
- And if you still have any difficulty understanding why the belief that all hypotheses are false might be fatal to an epistemology that assigns probabilities to propositions as 'strength of belief in their TRUTH', do let me know. Logicus 17:32, 18 September 2007 (UTC)
- Since the semester started, I have been rather busy, and expect to be so for quite some time, so will answer only the question about the Box quotation, and make one more comment. The other questions will have to wait, weeks probably. I can only say that you profoundly misunderstand my position. I still invite you to contact me directly. Believe me, I do have coherent and justifiable reasons to write what I did.
- But here is one cite to Box's comment:
- See the entry in WikiQuotes under George Box for other versions. It's not hard to find. Try Google. It works very well.
- I'll make one more comment. The reason why you have to consider more than one hypothesis is that if only one hypothesis is possible (that is, the universe of hypotheses under consideration consists of that one hypothesis alone), its prior and posterior probabilities are by definition 1. This is well-known, and any decent book on Bayesian statistics should set you straight on this point. Try Jim Berger's book.
- Now I have to go back to preparing my teaching for tomorrow. Bill Jefferys 00:02, 19 September 2007 (UTC)
BenE:
- "This claim is obviously false, since mathematically Bayes' Theorem itself does enable computing a posterior probability on a single proposition"-Logicus
To quote E.T. Jaynes:
- "A practical difficulty of this was pointed out by Jeffreys (1939); there is not the slightest use in rejecting any hypothesis H0 unless we can do it in favor of some definite alternative H1 which better fits the facts.
- Of course, we are concerned here with hypotheses which are not themselves statements of observable fact. If the hypothesis H0 is merely that x < y, then a distinct, error-free measurement of x and y which confirms this inequality constitutes positive proof of the correctness of the hypothesis, independently of any alternatives. We are considering hypotheses which might be called 'scientific theories' in that they are suppositions about what is not observable directly;only some of their consequences - logical or causal - can be observed by us.
- For such hypothesis, Bayes' theorem tells us this: Unless the observed facts are absolutely impossible on hypothesis H0, it is meaningless to ask how much those facts tend 'in themselves' to confirm or refute H0. Not only the mathematics, but also our innate common sense (if we think about it for a moment) tell us that we have not asked any definite, well-posed question until we specify the possible alternatives to H0. Then, as we saw in Chapter 4, probability theory can tell us how our hypothesis fares relative to the alternatives that we have specified; it does not have the creative imagination to invent new hypotheses for us."
Logicus: I note that insofar as it is intelligible, this highly confused quotation from Jaynes confirms my claim that BenE's claim that "Mathematically bayesianism doesn't allow to compute a probability on a single proposition like the one above [i.e. Newton's law of inertia or of gravity]." is false because "mathematically Bayes' Theorem itself does enable computing a posterior probability on a single proposition" as I point out. Jaynes seems to mistake a methodological rule for mathematics, arguably typical of his confusion.--Logicus 18:15, 16 October 2007 (UTC)
Jaynes emphasis not mine. From Jaynes book. Free draft here. But, go ahead, try to write down the mathematical equations that describe the probability of Newton's law in a consistent way. As you said yourself you end up with nonsense like having to pose all prior probabilities to be zero. This isn't a flaw in Bayesianism its a feature! Bayesianism, analogously to the way arithmetic doesn't allow division by zero, doesn't allow you to pose these ill-formed questions!--BenE 14:13, 20 September 2007 (UTC)
BenE to Logicus:
Logicus, Notice how although I have never interacted with B. Jefferys before this page, he and I, have the same views which are consistent with each other, consistent with how Bayesian theory is used in practice and most importantly consistent within the language of mathematics and its relation with the real world. We can actually write down the equations and solve real world problems which is more than I can say for your views. You keep giving examples that are outside the bounds of a useful Bayesian theory. Your perspective is so pure epistemology to the extreme that everything becomes meaningless with regard to the real world. If I used your rational and applied it to simple algebra I would reject this theory on the basis that it can't divide by zero and thus isn't a consistent and complete epistemological system for numbers. Eventually my world-views in general would fall to complete nihilism.
My Bayesian theory is epistemic, just not pure epistemic to the extreme as to disconnect it from reality. I and most bayesians define the level of truth a theory has, relatively to the accuracy of its predictions about the real world (Not as BTW "likely usefulness of a hypothesis in making novel predictions"). Bayesianism is deemed to be useful when the truth of theories is defined in terms of the accuracy of the verifiable predictions it makes relating to the real world. A theory is defined as true (relatively, compared to other theories) when its predictions are accurate. Is this so hard to understand? Notice how I use the words "real world and reality" everywhere. The reason is that for most people it is an obvious assumption (so much so that you'll probably have a hard time finding it in the literature as nobody seems to find the need to state it) that any scientific theory has for subject reality. You, for some reason don't seem to make that assumption and that's what causing you problems. A tautology isn't reality, its a self referential circular statement true by definition. We don't need to test its truth externally with a scientific theory, it's defined as true in its own symbolic language. It is part of the reasoning system, it's not to be reasoned upon with the reasoning system (unless one wants to test its capacity at circular logic). Bayesianism as a scientific theory doesn't apply here.
Also you keep using the terms subjective and objective without any definitions. Bayesian theory, according to Jaynes anyway, is subjective and objective at the same time! It is subjective in that it describes directly only our state of knowledge about the real world, not anything that could be measured in an experiment. It's objective in that it is independent of the personality of the user. Two people with the same information will always arrive independently at the same conclusion if they follow strict Bayesian mathematics.
Maybe you are not expressing yourself very well. Maybe you have actually have a consistent alternative theory which you think is better than bayesianism for science. If so, and if it's a well known theory, please create an alternate Wikipedia page which describes this theory and its benefits over bayesianism as science. Then come back here and create a link to that page so that people are aware of the alternative. But coming here to punch down hypothetical straw men that live in a world isolated from reality while telling us you found Bayesianism's "fundamental problems" is a waste of everybody's time.--BenE 15:33, 20 September 2007 (UTC)
- I think this debate is interesting and would love to take part in it myself. However, wikipedia is the wrong place. Therefore I would like to propose that this debate moves to another place. Here is a place we can use for example. We can cut and paste the discussion in this section to that place and continue the debate there. We could just leave a link here (and at other related wikipedia talk pages) so that others can find and join our open debate. iNic 13:28, 22 September 2007 (UTC)
BenE to logicus I was skimming trough Jaynes book and I stumbled upon in appendix, a reason we might be talking past each others. You seem to be mostly aware of the de Finetti system of probability while I'm more knowledgeable about the Bayes, Laplace, Jeffreys, Jaynes, line. Jayes mentions de Finetti in Appendix A"
There is today an active school of thought, most of whose members call themselves 'Bayesians' but who are actually followers of Bruno de Finetti and concerned with matters that Bayes never dreamt of. In 1937, de Finetti published a work which expressed a philosophy somewhat like ours and contained not only his marvellous and indispensable exchangeability theorem, but also sought to establish the foundations of probability theory on the notion of 'coherence'. This means roughly speaking, that one should assign and manipulate probabilities so that one cannot be made a loser in a betting based on them. He appears to derive the rules of probability theory very easily from this premise.[...] To our knowledge de Finetti does not mention consistency as a desiteratum, or test for it. Yet it is consistency - not merely coherence - that is essential here, and we find that, when our rules have been made to satisfy the consistency requirements, then they have automatically (and trivially) the property of coherence.
The Bayes, Laplace, Jeffreys, Jaynes, line, which has existed for a longer time period and seems to be based on more general principles, has never seen an objection in evaluating scientific hypothesis other than stating that they can't be evaluated alone. The arguments about applicability to scientific hypotheses go back to at least Harold Jeffreys book published in 1939.--BenE 15:48, 23 September 2007 (UTC)
Bayesian probability a special case of classical ?
[Please note this section was proposed to be about clarifying and researching different interpretations of ‘probability’ in the literature in contrast with the Bayesian conception, rather than regaling people with one's own conceptions.]
The first paragraph of the 'History' section claims
“[Laplace] instead adhered to the classical definition of probability, a special case of the Bayesian definition.”
But how on earth is Laplace’s classical definition of probability logically a special case of the Bayesian conception ? Surely the two contradict each other if anything ? e.g. classical probability observes the Principle of Indifference but subjective Bayesian probability does not.
Laplace’s classical definition given in Wikipedia is as follows:
“The probability of an event is the ratio of the number of cases favorable to it, to the number of all cases possible when nothing leads us to expect that any one of these cases should occur more than any other, which renders them, for us, equally possible.”
But this article says “ Bayesian probability is an interpretation of the probability calculus which holds that the concept of probability can be defined as the degree to which a person (or community) believes that a proposition is true.”
Thus, for example, whereas the subject matter of the classical conception is said to be EVENTS, that of Bayesian probability is said to be PROPOSITIONS.
I propose the clause “ a special case of the Bayesian definition” be deleted as an elementary error. --Logicus 18:13, 25 September 2007 (UTC)
- Logicus comments: I have now deleted the offending clause. By way of further hopefully helpful comment, one aspect of considerable basic conceptual confusion in this article seems to be that a weaker conception of Bayesian probability than that defined in its very first sentence is also operating in tandem with it, namely that Bayesian probability is just any version of the probability calculus that includes Bayes’ Theorem. (This is apparently why such as Bill Jefferys classifies himself as a Bayesian, in spite of rejecting this article’s definition of Bayesian probability in favour of his utilitarian conception.). It seems to be this logically weaker definition that permitted the inclusion of Laplace as a Bayesian. As for the stronger definition, it equates Bayesianism with subjective epistemic probability (and presumably also tacitly includes the use of Bayes' Theorem?). These confusions may well reflect that in the literature itself, but which should not be reproduced in an encyclopedia in which conceptual clarity is crucially pedagogically important. The most basic task of improving this article remains that of providing some clear historically adequate crisp definitions of all the different conceptions of probability it refers to. --Logicus 14:12, 30 September 2007 (UTC)
- Laplace was never a Bayesian in any sense of the word. However, every conception of probablilty (i.e., interpretation of probability theory as given by the axioms of Kolmogorov) includes Bayes Theorem. So a definition that loose of Bayesianism isn't useful at all. iNic 15:06, 30 September 2007 (UTC)
- I agree iNic, at least for the CONDITIONAL probability calculus, but then the other definition seems too strong --Logicus 18:14, 1 October 2007 (UTC)
- Yes it was an error. Good you deleted it. iNic 15:06, 30 September 2007 (UTC)
- Why then does it say in the classical definition article:"The classical definition enjoyed a revival of sorts due to the general interest in Bayesian probability, in which the classical definition is seen as a special case of the more general Bayesian definition.". Morever, Laplace has laid the foundation for today's Bayesian theory. If you read Laplace's book you will see that all his calculations are made using Bayesian techniques. He doesn't explicitly mention the use of Bayesian techniques as there were no other alternatives back then. However his mathematical conclusions always follow posterior probabilities based on a flat prior corresponding to the "when nothing leads us to expect that any one of these cases should occur more than any other," (Hence a special case of todays bayesianism which uses flat priors but also other uninformative priors). Laplace's "When nothing leads us to expect" is a statement of belief. Also,Laplace is famous to have said: "Probability theory is nothing but common sense reduced to calculation." Laplace's principle's (the rule of succession, for example) were much criticised in the 20th century until Bayesians came along and recognised them as the right solution to many problems, they simply needed to be generalised to increase their coverage and usefulness. In fact Laplace is the most cited author in Jaynes book on bayesian probability theory appearing in 21 sections even surpassing Sir Harold Jeffreys to whom the book is dedicated. For example Jaynes writes:
- "It appears that this result was first found by an amateur mathematician, the Rev. Thomas Bayes (1763). For this reason, the kind of calculations we are doing are called 'Bayesian'. We shall follow this long-established custom, although it is misleading in several respects. The general result is always called 'Bayes' theorem although Bayes never wrote it; and it is really nothing but the product rule of probability theory which had been recognised by others, such as James Bernoulli and A. de Moivre (1718), long before the works of Bayes. Furthermore, it was not Bayes but Laplace (1774) who first saw the result in generality and showed how to use it in real problems of inference."
- On Laplace's rule of succession:
- "This rule occupies a supreme position in probability theory; it has been easily the most misunderstood and misapplied rule in the theory, from the time Laplace first gave it in 1774. In almost any book on probability, this rule is mentioned very briefly, mainly in order to warn the reader not to use it. But we must take the trouble to understand it, because in our design of this robot Laplace's rule is, like Bayes' theorem, one of the most important constructive rules we have. [...] Poor old Laplace has been ridiculed for over a century because he illustrated use of this rule by calculating the probability that the sun will rise tomorrow, given that it has risen everyday for the past 5000 years. One obtains a rather large factor (odds of 1826214:1) in favor of the sun rising again tomorrow. With no exception at all as far as we are aware, modern writers in probability have considered this a pure absurdity. Even Keynes (1921) and Jeffreys (1939) find fault with the rule of succession. We have to confess our inability to see anything at all absurd about the rule of succession. We recommend very strongly that your do a little independent literature searching, and read some of the objections various writers have to it. You will see that in every case the same thing happened. Firstly, Laplace was quoted out of context, and secondly, in order to demonstrate the absurdity of the rule of succession, the author applies it to a case where it does not apply [...] but if you go back and read Laplace (1812) himself, you will see that in the very next sentence after this sunrise episode, he warns the reader against just this misunderstanding: "But this number is far greater for him who, seeing in the totality of phenomena the principle regulating the days and seasons, realizes that nothing a the present moment can arrest the course of it" "
- I hope this makes it clear that Laplace saw probability as a measure of belief which makes the classical definition a special case of the Bayesian definition. The article about the definition shows that I'm obviously not the only one who see it that way. I propose to undelete this fact. --BenE 14:08, 1 October 2007 (UTC)
- Why then does it say in the classical definition article:"The classical definition enjoyed a revival of sorts due to the general interest in Bayesian probability, in which the classical definition is seen as a special case of the more general Bayesian definition.". Morever, Laplace has laid the foundation for today's Bayesian theory. If you read Laplace's book you will see that all his calculations are made using Bayesian techniques. He doesn't explicitly mention the use of Bayesian techniques as there were no other alternatives back then. However his mathematical conclusions always follow posterior probabilities based on a flat prior corresponding to the "when nothing leads us to expect that any one of these cases should occur more than any other," (Hence a special case of todays bayesianism which uses flat priors but also other uninformative priors). Laplace's "When nothing leads us to expect" is a statement of belief. Also,Laplace is famous to have said: "Probability theory is nothing but common sense reduced to calculation." Laplace's principle's (the rule of succession, for example) were much criticised in the 20th century until Bayesians came along and recognised them as the right solution to many problems, they simply needed to be generalised to increase their coverage and usefulness. In fact Laplace is the most cited author in Jaynes book on bayesian probability theory appearing in 21 sections even surpassing Sir Harold Jeffreys to whom the book is dedicated. For example Jaynes writes:
- Well BenE, to claim that Laplace were a Bayesian is like claiming that Jesus was a Marxist or that Archimedes was an Intuitionist. You can perhaps find claims like that in the most propagandistic texts, but that surely doesn't make them more true nor these texts authors more credible. An absurd anachronism is always an absurd anachronism whoever said it. However, the opposite implication is of course true; some Bayesians view Bayesianism as a kind of revival of the classical definition of probability, that was stone dead at the time some Bayesians tried to dig it up from its grave. That is why it's correct to say that the classical definition enjoyed a revival of sorts due to the general interest in Bayesian probability. Note, however, that the opposite implication is not true. iNic 01:09, 2 October 2007 (UTC)
- Laplace's rule of succession is his attempt to justify inference by induction in a quantitative way. The so called "problem of induction" was considered to be an important problem to solve to justify what was believed to be the correct scientific reasoning at his time. However, it's not considered to be interesting or valid anymore. No one today thinks that science is built on induction. The law of gravity, for example, isn't increasingly more true for every observed apple that is dropped. Induction has been a non-issue for a very long time. That Jaynes thinks that this rule "occupies a supreme position in probability theory" is cute, but makes me wonder in what century he thought he lived. iNic 01:09, 2 October 2007 (UTC)
BenE to iNic: —Preceding unsigned comment added by BenE (talk • contribs) 14:11, 2 October 2007 (UTC)
- "The law of gravity, for example, isn't increasingly more true for every observed apple that is dropped."
You misunderstand how Bayesian principles work. The law itself isn't more true. However our rational belief that the law is valid is strengthened every time we observe it. Bayesian probability is based on making the distinction between truth and rational belief about the truth. It is a measure of rational and objective belief not of truth.
Well it is true in a way that Laplace wasn't a Bayesian in that the distinction Bayesian vs Frequencist didn't exist at the time. However, todays Bayesianism is based on a revival of Laplace's methods Laplace applied his theory to problems of belief not limiting himself to estimating frequencies. We could almost call todays Bayesianim: Laplacism!
This is how D.S. Sivia introduces Bayesianism in his book "Data Analysis: A Bayesian Tutorial" page 8:
- "... Reverend Thomas Bayes is credited with providing an answer to Bernouilli's question, in a paper published posthumously by a friend (1763). The present-day form of the theorem which bears his name is actually due to Laplace (1812). Not only did Laplace rediscover Bayes' theorem for himself, in far more clarity than did Bayes, he also put it in good use in solving problems in celestial mechanics, medical statistics and even jurisprudence. Despite Laplace's numerous successes, his development of probability theory was rejected by many soon after his death.
- The problem was not really one of substance but concept. To the pioneers such as Bernouilli, Bayes and Laplace, a probability represented a degree-of-belief or plausibility: how much they thought that something was true, based on the evidence at hand. To the 19th century scholars, however, this seemed too vague and subjective an idea to be the basis of a rigorous mathematical theory. So they redefined probability as the long run relative frequency with which an event occured, given (infinitely) many repeated (experimental) trials. Since frequencies can be measured, probability was now seen as an objective tool for dealing with random phenomena.
- Although the frequency definition appears to be more objective, its range of validity is also far more limited. For example, Laplace used (his) probability theory to estimate the mass of Saturn, given orbital data that were available to him from various astronomical observatories. In essence, he computed the posterior pdf for the mass M, given the data and all the relevant background information I (such as a knowledge of the laws of classical mechanics): prob(M|{data},I); [author points to a figure] To Laplace, the (shaded) area under the posterior pdf curve between m1 and m2 was a measure of how much he believed that the mass of Saturn lay in the range m1 <= M <= m2. As such, the position of the maximum of the posterior pdf represents a best estimate of the mass; its width, or spread, about this optimal value gives an indication of the uncertainty in the estimate. Laplace stated that: '... it is a bet of 11000 to 1 that the error of this result is not 1/100th its value.' He would have won the bet, as another 150 years' accumulation of data has changed the estimate by only 0.63%! According to the frequency definition, however, we are not permitted to use probability theory to tackle this problem. This is because the mass of Saturn is a constant an not a random variable; therefore, it has no frequency distribution and so probability theory cannot be used."
--BenE 14:06, 2 October 2007 (UTC)
- Logicus further to comments of 30 September: I offer the following further comments aimed at achieving conceptual clarification both of the apparently conceptually confused probabilist literature and also of this article as it reflects that confusion in the matter of explaining its category of 'Bayesian probability'. In short, I propose we investigate the hypothesis that this category is in fact a category mistake or pseudo-category, because there is really no such conception as distinct from non-Bayesian probability, but it has become a misnomer for subjective probability. For the logical order of exposition involved in identifying 'Bayesian probability' in general and for this article surely requires answering (1) What is probability ? (2) What is Bayesian probability ? and (3) What is subjective Bayesian probability ? in that order. But it seems 'Bayesian probability' has somehow been conflated with 'subjective Bayesian probability, but even then only equivocally, and is never defined as a separate more general category that is a sub-species of probability. (For the equivocation in this article see paragraph 2 of the 'Varieties' section, and also equivocations about whether such as logical probability, which is never clearly defined, is Bayesian or not.)
- For certainly inasmuch as Bayes' Theorem is a logical consequence of the Kolmogorov axiomatisation of the ontic conditional probability calculus for events and also of its epistemic counterpart for propositions, then ALL probability is Bayesian. So what is the specificity of the conception of probability referred to by 'Bayesian probability' ? Apparently, I submit for critical discussion, nothing more than subjective epistemic probability. So by way of possible logical clarification I offer the following possible pedagogical ontology of concepts for consideration.
- In answer to the first expository question 'What is probability ?', then first of all I suggest the predicate 'probability' should be interpreted as a primitive undefined term in intension, but whose extension is implicitly defined by the probability calculus in some axiomatisation (e.g. Kolmogorov's). BUT with the following additional historical exposition, namely that there have been two different interpretations of the subject of the 'probability' predicate, namely its ontic interpretation that it is a predicate of events or attributes (e.g. Popper's propensity interpretation or von Mises' relative frequency interpretation), and on the other hand in its propositional interpretations that it is a predicate of propositions OR of some hierarchy of such predicates whose ultimate subject is propositions e.g. in subjective 'epistemic' probability the term 'probability' is defined as 'the STRENGTH OF BELIEF in the TRUTH of a proposition', which involves the predicate 'TRUE' and also the further predicate of a predicate, namely 'STRENGTH OF BELIEF' that is itself a predicate of the propositional predicate 'TRUE' (i.e. P(p) = B(T(p), where p = any proposition, T(p) = TRUTH of a proposition and B(T) = STRENGTH OF BELIEF in T and P(p) = Probability of p).
- THEN SECONDLY, the various different conceptions of probability may be defined within this overall pedagogical framework. For example, epistemic probability, based on the notion that knowledge or episteme is TRUE propositions, may be defined as a conception of 'probability' whose ultimate subject is the truth of propositions (h), such as 'P(h) =df the rational degree of certainty in the truth of h' (the classical conception of Bernoulli and Laplace) OR 'P(h) =df the likeliness that h is true' (objective epistemic probability e.g. the likeliness that 'the coin is heads' is true is 0.5) OR 'P(h) =df the strength of belief that h is true' (subjective epistemic probability). This latter conception of probability is what the literature (and this article) tends to identify as 'Bayesian probability', if somewhat equivocally.
- However, where the probability of scientific hypotheses is concerned, anti-epistemic instrumentalist and utilitarian pragmatist philosophies of science maintain such hypotheses are neither true nor false and thus without epistemic value, but only instruments of prediction, or else maintain that their scientific value is just their pragmatic utility for such functions as making predictions, rather than whatever epistemic value they may have. Thus such philosophies engender non-epistemic propositional probability in philosophy of science, such as utilitarian predictive propositional probability, where 'P(h) =df the likeliness that h is useful for predicting novel facts' or its subjectivist counterpart 'P(h) =df the strength of belief that h is useful for predicting novel facts'. (This latter conception is the 'non-Bayesian' (i.e. non-epistemic) conception of probability advocated by such as Bill Jefferys on this Talk page, for example, but it remains to demonstrate such conceptions obey the axioms of the Kolmogorov probability calculus. Note how the fact that Jefferys regards himself as a 'Bayesian probabilist' arguably demonstrates that 'Bayesianism' cannot be identified with subjective epistemic probability which Jefferys rejects. The interesting question here is why Jefferys regards himself as a Bayesian in any more significant sense than that he uses Bayes's Theorem, and which probabilists he thinks are not Bayesians.)
- Thus on this hopefully less confused and confusing framework for conceptualising the different interpretations of 'probability', the category 'Bayesian probability' drops out of the picture as a non-category inasmuch as it is just 'subjective epistemic probability' rather than some additional or different independent category of probability. --Logicus 18:16, 3 October 2007 (UTC)
- BenE on Kolmogorov`s axioms
- "For certainly inasmuch as Bayes' Theorem is a logical consequence of the Kolmogorov axiomatisation of the ontic conditional probability calculus for events and also of its epistemic counterpart for propositions, then ALL probability is Bayesian. So what is the specificity of the conception of probability referred to by 'Bayesian probability' ?"-Logicus
- BenE on Kolmogorov`s axioms
- I think its the other way around. Bayesians see Bayes rules as more general than Kolmogorov's axioms: a mathematical generalisation of Aristotelian logic. [See Cox's theorem] (or buy the book on amazon). Jaynes has this to say about Kolmogorov (for more, see appendix A from the previous link):
- "[Kolmogorov's ] system of probability could hardly be more different from ours in general viewpoint and motivation; yet the final results are identical in several respects. [describes kolmogorov's system and axioms]. But as noted in Chapter 2, a proposition A referring to the real world cannot always be viewed as a disjunction of elementary propositions omega_i from any set OMEGA that has a meaning in the context of our problem; and its denial A_bar may be even harder to interpret as set complementation. The attempt to replace logical operations on the propositions A, B, ... by set operations on the set OMEGA does not change the abstract structure of the theory, but it makes it less general in respects that can matter in applications. Therefore we have sought to formulate probability theory in the wider sense of an extension of Aristotelian logic.[...] Finally, [The axioms] of the probability measure P were stated by Kolmogorov as seemingly arbitrary axioms; and Kolmogorov's system of probability has been criticised for that arbitrariness. But we recognize them as statements, in the context of sets, of just the four properties that we derived in Chapter 2 from requirements of consistency. [...] For all practical purpose, then, our system will agree with [Kolmogorov's] if we are applying it in the set-theory context. But in more general applications, although we have a field of discourse F and probability measure P on F with the same properties, we do not need, and do not always have, any set OMEGA of elementary propositions into which the elements of F can be resolved.[...] In summary, we see no substantive conflict between our system of probability and Kolmogorov's as far as it goes; rather, we have sought a deeper conceptual foundation which allows it to be extended to a wider class of applications, required by current problems of science."
- --BenE 00:15, 4 October 2007 (UTC)
- I think its the other way around. Bayesians see Bayes rules as more general than Kolmogorov's axioms: a mathematical generalisation of Aristotelian logic. [See Cox's theorem] (or buy the book on amazon). Jaynes has this to say about Kolmogorov (for more, see appendix A from the previous link):
- BenE on the choice of word Bayesian vs Subjective vs Objective
- I think many people assume that absolute truth about the universe is unreachable. Laplace starts his book about probability theory on this assumption (p.4) he describes ultimate knowledge (see Laplace's demon) but says it is always unreachable: "All these efforts in the search for truth tend to lead it back continually to the vast intelligence which we have just mentioned, but from which it will always remain infinitely removed." That it is possible for humans to reach purely epistemic and absolute truth is an illusion. Real truths about the universe are beyond the limit of our senses and cognition. What we can have instead are models which approximate the universe and allow us to make predictions about it. Science is our attempt at making these models the most predictive and objective as possible. Bayesian probability theory has the same aim but is defined with more precision than scientific theory through the language of mathematical equations. It is the best way to make predictions in the most objective way as possible, bringing them as close as possible to the assumed real objective truths which can never be known absolutely. Of course we can reach truths by definition within languages but these kind of truths are only intermediate symbols used in the approximate models of reality.
- So is Bayesian probability theory subjective? Well it is in the sense that it is an approximation to reality, existing only as an approximation in someone's head, in a computer simulation, or in a mathematical equation on paper. But it also aims to be the thing which brings humans the closest they will ever be to objective absolute truth by using a criteria of consistency. Since it is the most objective knowledge we have about the universe, shouldn't we reserve the word 'objective' for its description? And since it aims to be farthest away possible from random psychological variations commonly called 'subjective' shouldn't we avoid using the word 'subjective'?
- In order to avoid this debate, we avoid the words 'objective' and 'subjective' completely and instead use the word 'Bayesian', which refers to the mathematical equation which Bayesians believe is (even if subjective in a certain sense of the word), the closest we can ever get to objectivity. We thus avoid splitting hairs on vague english definitions.--BenE 02:03, 4 October 2007 (UTC)
Logicus proposed edit: The qualifier “very broad” in the second sentence of the ‘History’ section is logically mistaken, since as defined here IN THIS ARTICLE ‘Bayesian probability’ as ‘degree of belief in the truth of a proposition’ is rather a very narrow interpretation of ‘probability’. I propose its deletion and replacement by ‘narrow specific’, so
“The term Bayesian, however, came into use only around 1950, and it is not clear that Bayes would have endorsed the very broad interpretation of probability that is associated with his name.”
BECOMES
'The term Bayesian, however, came into use only around 1950, and it is not clear that Bayes would have endorsed the narrow specific interpretation of probability that is associated with his name.' --Logicus 12:59, 5 October 2007 (UTC)
- Yeah, that's probably right, the definition has become more specific, not more general with time. Although the domain of application has probably become more general. Bayes probably never knew his equation could summarise the whole of probability theory.--BenE 14:59, 5 October 2007 (UTC)
- Logicus to BenE: Thanks for this reversion on your remarks of 4 October about the relative generality of Bayesian and of Kolmogorov probability.
- But how can an intensionally narrower conception possibly be broader in extension ? A proper subconception surely defines a proper subset, at least on standard logic and set theory.
- No no what I meant is that concept as been specified in meaning, but it is still more general than Kolmogorov's conception in that Kolmogorov's axioms only applies to sets, wheras Cox theorem makes bayesian probabilities apply to any propositions including sets. --BenE 22:31, 11 October 2007 (UTC)
- I do not yet understand Jaynes' argument you quote that his conception of probability is deeper and wider than the Kolmogorov conception, whether or not it is what should be called a 'Bayesian conception', and since in the absence of any simple clear statement of Jaynes's conception of probability there is no reason to think it is, I therefore recommend deletion of the apparently 'mistaken' clause at least until and unless the contrary is satisfactorily proven.
--Logicus 18:10, 11 October 2007 (UTC)
Logicus to BenE: I would be grateful if you would desist from fragmenting my contributory discussions with your critical comments, lectures in philosophy and evangelical Jaynesian fundamentalist philosophy of science. I have re organised two of your contributions that do this into their chronological date order to prevent them doing so. May I suggest you add to them the relevant quotations from myself or the propositions you think you are criticising, as appropriate. May I also request that before making edits of the article, you first post your proposed edits on the Talk page for critical discussion. --Logicus 18:13, 8 October 2007 (UTC)
Logicus deletes BenE: On 2 October BenE lectured iNic on his ‘error’ about ‘Bayesianism’ as follows:
"BenE to iNic: —Preceding unsigned comment added by BenE (talk • contribs) 14:11, 2 October 2007 (UTC)
"The law of gravity, for example, isn't increasingly more true for every observed apple that is dropped."
You misunderstand how Bayesian principles work. The law itself isn't more true. However our rational belief that the law is valid is strengthened every time we observe it. Bayesian probability is based on making the distinction between truth and rational belief about the truth. It is a measure of rational and objective belief not of truth."
However, the following emboldened editing of the article itself then appeared, which I suspect is BenE's doing, although I have not confirmed:
“Another interpretation is based on an extension of Aristotelian Logic. Bayesian probabilities are then viewed as a the best mechanism to compute the intermediate results between truth and falsehood. Cox's theorem proves it is a unique solution by using a criterion of consistency. Note that using this interpretation, the problem encountered with scientific hypothesis under the betting interpretation are easily solved by only allowing two or more alternative theories to be compared. [3]
[3] see Jaynes 2003: "For such hypothesis, Bayes' theorem tells us this: Unless the observed facts are absolutely impossible on hypothesis H0, it is meaningless to ask how much those facts tend 'in themselves' to confirm or refute H0. Not only the mathematics, but also our innate common sense (if we think about it for a moment) tell us that we have not asked any definite, well-posed question until we specify the possible alternatives to H0. Then, as we saw in Chapter 4, probability theory can tell us how our hypothesis fares relative to the alternatives that we have specified;..." "
But the second emboldened sentence (i) obviously contradicts what BenE said above to iNic on 2 October,that Bayesianism is not about degrees of truth but rather about degrees of belief about truth, (ii) is obviously false on the article’s current definition of Bayesian probability as ‘degree of belief in the truth of a proposition’, and (iii) is unsourced. I have therefore deleted the whole paragraph and put it here for discussion. But I fear it may simply reflect considerable conceptual and philosophical confusion, and be an unacceptable edit. I should also add I do not understand how the problem of the positive undecidability of universal hypothesis is solved by the convention of only allowing two of more hypotheses to be compared. Perhaps BenE could elucidate if it is his unsourced work, or quote the explanation of this solution given by some published source.--Logicus 18:10, 9 October 2007 (UTC)
- Logicus, please read this entry in Andrew Gelman's blog. Gelman is a highly respected Bayesian, and one of the authors of the standard and best available text on Bayesian statistics (amongst other things). You've been complaining about how peculiar you regard my approach towards Bayesianism; you'll find that Gelman's approach is very similar to mine. Bill Jefferys 14:27, 10 October 2007 (UTC)
"But the second emboldened sentence (i) obviously contradicts what BenE said above to iNic on 2 October,that Bayesianism is not about degrees of truth but rather about degrees of belief about truth, (ii) is obviously false on the article’s current definition of Bayesian probability as ‘degree of belief in the truth of a proposition’, and (iii) is unsourced."
BenE's response How can you have a 'degree of truth that is not a belief?' When we talk about 'Degrees of truth' this automatically implies it is a belief as something can't be half true! This interpretation is made even more obvious by the context of the article and all the preceding talk about belief. Can you explain the distinction between degrees of truth and degrees of belief in truth? You seem to be stretching word meaning up to a point where they don't even carry semantics in order to mislead others into thinking my statements are contradictions. However, it wouldn't be hard to change it and spell everything out in details, something like the following:
- "Bayesian probabilities are then viewed as a the best mechanism to compute the intermediate estimated results between truth and falsehood within one's belief system or model of reality."
I don't know which part you think lacks references. Apart from this citation from Jaynes, which I keep repeating and which also references Jeffreys(1939):
- "A practical difficulty of this was pointed out by Jeffreys (1939); there is not the slightest use in rejecting any hypothesis H0 unless we can do it in favor of some definite alternative H1 which better fits the facts.
- Of course, we are concerned here with hypotheses which are not themselves statements of observable fact. If the hypothesis H0 is merely that x < y, then a distinct, error-free measurement of x and y which confirms this inequality constitutes positive proof of the correctness of the hypothesis, independently of any alternatives. We are considering hypotheses which might be called 'scientific theories' in that they are suppositions about what is not observable directly;only some of their consequences - logical or causal - can be observed by us.
- For such hypothesis, Bayes' theorem tells us this: Unless the observed facts are absolutely impossible on hypothesis H0, it is meaningless to ask how much those facts tend 'in themselves' to confirm or refute H0. Not only the mathematics, but also our innate common sense (if we think about it for a moment) tell us that we have not asked any definite, well-posed question until we specify the possible alternatives to H0. Then, as we saw in Chapter 4, probability theory can tell us how our hypothesis fares relative to the alternatives that we have specified; it does not have the creative imagination to invent new hypotheses for us."
If you read the quote from the discussion about Kolmogorov's axioms:
- "Therefore we have sought to formulate probability theory in the wider sense of an extension of Aristotelian logic."-Jaynes
But also read the beginning of Chapter 1 here. You may chose any quotation you like in there and add it to the article.
If there is no objection from anybdy else, I'll undo Logicus' changes and modify the quoted sentence to be more specific. --BenE 16:49, 10 October 2007 (UTC)
Logicus to BenE: I make two observations on the following rantings of 20 September:
"BenE to Logicus: ... We can actually write down the equations and solve real world problems which is more than I can say for your views. You keep giving examples that are outside the bounds of a useful Bayesian theory. Your perspective is so pure epistemology to the extreme that everything becomes meaningless with regard to the real world. If I used your rational and applied it to simple algebra I would reject this theory on the basis that it can't divide by zero and thus isn't a consistent and complete epistemological system for numbers. Eventually my world-views in general would fall to complete nihilism.
....Maybe you are not expressing yourself very well. Maybe you have actually have a consistent alternative theory which you think is better than bayesianism for science. If so, and if it's a well known theory, please create an alternate Wikipedia page which describes this theory and its benefits over bayesianism as science. Then come back here and create a link to that page so that people are aware of the alternative. But coming here to punch down hypothetical straw men that live in a world isolated from reality while telling us you found Bayesianism's "fundamental problems" is a waste of everybody's time.--BenE 15:33, 20 September 2007 (UTC)"
In the first instance, on their illiteracy, I have never said I have "found Bayesianism's fundamental problems" nor 'punched down any hypothetical straw men', nor indeed even any real straw men, and nor have you documented that I have. Most of the views you rant against are not mine. In fact I have no idea against who or what specific proposition the ranting of most of your texts is directed. This may be largely because they are unintelligible. I have asked your admirer and fellow Jaynesian Jefferys to translate their proposition against me he claims to agree with into simple intelligible English, but he has failed to do. Perhaps your supporter Coppertwig can offer an intelligible English translation of your proposition(s) against mine that he claims to agree with.
Secondly, the issue here is one of philosophy of science in respect of theories of scientific reasoning, that is, theories of scientific conduct in the acceptance and rejection or evaluation of theories that can account for the last three millenia or more of the history of science. As well as also trying to introduce greater conceptual clarity about what the Bayesian conception of probability is as distinct from others into this article, a project to which you seem to contribute nothing but further confusion, the main issue I am dealing with here in respect of the issue of Bayesian philosophy of science, a specific application of Bayesian probability, is that of moderating and qualifying this article's highly biassed pro-Bayesian philosophy of science approach that fails to mention any of its many, many problems and is in severe breach of NPOV. You will no doubt appreciate the irony of the antithesis of your 12 May contribution that complained this article is anti-Bayesian, and based on a straw man.
But I am not here advocating any particular alternative philosophy of science in this article, since that seems inappropriate. Nor do I see putting in links to articles on non-Bayesian and indeed also non-probabilist philosophies of science as appropriate, but maybe I'm wrong about that. For your information, there are of course many, many alternatives to Bayesian philosophy of science, such as the non-probabilist theories of Bachelard, Duhem, Feyerabend, Kuhn, Popper, Lakatos, Hanson, Cohen, Putnam, Glymour, Cartwright, Galison, Suppes, etc. and some of which have Wikipedia articles expounding them if you wish to educate yourself in such, given you seem unaware of anything other than Jaynes's philosophy of science. And note that probabilist philosophy of science is never even mentioned in the Wikipedia article on philosophy of science. Each philosophy of science will of course determine its own account and reconstructions of the history of science and successes and failures to do so. Explaining the heliocentric revolution typically defeats all philosophies of science. (I recall many years ago in the 1970s a chap called Swinburne, who also believes Bayesianism proves the existence of God, failed to achieve a Bayesian reconstruction of it.)
Of particular interest to yourself as a Jaynesian fundamentalist might possibly be Lakatos's philosophy of science for the reason that, as you may be aware, Jaynes was an ardent admirer of Lakatos's fellow Hungarian friend Polya and his theories of non-demonstrative 'plausible reasoning' in mathematics also admired by Lakatos. Jaynes probabilist theory of scientific method was his attempt to develop Polya's philosophy of mathematical discovery for the empirical sciences as I understand. But whereas Jaynes elected to develop a probabilist theory of scientific plausible reasoning, Lakatos, who believed the probability of all scientific laws to be zero, developed a non-probabilist dialectical theory of scientific method as expounded in his world famous Proofs and Refutations and The Methodology of Scientific Research Programmes. Thus you might like to consider Lakatos's philosophy of science as the antithetical alternative and antidote to Jaynes's probabilist philosophy within 'the Polya tradition' of 'plausible reasoning', and compare the relative achievements of their reconstructions of the history of science to see which you think gives the better account of its various episodes, such as 'the Copernican revolution', the success of Newton's theory if gravity, the Einstein revolution, and the various many historical case studies to be found in Method and Appraisal in the Physical Sciences Howson (ed) and elsewhere.
I hope this wider education in philosophy of science might reduce your posting your educationl growing pains in the subject on this page.--Logicus 18:10, 11 October 2007 (UTC)
Logicus to BenE
You said on 10 October:
"BenE's response
How can you have a 'degree of truth that is not a belief?' When we talk about 'Degrees of truth' this automatically implies it is a belief as something can't be half true! This interpretation is made even more obvious by the context of the article and all the preceding talk about belief. Can you explain the distinction between degrees of truth and degrees of belief in truth? You seem to be stretching word meaning up to a point where they don't even carry semantics in order to mislead others into thinking my statements are contradictions."
Rather than revealing my stupidity, as you imply, these comments rather reveal your ignorance of the philosophy of science, the subject on which you elect to regale these pages with your quasi-Jaynesian views on truth, science and the universe. For they reveal and that you are unaware of the concept and theories of verisimilitude, that is, of the notion degrees of the truth-likeness or of the truth-content of a false proposition, whereby one false proposition may be said to be nearer the truth or more truthful or truthlike than another. An intuitive informal answer to your question of how a proposition can be half true is easy. Consider a beast that is half-woman and half-horse. It would be half-truth to say it is a woman, or to say it is a horse. The notion of 'half truths'is common in everyday parlance, and the stock in trade of politicians. More technically the Finnish philosopher of science Niniluoto does audits of theories of verismilitude if you want to get technical and into versimilitude metrics and degrees of truth beween 1 and 0. Here I just give you some simple possible examples of theories of verisimilitude that your astronomer supporter Jefferys may like.
Why was Kepler's 1609 Astronomia Nova theory that the orbits of the 6 known planets and their 5 known sateliites were elliptical not accepted, nor its celestial dynamics ? As Curtis Wilson 1989 reports, at the time only the orbits of Mars and Mercury were detectably non-circular. So if verisimilitude were the ratio of unfalsified cases to all cases, then the ellipse hypothesis has verisimilitude 2/11 = 0.182 and a circular orbits hypothesis 9/11 = 0.818. If scientists reject theories they believe to have verisimiltude less than 0.5, then this would explain why Kepler's theory was rejected. And why was Newton's gravitational celestial dynamics not accepted for so long ? Well if versimilitude were the ratio of refuted predictions to total predictions, Newton's were mostly refuted, and thus scientists holding such a theory iof versimilitude would have accredited it very low versimilitude. Don't attach any great significance to these examples or assume I am advocatig such theories, they are illustrating possibilities that refute your assumption of the impossibility of degrees of truth.One of the virtues of verisimilitude is that a series of false theories, thus all with probability 0, may have increasing verisimilitude, which avoids the problem of all theories of confirmation such as probabilism that assign value zero to refuted theories given scientific theories are always refuted i.e.have counterexamples or anomalies and thus exclude crediting positive scientific progress to a series of false theories.e.g. the historical series consisting of Aristotle's, Kepler's, Newton's and Einstein's celestial dynamics --Logicus 18:16, 11 October 2007 (UTC)
BenE's brief comments you said:
- "I have never said I have "found Bayesianism's fundamental problems"
wasn't it you who wrote in the article?:
- "This problem of the Bayesian philosophy of probability becomes a fundamental problem for the Bayesian philosophy of science"
[Logicus comments: BenE you reveal a serious literacy problem here. What I am denying is that I ever said it was ME who found the problems. It was EdJohnston who attributed the discovery of the fallibilist problem to me. Flattering maybe, but not true so far as I am aware.--Logicus 19:46, 13 October 2007 (UTC)]
It's ironic you mention heliocentrism as Jaynes illustrated the method we should use to evaluate Bayesian theory by recounting how Galileo himself came to be accepted in this paper [Logicus comments: What proposition of Galileo's might it be that Jaynes thinks came to be accepted? I hope not his radically mistaken theory of gravitational free-fall that was never accepted, but which positivist fairy-tales claims was ? Or his heliocentric celestial dynamics and kinematics, contra Tycho's geocentric astronomy, that never was accepted. The troubl;e with Bayesian history of science is that being variant of logical positivism it usually reproduces positivist fairy tales about history of science. --Logicus 19:46, 13 October 2007 (UTC)]
Verisimilitude, is not a degree of truth more than a degree of belief is. It is rather a degree of similarity to truth. In fact, one wonders if it is not just an unsuccessful attempt at creating something not unlike logical Bayesianism. The Bayesian 'degree of belief' by its claim of being the best approximation to truth (as a belief theory or model) would make it a good candidate to fill the role of making optimally "verisimilar" theories.
- "Consider a beast that is half-woman and half-horse. It would be half-truth to say it is a woman, or to say it is a horse."
Here, we are getting away from reality and science. "woman" or "horse" are only concepts, categories or models existing only in our heads or on paper, they are not physical truths. They are only intermediate values used to assert beliefs about the object they categorise."horse" by itself doesn't represent anything real. I have been relentless about this: Bayesianism as a science only applies to propositions about real physical things, real physical observable truths. There is never scientific research meant to discover categories of things, rather we discover properties and use these properties to create useful categories based on similarity. As Aristotle said in Categories: "In the case of secondary substances, one would judge from the form of the appellation that a particular thing was being indicated when one said "man" or "animal". But this is not true; secondary substances, rather, indicate some quality."
Again, when I say truth on this page, I mean physical observable truth or rather truth of propositions that say things about reality. I think this is implicit when we talk about science.
I thus reiterate that at least for propositions about reality there can't be half truths. Laplace realised this which is why he asserted a unique (and unknowable) true universe as an presupposition for his probability theory (on page 3 of a more than 800 page volume!). --BenE 00:18, 12 October 2007 (UTC)
Logicus on The problems of Bayesian Philosophy of Science as distinct from those of the Bayesian Philosophy of Probability
Further to my comments of 11 October above, they do not really belong to this section on the concepts of probability. Maybe some confusion has arisen because BenE's critical comments of 20 September in this section commenced with his assertion that he held the same views as Bill Jefferys. But Jefferys' views and my debate with him concerned the philosophy of science, that is, the nature of scientific reasoning, and whether it is Bayesian probabilist or not. Thus it was reasonable to interpret BenE's statements as about the same issue, rather than about the philosophy of probability, which might be the issue he actually had in mind, even if unclear from his comments. As I have pointed out before to try and clarify this crucial distinction, one may hold a Bayesian philosophy of probability, but be a vehement anti-probabilist and thus anti-Bayesian in the philosophy of science, regarding it as utterly absurd that scientists' beliefs in the truth of theories obey the probability calculus or are even logically consistent.
Anyway, my point here is that the above discussion does not really belong to this section, but rather to a section on the problems of Bayesian philosophy of science, that is, the thesis that scientific reasoning is probabilist and Bayesian, not to be confused with theories about what is the best interpretation of the notion 'probability'. And so I copy it to another section devoted specifically to discussing the problems of Bayesian philosophy of science. This is, for example, is the appropriate place to discuss whether the belief that all scientific laws are false, whereby they must be assigned probability zero when 'probability' is defined as 'strength of belief a proposition is true', is recognised as posing a fundamental problem for Bayesian and probabilist theories of scientific reasoning. Or what episodes in the history of science are recognised as being successfully accounted for by a Bayesian probabilist theory of scientific reasoning, such as 'the Copernican revolution', 'the anti-Cartesian Newtonian revolution', 'the Einsteinian revolution'.--Logicus 16:12, 12 October 2007 (UTC)
The confusion of Jefferys & BenE on Bayesian probability
From the 19 Sept and 13 Oct contribution of Jefferys and those of 20 Sept and 10 Oct it seems that the conception of probability they have in mind is what some authors call 'logical' or 'objective Bayesian' probability as distinct from 'subjective probability' or 'subjective Bayesian probability' that this article identifies with 'Bayesian probability' in its opening sentence. It is the latter that Logicus has been employing in his analyses, but suggested might be a category mistake in his constructive analysis of 3 October. This would explain why Jefferys thinks that if there is only one hypothesis about some domain, it must have probability 1, even though not a tautology, because in this conception of probability, when there are n hypotheses it sets priors for each as 1/n. Thus it seems the gross unfamiliarity of Jefferys and BenE with the literature on Bayesian probability has caused much timewasting rather than getting on with the constructuve busines of conceptual clarification of definitions of probability this article requires.--Logicus 18:13, 18 October 2007 (UTC)
- You are right, I do mean the logical interpretation. Isn't this the most popular meaning for the word Bayesian? Who are the proponents of this subjective interpretation? I don't think there are many. Also I don't think the authors like Harold Jeffreys, Rudolf Carnap, Richard Cox and E.T. Jaynes used the word "logical" or "objective epistemic", they simply called themselves Bayesians.--BenE 23:03, 18 October 2007 (UTC)
Logicus to BenE on SUBJECTIVE probability: So at least it seems we have cleared up one fundamental misunderstanding. It was evident from your two page very first intervention of 12 May that something was radically wrong in your reading of this article as anti-Bayesian, and it has now become clear that this was because you identify Bayesianism with 'objective' Bayesianism such as of the Jaynesian ilk, and apparently wholly ignore 'subjective' or 'personal' Bayesianism, in terms of which this article defines 'Bayesian probability'. Thus it seems it is rather yourself who has been attacking a straw man in a case of mistaken identity, attacking subjective probability as though it were objective probability.
I think a crucial distinction between the two is that 'objective' probability employs the metaphyscial Principle of Indifference for assigning prior probabilities to mutually exclusive and logically exhaustive hypotheses, whereas 'subjective' probability does not, on the ground that it is pseudo-objective. Instead they are assigned in accordance with the subjective strength of belief of individual subjects that the proposition is true, just as the definition states, but which you have apparently misread somehow as being part-determined by observance of the Principle of Indifference in assigning the priors, rather than wholly by unregulated subjective belief. Of course arguably one objection made to objective Bayesianism this overcomes is the argument that on the Principle of Indifference for assigning prior equiprobabilites to all mutually exclusive possible competing hypotheses to explain some data, then since it is a logical triviality that infinitely many such competing hypotheses are always possible, their prior probabilities must all be zero. (I do not presume the validity of this objection here.)
If you wish to learn something about distinctly SUBJECTIVE Bayesian probabilist philosophy of science, perhaps you should try the Howson & Urbach 1993 textbook (second edition) cited in the references that has arguably most promoted its radical growth in university philosophy of science over the last decade. It is especially recommended for its decisive crisp refutations of the naive philosophical objectivist dogmas of Jaynes's 'objective probability' and his disciples such as Rosenkrantz. However, if you are logically astute in critical proof-analysis, you should also spot the farcical fatal flaw in the whole book's eliminative proof of its main concluding thesis that subjective Bayesianism is the only theory capable of putting inductive inference on a sound foundation [p438]. It is that all other methods are eliminated as unsound on the meta-criterion that any inductive method that makes evidentially a priori assumptions about the nature of the world, such as the Principle of Indifference does for example, is thereby fatally unsound [e.g. p69-70]. But of course the subjective Bayesianism method itself quintessentially does precisely that in its subjectivist rule for assigning prior probabilities according to the subject's beliefs about the world, and so is also fatally unsound. What Howson & Urbach's book really proves is that all inductive methods they consider are unsound on their negative criterion of sound induction, including, and arguably especially, subjective Bayesianism. You should also note that in their desparate attempt to escape the fatal flaw in their subjective Bayesianism posed by the popular radical fallibilist philosophy of science that all scientific laws are false whereby they must be assigned prior probability zero[p394-5], Howson & Urbach most amusingly only cite Einstein as an example of a scientist who believed their theory to be true, amusing because they quote Einstein's statement that he would still have believed his GTR was correct even if the 1919 eclipse experiment had refuted his prediction. Thus in fact on this testimony Einstein was an example of scientists' reasoning not being in accordance with Bayesian probabilist philosophy of science, according to which scientists adjust the strength of their beliefs in the truth of their theories in response to confirming or refuting evidence, or in the words of this article, the scientific method "requires one to start with initial beliefs about different hypotheses, to collect new information(for example by conducting an experiment), and then to adjust the original beliefs in the light of new information." and according to Bayesian philosophy of science this evidential belief adjustment is done according to Bayesian inference [See 'Applications']. But it seems Einstein had complete certainty in the truth of his GTR and was not prepared to change his opinion in the light of refuting evidence. And moreover to boot, on the testimony of Earman and Glymour 1980, it seems the best outcome of Eddington's celebrated 1919 eclipse experiments of a 1.98" deflection did in fact refute Einstein's prediction of a 1.74" gravitational deflection, as well as the Newtonian 0.87" prediction, thus apparently providing an example of Feyerabend's thesis that the practice of modern physics in (mis)interpeting experimental refutations as confirmations confounds all logical positivist confirmation theory.[p65 Against Method]
It should also be noted that subjective Bayesianism is confounded by instrumentalist philosophy of science according to which scientific laws are neither true nor false, but only instruments of prediction. Clearly in this case there can be no strength of belief a law is true.
As for your question of what is the most popular meaning of the word 'Bayesian', here I have no idea and answering this socio-semantic question would require extensive global social surveys. But with respect to the philosophy of science issue, it seems plausible that the 1990s Bayesian publication explosion from 200 articles per annum in the 1980s to 1400 by 2000 reported in the 2006 third edition of Howson & Urbach (pxi, taken from Williamson & Corfield 2001) was largely of the subjective variety. And certainly subjective Bayesianism now seems extremely popular if not the dominant theory of scientific reasoning in philosophy of science. But it would be interesting to find out exactly what the quantitative facts are.
As for your question of who the proponents of the subjective interpretation of probability are, I understand they also include such as Ramsey, de Finetti, Savage, Lindley, Hacking, Swinburne, Skyrms etc.
Your observation that Jeffreys, Carnap, Cox and Jaynes called themselves Bayesians rather than 'logical' or 'objective epistemic' Bayesians not only raises the issue yet again that this article's definition of 'Bayesian probability' as the subjectivist variety may be too narrow, which seems to be the main virtue of your contributions, but also the problem of how it can possibly be less wide than all probability theory. However, I hope you will find that a key distinguishing principle between 'objective' and 'subjective' Bayesian probability such as I have suggested, namely use of some form of the Principle of Indifference or not to determine priors, is historically adequate in the case of the aforenamed individuals and indeed more widely, although it is not a suggestion I have time to test exhaustively.
- I can agree with a lot of what you said, however I don't think the article as written currently is only about the subjective version of bayesianism, in fact most sections ("Controversy between Bayesian and Frequentist Probability","Applications","Probabilities of probabilities") describe the objective epistemic, logical interpretation. Also you seem to put a lot of emphasis on the philosophy of science issues when the article doesn't even mention philosophy of science. I think most people recognize that there are issues to be resolved before Bayesianism becomes accepted as an all encompassing philosophy of science. I personally think that it is a good candidate (at least the objective epistemic kind), but I wouldn't write this in wikipedia as it is more of an opinion than encyclopedic knowledge. However if you want to add a section at the end of the article entitled "Bayesian [subjective or logical] probability theory as a philosophy of science" along with a criticism subsection, I have no objection.--BenE 22:45, 24 October 2007 (UTC)
Gillies on only fair betting quotient
I have a suggestion: first of all, could Logicus please provide a quote from Gillies about the betting quotient and universal hypotheses. (Sorry if you already did. This page is rather long.) The quote is probably just for us to look at on this talk page. Then, my suggestion is that we have the article say something like "Gillies states that ...", so that it's attributed to Gillies, not to Wikipedia. I still have a number of problems with the statement [3] about the universal hypothesis. How does one win a bet -- by convincing a human judge of the truth of a proposition? Or by it being objectively true? Or what? and, for whom is it fair? It seems to me that it's unfair for the other player, who has to put some money down but may never get it back. Those are on top of the concerns I've raised already, which I'm not convinced have been adequately addressed. --Coppertwig 23:10, 31 August 2007 (UTC)
Logicus to Coppertwig: I don't undertand your problem here. There is no attribution to Wikipedia, rather Wikipedia is just reporting a recognised problem in the literature, and gives references for such. And you own original research views or problems with this problem are simply not relevant here. There are far more dubious wholly unsourced claims in this article to pick on that Logicus's genuine additions. But I will look for a quote for you. By the way, there is also the addition about the problem of fallibilism to come. --Logicus 14:53, 1 September 2007 (UTC)
- I appreciate your taking the time to look for a quote for me. If you wish to challenge unsourced claims, I believe the usual method is to put {{fact}} after them in the article, which makes a footnote like this [citation needed], and then wait a period of time -- I think several weeks at least is usually expected -- and then delete them if nobody has provided sources. --Coppertwig 17:21, 1 September 2007 (UTC)
- Coppertwig, if you really must put something from Gillies in the article, I suggest the following footnote: "e.g. see Gillies 2000, p55: "My own view is that betting does give a reasonable measure of the strength of a belief in many cases, but not in all. In particular, betting cannot be used to measure the strength of someone's belief in a universal scientific law or theory." " as a footnote to the very first sentence of my addition.--Logicus 18:09, 4 September 2007 (UTC)
Logicus, why is Gillies' personal view on betting odds, cited from a 2000 publication, relevant to the history of Bayesian probability? If this objection is so critical to Bayesianism's history, surely there would be some citation in an older publication by a more prominent author. -- 158.83.15.85 14:33, 18 September 2007 (UTC)
- Logicus to 158.83.15.85: Thank you for this anonymous mistaken comment. Please note that contrary to your assumption, I have never claimed Gillies' views on betting odds are relevant to "the history of Bayesian probability" nor to "Bayesianism's history", nor that his objection is critical to Bayesianism's history as you variously suggest. The point at issue here is neither about 'Bayesian probability' nor about 'Bayesianism', and nor indeed about their histories. Rather it is ONLY about the Bayesian PHILOSOPHY OF SCIENCE, that is, a specific APPLICATION of Bayesian epistemic probability to the specific domain of scientists' beliefs and reasoning about scientific propositions, an application listed amongst other such specific applications as 'e-mail spam filtering' in that specific section of the article called 'Applications'. As I understand it, this application to philosophy of science is a relatively novel application of Bayesianism that only really took off in the 1990s. SO I REPEAT, contrary to what you and some other Wiki editors such as Coppertwig, Jefferys, BenE etc sometimes mistakenly presume, the subject at issue here is not the much older topic of 'Bayesian probability', that is, a specific interpretation of the meaning of 'probability' in the probability calculus to mean 'strength of belief that a proposition is true', but rather only about the specific application of that general interpretation of probability and the probability calculus to the domain of scientists' beliefs about the propositions of science to try and explain such as their acceptance and rejection of them.
- As for Gillies, his view that the probability of all universal laws is zero is relevant to Bayesian PHILOSOPHY OF SCIENCE at least because (i) he is a professional academic philosopher of science (ii) he was a Cambridge double first maths wrangler, (iii) was a PhD student in the philosophy of probability of one of the most brilliant philosophers of science and of maths of the 20th century, Imre Lakatos (who also incidentally maintained the probability of all scientific laws is zero), (iv) is an ex President of the British Society for the Philosophy of Science and (v) part of his 2000 book on 'Philosophical theories of probability' does deal with Bayesian philosophy of science, which only took off in the previous decade.
- As for your surmise, when appropriately re-interpreted, that if this objection is critical to Bayesian philosophy of science then "surely there would be some citation in an older publication by a more prominent author.", it is indeed correct, as you might have discovered yourself had you bothered to read the article's footnote references for this objection and the literature listed in the article's References and done some thinking before putting pen to paper, or rather fingers to keyboard. For the saleswise more prominent authors Howson & Urbach discussed it on pages 72 and 263-4 in their 1989 'Scientific Reasoning: The Bayesian Approach' listed in the article's References , as mentioned in the article's footnote. Gillies' specific views on this objection were put into the article simply because it was specifically his views Coppertwig requested I provide, as you will see from the heading of this particular Talk section, although I have no idea why Coppertwig picked on Gillies. And nor, I suspect, does Coppertwig.
- If you wish to trawl through the literature in the article's References for further earlier citations of this objection, please feel free to do so. But please note that whether or not particular authors agree or disagree about the validity of this objection is irrelevant to the issue of correcting this article's highly biassed pro-Bayesian viewpoint that signally fails to mention hardly any of the many problems of Bayesianism and Bayesian philosophy of science by at least mentioning some of them, and thus giving it a somewhat more NPOV. --Logicus 14:46, 23 September 2007 (UTC)
- I can't easily look through the literature. For example, I checked the local public library for the Gillies book and it doesn't have it. However, I have several requests. I think you misunderstood an earlier request I made, and that request still stands. I was not asking for a quote from Gillies for the purpose of inserting the quote into the article. Rather, because you want to insert into the article a statement about the only fair betting quotient in certain circumstances being zero, and since I can't easily check the reference you attached to the statement, I asked you to present a quote on this talk page as a substitute for me looking in the book myself. Different people interpret things differently, so I wanted to check that whatever in that book you're interpreting as supporting that statement would also be interpreted by myself (and others) as supporting that statement. Although you've provided a quote from Gillies, it does not, in my opinion, indicate that Gillies believes that the only fair betting quotient in some particular situation is zero; for example, it does not include the words "fair" or "zero" or synonyms of them in my opinion.
- I would like to ask you, Logicus, to do six things: each of the three following requests applied to each of the two following statements you want to insert into the article: the statement about the only fair betting quotient in certain circumstances being zero, and the statement about the probability of scientific laws being zero. For each of these two, would you please:
- Since you presumably have the books at hand and I may not easily be able to access them, would you please provide on this talk page for the convenience of myself and other editors here a quote from the book that makes the statement so that we can verify that, in our opinion, the statement made in the book is essentially the same as the statement being made here. (I'm not proposing that the quote be included in the article. Possibly I or someone else might later propose including the quote in the article, but that is not the purpose of this request.)
- Please explain the relevance of the statement to "Bayesian probability", the topic of this article.
- When inserting the statement in the article, rather than asserting the statement, assert that a certain book has asserted it, perhaps like this: "Gillies (2000) states that ...".
- Thank you for considering my requests. Of course you don't have to do them, but doing them successfully may lessen my opposition to the insertion of those statements.
- Since Gillies is not Bayes, I wonder what the relevance of Gillies' opinion is here. Similarly for Popper. Perhaps statements by these people can only be considered relevant to this article if they mention "Bayes" or "Bayesian" in the context. --Coppertwig 15:32, 23 September 2007 (UTC)
- I'm not convinced that the idea about the only fair betting quotient being zero is a previously-published idea, so I've edited it. Also, it may only be Gillies' opinion that there is a problem, so this page should not state that there is a problem -- maybe state that Gillies says there is a problem, or (as in my edit) that there may be a problem. Perhaps some of the sentences that followed it also need to be deleted or modified for similar reasons. --Coppertwig 01:14, 1 October 2007 (UTC)
- I'm with you here.--BenE 14:16, 1 October 2007 (UTC)
- I'm not convinced that the idea about the only fair betting quotient being zero is a previously-published idea, so I've edited it. Also, it may only be Gillies' opinion that there is a problem, so this page should not state that there is a problem -- maybe state that Gillies says there is a problem, or (as in my edit) that there may be a problem. Perhaps some of the sentences that followed it also need to be deleted or modified for similar reasons. --Coppertwig 01:14, 1 October 2007 (UTC)
- Logicus to Coppertwig of 1 October: Would you please restore the original text before your edit of 1 October, and instead post your proposed change and its justification here on the Talk page first for critical discussion before any implementation. Also note your edit introduces a fatal omission. Let us see if you can discover what it is by yourself.
- Also note your view that the only fair betting quotient is zero was not previously published before Gillies is mistaken. It was also in Howson & Urbach 1989, the basic teaching text in 'Bayesian' philosophy of science listed in the article’s references, as I pointed out. Please pay attention and stop intervening in issues in a literature and subject, philosophy of science, you are patently unfamiliar with and not competent in. Your personal problems of lack of access to the basic literature are surely sufficient to debar you from commenting. Why should anybody suffer the burden of convincing you of anything ? Are you an employed editor of Wikipedia ?
- However, I should say I am not wholly opposed to the spirit of your proposal of only saying some people say there is a problem, and will consider a modification. --Logicus 12:43, 5 October 2007 (UTC)
- As it seems Gillies is not a Bayesian, and that you are only quoting a genera encyclopaedia, this literature has little authority on the subject. I don't have the encyclopaedia here but one of the quote you put on the page seems patently false:
- See p50-1, Gillies 2000 "The subjective theory of probability was discovered independently and at about the same time by Frank Ramsey in Cambridge and Bruno de Finetti in Italy."
I removed the following which is in any case too specific. It might go on a page about De Finetti since it only applies to his specific flavor of bayesianim.
- "This problem of the Bayesian philosophy of probability becomes a fundamental problem for the Bayesian philosophy of science that scientific reasoning is subjective Bayesian probabilist, which thereby seeks to reduce scientific method to gambling, but some regard it as solvable.[1] But it is also noteworthy that by 1981 De Finetti himself came to reject the betting conception of probability.[2]"--BenE 15:45, 5 October 2007 (UTC)
Logicus to Coppertwig of 1 October: How about the following quote as evidence of recognition in the literature that the positive undecidability of universal hypotheses poses a fundamental problem for the degree of belief as betting-quotients interpretation of subjective probability ?
"...critics [of the standard Dutch Book argument] have not been slow to point out that the postulate that degrees of belief entail willingness to bet at the odds based on them is vulnerable to some telling objections. One is that there are hypotheses for which the wise choice of odds bears no relation to your real degree of belief: if 'h' is an unrestricted universal hypothesis over an infinite domain, for example, then while it may in certain circumstances be possible to falsify 'h', it is not possible to verify it. Thus the only sensible practical betting quotient to nominate on 'h' is 0; for you could never gain anything if your betting quotient was positive and 'h' was true, whilst you would lose if 'h' turned out to be false. Yet you might well believe that 'h' stands a non-zero chance of being true. " [p90. Howson & Urbach 1993]
Please don't come back with the standard positivist rap about omniscient oracles as supposedly solving this problem, which is logically irrelevant to the issue of whether it is recognised in the literature as a fundamental problem requiring solution, whether or not you personally believe such alleged solutions are valid or invalid.--Logicus 19:01, 13 October 2007 (UTC)
The problem of fallibilist philosophy of science for epistemic Bayesian probabilist philosophy of science
The following is proposed as an addition to the second paragraph of the ‘Applications’ section of the article. Its relative length is apparently required to overcome the difficulty some Wikipedia editors have in understanding the point at issue and their objections.
However a fundamental problem for all probabilist philosophy of science is posed by radical fallibilist philosophy of science which maintains all scientific laws are false and will be refuted and replaced by hopefully better false laws that will in turn be refuted and revised again and so on ad infinitum in a potentially endless series of false laws.F1 For insofar as scientists believe this radical fallibilist philosophy, as it seems most do nowadays F2, then according to the canons of the subjectivist Bayesian method according to which probabilities are assigned to propositions in proportion to strength of belief in their truth, they must therefore assign zero prior probability to all scientific laws since they believe them to be false.F3 But this would render probabilist epistemology practically inoperable, since by Bayes' Theorem all evidentially posterior probabilities must therefore also be zero, thus putting all laws on an epistemic par and so eliminating any way of choosing between them epistemically within probabilist epistemology.F4 Thus philosophers of science who maintain scientific reasoning is consistently probabilist must deny most scientists are radical fallibilists, or at the very least show some scientists believe their theories are true, or at least not definitely false, in order for their probabilist theory of scientific reasoning to have any valid domain whatever.F5
F1 [As Duhem expressed the key tenet of this philosophy "Thus, the struggle between reality and the laws of physics will go on indefinitely: to every law that physics may formulate, reality will sooner or later oppose a rude refutation in the form of a fact, but, indefatigable, physics will improve, modify, and complicate the refuted law in order to replace it with a more comprehensive law in which the exception raised by the experiment will have found its rule in turn." p177 Duhem's The Aim and Structure of Physical Theory, Athaneum 1962]
F2 [Even Bayesian statistician George Box has admitted: "All models are wrong; but some models are useful." (Bill Jefferys, would you please kindly provide the reference here for this Box quotation you gave ?)
F3 [This is because a hypothesis that is refuted and thus falsified must be assigned zero probability in Bayesian epistemic probability theory: 'If a hypothesis h entails a consequence e, then P(h / ~ e) = 0. Interpreted in the Bayesian fashion, this means that h is maximally disconfirmed when it is refuted. Moreover...once a theory is refuted, no further evidence can ever confirm it, unless the refuting evidence or some portion of the background assumptions is revoked.' [p119, Howson & Urbach 1993] ]
F4 [Hence this problem is apparently fatal to the Bayesian theory of scientific method. For as Howson & Urbach admit, if it were correct that the prior probability of all unrestricted universal laws must be zero, "then that would be the end of our enterprise in this book" [p391 Howson & Urbach 1993], which is to demonstrate that scientific reasoning, and most especially its grounds for the acceptance and rejection of hypotheses, is subjective Bayesian probabilist reasoning. [p1] ]
F5 [For an example of some probabilist philosophers of science who do deny all scientists are radical fallibilists, see the desperate appeal to Einstein's (ironic?) dogmatic view on the truth of his GTR by Howson & Urbach on p394 of their 1993 Scientific Reasoning. But of course, one swallow does not a summer make.]
The main stumbling block on the part of some Wikipedia editors such as Jefferys, Johnston, BenE, Coppertwig etc in understanding that scientists' belief in radical fallibilist philosophy of science is fatal to probabilist philosophy of science seems to be their refusal to accept that according to the Bayesian philosophy of science literature and thus also in this article Bayesian epistemic probability interprets 'probability' as 'strength of belief that a proposition is TRUE', whereby if a proposition is believed to be false it must therefore be assigned probability zero. For they do not contest, and indeed apparently agree, that scientists believe all scientific laws are false. But they seek to avoid the conclusion that they must therefore assign them probability zero by doing original research and illegitimately redefining 'probability' as 'strength of belief that a proposition is USEFUL for making novel predictions', But since this is a non-Bayesian interpretation of probability, if they are right they are in effect demonstrating that scientific reasoning is not Bayesian probabilist.
LOGICUS 16 September 2007 —Preceding unsigned comment added by Logicus (talk • contribs) 14:45, 16 September 2007 (UTC)
- I think the idea that all scientific laws have probability zero counts as "original research" under WP:NOR. I think that idea is not stated by any of the given sources but is a conclusion reached by Logicus, and furthermore that it is not generally accepted by scientists. Therefore, it should not be stated in the article -- unless a source can be found that states it, and then at most in could be mentioned in a quote or indirect speech, as in "so-and-so says that all such probabilities are zero," not "All such probabilities are zero" as if that's what Wikipedia is asserting. Probabilities are manipulated within mathematical frameworks in which certain sets of scientific laws are "assumed" to be true. Besides, the edit is too long and I disagree with the premise of the argument for including such a long quote. --Coppertwig 17:18, 16 September 2007 (UTC)
- Coppertwig, please stop giving me your baloney! Do be a good fellow and go and read the literature, including the sources I give, where you should discover what you say is nonsense ! It is most definitely not Wiki-original research, whereas what you claim is. For example, it is well know that Popper maintained the probability of all laws must be zero, whether or not he was right or wrong. Please stop lecturing me on subjects about which you are either clearly ignorant or logically confused. Best wishes.
Logicus 18:14, 17 September 2007 (UTC)
History needs updating
If someone wants to edit the history section there is a great starting point for their research here Currently, appart from the mention of Bayes himself, the history starts way too late(1930!)--BenE 01:52, 21 September 2007 (UTC)
The problems of Bayesian Philosophy of Science
The problems of Bayesian Philosophy of Science as distinct from those of the Bayesian Philosophy of Probability
Further to my comments of 11 October above in the 'What is probability' section, they do not really belong to this section on the concepts of probability. Maybe some confusion has arisen because BenE's critical comments of 20 September in this section commenced with his assertion that he held the same views as Bill Jefferys. But Jefferys' views and my debate with him concerned the philosophy of science, that is, the nature of scientific reasoning, and whether it is Bayesian probabilist or not. Thus it was reasonable to interpret BenE's statements as about the same issue, rather than about the philosophy of probability, which might be the issue he actually had in mind, even if unclear from his comments. As I have pointed out before to try and clarify this crucial distinction, one may hold a Bayesian philosophy of probability, but be a vehement anti-probabilist and thus anti-Bayesian in the philosophy of science, regarding it as utterly absurd that scientists' beliefs in the truth of theories obey the probability calculus or are even logically consistent.
Anyway, my point here is that the above discussion does not really belong to this section, but rather to a section on the problems of Bayesian philosophy of science, that is, the thesis that scientific reasoning is probabilist and Bayesian, not to be confused with theories about what is the best interpretation of the notion 'probability'. And so I copy it to another section devoted specifically to discussing the problems of Bayesian philosophy of science. This is, for example, is the appropriate place to discuss whether the belief that all scientific laws are false, whereby they must be assigned probability zero when 'probability' is defined as 'strength of belief a proposition is true', is recognised as posing a fundamental problem for Bayesian and probabilist theories of scientific reasoning. Or what episodes in the history of science are recognised as being successfully accounted for by a Bayesian probabilist theory of scientific reasoning, such as 'the Copernican revolution', 'the anti-Cartesian Newtonian revolution', 'the Einsteinian revolution'.--Logicus 16:15, 12 October 2007 (UTC)
BenE's further comments I'm going to add, since you keep calling me a fundamentalist for believing in Jaynes' theories, that I am far from the only one with these views.
There is an influencial group of Bayesians part of the Future of Humanity institute which is part of the Faculty of Philosophy of Oxford University which the Philosophical Gourmet Report as recently ranked "the most important ranking of Graduate Programs in Philosophy in the English speaking world." They have a blog which frequently talks about Bayesianism as the probability theory AND as the philosophy of science. One of their contributors, Eliezer Yudkowsky wrote here :
- "Previously, the most popular philosophy of science was probably Karl Popper's falsificationism - this is the old philosophy that the Bayesian revolution is currently dethroning. Karl Popper's idea that theories can be definitely falsified, but never definitely confirmed, is yet another special case of the Bayesian rules; if p(X|A) ~ 1 - if the theory makes a definite prediction - then observing ~X very strongly falsifies A. On the other hand, if p(X|A) ~ 1, and we observe X, this doesn't definitely confirm the theory; there might be some other condition B such that p(X|B) ~ 1, in which case observing X doesn't favor A over B. For observing X to definitely confirm A, we would have to know, not that p(X|A) ~ 1, but that p(X|~A) ~ 0, which is something that we can't know because we can't range over all possible alternative explanations. For example, when Einstein's theory of General Relativity toppled Newton's incredibly well-confirmed theory of gravity, it turned out that all of Newton's predictions were just a special case of Einstein's predictions."
Another good article of him extolling Bayesianism can be found here
I may be young and naive and I may be suffering from intellectual "growing pains" but at least I am up to date with recent developments. And I doubt Oxford's Faculty of Philosophy is considered a fundamentalist group.--BenE 00:18, 13 October 2007 (UTC)
- Logicus continues to misunderstand my position. I am not talking about philosophy of science, but what scientists actually do. For some reason, many philosophers of science have the view that their ruminations about philosophy have something to do with what scientists actually do. This is generally false, since most philosophers of science have never done science and therefore have no notion of what scientists actually do.
- What scientists actually do is to construct models, compare with data, and try to find models that explain the available data well and predict future data well. A Bayesian analysis of a specifically identified set of models is a good way to choose between models that have been identified (even if we do not believe that any of the models in this restricted set is "ultimate truth"), and model-fitting criteria at any particular time will guide us in our quest to invent models that do a better job.
- Such a specifically identified set of models does not have the defect that Logicus claims to be a problem, that is, that you have to put zero prior probability on each. Indeed, once you restrict yourself to a specific set of models, you are required to set priors that add to unity, so not all can have zero prior probability. Whether the "ontologically true" model is within that set is not relevant for model comparison.
- I provided a joke a while ago about the Dean who, when approached by the Physics department chair about an expensive piece of equipment, complained that the mathematicians needed only pencil, paper and a wastebasket, and that the philosophers needed only pencil and paper. Logicus answered with a lame retort supposed to show the superiority of philosophers, that unfortunately completely missed the point, to wit:
- Of course the philosophers' joke on the joke is that the dean was herself a philosopher and correctly believed philosophers are infalible [sic], so don't need wastebaskets.
- Logicus' recent comments continue to prove that he has no clue about what scientists actually do or the way they actually think. I think, also, that it would be useful for anyone following this discussion to read Logicus' recent [rant] (which he later removed, but which, thanks to the WikiPedia gods, is preserved in perpetuity). Bill Jefferys 00:59, 13 October 2007 (UTC)
Logicus to Bill: Bill you seem to have a literacy problem here. I never removed my contribution of 11 October you link up to here. It is still there as above for the literate to see and be enlightened from the Bayesian positivist nightmare. Thanks for advertising it though. Also thanks for regaling us with your philosophy of science yet again. The question you have to answer is that IF 'probability' means 'strength of belief a proposition is true' and somebody believes a proposition is false, what probability should they assign a false proposition other than zero ? In answering this question you must set aside the fact that you yourself reject this conception of probability on which this article is based, albeit but have failed to propose an alternative conception that agrees with the literature referenced at the end of the article. Also remember that like fish have no good ideas about hydrodynamics, most scientists haven't a clue about what they actually do because their heads are usually filled with some ideological philosophy of science view about what they do i.e they suffer false consciousness. The task of critical philosophy of science is to analyse what they actually do rather than what they say they do. --Logicus 19:26, 13 October 2007 (UTC)
- If you will click on the [rant], you will find on the left hand side of the page a lot of stuff that you wrote, in red, that does not appear on the right hand side. This is a "diff", which shows what you deleted and what you added. I think you need your reading glasses checked.
- As for what you claim to be the task of critical philosophy of science, you may think anything you wish about what it thinks it does. This does not mean that it does it, and I think you will have a hard time showing that this is what scientists actually do.
- And, as to your claim that I have proposed no alternative conception that agrees with the literature referenced at the end of the article, I have provided citations that support my so-called "alternative" conception about the real role of models in scientific inference. Bill Jefferys 23:36, 13 October 2007 (UTC)
Logicus to Jefferys: If anybody who is literate clicks on the 'rant', providing they are wearing any reading glasses they may need, they should find 'the stuff I wrote in red on the left hand side of that page' also appears on this page above in my 11 October contribution, and hence that I never removed it, contrary to Jefferys' bizarre claim that I did remove it and insinuation that one has to go to this link to find it. And there they should also see that it is not a rant, but helpful advice to Jefferys' fellow 'Bayesian fundamentalist' philosopher of science BenE about the existence of non-probabilist philosophies of science.
But what the literate reader will not find anywhere from Jefferys is a simple concise definition of his conception of 'probability', nor an answer to the key question of what probability a subjective Bayesian probabilist should assign to propositions they believe to be false. His problem here is that of avoiding the appalling unthinkable conclusion that scientific reasoning is not Bayesian probabilist. For if scientists proceed as he claimed they do on 13 October and 15 August, and thus assign non-zero positive prior probabilities to propositions they actually believe to be false, then on the normal view of it they cannot be subjective Bayesians for whom 'probability' means 'strength of belief a proposition is true' whereby a proposition believed to be false is assigned probability zero, that is, no strength of belief it is true simply because it is believed to be false. Thus scientists are not subjective Bayesians on Jefferys' view of scientific practice. Precisely my point. QED.
What Jefferys fails to grasp in this key issue of whether scientific practice is subjective Bayesian probabilist or not is that in this instance it is not his account of scientific practice that is being challenged, but rather whether that practice as he represents it is a subjective Bayesian practice or not on the standard definition of Bayesian probability this article is based upon, that is, 'strength of belief a proposition is true'. And if scientists assign non-zero positive priors to hypotheses they believe to be false as Jefferys claims they do, then clearly it is not. Thus to establish his thesis that scientific practice is Bayesian probabilist, Jefferys must reject this article's definition of that conception of probability and replace it with an alternative definition, such as his own declared conception of it as 'strength of belief that a proposition is likely to be useful for predicting novel facts'. But of course for many reasons he cannot, including the fact that this does not square with even the pro-Bayesian literature, and at least the problem of yet again avoiding the conclusion that scientific reasoning is not probabilist at least because scientists breach Axiom 2 of that calculus in not assigning probability 1 to tautologies or else explaining why completely useless propositions for predicting novel facts such as 'The Moon is the Moon' are not given probability zero. Thus the very learned self-alleged Emeritus Professor Jefferys remains impaled on the horns of the dilemma constituted on the one horn by his equivocation between an instrumentalist idealist philosophy of science that scientific hypotheses are neither true nor false but instruments of prediction and a contrary radical fallibilist realist view that they are all false but possibly useful instruments of prediction, versus his fundamentalist belief on the other horn that scientific reasoning is Bayesian probabilist. Hardly surprising that in this unenviable situation he simply protests that he really does have a coherent and empirically adequate Bayesian philosophy of science that Logicus has simply not understood, but demurs presenting it on these pages, claiming he can only present it to Logicus by personal e-mail axtra Wikipedia, rather than on the Wikipedia Talk page where anybody can see it. Or else he suddenly becomes quasi-Emeritus and makes excuses that he must rush off to prepare his lessons for the new semester instead of answering Logicus's challenges, put as follows:
Logicus to Jefferys 18 September: On another point, since you yourself at least agree with the radical fallibilist philosophy of science that all scientific laws are false according to your 15 August testimony on this Talk page, then what probability would you assign a scientific law if you assigned probabilities to propositions according to strength of belief in their TRUTH, as subjective Bayesian epistemic probabilists do according to the literature ? (I appreciate you do not accept the subjective Bayesian epistemic interpretation of probability, but have your own utilitarian pragmatic interpretation of it as 'likely usefulness of a hypothesis in making novel predictions', but just imagine you did accept it. What probability would you assign a hypothesis you believe to be definitely false ?)
I would also be grateful to know why you assign tautologies probability 1, and thus why you evaluate them as maximally useful for making novel predictions. For instance, how is 'The Moon is the Moon' useful for making novel astronomical predictions ?
Jefferys to Logicus 19 September: Since the semester started, I have been rather busy, and expect to be so for quite some time, so will answer only the question about the Box quotation, and make one more comment. The other questions will have to wait, weeks probably. I can only say that you profoundly misunderstand my position. I still invite you to contact me directly. Believe me, I do have coherent and justifiable reasons to write what I did.
But the learned now quasi Emeritus Professor manages one last comment by way of trying to teach Logicus his rotten Bayesian statistics before rushing off to spread his gospel wider:
I'll make one more comment. The reason why you have to consider more than one hypothesis is that if only one hypothesis is possible (that is, the universe of hypotheses under consideration consists of that one hypothesis alone), its prior and posterior probabilities are by definition 1. This is well-known, and any decent book on Bayesian statistics should set you straight on this point. Try Jim Berger's book.
But this is of course illogical nonsense. There is no reason in the probability calculus nor in Bayesian probability why a lone hypothesis on any subject matter should be believed to be certainly true or a tautology, and indeed it may be believed to be false because it is believed to have a counterexample somewhere sometime, and thus assigned prior probability zero. Consider the case of there being only one theory of the Moon's constitution, namely that it is made of green cheese. What definition of what concept means it must have prior and posterior probability 1 Jefferys notably does not reveal. --Logicus 18:10, 15 October 2007 (UTC)
BenE to Logicus: As I have written again and again, bayesian probability theory is interpreted as a degree of belief that is calculated through Bayes theorem. Hence the name Bayesian. Your statement that a scientific theory must assign 0 to scientific theories is unsensical in this context as it is not possible to arrive at this probability value through Bayes theorem (Unless you pull this zero value out of your ass). We have no clue what the absolute prior is for a scientific theory for two reason: First we don't have any idea of the size of the hypotheses space and second, we don't even know that the theories are mutually exclusive. (e.g. the theory that the human body temperature is between 95-105 degrees does not exclude the theory that it is between 90-110) the only thing we can do is assign a maximum entropy prior between alternative theories and thus represent our state of ignorance. This is equivalent to setting the priors to be equal. When we calculate the odds ratio, the prior terms vanishes in the equation as P(H1)/P(H2)=1 and we never actually need to assign a number to these priors! By following Bayes theorem and the MaxEnt principle we arrive on a ratio based solely on the data, and that is therefore based on how the theory fits the data: how well it predicts the data.
Choosing theories based on 'strength of belief that a proposition is likely to be useful for predicting novel facts' is not a supposition but a result of the Bayesian approach. It's a result of applying Bayes theorem with a maximum entropy (equal) prior.
I provided many citations supporting this. However, why do you even make a fuss? Even if the people posting on this discussion page and the guys in Oxfords's philosophy department think it is the best candidate for a theory of science, the article in its present form doesn't even mention the word science once! Nothing in the article should bother you. --BenE 19:50, 15 October 2007 (UTC)
- I apologize to Logicus, for he is correct, he did not remove the material I thought he had removed. I was misled by the [diff page], which showed the material deleted on the left (in red, with minus sign indicating deletion) but not restored on the right. There is evidently something that I do not understand about the way the diff is presenting things. But when I searched the version that Logicus edited for the material, it was there. I am sorry.
- I also apologize for using intemperate language. I hope that all of us will change our ways and try to be more respectful of the views of our fellow editors, even when we don't agree with them. Bill Jefferys 22:48, 16 October 2007 (UTC)
- Logicus to Bill: Well thanks for that. I also find diff confusing, and often cannot find history of edits obviously made.--Logicus 18:15, 17 October 2007 (UTC)
- BenE stated "As I have written again and again, bayesian probability theory is interpreted as a degree of belief that is calculated through Bayes theorem." This is not correct. Bayes theory may or may not use a subjective "degree of belief" as an input (it can also use objective inputs) but the degree of belief is not the result of Bayes theorem.ERosa (talk) 07:19, 10 February 2008 (UTC)
Painting a picture of too much conflict?
This article seems very POV to me in that it seems to paint a picture of the Bayesian view of probability being more controversial than it actually is, and also that there is more conflict (see the use of the word "antagonism") between the two "schools of thought" than there actually is. Although people talk about the "schools of thought" I think that the evidence that these schools are as all-encompassing as they are is shaky at best.
For example, compare the books by Casella & Lehmann (Theory of Point Estimation) which arguably takes a heavily frequentist perspective...and the J.O. Berger (Statistical Decision Theory and Bayesian Analysis) book, which is arguably written from a (very strongly) Bayesian perspective. The berger book, which is about as "rabid" a Bayesian text as you can find, still embraces the frequentist interpretation of probability as one way of looking at things and fully lays out how to use such an interpretation. Similarly, the Lehmann book is very heavily slanted towards the frequentist perspective but it offers over a full chapter dedicated to Bayesian methods and discusses the philosophical aspects of the Bayesian interpretation of probability at great length.
I think that this article needs to be seriously rewritten to reflect the real state of things. All the modern texts talk about the "controversy" between Bayesian and frequentist methods as something that is more or less historical. People quibble about use of this technique and that, how it is applied, when it is appropriate, but most people agree that each interpretation has a certain domain where it is useful and another where it is not, and furthermore, there is a general consensus that both interpretations can be combined in a given problem for both philosophical and practical reasons. Do people agree with that? Cazort 23:37, 3 December 2007 (UTC)
- Well both yes and no. If you read just some random section of this talk page you will probably find some strong feelings with heated discussions. If you read philosophy papers about this you will find some strong feelings too. However, if you read mathematical books in probability theory and statistical methods you will find only dry expositions, as math books are in general the wrong forum for debates. (But there are some exceptions here too.) So it all depends on where you look if you will find heated debates or not.
- I think it's good that the article stresses the differences in view that exists, and the ongoing debate. This is the kind of information readers new to the subject/concepts want to know. What could be made a bit more clear, I think, is that there isn't only two views but many. The debate is historical in the sense that it's an old debate—it can easily be traced back to the old debate between materialism versus idealism—not in the sense that it's over. iNic (talk) 23:26, 17 December 2007 (UTC)
- No, the article simply perpetuated a point of confusion about the debate. Bayes theorem and Bayesian methods are routinely used by statisticians. Bayes theorem itself is a matter of mathematical proof and is not just subjective. But this article seemed to continue with a common misconception that confuses the philisophical debate about the nature of uncertainty with the methods statisticians actually use. I'm a statistician and I use both methods. When you have prior knowledge then a Bayesian analysis will actually be a better result if you track it over time and compare it to methods that ignore prior knowledge. Often, I've used Bayesian analysis when the prior knowledge was actually derived from other standard sampling methods, not subjective estimates. I've made the changes and, unlike the previous version, inserted specific citations for the claims. Hopefully, the first year stats students who have written the material previously will provide better citations if they want to refute what I just wrote.
- Actually, if the title of the article was "Subjectivist probability" and not "Bayesian probability" it would cause much less confusion. Again, there is nothing inherently subjectivist about Bayes theorem. The only possible tie is that, since Bayes theorem allows the use of prior knowledge - whether subjective or based on observed frequences - it has sometimes been mistaken as based *only* on subjective probabilities. ERosa (talk) 07:09, 10 February 2008 (UTC)
- Logicus to ERosa:Amen to that ! But the remaining problem is what then is the differentiating specificity of 'Bayesian probability', whereby on the one hand it is not necessarily subjectivist as you say, but on the other hand does not include all conditional probability ? The article's current definition of Bayesian probability is in fact rather a definition of subjective probability. Defining Bayesian probability is a difficult business and indeed possibly impossible if it is only a historically mistaken pseudo-category. Maybe more later.... --80.6.94.131 (talk) 15:26, 24 February 2008 (UTC) --Logicus (talk) 15:29, 24 February 2008 (UTC)
- I'm not sure I can fully construct the meaning of that first sentence. Are you saying that Bayes Theorem pertains to any conditional probability? If so, then yes. It is another example of an unfortunate naming convention confusing a lot of people. The subjectivist philsophy of probabilistic knowledge has little to do with Bayes Theorem, which is derived mathematically from fundamental axioms of probability theory - the same rules any frequentist would feel subject to. But I don't think its really that difficult to define. THere is really nothing in statistics called a "Bayesian probability". There is Bayes Theorem which is used to compute a conditional probability, but a conditional probability isn't uniquely Bayesian. It makes no more sense than to talk about gram as a "digital gram" because it was weighed on a digital scale. Anytime the term "Bayesian probability" is used they really mean "subjectivist view of probability". Bayes theorem gives exactly the right answer for p(x|y) when you give it p(x), p(y) and p(y|x). Any "frequentist" would use the same formula to compute p(x|y). A subjective probability can be used as one input to Bayes Theorem, but like every other formula in math or science, the formula doesn't care HOW we come up with the numbers. It just gives us the answer with the numbers we give it.ERosa (talk) 02:44, 25 February 2008 (UTC)
- The acid test of whether or not one's a Bayesian is not (and never has been) whether or not one believes in Bayes theorem. Everyone does. It's a theorem.
- Rather, the acid test is whether or not one believes it can ever be meaningful to talk about a probability P(x), if X is an event which has already happened, but about which you do not know the outcome. To a Bayesian, this is not only meaningful, it should be the central quantity of inference. To a Frequentist, it is not meaningful, and one should only talk about estimators, confidence limits and so forth; and discuss questions like "unbiassedness", which to a Bayesian can seem wholly misleading.
- That's been the meaning of "Bayesian" since the word was coined, in the 1950s.
- European universities tend to allow their lecturers more flexibility; but I have it on authority that, at least until very recently, there were still U.S. colleges where a lecturer would find themselves barred from teaching the course again, if they ever talked about P(x) in a first year statistics course, where X was to represent an event which had already happened. Jheald (talk) 10:00, 25 February 2008 (UTC)
- Your claim that lecturers were barred from talkinga bout P(x) makes no sense. Every one of the six undergraduate stats text books on my book shelves start with chapters that uses the term "P(x)". Are you saying using P(x) for "probability of x" is somehow uniquely Bayesian? The same term is used throughout statistics whether the author is "frequentist" or not. I agree with ERosa that this entire article confuses Bayesian with subjectivist. As you both pointed out, Bayes Theorem is mathematically proven. But the problem is that the word "Bayesian" has come to mean two very different things. Over the last couple of decades, statisticians are using Bayes Theorem more often (even though Bayes Theorem is much older than that) to properly incorporate prior known constraints on potential values. We have to separate the philisophical argument which I think is better labeled frequentist vs. subjectivist, from Bayes entirely. And I think it mischaracterizes the frequentist view that p(x) must related only to a past event. The frequentist view is that p(x) only has meaning as the frequency of x over a large number of trials. But I like the argument presented earlier that "degrees of belief" can also be tested by frequentist means. If, of all the times a person says they are 80% confident, they are right 80% of the time, then you have confirmed their "degree of belief" with the frequency of being right. So, even in a purely philosophical sense, I see no real conflict, much less within the pragmatic use of statistics.ChicagoEcon (talk) 15:03, 25 February 2008 (UTC)
- No, my claim is that a Bayesian will feel free to discuss the probability of x, where X is an event which has already taken place. A frequentist would resist this; and would resist talking about probability even of events in the future, if they could not be related to a frequency over a large number of trials.
- "Bayesian" has been used in this sense, ie usages of probability not related to frequency over a large number of trials, ever since the word was coined, in the 1950s. An out-and-out frequentist may use Bayes theorem; but they are unlikely to describe either themselves, or their calculation, as "Bayesian". Jheald (talk) 15:42, 25 February 2008 (UTC)
- In that case no statisticians are frequentists since all statistical estimators of means of populations are actually the P(a<x<b)where a and b are the bounds on some interval. This P(x) means that if we continued to sample the population until we got every possible member of the population, then the actual population mean has the stated chance of faling within those bounds. But, since it would be absurd that no statisticians are frequentists, and since that would be the logical conclusion from your claim, I would say that your initial characterization of frequentists resisting using P(x) is wrong. A frequentist uses P(x) but holds the position that the only meaning is what it means for the frequency of occurance over a large number of trials. A subjectivist would say it has another meaning - that it can mean degree of belief. I say frequentists' analysis of degrees of beliefs also show it meets the frequentists criterion (a large number of trials of degrees of belief statements can be observed with frequentists' observations). Both groups use P(x) and neither resists using it in any sense.ERosa (talk) 20:30, 25 February 2008 (UTC)
- I think if you look closer, you will find that that is not the case. A frequentist statistician will not make assertions about the probabilities of a parameter of a distribution. Rather, they will make assertions about the probability of an estimator , and how often it might or might not be an amount different from if hypothetically a large number of similar trials were to be carried out.
- A Bayesian will feel free to discuss the probability of itself. But for a by-the-book frequentist is a fixed parameter, not a random variable; so not something about which they can ever talk about a probability distribution. Jheald (talk) 21:23, 25 February 2008 (UTC)
- Then, as you define it, I've never met a frequentist statistician. And that would be somethig since I'm a statisician that worked, among other places, with the Census Bureau (where there are over 1000 professional statisticians). I've also worked with the statisticians ad the EPA and with many academic researchers. My contact list has over 100 people with advanced degrees in statistics. And I've never met anyone who would does not talk in terms of he probability of a parameter falling within stated bounds. Its simply the normal language among every statistician I know. And, believe me, I've "looked closely". Now, you should compute the odds that, even with a somewha biased sample, I would by chance have never met a frequentist statistician if there are any more than a tiny minority. (Hint: You can use Bayes Theorem)ERosa (talk) 15:49, 27 February 2008 (UTC)
- That's interesting. So are these people actually calculating probabilities P(θ|data) ? Or are they calculating confidence intervals and then misrepresenting the meaning of their calculation ? Jheald (talk) 16:35, 27 February 2008 (UTC)
- If you understand what "confidence interval" means, you know that the "confidence" that the "interval" a to b contains the population parameter x is is P(a<x<b|data). Its not a misrepresentation. They are saying that there is a 90% chance that, if we continued to sample the entie population,we would find the mean to be within the 90% CI. In fact, simulations of samples of populations will actually prove that. Where are you learning what you have "learned" about statistics? I honestly can't think of a single professor of stats, text, or PhD researcher who makes these fundamental errors you seem to be making. Can you provide a citation?ERosa (talk) 19:12, 27 February 2008 (UTC)
- If you really want P(θ|data) you do it the Bayesian way: you start with a prior P(θ|I), and update it according to Bayes theorem. Confidence interval calculations don't do that: they calculate P(interval|θ), without any consideration of the priors on θ. As a result, there are cases where frequentist methods can report very high "confidences" in parameter ranges which may nevertheless still actually have rather low probability. Jheald (talk) 20:22, 27 February 2008 (UTC)
- You didn't answer my question about a source. You were surprised that calculating a CI is actually calculating a particular P(X) (in this case, P(a<x<b) where x is a population parameter). Why would you think this is a misrepresentation and what is your source? ERosa (talk) 23:28, 27 February 2008 (UTC)
- But of course it is not calculating a P(X). The notation P(a<θ<b) is misleading, because θ is not a random variable. The interval is the random quantity, and it is fixed so that . It's a very odd calculation, when you actually write it out properly; but it has nothing to do with getting a probability distribution for θ. Jheald (talk) 00:51, 28 February 2008 (UTC)
- But of course it IS and you are seriously mislead. Again, provide a citation for your claim. The notation P(a<x<b) is quite standard and what is, in fact, random, is the estimate of x relative to the true population mean of x. A large number of simulations of samples from known populations show that the 90% CI contains the known population mean 90% of the time. By the way, a colleague of mine once wrote for the Journal of Statistics Education, which talks a lot about bizzare misconceptions about statistics. I think you will make an excellent subject.74.93.87.210 (talk) 04:49, 28 February 2008 (UTC) Forgot to sign in.ERosa (talk) 04:58, 28 February 2008 (UTC)
- By the way, the wikipedia article on confidence intervals seems to use notation entirely consistent with what I'm saying and contrary to what you say. You should also set out to "correct" that error. And all the errors in every stats text I pick up. You have a lot of work to do.ERosa (talk) 04:58, 28 February 2008 (UTC)
- But of course it IS and you are seriously mislead. Again, provide a citation for your claim. The notation P(a<x<b) is quite standard and what is, in fact, random, is the estimate of x relative to the true population mean of x. A large number of simulations of samples from known populations show that the 90% CI contains the known population mean 90% of the time. By the way, a colleague of mine once wrote for the Journal of Statistics Education, which talks a lot about bizzare misconceptions about statistics. I think you will make an excellent subject.74.93.87.210 (talk) 04:49, 28 February 2008 (UTC) Forgot to sign in.ERosa (talk) 04:58, 28 February 2008 (UTC)
- P(a<θ<b|data), calculated using Bayes theorem, is called a Bayesian credible interval. It coincides with a frequentist confidence interval only if the prior probability P(θ|data) is uniform. Otherwise, as you can verify for yourself, the calculations are different. And it's a well known fact, that if you bet against a Bayesian who has an accurate prior, you will tend to lose. Jheald (talk) 09:51, 28 February 2008 (UTC)
- By the way, with regard to the Wikipedia article on confidence intervals, note the confidence intervals#definition is in terms of
- ,
- ie probabilities of the interval given theta.
- Note also the section Meaning and Interpretation:
- "It is very tempting to misunderstand this statement in the following way... The misunderstanding is the conclusion that so that after the data has been observed, a conditional probability distribution of θ, given the data, is inferred... This conclusion does not follow from the laws of probability because θ is not a "random variable"; i.e., no probability distribution has been assigned to it."
- (emphasis added). Jheald (talk) 10:01, 28 February 2008 (UTC)
- By the way, with regard to the Wikipedia article on confidence intervals, note the confidence intervals#definition is in terms of
- Wow, this debate has generated a lot of text! Actually, I think the entry in the confidence interval argument needs to be corrected if it means that a 90% confidence interval doesn't have a 90% *propability* of containing the true value. And those who insist on the distinction between "credible interval" and "confidence interval" make the same mistake Jheald makes since the distinction has no bearing on observed outcomes. I believe what ERosa was referring earlier to was the fact that if you take, say 30 samples from a large population where you already know the mean, compute the 90% confidence interval, and repeat this thousands of times, you will find that 90% of the time the known population mean actually fell between the upper and lower bounds of the 90% confidence interval. This claim is experimentally verifiable. Neither the math nor experimental observations contradict ERosa. This is another example of how people have some strange ideas about probability theory.Hubbardaie (talk) 13:43, 28 February 2008 (UTC)
- I noticed that the section of the confidence interval article that Jheald cites had no citations for its arguments (much like Jheald's arguments in here). So I added fact flags. When I get a chance I will rewrite that fundamentally flawed section. This is the problem when people who barely understand the concepts try to get philosophical.Hubbardaie (talk) 14:18, 28 February 2008 (UTC)
Arbitrary section break (Confidence limits)
- Here's a concrete example of the problems you can get into with confidence limits.
- Suppose you have a particle undergoing diffusion in a one degree of freedom space, so the probability distribution for it's position x at time t is given by
- Now suppose you observe the position of the particle, and you want to know how much time has elapsed.
- It's easy to show that
- gives an unbiased estimator for t, since
- We can duly construct confidence limits, by considering for any given t what spread of values we would be likely (if we ran the experiment a million times) to see for .
- So for example for t=1 we get a probability distribution of
- from which we can calculate lower and upper confidence limits -a and b, such that:
- Having created such a table, suppose we now observe . We then calculate , and report that we can state with 95% confidence, or that the "95% confidence range" is .
- But does that give a 95% probability range for the likely value of t given x? No, it does not; because we have calculated no such thing.
- The difference becomes perhaps clearest if we think what answer the method above gives, if the data came in that .
- That gives . Now when t=0, the probability distribution for x is a delta-function at zero, as is the distribution for . So a and b are both zero, and so we must report a 100% confidence range, .
- Does that give a 100% probability range for the likely value of t given x? No, because we have made a calculation of no such quantity. The particle might actually have returned to x=0 at any time. The likelihood function, given x=0, is actually
- Conclusion: confidence intervals are not probability intervals for θ given the data. Jheald (talk) 15:54, 28 February 2008 (UTC)
Certainly confidence intervals are not probability intervals. Here's a simple example: two independent observations are uniformly distributed on the interval from θ − 1/2 to θ + 1/2. Call the larger of the two observations max and the smaller min. Then the interval from min to max is a 50% confidence interval for θ since P(min < θ < max) = 1/2. But if you observe min = 10.01 and max = 10.02, it would be absurd to say that P(10.01 < θ < 10.02) = 1/2; in fact, by any reasonable standard it would be highly improbable that 10.01 < θ < 10.02 unless you had other information in addition to that given above (e.g. if you happened to know the actual value of θ). And if you observed min = 10.01 and max = 10.99, then it would be similarly absurd to say that P(10.01 < θ < 10.99) = 1/2; again, it would be highly improbable that θ is not in that interval. Michael Hardy (talk) 20:54, 28 February 2008 (UTC)
I think where Hardy and Jheald are differeing with Hubbardaie and myself is in two ways. First, as I've said before, Jheald need only repeat this process in a large number of trials to show that the CI will capture the mean exactly as often as the CI would indicate. In other words, if the 95% CI is a to b, and we compute a large number of intervals a to b based on separate random samples, we will find that the known mean falls within 95% of the computed intervals. Second, Hardy is calling the result absurd because he is taking prior knowledge into account about the distribution. But, again, if this sampling is repeated a large number of times, he will find that only 5% of the computed 95% CIs will fail to contain the answer. If we move away from the anecdotal to the aggregate (where the aggregate is the set of all CI's ever properly computed on any measurement) we find that P(X within interval of Y confidence)=Y.ERosa (talk) 21:40, 28 February 2008 (UTC)
- I did not call it "absurd" because of prior knowledge; I said it's absurd UNLESS you have prior knowledge. It is true that in 50% of cases this 50% confidence interval contrains the parameters, but in one of my cases the data themselves strongly indicate that this is one of the OTHER 50%, and in the other one of my cases, the data strongly indicate that this is one of the 50% where θ is covered, so one's degree of confidence in the result would reasonably be far higher than 50%. Michael Hardy (talk) 16:13, 29 February 2008 (UTC)
- Also, Jheald commits a non-sequitur and begs the question. He shows a calculation for a CI and up to the point of that answer, he is doing fine. But then he asked "But does that give a 95% probability range for the likely value of t given x?" and then states "No, it does not; because we have calculated no such thing". You correctly compute a confidence interval, but then make an unfounded leap to what it means or doesn't mean. You have not actually proved that critical step and your claim that you have not computed that is simply repeating the disputed point (i.e. begging the question).ERosa (talk) 21:44, 28 February 2008 (UTC)
- Well, the 95% CI calculation is different to what a calculation of a 95% probability range for the likely value of t given x would look like. But rather than labour the point, surely the coup-de-grace is what follows?
- If you observe x=0 in the example I've given above, the CI calculation gives you a 100% confidence interval for t=0.
- But the likelihood
- So there is the key ingredient for the probability of t given x (give or take whatever prior you want to combine it with), and it is not concentrated as a delta-function at zero. Jheald (talk) 13:02, 29 February 2008 (UTC)
- But now in your new response I see you are backing off from your original claim that given a particular set of data x, the CI will accurately capture the parameter 95% of the time. Now I see that you are replacing that with the weaker claim that given a particular parameter value, t = t*, the CI will accurately capture the parameter 95% of the time. Alas, this also is not necessarily true.
- What is true is that a confidence interval for the difference calculated for a correct value of t would accurately be met 95% of the time.
- But that's not the confidence interval we're quoting. What we're actually quoting is the confidence interval that would pertain if the value of t were . But t almost certainly does not have that value; so we can no longer infer that the difference will necessarily be in the CI 95% of the time, as it would if t did equal .
- If you don't believe me, work out the CIs as a function of t for the diffusion model above; and then run a simulation to see how well it's calibrated for t=1. If the CIs are calculated as above, you will find those CIs exclude the true value of t a lot more than 5% of the time. Jheald (talk) 14:23, 29 February 2008 (UTC)
- I've written in the confidence interval article talk something this entire discussion has been seriously lacking...citations! See the rest there.Hubbardaie (talk) 14:37, 29 February 2008 (UTC)
- By the way, I also made a similar argument to ERosa that, over a large number of trials, 95% of 95% CIs will contain the true mean of a population. I haven't backed off of it and I don'tsee where ERosa has. In fact, the student-t distribution (which I wrote about in my book) was initially empirically derived with this method. So, alas, it IS true that the 95% CI most contain the true value 95% of the time. If you don't believe me, run a simulation on a spreadsheet where you randomly sample from a population and compute a CI over and over again. So much for the coup-de-grace. But this is getting us nowhere. Refer to the citations I provided in the confidence interval article. You also have to provide verifiable citations for anything you say or you run the risk of violation the NOR rule.Hubbardaie (talk) 14:45, 29 February 2008 (UTC)
- But I'm not calculating the mean of a population. I'm trying to get a confidence limit for an unknown time, given a position measurement.
- The CIs I get don't reflect the probability distribution P(t|x) for that unknown time, given the measurement.
- That is sufficient to dispose of the assertion that confidence intervals necessarily reflect the probability distribution for their parameter given the data.
- You might also like to reflect that WP:NOR specifically does not apply to talk pages, and per WP:SCG the creation of crunchy examples and counter-examples is not considered OR. Jheald (talk) 15:47, 29 February 2008 (UTC)
- First, I didn't say NOR applied to talk pages. Of course, knock yourself out and apply all the original research you want in here (that is, one might presume its original since you never provide a citation). I'm just cautioning you for when and if you decide to modify the actual article. In there you will need citations, so why not show them here,too? And you seem to have backed off of your original position quite a lot. As I review your conversations with me and others over the last couple of weeks, you originally said that a frequentist would resist using P(X) at all. This morphed into a conversation about whether a confidence interval a to b has a probability of P(a<x<b). The fact that William Sealy Gosset derived the first t-stats by empirical methods settles that issue. The citations I showed in the confidence interval page contradict your position. You haven't proven anything. You, again, made an unrelated point followed by an unfounded leap to the original debated assertion. And you have, again, confused situations where a possible, but unlikely set of observations can in one situation produce a range that doens't contain the true value, when the true value is known, with situations where you don't know the true value to begin with and are trying to assess the probability distribution of possible population parameter values.Hubbardaie (talk) 16:03, 29 February 2008 (UTC)
- In the particular case of a Student t-test, the 95% confidence interval does match a Bayesian 95% credible interval. (For a derivation, see eg Jeffreys, Theory of Probability). Student in fact used an inverse probability approach to derive his distribution; similar to Edgeworth, who'd used a full Bayesian approach back in 1883. The reason the two match is (i) we assume that we can adopt a uniform prior for the parameter; (ii) that the function P(θ'|θ) is a symmetric function that depends only on (θ'-θ), with no other dependence on θ itself; and also that (iii) θ' is a sufficient statistic.
- Under those conditions, a 95% confidence interval will match a Bayesian 95% credible interval. But in the general case, ie in other situations, as in the example I gave higher up, the two do not match. Jheald (talk) 18:12, 29 February 2008 (UTC)
- So: does a confidence interval a to b in general have a probability of P(a<x<b|data) = 0.95 ? In general, no. And even when it does (like the case of the t-test), in moving from one to the other, one is (either consciously or unconsciously) making a transition of worldview, from the frequentist to the Bayesian.
- I don't back down from what I said above. The notion of a conditional point probability, or an interval probability, for P(θ|data) is not a Frequentist notion. A proper frequentist would not talk about P(θ) at all. Talk about P(θ|data), where θ is a non-random parameter, is only meaningful in the context of a Bayesian outlook. If somebody does believe in P(θ|data), then they either don't care about Frequentism, or don't understand it. Jheald (talk) 18:28, 29 February 2008 (UTC)
- Look, I respect where you are coming from. You are clearly not a total layman on the topic. But I won't repeat how your claim doesn't address the issue of how, when you have no a priori knowledge of a population's mean or it's variance, that the CI is meant to mean what the sources I cite say it means. I understand you continue to insist that when someone says a CI is a range that contains the true values with a given probability, that they must be wrong or misleading, contrary to the authoritative sources I cite clearly state. Let's just lay out the ciations and show both sides in the article. Hubbardaie (talk) 22:43, 29 February 2008 (UTC)
- ^ See Gillies 'Induction and Probability' Parkinson (ed) An Encyclopedia of Philosophy 1988; p263-4, Howson & Urbach 1989
- ^ He said "...betting strictly speaking does not pertain to probability but to the Theory of Games". See "The role of 'Dutch Books' and 'Proper Scoring Rules' " in British Journal for the Philosophy of Science 32 1981 55-6.]