Noninertial Frame

Polanyi's Personal Knowledge: Intellectual Passions

2009-01-09T20:23:00.000-08:00

OK, I loved this chapter. It really resonated with me (even though I have a few problems with it). In particular I thought the section on Scientific Value was spot on. I defines three criteria for assessing the value of a scientific affirmation: certainty, profundity, and intrinsic interest. He aptly points out that the three are to some extent incompatible and while enhancing one aspect of a theory we may degrade another aspect. In particular he claims that modern science has a tendency to follow the "Laplacean ideal" of strict objectivity which enhances certainty at the expense of intrinsic interest. I would agree, at least to some extent. I see this in some of the more extreme proponents of a Theory of Everything. High energy physics has achieved tremendous precision, and its practitioners should be praised for that. The Laplacean ideal, in my opinion, isn't BAD. It's just not the ONLY thing. Let's not pretend that high energy physics can possibly give us a theory of "everything". In part this is just a misnomer, and most physicist mean by this only (only!) a single theory that accounts for all four known fundamental forces. But some actually claim that achieving such a theory would represent the end of physics. That would only be true IF all we really care about is particle physics. I'm all for particle physics. But I'm for lots of other stuff too that does not, as far as we can tell, reduce to particle physics. Perhaps it does, but let's be honest about the fact that we can't tell yet.

So Polanyi emphasizes the personal component involved in discovery and in the evaluation of a discovery as a "true" discovery. He claims that our anticipation that a new theory will be fruitful leads us to believe that it is true. He admits that this anticipation can be misguided (as he says in the previous chapter, we have to risk saying nonsense in order to say anything at all). He also discusses something very like Kuhn's incommensurability when he states that "Formal operations relying on one framework of interpretation cannot demonstrate a proposition to persons who rely on another framework." (p. 151) And he makes it clear that he thinks there are no formal rules whereby we can mediate such disputes, or determine which facts are of scientific interest and which are not. He does say that a new conception can sometimes reconcile two competing frameworks, but this is not generally how he solves the problem. His solution, like that of Kuhn, seems to be a sociological one. Each scientist is accredited by the scientific community as an expert in a certain area. Each accredited scientist then has the job of accrediting the work of others in that area and closely related areas. By this means the society of scientists polices itself.

I find this a little dissatisfying. Polanyi makes such a big deal about the role of personal passions in generating scientific conceptions and the belief that those conceptions are true. But he then claims that we can't sit alone in thinking our new conceptions is true. Our new conceptions must "conquer or die." (p. 150) So therefore we get mired in sociology. But what makes the collective group of scientists better able to judge a NEW conception (rather than work that fits entirely within an accepted framework, which they are specifically trained to judge) than an individual scientist? Presumably it goes back to Polanyi's assertion that we are competent to judge what is "real" and although competence doesn't imply perfection, if you get enough competent people working together you can get it right most of the time. Presumably we can count on each scientist to do a good job policing the whole of science as long as they maintain their "passion for mental excellence" which "believes itself to be fulfilling universal obligations." (p. 174) While I agree with Polanyi that formal rules for judging scientific theories are "doomed to failure" I cannot say that I find his solution to the problem fully acceptable.

I liked his discussion of mathematics, especially his claim that math is not just a set of non-contradictory statements but rather a set of INTERESTING (and also non-contradictory) statements. He also discounts the notion that mathematics consists only of tautologies by pointing out that the axioms are not tautologies. And it is important to note that we don't use just any set of axioms, but only those that are interesting or fruitful.

Finally, I enjoyed his last section on Dwelling In and Breaking Out. The language he used was almost mystical, but the experience he is trying to describe is real. There is a difference between dwelling in scientific theory, of gaining a deep appreciation for a theory's beauty, of internalizing a theory and making it part of your thinking, as opposed to using the theory in a routine manner. His description of discovery also sounds mystical but gets at something real. There is a sense of breaking out or breaking through when a discovery is made (not that I've made any important discoveries, but even the little discoveries carry some of this feeling). I'm with Polanyi that science is ultimately pursued for these moments of personal passion. That's certainly why I do it.

Polanyi's Personal Knowledge: Articulation

2009-01-09T20:22:00.000-08:00

This post continues my comments on Polanyi's book.

In this chapter Polanyi analyzes various levels of intelligence, starting from primitive animal intelligence and working up to full-scale human language and interpretive frameworks. I found this chapter a little rough going, presumably because of my limited background in linguistics and cognitive psychology. But I'll try to comment on a few things that stood out to me in this chapter.

The first is Polanyi's distinction between heuristic and routine stages of learning. It seems to me that this dichotomy fits well with two other dichotomies commonly seen in the philosophy of science. One is the distinction between the context of discovery and the context of justification. The other is Kuhn's distinction between a crisis and normal science. In all three of these dichotomies the former part is where the really interesting (but hard to define) stuff happens, while the latter part is just a matter of working out the details of the ideas generated by the former part. This reminds me that one thing I greatly appreciate about Polanyi is his willingness to tackle the context of discovery. Traditionally philosophers of science have refused to touch this issue, concentrating only on the context of justification. Discovery was left as a mystery. Kuhn tackled the context of discovery from one direction, namely the sociological one. He claimed that the processes of normal science would give rise to anomalies and eventually the social pressures to deal with the anomalies would become so great that a crisis would be created. This may explain why a given discovery happened approximately when it did, but it doesn't say anything about the personal participation of the scientist in discovery. This, of course, is exactly what Polanyi is on about. I particularly like how he points out the irreversible nature of discovery. One cannot un-discover something one has discovered! He also talks about intellectual discomfort as the driving force that shapes new conceptions. This sounds a lot like Kuhn's crisis, but on a personal level.

I was interested in his Laws of Poverty and Consistency for language - the idea that you can't have a word for everything (so that words will be repeated) and you must use words consistently for them to have any meaning. Presumably the same laws apply to scientific theories. You don't want a theory for everything: you want to be able to apply a single theory to a wide variety of situations. But you must also use the theory consistently, so it will not generally be freely adaptable to EVERY situation. I particularly like his use of the map analogy (which I have seen elsewhere, in the work of Thomas Brody). A good map must achieve a balance between accuracy and usability. A 1:1 scale map is useless - you may as well just walk the streets. But a map with insufficient detail may be easy to use, but still useless if it doesn't provide the information you need. Languages and theories are like that. We want them to be accurate and precise, but they must remain tractable and therefore they must contains SOME imprecision and ambiguity. Frankly, I think that it is in the deft handling of these ambiguities that the beauty of both language and science is seen.

I think this ties in well with his point about mathematical formalisms. He claims that mathematics and symbolic logic are tools that simply assist our inarticulate intelligence in working out answers. The formalism does not, and cannot, truly give us anything new. But there is a point here that Polanyi could make that I didn't get from the reading (although maybe it is there). He talks about the fact that much of our inarticulate knowledge can never be made articulate. But it seems as though a good formalism helps you to articulate more than you otherwise could. After all, without some sort of formalism (like a primitive language) you couldn't articulate anything. Are math and logic the BEST formalisms in this sense? Or are they the best at helping us articulate only certain types of knowledge?

Polanyi's big point in this chapter seems to be that all of our articulate knowledge involves a self-assessment, or self-accreditation, or our own act of knowing. We develop and use an interpretive framework, but constantly assess the framework. We even have the expectation that the framework will break down at some point, when we encounter something truly novel, and we are prepared to adapt the framework to this new experience. This adaptation too is subject to our self-appraisal. Polanyi admits that we might very well question our own ability to evaluate our frameworks. He seems to say that the answer to this question involves a leap of faith in which we express confidence in our ability to recognize an objective reality. All adaptations of our interpretive framework are undertaken with the goal of getting closer to closer to reality. He takes it for granted that getting closer to reality is universally satisfying, and therefore also personally satisfying (which is why we can rely on our self-satisfaction, or lack thereof, when judging our theories or utterances or whatever). I agree with him on this, but I'm not sure how well his argument holds up so far.

One quote from this chapter bothered me a bit:

"Man's whole intellectual life would be thrown away should this interpretive framework be wholly false; he is rational only to the extent to which the conceptions to which he is committed are true. The use of the word 'true' in the preceding sentence is part of a process of re-defining the meaning of truth, so as to make it truer in its own modified sense." (p. 112)

I'm not sure I understand what he is saying in the second sentence and I have not yet figured out how he is re-definng truth.

There were a few tidbits in this chapter that were interesting for me as a teacher. He talks about the fact that a problem must be hard, but not too hard, for its solution to be enjoyable for the solver. A problem can only produce intellectual strain in someone who understand the problem, but it will also only produce that strain (the alleviation of which leads to joy) if it is challenging. I agree, and I also agree with his point about needing to work through concrete problems to master a subject like math or physics. But maybe these things are obvious.

Finally I was intrigued by his statement to "look at the known data, but not in themselves, rather as clues to the unknown; as pointers to it and parts of it." (p. 127-128) Just a month ago I was telling my students that Galileo's genius was that he saw the truth as hidden WITHIN experience rather than as hidden BY experience (as Plato and the neo-Platonists saw it) or as equivalent to experience (as the followers of the Aristotelian tradition tended to see it). The Platonists didn't want to look at data at all, they just wanted to think. The Aristotelians wanted to look at the data in themselves, as the actual subject of study, the phenomena to be "saved". But Galileo sought to gain access to a hidden reality not through thought alone, but by thinking ABOUT THE DATA. I think that this is what Polanyi is getting at, and I think this viewpoint does presuppose that there is an objective reality and that it has some meaningful connection with our sense experience.

Polanyi vs. Kuhn

2008-12-22T10:39:00.000-08:00

As I am reading Polanyi's Personal Knowledge I can't keep thoughts of Thomas Kuhn's Structure of Scientific Revolutions far from my mind. So much of what Polanyi is saying is similar to what Kuhn says, although I believe they differ on certain key points. Kuhn apparently claimed that his work did not take anything from Polanyi's, but that is hard for me to believe given the fact that several of Kuhn's key theses are stated (in some form) in Polanyi's book, which was published well before Structure.

A member of my discussion group sent me a link to an article in the Polanyi Society's periodical by Martin Moleski on "Polanyi vs. Kuhn." I found the article very helpful. The article is available here: Polanyi vs. Kuhn. The rest of this post consists of my thoughts on Moleski's article.

I found Moleski's article very helpful in disentangling some initial impressions I had about the similarities between Polanyi and Kuhn. I have always been very troubled by Kuhn's concept of incommensurability. I have heard that Kuhn developed many of his ideas about scientific theory change while writing his book on the Copernican Revolution. But the more I have studied the Copernican Revolution the more I have been bothered by Kuhn's notion that nothing can mediate the two sides of the dispute in a "paradigm shift." I would seem that Polanyi may provide a middle ground between purely empiricist claims that sense experience is always the final arbiter in any scientific dispute (a claim that simply doesn't fit with most of the history of science) and Kuhn's relativist position. I am convinced that there are ways to arbitrate disputes over scientific theories, but that the principles that can be used to arbitrate these disputes are metaphysical rather than purely empirical. One way to say this is that scientists have a commitment to making their theories match the empirical data, but they also judge scientific theories according to a variety of what might be termed "aesthetic" criteria. The extent to which a scientific theory fits a given piece of empirical data may be (but is not always) unproblematic, but the choice of which aesthetic criteria are applicable can lead to major disagreements about which theory is superior.

Let me draw an example from the Copernican Revolution (since I have been studying this subject for 6 months now, and since both Polanyi and Kuhn make use of this example). In selecting between the Ptolemaic and Copernican systems, astronomers certainly considered how each theory matched to the observed data. But both theories were equally good (or bad) in this regard. It turned out Copernicus' theory was a bit easier to use in some ways, so many astronomers adopted his system for their calculations - but almost none of these seem to have believed in the reality (or truth, if you will) of the Copernican system. Some took what might be described as a positivist approach in which they viewed astronomical theories as ways of generating predictions for planetary positions, and nothing more. But some saw astronomical theories as representations of reality and these almost invariable committed themselves to the Ptolemaic view. Why? It seems to me it was because of the aesthetic criteria they chose to apply. Specifically they wanted an astronomical theory that fit with common sense observations (we don't FEEL the Earth moving), with the prevailing Aristotelian physics and cosmology, and that didn't contradict Holy Scripture. In favor of the Copernican system one could bring to bear a different set of aesthetic criteria, criteria that focus on the coherence and order of the theoretical system and the fact that only with the Copernican system could the distances to the planets be determined from observation. These were the features that ultimately convinced a few later astronomers like Kepler and Galileo to adopt the Copernican view.

It might seem that this would lead us to Kuhn's relativistic impasse: how are we to judge which set of aesthetic criteria is best? Clearly not by appeal to empirical data. Any principles which help us to select the best set of aesthetic criteria must be metaphysical. I take Kuhn to say that the selection of aesthetic criteria is subjective and is mostly a matter of conforming to tradition. And yet, there is no doubt that aesthetic criteria, like scientific theories, DO change. The Copernican Revolution occurred not because the Copernican theory finally proved itself to be empirically superior to the Ptolemaic theory (that did eventually happen after Kepler radically reinvented the "Copernican" theory - but Kepler had to first accept the Copernican viewpoint before he could transform it into an empirically superior theory). My sense is that the Revolution occurred because of changes in the aesthetic criteria used to judge scientific theories. Kepler, for example, saw coherence and order as the most critical criterion for judging a theory. He was awed by the intellectual beauty of the Copernican theory, in contrast to what he saw as the ugliness of the Ptolemaic theory. Galileo, on the other hand, came to question Aristotle's physics (and cosmology) as well as the "common sense" views about motion. Both men abandoned some of the aesthetic criteria that had been used against Copernicus and adopted new criteria that came down in favor of the heliostatic theory. They both also tried to find a middle ground on the issue of Scripture, claiming that the Bible is not a science text but a revelation about God that is given from a "human perspective". Neither was very successful in this venture: Kepler's views on interpreting Scripture were basically ignored (like much of his astronomy, for a while) and it was primarily Galileo's attempts at Scriptural interpretation that got him in trouble with the Inquisition.

So how were these men able to break from 1400 years of tradition? What motivated their change in aesthetic criteria? I am becoming increasingly convinced that it was their commitment to realism. Both men were trying to get at the truth of the world, rather than an economical description of it. They found the disjunction between astronomy and physics unacceptable. A TRUE theory should account for the motion of the heavens AND the motion of objects on Earth. They sought to unify physics and astronomy because they were convinced that only such a unification could bring them closer to the truth. They went about this unification in different ways, and both made mistakes (from our modern point of view). Kepler retained Aristotle's ideas about motion on the Earth and tried to apply them to the heavenly bodies. Galileo dismantled Aristotle's views on Earthly motions but his replacement was greatly influenced by Aristotle's views about the "natural" circular motions of the heavenly bodies. But both men came to question the traditional aesthetic criteria for judging astronomical theories and began to apply radically new criteria in an effort to find the truth (or so I think).

This is, according to Martin Moleski, the dividing line between Kuhn and Polanyi. Kuhn refuses to commit to any ideal of truth or reality. Thus he remains mired in relativism and incommensurability. Polanyi is willing to take the bold step of proclaiming that science seeks the truth (not that it can necessarily get there, but it at least SEEKS to get there). This commitment to realism provides, I believe, a way out of relativism and incommensurability. It seems to be the case that a realist viewpoint drives us toward adopting certain criteria for scientific theories while leading us to reject others. Polanyi talks about some of these criteria that lend themselves to a realist perspective: "man's delight in abstract theory" and "theories may be constructed without regard to one's normal mode of experience". I look forward to reading more about his criteria for objectivity, which I take to be aesthetic criteria for judging scientific theories. My hope is that I will find Polanyi's approach much less troubling than Kuhn's.

Polanyi's Personal Knowledge: Part I

2008-12-22T10:35:00.000-08:00

I've started reading Michael Polanyi's Personal Knowledge as part of a discussion with a group of scientists and theologians. I intend to record my thoughts on Polanyi to share with the discussion group, and this blog seems like a good place to do so. Without further ado, here are my thoughts on Part I of Polanyi's book:

I think I like the general direction that Polanyi is headed in Part I, but I am anxious to see how he resolves some issues that seem like potential problems to me. I am very much in favor of his advocacy of a metaphysical commitment to truth on the part of scientists. I think a commitment to devising TRUE theories about REALITY is essential to genuine science, but some philosophers of science have disregarded this notion as either meaningless (like the logical positivists) or as a goal to which we can aspire but which we can never know if we have attained, or even come close (like Popper). But if you look at the history of science it seems as though it was a metaphysical commitment to finding the truth which led to all the great advances in our knowledge. I should be careful to point out that a commitment to finding the truth is quite different from a belief that one's current theory is true.

The value of this metaphysical commitment can be made to fit with the views of other philosophers of science (but probably not the positivists). For example, a commitment to devising a true theory (as opposed to an empirically acceptable theory) makes your theory more falsifiable in Popper's sense, because you must accept ALL of the consequences of your theory as legitimate predictions. You can't just take those things the theory was designed to predict and leave out those things it wasn't designed for. Polanyi seems to be talking about this aspect of a theory when he talks about the "indeterminate scope of its true implications". This commitment fits with Lakatos idea of a hard core of beliefs (those which the scientists believes are TRUE) and a protective belt (those which the scientist is less committed to and thus more willing to change or discard). Further, it fits with Kuhn's idea that from a practical perspective we must accept a certain set of things as true in order to get any work done. But I think Polanyi seems to be reaching for something more than any of these others deliver - I hope he gets us there by the end!

I also like Polanyi's claim that in respect to their approach to truth, theories must be judged using what are essentially aesthetic criteria (criteria which he equates with "rationality"). This is very important in his discussion of order, in which he claims that to talk about randomness we must first recognize some kind of distinctive order. This recognition is a personal act and it involves aesthetic considerations. Aesthetic criteria come into play in his description of statistical tests as well: we will require more stringent statistical evidence for hypotheses that we feel are intrinsically unlikely (like his horoscope example), while requiring less stringent evidence (perhaps using Fisher's 5% rule) for a hypotheses we deem not unlikely (like Darwin's cross-fertilization hypothesis). I AM concerned about how much subjectivity this leads to. Different people can hold to different aesthetic criteria. How will Polanyi suggest that we mediate between different sets of criteria, each of which is judged to be a "rational" set by its holder? I look forward to finding out later in the book: at this point I'm still worried by this.

To take an example, let me pick on a statement in the book that bothered me. On p. 4 he says "the Copernican system, being more theoretical than the Ptolemaic, is also more objective. Since its picture of the solar system disregards our terrestrial location, it equally commends itself to the inhabitants of Earth, Mars, Venus, or Neptune, provided they share our intellectual values." This statement may be fine for the modern reader. But a 16th century astronomer would not have known what to make of this argument. For him (and it would have been a "him") the idea of inhabitants of Mars or Venus would have seemed absurd (and he wouldn't have known what Neptune was). He would not understand why there was any value to having a system that works just as well from the viewpoint of Mars. Mars was a fundamentally different entity from the Earth (it was an eternal, perfect, celestial body, in contrast to the corrupt and mutable Earth). He might agree that the Copernican theory was more abstract, but he would have questioned the value of that abstraction. So is Polanyi saying that abstraction is a universal criterion and that the 16th century astronomer is simply wrong (in an objective way) for not choosing the more abstract theory? I'm not sure yet that's what he's saying, and if he is I'm not sure what I think of it. I'm very leery of any attempt to lay down universal objective criteria for choosing scientific theories. All attempts at doing this so far have failed, in my view.

A similar point could be made about his statement (on p. 64) that "since every act of personal knowing appreciates the coherence of certain particulars, it implies also submission to certain standards of coherence." I can see how this submission could lead to less subjectivity. But what if two people have different "standards of coherence"? What do we do then? Whose standards do we follow? Kuhn would say we follow the standards of our chosen paradigm and that there is simply no real way to mediate between two opposing paradigms. I am very uncomfortable with this view. I hope that Polanyi will show us how "submission to the compelling claims of what in good conscience I conceive to be true" (p. 65) will help us make at least partially objective choices between competing theories, even if he can't spell out specific rules for how this might work.

Reflections on my own research

2008-12-22T05:35:00.000-08:00

I don't normally talk about my own research on this blog. Mainly this is because my research is in a fairly esoteric field (quantum chaos), so it's pretty technical and doesn't really fit the general theme of the blog. But I just finished writing a new paper and while writing it I was struck by the vast difference between the way the research is laid out in the paper and the way it actually happened. So I thought it would be interesting to tell the story of our "discovery" and then contrast it with how I wrote the paper. (I'm not claiming that it is a discovery that is in any way important, even within my field of specialty - although I hope it is considered important enough for the paper to be published - but nonetheless we did find and explain a new phenomenon that has not previously been discussed in the literature.)

The project was one I worked on with an undergraduate student. The aim of the project was to examine the relationship between classical mechanics and quantum mechanics in a particular model system. In quantum mechanics a particle trapped in a small region of space can only have certain specific values of energy. These are known as the energy eigenvalues of the system. It turns out that the statistical properties of the differences between consecutive eigenvalues depend on whether or not the dynamics in the classical version of the system is regular or chaotic . For one-dimensional systems like the one we were studying (which are always regular according to Newtonian physics) the general expectation is that the eigenvalues will be uniformly spaced. But our system has an unusual feature: it has non-Newtonian orbits. I won't go into the details, but these are basically paths that the particle follows in the classical limit of quantum mechanics, but not in Newtonian mechanics. It turns out our system has a lot of these non-Newtonian orbits and we were interested to see if that might have some affect on the spacings between consecutive quantum eigenvalues.

It turns out that they do have an effect. My student and I set up the necessary numerical calculations and she did all the hard work. The figure below shows the spacings she calculated.

Without the non-Newtonian orbits in the classical system we would expect all of the quantum eigenvalue spacings to equal one. Clearly they do not. So we found what I had hoped we would find. We also expected the spacings to get closer to one as we went to higher and higher energies, which they do. By my student's calculations had uncovered something else. Notice that at certain values of n the seemingly random scatter of spacings coalesces into a few distinct curves. In particular, around n=525 or so the sequence of spacings forms three distinct curves. This was totally unexpected and as far as I know it's never been seen before (it's also probably not very important, but let's not dwell on that now).

So what were these unexpected features in the spacings sequence? As soon as I saw it I had an idea of what it MUST be. I was sure it was a resonance phenomenon. Resonances show up all over the place in physics, and to me this looked like a resonance. But there can be many different kinds of resonance phenomena - what kind was this? I immediately focused in on two possibilities: either it was a resonance between the periods of two non-Newtonian periodic orbits (so that the ratio of their two periods would form a rational number with a small denominator) or it was a resonance between the actions of two such orbits (action is an important quantity in classical physics, but it's hard to describe without a lot of math). My gut told me it was the period resonance, but I knew that classical actions can play an important role in determining quantum eigenvalues so I couldn't neglect that choice. I was able to derive a formula for each of these options that would allow me to predict exactly where these resonance features should appear. I checked the formula against the results shown in the figure above and they BOTH matched - for the particular set of parameters we were using in our calculations they gave the same predictions. However, I could tell that for a different set of parameters the two formulas would give different predictions, so my student did the calculations with the new parameters: the period resonance formula fit perfectly and the action resonance formula failed.

At first glance this seems to fit a Popperian model of science. I came up with two competing theories to explain the observed data and subjected both theories to rigorous testing. One theory was falsified, the other theory survived. Now I can publish the successful theory and move on, right? But that's not really the whole story. First of all, how did I come up with these two theories? How did I recognize this odd behavior in a sequence of numbers as a resonance phenomenon? How did I know before doing any testing which of the two theories was better? Was it just a lucky guess? I don't think so, but I can't really articulate how I formed the idea. It just LOOKED like a resonance, and the period resonance idea made more physical sense to me. Obviously my training played a part in this (as I said above, resonances are all over physics - so resonances are the kind of thing physicists are trained to spot). But I can't really think of another example of a resonance that looks just like this one. And I find this fascinating, in part because I'm reading Michael Polanyi's Personal Knowledge right now and he makes a big deal of the inarticulate, tacit component of science. I feel as though I saw Polanyi's ideas in action in my own work.

I should point out that although some aspects of this work do seem to fit Popper's model, none of it fits the empiricist/positivist induction model. I used nothing more than a qualitative knowledge about the resonance features to derive my period resonance theory (and the competing action resonance theory). I didn't pay any attention at all to the actual energies at which these resonance occur until AFTER I had devised a formula (or, rather, two formulas) to predict the energies at which they SHOULD occur (if my theory was right).

But that's not all that struck me about this research. As I said, I just finished writing the paper and in the paper you won't see a discussion of these two competing theories. It became unnecessary. You see, once I was absolutely certain that my period resonance theory was correct I figured out how to DERIVE it from something called semiclassical theory. Now this was not a trivial derivation and I honestly don't believe I could possibly have done it without knowing exactly where I wanted to end up. It is not at all obvious (to me, anyway) from the semiclassical theory itself that resonance features should show up in the sequence of eigenvalue spacings. It took a lot of work to bring those features out, and I wouldn't have know how to do that work (or even that the work should be done) if I hadn't already figured out what the result would be.

In my paper, though, you will just find a presentation of the numerical data and some discussion pointing out the unusual features. Then you will find a derivation of the period resonance theory from the semiclassical theory. You'd never know from my paper that I figured out the period resonance theory BEFORE I even started working with the semiclassical theory. I think that this is fairly typical of scientific papers (I know it is typical of my own). Scientific papers rarely describe the actual process of discovery. Some of what is left out is a description of errors and dead ends (that's the case for my current paper too - we spent a long time incorrectly calculating the spacings before we realized what we were doing wrong, but you won't find any discussion of THAT in our paper). But often there are some really interesting aspects of the process of science that get left out. And I think that's a shame. I guess that's why I decided to write this blog entry!

Incommensurable Football

2008-12-01T21:32:00.000-08:00

My new approach to the blog (short, frequent posts) didn't last long. So after a four month hiatus from the blog (spent largely trying, successfully I hope, to figure out how to teach non-science majors about the Copernican Revolution) here another really long entry.....

Now that I've re-read Kuhn's The Structure of Scientific Revolutions and read his The Essential Tension (not to mention his The Copernican Revolution) I still find myself troubled by his idea of incommensurability. I agree with Kuhn's commitment to evaluating scientific theories from the standpoint of those who held them. We must accept that our criteria for judging theories change over time and therefore there will be cases in which one theory is judged superior using a certain set of criteria (adopted by one group of scientists) while another theory is judged superior using a different set of criteria (by a different group of scientists). There is no doubt that such cases have arisen in the history of science. I'm sure it is the case that some of these disagreements were resolved through social or psychological, rather than scientific, means. But I remain convinced that these dilemmas COULD have been resolved by scientific means, eventually, at least in almost all cases. And I've been inspired in my thinking on this topic by, of all things, college football. Bear with me for a moment as I talk football. I'll return to the philosophy of science in a bit, but I've got to set the stage first.

A earned my PhD in physics from the University of Texas at Austin and am therefore a fan of the Longhorns. It follows from this that I cannot stand the Oklahoma Sooners. These two teams, along with Texas Tech (about whom I have no strong feelings), are currently embroiled in a controversy over who should be declared the champion of the Big XII South Division. All three teams have identical 11-1 records (7-1 in the conference). Texas beat OU, Texas Tech beat Texas, and OU beat Tech. The conventional criteria (conference record, overall record, head-to-head results) cannot produce a unique winner for the division. This is, I feel, rather like two (or three?) scientific theories that fit equally well the evidence that is accepted by proponents of both theories. Perhaps there is a ``crucial experiment'' in favor of one theory, but there is also a ``crucial experiment'' in favor of the other theory (I view the crucial experiment as being like the head-to-head matchup - although in football one can always question whether or not the "best team always wins" and the same is likely true of science). The rules of the Big XII provide a solution to this football dilemma, but the solution is a very Kuhnian one: the winner of the division is determined by which team has the highest BCS ranking. Note that the BCS ranking is determined by computer polls (constructed by "experts" who use various statistical and numerical criteria to rank teams against each other) as well as by human votes. In other words, this dilemma is settled through social means. Just like, according to Kuhn, disputes between scientific theories.

As it turns out the 'Horns (along with the Red Raiders) got the short straw and the hated OU Sooners have been declared division champs. In all honesty, they deserve it as much as the Longhorns do (though perhaps not any more). There is a genuine ambiguity here. It seems as though the only possible solution is the social one. Over the last several days I've read innumerable attempts to apply logic to the situation, logic which inevitably shows that the team favored by the logician should be chosen as the division champ. Sounds a lot like debates over phlogiston or the motion of the Earth! The truth is that the usual standards simply fail to supply a clear answer in this case. It's painful for us Longhorn fans, but the truth is that we can't prove that it SHOULDN'T be OU in the title game - except by appealing to the innate superiority of Texas over OU that we all feel deep down in our bones. But I'm sure OU fans feel the same way about their team (assuming OU fans have normal human feelings...).

So far my story seems to be heading in a pessimistic direction. If we can't even figure out which of two football teams is better, how can we hope to do the same for competing scientific theories? But I am convinced that there is a way out. The solution for the football controversy could be easy: just set up a round-robin tournament among these three teams and keep it going until a clear champion emerges. This solution may be impractical but if we REALLY wanted to be sure we could do it. Even then, though, there is a problem. Football teams are transient things. Players get hurt (indeed, Tech star Michael Crabtree likely couldn't play in my proposed tournament). If things go on long enough, some of the players will graduate (yes, some of them DO graduate) and will no longer be eligible to play. So you really aren't always comparing the same three teams.

This is where science has the advantage on football: scientific theories may be transient, but they don't NEED to be. Yes, theories come and go, but if we can hold off what I'm calling the "social solution" we can keep a theory in play as long as needed. We can keep finding more head-to-head match-ups, or at least get a better handle on the breadth of problems that can be solved by one theory versus the other (a bit like the "strength of schedule" in the football computer polls). Ambiguities can arise, as they did in the Big XII South this year, but over time those ambiguities can be sorted out if we care to do so. Sorting out the ambiguities and avoiding incommensurability requires, I think, three things: time, effort, and SOME shared criteria for evaluating theories. The proponents of different theories need not share ALL of their criteria, but there must be some overlap. In particular, both groups must have some commitment to empirical validation of their theories. Yes, seemingly contradictory empirical results can always be explained away by tweaking some auxiliary information, etc. So no one piece of empirical evidence will decide the victor (just as the Texas-OU game did not decide the Big XII South Champion). But with sufficient time and effort enough empirical evidence can be compiled to push us to one of three situations: one theory is clearly better than the other at matching the empirical data, both theories match the empirical data equally well but one theory has been forced to become more complicated to match the data, or the two theories turn out to really be the same theory.

Of course my argument doesn't prove anything, but it feels right to me. I am utterly convinced that even with an additional 300+ years of development impetus theory could not compete with Newtonian physics in the efficiency and accuracy with which it predicts the motion of macroscopic objects. I believe that if I could travel back in time armed with my knowledge of Newtonian physics (and a good English-Latin dictionary?) I could convince medieval scholars like Buridan and Oresme to abandon impetus and embrace Newton's ideas. I feel certain of this. But then, I feel certain that Texas is better than OU. Feeling certain counts for little in the philosophy of science, just as in college football.

Puzzle Solving and Gestalt Shifts

2008-08-10T20:13:00.000-07:00

I saw something recently that got me thinking about Kuhn's distinction between normal science (which he describes as "puzzle solving") and revolutions (which he likens to Gestalt shifts). The thing I saw was a puzzle (hence the connection to puzzle solving). The puzzle consisted of this: six toothpicks are laid down on a table in groups of three. Each group of three forms an equilateral triangle with a base near the puzzle-solver and the opposite apex pointing away. Here's the puzzle: move one (and only one) toothpick to form four triangles. If you really want to get into the spirit you should go get yourself six toothpicks and try this yourself before reading any more....

... no, seriously, it will help you get what I'm talking about ...

OK, so the trick ends up being that you move one of the toothpicks on the left triangle so that it forms a representation of the number 4 (i.e the numeral 4). (Try it - you just have to slide the right side of the triangle so that it become perpendicular with the base.) The other triangle remains a triangle. So the result if 4 triangles. Now, you can argue that it should be "4 triangle" not "4 triangles", but I saw people solve the puzzle so it's not totally off the wall.

My point is this: solving that puzzle involves something like a Gestalt shift. Kuhn would probably not disagree. After all he says it is persistent failure of normally good puzzle solving strategies to solve a seemingly valid puzzle that leads to crisis and ultimately revolution in science. He might argue that it was only after you had exhausted all possible ways of actually forming 4 separate triangular shapes with the toothpicks that you would make the Gestalt shift to thinking about the numeral 4. Maybe this is right (I wasn't one of the ones who solved the puzzle so I don't know).

What strikes me is that this is a Gestalt shift that is taking place on a very low level. The Gestalt shift needed to solve that puzzle is not one that will alter my view of the Universe, or force me to cast out all of my previous notions of puzzle solving. Yes, it may now add a tool to my puzzle-solving arsenal that simply wasn't there before. But if this is a revolution it is a microrevolution. And a revolution on this scale would not lead to any incommensurability. Actually, visual Gestalt shifts usually don't produce incommensurability - you can still see the rabbit even after you've seen the duck, and you can usually coach others to see what you now see.

Kuhn says (in The Structure of Scientific Revolutions) that revolutions take place on many scales. But he always seems to talk about the big ones. His distinction between normal science and revolutions implies that normal science is what takes place without interruption for years and years until BOOM there is a revolution. But if revolutions can take place on ALL scales (from a new way of seeing a highly specialized problem in a particular area of technical research, all the way up to revolutions that involve major cosmological or metaphysical consequences) then revolutions must be happening ALL THE TIME. Granted, the big ones only come around occasionally, but little ones occur almost non-stop. It's a scaling law behavior - think earthquakes (big ones are rare but devastating, little ones happen all the time but may be unnoticed).

If this is true then the distinction between normal science and revolutions becomes much less clear. The idea of incommensurability doesn't seem to hold up either (or maybe it only applies to the biggest of revolutions, but I'm not even convinced of that). Something to think about.

The Realism of Copernicus

2008-08-01T18:43:00.000-07:00

I've been doing some reading in preparation for teaching a course on the Copernican Revolution this Fall (currently I'm reading Alexandre Koyre's "The Astronomical Revolution"). In the process I've been struck by Copernicus' apparent motivation for developing his new system of astronomy. It wasn't so much that he was trying to devise a system that would match up better with observations (and he didn't). It wasn't that he really was convinced that the Earth moved independent of astronomical considerations (at the time you would have had to be crazy to believe that). It was that he was absolutely, utterly committed to the REALITY of uniform circular motion in the heavens. I guess this can be attributed to Platonic (or maybe Pythagorean?) influence, but he seems to have believed that uniform circular motion was the only thing that could possibly REALLY be going on up there in the skies. He states clearly that his major motivation for devising his system was to get rid of the equant, which was Ptolemy's great heresy against the Platonic (and Aristotelian) doctrine of uniform circular motion. He was so committed to ridding astronomy of equants that he was willing to consider the absurd notion of a moving Earth!

It is interesting to contrast Copernicus' metaphysical commitment to the truth of uniform circular motion with the phenomenalism of Andreas Osiander, who wrote the controversial preface to Copernicus' "De Revolutionibus". Osianders view, as stated in that preface, is that the job of astronomy is to "save the appearances" and that one should use every mathematical trick available, even one as silly as making the Earth revolve around the Sun, in order to make the calculations match the appearances of the sky. This view is actually quite sophisticated and modern, and is not much different from the logical positivism that dominated philosophy of science (and philosophy generally) during the first half of the 20th Century. But this is obviously not Copernicus' view. He is willing to throw out a very useful mathematical trick (the equant) in order to get back to what he KNOWS is the TRUE motion of the celestial bodies, namely uniform motion in a circle.

So Copernicus has the less sophisticated philosophical point of view, as well as a strong metaphysical commitment to a scientific idea that turns out to be totally wrong. And it is exactly because of this that he, rather than Osiander and others like him, revolutionized astronomy and paved the way for modern science. It turns out he didn't need to be so revolutionary. He could have gotten rid of equants and stayed with a geocentric universe by throwing in a few more epicycles (as Kepler later showed). But either he was unaware of this, or decided to give the heliocentric view a shot and became convinced of its beauty (and thus its truth, since he held that kind of Platonic view).

It is also interesting that it was Copernicus' devout commitment to uniform circular motion that led him to the first major breakthrough in astronomy since Ptolemy. But it was Kepler's ability to see past this view and consider non-uniform non-circular motion that led to the next major breakthrough. I doubt very much that Copernicus would have been pleased at what Kepler did to his astronomical system. It just goes to show that sometimes "bad" ideas lead to "good" results.

New Approach to the Blog

2008-08-01T18:39:00.001-07:00

Up to this point I've been trying to post relatively well thought-out essays. But as any hypothetical reader could tell, I've kind of dropped the ball recently. It's not that I haven't had things I wanted to write about. I just didn't have time to do a full-fledged essay.

So from now on I am going to start writing quick little ideas that pop into my head. I may muster up a genuine essay or two (with actual research, though not as much as if I was really going to publish). But for the most part I will post short descriptions of things I am thinking about in regards to the history and philosophy of science, and teaching physics and astronomy. Mostly these will be questions or ideas I want to explore in greater detail someday (who knows when?). Or they may be just offhand comments that will be carried no farther. If I ever have any readers, I hope they will be sympathetic to this approach.

Duhem, Brody, Hubble: Approximation and the Scope of Theories

2008-03-23T20:58:00.000-07:00

In this essay I'd like to highlight some similarities between the ideas of two physicists who have written extensively on the philosophy of physics. The first is Pierre Duhem, author of The Aim and Structure of Physical Theory. The second is Thomas Brody, author of The Philosophy Behind Physics. In reading these two books I have been struck by some remarkable similarities that I think are worth pointing out. This is somewhat surprising since Duhem was primarily a phenomenalist (though he makes concessions to realism with his idea that science approaches a "natural classification") while Brody seems to be a realist (but one who makes concessions to phenomenalism in his insistence that science is necessarily approximate). It is also surprising because Brody makes no reference to Duhem's work, even though Duhem's book was published in 1906 (in French, an English translation has been available since at least 1954) and Brody 's philosophical work was published mostly in the 1970's and 1980's. These ideas have also given rise to some thoughts about some of Edwin Hubble's work, which I have been studying recently (Hubble's The Realm of the Nebulae is a good introduction to his work).

One similarity between these two is their emphasis on approximation in physics. Both Duhem and Brody recognize that all measurements are approximate and all theoretical predictions are approximate as well. Duhem makes this point the basis for his famous thesis that any experimental measurement outcome is necessarily consistent with an infinite number of theories and any theoretical prediction is consistent with an infinite number of experimental measurement outcomes (a thesis, often known as underdetermination, which was later expanded by W. V. O. Quine, but with somewhat different emphasis). Brody emphasizes the fact that approximations are generally valid in some circumstances in invalid in others. The approximate nature of scientific theories then becomes the basis for his idea of the scope of a theory. The scope of a theory is the range of phenomena for which the theory is valid. The theory is not expected to be valid for phenomena that lie outside its scope. Brody argues that one of the main goals of science is to delimit the scope of theories (as well as to create new theories).

The concept of a limited scope for physical theories is common to Duhem and Brody. Indeed, they both find it quite acceptable for a physicist to use two completely incompatible theories in the course of her work. Duhem quotes Poincare to state that one can use logically incompatible theories as long as one takes care not to mix them or to "get to the bottom of things." Brody presents a somewhat subtler view based on his idea of scope. It is acceptable even to mix theories that are logically incompatible provided that one doesn't use any theory to describe phenomena that are outside of its scope. He cites as an example molecular physics in which the nuclei are treated as classical Newtonian point masses, while the electrons are treated as relativistic quantum particles. Perhaps an even better example he uses is that of studying the influence of the Moon's gravity on a pendulum by first calculating the Moon's orbit (treating Earth as a Newtonian point mass for this purpose) and then treating the gravitational force between the Moon and the pendulum bob as a perturbation on the pendulum's "normal oscillation" (treating Earth as an infinite plane and Earth's gravitational field as uniform). Here within a single problem the physicist uses to logically incompatible models of the same object, but for different phases of the problem. Perhaps Poincare would not consider this "mixing" the two models - but the main point is that each model is used to predict a phenomenon that is within the scope of that particular model. I'm well acquainted with this type of work, since my own research has largely focused on the interaction of quantum particles with oscillating classical electric fields, so I mix classical electrodynamics and quantum physics all the time.

An important consequence of the fact the scientific theories are approximate and have limited scope is that scientific theories are not about truth.
Both Duhem and Brody insist that scientific theories cannot be evaluated on a logical basis. Theories are neither true nor false in a logical sense. The concept of a theory of everything (a theory that explained all phenomena) would be meaningless for both Brody and Duhem. Duhem would view such a theory as a "cosmology" (in the ancient meaning of this term) and thus not a scientific theory at all. Indeed, his chief goal in Aim and Structure was to separate science from, and make it independent of, cosmology.

One more similarity between Duhem and Brody is their insistence on the evolutionary nature of science. Duhem seems to disdain the very idea of scientific revolutions. In part this is based on his extensive historical study of medieval physics which illustrate the origins of many of the ideas that eventually reached maturity in Newton's physics. It should be noted that Duhem wrote his book around 1905, so he was unaware of the coming quantum and relativistic "revolutions" (though I doubt these would have changed his views). Brody seems to accept the idea of revolutions, but rejects Kuhn's idea that between revolutions physicists only solve problems using an established paradigm. He points to the extensive development of mechanics after Newton, pointing out that Newton might very well be unable to understand things like Hamilton-Jacoby theory and the geometrical mechanics of Poincare even though these are supposedly the result of "problem solving" within the paradigm that Newton himself created. I intend to write more about evolution versus revolution in science at a later date.

For now what I'd like to do is apply the framework of a theories scope to an idea that is a key part of Edwin Hubble's work, an idea he refers to as "The Principle of the Uniformity of Nature." Now this phrase is often used to denote an essentially metaphysical statement that is supposed to justify induction, but I don't think that's what Hubble means by it. He means something more like a methodological principle, and I think it can be clearly explained in terms of Brody's idea of scope. What Hubble is saying is this: when an empirical law has been found, we should assume the widest possible scope for this law. For example, Henrietta Leavitt found an empirical law relating the apparent brightness (and thus, essentially, the intrinsic brightness) and the period of Cephied variables in the Large Magellanic Cloud. Hubble applied the Principle by assuming that ALL Cepheid variables (as identified by the shape of their light curves) follow this law, even those in distant galaxies and in various parts of our own galaxy. This turned out to cause problems because it led to inconsistent results. But Hubble was never trying to say that these empirical laws really did have universal scope, but only that we should assume that they do until we have reason to think otherwise (i.e. until that empirical law leads to contradictions with another empirical law, or with directly observed data). When such contradictions occur the scope of one of the laws involved must be reduced. In the case of Cepheids, the contradictions were resolved by proposing that there are two types of Cephieds with different period-luminosity relations (one type resides in the galactic plane, the other type in the halo).

Of course, if two empirical laws contradict it may be hard to determine which one should have its scope reduced. In some cases we may be able to carry out an experiment or observation that will clearly favor the modification of one law over the other. But in many cases we may need to guess, and our guess will be guided by how the modification fits with all of our other theories. This view actually ties in well with Duhem's other famous thesis: that we never test a theory in isolation, but rather we test the entire system of current theories. When a predictions is contradicted by a measurement we never know which theory (or assumption, etc.) is to blame, but we must make a choice of what to modify. That choice will be made with a consideration for the impact it will have on our system of theories and its fit to all previously known data. For example, we will be unlikely to modify a foundational theory that explains a wide range of phenomena. Instead, we will probably choose to modify (or delimit the scope of) a theory which if of lesser importance to the entire structure of our theoretical system. This is essentially Lakatos' idea of modifying the "protective belt" rather than the "hard core" of our theoretical system.

Note that the Principle of the Uniformity of nature is a methodological assumption with no logical basis. Logically we have no reason to suppose that the scope of an empirical law or theory extends beyond the data already known to fit it. It is interesting, though, that Hubble's methodological assumption can be recast in terms of Popper's fundamental methodological assumption to always choose the most falsifiable theory. Certainly, we make any theory more falsifiable by assuming it has a universal scope rather than a limited scope. The difference in Hubble's proposal is that he suggests limiting the scope of the empirical law rather than discarding it as Popper (at least in his early work) would have it. I think Hubble's perspective on astronomy fits in very well with the scheme that seems common to both Brody and Duhem.

An analogy that Brody uses can help make sense of all this. He says that science is rather like a map. A map is always approximate. The idea of a map that depicted its subject exactly down the finest detail (i.e. showing blades of grass in Central Park, and the ant crawling on the blade of grass, and the crumb of bread in the ants mandibles, etc.) is ridiculous. Not only that, but such a map would be useless. Moreover, as we make our way around a city we may use multiple incompatible maps. For example, we may have a street atlas, a subway map, and a restaurant guide. These maps are not logically compatible because they will indicate different relative distances between supposedly identical locations (subway maps, in particular, are always schematic and do a poor job of depicting geographical relations between stations). However, in going from a hotel on Fifth Avenue (why not?) to the Statue of Liberty we might make use of a street map to find the nearest subway stop, the subway map to get us to the station closest to the ferry terminal, the street map again to find the ferry terminal, then the ferry map to make sure we get on the correct route. None of these maps embodies the "truth" of New York City, but they all provide a useful depiction of certain structural relations within the "real" New York City. They are incompatible in a logical sense, and yet we can use them together to get where we want to go. In a similar way, none of our scientific theories embody the "truth" of the world, but they do provide useful depictions of certain structural relations within the "real" world. We can use incompatible scientific theories to solve problems and make predictions about the physical world, provided we know which structural relations are accurately depicted by a given theory and which are not.

Science Curriculum

2008-01-21T20:03:00.000-08:00

I've been reading Science Teaching: The Role of History and Philosophy of Science by Michael Matthews. In an early chapter of that book he gives a brief history of the large-scale science education curricula that have been developed in the last hundred years or so. Reading that has gotten me thinking about the problem of all-encompassing curricular movements and I thought I go ahead and jot down my half-formed thoughts. I am certainly no expert on educational theory, and I can't even claim to be much of an expert on science teaching. So take all this with a grain of salt.

It seems to me that all the big curricular movements assume that there is a single "best" way to teach science to all students at all stages of development. This basic idea seem flawed to me. There are at least two groups of students for whom we have very different goals in science teaching: those who will become scientists and those who will not. Of course, we don't know which are which until very late i the game. But the ideal science education for someone who will never become a professional scientist is likely quite different from what is needed to train a future scientific professional. Furthermore, it seems ridiculous to think that one approach will be ideally suited to all ages of students. Student capabilities change significantly as students age and it may be that what is best for an elementary school student is radically different from what is best for a high school student. At the same time, though, the education of elementary school students and high school students cannot exist entirely independent of each other. High school education must build upon what has been learned in elementary school, while elementary school education should supply students with the background they need for their high school studies.

It might seem like the development of a unified curriculum for both types of students at all grade levels is a hopeless task. Maybe it is. But I think there might be some hope. To begin with, my impression is that at the early grade levels there really is no difference between what is best for the future scientists and what is best for others. This is fortunate since it is precisely at these grade levels that one has no chance of distinguishing the members of the two groups. At the elementary school level science teaching should focus on teaching about science, rather than teaching scientific theory. Content is not critical at this stage. Students should probably be given some exposure to the various scientific disciplines, but that exposure should be focused on particular topics that illustrate the nature of scientific inquiry. Teaching should be very hands-on, should be clearly relevant to the real world (in a directly perceivable way - so teaching kids about quantum mechanics and saying that it relates to grocery store scanners and computers doesn't cut it), and should be infused with history. Matthews argues for a history-based approach to science teaching that I think would be very well suited to teaching students at this level (his specific example of the history of the study of pendulum motion is excellent).

At the elementary, and probably the middle school. level students should not be burdened with the abstract theories that constitute the grand achievements of modern science. Instead, students should be given an opportunity to explore but also to experience the interplay between ideas and facts. They should be led to see that ideas do not spring forth from facts, but that rather ideas often transform the meaning of previously known facts. They should come to see that science deals not directly with the real world but only indirectly, with the idealized world of ideas serving as an intermediary which is not a direct representation of the world but rather a lens through which aspects of the real world can be understood. As a physicist I would be perfectly happy if students at this level were Aristotelian, as long as they were thoughtfully Aristotelian. I am convinced that this approach, although it would not get students to an understanding of modern science, would do a great deal to pave the way for future instruction. After all, college physics professors are now well aware that we must assume that many (if not most) of our students enter our introductory college physics courses with an essentially Aristotelian view of motion (if they have any coherent view at all). So it is hard to see that this approach would do any harm.

At the high school level and beyond it become more important to distinguish separate tracks for future scientists and others. For future scientists, scientific education must include a significant amount of training as well as education. Future scientists must learn how to use the theoretical and experimental tools of modern science and to do this they must be exposed to the abstract formulations of modern scientific disciplines. However, I think even for these students that the transition from learning about science to learning the edifice of modern science should be gradual. Teaching should progress from a purely historical, hands-on, real-world approach to a more discipline-structured, mathematical, abstract approach. At no point should the historical or hands-on elements disappear entirely, but they will need to be less prominent to make room for the more professional elements. Ideally the history and the abstract formulation would be closely tied together. Students could be shown how the abstract ideas were developed historically, but then could go on to make use of these ideas in problem-solving, etc.

For students not interested in careers in science will probably still need some exposure to the abstract style of thinking that characterizes modern science, but they need less exposure than the future scientists. What they probably need at this level is a chance to see the connections between science and major social, political, and economic issues. Students at this level have enough awareness of these other areas that it makes sense to connect science to them. This is basically where we want most citizens to end up: they should have some understanding of what science is all about and the role that science plays in today's world. This kind of educations would hopefully make them more informed to participate in the social, political, and economic life of modern civilization and also provide them with the thinking tools they need to resist pseudoscientific claptrap.

Perhaps the biggest difference in the two educational tracks will come at the college level. Here the goal is to go beyond the basics and dig deeper. For future scientists this means becoming increasingly expert at using the formalisms of modern science. For non-scientists this means engaging in a more sophisticated inquiry into the history and philosophy of science and the relation of science to society. History and philosophy may become add-on components to courses for scientists, as may the hands-on elements (which will typically be separated into lab sessions) while for the non-science major these elements should be infused throughout the course. Breadth of content now becomes important in courses for scientists, while the content of non-science major courses can be narrowly focused and suited to the expertise of the instructor or the interests of students. Graduate education in the sciences would likely continue as it is now, an almost entirely formal training in the concepts and techniques of the modern discipline.

I think this approach would be of tremendous benefit to the vast majority of students who have no intention of becoming professional scientists. It would be particularly beneficial for future elementary school teachers who are currently harmed by the formal science education which they receive and then (since it is what they have been taught) pass on to their students who simply aren't ready for it and don't need it. This system does have some disadvantages, mainly for future scientists. It is possible that reducing the amount of formal, abstract science they engage in at an early age will hamper their ability to master this material later in their education. But I'm not convinced that young students gain much from exposure to abstract scientific theory. I think that material is probably not developmentally appropriate for these young students. And in any case it is not clear that current teaching which utilizes a more professional approach in early grades does all that much to help prepare students for coursework at the college level.

Perhaps the more significant disadvantage for future scientists is that they would miss out on the more sophisticated history and philosophy of science that would be presented to non-science majors at the college level. This really is unfortunate, but again I think my ideas would be better than the status quo in which future scientists receive almost no instruction that involves history and philosophy of science. Perhaps science majors could be encouraged to take general education science courses as electives. I think this is particularly important for future high school science teachers (who will presumably be science majors, but won't become professional scientists and will need to understand the historical and philosophical approaches to teaching science if they are to utilize these approaches as teachers).

Again, I'm not expert. I'm not seriously proposing this as a model for a new national curriculum or anything remotely like that. This just represents the state of my current thinking on the subject. I'll continue to read more and probably find out the flaws in my thinking (I've already read more and found out that Ernst Mach came up with much the same line of thinking that I've been bouncing around in my head for the last week or so - and I feel encouraged by that!).

History of Astronomy with Errors

2008-01-12T19:51:00.000-08:00

This blog post will be a bit unusual. I just wrote a letter to the editor of APS News pointing out a few errors in a historical piece on Edwin Hubble that was in the January 2008 edition (this will be accessible only to APS members until the next APS News comes out, and then it will be available to all). What I plan to do here is print my letter and give some additional comments. I have no idea if my letter will be published in APS News, but here it is:

Dear Editor,

I always enjoy reading “This Month in Physics History” and the January installment on Hubble’s discoveries was no exception. However, I would like to point out a few minor errors in that piece. Most astronomers in the early 20’s favored the theory that spiral nebulae were “island universes” and in fact believed the Milky Way to be much smaller than we now know it to be. Shapley and a few others favored the idea of a much larger Milky Way which contained the spiral nebulae, but Shapley’s letters indicate that he knew he was in the minority on this issue. Also, it was Henry Norris Russell who presented (on behalf of Hubble) the data on Cepheids in Andromea at the AAS meeting in January 1925. Most importantly, it is untrue that “Hubble didn’t discuss the implications of what he had found” in his 1929 PNAS paper. In the final paragraph of that paper he says “the velocity-distance relation may represent the de Sitter effect”, referring to the model of the Universe presented by Willem de Sitter in 1917. This model was originally interpreted as a static model, but did predict a redshift that increased with distance because of scattering and an apparent slowing down of distant atomic vibrations. So in 1929 Hubble did not interpret his data as indicating an expanding Universe, but rather as supporting de Sitter’s static model. It was only later realized that de Sitter’s model was equivalent via a coordinate transformation to expanding models such as that proposed by Georges Lemaitre in 1927 (Lemaitre’s model was unknown to Hubble and most astronomers until 1930). A detailed account of this history is given in Robert W. Smith’s The Expanding Universe (Cambridge U Press, 1982).

Now let me add a few comments:

My pointing out the second error may be me just being picky. It WAS Hubble's data on Cephieds in Andromeda that was presented at the AAS meeting, even if it Russell presented it for him. The piece in APS news implied that Hubble presented it himself, but the wording could be interpreted to fit the facts (though I doubt many readers would interpret it that way). The other errors are more problematic in that they serve to glorify Hubble at the expense of historical accuracy. I seriously doubt that this was the conscious intent of the person who wrote the piece (or the APS News editors), but there it is. Most astronomers were already convinced that there were other galaxies long before Hubble's Cepheid discovery. That discovery, though, put the nail in the coffin. It was a MAJOR discovery, but ultimately what it indicated was what most astronomers thought already. It did when over the few dissenters, some of whom were very important astronomers like Harlow Shapley. The discovery of Cepheids in Andromeda was of immense importance because up to that point the evidence for the extra-galactic nature of the spiral nebulae was circumstantial and conflicting. Hubble found the smoking gun, and subsequently got rid of the conflicting evidence by dismantling Adrian van Maanen's work on the rotation of spiral nebulae (and interesting story in its own right).

It is the third error that I found most surprising. Hubble clearly proposes in his 1929 paper that the velocity-distance relation could be evidence that favored de Sitter's model of the Universe (which was a static model). Hubble did not at that time think that he had found evidence for an expanding Universe. In fact, Hubble continued to resist the idea of a non-static Universe for years. I'm guessing that this is where the statement in the APS News article came from. In later years Hubble did refuse to comment on the interpretation of the velocity-distance relation. But this was after de Sitter's model had been invalidated (mainly because the mean density of the Universe was too high for his model to be relevant) and new non-static models (actually old models that nobody had paid attention to, like Lemaitre's and Friedmann's) had become the focus of the discussion. Hubble apparently did not believe the the redshifts he observed were genuine Doppler shifts, due to actual recessional motion. He did not withhold his opinion because he thought interpretation should be left to others (after all, he was quite ready to support de Sitter's model and in fact his work was likely an attempt to test that model directly). But when the only options up for discussion were expanding models he did not want to side with any of them.

Again, the importance of Hubble's (and Humason's) work on the velocity-distance relation can hardly be overstated. We NOW recognize it as a crucial piece of evidence for the expansion of the Universe. But it was not recognized as such in 1929 (certainly not by Hubble). I don't intend to fault Hubble for this - after all, he was an observational astronomer and an incredibly good one. And in 1929 astronomers were essentially unaware of the existence of expanding models like Lemaitre's. Given what he had to work with, Hubble made a reasonable suggestion that his data supported de Sitter's model. This turned out to be wrong and from that point on Hubble was reluctant to throw his support behind any particular model. All of this is entirely reasonable behavior on his part. But let's not try to hide the fact that Hubble backed the wrong horse.

The errors in the APS News piece were innocent enough. But unfortunately I suspect that such errors are made in many similar cases. They serve to produce an alternate history of science in which our greatest scientists made no mistakes. But this dehumanizes them and makes their accomplishments seem out of reach. Even the greats stumble on occasion. And the achievements of the greats are inevitably built on the work of many who came before (even Einstein was preceded by Lorentz, Fitzgerald, Poincare, etc.). A more accurate history of science might actually be more interesting and might help us to see that science really is, of necessity, a community enterprise. Even the great ones need others to lay the groundwork, catch their few mistakes, and follow up on the leads they leave open. We certainly wouldn't want incorrect physics in such a publication - let's try to keep incorrect history out as well.

Philosophy in Astronomy: Unique vs. Ordinary

2008-01-06T19:02:00.000-08:00

As with any science, philosophical notions have played an important role in the development of astronomy. It seems to me that one philosophical notion that has had a tremendous influence on astronomy is the idea that Earth is (or is not) a unique place in the Universe. There is no denying that Earth is special (to us) in that it is the planet from which all of our astronomical observations have been made (well, nearly all, and those that weren't made from Earth were made from relatively nearby). But is Earth truly unique in the Universe?

In classical astronomy Earth occupied a singular location in the Universe. In Aristotle's cosmology Earth was located at the center of the Universe (which was finite and spherical and therefore had a very well-defined center). As pointed out in a recent Physics Today article, Aristotle didn't think that the center of the Universe was wherever Earth was, but rather that Earth was at the center because all matter fell toward the center and therefore Earth (which was nothing more than a collection of all the matter in the Universe) had to be located there. In a way it is hard to say whether or not Earth really occupied a unique place in Aristotle's cosmology because all the matter in the Universe was part of Earth. Everything else was celestial aether and not base matter at all. Earth was unique because it was everything, in a sense. This idea certainly came to take on philosophical (and later theological) dimensions, but initially it was based on sound observation. All celestial objects can be clearly seen to rotate about Earth, and any attempt to move matter away from Earth just results in that matter falling back again (they couldn't achieve escape velocity in ancient Greece). So it fit the data to consider Earth as the center of it all. Nevertheless, as I said, the concept that Earth occupies the center of the Universe ultimately became a philosophical and theological principle.

The Copernican revolutions changed all this, but in small steps. Copernicus moved Earth away from the center of the Universe, but put the Sun in its place. Earth's place was no longer unique, but it was still one of only a handful of planets orbiting the Sun which occupied the center of the (still spherical and finite) Universe. Even Kepler (who was willing to consider that there might be life on some of the other planets) still retained the Sun at the center of the Universe and Earth as one of the few privileged planets to orbit it. It was really only after Newton (when there was a physical mechanism for the planet's orbital motion about the Sun, rather than a geometric explanation) that it became easy to think of the Sun as one of many Suns and Earth as one of a potentially very large number of planets. It was no longer necessary that either Earth or Sun be at a unique geometric location.

Contemporary astronomy has come to embrace the notion that Earth and Sun are not unique, but are wholly ordinary. Indeed, astronomers become suspicious of any evidence that seems to indicate that Earth or Sun are special. For the most part these suspicions appear to be justified. I've been studying the history of galactic astronomy in the early 20th Century and this issue played an important role. For many years it was thought that the Sun was located very near the center of our galaxy (although the concept of a galaxy was not entirely clear at the time) because statistical studies of stellar distances seemed to place us at the center of all the stars we could observe. It turned out later that this was because the absorption of starlight by interstellar dust limited the distance to which the telescopes of the time could penetrate. In fact, all of the stars were observed were just a small part of the Milky Way galaxy. At the time, though, nobody thought there was much interstellar absorption and the data putting the Sun at the center of the galaxy seemed rock solid. Still, it was viewed with some concern because it seemed to give the Sun a special location. When Shapley studied the distribution of globular clusters and found that the center of the clusters (which was presumably also the center of the galaxy) was far from the Sun, he considered it a triumph on the scale of Copernicus displacing Earth from the center of the Universe.

Even with the Sun dislodged from the center of the galaxy, astronomers still struggled against the notion of a unique location. Some astronomers (Shapley included) thought that our galaxy was the only galaxy, and that the so-called "spiral nebulae" were just objects within our enormous galaxy. Even when Hubble's observation of Cepheids in Andromeda showed that Andromeda was a separate star system from the Milky Way galaxy, it was still thought that the Milky Way was vastly larger than any other galaxy including Andromeda. If the spiral nebulae were "island Universes" then the Milky Way was a continent. This was also viewed with suspicion by some astronomers who thought that the Milky Way must surely be very similar to at least the larger and more prominent spiral nebulae (like Andromeda). Later revisions to the diameter of the MIlky Way and the distance (and thus the diameter) of Andromeda showed that in fact Andromeda is a bit larger than our Milky Way, so in fact our galaxy is an ordinary galaxy and not even the biggest in the Local Group.

In each of these cases observations that seemed to indicate that Earth or the Sun or the Milky Way were unique ended up being erroneous and in fact all three appear to be ordinary members of their respective classes. The assumption of ordinariness was becoming firmly entrenched by the time Hubble carried out his study of the redshifts of spiral nebulae. The data clearly indicated that nearly all galaxies were moving away from the Milky Way with speeds that increased with their distance from the Milky Way. On the surface this would again seem to indicate a special location, and thus a unique status, for the Milky Way. But as far as I can tell astronomers never even considered this possibility. This may be due to the fact that General Relativity was already on hand to provide an explanation that did not assume a special location for the Milky Way (in fact, from any point in the Universe the same phenomenon could be observed). One wonders, though, how this data would have been interpreted had Einstein (nor Hilbert nor Poincare, etc.) not come up with GR. Hubble speculates a bit on this in his book The Realm of the Nebulae.

In reflecting on this history what stands out is the distinction between specialness and uniqueness. As I said above, there is no doubt that Earth (and the Sun and the Milky Way) is special, because it is where we are. There is always something special about the observer's location when interpreting data taken by that observer. In many cases that "specialness" may look like "uniqueness", but there is a subtle difference between the two. Special means special only from our point of view. Unique means special in a grander, more objective, more Universal sense. The history of astronomy is riddled with instances of specialness being confused with uniqueness. In light of that history astronomers have adopted as (I would say) a philosophical principle the idea that there is nothing unique about our location (Earth, Sun, or Milky Way). We now build theories based on the assumption that Earth is a typical inner planet (who knows?), the Sun is a typical G star (it seems to be), and the Milky Way is a typical galaxy (it seems to be a typical spiral). The validity of these assumptions is rarely questioned. Astronomers have been burned to many times in the past.

This assumption of non-uniqueness seems entirely reasonable to me, but there is some danger of it becoming too dogmatic. It is possible that some aspects of our location might be unique, or at least very rare. In fact, some proponents of the Strong Anthropic Cosmological Principle argue that we are in a unique Universe, perhaps one that is specially designed to produce intelligent life. Again, most astronomers (and physicists - including myself) view this idea with suspicion. But we must take care to not be closed to the idea of uniqueness, or we will be no better than the classical astronomers who closed themselves to the idea of ordinariness.

Teaching Science as Liberal Art

2007-12-24T19:49:00.000-08:00

I've been thinking a lot recently about how to teach science as a liberal art. I've argued elsewhere in this blog that science is a liberal art and has greater claim to that title than many disciplines (say, history for example) that are typically thought of as liberal arts. I still believe this to be true. However, I recognize that the way science is frequently (perhaps usually) taught tends to suck the liberal artness right out of science. Science, the way it is practiced by most scientists, IS a liberal art. It the activity of a free person, who engages in scientific research for the sheer joy of the intellectual endeavor. It is not something that one does for a paycheck. But many science courses are taught in such a way that only the utility of science is emphasized. Others even forgo highlighting the utility of science and instead present the material as a succession of facts to be memorized because they are "correct". Better science courses help there students learn to think like scientists, to engage at a basic level in the kinds of activities that scientists engage in. But few courses, I think, can really get students wrapped up in the experience of science. I don't think I've managed that feat myself, although I hold out hope for the future.

I think one reason science courses fail to fully immerse students in the scientific experience is that they cover too much material. Many science courses for non-science majors are what might be called "highlights" courses, which try to cover all of the important topics in the discipline. My physics course for non-science majors is a bit like this, although I have found myself cutting breadth to gain in depth. But the more I think about it the more I think there is a better way to teach science. Instead of giving students the "Cliff's Notes" version of the discipline, give them an excerpt of one of the really good parts. Pick a particular important discovery (or sequence of discoveries) and really delve into it. Present it historically, so that students learn about the errors and false starts as well as the great discoveries. A historical presentation also serves to highlight that science is a human activity, carried out by human beings not by computers or robots or mindless automatons.

I've reached this conclusions as a result of a confluence of several factors. The most important is the development of an astronomy course of this type (focusing on the Copernican Revolution) by a colleague of mine. The second factor is the departure of that same colleague to pursue another career, leaving me to teach his astronomy course. The third factor is that I recently read The Liberal Art of Science, a report from a committee of the AAAS. My departing colleague also taught some more standard astronomy "surveys" and I'm supposed to pick these up as well, but I just don't see myself teaching that type of astronomy course and I don't think such a course really teaches science as a liberal art (at least not as outlined in the AAAS report). In a way, I'm in an ideal situation for innovation. I'm an outsider to astronomy (although my undergraduate degree is in physics/astronomy and I did a bit of astronomy research as an undergrad) so I have no commitment to the status quo. I also have no commitment to particular pieces of the discipline. A well-trained professional astronomer probably feels like she is cheating her students if she doesn't teach them such-and-such. But I lack that training and thus those feelings. I'm free to develop a new astronomy course as I see fit. And so I intend to create a new course modeled on the style of my colleague's Copernican Revolution course, but with the discovery of galaxies as my topic (I'll also continue teaching The Copernican Revolution).

I may say more about this new course at a later date (when I've actually got some of it figured out), but for now I want to discuss the conclusions I have come to, in the process of thinking about this new astronomy course, about how science should be taught. As I said above I think science courses for non-science majors should focus on a fairly narrow topic, and take a historical approach. But it is essential that the course delve into not only WHAT the scientists discovered but how they discovered it and how others became convinced of their discovery. Students need to see that this process is far from straightforward. In fact, the best examples to present are discoveries that were controversial for years before finally becoming accepted (like the Copernican Revolution, only in that case it was centuries). Students should be given the chance to examine the evidence on both sides. In the process they should see that there are often legitimate objections to controversial new ideas (like Copernicus' idea) but that in some cases these ideas are able to overcome those objections and become part of accepted science. They should see what it takes for a controversial theory to succeed. They should be exposed to the problems, the mistakes, and the political maneuvering that plague a controversial hypothesis. And ultimately they should have a strong understanding of why the idea ultimately won acceptance.

To do all of these things students must "get their hands dirty". They must carry out experiments and make observations. Reading the results of someone else's experiment is simply not as compelling as conducting the experiment yourself. Of course, in some cases they will be unable to perform the experiment themselves. Simulations can work well in such situations, but if no simulations is available then students will have to read about it. But whenever possible they should read primary sources. For the astronomy course I am developing I am convinced that my students can handle reading a few articles from the Astrophysical Journal, as well as some more historical material from the publications of the Royal Society of London. Original research articles on the history of science can also be of great use. I intend to have students re-analyze published data (after all, we won't have the Mount Wilson 100-inch telescope to play around with like Hubble did) and try to draw their own conclusions, then compare their findings to those of the original author.

Of course, there needs to be some time for discussion and synthesis as well. Even a narrowly defined topic will have many strands of evidence that ultimately braid together to make the case for the new discovery or new theory. Students should delve as deeply as possible into several of these strands, but they also need time to do the braiding and see how the different strands tie together (or fail to tie together in some cases). Ideally there should be some strands of evidence that contradict each other (that is the case for the Copernican Revolution, where evidence from the physics of the time flatly contradicted the idea of a moving Earth). Such contradictory evidence creates a tension that must be resolved. Science strives for internal consistency and unity. This aspect of science is often left out of courses, because we never show the students the evidence that turned out to be "wrong".

All of these things take time. You can't conduct your own experiments, read primary sources, delve into the history of the discoveries, explore multiple strands of evidence for a theory, and synthesize all of this into a unified whole and still cover every important theory in the discipline in a single semester. But this is the essence of science. Science is not, ultimately, about what we know right now. What we know now will be supplanted in the future. Science is about how we come to know things at all. And students should be encouraged to revel in the fact that we ARE able to know things, things that it might seem would be impossible for us to know. How could we, stuck here on our little planet, ever learn that there are other galaxies composed of billions of stars that are billions of light years away from our own galaxy? How could we ever know that the entire Universe is expanding? Isn't it mind-boggling that we possibly say we "know" these things? And yet, these great pieces of knowledge are built up out of a series of much smaller, and much more believable pieces. Students need to see how those small pieces fit together to form the grand (but very incomplete) puzzle of modern science. Surely we would prefer to read a single scene from a Shakespeare play (Hamlet and Ophelia in Act III, scene i, perhaps, or the hysterical play within the play that is Act V of A Midsummer Night's Dream) rather than read a synopsis of the plot. I think the same is true for science. If we want students to really see what science is all about they must be offered a tasty delicacy, not fed fast-food.

Well, those are my thoughts. Now I need to go ... I think I hear hoofs clattering on my rooftop.

Kuhn's "Copernican Revolution" and Incommensurability

2007-12-18T20:29:00.000-08:00

It's been ages since my last post. I hit a point in the semester where I was sufficiently far behind so as to preclude any thoughts of essay-writing for this blog. But now the holidays have arrived and I have a backlog of topics to write about. Fortunately my reading did not halt when my blogging did...

In the time since my last post I finished reading Thomas Kuhn's "The Copernican Revolution." It's an incredibly good read for anyone interested in intellectual history, and particularly the history of astronomy. I was very motivated to read it because I will be teaching astronomy starting next Fall, and I intend to teach a course developed by a colleague that focuses on the Copernican Revolution. I was also interested in the book because I had heard that Kuhn's work on the Copernican Revolution had ultimately led him to the conclusions about the nature of science that he presents in his "The Structure of Scientific Revolutions." In particular I was interested to see the origins of his idea of incommensurability (the idea that there is no logical way to decide between two competing paradigms because each paradigm has different standards of evidence and makes different fundamental assumptions that cannot be questioned within the paradigm).

What struck me most about Kuhn's presentation of Aristotelian cosmology was how sensible it ancient science was. Sure, I know that most of it has now been discredited. But Kuhn did a great job of showing how well the Ptolemaic/Aristotelian system explained much of what was "known" at the time (some of what was "known" turned out to be wrong as well, but they couldn't anticipate that then). There was also a great deal of internal consistency in ancient science, and in fact it was this internal consistency that produced much of the scientific resistance to Copernicus' proposals. Making Earth a planet did not just change astronomy, but it also had an impact that would be felt through all of physics as well as in other areas. If ancient science had been a collection of ad hoc ideas then there would have been little resistance to Copernicus since his ideas would have impacted only the highly specialized area of mathematical astronomy (in which Copernicus was a recognized leader). I was also impressed by how far medieval science advanced beyond the ideas of Aristotle. In particular, Oresme and Buridan were on the verge of the concept of momentum and something like Newton's Second Law. Kuhn also points out that Descartes was the first to clearly formulate a Law of Inertia. This makes the work of Galileo and Newton somewhat less revolutionary than I had thought (though still incredibly revolutionary).

Overall I just can't see where Kuhn got the idea of incommensurability from. It just doesn't seem to be there in this book. He goes to great lengths to point out that Copernicus himself was a die-hard Aristotelian in almost all of his thinking except the planetary nature of Earth. Tycho Brahe was of a similar frame of mind. Kepler was not Aristotelian, and his general approach was quite different from that of most of his contemporaries. But Kepler was just one of the first to ride the wave of neoPlatonism. Kunh readily admits that Kepler's explanation of planetary motion would have won over professional astronomers without any additional evidence. His predictions were simply more accurate than those of anyone else, and this was what counted for professional astronomers. Note that this was a common piece of evidence that both geocentrists and heliocentrists could agree on. There is no incommensurability there. Granted, Kepler's work was unlikely to win over the general populace to the heliocentric model. But that is a process that lies beyond the realms of science itself.

I've always heard of one example of incommensurability being the refusal of anti-Copernicans to admit telescopic evidence as valid. This is a disagreement over what constitutes valid evidence, but it is a scientifically legitimate disagreement. Galileo was the first to use a telescope for astronomy, and the science of optics was new on the scene. It is no surprise that some scientists viewed telescopes with suspicion. It was an as yet unproven technology. If those same scientists had lived long enough to see telescopes and other optical devices in common use they doubtless would have conceded that Galileo's evidence was valid. This is not a matter of incommensurable paradigms, but rather an appropriate cautiousness with regard to a completely new technology. Frankly, there were a wide variety of scientific reasons for rejecting Copernicus' system. For one thing, it wasn't any better than Ptolemy's, as Kuhn points out. For another, it required the dismantling of virtually all the physics that was known at the time. It turns out this was a good thing because that physics was wrong, but it was surely reasonable for Copernicus' contemporaries to hesitate to throw away what they knew of physics for something that would bring them little or no gain. Copernicus himself knew his theory had major problems and expected it to be criticized (which is why he resisted publishing it until his death). There was a lot that needed to be worked out before the benefits of the Copernican idea could be reaped.

Perhaps a genuine incommensurability lies in how various astronomers judged Copernicus' theory. To those with an empirical, Aristotelian viewpoint it could only be deemed a failure or at best a "nice try." To those with a more Platonic perspective (like Kepler) the theory had much to credit it. It was conceptually more economical than Ptolemy's system, even though this conceptual economy had to be covered over with ad hoc additions to make the predictions match the level of accuracy of the Ptolemaic system. But this difference in perspective does not represent an incommensurability between two scientific paradigms. Rather, it seems to be a possible incommensurability between individual scientists who may place different value on different types of evidence. Differences between individual scientists have been around as long as science has. Kuhn is claiming something much larger in "Structure" then that sometimes scientists disagree with each other.

I wonder if a similar examination of a smaller-scale scientific revolution would have led Kuhn away from the idea of incommensurability. The Copernican revolution involved many philosophical the theological issues in addition to the scientific issues. Copernicus' idea ultimately overthrew a worldview that had dominated Western thought for millenia. The revolution itself spanned a long period of time (from Copernicus to Newton) and it came at a time when great technical advances were made (though this may be typical of any important scientific revolution). As Kuhn points out, the backlash against Copernicus' ideas was driven in part by the fundamentalism of the new Protestant faith and the need for the Catholic Church to find a target to attack in order to show that it was not lax about biblical authority. The examination of a similar revolution that did not have all of these complicating factors might not lead to the idea of incommensurability. An example that comes to mind (because I've been studying it recently) is the revolution that saw our Sun moved from the center of the Universe to out near the edge of one spiral galaxy among billions. There were issues of evidence here as well, particularly in regard to the Cepheid variable period-luminosity relation and van Maanen's measures of the rotation of spiral nebulae, and as a result astronomers disagreed on some major points (such as whether spiral nebulae were inside or outside our galaxy). Ultimately, though, a consensus was reached and the main players on both sides of the debate came to the same conclusions in the end. No incommensurability there, it seems.

Time and Lengths Scales for Scientific Theories

2007-10-08T18:47:00.000-07:00

As noted in an earlier post, I am in the process of reading (and thoroughly enjoying) N. David Mermin's Boojums All the Way Through. In a couple of his essays Mermin emphasizes the incredible successes of quantum mechanics. He mentions the fact that quantum mechanics was born in 1900 (on December 14, exactly 73 years before I was born!) and that even now (he published one of these essays in 1988, but we'll update it to 2007) there are no signs that quantum mechanics is incorrect. So we've had over 100 years to overthrow the theory, or at least build some solid evidence against it, and nothing has happened. Mermin also points out that the quantum theory was developed to explain atomic processes that have a characteristic length scale of 10^-9 meters or so, but that the theory has been extended to the much smaller length scales of subatomic particles (for this we can use the length scale for weak interactions, something like 10^-17 meters). So quantum mechanics has proven successful over length scales spanning 8 orders of magnitude.

This got me thinking: Is this really all that impressive? How does it stack up to the run that classical physics had? Not very well, it turns out. Classical physics was born, to give a conservative estimate, with the publication of Newton's Principia in 1687. Classical physics was essentially unchallenged until Planck's lecture on December 14, 1900. It wasn't SERIOUSLY challenged until Einstein's "On the Electrodynamics of Moving Bodies" in 1905. That's a span of well over 300 years. As for length scales, classical mechanics was primarily devised to account for the motions of the planets in the solar system. The diameter of the solar system is on the order of 10^16 meters. Of course, Newtonian mechanics also turned out to work pretty well for small objects (let's say on the scale of 1 mm - again a conservative estimate since classical physics surely works quite well at the micrometer scale and even somewhat smaller). This conservative estimate gives us length scales spanning 19 orders of magnitude. So it looks like quantum mechanics has a long way to go before it could be considered as "successful" as classical mechanics according to these measures. But eventually we did concede that classical mechanics was not the final correct theory. Why should we then accept quantum mechanics as the final correct theory?

Now, this is not to say that I think quantum mechanics is wrong in some specific way. It really has looked pretty good so far. But the inductive argument doesn't work - just because it hasn't failed yet doesn't mean it never will. As scientists we should not consider ANY theory to be final, regardless of its level of success (as measured by time, length, or any other means). There is no doubt that quantum mechanics is impressive. The fact that it has challenged some of our basic notions of the nature of physical reality indicates, I think, that it gets at something very deep. But we also should not refuse to question quantum mechanics when the answers it provides are less than satisfying.

There is no doubt that quantum mechanics represents the best theory available right now. We should continue to use it, and to extend it in interesting ways (i.e. quantum field theory, quantum gravity if we can manage it, etc.). But the same was true of classical mechanics long ago when it was being extended in interesting ways by the likes of Lagrange, Maupertuis, and Hamilton. Let us not be guilty of the hubris of Lord Kelvin who declared at the end of the Nineteenth Century that physics was all but finished. Nature may still have some surprises in store for us, and since we cannot know if we will be surprised we must always remain open to the possibility.

Creativity in Physics and Literature

2007-10-06T18:50:00.000-07:00

I've been reading N. David Mermin's Boojums All the Way Through, a collection of his essays, articles, and book reviews. One of the book reviews is of a biography of Lev Landau, and one of the nuggets that Mermin extracts for the reader is Landau's logarithmic scale for rating physicists. Einstein apparently received a special rating of 0.5, while the (other) founders of quantum mechanics (Bohr, Heisenberg, Schroedinger) rate a 1. Landau apparently gave himself a 2.5 but later upgraded this to 2. Mermin, later in his book, describes himself as a 4.5. Landau apparently referred to physicists who rate a 5 (the worst score on his scale) "pathologists".

Reading about Landau's rating system has caused me to reflect on creativity in science. After all, what is it that distinguishes a 1 (or even a 0.5!) from a 5 in Landau's scale? I would argue that it is most certainly creativity. It surely is not hard work, for though the great physicists were passionate about there subject and thus undoubtedly worked quite hard I am certain that many who have worked as hard or harder still rate but a 5. It cannot be anything like mathematical ability. Einstein's self-reported troubles with math are well-known, and Bohr was apparently wretched at doing serious calculation. I suppose one might site physical intuition as the determining factor, but what exactly does that mean? Physics professors usually mean by "physical intuition" a certain level of internalizing of the known laws of physics. But the great physicists were great specifically because their thinking was NOT limited by an internalization of the known laws. They were able to see beyond what was known. I don't know what else to call this but creativity.

Creativity is an issue that has been much on my mind as I study the philosophy of science. Philosophers of science tend to ignore the creative aspect of science. I think this is for two reasons. First of all, philosophy of science is most often an attempt to rationally reconstruct the actual activities of science. How can one rationally reconstruct an act of creativity? Second, philosophers of science tend to focus on how scientific theories are tested and how the decision to modify a theory comes about. They focus much less on how new theories are constructed or how old theories are modified. The point of the philosophy of science is not so much to explain how scientists construct theories as it is to explain how we can make sure those theories are legitimate (or useful, or not blatantly wrong, or not entirely metaphysical). The creative act of science is thus outside the purview of philosophy of science. The main exception I can think of would be the early inductivists, who saw theories as generalizations from the data. But I don't think anyone would seriously argue that, even when such generalizations do occur (and I think they occur infrequently), this process is straightforward or simple.

But even if philosophers of science can't explain scientific creativity, is it possible to classify it in some way? It seems to me that scientific creativity is much like creativity in other areas. I'd like to draw some comparisons between physics and literature to illustrate this point. Let me make clear at the outset that I am no expert on English literature, and I am largely ignorant of non-English literature. But hopefully if I stick to things that I have read and that I know are widely acclaimed, I'll muddle through this without offending anyone.

Most of science is performed by people who are creative only on a small scale, in a somewhat workaday fashion. These would be Landau's 5's (I would rank myself among them, if I deemed myself worthy of any rank at all). Creating new scientific knowledge of any kind requires a certain level of creativity, in that you are doing something that has not been done before and therefore you cannot follow any sort of template. Probably most of the creativity actually comes in formulating a question to study, or at least choosing some investigation to perform (I rarely have a well-formulated question when I begin my research, but I usually do have some idea of something into which I want to poke my nose). Starting a research project involves a suite of choices that cannot usually be guided by established principles. In any field of science there are an infinite number of factors that can be analyzed, or relations that can be investigated. For most scientists the creative act comes in choosing which of these infinite possibilities will be productive or interesting. From that point on the work may involve only well-trodden pathways. I think this is something like most popular fiction. The author must come up with an idea for a novel or story, and if they are to avoid charges of plagiarism it must be an idea that is new on some level. But the typical work of popular fiction is pretty similar to something that has gone before, and both the prose and the literary conceits are likely to be standard fare. If Landau had rated novelists he would probably consider most authors on the NY Times Bestseller list to be "pathologists".

Somewhere much farther up the chain come those who extend the boundaries of the field in a significant way. This can be done by breaking new ground, or by finding hitherto unknown connections between disparate areas. In physics this might include those like Dirac or Feynman, who worked within the framework of quantum mechanics but extended that framework into new and unexpected territories (Poincare would be a similar example in the realm of classical physics). For an old school example, Johannes Kepler might fall into this category since we worked within a framework established by Copernicus but made a crucial extension (to elliptical orbits) that turned out to make all the difference. The great unifiers would also fit in this category (here I am thinking mainly of Weinberg, Glashow, and Salam for their unification of electromagnetic and weak interactions, but classical physicists like Lagrange and Hamilton might also fit this category, as might James Clerk Maxwell). In my limited knowledge of literature I would put Vladimir Nabokov in this category. Lolita was not, really, an entirely new type of novel. But it was about things that no novel had been about before. Similarly, Pale Fire turns a poem (and its exegesis) into a novel and thereby creates a connection that had not been exploited before. Note that this categorization deals only with creativity. Dirac and Feynman both possessed incredible technical prowess in addition to their creativity. Similarly, Nabokov's prose is nothing short of breathtaking. Perhaps it is impossible to separate these attributes from creativity, but I am at least not explicitly taking them into account here.

At or near the top of the creativity scale are those who change the face of their fields forever. In physics this would include Landau's 1's: Bohr, Heisenberg, Schroedinger. I'd add Boltzmann and Faraday. To go way back we could add Galileo and Copernicus in this category. These people gave us a new way of understanding the natural world. They had the vision to see far beyond the existing theories, and the courage and creativity to construct something radically new. I suppose Joyce would be in this category for literature (though I must confess that I have only read Portrait of the Artist as a Young Man, and parts of Dubliners - I'll get to Ulysses someday soon, but I may not be strong enough for Finnegan's Wake). I'd like to put Borges in this category as well (you see, I have read some non-English authors) because I think his creativity merits it, even if his prose does not (but then, I didn't read his work in Spanish so I can't really say). These authors wrote works that departed radically from the conventions of the day, and literature has not been the same since.

Now, what about that special case of Landau's: Albert Einstein, who merits a 1/2. I would argue that Isaac Newton merits the same special score. What author could merit such a special distinction? William Shakespeare? Fyodor Dostoyevsky? (OK, I admit that I'm trying to atone for my English-language leanings here.) I will let others more literate than I make that call. What is it that sets these people apart from the 1's? Again, I believe it is creativity but it is a level of creativity that inspires awe. The work of a 5 may lead one to think "I would have thought of that if I had worked on that problem." The work of, say, a 2 might lead on to think "I wish I could have thought of that, but I doubt I would have." In the case of the 1's, we might think "I can't believe they thought of that, they are geniuses." For those in this special category our thoughts are mute and we are left to gaze in awe at a mind that operates on a level entirely different from our own.

I'll close this essay by pointing out some interesting features of what I have just written. It strikes me as curious that all of the physicists I mention are theorists, not experimentalists (except Galileo and Faraday, who were both). The authors I mention are all prose authors (well, except Shakespeare). The second fact follows directly from my own personal prejudices (I prefer prose to poetry), which in turn have influenced what I have read. But the exclusion of experimentalists seems odd to me in retrospect, and it would be false to claim that I am simply unaware of highly creative experimental work. Millikan was incredibly creative, as was Michelson. James Joule certainly deserves some high marks for his creativity in studying the relation between heat and mechanical energy. In fact, I think one could argue that in recent years experimentalists have demonstrated a higher level of creativity than have theorists. But somehow this seems like a different type of creativity. For one thing, it is highly constrained creativity in that experiments must make use of apparatus that either exists or can be built with a reasonable investment of time and money. Theoretical creativity is largely free from such practical constraints. Furthermore, experimental greatness requires a set of skills that are not specifically intellectual. Perhaps one day I will write an essay comparing the great experimental physicists to the great painters. Could I, then, rate da Vinci in the same category as himself? Probably not - he was much better as a painter than as an experimental physicist, although this was probably not due to a lack of creativity. Anyway, until I write that essay I will simply apologize to the experimentalists and try to redirect the blame toward Lev Landau who got me started thinking about all of this anyway.

Interpretating Quantum Mechanics

2007-10-01T19:01:00.000-07:00

I just got my copy of the October American Journal of Physics (the best, though not the most prestigious, physics journal in the world). The Letters to the Editor section contains a letter by Art Hobson, written in response to a book review by N. David Mermin. The book that Mermin reviewed was Quantum Enigma by Rosenblum and Kuttner. I've not read the book myself, but I did read Mermin's review. One of his chief complaints (though not his only complaint, nor was his review wholly critical) was that in discussing various interpretations of quantum mechanics Rosenblum and Kuttner ignore the view that quantum states represent not physical states of a particle but rather states of our knowledge. Hobson rejects this view (as well as the view, evidently emphasized by Rosenblum and Kuttner, that perception of a measurement result by a conscious entity brings about a collapse of the wavefunction).

Now I have a great deal of admiration for both of the participants here. I am in the process of reading Mermin's Boojums All the Way Through. Mermin is without question the best prose stylist in physics (and apparently a major contributor in condensed matter physics, though that's not my field so I can hardly judge). Hobson, on the other hand, has been a champion for the social relevance of physics and for the teaching of physics to non-science students. I use his Physics: Concepts and Conncetions textbook for my liberal-arts physics course. While I would have read any letters on interpreting quantum mechanics with interest, the name recognition definitely made these letters stand out to me.

Hobson claims the view that quantum states are states of knowledge rather than states of some objective physical reality is an unnecessary extravagance. He argues that the analysis should really be done from the perspective of quantum field theory, and that most physicists certainly believe that quantum fields are objectively real (offering a quote from Weinberg that I have seen him use before). He then goes on to explain how decoherence explains how a quantum superposition can be transformed, through interaction between the quantum system and its environment, into an incoherent state that can be described with a diagonal density operator. Hobson then declares that these incoherent states are no more mysterious than the proposition that there is a 0.5 probability that a coin flip will come up heads.

I find this last comment by Hobson particularly interesting in light of the position he is attacking. He wants to avoid the claim that quantum states are states of knowledge, but yet he is reduced to saying that quantum probabilities are just like the probabilities involved in flipping a coin. But classical probabilities, like those for a coin toss, are invoked exactly because we lack knowledge. The equal probability of getting heads or tails when a coin is flipped does not represent anything objectively real about the state of the coin on a given flip. What it represents is the state of our knowledge about the coin. If we knew a great deal more about the coins initial state, and about all the forces that act upon the coin, we could determine with near certainty which side of the coin would land up. It is only because we are ignorant of all this information that we must resort to probabilities. So Hobson's invocation of decoherence seems to support Mermin's view, rather than refute it. Indeed, decoherence can only be deemed to have fully solved the measurement problem if quantum states are only states of knowledge (because it reduces the quantum superposition to a classical mixture). If we believe that quantum states represent an objective reality then we are left wondering why decoherence fails to produce a single outcome (rather than a classical mixture of various outcomes). Certainly when we do measurements in the lab we get a single outcome each time (though not the same outcome every time we repeat the measurement).

I also find Hobson's reliance on quantum field theory to be a little problematic. Not so much for technical reasons as for pedagogical reasons. In fact, I have avoided moving to the new edition of his text in part because of this. It is not clear to me that all of the mysteries of quantum mechanics can be swept under the rug of quantum field theory. Quantum field theory has been very successful in describing a rather limited range of phenomena. But it's not clear to me that quantum field theory, as a model of physical interactions, completely contains and therefore exceeds non-relativistic quantum mechanics. In the same way, I have yet to be fully convinced that quantum mechanics completely contains classical mechanics. The idea that our "most fundamental" theory might not contain all the other "less fundamental" theories is anathema to most physicists, but it doesn't bother me since I don't believe in the idea of a final theory anyway. In any case, from the perspective of a non-science student I think blaming the whole mess on quantum fields is a bit like saying the Wazzleblatchet did it (which might be great for a Dr. Seuss tale, but not in my physics class).

Now this isn't to say that I side fully with Mermin on this debate. I have some issues with the idea that quantum states are "nothing more" than states of knowledge. As I said above, we use classical probabilities to represent states of knowledge. But clearly there is something different going on in quantum mechanics. So if quantum states are states of knowledge then our knowledge about quantum particles is constrained in some rather odd ways. In this sense, saying that quantum states are states of knowledge does little to dispel the mysteries of quantum mechanics. I'm not sure that is really Mermin's goal. His goal in the review was to combat the idea that consciousness brings about physical changes in some objectively real quantum state. If quantum states are states of knowledge then it is no surprise that the quantum state changes when a conscious entity becomes aware of a measurement result. But even this point of view does not wholly discount the idea that there IS an objective physical reality. I personally view any science (not just quantum mechanics) as the result of an interplay between our minds and an objective physical reality. Neither piece is wholly absent from classical physics, nor from quantum physics.

Physics is a Liberal Art!

2007-09-29T20:30:00.000-07:00

In my introductory essay for this blog, I argued that physics is a liberal art. I'd like to spend a little time making a stronger case for that argument. It seems to have become commonplace for people to equate the liberal arts to the humanities (and perhaps even to only certain disciplines within the humanities). The seven traditional liberal arts were rhetoric, grammar, logic, geometry, arithmetic, music, and astronomy. Of these it is easy to associate rhetoric and grammar with, say, a major in English (though English professors would cringe at the idea that they primarily teach rhetoric and grammar - of course, their focus is on literary criticism). Similarly, one can associate logic with philosophy and music with the fine arts in general. So there is no doubt that there is a big overlap between the traditional liberal arts and the humanities. But what about geometry, arithmetic, and astronomy? I see little choice but to equate these with the modern study of mathematics and science. Granted, a modern mathematics major will spend little time studying geometry (and hopefully none studying arithmetic, including "college algebra", which they should already know), just as English majors don't spend much time on grammar. Still, there can be no doubt that mathematics and science were very much a part of the traditional liberal arts.

Of course, one can argue that the term simply means something different now. But what did the term mean in classical and medieval times? It referred to areas of knowledge that were appropriate for free men, as opposed to more applied areas of knowledge that might be appropriate for slaves or serfs. So if we take the term to mean the same thing today (knowledge appropriate for free persons), then what should the liberal arts be in today's context? I have no easy answer for that, but I am absolutely certain that science must be a part of it. A free person in modern society must have a basic understanding of the methods of science, and at least some rudimentary scientific content knowledge. Why? Because science and its by-products pervade every aspect of modern society. Science drives our economies and has helped produce a worldview that is conducive to the modern democratic state (i.e. with the concept of universal natural laws). Those without any knowledge of science in today's world are in a dangerous situation because they can be easily controlled and manipulated by those who do understand science (and often by those who don't - just look at some political rhetoric and advertisements for pharmaceuticals). I hope no one would argue with the notion that all free persons should know how to read, write, and perform basic mathematical manipulations. I would place science right after these on the list of things a free person should know.

Now, I don't mean that every free person needs to major in science at the college level. Hardly. But a free person should possess a basic understanding of the methods of science, and some ability to distinguish science from psuedoscience and from that which is simply not science. Unfortunately, students who take science courses at the college level are often given an "introduction to the discipline" that focuses on content rather than methodology. These courses might make it seem like the purpose of studying science is solely to become a scientist. This makes the sciences seem more like applied disciplines rather than intellectual disciplines appropriate for all free persons. I think science should be taught as the liberal art that it is, rather than as vocational training. This is imperative for non-science majors who may take only one or two science courses. I am becoming increasingly convinced that we should also teach courses for science majors this way, at least at the introductory level. If more advanced courses take on a more "vocational" or "professional" feel then that may be appropriate once students have an understanding of the basic methodology of science.

One last thought on why science is a liberal art: science is a liberal art because people pursue science for the same reason that people pursue other liberal arts. Most English majors do not study English because they sense it will land them a high-paying job one day. Most Fine Arts majors don't view their education as preparation for a lucrative career as a painter, etc. But neither do physics majors study physics because it will get them a good job. Most of us study physics for the same reason that people write poetry: because it brings us joy. Doing physics is fun (at least, it is for me). Physics, and the other sciences, are very intellectually stimulating. Now perhaps this can be said of anything. I've talked with some marketing professors who make the study of marketing sound enjoyable and intellectually stimulating. But most students who study marketing probably do so because they want to get a job in that field. Physics students don't tend to think that way. Many physics majors go on to grad school, but I think this is primarily because they enjoy studying physics and they want to keep doing so. Others get jobs straight away, more often that not outside the field of physics. And that's fine, because they didn't major in physics as preparation for a specific job. They majored in physics because it challenged their mind, deepened their reasoning skills, improved their understanding of nature, and honed their mathematical ability. I believe the same is true for other liberal arts like English or History. They don't really serve to prepare you for a specific career (unless you want to teach), but they provide you with a set of intellectual skills that can enrich your life and make you capable of meeting almost any challenge.

I'd like to see the sciences receive recognition as liberal arts. I think the Humanities folks need to acknowledge science's rightful place among the liberal arts. I likewise think that quite a few scientists need to stop scoffing at the liberal arts and start recognizing that their own subject is as much a liberal art as History or Philosophy. That doesn't mean we can't continue to recognize certain boundaries between disciplines. Joyce's Ulysses is not science, any more than Einstein's "On the Electrodynamics of Moving Bodies" is literature. But both should be recognized as the great intellectual achievements they are. How much poorer we would be if we had only science, or only literature, but not both!

The Scope of a Scientific Theory

2007-09-15T20:12:00.000-07:00

In this essay I want to follow up my discussion of Lakatos' conception of scientific research programmes by describing Thomas Brody's conception of the scope of scientific theories. Brody's views, as set forth in his The Philosophy Behind Physics (a book which he did not complete before his death, and which includes some essays written by him that were never intended for the book), seem to have been largely ignored by the philosophy of science community. It may be that his ideas are fundamentally flawed, and have been ignored for good reason. I'm not enough of an expert to judge that. However, I find his concept of scope quite compelling. In particular, it seems to me that Brody's approach can be viewed as another way of keeping Popper's basic approach to scientific methodology while simultaneously addressing some of the problems that beset Popper's views (see my previous essay for a brief discussion of some of these problems).

Brody's understanding of scientific progress sees the evaluation of scientific theories as divided into two stages. In the first stage, a nascent theory gains support by accumulating confirming evidence (corroboration, in Popper's terminology). A newly proposed theory that fails to be supported by empirical evidence will likely be discarded. However, once a theory has survived this early stage it moves on to a phase in which the focus is on trying to find situations in which the theory fails (just as in Popper's falsification approach). The purpose of finding these failures, though, is not to falsify the theory in anything like an absolute sense. The theory will not be discarded simply because a few failures occur. Rather, these failures of the theory are used to delimit the theory's scope.

The scope of a theory, as Brody presents it, is something like the set of circumstances in which the theory will produce successful predictions. This definition, though, is probably too vague. If the successes and failures of the theory follow no apparent pattern then it is probably impossible to define the scope of that theory. But a theory's scope can become well defined if we can translate the "circumstances" in which the theory is used into something like a parameter space. If we then find that the theory produces successful predictions for some region of this parameter space, but fails to produce successful predictions outside this region, then the region of success effectively defined the scope of the theory. Brody seems to assume in his writing that we should expect theories to behave in this way. He does not address pathological cases in which the points in parameter space at which the theory is successful are intimately intermixed with the points at which the theory fails. I think this is because his concept of scope is largely derived from his view that all theories are approximations (see my essay on approximations in physics for my take on this). Mathematical approximations (such as truncating a Taylor series, for example) are generally valid for some range of values for the relevant variables and invalid outside of that range. Brody seems to think the scope of a theory can be determined in much the same way.

Now I find this idea compelling because it avoids the assumption that I have claimed lies at the heart of naive falsificationism, namely that there is one "true" theory that is capable of predicting everything in the Universe. Brody's sees scientific theories as human inventions, inventions that can at best approximate reality. Science, from this point of view, is not about pursuing absolute truth but about finding approximate truths and understanding when those approximate truths hold and when they do not. Brody finds it perfectly acceptable to have one theory that describes a phenomenon in a certain parameter range and another logically incompatible theory that describes the same phenomenon in a different parameter range. He discusses the various models used in nuclear physics in this context. It is possible that one could even view wave-particle duality (of photons, electrons, etc.) in this way, although it is not clear how one could define parameters such that wave behavior manifests for one range of parameter values and particle behavior for a different range.

Another reason I find Brody's idea compelling is that it seems to reflect some important parts of scientific history. When a well-established theory is falsified, historically scientists have not tossed the theory aside and moved on to something else. Certainly, there is a desire to formulate a new theory that will work where the previous theory failed. But quite often the "falsified" theory is kept on. If one can clearly determine the scope of the theory then it is rational to continue using the old theory in those situations which fall within its scope. Classical Newtonian mechanics is an excellent example of this. Newtonian mechanics has been falsified over and over, yet we have not tossed it aside (I teach a two-semester sequence on intermediate classical mechanics and I don't think I am wasting my student's time!). We still use Newtonian mechanics in those situations where we are confident that it will work. There may be a sense in which physicists are convinced that quantum mechanics and relativity are "truer" than Newtonian mechanics, and that we are only still willing to use Newtonian mechanics because it accurately approximates those "truer" theories in certain situations. But in the case of quantum mechanics, showing that the "truer" theory reduces to Newtonian mechanics in the appropriate circumstances has proved to be challenging (particularly in the case of chaotic systems). The same may be true for general relativity, although I know much less about that case. I think Brody would claim that we need not worry so much about this issue. As long as we know when it is okay to use Newtonian mechanics, then it is fine for us to do so. We don't have to convince ourselves that we are really using quantum mechanics and general relativity in an approximate form.

Now I think that the concept of scope helps resolves some of the problems associated with naive falsificationism, but it certainly doesn't settle all of them. In particular, it seems to suffer at the hands of the Duhem-Quine thesis. If a theory fails to predict the results of an experiment, how can we be sure that the experiment is outside the theory's scope? It could be that the failure is due to an auxiliary hypothesis (thus indicating that the experiment is outside the scope of that auxiliary hypothesis). So just as we can never falsify a theory in isolation, we can never determine the scope of a theory in isolation. We can only determine the scope of an entire network of theories that are used to predict the results of an experiment (and to interpret the experimental data). Another way to state this is that when we test a theory we must inevitably assume the validity of several other theories in the process. This assumption may prove to be correct, or it may not. Whenever we get a negative result, it could be a failure of the theory we are testing or it could be a failure of the assumptions we have made. This makes determining the scope of a theory a complicated process. In practice we must evaluate many theories at once and any failure signifies that we are outside the scope of at least one of the theories. Delimiting the scope of a set of theories thus becomes and endless process of cross-checking. So Brody's view faces some serious challenges - but I think it deserves more attention than it has received.

I'd like to close this essay by trying to tease out some similarities between the approaches of Lakatos and Brody. Both seem to build on Popper's basic premise. Both avoid inductivist ideas. Both attempt to defend the rationality of science (contra Kuhn and Feyerabend, etc.). I think one could even reformulate Lakatos' ideas using Brody's language. When we perform an empirical test of a theory we are really testing a whole network of theories and assumptions. However, based on the details of the experiment we may have different levels of confidence in the various theories that compose the network. We may be very confident that we are well within the scope of many of these theories/assumptions, and therefore we would be very unlikely to blame any failure on these parts of the network. The theories or assumptions in this group would form the "hard core" in Lakatos' terminology. On the other hand, we may be less certain about the where the experiment falls in relation to the scope of other theories and assumptions in the network. We would be much more likely to blame a failed prediction on one of these theories/assumptions. This group of theories and assumptions then forms the "protective belt". This represents a significant change in Lakatos' conception (at least, as far as I understand it) because now theories could move between the hard core and the protective belt depending on the context of the experiment. I think this is a step in the right direction because it provides some much-needed flexibility. In particular, it opens up the door for falsifying (or at least delimiting the scope of) those theories which are part of the hard core. If a theory that is in the hard core is always in the hard core then it would seem to be unfalsifiable, and thus it would become a metaphysical principle or a convention rather than a physical theory. Yet, this idea does allow for the possibility that some theories (or principles, or whatever) could have universal scope and could therefore be "permanent members" of the hard core.

I have actually used Brody's concept of scope in teaching students about the nature of science. I have them perform an experiment to determine the relation between the period of a pendulum's oscillations and it's length. They consider two mathematical models: one in which the period is a linear function of length, and one in which the square of the period is a linear function of length. They generally find that both models work well to fit their data. They then use each model to predict the period of a 16-meter pendulum, and then they actually measure the period of such a pendulum (we have a 16-m Foucault pendulum in our lobby). They find that the second model's prediction is reasonably close, while the first model is way off. We could consider this a falsification of the first model, but I try to lead them toward a different conclusion: that we have really just shown that long pendulums lie outside the scope of the first model. In fact, if we made the pendulum VERY long (say, a significant fraction of Earth's radius) then we would find the second model would fail as well. The basic idea is that all models have a finite scope, so a failure of a model doesn't mean we should discard it or else we would discard everything. However, in evaluating between two models we may find that the scope of one completely encloses but extends beyond the scope of the other. In that case we would clearly prefer the model that has the wider scope. On the other hand, if the two models had scopes that overlapped only partially or else did not overlap at all then it would be quite reasonable to keep both models around so that we can use the model which is most appropriate for the prediction we are trying to make.

A First Response to Naive Falsificationism

2007-09-08T19:14:00.000-07:00

I've just finished reading Karl Popper's Logic of Scientific Discovery (including all of the "starred" appendices), so now is as good a time as any to write about two responses to Popper's ideas that have been on my mind lately. Let me start by giving a caricature of Popper's doctrine of falsificationism. The basic idea is that scientific theories are bold conjectures about the physical world, which must then be subject to stringent empirical tests. Any theory which fails such a test must be discarded, and a new bold conjecture must be put forth. This notion of the methodology of science avoids many of the pitfalls of inductivist thinking (such as the verifcationist approach of the logical postivists), but it suffers from its own set of problems. I'll list some of the more glaring ones here. First of all, it would be unwise to throw out a theory which has failed an empirical test unless there is a comparably good theory available that has yet to fail any empirical tests. In other words, just because a theory is "wrong" doesn't mean it is not better than nothing. Secondly, any empirical test will test not just a single theory but rather a whole set of theories (which may or may not be related to each other) along with various other assumptions and approximations that must be used to relate the raw empirical data to the prediction made by the theory. If the prediction fails to match the empirical data, it is not at all clear that this must be the fault of the theory that is supposedly being tested. Finally (for this essay - this is by no means a comprehensive list of problems with Popper's doctrine), naive falsificationism contradicts the history of science. Every theory that we hold today has been "falsified" at some point, and it seems likely that any future theory will suffer the same fate. Clearly we are not willing to throw out all of our current science because a few contradictions have been found.

One thing that was interesting for me in reading Logic of Scientific Discovery is that Popper seems to have been quite aware of most, if not all, of these issues. In fact, he really says very little about falsification itself. His main concern is to define falsifiability, and to show how the falsifiability of a theory can be used as a criterion for demarcation between science and pseudoscience. It seems clear from reading the book that Popper understood falsification was not a simple matter. Regardless of what Popper thought, I'd like to analyze a basic assumption that I think would have to underly the naive falsificationist view (if there was anyone who actually held this view). The assumption is this: that there is a universally correct theory, a theory which would deliver predictions that would be correct in all circumstances. If this were true, then any theory which made a prediction that failed would clearly not be that universally correct theory and could thus rationally be discarded in hopes of reaching that "final theory." (Of course, this assumes that our empirical test was valid and that all the information that was used, along with our theory, to make the predictions was accurate.) I have argued in a previous essay that I believe such a final theory is impossible. All theories are, I believe, approximations and all approximations are valid only in certain situations.

I would like to discuss two attempts to remedy some of the problems with naive falsificationism. (I'm sure there have been attempts other than these, and certainly many philosophers of science have been quite content to discard Popper's ideas altogether, but I'm going to focus on these two attempts to retain Popper's basic approach.) The first approach I want to discuss is well known: Imre Lakatos' "methodology of scientific research programmes." The second is, I think, less well known: Thomas Brody's concept of the "scope" of scientific theories. Actually, I am personally more familiar with Brody's ideas because I have read his work, while I have only read descriptions of Lakatos' ideas written by others (I'll fix that soon, I hope - I have at least a few of Lakatos' key papers in my library). I'll describe Lakatos' idea in the remainder of this essay. In my next essay I'll describe Brody's approach and try to analyze the similarities and differences between these two views.

Lakatos conceives of research programmes as conglomerations of theories and assumptions. As noted above, any empirical test cannot test a single theory but rather tests a whole group of theories (this is often referred to as the "Duhem-Quine thesis"). Lakatos realized that in the face of this problem scientists must make a decision about which piece of the collection of theories and assumptions that are relevant to the empirical test gets falsified and which pieces are still considered valid. Lakatos proposes that each research programme has a "hard core" that is considered almost sacred. The hard core consists of the fundamental theories that scientists will simply refuse to abandon, even in the face of an apparent falsification. Surrounding this hard core of theories is the "protective belt." The protective belt consists of lesser theories and a wide variety of assumptions to which scientists feel less compelled to cling. If a prediction fails to accord with the empirical test then this failure will be blamed on some element of the protective belt rather than on one of the theories that comprises the hard core. Thus the protective belt serves to shield the hard core from falsification. (Note that this is exactly contrary to what Popper says we ought to do with our theories, since he claims we should NEVER shield them from falsification but rather subject them to the most extreme tests.) This is somewhat similar to Poincare's distinction between principles and laws. Principles, for Poincare, are true by convention and cannot be subject to empirical tests, while laws are empirically testable.

Lakatos' approach does appear to solve the problems with naive falsification that were mentioned above. It explains why, for example, we did not abandon Newtonian mechanics and gravitation when the predicted location of the planet Uranus was found to disagree with observations. Newton's theories were part of the hard core (at the time), while our knowledge of the objects that populate the solar system was part of the protective belt. Scientists were more willing to postulate an unknown planet than they were to modify or discard Newtonian theory. This turned out to be a wise choice, and the discovery of Neptune soon followed. Nevertheless, there are some problems with Lakatos' ideas as well. It seem very problematic that we should protect our theories from falsification. Indeed, how can theories that reside in the hard core ever be found false (if indeed they are so). I suppose one can claim that repeated falsifications, even when the protective belt has been modified to account for past problems, might begin to weaken the hard core and eventually some portions of the hard core might "slip" into the protective belt. Perhaps the rise of quantum mechanics can be viewed in this way, although this view does not seem to apply to the development of relativity (Michelson and Morley provided the repeated falsification, but it seems that Einstein did not even consider this in developing his special theory).

There is one issue I have wondered about, and which Lakatos may have addressed although I am unaware of it. This issue is this: must the hard core and protective belt be fixed at any given time. Or can theories and assumptions move from the hard core to the protective belt, and vice versa, depending on what kind of prediction we are making. I discussed in my previous essay how the validity of approximations depends on the use to which they are put. This seems relevant to the idea of defining a hard core as well. In certain contexts we might feel that a theory is to be regarded as unquestionably true, while in another context we might consider the theory subject to falsification. One could look at Newtonian mechanics that way. I think any physicist would be unwilling to accept that a cannonball moved in a way that contradicted Newtonian mechanics. On the other hand, the same physicist would quite willingly admit that Newtonian mechanics is falsified by the motion of an electron. Now if one takes the view that science should strive for a final theory that works in all possible contexts, then this idea wouldn't make much sense. If falsification in any context implies total falsification (i.e. we declare the theory "wrong" and discard it) then it would not make sense to consider a theory falsifiable in certain situations but not in others. However, if we accept that scientific theories must always be approximate and provisional then one might be willing to move a theory from the hard core to the protective belt (or vice versa) based on the context of the prediction.

I'll leave my discussion of Lakatos' research programmes for now, because I think Brody's concept of scope can illuminate some of these issues. I'll tackle that in my next essay.

Approximation in Physics

2007-09-02T21:06:00.000-07:00

Science is all about approximations. The importance of approximation is perhaps most clear in the discipline of physics, which deals with simpler systems than does chemistry or biology. In any case, physics is my discipline so I will focus mainly on approximation in physics. Certainly it is in physics that it is easiest to quantify our approximations, and this may make the role of approximation stand out more clearly in physics than it does in other sciences. Nevertheless, approximation is an important, indeed a fundamental, part of any science.

Let's start by distinguishing two ways that approximation shows up in science (for now I'll focus on physics). One way to make use of approximation is within a model. When I use the word model I mean something a bit broader than what is meant when we talk about "the Bohr model of the atom." I would like to use the term "model" in something like the way that it is used in "the Standard Model of Particle Physics", although without any assumption that the model represents anything fundamental. (For that matter, I have doubts that the Standard Model represents anything fundamental, but that's another story - sort of.) I can make this clearer by describing a specific example. Let's take a model of the solar system. We might posit that each object (planet, asteroid, Sun, etc.) can be represented by a mathematical point in a three-dimensional (Euclidean) space. Each object is also assigned a mass. We then state that these point masses move in such a way that they obey Newton's three Laws of Motion combined with Newton's Universal Law of Gravitation. Now if we specify the initial conditions for each point mass (position and velocity) as some time t=0, then we can determine the state of our model at any future time. In essence, by "model" I mean a description of a system that includes a list of the basic entities of which the system is composed along with the relevant properties of those entities, as well as any rules or laws that dictate the behavior of those entities. I'll leave it open whether or not the initial conditions constitute part of the model.

Now when I talk about using approximations within a model I mean finding the approximate behavior of the entities in the model, rather than the exact behavior according to the rules inherent in the model. For example, in our solar system model we could decide to ignore the gravitational effects of small bodies like asteroids and instead focus only on the gravitational pull of the planets (and perhaps a few larger moons). Note that this approximation can actually take two forms. One way to approach this approximation is to cut the small bodies out of the model entirely. In other words, let's just pretend for the moment that there are no asteroids or small moons in the solar system. This gives us a new, simpler model in which the list of basic entities is different from that of our original model. An entirely different approach to this approximation would be to keep the asteroids and small moons, but ignore the gravitational pull of these objects on the planets, etc. (but not the pull of the planets on the asteroids and small moons). This gives us a new model that has the same list of basic entities, but which follows a different set of laws. In particular, violations of Newton's Third Law of Motion (that objects exert equal magnitude forces on each other in opposite directions) and Newton's Universal Law of Gravitation (that all massive bodies exert gravitational forces on all other massive bodies) abound in this model. There is a sense in which this version of our approximation actually leads to a much more complicated model (sometimes the Third Law is satisfied, sometimes not), but in practice it turns out to be easier to make quantitative predictions in this model than in the original one.

Note that which of these two approaches you take will depend heavily on what you wish to get from your model. For example, if you are trying to predict the motion of Mercury either version of this approximation might do the job nicely but the first version will probably be easier to use. On the other hand, if you are trying to predict possible asteroid impacts on Earth then clearly the first version of the approximation will not do the trick (I can tell you what THAT version will predict right now!). This raises a critical point about how approximations are used in science. The validity of an approximation cannot generally be evaluated on its own. Rather, it can only be evaluated in light of the goals of the scientist. For some purposes the first version of our approximation might be perfectly valid, but for other purposes it will be completely useless. This point may not seem all that striking when we are discussing approximation within a model, but it will also be important for our discussion of the second way that approximation is used in science.

The second way that approximation is used is that our model itself is an approximation of a real physical system (or at least a possible physical system - as a computational physicist I often employ models that likely don't correspond to anything that currently exists in the real world, but give the atom optics people enough time and they'll probably create one). Our solar system model is presumably meant to be a model of the actual planets and other bodies that orbit our Sun. But even the original version of our model was very much an approximation. To begin with, we ignore every property of the objects in the solar system other than their position (and associated time derivatives) and mass. We ignore their chemical composition, their temperature, their albedo, their rotational angular momentum, and on, and on. Furthermore, we ignore the fact that there is a lot of stuff outside the solar system that could, at least in principle, influence the motion of the objects within the solar system. These approximations amount to making cutoffs in our list of basic entities and cutoffs in our list of the relevant properties of those entities. But that's not where the approximations stop. We also make approximations in regard to the laws or rules of our model. Our solar system model ignores all forces other than gravitation. We also chose to use Newton's Laws of Motion, which we know are only approximately correct. Indeed, we completely ignore quantum effects (perhaps justifiably) as well as relativistic effects (that's harder to justify - after all, we know we won't get Mercury's orbit right if we don't use General Relativity).

Now after thinking hard about all the approximations that are intrinsic to our model of the solar system, we might begin to question whether the model is actually good for anything. It seemed pretty good at first (after all, we were prepared to keep track of a point mass to represent every object in the solar system). But now it seems like a joke. How can we convince ourselves that it is not? In other words, how can we convince ourselves that this model is valid? Well, we saw in that in the case of approximations made within a model the question of validity hinged on our purpose in using the model. The same is true here. Suppose we want to predict what the Moon will look like from Earth on a given night in the future. Well, if all we want to know is the phase of the Moon then our model may do the trick since it can predict (approximately) the location of the Moon relative to Earth and the Sun. If, on the other hand, we want to be able to say something about how bright the Moon will appear then we are lost since that will depend on the nuclear reactions that produce the Sun's luminosity, the albedo of the lunar surface, as well as local atmospheric effects on Earth. None of that is in our model, so we have no hope. So in some cases we can show that the model is invalid for a particular purpose without even making a prediction. For example, if we want to predict properties of objects that are not included in the model (either the properties or the objects or both) then the model will not be valid for that purpose. But if we cannot invalidate the model that way, the best way to test it is to make predictions with the model and see how they stand up against the actual behavior of the physical system we are trying to model. If we wanted to predict Earth's orbit around the Sun for the next one hundred years we would probably find that our original solar system model would do a bang-up job.

There may be other ways to validate our approximations, particularly if they can be quantified. For example, we can justify ignoring quantum effects in our solar system model because they simply won't show any noticeable effect at the scale we are examining. We can justify ignoring the gravitational forces exerted on the planets by stars other than the Sun because those forces are negligibly small. Again, though, whether or not these approximations are acceptable will depend on our purpose for using the model (just how precisely do you want to predict Earth's orbit?).

Now we're getting close to the main point I want to make. I hope it is clear that approximations pervade everything we do in physics. Approximations are such a fundamental part of doing physics that physicists tend to forget they are there. But all of physics is an approximation. It is simply not possible to do any physics without making some approximations. We must truncate the list of basic entities in our model (otherwise we would have to include everything in the Universe - and I don't just mean the visible Universe). We must truncate the list of properties that the basic entities possess (otherwise we must consider all possible properties - and who can say what unknown properties might be possessed by some object in the Universe?). Ideally we will make use of our best physical laws, but even these are almost certainly approximations.

I probably wouldn't get much argument from other physicists about what I've said so far, but now I'd like to claim something more controversial. I am convinced that understanding the role of approximation in physics makes it impossible to believe in a "final theory" (or at least impossible to believe that we will ever find the final theory). If we are willing to acknowledge that everything we now call physics is, at some level, an approximation then we must accept that any new theories must potentially be approximate as well. How would we be able to determine that a new theory was not an approximation, that it was in fact the "final" theory? Presumably we would use the theory (or model) to make predictions and then see if those predictions match our observations or experimental results (and I'll pretend for the moment that we can do this in some obvious way and actually know whether or not the predictions match the observations - reality is much trickier than this). But even if the match was perfect that would only validate the theory for that prediction (or at best for some class of predictions). We can only test the validity of an approximation in the context of some particular purpose. To show that a theory is universally valid (i.e that it is a final theory) we would have to show that it is valid for all possible purposes. I simply don't think that it is possible to do this. I'm really not even sure what it would mean to make the claim that a theory is valid for all possible purposes.

So I don't believe in a final theory. I believe that all of physics as it currently exists is an approximation, although an astoundingly good approximation for most of the purposes for which we have used it. I believe that all future physics will still be an approximation, although it will likely be a better approximation for some of our old purposes and a valid approximation for some purposes of which we have yet to conceive. But physics will always have a limited scope (I'll save a detailed discussion of the idea of scope for a later essay - or just read Thomas Brody). The only argument I can see against this point of view is that if we find a theory that works well for all purposes that we can think of now, then it will probably work well for all purposes we devise in the future. But evaluating a theory based on its "logical probability" does not work well (I'll try to write about this later, but Karl Popper has already made the point pretty well). I am convinced that any "Dreams of a Final Theory" will remain just that (with or without the Superconducting Supercollider).

P.S. Yes, I've read the book by Stephen Weinberg whose title I quote above. No, I am not trying to criticize that book or Weinberg himself. In fact, I had two semesters of quantum field theory from Weinberg when I was a grad student. The man clearly knows a lot more about physics than I do, and he has thought a lot more about philosophy than the vast majority of physicists. Even if, as a pond minnow, I wished to pick a fight with one of the great Leviathans of the sea, I would not choose to pick a fight with Weinberg!

Why Noninertial Frame?

2007-09-02T20:42:00.000-07:00

I'll try to answer two versions of the above question. First let me answer this version: why have I created this blog (which happens to be titled "Noninertial Frame")? My reason is fairly simple. I am a professional physicist and a professor at a liberal arts college. Both my undergraduate and doctoral degrees are from research universities, but somewhere along the way I picked up the liberal-arts mindset. I have a deep love for the liberal arts, including physics. Yes, physics is one of the liberal arts. It is part of the Quadrivium (which consists of arithmetic, geometry, music, and astronomy or cosmology - I'm counting this last as including physics). Although physics is my first academic love, it is not my only one. I am particularly interested in philosophy (part of the Trivium which includes grammar, rhetoric, and logic), primarily in philosophy of science. As I have no professional training or expertise in any area other than physics (and a bit of math), I do not expect to engage in philosophy (or any other discipline except physics) in a professional capacity. But I love thinking about it and I love writing about it (and the writing helps me clarify my thinking). What better forum for unprofessional writing than a blog? With a blog I may actually find a reader or two who is interested in my thoughts, and I'll always know where to find that essay I wrote a few years back...

Now for the second version of the above question: why have I chosen to name my blog Noninertial Frame? Well, the term is used in physics for a reference frame that does not obey Newton's Laws of Motion (specifically the Law of Inertia). Generally this means the reference frame is accelerating with respect to frames which do obey Newton's Laws. Why is this apropos for my blog? First of all, it's a physics term and I am a physicist. I expect everything I write to be connected to physics, if not entirely about physics. Second, I am writing from the perspective of a liberal arts physicist for whom the usual rules for physicists (focus on research, don't muck about with philosophy, etc.) don't seem to apply. Finally, I'm hoping that this blog will help me maintain my interest in the connections between physics and the other liberal arts, so that I don't settle into the inertia of a routine career in physics teaching and research.

Now, what can you expect to find here. Mostly I will post essays (occasionally lengthy ones, I would guess) about issues related to physics but not strictly within the domain of physics per se. These essays will mostly be philosophical, but I will likely work in some mathematics and maybe a few discussions of literature and music (to round out the seven liberal arts). I'll keep it all tied to physics, though, because if it has nothing to do with physics then there is probably no reason for anyone to read what I write.

I hope this blog will be useful for someone other than me, but if not then I can live with that. Cheers!