One problem for high-altitude balloon projects is the CoCom limit on how high and how fast a GPS will operate. To prevent GPS modules from being used in very fast moving weapons (such as ballistic missiles) GPS receivers are not allowed to operate at:
1. Higher than 60,000 feet
2. When traveling faster than 1,000 knots
The second restriction doesn't matter for GAGA-1, but the first does. GAGA-1 will have a maximum altitude (balloon dependent) of more like 100,000 feet.
Different manufacturers implement the CoCom limit in different ways: some use an AND rule (>60,000 ft and >1,000 knots) and others use an OR rule (>60,000 ft or >1,000 knots). For high-altitude ballooning it's ideal if the GPS uses AND. Unfortunately, this information is shrouded mostly in mystery and it's only through actual flights and testing that people have managed to determine which GPS receivers are AND and which are OR.
For GAGA-1 I have two GPS units: one in the Recovery Computer and one in the Flight Computer. The Flight Computer is using a Lassen IQ which is known to work correctly on balloon flights.
The Recovery Computer is using the GM862-GPS which will fail. This is OK because it is used when the balloon has landed to send GPS location via SMS messages. But the failure mode is important.
I've been back and forth with Telit technical support and they claim that the module will simply fail to give me a GPS fix above 60,000 ft but that once the balloon is down again it'll restart automatically. Others claim that code should be included to automatically reset the GPS if it hasn't given a fix for some length of time. I plan to update the code to include an auto-reset after 30 minutes if no fix or no satellites during flight and recovery.
2010-11-28
2010-11-27
GAGA-1: Capsule insulation and antenna mounting
A bit of physical stuff on GAGA-1 this weekend after the Recovery Computer software last time. I'd previously painted the capsule for high visibility, but hadn't started cutting it or sticking on parts. After the successful test of the Recovery Computer it's time to put some bits on the box!
The three antennae visible on the box (as with the other components) are hot glued in place. I pierced holes in the box using a long metal skewer and a chop stick.
Here's a close up of the top of the capsule.
The top two antennae are for the two GPS modules (one in the Flight Computer and the other in the Recovery Computer). The long thin antenna is for the GSM connection that's part of the Recovery Computer.
The other two parts are a small red straw and a large black straw. The small red straw is simply there to allow the pressure to equalize between the inside and the outside of the capsule. Since the pressure is very low in the stratosphere it would be dangerous to send the box up completely sealed.
The black straw is sealed at the end with hot glue and will be where the external temperature sensor is placed.
I've further insulated the box by lining the interior with sheets of space blanket. This reflects almost all the heat generated inside the box (by the electronics) and should help keep things warm.
This was very fiddly to do as the space blanket material is very thin. I cut sheets out using a stencil and glued them in place. Placing my hand in the box I can feel warmth: the reflected warmth of my own hand.
Finally, here's an interior shot of the lid of the capsule showing where the cables for the antennae and straws poke through.
The three antennae visible on the box (as with the other components) are hot glued in place. I pierced holes in the box using a long metal skewer and a chop stick.
Here's a close up of the top of the capsule.
The top two antennae are for the two GPS modules (one in the Flight Computer and the other in the Recovery Computer). The long thin antenna is for the GSM connection that's part of the Recovery Computer.
The other two parts are a small red straw and a large black straw. The small red straw is simply there to allow the pressure to equalize between the inside and the outside of the capsule. Since the pressure is very low in the stratosphere it would be dangerous to send the box up completely sealed.
The black straw is sealed at the end with hot glue and will be where the external temperature sensor is placed.
I've further insulated the box by lining the interior with sheets of space blanket. This reflects almost all the heat generated inside the box (by the electronics) and should help keep things warm.
This was very fiddly to do as the space blanket material is very thin. I cut sheets out using a stencil and glued them in place. Placing my hand in the box I can feel warmth: the reflected warmth of my own hand.
Finally, here's an interior shot of the lid of the capsule showing where the cables for the antennae and straws poke through.
2010-11-26
Notes on Kryptos Part 4
Copy of message I sent to the Kryptos group on Yahoo! for anyone whose working on Kryptos but not in that group.
Given Elonka's notes mentioning that K4 uses a cipher system not known to anyone else I decided to investigate other possible ways of attacking K4. Specifically, I wondered if the BERLIN crib might not be as simple as NYPVTT turning letter for letter into BERLIN.
First I assume that this is something that's breakable by hand as was the rest of Kryptos and thus would simply be based on MOD 26 arithmetic of letters and might involve transposition of characters.
So I went to see if there's a word that could be permuted to create some permutation of BERLIN from NYPVTT. There is: it is SILENT
NYPVTT
ENTSIL
-----
RLINBE
More strikingly this works if you are sliding SILENT from the start of K4, it falls in just the right position to make BERLIN
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
SILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTS
GJVVHHPWRLHETAZPVYTJHJYKNYBTEGYSDWBMOBBWWJKAPOMSOIENXEMLTEJBFNMRLINBEQMYHSHKQDRFENPWAOVYUNSCPOPTJ
Now leading on from this I wonder if the cipher used for K4 consists of permutations of both the key and the ciphertext. Note how BERLIN is permuted within itself and so then I returned to the start of cipher text to see if there's a permutation of SILENT that results in a word (after permutation) starting in position 0. Once again there is:
OBKRUO
ILENTS
------
WMOENG
i.e. the word is WOMEN, assuming that the G is in the word after women. In this case ILENTS is a simple rotate of the word SILENT (just as ENTSIL followed by ENTSIL gives us BERLIN). There are likely other words as well, but this one is strikingly long.
Running through all six possible rotations of SILENT you get:
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
SILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTS
GJVVHHPWRLHETAZPVYTJHJYKNYBTEGYSDWBMOBBWWJKAPOMSOIENXEMLTEJBFNMRLINBEQMYHSHKQDRFENPWAOVYUNSCPOPTJ
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
ILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSI
WMOENGFZKUNDJDSYBXJMASEJDBUCKFOVWFHLEEUFCIADIXSRELXWDDCOMNPAVQFARHDEXZSXXVATWCHIXWVVQROHAMIFIXVSZ
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
LENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSIL
ZFXKMWISTAMTMWBEANMFJYDZGUDIJVROFLGBHXDLBYDWRDRHHEGCCTFHVTOQYJOGQXGXGFRNAOJZVSKBGCULTKXNZCLYRDUIC
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
ENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILE
SODJCZBBZZCWFFHDQQFOPXTCZDJHZYKXLKWEAGJKRBWFXCHKANMBSWYQBSETRSUFGAZGMEHQTXPYLVDKMBKOMTDMPFEHXCKLV
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
NTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILEN
BUCZFSKHYPFPOLGTTJOUONWVIJIXCRTDKAZXJMIAUUFLWSKDJTLRVPHWAIHMAYTVJTIMLUKJCDOOOOMQLRNHVZCCSYNNWSNEE
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
TSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENT
HTSCYBQGOSYYUKWWMSUTEQPEOIYAVAZCADSGPLYDNDLKMVDMPSBUOYNVQLAVGXJYCCOLBXDSICERHXSPBUGQBYSFLHTMMVGNK
If you look you'll see various words popping out (in the ILENTS set, second row above, there's WOMEN at the beginning and closer to the end an anagram of WATCH).
Perhaps there's a method to choosing which shift of SILENT to use followed by some sort of transposition.
Given Elonka's notes mentioning that K4 uses a cipher system not known to anyone else I decided to investigate other possible ways of attacking K4. Specifically, I wondered if the BERLIN crib might not be as simple as NYPVTT turning letter for letter into BERLIN.
First I assume that this is something that's breakable by hand as was the rest of Kryptos and thus would simply be based on MOD 26 arithmetic of letters and might involve transposition of characters.
So I went to see if there's a word that could be permuted to create some permutation of BERLIN from NYPVTT. There is: it is SILENT
NYPVTT
ENTSIL
-----
RLINBE
More strikingly this works if you are sliding SILENT from the start of K4, it falls in just the right position to make BERLIN
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
SILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTS
GJVVHHPWRLHETAZPVYTJHJYKNYBTEGYSDWBMOBBWWJKAPOMSOIENXEMLTEJBFNMRLINBEQMYHSHKQDRFENPWAOVYUNSCPOPTJ
Now leading on from this I wonder if the cipher used for K4 consists of permutations of both the key and the ciphertext. Note how BERLIN is permuted within itself and so then I returned to the start of cipher text to see if there's a permutation of SILENT that results in a word (after permutation) starting in position 0. Once again there is:
OBKRUO
ILENTS
------
WMOENG
i.e. the word is WOMEN, assuming that the G is in the word after women. In this case ILENTS is a simple rotate of the word SILENT (just as ENTSIL followed by ENTSIL gives us BERLIN). There are likely other words as well, but this one is strikingly long.
Running through all six possible rotations of SILENT you get:
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
SILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTS
GJVVHHPWRLHETAZPVYTJHJYKNYBTEGYSDWBMOBBWWJKAPOMSOIENXEMLTEJBFNMRLINBEQMYHSHKQDRFENPWAOVYUNSCPOPTJ
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
ILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSI
WMOENGFZKUNDJDSYBXJMASEJDBUCKFOVWFHLEEUFCIADIXSRELXWDDCOMNPAVQFARHDEXZSXXVATWCHIXWVVQROHAMIFIXVSZ
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
LENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSIL
ZFXKMWISTAMTMWBEANMFJYDZGUDIJVROFLGBHXDLBYDWRDRHHEGCCTFHVTOQYJOGQXGXGFRNAOJZVSKBGCULTKXNZCLYRDUIC
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
ENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILE
SODJCZBBZZCWFFHDQQFOPXTCZDJHZYKXLKWEAGJKRBWFXCHKANMBSWYQBSETRSUFGAZGMEHQTXPYLVDKMBKOMTDMPFEHXCKLV
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
NTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILEN
BUCZFSKHYPFPOLGTTJOUONWVIJIXCRTDKAZXJMIAUUFLWSKDJTLRVPHWAIHMAYTVJTIMLUKJCDOOOOMQLRNHVZCCSYNNWSNEE
OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQSJQSSEKZZWATJKLUDIAWINFBNYPVTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
TSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENTSILENT
HTSCYBQGOSYYUKWWMSUTEQPEOIYAVAZCADSGPLYDNDLKMVDMPSBUOYNVQLAVGXJYCCOLBXDSICERHXSPBUGQBYSFLHTMMVGNK
If you look you'll see various words popping out (in the ILENTS set, second row above, there's WOMEN at the beginning and closer to the end an anagram of WATCH).
Perhaps there's a method to choosing which shift of SILENT to use followed by some sort of transposition.
2010-11-24
A proper Dr.
I wrote in my bio for The Geek Atlas that (speaking about myself in the third person) "Because he has a doctorate in computer security he's deeply suspicious of people who insist on being called Dr.". I am very suspicious of people who shove their PhD in your face, or who insist on being called Dr. In fact, like a British surgeon, I would much prefer to be called Mr. (which I suppose is a form of snobbery) mostly because it's what I've done after my doctorate that I'm most proud of.
Which brings me to the case of "Dr." Gillian McKeith. I only became aware of her because of the wonderful Ben Goldacre who has taken her to task about her qualifications and claims.
In an old article Goldacre talks about McKeith's qualifications and her legal threats against people who criticize her. He makes the very good point that it's easy to validate real credentials (e.g. if you want to check that I've really got a DPhil from Oxford you just need to call them up or, unlike McKeith, you can read my thesis).
Another person with a PhD from an unaccredited institution is John Gray (who wrote Men are from Mars, Women are from Venus). He refers to himself as Dr John Gray or John Gray PhD.
A striking similarity between McKeith's and Gray's web sites are the special sections explaining their degrees. John Gray has a page on the subject and McKeith explains her degree in detail:
Whenever I see these long explanations I can't help remembering the line: "The lady doth protest too much, methinks". Let me, for the record, do a special section describing my qualifications: "John Graham-Cumming, MA (Oxon), DPhil (Oxon)"
Which brings me to the question of what the proper title is for someone who has a PhD that isn't really a PhD (or at least a PhD from an unaccredited institution). Might I suggest Ph. in front of their name for 'Phony'?
Which brings me to the case of "Dr." Gillian McKeith. I only became aware of her because of the wonderful Ben Goldacre who has taken her to task about her qualifications and claims.
In an old article Goldacre talks about McKeith's qualifications and her legal threats against people who criticize her. He makes the very good point that it's easy to validate real credentials (e.g. if you want to check that I've really got a DPhil from Oxford you just need to call them up or, unlike McKeith, you can read my thesis).
Another person with a PhD from an unaccredited institution is John Gray (who wrote Men are from Mars, Women are from Venus). He refers to himself as Dr John Gray or John Gray PhD.
A striking similarity between McKeith's and Gray's web sites are the special sections explaining their degrees. John Gray has a page on the subject and McKeith explains her degree in detail:
Gillian then spent several years re-training for a Masters and Doctorate (PhD) in Holistic Nutrition from the American Holistic College of Nutrition (USA).
[...]
To obtain a PhD in Holistic Nutrition from the College, it is a pre-requisite that a student must have a Masters Degree and then undertakes study through a number of preliminary courses and a core curriculum, including general nutrition, immune system health, detoxification, herbology, human anatomy, enzymatic nutritional therapy, vitamin and mineral studies, nutrients, relationship of diet and disease, geriatric nutrition and nutrition and mental health. Doctoral students also have to prepare an original and practical dissertation. Gillian studied and completed the PhD course and dissertation over a period of more than 4 years between 1993 and 1997.
The PhD in Holistic Nutrition is a doctorate programme that is approved by the (American) National Association of Nutrition Professionals (NANP), a non-profit organisation which maintains the integrity of the holistic nutrition profession by establishing educational standards, a rigorous code of ethics and registration of nutrition professionals.
Whenever I see these long explanations I can't help remembering the line: "The lady doth protest too much, methinks". Let me, for the record, do a special section describing my qualifications: "John Graham-Cumming, MA (Oxon), DPhil (Oxon)"
Which brings me to the question of what the proper title is for someone who has a PhD that isn't really a PhD (or at least a PhD from an unaccredited institution). Might I suggest Ph. in front of their name for 'Phony'?
2010-11-20
GAGA-1 Recovery Computer Ground Test
Today was the first live test of the GAGA-1 Recovery Computer and it, at least initially, didn't go well. The result is much improved, fully working, code in the repository. This is why I'm obsessed with actual testing of the components of GAGA-1.
First the good news: the module ran on 4 AA batteries for 9 hours without showing any problems caused by power. For over 3 hours the module was getting GPS location information and sending SMS messages. This is very reassuring.
Here's the module sitting on the kitchen table ready to go:
Here's the commit message for the code changes:
The floating point was a pain (and yes, it is documented in the Telit documentation). The SMS timeout was showing up in the log file as follows:
The first part (+CMGS: 161) is left over from sending an SMS with the AT+CMGS command and meant that the read buffer hadn't been flushed. Changing the timeout fixes this. The good news here is that my defensive programming style worked well in keeping the module running in under this error condition.
I ran the module for 208 minutes (3 hours, 28 minutes) sitting motionless in the garden reporting position via SMS every two minutes. Here's a graph showing reported altitude, number of satellites, internal temperature and speed. At the beginning I take the module out from the house (room was at 20C) into the garden (temperature was 7C) and at the end I bring it in again.
The gaps in the temperature record are where the problem in number 3 above occurred. The chart starts just as I put the module down in the garden; at the end you can see the number of satellites drop and the apparent speed increase as the module is brought indoors. The module seems to be running about 3C hotter than ambient temperatures.
Things to do on this part:
1. Shorten the leads on the two antennae and install in the capsule.
2. Run a three hour test in a moving car.
3. I am still very worried about the COCOM limit and am waiting for a response from Telit. In the worst case I'm going to add a watchdog to the code so that if there's been no GPS lock for an hour a complete reset of the GPS module is forced.
First the good news: the module ran on 4 AA batteries for 9 hours without showing any problems caused by power. For over 3 hours the module was getting GPS location information and sending SMS messages. This is very reassuring.
Here's the module sitting on the kitchen table ready to go:
Here's the commit message for the code changes:
Significant changes based on live testing of the code on the Telit GM862-GPS module:
1. The module does not support floating point and so the gps2.get() function has been modified to only use integer arithmetic and thus not do conversion of latitude and longitude. This leaves most of the returned elements as strings. Except altitude which is needed for comparisons.
2. The main loop is wrapped in the try/except that will hide any inflight errors (although they will be logged to the log file). I hope this will never be called.
3. The timeout on SMS message sending has been increased from the default to 30 seconds because it can take a while to send the message and this was corrupting the return from the temperature command.
4. Improved handling of timestamps to make it clearer in the SMS messages when a message occurred.
5. Modified the upload script to delete the .pyo files for uploaded .py files so that the module will recompile and not use the previous version.
The floating point was a pain (and yes, it is documented in the Telit documentation). The SMS timeout was showing up in the log file as follows:
32945: Temperature read returned
+CMGS: 161
OK
#TEMPMEAS: 0,22
OK
The first part (+CMGS: 161) is left over from sending an SMS with the AT+CMGS command and meant that the read buffer hadn't been flushed. Changing the timeout fixes this. The good news here is that my defensive programming style worked well in keeping the module running in under this error condition.
I ran the module for 208 minutes (3 hours, 28 minutes) sitting motionless in the garden reporting position via SMS every two minutes. Here's a graph showing reported altitude, number of satellites, internal temperature and speed. At the beginning I take the module out from the house (room was at 20C) into the garden (temperature was 7C) and at the end I bring it in again.
The gaps in the temperature record are where the problem in number 3 above occurred. The chart starts just as I put the module down in the garden; at the end you can see the number of satellites drop and the apparent speed increase as the module is brought indoors. The module seems to be running about 3C hotter than ambient temperatures.
Things to do on this part:
1. Shorten the leads on the two antennae and install in the capsule.
2. Run a three hour test in a moving car.
3. I am still very worried about the COCOM limit and am waiting for a response from Telit. In the worst case I'm going to add a watchdog to the code so that if there's been no GPS lock for an hour a complete reset of the GPS module is forced.
2010-11-18
I guess Hacker News doesn't do meta very well
(which is ironic for something written in a Lisp variant)
Recently, I've grown tired of stories about the TSA on Hacker News.
So I posted and item saying that I was taking a break (the title was "Au revoir Hacker News") with text saying that I was fed up with the TSA stories (and in particular the Ron Paul story in the top slot) and that I was going to take a temporary break. It ended saying I'd see everyone in the New Year once things had blown over.
It was at this link but it was nuked by someone.Not made [dead], simply expunged by a moderator.
Feels a bit uncalled for to me, I'd have been happy for the community to shoot me down.
(Actually it is [dead] so it was the community that killed it off)
I guess I'll be back in the New Year.
Recently, I've grown tired of stories about the TSA on Hacker News.
So I posted and item saying that I was taking a break (the title was "Au revoir Hacker News") with text saying that I was fed up with the TSA stories (and in particular the Ron Paul story in the top slot) and that I was going to take a temporary break. It ended saying I'd see everyone in the New Year once things had blown over.
It was at this link but it was nuked by someone.
Feels a bit uncalled for to me, I'd have been happy for the community to shoot me down.
(Actually it is [dead] so it was the community that killed it off)
I guess I'll be back in the New Year.
2010-11-12
The things make got right (and how to make it better)
make is much maligned because people mistake its terse syntax and pickiness about whitespace for signs of being an anachronism. But make's terseness is what makes make fit for purpose, and people who design 'improvements' rarely seem to understand the fundamental zen nature of make.
Here are some things make does well:
1. make's key use is in the expression of dependencies. make has a compact, syntactic cruft-free way of expressing a dependency between a file and other files.
2. Since make is so dependent on handling lists of dependencies it has built-in list processing functionality.
3. Second to dependency management is the need to execute shell commands. make's syntax for including dependencies in shell commands is small which prevents the eye from being distracted from the commands themselves.
4. make is a macro-language not a programming language. The state of a build is determined by the dependency structure and the 'up to dateness' of files. There's no (or little) need for any other internal state.
To see the ways in which make is superior to other similar, more modern, systems this post will compare GNU Make and Rake. I've chosen Rake because I believe its illustrative of what happens when people create new make-like systems instead of just fixing the things that are broken about make.
Here's a simple Makefile showing the syntax used for updating a file (called target) from a list of dependent files by running a command called update.
(If you are unfamiliar with make then it's helpful to know that $@ is the name of the file to the left of the :, and $^ is the list of files to the right).
Here's the same thing expressed in Rake. The first thing that's obvious is that there's a lot of syntactic noise around the command and the expression of dependencies. What was clear in make now requires more digging to uncover and things like #{t.prerequisites.join(' ')} are long and unnecessarily ugly.
The biggest 'problem' that the Rake syntax fixes in make is that the target and prerequisite names can have spaces in them without difficulty. Because a make list is space-separated and there's no escaping mechanism for spaces it's a royal pain to work with paths with spaces in them.
make's terse syntax $@ is replaced by #{t.name} and $^ is #{t.prerequisites.join(' ')}. The great advantage of the terse syntax is that the actual command being executed can be clearly seen. When the command lines are long (with many options) this makes a real difference in debug-ability.
This terseness is better can be seen in an example taken from the Rake documentation:
which rewritten in make syntax is:
If you want to fix make then it's worth considering the following make problems that don't require an entirely new language:
1. Fix the 'spaces in filenames' problem. Not hard, just needs consistent escaping or quoting.
2. make has a concept of a PHONY target which is a target that isn't a file (used for things like clean and all). These are in the same namespace as file targets. This should be fixed.
3. make can't detect changes in the commands used to build targets. It would be better if make could do this. You can hack that into make but it's ugly.
4. make relies on timestamps for 'up to date' information. It would be better if make used hashes (in some situations, such as when files are extracted from a source code management system, timestamps can be unreliable). This can also be hacked into make if needed.
5. Ensure that non-recursive make is handled in an efficient manner.
Overall I'd urge make reimplementers to do as Paul Graham has done with LISP: his arc language is very LISP-like rather than something brand new.
And one final note: building and maintaining software build systems is inherently hard. Visualizing and getting right the graph of dependencies and handling cross-platform problems isn't easy. If you do come up with something good, please write good documentation for it.
Here are some things make does well:
1. make's key use is in the expression of dependencies. make has a compact, syntactic cruft-free way of expressing a dependency between a file and other files.
2. Since make is so dependent on handling lists of dependencies it has built-in list processing functionality.
3. Second to dependency management is the need to execute shell commands. make's syntax for including dependencies in shell commands is small which prevents the eye from being distracted from the commands themselves.
4. make is a macro-language not a programming language. The state of a build is determined by the dependency structure and the 'up to dateness' of files. There's no (or little) need for any other internal state.
To see the ways in which make is superior to other similar, more modern, systems this post will compare GNU Make and Rake. I've chosen Rake because I believe its illustrative of what happens when people create new make-like systems instead of just fixing the things that are broken about make.
Here's a simple Makefile showing the syntax used for updating a file (called target) from a list of dependent files by running a command called update.
target: prereq1 prereq2 prereq3 prereq4
update $@ $^
(If you are unfamiliar with make then it's helpful to know that $@ is the name of the file to the left of the :, and $^ is the list of files to the right).
Here's the same thing expressed in Rake. The first thing that's obvious is that there's a lot of syntactic noise around the command and the expression of dependencies. What was clear in make now requires more digging to uncover and things like #{t.prerequisites.join(' ')} are long and unnecessarily ugly.
file target => [ 'prereq1', 'prereq2', 'prereq3', 'prereq4' ] do |t|
sh "update #{t.name} #{t.prerequisites.join(' ')}"
end
The biggest 'problem' that the Rake syntax fixes in make is that the target and prerequisite names can have spaces in them without difficulty. Because a make list is space-separated and there's no escaping mechanism for spaces it's a royal pain to work with paths with spaces in them.
make's terse syntax $@ is replaced by #{t.name} and $^ is #{t.prerequisites.join(' ')}. The great advantage of the terse syntax is that the actual command being executed can be clearly seen. When the command lines are long (with many options) this makes a real difference in debug-ability.
This terseness is better can be seen in an example taken from the Rake documentation:
task :default => ["hello"]
SRC = FileList['*.c']
OBJ = SRC.ext('o')
rule '.o' => '.c' do |t|
sh "cc -c -o #{t.name} #{t.source}"
end
file "hello" => OBJ do
sh "cc -o hello #{OBJ}"
end
# File dependencies go here ...
file 'main.o' => ['main.c', 'greet.h']
file 'greet.o' => ['greet.c']
which rewritten in make syntax is:
SRC := $(wildcard *.c)
OBJ := $(SRC:.c=.o)
all: hello
.o.c:
cc -c -o $@ $<
hello: $(OBJ)
cc -o hello $(OBJ)
main.o: main.c greet.h
greet.o: greet.c
If you want to fix make then it's worth considering the following make problems that don't require an entirely new language:
1. Fix the 'spaces in filenames' problem. Not hard, just needs consistent escaping or quoting.
2. make has a concept of a PHONY target which is a target that isn't a file (used for things like clean and all). These are in the same namespace as file targets. This should be fixed.
3. make can't detect changes in the commands used to build targets. It would be better if make could do this. You can hack that into make but it's ugly.
4. make relies on timestamps for 'up to date' information. It would be better if make used hashes (in some situations, such as when files are extracted from a source code management system, timestamps can be unreliable). This can also be hacked into make if needed.
5. Ensure that non-recursive make is handled in an efficient manner.
Overall I'd urge make reimplementers to do as Paul Graham has done with LISP: his arc language is very LISP-like rather than something brand new.
And one final note: building and maintaining software build systems is inherently hard. Visualizing and getting right the graph of dependencies and handling cross-platform problems isn't easy. If you do come up with something good, please write good documentation for it.
2010-11-05
GAGA-1 Recovery Computer
Finally, got some time to work on the GAGA-1 Recovery Computer that uses a combination of a GPS and a GSM module to send position updates via SMS to a cell phone. The complete code is now in the repository in the gaga-1/recovery/ folder.
The recovery computer itself is a Telit GM862-GPS module mounted on a board that supplies power from four AA batteries. It has two external antennas: one for GPS and one for GSM access. Here's a shot of the computer before installation in the capsule (clearly the cables are going to have to be shortened and the power supply cleaned up before the real flight). The GPS antenna is square and the GSM is the long thin bar.
The GM862-GPS has an integrated Python interpreter so the control software is a set of Python modules that handle getting GPS information (and sundry information like temperature and voltage) and sending SMS messages at appropriate times. Here's the key piece of code for the recovery computer:
That code can be found in gaga-1.py which is the main module executed automatically by the GM862-GPS. The other important modules are logger.py (logs to the serial port for debugging and a file in the NVRAM on the GM862-GPS), at.py (simple wrapper for AT command access on the module), sms.py (module for sending SMS messages) and gps2.py (module to get GPS location).
There's a small Makefile the controls building and uploading of the code to the module (upload is achieved using the upload.pl helper program). The main commands are make all to build the code into compiled Python files, make upload to upload to the GM862-GPS and make test to run a flight simulation.
To test the code I've written modules that pretend to be the Telit Python modules (MDM, MOD, GPS, SER, etc.) and respond realistically to API calls and AT commands from my code. Within these modules I've programmed a simulated flight (an ascent, albeit a fast one, followed by descent) and random appearance of errors coming from the module (such as no GPS fix, no GSM access and other errors).
Here's a log of a simulated flight. You can see times when failures occurred (loss of GPS, can't send SMS: those lines are in red). I've highlighted the altitude in blue for easy reading.
There are a few remaining items:
1. Run a complete, real test of the module using fresh batteries in a moving car and ensure that it correctly logs information and sends SMS. Also, see how long it lasts.
2. Get an answer from Telit on the COCOM limits so that I understand how the GPS fails above the 18km altitude line.
3. Cut down the cables and install in the capsule.
Then it'll be on to the flight computer.
The recovery computer itself is a Telit GM862-GPS module mounted on a board that supplies power from four AA batteries. It has two external antennas: one for GPS and one for GSM access. Here's a shot of the computer before installation in the capsule (clearly the cables are going to have to be shortened and the power supply cleaned up before the real flight). The GPS antenna is square and the GSM is the long thin bar.
The GM862-GPS has an integrated Python interpreter so the control software is a set of Python modules that handle getting GPS information (and sundry information like temperature and voltage) and sending SMS messages at appropriate times. Here's the key piece of code for the recovery computer:
# The recovery computer runs through a sequence of simple states that
# determine its behaviour. It starts in the Launch state, transitions
# to the Ascent mode once above a preset altitude, then moves to
# Flight mode once too high for safe SMS usage. Once below the safe
# altitude it transitions to Recovery mode.
state = ''
set_state( 'LAUNCH' )
sms.init()
gps2.init()
# The rules for the states are as follows:
#
# Launch: get GPS position every 1 minute and SMS, check for
# transition to Ascent mode
#
# Ascent: get GPS position every 2 minute and SMS, check for
# transition to flight mode
#
# Flight: get GPS position every 5 minutes and check for
# transition to Recovery mode
#
# Recovery: get GPS position every 1 minute and SMS
while 1:
position = gps2.get()
if state == 'LAUNCH':
report_position(position)
if position['valid'] and
( position['altitude'] > ascent_altitude ):
set_state( 'ASCENT' )
elif state == 'ASCENT':
report_position(position)
if position['valid'] and
( position['altitude'] > flight_altitude ):
set_state( 'FLIGHT' )
elif state == 'FLIGHT':
if position['valid'] and
( position['altitude'] < recovery_altitude ):
set_state( 'RECOVERY' )
elif state == 'RECOVERY':
report_position(position)
if state == 'LAUNCH' or state == 'RECOVERY':
delay = 1
elif state == 'ASCENT':
delay = 2
else:
delay = 5
MOD.sleep(delay * 600)
That code can be found in gaga-1.py which is the main module executed automatically by the GM862-GPS. The other important modules are logger.py (logs to the serial port for debugging and a file in the NVRAM on the GM862-GPS), at.py (simple wrapper for AT command access on the module), sms.py (module for sending SMS messages) and gps2.py (module to get GPS location).
There's a small Makefile the controls building and uploading of the code to the module (upload is achieved using the upload.pl helper program). The main commands are make all to build the code into compiled Python files, make upload to upload to the GM862-GPS and make test to run a flight simulation.
To test the code I've written modules that pretend to be the Telit Python modules (MDM, MOD, GPS, SER, etc.) and respond realistically to API calls and AT commands from my code. Within these modules I've programmed a simulated flight (an ascent, albeit a fast one, followed by descent) and random appearance of errors coming from the module (such as no GPS fix, no GSM access and other errors).
Here's a log of a simulated flight. You can see times when failures occurred (loss of GPS, can't send SMS: those lines are in red). I've highlighted the altitude in blue for easy reading.
$ make test
342295541: sms.send(+447...,"Transition to state LAUNCH")
342295541: gps2.get() -> 180541.000,5238.7818N,00211.1238W,
1.2,13.00,3,138.00,0.00,0.00,051110,02
342295541: sms.send(+447...,"52.6464N 2.1854W 13.00m 138.00deg
0.00kph 2sats (1711mV, 28C)")
342295601: gps2.get() -> 180641.000,5238.2410N,00211.3171W,
1.2,2053.03,3,114.00,45.00,24.30,051110,05
342295601: sms.send(+447...,"52.6373N 2.1886W 2053.03m 114.00deg
45.00kph 5sats (1919mV, -51C)")
342295601: sms.send(+447...,"Transition to state ASCENT")
342295721: gps2.get() -> 180841.000,5238.1755N,00211.5866W,
1.2,7093.11,3,241.00,38.00,20.52,051110,03
342295721: sms.send(+447...,"52.6363N 2.1931W 7093.11m 241.00deg
38.00kph 3sats (535mV, -11C)")
342295721: Failed to get SMS prompt
342295721: sms.send(+447...,"Transition to state FLIGHT")
342295721: Failed to get SMS prompt
342296021: gps2.get() -> 181341.000,,,,,0,,,,051110,00
342296321: gps2.get() -> 181841.000,5238.5639N,00211.2198W,
1.2,37093.23,3,46.00,33.00,17.82,051110,06
342296621: gps2.get() -> 182341.000,5238.7426N,00211.8475W,
1.2,48193.23,3,94.00,43.00,23.22,051110,01
342296921: gps2.get() -> 182841.000,5238.8810N,00211.9387W,
1.2,34693.21,3,355.00,5.00,2.70,051110,06
342297221: gps2.get() -> 183341.000,5238.1542N,00211.6911W,
1.2,20293.21,3,26.00,28.00,15.12,051110,04
342297522: gps2.get() -> 183842.000,5238.2393N,00211.8530W,
1.2,8262.62,3,54.00,24.00,12.96,051110,02
342297822: gps2.get() -> 184342.000,5238.1079N,00211.0661W,
1.2,12.00,3,3.00,0.00,0.00,051110,08
342297822: sms.send(+447...,"Transition to state RECOVERY")
342297882: gps2.get() -> 184442.000,5238.6368N,00211.9774W,
1.2,12.00,3,77.00,0.00,0.00,051110,06
342297882: sms.send(+447...,"52.6439N 2.1996W 12.00m 77.00deg
0.00kph 6sats (1444mV, -2C)")
342297942: gps2.get() -> 184542.000,5238.6790N,00211.3624W,
1.2,18.00,3,63.00,0.00,0.00,051110,03
342297942: sms.send(+447...,"52.6446N 2.1894W 18.00m 63.00deg
0.00kph 3sats (1246mV, -22C)")
342298002: gps2.get() -> 184642.000,5238.4941N,00211.8801W,
1.2,17.00,3,256.00,0.00,0.00,051110,01
342298002: sms.send(+447...,"52.6416N 2.1980W 17.00m 256.00deg
0.00kph 1sats (3095mV, -51C)")
342298062: gps2.get() -> 184742.000,,,,,2,,,,051110,00
342298062: sms.send(+447...,"No GPS lock (1045mV, 32C)")
342298122: gps2.get() -> 184842.000,5238.9542N,00211.9596W,
1.2,11.00,3,21.00,0.00,0.00,051110,05
342298122: sms.send(+447...,"52.6492N 2.1993W 11.00m 21.00deg
0.00kph 5sats (2742mV, 48C)")
342298182: gps2.get() -> 184942.000,5238.9607N,00211.1014W,
1.2,14.00,3,167.00,0.00,0.00,051110,08
342298182: sms.send(+447...,"52.6493N 2.1850W 14.00m 167.00deg
0.00kph 8sats (819mV, 6C)")
There are a few remaining items:
1. Run a complete, real test of the module using fresh batteries in a moving car and ensure that it correctly logs information and sends SMS. Also, see how long it lasts.
2. Get an answer from Telit on the COCOM limits so that I understand how the GPS fails above the 18km altitude line.
3. Cut down the cables and install in the capsule.
Then it'll be on to the flight computer.
2010-11-04
The most common objection to my 'releasing scientific code' post
Is...
Or as expressed at RealClimate:
This argument strikes me as bogus. It comes down to something like "we should protect other scientists from themselves by not giving them code that they might run; by not releasing code we are ensuring that the scientific method is followed".
Imagine the situation where a scientist runs someone else's code on the data that person released and gets the same result. Clearly, they have done no science. All they have done is the simplest verification that the original scientist didn't screw up in their methods. That person has not used the scientific method, they have not independently verified the results and their work is close to useless.
Is this enough to argue that the code should have been closed in the first place?
I can't see that it is. No one's going to be able to publish a paper saying "I ran X's code and it works", it would never get through peer review and isn't scientific.
To return to the first quote above, running someone else's buggy code proves nothing. But in hiding the buggy code you've lost the valuable situation where someone can verify that the code was good in the first place. Just look at the effort I went to do discover the code error in CRUTEM (which, ironically, is a 'key observational climate data sets' to use RealClimate's words).
The argument from RealClimate can also be stated as 'running someone else's code isn't helpful so there's no point releasing it'. (see comments below to understand why this is struck out) The premise is reasonable, the conclusion not. I say that because there are other reasons to release code:
1. It can be used by others for other work. For example, good code can form part of a library of code that is used to improve or speedup science.
2. The algorithm in a paper can be quickly checked against the implementation to ensure that the results being generated are correct. For example, the CRUTEM error I found could have been quickly eliminated by access to the paper and source code at the same time.
3. Releasing code has a psychological effect which will improve its quality. This will lead to fewer errors on the part of scientists who rely on computer methods.
And why dismiss so casually the argument that running the code used to generate a paper's result provides no actual independent verification of that result? How does running the same buggy code and getting the same buggy result help anyone?
Or as expressed at RealClimate:
First, the practical scientific issues. Consider, for example, the production of key observational climate data sets. While replicability is a vital component of the enterprise, this is not the same thing as simply repetition. It is independent replication that counts far more towards acceptance of a result than merely demonstrating that given the same assumptions, the same input, and the same code, somebody can get the same result. It is far better to have two independent ice core isotope records from Summit in Greenland than it is to see the code used in the mass spectrometer in one of them. Similarly, it is better to have two (or three or four) independent analyses of the surface temperature station data showing essentially the same global trends than it is to see the code for one of them. Better that an ocean sediment core corroborates a cave record than looking at the code that produced the age model. Our point is not that the code is not useful, but that this level of replication is not particularly relevant to the observational sciences.
This argument strikes me as bogus. It comes down to something like "we should protect other scientists from themselves by not giving them code that they might run; by not releasing code we are ensuring that the scientific method is followed".
Imagine the situation where a scientist runs someone else's code on the data that person released and gets the same result. Clearly, they have done no science. All they have done is the simplest verification that the original scientist didn't screw up in their methods. That person has not used the scientific method, they have not independently verified the results and their work is close to useless.
Is this enough to argue that the code should have been closed in the first place?
I can't see that it is. No one's going to be able to publish a paper saying "I ran X's code and it works", it would never get through peer review and isn't scientific.
To return to the first quote above, running someone else's buggy code proves nothing. But in hiding the buggy code you've lost the valuable situation where someone can verify that the code was good in the first place. Just look at the effort I went to do discover the code error in CRUTEM (which, ironically, is a 'key observational climate data sets' to use RealClimate's words).
1. It can be used by others for other work. For example, good code can form part of a library of code that is used to improve or speedup science.
2. The algorithm in a paper can be quickly checked against the implementation to ensure that the results being generated are correct. For example, the CRUTEM error I found could have been quickly eliminated by access to the paper and source code at the same time.
3. Releasing code has a psychological effect which will improve its quality. This will lead to fewer errors on the part of scientists who rely on computer methods.
Subscribe to:
Posts (Atom)