From jasonic@nomadicsltd.com Wed Nov 1 15:58:37 2000 From: jasonic@nomadicsltd.com (Jason Cunliffe) Date: Wed, 1 Nov 2000 10:58:37 -0500 Subject: [Edu-sig] UNext ? Message-ID: <002e01c0441c$98c208c0$c3090740@megapathdsl.net> You might be all interested to see what Don Norman [ http://www.jnd.org/ ] has been working on: http://www.unext.com/ Welcome to UNext.com, the Internet education company. Our mission is to provide transforming, life-enhancing educational opportunities to people around the world. Thriving in today's rapidly changing, knowledge-driven economy requires continual growth in knowledge and skills, or as we call it, HUMAN CAPITAL. Through the power of the Internet, we are delivering world-class knowledge to people everywhere. Where do you suppose CP4E fits into this picture? - Jason ___________________________________________________________ Jason CUNLIFFE = NOMADICS['Interactive Art and Technology'] From pdx4d@teleport.com Fri Nov 10 18:09:43 2000 From: pdx4d@teleport.com (Kirby Urner) Date: Fri, 10 Nov 2000 20:09:43 +0200 Subject: [Edu-sig] Cryptonomicon Message-ID: <3.0.3.32.20001110200943.00cfa160@pop.teleport.com> I haven't posted in awhile, having my hands full doing heavy duty logistics in South Africa and the Kingdom of Lesotho. My reading of late (just finished) was Neal Stephenson's Cryptonomicon, a science fiction novel (and NYT best- seller) which takes me back to my youth in the Philippine islands (where I went to high school -- and learned to scuba dive from exMarine Gill Gilleland), and to other places. I highly recommend the book. Why this is relevant to edu-sig is there's a lot to write about cryptography, using the Python language. I've been meaning to explore this more, jumping off from my http://www.inetarena.com/~pdx4d/ocn/numeracy2.html wherein I explore prime numbers, and come up against the difficulty of finding the factors of very large numbers. If you take a huge number that factors uniquely into two primes, you could think of one as the private key, the other as the public key -- something like that. I'm separated from my eBook on the Standard Library at the moment, but as I recall, Python already ships with at least one crypto algorithm. But there's a lot more we could be doing, to build bridges between the "math through programming" initiative (which I've been spearheading)[1], and the crypto world. Anyway, I invite people to consider a role for Python at this juncture -- could be a great place to do some original and useful curriculum writing, IMO (think elementary school for example -- no need to start in the densest areas -- and do some simple substitution codes (plus there's always using algorithms to try cracking such simple codes (then moving on to the next level)). Kirby from the Kingdom of Lesotho (574 Hoo Hloo, Maseru) [1] http://www.oreillynet.com/pub/a/python/2000/10/04/pythonnews.html PS: another link to Cryptonomicon is the "math teacher as storyteller" thread -- thinking of math teaching as an opportunity to tell stories with math content inter- weaved. Stephenson's book contains many examples of this. From dustin@cs.uchicago.edu Fri Nov 10 19:56:32 2000 From: dustin@cs.uchicago.edu (Dustin Mitchell) Date: Fri, 10 Nov 2000 13:56:32 -0600 (CST) Subject: [Edu-sig] Cryptonomicon In-Reply-To: <3.0.3.32.20001110200943.00cfa160@pop.teleport.com> Message-ID: On Fri, 10 Nov 2000, Kirby Urner wrote: > Anyway, I invite people to consider a role for Python > at this juncture -- could be a great place to do some > original and useful curriculum writing, IMO (think > elementary school for example -- no need to start in > the densest areas -- and do some simple substitution > codes (plus there's always using algorithms to try > cracking such simple codes (then moving on to the > next level)). I think this is an excellent idea. Crypto is hard to learn because (a) proving that it's hard to crack requires complex formal mathematical constructions, and (b) any real crypto uses numbers too big to write down, let alone think about. A clear implementation of various crypto algorithms in Python would be a great thing to have around. If we can take its bit-length down to, oh, 9, then the numbers are numbers high-schoolers can crunch in a class period with a calculator, but the computer can take care of a lot of the calculations automatically. There are some *excellent* opportunities for assignments in there, too. And they're virtually impossible to cheat on :) Dustin --------------------------------------------------------------------- | Connection - in an isolating age )O( | --------------------------------------------------------------------- From pdx4d@teleport.com Sat Nov 11 07:36:20 2000 From: pdx4d@teleport.com (Kirby Urner) Date: Sat, 11 Nov 2000 09:36:20 +0200 Subject: [Edu-sig] Cryptonomicon In-Reply-To: References: <3.0.3.32.20001110200943.00cfa160@pop.teleport.com> Message-ID: <3.0.3.32.20001111093620.00cff8dc@pop.teleport.com> >I think this is an excellent idea. Crypto is hard to learn because (a) >proving that it's hard to crack requires complex formal mathematical >constructions, and (b) any real crypto uses numbers too big to write down, >let alone think about. Re big numbers: I think it really open things up for kids that we have big integers easily available now. Even though we're doing all the same ops, there's a psychological bridge that gets crossed when the numbers start taking a couple paragraphs or even pages to write down. For one thing, you really start to appreciate that we have computing machinery, and that opens up the space for a discussion of how that came to be (another thread in Cryptonomicon -- because of course Turing was involved in the war effort to break German and Japanese codes, and it was this push which led to the faster evolution of digital computing). Java also has a BigNumber class, and an op for spitting out large numbers with a percentage change that said numbers are prime (you approach a probability of 1 depending on how computationally intensive you want to get -- I haven't studied the source for this, am curious what algorithms are used (would like to see them re-expressed in Python, just for reference)). Probably the Java methods are inherited from some well-known C library and the code is in some numerical recipes book I haven't studied yet. >A clear implementation of various crypto algorithms in Python would be a >great thing to have around. If we can take its bit-length down to, oh, >9, then the numbers are numbers high-schoolers can crunch in a class >period with a calculator, but the computer can take care of a lot of the >calculations automatically. Plus just consider those really simple substitution codes where you make A = J, B = Z -- whatever random assignment. Also, you want to parse your messages into 5 letter chunks: ALSOY OUWAN TOPAR SEYOU RMESSA GESIN TO5LE TTERC HUNKS. You find this simple "club house" codes in books for little kids, with titles like 'I, Spy' and stuff. Of course this is nothing sophisticated, but it's a great opportunity to use Python string ops to: make the 5 letter uppercase chunks out of whatever plaintext input (could be interactive and first, then convert to file i/o); use the dictionary data structure to do the lookup/substition. >There are some *excellent* opportunities for assignments in there, >too. And they're virtually impossible to cheat on :) > >Dustin > The basic strategy here is to make Python a fun "toy" (in the sense that kids will want to pick it up and play with it spontaneously, not because some teacher is standing over them with a whip). The way to do this is just provide enough interactive experience to suggest a vocabulary of "tinker toys" (components), which kids while then use in synergetic ways to assemble whatever. What we want to avoid is the intimidating sense that you're not allowed to start small. Too many educational experiences include some older kids (e.g. teachers) suggesting that your work is nothing special, not interesting, because you don't know "all" of what's to know about. Crypto is a case in point. Regarding crypto, there are lots of segues to random numbers (the Standard Library book re Python keeps warning us that the randoms are "pseudo" and hence potentially an Achilles heal if you're using them as a basis for some encryption algorithm). This whole concept of random numbers generated by machines is especially rich, philosophically. We have to be clear what we mean by "random" -- what are the tests such numbers must pass? Knuth has done a lot of work in this area, and computer science has gotten a big boost from this thread -- which thinking we can recapitulate in curriculum writing geared for those relatively new to the subject. I get into random numbers some in my "Random walks through the matrix" piece, wherein a turtle swallows an n-sided die and then hops in a spatial lattice defined by n degrees of freedom at each turn to play (the so-called isotropic vector matrix features 12 spokes from every hub, to the surrounding closest packed spheres at the corners of a cuboctahedron). Kirby From gherman@darwin.in-berlin.de Sat Nov 11 12:01:43 2000 From: gherman@darwin.in-berlin.de (Dinu C. Gherman) Date: Sat, 11 Nov 2000 13:01:43 +0100 Subject: [Edu-sig] Cryptonomicon References: Message-ID: <3A0D3527.6553B050@darwin.in-berlin.de> Dustin Mitchell wrote: > > A clear implementation of various crypto algorithms in Python would be a > great thing to have around. If we can take its bit-length down to, oh, > 9, then the numbers are numbers high-schoolers can crunch in a class > period with a calculator, but the computer can take care of a lot of the > calculations automatically. I thought the very same thing maybe a year ago, when I im- plemented MD5 in pure Python. Unfortunately, I had many other things to do right after that, so I discontinued that. I'd be happy to send my code to anybody who wants to spend a tiny bit of time doing just a bit of debugging in order to make it run correctly for input strings longer than 2**32 (or so). Regards, Dinu -- Dinu C. Gherman ................................................................ "The only possible values [for quality] are 'excellent' and 'in- sanely excellent', depending on whether lives are at stake or not. Otherwise you don't enjoy your work, you don't work well, and the project goes down the drain." (Kent Beck, "Extreme Programming Explained") From jhrsn@pitt.edu Wed Nov 15 22:37:38 2000 From: jhrsn@pitt.edu (Jim Harrison) Date: Wed, 15 Nov 2000 17:37:38 -0500 Subject: [Edu-sig] Programming for non-programmer IT professionals (in healthcare) Message-ID: Hello- I've been lurking on this list for some time and I appreciate the interesting discussions that have occurred over the past months. Much of this discussion has focused on the application of Python to teaching programming at the primary and second school levels, with a bit of introductory programming at the college level thrown in. This is near and dear to my heart--and I'm cheering you on--because I currently have an 8th grader with an interest in programming and I would love to see Python take hold in the schools. There is another application for Python in education that I believe also holds a lot of promise. Currently, a number of graduate programs across the country train individuals for careers in information technology which are not primarily oriented to programming. For example, we have a training program in medical informatics at the University of Pittsburgh School of Medicine. Within this program, we offer MS degrees (2 years), PhD degrees (4 years+) and postdoctoral fellowships (1 or 2 years after an MD or PhD). Some of our students are computer-science-trained, but many are physicians or other healthcare workers who have a stong interest in computing and varied computing backgrounds. Some of them go into research and development in which programming plays a central role, but many others enter IT administration and other areas such as standardized vocabulary development or information resource management where they do not themselves carry out code development. For these latter individuals, it is important that they understand certain concepts related to coding such as control flow, the general meaning of object-oriented and modular design, issues related to data parsing, expression of data and program structure with UML, communication between systems, programming for application servers, etc. They also need to be able to discuss program design issues intelligently with programmers. They do not, however, need to program in medium-to-low level languages. There is a programming requirement in the curriculum, which is typically satisfied by taking one semester of introductory C in the computer science curriculum (I think this approach is common in similar settings across the country). I believe this is a disservice to some of these students, firstly because it convinces them that they don't want to program, secondly because few of the higher level topics I mentioned above are addressed at that level of C training, and thirdly because this brief exposure to C does not leave them with useful tools for their personal work. In response to this, I'm beginning to develop a programming course for medical informatics based on Python which will be offered to appropriate students in our program beginning in the fall of 2001. As I conceive it now, it will consist of about a month of getting up to speed in the basics of the language, followed by a series of problems in medical information processing that extend the understanding of Python and also allow us to explore important high-level issues. I'll carry out most of the curriculum development this winter and it should be pretty well set by spring. I think the use of Python in this way would also be an advantage in similar educational settings outside of medical informatics. The initial month of the course I plan should be fairly generic and might be useful to others (or others might inform me of useful approaches--I plan on looking at Jeff Elkner's Python version of "How to think..." in some detail). If any list members are interested in this type of teaching, or in the application of Python in medical settings, I'd be happy to correspond with them and perhaps share resources. Jim Harrison ________________________________________________________________________ James H. Harrison, Jr., MD, PhD Associate Director of Pathology Informatics, Department of Pathology Faculty Member in Residence, Center for Biomedical Informatics University of Pittsburgh Suite 8084 Forbes Tower Pittsburgh, PA 15213-2582 jhrsn@pitt.edu | voice: 412-647-7113 | fax: 412-647-7190 "If you want sense, you'll have to make it yourself!!"-Norton Juster ________________________________________________________________________ From pdx4d@teleport.com Fri Nov 17 13:18:41 2000 From: pdx4d@teleport.com (Kirby Urner) Date: Fri, 17 Nov 2000 05:18:41 -0800 Subject: [Edu-sig] Programming for non-programmer IT professionals (in healthcare) In-Reply-To: Message-ID: <3.0.3.32.20001117051841.00a7d950@pop.teleport.com> >I think the use of Python in this way would also be an advantage in similar >educational settings outside of medical informatics. The initial month of >the course I plan should be fairly generic and might be useful to others (or >others might inform me of useful approaches--I plan on looking at Jeff >Elkner's Python version of "How to think..." in some detail). If any list >members are interested in this type of teaching, or in the application of >Python in medical settings, I'd be happy to correspond with them and perhaps >share resources. > >Jim Harrison Jim -- I'd like to keep in touch with you on this. I've done a lot of database design and programming around outcomes research, collecting data from cath labs and cardiac operating rooms in particular (but work in other areas as well, including group health insurance and data mining in non-cardiac clinical data). Mostly I've found the concepts of relational database management to be front and center, with OOP and RDBMS having many ways of fitting together. Visual FoxPro (VFP), easier and higher level than C++, and focused around data tables, has been one of my bread and butter languages in this regard. Like Python, VFP features an interpreted command line environment -- a big plus when it comes to learning a language (because you can test things quickly). Before I learned Python, I was trying to do my curriculum writing around VFP (instead of the more used VB). But of course Python has many advantages, not the least of which is the open, sharing, royalty-free space it comes in -- and also its non-Windows transportability (to UNIX and Linux in particular). I've mostly worked in data collection and data warehousing, while end-users of the data are more likely to be involved in statistical work, e.g. using SPSS. Python could provide more transparency in this realm, especially if we add some ODBC and SQL to the picture (Python add-ons), plus maybe some Numeric Python. The idea here would be to access some large data tables and do some analysis on them. Brainstorming around the idea of an electronic medical record defined using OO principles might be another focus, combined with Python's evolving powers in the realm of XML. The idea here is to open the world of structured information, and around something so complicated as medical histories. How do you structure the relationship between patients, procedures, outcomes? Plus there's the whole financial side (where clinical info sometimes gets mixed in, i.e. ICD9 codes + specific medications). Of course issues of privacy and data encryption might enter at this juncture. I would agree with you that a year of C could give a more warpedly and less useful approach to medical informatics in that you'd probably get less direct hands-on experience of direct relevance to your field by the end this year. I curriculum that mixing pre- written modules with exercises requiring student programming (in Python) might easily go deeper into the subject area (medical informatics), leaving the intricacies of C programming to a subset of students those choosing to specialize in this direction. What's important is, as you've indicated in your post, a kind fluency, an ability to think like a programmer in _some_ language. Python is an excellent language for developing this fluency, gaining this style of thought. From here, it's a hop to Java, and then maybe to C++. But even without jumping C++, Python already gives you entre into many conversations and experiences of direct relevence to medical informatics. Kirby From delza@antarcti.ca Fri Nov 17 17:47:56 2000 From: delza@antarcti.ca (Dethe Elza) Date: Fri, 17 Nov 2000 09:47:56 -0800 Subject: [Edu-sig] Re: Programming for non-programmer IT professionals References: <20001117170229.EA3E11D2DD@dinsdale.python.org> Message-ID: <004501c050be$84ece5a0$2701010a@dev.antarcti.ca> Kirby & Jim, When analyzing data like you're talking about, you might want to check out Martin Fowler's "Analysis Patterns : Reusable Object Models." Many of his examples are drawn from health care, and there's a lot of good stuff. ISBN 0201895420. --Dethe Dethe Elza Antarcti.ca Client Lead http://antarcti.ca http://map.net From pdx4d@teleport.com Fri Nov 17 18:30:48 2000 From: pdx4d@teleport.com (Kirby Urner) Date: Fri, 17 Nov 2000 10:30:48 -0800 Subject: [Edu-sig] Re: Programming for non-programmer IT professionals In-Reply-To: <004501c050be$84ece5a0$2701010a@dev.antarcti.ca> References: <20001117170229.EA3E11D2DD@dinsdale.python.org> Message-ID: <3.0.3.32.20001117103048.008875b0@pop.teleport.com> Thanks for the tip. Here's the Amazon URL: http://www.amazon.com/exec/obidos/ASIN/0201895420/ It'd be fun to swap course outlines or lesson plans showing how Python might integrate more tightly into medical informatics. In my own work, I'm exploring the potential links between HCL and XML. How could we describe our patient-encounter- procedure-subprocedure model in XML, or would that be the best way to go? Having a patient-centric view, with both linear/temporal and current data is one way we want to present info. An aggregate/statistical view, with patients sorted by demographic attributes (for example) is another. These are the generic sorts of problems that lots and lots of people are working on, in many different ways. Links: http://puck.informatik.med.uni-giessen.de/people/messaritakis/hl7xml/ http://www.infoloom.com/gcaconfs/WEB/philadelphia99/alschuler.HTM http://www.info.dsdc.dla.mil/Partners/HL7.html http://medicine.ucsd.edu/f99/D005484.htm I haven't done anything with Python in this area (yet) and would be interested in seeing what others might be doing along these lines. Curriculum writing which spells out aspects of HL7-XML integration, using Python as a teaching language, would I think be a useful component in medical informatics (using Python Standard Library XML features, plus 3rd party enhancements). One idea here is to anticipate the new and existing component architectures wherein Python modules will be able to participate in harmony with other components written in other languages (COM, .NET etc -- I gather from reading the Python stuff that there's quite a bit of interest in getting .NET and Python working together). See: http://www.microsoft.com/net/default.asp for more on the .NET thing. Kirby At 09:47 AM 11/17/2000 -0800, Dethe Elza wrote: >Kirby & Jim, > >When analyzing data like you're talking about, you might want to check out >Martin Fowler's "Analysis Patterns : Reusable Object Models." Many of his >examples are drawn from health care, and there's a lot of good stuff. ISBN >0201895420. > >--Dethe > >Dethe Elza >Antarcti.ca Client Lead >http://antarcti.ca >http://map.net From jhrsn@pitt.edu Fri Nov 17 18:33:36 2000 From: jhrsn@pitt.edu (Jim Harrison) Date: Fri, 17 Nov 2000 13:33:36 -0500 Subject: [Edu-sig] Programming for non-programmer IT professionals (in healthcare) In-Reply-To: <3.0.3.32.20001117051841.00a7d950@pop.teleport.com> Message-ID: on 11/17/00 8:18 AM, Kirby Urner at pdx4d@teleport.com wrote: ... > Python could provide more transparency in this realm, > especially if we add some ODBC and SQL to the picture > (Python add-ons), plus maybe some Numeric Python. The > idea here would be to access some large data tables and > do some analysis on them. ...not to mention a little CORBA to illustrate intersystem communication and enterprise system architecture, using one of the ORBs that is accessible from Python. > Brainstorming around the idea of an electronic medical > record defined using OO principles might be another focus, > combined with Python's evolving powers in the realm of > XML. The idea here is to open the world of structured > information, and around something so complicated as > medical histories. How do you structure the relationship > between patients, procedures, outcomes? Plus there's the > whole financial side (where clinical info sometimes gets > mixed in, i.e. ICD9 codes + specific medications). These are the sorts of things that I'm interested in formulating as problems to be addressed by programming tasks over the second portion of the course. We won't be able to get very deeply into them in an introductory course, but just raising these issues in a simple way in the setting of programming instruction would be a substantial improvement over the current curriculum. > I would agree with you that a year of C could give a > more warpedly and less useful approach... I'm not talking about a year of C. The requirement is one semester. As far as I'm concerned that's almost useless and in some cases detrimental. > [A] curriculum ... mixing pre- > written modules with exercises requiring student > programming (in Python) might easily go deeper into > the subject area... > > What's important is, as you've indicated in your post, > a kind fluency, an ability to think like a programmer > in _some_ language. Python is an excellent language > for developing this fluency, gaining this style of > thought. Just so, and that's the entire point. Though my focus is medical informatics, I think this issue goes beyond any particular subject domain. The concepts in CP4E were conceived with respect to beginning programming in secondary schools and perhaps the first year of undergrad. I see them transposing almost perfectly to upper level undergrad and graduate education in settings where "programming fluency" is necessary but technical skill in a medium or low level language is not. Thanks for the thought-provoking reponse. Jim Harrison Univ. of Pittsburgh From delza@antarcti.ca Fri Nov 17 18:58:41 2000 From: delza@antarcti.ca (Dethe Elza) Date: Fri, 17 Nov 2000 10:58:41 -0800 Subject: [Edu-sig] Re: Programming for non-programmer IT professionals References: <20001117170229.EA3E11D2DD@dinsdale.python.org> <3.0.3.32.20001117103048.008875b0@pop.teleport.com> Message-ID: <005301c050c8$67491d20$2701010a@dev.antarcti.ca> > Thanks for the tip. Here's the Amazon URL: > http://www.amazon.com/exec/obidos/ASIN/0201895420/ I generally use ISBNs to reference a book so folks can use their store of choice. Here in Canada I usually use http://chapters.ca or http://indigo.ca to avoid going through customs and paying international shipping. > It'd be fun to swap course outlines or lesson plans showing > how Python might integrate more tightly into medical > informatics. My current work is in 3D on the web, I just happened to have read the book and thought it would apply to your research interest. My medical infomatics is limited to what I picked up as a Hospice office secretary/ office manager and an incomplete EMT program. I'm on the list because I'm hugely interested in python, learning, and synergetics. --Dethe Dethe Elza Antarcti.ca Client Lead http://antarcti.ca http://map.net From pdx4d@teleport.com Sat Nov 18 17:40:47 2000 From: pdx4d@teleport.com (Kirby Urner) Date: Sat, 18 Nov 2000 09:40:47 -0800 Subject: [Edu-sig] Re: Programming for non-programmer IT professionals In-Reply-To: <005301c050c8$67491d20$2701010a@dev.antarcti.ca> References: <20001117170229.EA3E11D2DD@dinsdale.python.org> <3.0.3.32.20001117103048.008875b0@pop.teleport.com> Message-ID: <3.0.3.32.20001118094047.009f4e90@pop.teleport.com> At 10:58 AM 11/17/2000 -0800, Dethe Elza wrote: >> Thanks for the tip. Here's the Amazon URL: >> http://www.amazon.com/exec/obidos/ASIN/0201895420/ >I generally use ISBNs to reference a book so folks can use their store of >choice. Yes, ISBN more useful -- I was being more crassly commercial in referencing Amazon, mostly a convience for USAers (or for those wanting to see a book cover, read some reviews, get an idea of the price and so on...). >> It'd be fun to swap course outlines or lesson plans showing >> how Python might integrate more tightly into medical >> informatics. > >My current work is in 3D on the web, I just happened to have read the book >and thought it would apply to >your research interest. My medical infomatics is limited to what I picked >up as a Hospice office secretary/ >office manager and an incomplete EMT program. > >I'm on the list because I'm hugely interested in python, learning, and >synergetics. > >--Dethe That's a great combo (sharing a bias). I'll try to find that patterns book in a library or even 2nd hand (we're fortunate to have Powell's here in PDX (another commercial)). Speaking of 3D and imaging, that's a whole other way to approach medical topics using Python. Medicine, as much as any discipline, is pushing the boundaries of computerized imagery (storage-retrieval and analysis) and lots of Python is devoted to manipulation of digital files -- plus medical databases maybe include "blobs" i.e. binary data or links to same = cines, MRIs, other kinds of tomography. Of course 3D (or "4D" in synergetics -- the origin of my company name, "4D Solutions" (commercial :-D)) is close to my heart as well, a primary focus of such Python-related webpages as http://www.inetarena.com/~pdx4d/ocn/pyqvectors.html Kirby Related posts: http://www.deja.com/getdoc.xp?AN=617004981&fmt=text http://www.teleport.com/~pdx4d/videogrammatron.html From pdx4d@teleport.com Sun Nov 19 17:20:12 2000 From: pdx4d@teleport.com (Kirby Urner) Date: Sun, 19 Nov 2000 09:20:12 -0800 Subject: [Edu-sig] Cryptonomicon In-Reply-To: <3.0.3.32.20001111093620.00cff8dc@pop.teleport.com> References: <3.0.3.32.20001110200943.00cfa160@pop.teleport.com> Message-ID: <3.0.3.32.20001119092012.00822690@pop.teleport.com> Earlier I wrote: >Plus just consider those really simple substitution codes where >you make A = J, B = Z -- whatever random assignment. Also, >you want to parse your messages into 5 letter chunks: ALSOY >OUWAN TOPAR SEYOU RMESSA GESIN TO5LE TTERC HUNKS. You find >this simple "club house" codes in books for little kids, with >titles like 'I, Spy' and stuff. Of course this is nothing >sophisticated, but it's a great opportunity to use Python >string ops to: make the 5 letter uppercase chunks out of >whatever plaintext input (could be interactive and first, >then convert to file i/o); use the dictionary data structure >to do the lookup/substition. I've played around with the above, so far minus the 5-letter chunking, to develop a simple "clubhouse" substitution scheme in Python. Here's the kind of file it produces: RUFWVEUWB PLS VBJBL QBPWV PAU UFW RPTGBWV HWUFAGT RUWTG UL TGNV EULTNLBLT P LBY LPTNUL, EULEBNJBS NL CNHBWTQ PLS SBSNEPTBS TU TGB OWUOUVNTNUL TGPT PCC MBL PWB EWBPTBS BIFPC. LUY YB PWB BLAPABS NL P AWBPT ENJNC YPW, TBVTNLA YGBTGBW TGPT LPTNUL UW PLQ LPTNUL VU EULEBNJBS PLS VU SBSNEPTBS EPL CULA BLSFWB. YB PWB MBT UL P AWBPT HPTTCBRNBCS UR TGPT YPW. YB GPJB EUMB TU SBSNEPTB P OUWTNUL UR TGPT RNBCS PV P RNLPC WBVTNLA-OCPEB RUW TGUVB YGU GBWB APJB TGBNW CNJBV TGPT TGPT LPTNUL MNAGT CNJB. NT NV PCTUABTGBW RNTTNLA PLS OWUOBW TGPT YB VGUFCS SU TGNV. HFT NL P CPWABW VBLVB, YB EPLLUT SBSNEPTB, YB EPLLUT EULVBEWPTB, YB EPLLUT GPCCUY TGNV AWUFLS. TGB HWPJB MBL, CNJNLA PLS SBPS YGU VTWFAACBS GBWB GPJB EULVBEWPTBS NT RPW PHUJB UFW OUUW OUYBW TU PSS UW SBTWPET. TGB YUWCS YNCC CNTTCB LUTB LUW CULA WBMBMHBW YGPT YB VPQ GBWB, HFT NT EPL LBJBW RUWABT YGPT TGBQ SNS GBWB. ... along with the all-important key file, saved separately: D=Z Z=X Q=Y J=V ... and so on -- all 26 letters have their random substitutes. When you bring the key file and encrypted text back together, the original text (except capitalized). Notice: this primitive scheme keeps punctuation and whitespace as is, simply substitutes for the 26 uppercase alpha letters. But it's still a useful exercise for learning various aspects of Python, as well as starting to think like a cryptographer. For example, coming up with a substitution key is automatic and involves pseudo-randomly choosing from a grab-bag of alpha characters, removing the chosen letter, and choosing again, until all 26 have been chosen at pseudo-random. You then build a dictionary with this, i.e. pair 'A' with the first letter chose, 'B' with the next, and so on. There's no rule which says you can't pair a letter with itself -- could happen. Here's some code: import string, random def permute(): """ Randomly permute the uppercase alphabet by choosing its letters pseudo-randomly """ alphalist = list(string.uppercase) newlist = [] for i in range(len(alphalist)): randchar = random.choice(alphalist) alphalist.remove(randchar) newlist.append(randchar) return newlist def mkdict(): """ Pair uppercase alphabet with randomly permuted version of same """ tuples = zip(string.uppercase,permute()) codedict = {} for pair in tuples: codedict[pair[0]]=pair[1] return codedict The rest of clubhouse.py is about file i/o. You feed it filename.txt and get back filename.cpt and filename.key, the encrypted text and deciphering key. Then you reverse the process, getting filename.dcp (decrypted) back out. Looks like this in IDLE: >>> clubhouse.encrypt(r"./ocn/sample.txt") Writing to ./ocn/sample.cpt Saving key as ./ocn/sample.key >>> clubhouse.decrypt(r"./ocn/sample.cpt") Reading from ./ocn/sample.cpt Writing to ./ocn/sample.dcp Using key ./ocn/sample.key FOURSCORE AND SEVEN YEARS AGO OUR FATHERS BROUGHT FORTH ON THIS CONTINENT A NEW NATION, CONCEIVED IN LIBERTY AND DEDICATED TO THE PROPOSITION THAT ALL MEN ARE CREATED EQUAL. NOW WE ARE ENGAGED IN A GREAT CIVIL WAR, TESTING WHETHER THAT NATION OR ANY NATION SO CONCEIVED AND SO DEDICATED CAN LONG ENDURE. WE ARE MET ON A GREAT BATTLEFIELD OF THAT WAR. WE HAVE COME TO DEDICATE A PORTION OF THAT FIELD AS A FINAL RESTING-PLACE FOR THOSE WHO HERE GAVE THEIR LIVES THAT THAT NATION MIGHT LIVE. IT IS ALTOGETHER FITTING AND PROPER THAT WE SHOULD DO THIS. BUT IN A LARGER SENSE, WE CANNOT DEDICATE, WE CANNOT CONSECRATE, WE CANNOT HALLOW THIS GROUND. THE BRAVE MEN, LIVING AND DEAD WHO STRUGGLED HERE HAVE CONSECRATED IT FAR ABOVE OUR POOR POWER TO ADD OR DETRACT. THE WORLD WILL LITTLE NOTE NOR LONG REMEMBER WHAT WE SAY HERE, BUT IT CAN NEVER FORGET WHAT THEY DID HERE. I've put the colorized (HTMLized) code for my clubhouse.py (version 1.0) at http://www.inetarena.com/~pdx4d/ocn/clubhouse.html with non-HTMLized (plaintext) source code at: http://www.inetarena.com/~pdx4d/ocn/python/clubhouse.py The html file has tie-backs to some of our posts in this thread on edu-sig (including this one). Kirby From dyoo@hkn.eecs.berkeley.edu Mon Nov 20 03:32:53 2000 From: dyoo@hkn.eecs.berkeley.edu (Daniel Yoo) Date: Sun, 19 Nov 2000 19:32:53 -0800 (PST) Subject: [Edu-sig] Cryptonomicon In-Reply-To: <3.0.3.32.20001119092012.00822690@pop.teleport.com> Message-ID: > def permute(): > """ > Randomly permute the uppercase alphabet > by choosing its letters pseudo-randomly > """ > alphalist = list(string.uppercase) > newlist = [] > for i in range(len(alphalist)): > randchar = random.choice(alphalist) > alphalist.remove(randchar) > newlist.append(randchar) > return newlist This looks nice! It might be nice to show another approach to shuffling the alphabet: ### from string import uppercase from random import randint def permute(L): """ Permute a list by swapping elements randomly. """ newlist = L[:] # shallow copy for i in range(len(L)): rand_i = randint(i, len(L)-1) newlist[i], newlist[rand_i] = newlist[rand_i], newlist[i] return newlist if __name__ == '__main__': print permute(list(uppercase)) ### From dustin@cs.uchicago.edu Mon Nov 20 04:38:50 2000 From: dustin@cs.uchicago.edu (Dustin Mitchell) Date: Sun, 19 Nov 2000 22:38:50 -0600 (CST) Subject: [Edu-sig] Cryptonomicon In-Reply-To: Message-ID: On Sun, 19 Nov 2000, Daniel Yoo wrote: > This looks nice! It might be nice to show another approach to shuffling > the alphabet: > > ### > from string import uppercase > from random import randint > def permute(L): > """ > Permute a list by swapping elements randomly. > """ > newlist = L[:] # shallow copy > for i in range(len(L)): > rand_i = randint(i, len(L)-1) > newlist[i], newlist[rand_i] = newlist[rand_i], newlist[i] > return newlist > > if __name__ == '__main__': > print permute(list(uppercase)) > ### It takes a bit more thought to see that this acheives the same amount of permutation as the original. For instance, if we merely swapped random elements L times, the result would be different, because unchanged elements would be much more common. That is: ### from string import uppercase from random import randint def permute(L): """ Permute a list by swapping elements randomly. """ newlist = L[:] # shallow copy for i in range(len(L)/2+1): rand_i = randint(0, len(L)-1) rand_j = randint(0, len(L)-1) newlist[rand_j], newlist[rand_i] = newlist[rand_i], newlist[rand_j] return newlist if __name__ == '__main__': print permute(list(uppercase)) ### I can see some interesting discussions about what it means to be random, and what sorts of characteristics we might want from a 'random' activity in different situations. --------------------------------------------------------------------- | Connection - in an isolating age )O( | --------------------------------------------------------------------- From pdx4d@teleport.com Mon Nov 20 13:35:21 2000 From: pdx4d@teleport.com (Kirby Urner) Date: Mon, 20 Nov 2000 05:35:21 -0800 Subject: [Edu-sig] Cryptonomicon In-Reply-To: References: Message-ID: <3.0.3.32.20001120053521.008148f0@pop.teleport.com> >I can see some interesting discussions about what it means to be random, >and what sorts of characteristics we might want from a 'random' activity >in different situations. > Yes, always a worthwhile thread. Also fun would be to take some clubhouse crypto texts and crack them. Although permuting the 26 uppercase letters yields 403,291,461,126,605,635,584,000,000 possible arrangements (looks impressive), if we know the plaintext is in English, then we can always apply some valuable clues, e.g. letter frequency, in descending order, tends towards ETAOINSHRDLU, plus we have commonly occuring letter combos, like ET EA OU. And if the encrypted text isn't chunked (e.g. into 5-letter strings), to obscure word lengths, then we have a lot more to go on: frequent 3-letter words like AND THE would suggest substitutions. Indeed, finding the most common letter (say V) and then finding 3-letter words ending in V would give a good hypothesis for the T and H substitutions. I'd be in favor of leaving all these clues intact (e.g. not even chunking at first) and allowing students to successfully crack a few clubhouse messages. Then you could go to a next level of difficulty (e.g. add chunking). As per usual, the cryptographer and cracker share the same mind, as do computer security folks and hackers (in the popular media sense of hacker). Kirby From dustin@cs.uchicago.edu Mon Nov 20 16:10:01 2000 From: dustin@cs.uchicago.edu (Dustin Mitchell) Date: Mon, 20 Nov 2000 10:10:01 -0600 (CST) Subject: [Edu-sig] Cryptonomicon In-Reply-To: <3.0.3.32.20001120053521.008148f0@pop.teleport.com> Message-ID: On Mon, 20 Nov 2000, Kirby Urner wrote: > Also fun would be to take some clubhouse crypto texts and > crack them. Although permuting the 26 uppercase letters > yields 403,291,461,126,605,635,584,000,000 possible > arrangements (looks impressive), if we know the plaintext > is in English, then we can always apply some valuable clues, > e.g. letter frequency, in descending order, tends towards > ETAOINSHRDLU, plus we have commonly occuring letter combos, > like ET EA OU. I thought that was 'RSTLNE'.. Dustin --------------------------------------------------------------------- | Connection - in an isolating age )O( | --------------------------------------------------------------------- From pdx4d@teleport.com Mon Nov 20 17:28:33 2000 From: pdx4d@teleport.com (Kirby Urner) Date: Mon, 20 Nov 2000 09:28:33 -0800 Subject: [Edu-sig] Cryptonomicon In-Reply-To: References: <3.0.3.32.20001120053521.008148f0@pop.teleport.com> Message-ID: <3.0.3.32.20001120092833.009648d0@pop.teleport.com> >> e.g. letter frequency, in descending order, tends towards >> ETAOINSHRDLU, plus we have commonly occuring letter combos, >> like ET EA OU. > >I thought that was 'RSTLNE'.. > >Dustin A joke, right? Something to do with the Wheel of Fortune. BTW, my source for that factoid was: http://raphael.math.uic.edu/~jeremy/crypt/freq.html Here's another post on the topic, which gives some alternatives (all starting with ET though): http://linguistlist.org/~ask-ling/archive-1998.7/msg00388.html Kirby From dorothea@impressions.com Tue Nov 21 18:04:08 2000 From: dorothea@impressions.com (Dorothea Salo) Date: Tue, 21 Nov 2000 12:04:08 -0600 Subject: [Edu-sig] Re: Edu-sig digest, Vol 1 #172 - 5 msgs References: <20001120170133.363A91D015@dinsdale.python.org> Message-ID: <11d201c053e5$71965e30$2000c886@ep2> > def permute(): > """ > Randomly permute the uppercase alphabet > by choosing its letters pseudo-randomly > """ > alphalist = list(string.uppercase) > newlist = [] > for i in range(len(alphalist)): > randchar = random.choice(alphalist) > alphalist.remove(randchar) > newlist.append(randchar) > return newlist > > def mkdict(): > """ > Pair uppercase alphabet with randomly permuted > version of same > """ > tuples = zip(string.uppercase,permute()) > codedict = {} > for pair in tuples: > codedict[pair[0]]=pair[1] > return codedict The above algorithms, if I'm reading them correctly, allow a character to be substituted by itself. I assume this is desirable behavior (since it increases the number of permutations), but perhaps someone could comment on how often self-substitution could be expected to happen, and how likely it might be that self-substitution would cause the "encrypted" result to be easier to human-decode. (Self-substitution of "e" seems more likely to ease decryption than self-substitution of "z," but I'm just guessing.) Of course, most Cryptoquoters assume that self-substitution is explicitly disallowed, so it might actually be considered a "feature" in this context... (I can think of a couple of ways to disallow self-substitution, but I think the Python is less interesting than the problem.) Dorothea -- Dorothea Salo Impressions Book and Journal Services, Inc. phone: (608) 244-6218 fax: (608) 244-7050 http://www.impressions.com From schoen@loyalty.org Tue Nov 21 19:34:28 2000 From: schoen@loyalty.org (Seth David Schoen) Date: Tue, 21 Nov 2000 11:34:28 -0800 Subject: [Edu-sig] Re: Edu-sig digest, Vol 1 #172 - 5 msgs In-Reply-To: <11d201c053e5$71965e30$2000c886@ep2>; from dorothea@impressions.com on Tue, Nov 21, 2000 at 12:04:08PM -0600 References: <20001120170133.363A91D015@dinsdale.python.org> <11d201c053e5$71965e30$2000c886@ep2> Message-ID: <20001121113428.S28746@zork.net> Dorothea Salo writes: > > def permute(): > > """ > > Randomly permute the uppercase alphabet > > by choosing its letters pseudo-randomly > > """ > > alphalist = list(string.uppercase) > > newlist = [] > > for i in range(len(alphalist)): > > randchar = random.choice(alphalist) > > alphalist.remove(randchar) > > newlist.append(randchar) > > return newlist > > > > def mkdict(): > > """ > > Pair uppercase alphabet with randomly permuted > > version of same > > """ > > tuples = zip(string.uppercase,permute()) > > codedict = {} > > for pair in tuples: > > codedict[pair[0]]=pair[1] > > return codedict > > The above algorithms, if I'm reading them correctly, allow a character > to be substituted by itself. I assume this is desirable behavior (since it > increases the number of permutations), but perhaps someone could comment on > how often self-substitution could be expected to happen, and how likely it > might be that self-substitution would cause the "encrypted" result to be > easier to human-decode. (Self-substitution of "e" seems more likely to ease > decryption than self-substitution of "z," but I'm just guessing.) Self-substitution happens at all with probability 1-(1/e). It might help human attackers in some circumstances, if they get extraordinarily lucky. But in general, there's no reason to assume that self-substitution leaks more information about the message. If you see that the most common letter in a cyphertext is "M", do you assume that "M" is "E"? If you see that the most common letter in a cyphertext is "E", do you assume that "E" is "E"? If your answer to the two questions is different, then you're implying that you think the cyphertext alphabet is not completely arbitrary. This might be a good assumption in dealing with substitution cyphers created by humans by hand (in the sense that people are terrible at generating random numbers), but it's probably not a good assumption in dealing with an automatically generated cypher. Remember that by forbidding self-substitution, you eliminate almost 2/3 of the possible permutations, so you more than _double_ the amount of information available about the decryption process. (OK, that's not quite fair, because you don't usually have to find the entire alphabet in order to recover the plaintext.) Why should you give any extra clues? I can rephrase this another way. Suppose I have a cyphertext "QTQRQ ZCMYO". So originally this is just saying: there is a phrase of two words of five letters each written with the Latin alphabet. The first, third, and fifth letters of the first word are identical to each other, and different from all other letters in the phrase. The second letter of the first word is unique (not repeated anywhere else in the phrase). So is the fourth letter of that word. So is each letter of the second word. Do you see that this is _all_ the information given by the cyphertext? Now specifying a non-self-substitution rule adds the following additional information, which was not conveyed in any way by the cyphertext: The first letter of the first word is not "Q". The second letter of that word is not "T". The fourth letter of that word is not "R". The first letter of the second word is not "Z". The second letter of that word is not "C". The third letter of that word is not "M". The fourth letter of that word is not "Y". The fifth letter of that word is not "O". There are potentially very significant hints! But there was absolutely no way to infer them directly from the original cyphertext, without reference to knowledge of any languages that can be written in the Latin alphabet. Another way to say this is that if you somehow have the vast file /usr/share/plaintexts/parsimonious-decryptions, which contains all possible parsimonious decryptions of every cyphertext :-), and you are trying to write a program to search this file and print out all possible parsimonious decryptions for a given cyphertext, the hints you get from the non-self-substitution rule will potentially make your program faster and make it print out fewer possible decryptions. -- Seth David Schoen | And do not say, I will study when I Temp. http://www.loyalty.org/~schoen/ | have leisure; for perhaps you will down: http://www.loyalty.org/ (CAF) | not have leisure. -- Pirke Avot 2:5 From pdx4d@teleport.com Sat Nov 25 18:09:40 2000 From: pdx4d@teleport.com (Kirby Urner) Date: Sat, 25 Nov 2000 10:09:40 -0800 Subject: [Edu-sig] Cryptonomicon In-Reply-To: <3.0.3.32.20001120092833.009648d0@pop.teleport.com> References: <3.0.3.32.20001120053521.008148f0@pop.teleport.com> Message-ID: <3.0.3.32.20001125100940.00966800@pop.teleport.com> I've added some links to my Python-based http://www.inetarena.com/~pdx4d/ocn/clubhouse.html including to an URL where Windows users can download GUI simulators of Enigma machines. The Sale essay on deciphering the Enigma mentions how "no letter my encipher to itself" was actually a weakness of the German system, along with its bidirectionality, i.e. if A enciphered to J, then J enciphered to A. The difference between simple random substitution ala my clubhouse code algorithm (which allows self-substitution) and something like Enigma, is the latter changes the substitution key with each press of a letter (in the Enigma using a complicate system of rotors which, like a car odometer, knocked successive wheels one notch with each complete revolution of the one before). Here's some example plaintext and corresponding ciphertext, from one of the Enigma simulators: Input (note 5-letter chunking): AQUIC KBROW NFOXJ UMPED OVERT HELAZ YDOGW WWWWW WWWWW WWWWW WWWWW WWWWW WWWWW WWWWW WW Output (note how repeated Ws in the input nevertheless enciphers to different letters below): UVWFP ALDFF FMNML SHZLI GTMXM CISQU EIYED FJORN OMNRA CZVXL MRBAO JRGRO ZKCAJ NMMLP AO Also in the news: an Enigma machine stolen from the Bletchy Park museum was recently recovered, along with the internal rotors (found separately, according to newspaper accounts). Another link shows contains some scans of Turing's original typed manuscript re the Enigma, plus there's a virtual tour of Bletchy Park -- all very reinforcing of the storyline developed by Neal Stephenson's 'Cryptonomicon', the novel which originally inspired me to launch this thread. It's be high feasible to write an Enigma simulator in Python of course, including with a GUI front end. But in accordance with my "cave painting" analogy, I think what's important from a pedagogical point of view is, on first pass, to give just the flavor, the essential gist, and then move on to linked topics (e.g. digital circuit design and the evolution of computing hardware). Kirby From delza@antarcti.ca Mon Nov 27 17:52:48 2000 From: delza@antarcti.ca (Dethe Elza) Date: Mon, 27 Nov 2000 09:52:48 -0800 Subject: [Edu-sig] Re: Cryptonomicon References: <20001126170128.19C591CE0D@dinsdale.python.org> Message-ID: <3A229F70.4030206@antarcti.ca> Bruce Schneier, who wrote the Solitaire algorithm (called Pontifex in Cryptonomicon) has an excellent page describing how and why it works: http://www.counterpane.com/solitaire.html One of the cool things about it is that it's designed to use playing cards, so you can have students encrypt and decrypt messages with cards, getting a tactile feel for encryption, then implement the code. There's python code up on the site, too, though they make no claims about its reliability: http://www.counterpane.com/pysol.zip Anyway, it's much more realistic from an encryption point of view, without being much more complicated than a clubhouse algorithm. Win-win. --Dethe edu-sig-request@python.org wrote: > Send Edu-sig mailing list submissions to > edu-sig@python.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://www.python.org/mailman/listinfo/edu-sig > or, via email, send a message with subject or body 'help' to > edu-sig-request@python.org > > You can reach the person managing the list at > edu-sig-admin@python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Edu-sig digest..." > > > Today's Topics: > > 1. Re: Cryptonomicon (Kirby Urner) > > --__--__-- > > Message: 1 > Date: Sat, 25 Nov 2000 10:09:40 -0800 > To: edu-sig@python.org > From: Kirby Urner > Subject: Re: [Edu-sig] Cryptonomicon > > > I've added some links to my Python-based > http://www.inetarena.com/~pdx4d/ocn/clubhouse.html > including to an URL where Windows users can > download GUI simulators of Enigma machines. > > The Sale essay on deciphering the Enigma mentions > how "no letter my encipher to itself" was actually > a weakness of the German system, along with its > bidirectionality, i.e. if A enciphered to J, then > J enciphered to A. > > The difference between simple random substitution > ala my clubhouse code algorithm (which allows > self-substitution) and something like Enigma, is > the latter changes the substitution key with each > press of a letter (in the Enigma using a complicate > system of rotors which, like a car odometer, > knocked successive wheels one notch with each > complete revolution of the one before). > > Here's some example plaintext and corresponding > ciphertext, from one of the Enigma simulators: > > Input (note 5-letter chunking): > > AQUIC KBROW NFOXJ UMPED OVERT HELAZ YDOGW WWWWW > WWWWW WWWWW WWWWW WWWWW WWWWW WWWWW WW > > Output (note how repeated Ws in the input > nevertheless enciphers to different letters > below): > > UVWFP ALDFF FMNML SHZLI GTMXM CISQU EIYED FJORN > OMNRA CZVXL MRBAO JRGRO ZKCAJ NMMLP AO > > Also in the news: an Enigma machine stolen from > the Bletchy Park museum was recently recovered, > along with the internal rotors (found separately, > according to newspaper accounts). > > Another link shows contains some scans of Turing's > original typed manuscript re the Enigma, plus > there's a virtual tour of Bletchy Park -- all > very reinforcing of the storyline developed by > Neal Stephenson's 'Cryptonomicon', the novel which > originally inspired me to launch this thread. > > It's be high feasible to write an Enigma simulator > in Python of course, including with a GUI front > end. But in accordance with my "cave painting" > analogy, I think what's important from a pedagogical > point of view is, on first pass, to give just the > flavor, the essential gist, and then move on to > linked topics (e.g. digital circuit design and > the evolution of computing hardware). > > Kirby > > > > > --__--__-- > > _______________________________________________ > Edu-sig mailing list > Edu-sig@python.org > http://www.python.org/mailman/listinfo/edu-sig > > > End of Edu-sig Digest From pdx4d@teleport.com Mon Nov 27 21:23:03 2000 From: pdx4d@teleport.com (Kirby Urner) Date: Mon, 27 Nov 2000 13:23:03 -0800 Subject: [Edu-sig] Re: Cryptonomicon In-Reply-To: <3A229F70.4030206@antarcti.ca> References: <20001126170128.19C591CE0D@dinsdale.python.org> Message-ID: <3.0.3.32.20001127132303.00988c70@pop.teleport.com> >Anyway, it's much more realistic from an encryption point of view, >without being much more complicated than a clubhouse algorithm. Win-win. > >--Dethe RE: SOLITAIRE I think it's quite a bit more complicated than the clubhouse algorithm, but highly relevant and useful in any case. I'll be adding a link specific to that Solitaire page from my http://www.inetarena.com/~pdx4d/ocn/clubhouse.html as it's so tied-in the the 'Cryptonomicon' thread. Thanks for pointing this out. It'd be a fun project for a student to verify the Python implementation of Solitaire against the test vectors provided (test vectors show, for example, what ciphertext you should get out, using what key and plaintext as inputs). As I recall from the novel, the Perl version makes heavy use of regular expressions. Probably the Python version is written the same way (I haven't looked at it yet). If the tests fail for any reason, then the verification job could turn into a debugging job. RE: BLOWFISH Speaking of complicated, state-of-the-art algorithms, I've just finished running the test vectors (successfully) for my pure Python implementation of the Blowfish encryption algorithm. I haven't written anything about it yet, but the code itself is at: http://www.inetarena.com/~pdx4d/ocn/python/blowfish.html http://www.inetarena.com/~pdx4d/ocn/python/blowfish.py (that's a color-coded .html version for readability, web-linking, plus a .py version for downloading to run, as per my usual practice). A.M. Kuchling has done the sophisticated implementation of Blowfish for Python, available via http://web.homeport.org/~adam/crypto/python.phtml -- plus he's sharing lots more algorithms besides Blowfish, and implementing them for speed, by extending Python's modules using C. What I'm doing is helping students learn the algorithm in a way that assumes familiarity with Python, but not necessarily with C -- i.e. my implementation is more for pedagogical purposes than for providing industrial-grade code for a production environment. It provides a good excuse to play around with bits and bytes, which in turn reinforces the groundwork for a lot of "math through programming" topics. Blowfish is one of this bit-flipper mix-masters which churns away at a huge haystack of array values, initialized by default to the hexadecimal digits of PI. Your pin (which can be from 32 to 448 bits, in 32-bit increments), gets churned into that haystack (by way of 521 encryption operations) and effectively lost (so don't forget it). This pin-initialized haystack then encyphers your plaintext in 64-bit blocks, losing it too -- and yet by simply reversing the 16+ pre-defined encyphering operations, the plaintext is faithfully recovered -- provided, of course, said haystack is first initialized using the same pin. Kirby From urner@alumni.Princeton.EDU Tue Nov 28 20:55:10 2000 From: urner@alumni.Princeton.EDU (Kirby Urner) Date: Tue, 28 Nov 2000 12:55:10 -0800 Subject: [Edu-sig] Re: Intro to Crypto (Python, kid/beginner focus) References: Message-ID: <05682tc8n7fht3qbgl915ls4u0bfm0asqv@4ax.com> Kirby Urner wrote: >A lot of the background reading is on the web pages >linked from the URL below, which contains mostly >Python source code. Actually, not any more -- I've just upgraded this page (*) to contain a better narrative:code ratio. It's actually quite readable now, with most of the source code kept in the back, for those wanting to read it (and/or copy/execute it). This is one of several web pages from the Oregon Curriculum Network, developed in part to assist homeschoolers with math-related curriculum. But I'm not exclusively so focussed (e.g. my own daughter goes to a public school, and I'm thinking of her and her peers just as much -- went over there recently to install Python for the iMac). The Python aspect wasn't part of this site at the outset -- although using *some* computer language always was. I came to Python late in my game, and quickly recognized it for having the key features I was looking for in a teaching language, chiefly a command line interface (CLI) and an easy way of implementing many of the object-oriented programming concepts (OO). Kirby * http://www.inetarena.com/~pdx4d/ocn/clubhouse.html From pdx4d@teleport.com Wed Nov 29 22:08:33 2000 From: pdx4d@teleport.com (Kirby Urner) Date: Wed, 29 Nov 2000 14:08:33 -0800 Subject: [Edu-sig] Re: Cryptonomicon In-Reply-To: <3.0.3.32.20001127132303.00988c70@pop.teleport.com> References: <3A229F70.4030206@antarcti.ca> <20001126170128.19C591CE0D@dinsdale.python.org> Message-ID: <3.0.3.32.20001129140833.009ac100@pop.teleport.com> >It'd be a fun project for a student to verify the Python >implementation of Solitaire against the test vectors provided >(test vectors show, for example, what ciphertext you should >get out, using what key and plaintext as inputs). Of course Mordy Ovits has provided his own test vectors and verification routines for his implementation of Solitaire, so this fun project has already been done -- I shoulda realized. >As I recall from the novel, the Perl version makes heavy >use of regular expressions. Probably the Python version >is written the same way (I haven't looked at it yet). It's not -- no regular expressions at all. Solitaire in Python therefore seems far less cryptic to me than the Perl implementation -- but that's not surprising, as I'm no Perl guru. From Solitaire.py: # NOTE: the Solitaire encryption algorithm is strong cryptography. # That means the security it affords is based on the secrecy of the # key rather than secrecy of the algorithm itself. That also means # that this program and programs derived from it may be treated as # a munition for the purpose of export regulation in the United States # and other countries. You are encouraged to seek competent legal # counsel before distributing copies of this program. Fun fun! Kirby From dyoo@hkn.eecs.berkeley.edu Thu Nov 30 11:33:55 2000 From: dyoo@hkn.eecs.berkeley.edu (Daniel Yoo) Date: Thu, 30 Nov 2000 03:33:55 -0800 (PST) Subject: [Edu-sig] More permutation madness [Cryptonomicon] Message-ID: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --545289610-2108799327-975584035=:15623 Content-Type: TEXT/PLAIN; charset=US-ASCII There was a question about ways of generating permutations: is there an easy way to show that these two functions: ### def permuteRestricted(L): newlist = L[:] # shallow copy for i in range(len(L)): rand_i = randint(i, len(L)-1) newlist[i], newlist[rand_i] = newlist[rand_i], newlist[i] return newlist def permuteUnrestricted(L): newlist = L[:] # shallow copy for i in range(len(L)): rand_i = randint(0, len(L)-1) rand_j = randint(0, len(L)-1) newlist[rand_j], newlist[rand_i] = newlist[rand_i], newlist[rand_j] return newlist ### act differently? I tried to prove this difference by brute force: the attached program generates the whole space of permutations, given these two methods. I have to apologize for the code's uglyness and sluggishness, and if anyone can simplify or optimize it, I'd be very happy. I hope this helps! --545289610-2108799327-975584035=:15623 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="countPerms.py" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: attachment; filename="countPerms.py" IiIiVGhpcyBpcyBhIHNtYWxsIHNjcmlwdCB0aGF0IHNob3dzIHRoZSBkaWZm ZXJlbmNlIGJldHdlZW4gdHdvDQpzaHVmZmxpbmcgbWV0aG9kcy4gIEl0IHNo b3VsZCBzaG93IHRoYXQgb25lIGFwcHJvYWNoIGRvZXNuJ3QgZXZlbmx5DQpn ZW5lcmF0ZSBjZXJ0YWluIHR5cGVzIG9mIHNodWZmbGluZy4NCg0KRG8gdGhl IGZvbGxvd2luZyBmdW5jdGlvbnM6DQoNCiMjIw0KZGVmIHBlcm11dGVSZXN0 cmljdGVkKEwpOg0KICAgIG5ld2xpc3QgPSBMWzpdICAgIyBzaGFsbG93IGNv cHkNCiAgICBmb3IgaSBpbiByYW5nZShsZW4oTCkpOg0KICAgICAgICByYW5k X2kgPSByYW5kaW50KGksIGxlbihMKS0xKQ0KICAgICAgICBuZXdsaXN0W2ld LCBuZXdsaXN0W3JhbmRfaV0gPSBuZXdsaXN0W3JhbmRfaV0sIG5ld2xpc3Rb aV0NCiAgICByZXR1cm4gbmV3bGlzdA0KIyMjDQoNCmFuZA0KDQojIyMNCmRl ZiBwZXJtdXRlVW5yZXN0cmljdGVkKEwpOg0KICAgIG5ld2xpc3QgPSBMWzpd ICAgIyBzaGFsbG93IGNvcHkgICAgICAgICAgICANCiAgICBmb3IgaSBpbiBy YW5nZShsZW4oTCkpOiAgICAgICAgICAgICAgICAgICANCiAgICAgICAgcmFu ZF9pID0gcmFuZGludCgwLCBsZW4oTCktMSkNCiAgICAgICAgcmFuZF9qID0g cmFuZGludCgwLCBsZW4oTCktMSkNCiAgICAgICAgbmV3bGlzdFtyYW5kX2pd LCBuZXdsaXN0W3JhbmRfaV0gPSBuZXdsaXN0W3JhbmRfaV0sIG5ld2xpc3Rb cmFuZF9qXQ0KICAgIHJldHVybiBuZXdsaXN0DQojIyMNCg0KZG8gdGhlIHNh bWUgdGhpbmc/DQoNClRoZSBmb2xsb3dpbmcgcHJvZ3JhbSB0cmllcyB0byBz aG93IHRoYXQgdGhlcmUgSVMgYSBkaWZmZXJlbmNlIGJldHdlZW4NCnRoZW06 IHRoZXkgcHJvZHVjZXMgZGlmZmVyZW50IGRpc3RyaWJ1dGlvbnMgb2YgcGVy bXV0YXRpb25zLCB0aGF0IGlzLA0KdW5kZXIgdW5kcmVzdHJpY3RlZCByYW5k b20gc3dhcHBpbmcsIGNlcnRhaW4gcGVybXV0YXRpb25zIGFyZSBtb3JlDQpj b21tb24gdGhhbiBvdGhlcnMhDQoNCldlIGRvIHRoaXMgYnkgY29uc3RydWN0 aW5nIHRoZSB3aG9sZSBzcGFjZSBvZiBwZXJtdXRhdGlvbnMgYWNjb3JkaW5n DQp0byBlYWNoIG1ldGhvZC4gIEFmdGVyIGNvbnN0cnVjdGlvbiwgd2UgY291 bnQgdGhlIGRpc3RyaWJ1dGlvbnMuIiIiDQoNCg0KZnJvbSBvcGVyYXRvciBp bXBvcnQgYWRkDQpkZWYgbWFwcGVuZChmLCBMKToNCiAgICAiIiJBcHBseSBh IGZ1bmN0aW9uIG9uIGVhY2ggZWxlbWVudCBvZiB0aGUgbGlzdCwgYW5kIGFw cGVuZCBhbGwNCiAgICBpdHMgcmVzdWx0cy4gIFVzZWZ1bCB3aGVuIGYoeCkg aXRzZWxmIHJldHVybnMgYSBsaXN0IG9mIHZhbHVlcyBmb3INCiAgICBlYWNo IHggaW4gTC4gICIiIg0KICAgIHJldHVybiByZWR1Y2UoYWRkLCBtYXAoZiwg TCkpDQoNCmRlZiBkaXN0SGFzaChMKToNCiAgICAiIiJSZXR1cm4gYSBoYXNo IG9mIHRoZSBkaXN0cmlidXRpb24gb2YgdGhlIGVsZW1lbnRzLiAgRWFjaA0K ICAgIGVsZW1lbnQgbXVzdCBiZSBhbiBpbW11dGFibGUgdGltZTsgb3RoZXJ3 aXNlIHdlIGNhbid0IHVzZSBpdCBhcyBhDQogICAgaGFzaCBrZXkuIiIiDQog ICAgcmVzdWx0ID0ge30NCiAgICBmb3IgeCBpbiBMOg0KICAgICAgICByZXN1 bHRbeF0gPSByZXN1bHQuZ2V0KHgsIDApICsgMQ0KICAgIHJldHVybiByZXN1 bHQNCg0KZGVmIGdldFN3YXBwZWRQZXJtKHBlcm0sIGksIGopOg0KICAgICIi IlJldHVybiB0aGUgcmVzdWx0aW5nIHBlcm11YXRpb24gYWZ0ZXIgb25lIHN3 YXAgYmV0d2VlbiBwZXJtW2ldDQogICAgYW5kIHBlcm1bal0iIiINCiAgICBy ZXN1bHQgPSBwZXJtWzpdDQogICAgcmVzdWx0W2ldLCByZXN1bHRbal0gPSBy ZXN1bHRbal0sIHJlc3VsdFtpXQ0KICAgIHJldHVybiByZXN1bHQNCg0KZGVm IGdlbmVyYXRlVW5yZXN0cmljdGVkKHBlcm0sIGkpOg0KICAgICIiIlJldHVy biBhbiB1bnJlc3RyaWN0ZWQgZXhwYW5zaW9uIG9mIHBlcm11dGF0aW9ucywg Z2l2ZW4gYSBiYXNlDQogICAgcGVybXV0YXRpb24gInBlcm0iLg0KDQogICAg VGhpcyBpbnRlbnNpb25hbGx5IGlnbm9yZXMgdGhlIGJvdW5kYXJ5IGkuDQog ICAgIiIiDQogICAgcmVzdWx0cyA9IFtdDQogICAgZm9yIGogaW4gcmFuZ2Uo bGVuKHBlcm0pKToNCiAgICAgICAgZm9yIGsgaW4gcmFuZ2UobGVuKHBlcm0p KToNCiAgICAgICAgICAgIHJlc3VsdHMuYXBwZW5kKGdldFN3YXBwZWRQZXJt KHBlcm0sIGosIGspKQ0KICAgIHJldHVybiByZXN1bHRzDQoNCmRlZiBnZW5l cmF0ZVJlc3RyaWN0ZWQocGVybSwgaSk6DQogICAgIiIiUmV0dXJuIHRoZSBl eHBhbnNpb24gb2YgcGVybXV0YXRpb25zLCBnaXZlbiBhIGJhc2UgcGVybXV0 YXRpb24NCiAgICAicGVybSIuDQoNCiAgICBUaGUgcGFyYW1ldGVyICdpJyBy ZXN0cmljdHMgd2hhdCBraW5kcyBvZiBwZXJtdXRhdGlvbnMgYXJlDQogICAg YXZhaWxhYmxlIHRvIGdlbmVyYXRlLiIiIg0KICAgIA0KICAgIHJlc3VsdHMg PSBbXQ0KICAgIGZvciBqIGluIHJhbmdlKGksIGxlbihwZXJtKSk6DQogICAg ICAgIHJlc3VsdHMuYXBwZW5kKGdldFN3YXBwZWRQZXJtKHBlcm0sIGksIGop KQ0KICAgIHJldHVybiByZXN1bHRzDQoNCmRlZiBnZXRQZXJtdXRhdGlvbnMo biwgZnVuYyk6DQogICAgIiIiR2VuZXJhdGUgdGhlIHdob2xlIHNwYWNlIG9m IHBlcm11dGF0aW9ucywgZ2l2ZW4gYSBwZXJtdXRpbmcNCiAgICBmdW5jdGlv biAiZnVuYy4iDQogICAgIiIiDQogICAgcGVybVNwYWNlID0gWyBsaXN0KHJh bmdlKG4pKSBdICAjIG91ciBpbml0aWFsIHNwYWNlIGNvbnRhaW5zDQogICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICMgdGhlIGlkZW50aXR5IHBl cm11dGF0aW9uLg0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAj IFdlIHdpbGwgZXhwYW5kIHRoaXMuDQogICAgZm9yIGkgaW4gcmFuZ2Uobik6 DQogICAgICAgIHBlcm1TcGFjZSA9IG1hcHBlbmQobGFtYmRhIHAsIGk9aSwg ZnVuYz1mdW5jOiBmdW5jKHAsaSksIHBlcm1TcGFjZSkNCiAgICByZXR1cm4g cGVybVNwYWNlDQoNCg0KZGVmIF9tYWluKG4pOg0KICAgIHJpZ2h0ID0gZ2V0 UGVybXV0YXRpb25zKG4sIGdlbmVyYXRlUmVzdHJpY3RlZCkgICAgICMgbG9h ZGVkIHdvcmRzLi4uDQogICAgcHJpbnQgIlJlc3RyaWN0ZWQgc3dhcHBpbmc6 IiAsIGRpc3RIYXNoKG1hcCh0dXBsZSwgcmlnaHQpKQ0KICAgIA0KICAgIHdy b25nID0gZ2V0UGVybXV0YXRpb25zKG4sIGdlbmVyYXRlVW5yZXN0cmljdGVk KQ0KICAgIHByaW50ICJVbnJlc3RyaWN0ZWQgc3dhcHBpbmc6IiwgZGlzdEhh c2gobWFwKHR1cGxlLCB3cm9uZykpDQoNCiMgRHJpdmVyIGZ1bmN0aW9uLiAg QW55dGhpbmcgbW9yZSB0aGFuIDMgd2lsbCBjYXVzZSBhIExPVCBvZiB3b3Jr Lg0KaWYgX19uYW1lX18gPT0gJ19fbWFpbl9fJzoNCiAgICBwcmludCAiSGVy ZSdzIGEgc3VtbWFyeSBvZiBwZXJtdXRhdGlvbnMgbGVuZ3RoIDM6Ig0KICAg IF9tYWluKDMpDQogICAgcHJpbnQNCiAgICBwcmludCAiSGVyZSdzIGEgc3Vt bWFyeSBvZiBwZXJtdXRhdGlvbnMgbGVuZ3RoIDQ6Ig0KICAgIF9tYWluKDQp DQoNCg0K --545289610-2108799327-975584035=:15623-- From pdx4d@teleport.com Thu Nov 30 17:30:27 2000 From: pdx4d@teleport.com (Kirby Urner) Date: Thu, 30 Nov 2000 09:30:27 -0800 Subject: [Edu-sig] More permutation madness [Cryptonomicon] In-Reply-To: Message-ID: <3.0.3.32.20001130093027.0099a690@pop.teleport.com> At 03:33 AM 11/30/2000 -0800, Daniel Yoo wrote: > >There was a question about ways of generating permutations: is there an >easy way to show that these two functions: <> >act differently? They have different properties. There's only one possible route to ABCD...WXYZ in the restricted method (all letters self-identified) through a narrowing set of options at each turn, for 26 turns. But the unrestricted method gives a great many pathways to this same terminus in 26 iterations, by presenting a non-narrowing set of options at each turn. The restricted method is your classic decision tree with fanning branches, terminating in 26! equally probable outcomes. Every iteration forces you to the next level, towards the final outcome level. The unrestricted method allows you to retrace your steps, back towards the beginning. Without trying to compute a lot of specific probabilities, I think you can see that the two methods, while confined to the same "state space" of 26! letter sequences, are not isomorphic (i.e. are not simply two ways of doing the exact same thing). Kirby From tim.one@home.com Thu Nov 30 21:10:00 2000 From: tim.one@home.com (Tim Peters) Date: Thu, 30 Nov 2000 16:10:00 -0500 Subject: [Edu-sig] More permutation madness [Cryptonomicon] In-Reply-To: Message-ID: [Daniel Yoo] > There was a question about ways of generating permutations: is there an > easy way to show that these two functions: > > ### > def permuteRestricted(L): > newlist = L[:] # shallow copy > for i in range(len(L)): > rand_i = randint(i, len(L)-1) > newlist[i], newlist[rand_i] = newlist[rand_i], newlist[i] > return newlist > > def permuteUnrestricted(L): > newlist = L[:] # shallow copy > for i in range(len(L)): > rand_i = randint(0, len(L)-1) > rand_j = randint(0, len(L)-1) > newlist[rand_j], newlist[rand_i] = newlist[rand_i], > newlist[rand_j] > return newlist > ### > > act differently? > ... By counting. Let N=len(L). Then in the first version, there are N! (N factorial) possible outcomes (on the first iteration thru the loop, rand_i can have any of N values; on the second iteration, any of N-1; and so on). In the second version, N**(2*N) (on the first iteration thru the loop, the pair (rand_i, rand_j) can take any of N**2 values; and likewise for all the other iterations). Since there are N! possible permutations, that the first version has N! possible outcomes means it's at least *possible* for each permutation to be equally likely (the counting argument alone can't prove it's fair, though -- it simply doesn't rule fairness out). But the second version can't possibly be fair unless N**(2*N) is a multiple of N!. That is, I think, what makes it so seductive: the second version *is* fair if N happens to be 1 or 2. People think about the small cases, and leap to the conclusion that it "must be" fair for larger N too. But for N==3, 6 (3!) does not divide 729 (3**6), so the second version necessarily favors some permutations. Note that "the usual" wrong algorithm for generating a random permutation is a bit simpler than permuteUnrestricted: def permuteUnrestricted_usual(L): newlist = L[:] # shallow copy for i in range(len(L)): rand_j = randint(0, len(L)-1) newlist[i], newlist[rand_j] = newlist[rand_j], newlist[i] return newlist The same kind of counting argument shows that it can't be fair for len(L)>2. An exact analysis is quite difficult! probability-is-full-of-surprises-ly y'rs - tim