We Need An R5.1RS
January 15, 2009 at 8:49 pm | In idiocy, retracted | 11 CommentsTags: retracted
Disclaimer: If you absolutely love R6RS, think it is just the greatest thing since sliced bread, and look back on R5RS as the dreary chronicles of an unenlightened dark age, you are guaranteed to disagree with the points brought up herein. But maybe you have your doubts. Maybe you wonder if everything was handled properly, or if maybe tripling the total length of the specification from 50 pages to 161 was a bad move.
And you may be right to doubt. R6RS was only ratified by 65.7% of voters. Of the 13 respondents to a survey of Scheme implementors, only 3 expressed an intention to support R6RS, and 2 registered their outright rejection of the standard, with Felix Winkelmann of Chicken Scheme saying flat out “R6RS must die.”
It’s sad. Scheme is such a beautiful language. I began my Lisp forays with Common Lisp, but was quickly drawn away by the unparalleled purity and simplicity of Scheme. I would code everything in Scheme if I could. And part of what makes Scheme such a wonderful base to build upon is its philosophy of small, orthogonal features that can be composed into novel and interesting shapes.
The Problem
(As I See It)
R6RS strays from this way. It tries to deliver too much, and a lot of what it provides would be better implemented as an optional library on top of the Scheme base. If you want a language that provides you with everything under the sun you can use Python, or if you just can’t bring yourself to give up the parentheses, Common Lisp is that way.
While I agree that Scheme could use a few more features, I believe that R5RS is much more versatile than it was given credit for. Undoubtedly it has a few rough spots, but I think that they can be resolved with much fewer changes than R6RS inflicted upon the language.
Supporting my point, the section of R6RS devoted to libraries is the same length as the entire language overview. Gone are the days when the entirety of Scheme (with the possible exception of continuations, which still give me trouble) could be learned in a day or two. Understanding the semantics of the library system seems like such a daunting task that many neophytes could be turned away altogether.
But all this complexity doesn’t really gain us as much as we might think. As I see it, the main features of R6RS worth getting excited over are:
- A standardized library system
- Unicode support
- The ability to use it in script files
My Proposal
I wonder if maybe we should just add a few small changes to R5RS. Nothing extravagant, just the bare minimum necessary to implement all the goodies that R6RS promised us. We need a point upgrade to R5RS. An R5.1RS.
Goals:
I used several criteria while brainstorming possible modifications to the R5RS Scheme language. I wanted them to be simple to understand, easy to implement, and genuinely useful in a variety of contexts.
- Simplicity – The proposal defines the most minimal set of changes that will allow the desired functionality to be implemented in a portable, standard way.
- Generality – The proposed changes need to be useful in many situations, not just for aping functionality already defined in R6RS.
- Competition – The changes should not dictate any particular way of using them, they should simply make a few things possible that were not before. This allows people to make use of the building blocks in their own ways, in the hopes that the best solution will be the one that ultimately wins out.
Proposed Changes:
My proposal defines four modifications. Taken together, they will allow for the creation of portable, elegant library and module systems, allow Scheme to be used for system scripts in a standard way, and make it easier for people to write Scheme code in languages with alphabets outside of standard ASCII.
- Define all Scheme programs to be UTF-8 encoded Unicode.
- Rationale: Unicode is becoming more and more prevalent as people all around the world begin to use computers, and our programming languages should reflect this fact. UTF-8 should be the encoding of choice because of its ASCII transparency. Any other encoding would require conversion before it would be editable.
- Define a SCHEME-IMPLEMENTATION function, which returns a pair of (SCHEME-NAME . VERSION). SCHEME-NAME will be a symbol representing the name of the currently running implementation, and VERSION will be a number corresponding in some way to the version of the implementation.
- Rationale: Most implementation of Scheme provide some additional functionality above and beyond what is explicitly required by the R5RS spec. To allow portable libraries to make use of this functionality where it is available, some straightforward mechanism should be made available for libraries to determine what functions are available to them. This approach avoids the difficulties of actually providing functions to determine what extensions are available, at the expense of requiring some amount of manual porting by library creators.
- Define ‘#!’ to be a quotation symbol when encountered at the beginning of a line of text.
- Rationale: Scheme can only stand to gain from becoming more useful as a general-purpose scripting language. Allowing Unix shebangs to be placed in the source file will aid this goal, and since many Schemes already allow files to be executed when passed as command line arguments, the modifications needed are minimal. Requiring that the symbol only be quoting at the beginning of a line helps to avoid inadvertent breakage of programs which may use those characters as identifiers, as identifiers are generally preceded either by an opening parenthesis, or whitespace.
- Modify the definition of the LOAD function to state that it shall return a list containing the return values of all top-level forms in the loaded file. This list will of course include, if applicable, the lists returned by other LOAD functions in the file.
- Rationale: The design of a module system requires that some provision be made to allow it to handle the code inside of modules. The only portable way this can be achieved currently would require the use and modification of a global variable. While it may be possible to account for variations in return values from many existing interpreters, it would be much nicer to be able to expect some sort of standard return values from the function.
Going Forward
Implementation Considerations
The SCHEME-IMPLEMENTATION function should be relatively straightforward to support in any implementation. It simply needs to return a pair of values, and this should be well within the capabilities of any popular implementation. On the other hand, this could be implemented as an external library through some fancy checking of various unstandardized features of the language, but this solution is unappealing in the long term.
The LOAD function should be relatively easy to modify, but this may be rendered difficult by current implementations’ differing usage of the return value. If enough implementations provide some sort of mechanism to retrieve at least one value from the loaded file, that may suffice to program a good module system. As with the SCHEME-IMPLEMENTATION function, though, it would be nice to have a standard behaviour to expect.
The difficulty of implementing the new quoting form may vary, but should be simple enough in all but the most arcane of hand-coded parsers. After all, Scheme already requires that ‘;;’ be recognized as a quoting form, and it shouldn’t be that much harder to define ‘#!’ similarly.
The Unicode point is possibly the hardest, and also maybe the most worthy of deferring until later. On the one hand, few could say that there would be no use to adding it as an expected feature of the language, but on the other hand, it is hardly necessary to consider it alongside the other more trivial modifications suggested herein. Yet another consideration, of course, is the general desirability of Unicode support to a Scheme implementation. Even if it is not required that Unicode be supported, it is still likely that more and more languages will support it as time progresses.
Building Blocks
The LOAD and SCHEME-IMPLEMENTATION modifications suggested, while individually rather small, together allow significant progress to be made in the areas of portability and versatility for the Scheme language. With the use of the newly-defined return values for the LOAD function, new module systems can be written with ease. An easy to use, standardized mechanism for determining the current implementation allows portable libraries to be written with ease, by allowing them to know ahead of time what additional functions and capabilities are provided to them, without any of the roundabout methods currently required. Adding the new commenting form makes it laughably easy to use Scheme as a scripting language on a level with Python, Perl, or Bash scripts.
Wrapping Up
I realize that my knowledge of Scheme is much more limited than many, so I ask you, Internet, do these recommendations seem like a sensible idea? Are there issues that I simply don’t realize inherent to what I suggest? Do I simply fail to understand the difficulty of implementing these suggestions?
I think that my ideas could work. I really believe that these handful of little changes could ease the way for a whole new level of innovation and exploration with Scheme. With the ability to easily implement new designs for module systems and libraries, I fully expect to see Scheme surpass the meager capabilities of Common Lisp’s ASDF, and then keep going.
I believe that this is worth a shot. Any Scheme implementors out there who would be willing to give it a try? If no one wants to implement it themselves, I fully understand, and have only one further question:
Would you be willing to let me give it a shot?
Pages: 1 2
11 Comments »
RSS feed for comments on this post. TrackBack URI
Leave a comment
Blog at WordPress.com. | Theme: Pool by Borja Fernandez.
Entries and comments feeds.
Gone are the days when the entirety of Scheme (with the possible exception of continuations, which still give me trouble) could be learned in a day or two.
This is the claim that annoys me most about people who are upset about the R6RS. The days in which ‘Scheme’ was cotermious with ‘the language used in SICP’ never existed. The original Scheme system could perhaps be understood in a couple days, but that certainly wasn’t true of R5RS. For example, just understanding the macro system of R5RS is a project of several months, if not years. As you say, continuations are also complicated, and the interaction of continuations, state and the top level is a hopeless nest of tangled interactions. Internal define is also pretty messy.
Comment by Sam TH — January 15, 2009 #
Check out ERR5RS:
http://scheme-punks.cyber-rush.org/wiki/index.php?title=ERR5RS:Charter
Comment by Grant Rettke — January 15, 2009 #
I have looked at ERR5RS, but I take issue with the way that it intends to declare a single library system. I really think that the issue should be left open for a while longer, until we see a truly successful Scheme module system take off. Until then, I believe that it’s too early to precisely define what form a Scheme library system should take.
Comment by Will Donnelly — January 15, 2009 #
How do you define success for a module system?
PLT’s and Chez’s work just fine.
Comment by Grant Rettke — January 16, 2009 #
Have you looked at SRFI-0? Feature based expansion of code is a better idea than explicitly checking the name and version of a Scheme system.
Your proposed solution of retrieving the name and version of the Scheme system in use sounds very similar to browser sniffing, which is frowned upon because it breaks your code whenever an incompatible new version of an existing system is released, or a completely new system is released. I wouldn’t say it isn’t occasionally useful in highly specific situations, but that hardly makes it worth standardising, IMHO.
Comment by Peter Bex (sjamaan) — January 16, 2009 #
Grant:
Can I install a PLT module under Chez Scheme? That’s what I mean by “success”. Anything less is a compromise that the Scheme community shouldn’t have to accept.
Peter:
I like SRFI-0, but it doesn’t do anything to allow the user to use implementation-specific extensions reliably. Specifically, the syntax definition in the SRFI says:
–> a symbol which is the name or alias of a SRFI
this implies to me that the solution will *only* apply when a SRFI has already been written for that feature set.
Besides that, I take issue with your statement that browser sniffing “breaks your code whenever an incompatible new version … is released”. This should not be true of a properly designed system, which should either degrade gracefully to a known standard featureset when it sees an unknown browser, or should very rightly give an error message if it requires a given nonstandard feature to accomplish its task.
I agree with you that such a system should not be used in every case, but that it is useful in “highly specific situations”, and I submit that this is just such a situation. The whole idea is to allow implementation-specific extensions to be safely used *before* they are standardized. If some such mechanism is not provided, how is the general mass of people to make use of them, and thereby make them popular enough to be standardized? Are we going to force them to only use one Scheme implementation, and leave the users of all the others out in the cold? Or are we to standardize on features before people have had a chance to use them in a variety of real-world situations?
It is a distasteful solution to me too, but it seems to be better than any available choices, including that of inaction. My hope would be that the task of using the information so gained would be mostly taken up by “compatibility” libraries, similar to the way that CFFI unifies the different foreign function interfaces found in Common Lisp-land.
Comment by Will Donnelly — January 16, 2009 #
Sorry about that; I got confused by Chicken’s feature system; it is based on SRFI-0, but it allows users and Chicken itself to register custom feature names that are not SRFI names. This is used for providing backwards compatibility when a procedure is moved to another library, for example, but also to select a macro implementation depending on the macro system that is loaded (syntax-case, syntactic closures, or defmacro style macros, for example).
I think this system, if properly standardized, could fulfill the need you have. It allows you to ask exactly what you want to ask: “does this system support feature X?”, instead of “is this scheme system A, B, C or D of which I happen to know they support feature X right now?”.
You say that version sniffing does not have to be a problem. But consider that now scheme system E start supporting this new feature. Does everybody now need to change their code and add system E to the list? What about a scheme system that decides to drop support for the feature in a future release? Your code will definitely not be anticipating that.
I’ve seen version sniffing break numerous times in web browsers, so I don’t trust it one bit.
Comment by Peter Bex (sjamaan) — January 17, 2009 #
I agree that, if properly standardized, a feature-based system could work. The problem is, who gets to define the features, names, and usage definitions? Once you reach that point, you may as well just have an SRFI anyway.
As an example of my point, look at the way that these different Schemes implement subprocesses:
CHICKEN:
http://chicken.wiki.br/Unit%20posix#process
PLT:
http://docs.plt-scheme.org/reference/subprocess.html#(part._.Simple_.Subprocesses)
Guile:
http://www.gnu.org/software/guile/manual/html_node/Processes.html#Processes
Scheme48:
http://groups.csail.mit.edu/mac/projects/s48/1.8/manual/manual-Z-H-10.html#node_idx_704
Chez Scheme:
http://www.scheme.com/csug7/foreign.html#./foreign:h1
STklos:
http://www.stklos.net/Doc/html/stklos-ref-4.html#Processes
You will notice that most of these implementations are similar, falling into the groups of process-launching or forking. But even among these groups, there are still subtle differences among interfaces. If I call (process “…”), what do I get back? Is it a three- or four-element list? Is it some special “process” object? And if I fork, how am I supposed to talk to the subprocess?
I couldn’t rightly justify supporting any of these solutions as the “definitive” standard over all the rest, so neither can I justify recommending such a feature system until that issue can be worked out.
Possibly some sort of “compatibility list” would be a good idea. That way, the implementor of a new Scheme could simply look out there, and if their semantics are compatible with some other Scheme (as are, for the most part, CHICKEN and Chez Scheme’s PROCESS definitions), they could simply register the feature, for example: ‘CHICKEN-PROCESS-COMPATIBLE.
I don’t pretend to have any answers here, I’m mostly just trying to think out loud through to consequences of the different choices, and hopefully get other people to think on similar lines as well.
Comment by Will Donnelly — January 17, 2009 #
I think I understand what you mean. Hacking around in applications through testing scheme versions seems like a worse solution to me than simply being unportable. Of course, you might disagree (and I’m sure you do), but I think the Right Thing to do here is propose a real SRFI.
If the end goal is to have truly portable Scheme programs, we should look into more long-term solutions. I’m very afraid that introducing a hook like this just helps maintain the status quo even more.
Comment by Peter Bex (sjamaan) — January 17, 2009 #
One way to help people think through and standardize on an approach is to work on the next standard itself, R7RS.
Comment by Grant Rettke — January 18, 2009 #
Peter:
While in principle I agree with you, in practice it seems that relying on implementations to adhere perfectly to the specifications is fraught with danger.
Have you ever tried running “(scheme-report-environment 5)” on different implementations? MzScheme, Guile, and MIT Scheme all throw errors because the procedure isn’t defined. And that’s just from R5RS, which has had a decade to be implemented.
While, arguably, MzScheme never makes any claims to R5RS compatibility, Guile and MIT Scheme certainly do, and I’m not prepared to just dismiss any of them out of hand for failing to implement a function.
But now in the absence of a decent way to get a Scheme environment, it becomes impossible to call EVAL in a portable fashion.
I think you may misunderstand me a little. I do not propose implementation detection features for the use of arbitrary applications (though undisciplined coders may use them that way), but rather to ease the creation of compatibility libraries that will allow Scheme code to be portable across implementations.
Until I can call EVAL on my top nine implementations without three of them throwing an error, I’m afraid I must still insist that some sort of detection abilities are necessary.
I do agree that we should look into long-term solutions, and I have been convinced inasmuch as I now believe we shouldn’t enshrine such a hack in a standard. Thankfully, it seems that the various unspecified bits of R5RS offer enough variation to reliably detect Scheme implementations in user code.
Comment by Will Donnelly — January 18, 2009 #