"Initialize right away" --> "Make it hard to figure out what your variables are by mixing up code with their declarations."
"Construct new values" --> "Manually do what the first pass of the compiler already does. Except don't use the const keyword."
"Make functions, not procedures" --> "Don't check for errors. Don't even make it possible to check for errors."
"Make your objects immutable" --> "Let's see how much time we can spend in malloc!"
"Use purely functional data structures" --> "Arrays and hash tables are often the optimally performant data structures... but you shouldn't use them."
"When you can't help it" --> "If all your other attempts to sabotage your code fail to provide the required performance degradation, you can always slow your program down by copying data around unnecessarily."
In all seriousness, just like the famous "GOTO considered harmful", the author's "assignment statement is harmful" is an assertion which is valid in some situations -- quite possibly most or all of the situations he personally has to deal with -- but most definitely not valid in all circumstances.
In his book "Zen of Assembly Language", Michael Abrash has a chapter named "Billy, don't be a compiler" and tells the story of a man who coded Assembly like a C compiler.
I apologize in advance, but I can't help but extend to you the same advice, cperciva.
"Make it hard to figure out what your variables are by mixing up code with their declarations."
In the right language, binding of variables is usually done by a LET construct, and within its scope where the variable is visible, the code is indented in ward. When you nest LETS you can see the liveness of a variable.
"Construct new values" --> Manually do what the first pass of the compiler already does. Except don't use the const keyword
He misspoke. He meant construct new variables. I am sure you would agree with him that a 'promiscuous tmp' is a royal pain in the ass when you're stepping through bad C code.
"Make functions, not procedures" --> "Don't check for errors. Don't even make it possible to check for errors."
Again, you're thinking at an abnormally low level of abstraction. Results are not for error handling, that's what the condition system is for (some erroneously call it an exception handling system)
"Make your objects immutable" --> "Let's see how much time we can spend in malloc!"
That's what the GC is for. When you make objects immutable you're buying yourself the piece of mind that comes with referential transparency.
"Use purely functional data structures" --> "Arrays and hash tables are often the optimally performant data structures... but you shouldn't use them."
There are ways to implement functional arrays and hashtables, but none too apparent without getting into academic wankery. I agree on this point.
Imperative exception handlers, like in C++, are a lot like glorified longjumps. When learning Erlang I was pleased that a more functional language can handle exceptions much more neatly--if a given expression causes an exception, the value of that expression is the exception, and that recurses in a certain way. It's conceptually less like exception handling in an imperative language and much more like an automatically mediated means of returning error codes.
Language specific? Could you specify the languages you have in mind? I used a C/C++/Java syntax, but I meant any language. Or at least the ones with garbage collection.
"Initialize right away" --> "Make it hard to figure out what your variables are by mixing up code with their declarations."
No. Putting all declarations at the top of a function makes it hard to figure out. If you can use C99, you should keep declarations and initializations together and hopefully as close to where they are actually used as possible.
One thing I'd add about the article: naming a function "move" and then having it return the sum of two vectors is bizarre and worse than the original implementation. At least, move as part of the Point class actually moved the point.
In a 50 line function with 5 variables, I find it much easier to look at 5 lines of variable declarations to figure out how I declared a variable than to scan through up to 50 lines of code.
So, declaring variables at the top of the function makes it easy to find them. Now how do you find the point of initialization?
declare x |
... | ...
initialize x | declare & initialize x
... | ...
use x | use x
In both cases, you will have to scan for the whole function, or hit the "search" hotkey. Or maybe sometimes you just need to know when `x` were declared, but not what value it holds?
One thing I'd add about the article: naming a function "move" and then having it return the sum of two vectors is bizarre and worse than the original implementation. At least, move as part of the Point class actually moved the point.
This makes sense in MVC where the controller is responsible for updating point coordinates and the view is responsible the actual rendering of the points on the display.
Also, in most graphic systems, the screen refresh is done by a single call made from within the interaction loop or by a designated callback. That's because all display updates are made to a back buffer, which then gets written to the framebuffer at a regular clock-driven interval.
> naming a function "move" and then having it return the sum of two vectors is bizarre and worse than the original implementation.
I kept the name for the sake of consistency. I agree that's a mistake. The problem is that a `Point` should be moved with a `Vector`, and not another point. But introducing a `Vector` class would have lengthen my tutorial. Maybe I should use two `floats` instead of a `Point`.
That example bothered me as well mostly because when the move function is placed outside of the class it is now performing a different action. The new function no longer really moves a point so much as it adds two points (regardless of the point/vector issue).
I think many of the naming conventions commonly used in OOP for mutable objects are not as appropriate for their immutable counterparts.
Error checking can be done in a function-composing style if your functions check for error on their inputs, recognize error-indicating values on their parameters, and consistently output error-indicating values.
There is definitely waste involved in this--when composing functions this way, you have extra function calls and error-code checking on your parameters rather than checking data along the way and bailing out if there's a problem. Your functions have to check the arguments rather than having their preconditions enforced outside of them, which makes the composed functions less elegant. I don't know how practical it is in the end, but pushing most of the error-checking complexity into lower level functions and writing your upper level code in a compositional style sure seemed elegant every time I tried it.
Error checking can be done in a function-composing style if your functions check for error on their inputs, recognize error-indicating values on their parameters, and consistently output error-indicating values.
True -- but this is only possible if you accept layering violations. To take an extreme example, how is printf() supposed to know that if a string parameter is NULL then it should error out without printing anything?
Your example is wrong. Functions parameters should be guaranteed to be error-free in the first place. If you decided that a NULL string was an error condition, you should test for that before calling printf.
If you have to check your return values yet you can't check your arguments, how do you compose functions? foo(bar(x)) requires foo to check its argument, since bar(x) could return an error code and foo() needs to know how to deal with it.
You don't have to check all possible instances of spurious arguments--you should be allowed to specify some preconditions--but you definitely have to check for error codes!
Example: Suppose we're writing a basic filesystem. We have the following primitive functions:
unsigned long getino(char pathname): For a given designated device and path, get the inode number of the designated file. Return 0 (there is no inode number 0) in case of error.
MINODE iget(unsigned long ino): For a given designated inode number, return a pointer to a data structure in memory containing the INODE data; allocated if necessary. Return NULL in case of error.
You can either test your return values from these functions, in which case you will do this:
How much argument checking has to go into iget()? Exactly this much:
if (!ino) return NULL
It seemed like a neat idea at the time, and considering I called that particular composition of functions a dozen times in that project, there was a lot less repeating myself in teaching my functions to understand error codes.
unsigned long getino(char * pathname):
For a given designated device and path, get the
inode number of the designated file. Return 0
(there is no inode number 0) in case of error.
Completely off-topic, but there is a bug in this prototype: Inode numbers have type ino_t, not type unsigned long.
(Ok, two bugs, if you count the fact that there can theoretically be an inode number 0. And three bugs if you count the fact that pathname should be declared as a (const char *). But the ino_t vs. unsigned long bug is really bad if you care about portability.)
You're right on all counts, and if I were writing a real filesystem rather than a class assignment I would have taken these issues more seriously.
(edited to add paragraph:) Nothing I've said in this discussion should be taken as a strongly held opinion about anything--I'm rather shy about sharing these thoughts in the first place, but considered this particular example illustrative of a possibility I take seriously.
You do not want to see the other horrors that particular professor of mine has perpetrated, incidentally.
In general, I prefer not to have to check for arguments. In your case, the right™ way to do it is probably monads: they help you separate the error checking from the "normal" path.
The easiest way is to treat printf() as a procedure rather than a function, and to restrict your use of compositional style to cases where it's still a good idea.
Thank you for your feedback, I will amend my entry.
> "Make functions, not procedures" --> "Don't check for errors. Don't even make it possible to check for errors."
I will make my tutorial clearer on this point: errors can be embeded in results. Option types can do wonders. Of course, this is not easy to do in C.
My opinion is that programming in C means either dealing with legacy code, seeking utmost performance, or being nuts. I think that leaves plenty of room for reasonable programming. I never said that my assertion is valid in all circumstances. Actually I agree with you here. But I would be extremely surprised if my advice couldn't be followed by most programmers, most of the time.
Actually, it's quite simple. The issue is that debugging becomes a bitch, and error codes are hidden even from the developer inside nested function calls - ie. you can't stop/recover from errors right away.
I can't agree - you don't need results of functions for checking errors - in many higher level languages you can return multiple values, you can use exceptions, even in C++ without exceptions you can do this by in-out parameters, for example:
I agree. By the few code that I write[1] outside of my day job is not in C. Also, I didn't meant C, but any language with garbage collection. I used C syntax mainly to reach mainstream imperative programers.
A useful summary (eg. I found my experimental code from last year a lot easier to understand when it used functions rather than procedures) - though I couldn't help laughing when he started saying "mistakes", "correct" and "incorrect".
Ah… Do you think I should change my wording? Because if it made you laugh, it could also discredit my point in the eyes of those who don't agree with me in the first place.
The wording comes across as dogmatic; yet the article isn't actually dogmatic - it sincerely acknowledges cases where the functional approach isn't suitable (even including the really cool algorithm of quicksort - BTW you might mention games as another one where mutable state is convenient). Showing respect for alternatives (where due) makes an article seem objective, impartial, intellectual and truth-seeking (rather than pushing an agenda; or trying to sell something). It would probably be beneficial if the wording reflected your intent.
I did find it distracting, because (to be frank) people don't like to be told what they should be doing; laughter is a way to defuse the affront.
Another approach is to appeal to those (few) people who are keen to learn about techniques that will help them. And I really think that that's what you were going for. Changing the wording would help achieve that aim, in my opinion.
So... yes.
btw: I really liked your explanation of linked lists being cheap to modify; and the article in general is very well done and communicates very effectively - I'd say it's of textbook standard.
"Initialize right away" --> "Make it hard to figure out what your variables are by mixing up code with their declarations."
"Construct new values" --> "Manually do what the first pass of the compiler already does. Except don't use the const keyword."
"Make functions, not procedures" --> "Don't check for errors. Don't even make it possible to check for errors."
"Make your objects immutable" --> "Let's see how much time we can spend in malloc!"
"Use purely functional data structures" --> "Arrays and hash tables are often the optimally performant data structures... but you shouldn't use them."
"When you can't help it" --> "If all your other attempts to sabotage your code fail to provide the required performance degradation, you can always slow your program down by copying data around unnecessarily."
In all seriousness, just like the famous "GOTO considered harmful", the author's "assignment statement is harmful" is an assertion which is valid in some situations -- quite possibly most or all of the situations he personally has to deal with -- but most definitely not valid in all circumstances.