Home > Uncategorized > Self.Learn(Ruby);


Note: A lot of this stuff is CTRL+C, CTLR+V. I have just tried to consolidate stuff into one single article. I can come back to this one article & coherently read this from one single article, instead of jumping over a dozens of pages and get lost. So I do not claim credits for any of the stuff here.

Dynamic Languages are gaining popularity. In fact, CLR ‘will support’ / ‘supports’ Iron Ruby and Iron Python via Dynamic Language runtime. Why are dynamic languages becoming so popular? Let us try to find the answers in the near future.

First let us try to set some terminology straight (I am not a computer scientist, please bear with) 
Type:  ‘A type is a label used by a type system to prove some property of the program’s behavior’ [1]. Very helpful isn’t it. What it really means is (or atleast how I understood it). Consider the line below in C#

int i;

Here i has been labelled as an int. What it proves is it will fall within some range ‘Int32.MinValue – Int32.MaxValue’. Only some operations are valid on it and so on.

It is also helpful to quote a couple of other definitions of type.
‘Perhaps a type is just a set of possible values.  Perhaps it is a set of operations’ these are from [1] again.
‘A type is metadata about a chunk of memory that classifies the kind of data stored there. This classification usually implicitly specifies what kinds of operations may be performed on the data.’ [2]

Type System:
‘A type system is a tractable syntactic method for proving the absence of certain program behaviors’ [1] Chris Smith further explains the definition as

  • syntactic method .. by classifying phrases: A type system is necessarily tied to the syntax of the language. It is a set of rules for working bottom up from small to large phrases of the language until you reach the result.
  • proving the absence of certain program behaviors: This is the goal. There is no list of "certain" behaviors, though. The word just means that for any specific type system, there will be a list of things that it proves. What it proves is left wide open. (Later on in the text: "… each type system comes with a definition of the behaviors it aims to prevent.")
  • tractable: This just means that the type system finishes in a reasonable period of time. Without wanting to put words in anyone’s mouth, I think it’s safe to say most people would agree that it’s a mistake to include this in the definition of a type system. Some languages even have undecidable type systems. Nevertheless, it is certainly a common goal; one doesn’t expect the compiler to take two years to type-check a program, even if the program will run for two years

Types of typing:
When the type checking happens (proving the absence of certain program behaviors)
Static: Compile-time
Dynamic: Run-time
How much the programmer can circumvent the type system (this one is debatable though)
StrongTyping : Values are strongly associated with a type, and this type cannot be changed. An int is an int, no matter what lexical variable points at it. Conversions from one type to another are accomplished by constructing a new data object out of the old one, not by converting the same data value to a new type. Many lanuages with DynamicTyping use this. Scheme is an example of a language with StrongTyping and DynamicTyping. 
WeakTyping : Vales are not strongly associated with a type. Usually, they are just values that may be interpreted in variety of ways dependon on the lexical variable that references them.

     int *my_int = malloc( sizeof( int ) ); // Make a memory address on the heap.
     char *my_char = (char *)(my_int); // Note the typecast, a hallmark of this kind of programming
     int **my_ptr_int = (int **)(my_int); // Another type.

All of this is from [3].
Whether there is some assumed sub typing?
Nominative Typing[8]: Types of values are differentiated by programmer-specified names (two differently-named types which are arrays of 5 ints are differert types).
Structural Typing[8]: Types are defined by the structure of values (two differently-named types which are arrays of 5 ints are the same type

Duck Typing: Similar to structural typing, but distinct from Structural typing, because in Structural typing schemes type compatibility and equivalence are determined by the type’s structure; whereas in duck typing, type compatibility and equivalence are determined only by that part of a type’s structure that is accessed.[7]

Whether each variable decration must have a type
Manifest/Implicit: Whether each variable must have a type declaration or not. Most static languages have manifest types, and most dynamic languages have implicit types. However, there are exceptions: Haskell and the dialects of ML can infer the type of any variable based on the operations performed on it, with only occasional help from an explicit type. Also, the GNU/NeXT/Apple extensions to Objective-C permit optional manifest typing [4]

A lot of these classifications are not binary anymore. The distinctions are merging day by day.

Next let us start with the common myths

  • Dynamic languages means dynamic typing:
    The reason this myth exists is that languages do not give you a choice. Either they require static types everywhere or don’t support them at all. In a language that supported static typing but made it optional, you could leave out types when writing new, informal code, but declare types for standard interfaces that are widely used. Would that give you the best of both worlds. Dylan Language seems to have this feature [5]. Microsoft too is working on bridging the static-dynamic divide [6].
  • Dynamic Languages mean weak typing:
    No, Ruby, Python are strongly typed. Static-Dynamic, Strong-Weak are orthogonal classifications.

    "The statement made at the beginning of this thread was that many programmers have used dynamically typed languages poorly.  In particular, a lot of programmers coming from C often treat dynamically typed languages in a manner similar to what made sense for C prior to ANSI function prototypes.  Specifically, this means adding lots of comments, long variable names, and so forth to obssessively track the type information of variables and functions.

    Doing this prevents a programmer from realizing the benefits of dynamic typing.  It’s like buying a new car, but refusing to drive any faster than a bicycle.  The car is horrible; you can’t get up the mountain trails, and it requires gasoline on top of everything else.  Indeed, a car is a pretty lousy excuse for a bicycle!  Similarly, dynamically typed languages are pretty lousy excuses for statically typed languages.

    The trick is to compare dynamically typed languages when used in ways that fit in with their design and goals.  Dynamically typed languages have all sorts of mechanisms to fail immediately and clearly if there is a runtime error, with diagnostics that show you exactly how it happened.  If you program with the same level of paranoia appropriate to C – where a simple bug may cause a day of debugging — you will find that it’s tough, and you won’t be actually using your tools.

    (As a side comment, and certainly a more controversial one, the converse is equally true; it doesn’t make sense to do the same kinds of exhaustive unit testing in Haskell as you’d do in Ruby or Smalltalk.  It’s a waste of time.  It’s interesting to note that the whole TDD movement comes from people who prefer dynamically typed languages…)" [1].

  • Static types imply type declarations:
    The thing most obvious about the type systems of Java, C, C++, Pascal, and many other widely-used "industry" languages is not that they are statically typed, but that they are explicitly typed. In other words, they require lots of type declarations. (In the world of less explicitly typed languages, where these declarations are optional, they are often called "type annotations" instead. You may find me using that word.) This gets on a lot of people’s nerves, and programmers often turn away from statically typed languages for this reason.

    This has nothing to do with static types. The first statically typed languages were explicitly typed by necessity. However, type inference algorithms – techniques for looking at source code with no type declarations at all, and deciding what the types of its variables are – have existed for many years now. The ML language, which uses it, is among the older languages around today. Haskell, which improves on it, is now about 15 years old. Even C# is now adopting the idea, which will raise a lot of eyebrows (and undoubtedly give rise to claims of its being "weakly typed" — see definition above). If one does not like type declarations, one is better off describing that accurately as not liking explicit types, rather than static types.

    (This is not to say that type declarations are always bad; but in my experience, there are few situations in which I’ve wished to see them required. Type inference is generally a big win.) [1]

  • Static types imply upfront design or waterfall methods:
    Some statically typed languages are also designed to enforce someone’s idea of a good development process. Specifically, they often require or encourage that you specify the whole interface to something in one place, and then go write the code. This can be annoying if one is writing code that evolves over time or trying out ideas. It sometimes means changing things in several different places in order to make one tweak. The worst form of this I’m aware of (though done partly for pragmatic reasons rather than ideological ones) is C and C++ header files. Pascal has similar aims, and requires that all variables for a procedure or function be declared in one section at the top. Though few other languages enforce this separation in quite the same way or make it so hard to avoid, many do encourage it.

    It is absolutely true that these language restrictions can get in the way of software development practices that are rapidly gaining acceptance, including agile methodologies. It’s also true that they have nothing to do with static typing. There is nothing in serious writings on static type systems that has anything to do with separating interface from implementation, declaring all variables in advance, or any of these other organizational restrictions. They are sometimes carry-overs from times when it was considered normal for programmers to cater to the needs of their compilers. They are sometimes ideologically based decisions by past authoritarian sorts of people, much like anti-sodomy laws. They are not static types.

    If one doesn’t want a language deciding how they should go about designing their code, it would be clearer to say so. Expressing this as a dislike for static typing confuses the issue.

    This fallacy is often stated in different terms: "I like to do exploratory programming" is the popular phrase. The idea is that since everyone knows statically typed languages make you do everything up front, they aren’t as good for trying out some code and seeing what it’s like. Common tools for exploratory programming include the REPL (read-eval-print loop), which is basically an interpreter that accepts statements in the language a line at a time, evaluates them, and tells you the result. These tools are quite useful, and they exist for many languages, both statically and dynamically typed. They don’t exist for Java, C, or C++, which perpetuates the unfortunate myth that they only work in dynamically typed languages. There may be advantages for dynamic typing in exploratory programming (in fact, there certainly are some advantages, anyway), but it’s up to someone to explain what they are, rather than just to imply the lack of appropriate tools or language organization. [1]

  • Dynamically typed languages provide no way to find bugs:
    A common argument leveled at dynamically typed languages is that failures will occur for the customer, rather than the developer. The problem with this argument is that it very rarely occurs in reality, so it’s not very convincing. Programs written in dynamically typed languages don’t have far higher defect rates than programs written in languages like C++ and Java.

    One can debate the reasons for this, and there are good arguments to be had there. One reason is that the average skill level of programmers who know Ruby is higher than those who know Java, for example. One reason is that C++ and Java have relatively poor static typing systems. Another reason, though, is testing. As mentioned in the aside above, the whole unit testing movement basically came out of dynamically typed languages. It has some definite disadvantages over the guarantees provided by static types, but it also has some advantages; static type systems can’t check nearly as many properties of code as testing can. Ignoring this fact when talking to someone who really knows Ruby will basically get you ignored in turn. [1]

  • Static types imply longer code:
    Type inference not only allows you to omit type information when declaring a variable; type information can also be used to determine which constructors to call when creating object graphs. In the following example,the compiler infers from the declaration Buttonb,that it should construct a new Size object,assign the Height and Width fields to the integers 20 and 40 respectively,create a new Button instance b and assign the Size object tothe Size field of b:


    This economy of notation essentially relies on the availability of static type information. In other words,static type information actaully allows programs to be more concise than their equivalent dynamically typed counterparts [6].

All these are presented here just to enable you to ‘Keep your balance’ and not to get carried away by one side of the camp. Whether dynamic languages will rule the world or not is something which I cannot predict. But it definitely helps to learn the concepts from them. It is very likey a lot these concepts / features will get sucked into .NET Framework. Knowing a dynamic language today will help me use .NET in a better way when these features make it to the mainstream .NET. Today I speak to lot of folks who code in .NET 2.0 but do not use a single feature of .NET 2.0 (it is basically .NET 1.1 code compiled in 2.0 framework). I wish ArrayList class was removed unless you switch on a backward compatibility mode. This would force people to use List<T> or Dictionary<TKey, TValue> over Hastable and so on.

In fact there are other versions of this like ‘Coding in java inside of C#’. Coding in ASP inside of ASP.NET and so on. I do not want to be in this boat ever, that’s the reason I am interested in learning ruby. The ‘.’ and ‘;’ in the title are kinda unintentional :-).

[1] What to know before debating type systems – http://cdsmith.twu.net/types.html Update: Link moved to http://www.pphsg.org/cdsmith/types.html
[2] Typing – Strong vs Weak, Static vs Dynamic – http://www.artima.com/forums/flat.jsp?forum=106&thread=7590
[3] Strongly Typed – http://c2.com/cgi/wiki?StronglyTyped
[4] Strong versus Weak Typing – http://www.artima.com/forums/flat.jsp?forum=32&thread=3572
[5] David Thomas on the benefits of dyanmic typing – http://c2.com/cgi/wiki?DavidThomasOnTheBenefitsOfDynamicTyping
[6] Static Typing when possible, Dynamic Typing when needed: The end of cold war between Programming Languages – http://pico.vub.ac.be/~wdmeuter/RDL04/papers/Meijer.pdf
[7] Duck Typing –  http://en.wikipedia.org/wiki/Duck_typing#Comparison_with_Structural_type_systems
[8] Types Of Typing – http://c2.com/cgi/wiki?TypesOfTyping

Categories: Uncategorized
  1. Paddy3118
    October 14, 2007 at 6:33 pm

    You missed a reference for your Duck typing quote that should be to http://en.wikipedia.org/wiki/Duck_typing#Comparison_with_Structural_type_systemsNice to know you find it worthy ;-)- Paddy.

  2. Sendhil Kumar
    October 15, 2007 at 8:22 am

    Thanks Paddy. Added a reference to that. I might a missed a lot of other references too. I’ll keep adding them as I go along.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: