Tuesday, April 12, 2011

Duck-typing in practice

As a recent Python devotee, I have found that duck-typing can sometimes be harder in practice than in theory. In theory, if it looks like a duck, swims like a duck, and quacks like a duck, then it's a duck. In practice, or at least in my practice, there are ducks incapable of flight. Or wolves in ducks' clothing. All attempted humor aside, there is difficulty in reconciling a type-based view of object oriented programming and the much more laissez-faire approach that Python adopts.

It should be obvious that duck-typing provides a more variable approach to implementation, something which was included in the driving factors behind the invention of OOP. Nonetheless, #python on freenode is inundated with questions from inexperienced programmers as to the proper way to distinguish a list from a string, or other similar situations. I will side with the operators of #python and assert that the correct way doesn't happen to be isinstance(). This function's usage is contrary to the idea of duck-typing and, in its positive form, should be avoided at all costs. However, how does one distinguish a list from a string?

On the surface, there are many similarities between these two datatypes. Both possess a length, have __add__ ,__mul__ , and  __contains__ methods. One example of a method they don't share is the append method.

What I am getting to is that, while duck-typing ensures that a function, or bit of code will work regardless of type, so long as it has the proper interface, what happens when two things share the relevant methods and the only thing distinguishing them is irrelevant to the problem at hand? It is easy to dismiss these situations as inconsequential given the nature of duck-typing (if they share the relevant interface, then it SHOULD work). But is it possible to make concessions? Is duck-typing compatible with negative definition by type (a set of "is not isinstance()" checks)? After all, if something looks like a duck, swims like a duck, and quacks like a duck, can we at least check to make sure it isn't a wolf in duck's clothing?

If the number of question marks in this post are any indication, I invite discussion on this topic, perhaps someone more knowledgeable than me will have more declarative sentences.


  1. I don't worry too much about two things sharing the same methods but not being the same thing -- that doesn't happen that much.

    But I still don't understand the best way to see if something is list-like or string-like. Test for every single publically declared method in list to see if the object meets it? That seems infeasible. But there are all sorts of useful methods in there that I might want to use, it's so much easier to know it's a list and know they'll all be there. I end up using instanceof (or the equivalent in ruby where i spend a lot more time and am more familiar with than python), don't know any other way to do it.

  2. @bbwild:
    the built-in method dir() displays a list of methods defined for a given instance, however, I have heard that this function does not always return results that can be depended on for code. I have heard that it is heavily implementation-dependent, which makes sense to me.

  3. "what happens when two things share the relevant methods and the only thing distinguishing them is irrelevant to the problem at hand?"

    Can you think of an example of this, but where the type of the two things would still matter?

    I could see this being a problem if a type 'accidentally' satisfies whatever interface is necessary, or where two methods by the same name have different behavior (such as sorting and returning a new instance versus sorting in place).

  4. The argument that duck typing should always work ignores situations where there are non-trivial constraints which are not expressed. Consider, for example, multiplication. Is the mul() function commutative- in other words does a.mul(b) == b.mul(a)? Some times it is (real multiplication), some times it is not (matrix multiplication). One could easily imagine an algorithm that depended upon multiplication being commutative, which passing in a matrix would fail.

  5. One of the biggest problems with Python is the inconsistency in libraries in treating things type-agnostically. There are many functions that will take a single string, a list of strings, or a dict, all on the same parameter in the interest of convenience, when really that defeats the purpose of duck typing and provides awful examples for people trying to learn the language.

    I think maybe a large conceptual problem has to do with trying to write multiple dispatch in python, when really implementors should absolutely be using separate functions for separate interfaces. The function name then becomes the disambiguation in absence of type dispatching, and this is probably far more pythonic than all the examples of "if it's a string, we do this; if a list of strings, we pretend it's not..."