Python is a great programming language, elegant and concise. However, it has flaws and shortcomings. Here are some of them.
Version hell
Python 3 is backwards imcompatible with Python 2. Though the former was released almost two years ago, popular libraries and web hostings still support only Python 2.x. Google AppEngine uses Python 2.5, which has a very different syntax from Python 3. Django works on Python 2.7, but not on Python 3. And so on.
# Python 2.5 # Python 3 for i in xrange(9): for i in range(9): print '%02d' % (i / 3), print( '{:02d}'.format(i // 3), end=' ' ) # Differences: # no xrange (range must be used instead), # print is a function, # format should be used instead of %, # integer division is now //, not /
If you want to write a script or a web app, you have to choose between the versions. Python 2.x is supported everywhere, but you will have to rewrite your code in future. Python 3 is for "brand new things", as van Rossum says, but it still has little support, so you will be limited in the choice of hosting provider for your web app.
Side-by-side installation of Python 2 and 3 under Windows is problematic. It would be much easier if Python 3 scripts had a different file extension, for example, py3
instead of py
.
Incomprehensible language reference
Python language reference sounds like the author wrote it for himself. It's hardly usable for an average Python developer. Compare:
Python | PHP |
---|---|
String literals are described by the following lexical definitions:
stringliteral ::= [stringprefix](shortstring | longstring) stringprefix ::= "r" | "R" shortstring ::= "'" shortstringitem* "'" | '"' shortstringitem* '"' longstring ::= "'''" longstringitem* "'''" | '"""' longstringitem* '"""' shortstringitem ::= shortstringchar | stringescapeseq longstringitem ::= longstringchar | stringescapeseq shortstringchar ::= <any source character except "\" or newline or the quote> longstringchar ::= <any source character except "\"> stringescapeseq ::= "\" <any source character> ...One syntactic restriction not indicated by these productions is that whitespace is not allowed between the stringprefix or bytesprefix and the rest of the literal... In plain English: Both types of literals can be enclosed in matching single quotes (') or double quotes ("). They can also be enclosed in matching groups of three single or double quotes (these are generally referred to as triple-quoted strings). The backslash (\) character is used to escape characters that otherwise have a special meaning, such as newline, backslash itself, or the quote character. |
The simplest way to specify a string is to enclose it in single quotes (the character '). To specify a literal single quote, escape it with a backslash (\). To specify a literal backslash, double it (\\). All other instances of backslash will be treated as a literal backslash: this means that the other escape sequences you might be used to, such as \r or \n, will be output literally as specified rather than having any special meaning. <?php echo 'this is a simple string'; echo 'You can also have embedded newlines in strings this way as it is okay to do'; // Outputs: Arnold once said: "I'll be back" echo 'Arnold once said: "I\'ll be back"'; // Outputs: You deleted C:\*.*? echo 'You deleted C:\\*.*?'; ... |
PHP documentation does not try to explain several types of strings (single-quoted and triple-quoted) in one paragraph; it provides examples and skips unreadable "lexical definitions".
When using PHP, you can look up the language reference when you don't remember how an obscure language feature works. When using Python, you have to decipher the grammar reference or try to find it in the tutorial (which does not describe all features).
Optimized data types (immutable, ranges and iterators)
Python has subtle different data types (list and tuple, bytes and bytearray), so you often have to think "Do I need the mutable type here?" Pythonists disagree about the proper usage for tuples.
IMHO, tuple is an optimized particular case of a list. They are faster than usual lists and allow simple hashing, so a tuple can be a dictionary key, but this feature is rarely used. Other programming languages have one data structure where Python has two.
Another "optimized" data type is iterator, a "lazy" data structure. In Python 2, map
function returns a list; in Python 3, it returns an iterator object, which behaves like list in some (but not all) aspects. It cannot be printed or indexed without converting to a list. You can pass it to min
or max
function, but only once, so you cannot find both minimum and maximum without coercion to a list:
a = map(int, '1 2 3'.split()) print(a) # prints: <map object at 0xABCD>, not [1, 2, 3] a[2] # TypeError: 'map' object is not subscriptable print( min(a) ) # prints 1 print( max(a) ) # error, because the iterator already was "iterated"
In essence, Python burdens programmer with choosing the fastest data structure. Instead, the interpreter could always use lists. The programmer would not have to focus on the implementation details and remember the peculiar syntax of tuples or the non-intuitive behavior of iterators. A more sophisticated language implementation (such as RPython) could analyze the program and choose the optimized types automatically, without introducing new types to the language.
Conclusion
Python is definitely better planned that other scripting languages, but you should choose between the versions, find a good book, and keep in mind the diverse data types.
Additional reading
- Python Warts, the things for which people have criticised Python.
- Python flaws: scope and shadows by Luis Artola.
26 comments
(Feel free to add examples of other languages, but note that everything starts at least in 1956 with Fortran I, which had to run on the machine with 8K memory!)
1956: Fortran I:
PRINT 1, X
1 FORMAT (F10.2)
1980: C
printf("%10.2f", x);
1988: C++
cout << setw(10) << setprecision(2) << showpoint << x;
1996: Java
java.text.NumberFormat formatter = java.text.NumberFormat.getNumberInstance();
formatter.setMinimumFractionDigits(2);
formatter.setMaximumFractionDigits(2);
String s = formatter.format(x);
for (int i = s.length(); i < 10; i++) System.out.print(' ');
System.out.print(s);
2004: Java
System.out.printf("%10.2f", x);
2008: Python 3
print( '{:10.2f}'.format( x ) )
2008: Scala and Groovy
printf("%10.2f", x)
A commenter from Hacker News said:
You should not hide that the strings part in the tutorial explains them pretty good: http://docs.python.org/py3k/tutorial/introduction.html#strings
You have completely missed the point about iterators. They're *lazy* data structures, similar to what Haskell uses, not merely an optimised list. Use an iterator when you want to generate sequence items just in time (when needed, and not before), or when you don't have any need to keep used items around.
I've been using Python for about 15 years now, and I'd sooner give up lists than iterators.
(Actually either would be stupid, since they do different things for different purposes, but if I had to lose one, I'd give up lists.)
"Python language reference sounds like the author wrote it for himself"
Err... Have you actually read the "Java language reference" ? These are language *reference* - something that you can use to write a conform implementation -, not a tutorial. It's like criticizing source code for not being a proper end-user documentation.
"you often have to think "Do I need the mutable type here?""
If you do have to think about it then you have a bigger problem to solve IMHO. FWIW, while they _can_ be used as such, tuples are NOT immutable lists.
Thank you for the comments. Python tutorial is nice, I wish that the language reference was written in the same style.
About iterators: I understand that they are "lazy" and how do they work. However, I often find that I need to convert them back to lists or tuples, so the "laziness" (which is, IMHO, an optimization for avoiding unnecessary memory usage) is not very useful.
About Java: try to read PHP language reference. About tuples: as I mentioned in the article, pythonists disagree about the proper usage for them.
A commenter from Reddit said:
The language reference is a grammar reference not a tutorial. Would you prefer it didn't exist?
When using PHP, you can look up the language reference when you don't remember how an obscure language feature works. When using Python, you have to decipher the grammar reference or try to find it in the tutorial (which does not describe all features). I would prefer that the language reference didn't exist in its current form: it's useless and confusing for 99% of Python programmers, who don't what to implement their own Python parser.
I prefer a small core language with lots of different datatypes. I can start being productive keeping things simple using more common types but switch to more powerful ones when I have a special need. If I wanted a really simple language I would probably go for LUA. It is small and has a good JIT.
If every machine had infinite memory, and linear processes could be executed in 0 time, then using lists for everything would work. However, in the real world, loading a complete list may not be possible, and processing the list must be done bit by bit. Iterators are ideal for this, and at the same time, can be coerced into random-access lists should the need arise.
However, in real life, I have only rarely used random-access lists, and even then they were just compact dictionaries/maps/hashes. Most problems can be reduced to list processing, item by item, hence the popularity of languages such as Lisp. Python brings in several neat features that make list processing easy, while maintaining the Algol-style recipe syntax that we all naturally understand.
Peter, I'm afraid you are making an apples to oranges comparison between the PHP "Language Reference" and the Python "Language Reference". Instead, you should be comparing those PHP docs to the Python Library Reference (http://docs.python.org/library/). The Python Language Reference is meant as a technical guide on the syntax of Python, while the Library Reference is meant to be the user manual. This is why the description of the Library Reference says, "Keep this under your pillow." I rarely visit the Language Reference, but visit the Library Reference almost daily.
http://wiki.python.org/moin/Python2orPython3 -- this describes the arguments for and against switching to Python 3. Python 3 being backwards incompatible (in some ways) isn't a bad thing at all; the cruft has been eliminated.
The Python language reference is, for me, quite easy to read as I am familiar with RFC-style definitions. "One syntactic restriction not indicated by these productions is that whitespace is not allowed between the stringprefix or bytesprefix and the rest of the literal…" actually, it -is- clearly defined:
Notice the lack of whitespace between "[stringprefix]" and "(shortstring…"
http://rgruet.free.fr/PQR26/PQR2.6.html#Strings is a good "quick" reference for strings, demonstrating their different forms in a clear way. It's also a good quick reference for everything else in Python.
All of your problems appear, to me, to be invalid or imaginary.
Use lists when every item is similar and what you do with the n'th member is similar to what you might do with the m'th.
Use tuples when position of an item has meaning. You should not use a list to hold the x and y coordinates for a point in 2D space - use a tuple.
You shouldn't use a tuple to hold successive lines from a file - use a list.
Chris, the problem is that Python doesn't have an official language reference that is readable for an average programmer. PHP has such reference, and it's useful for finding information on obscure features.
Paddy3118, thank you for the rules. I would add that you have to coerce list to tuple when you want to use it as a dictionary key (my two-stage table script contains an example).
Ian Bicking said via Buzz:
Peter, thanks for your reply. I'm afraid we still don't understand each other. I'm trying to say that for the example you gave of comparing PHP docs on strings to Python docs on strings, the reference you should have used is http://docs.python.org/library/stdtypes.html#sequence-types-str-unicode-list-tuple-bytearray-buffer-xrange which is in the "Library Reference" docs, and if you are absolutely, completely new to the language, you should be working through the Tutorial docs, in which you would have come across this: http://docs.python.org/tutorial/introduction.html#strings
Now, I do think a fair criticism of the documentation is that it's not completely obvious which of the documentation pages (Language, Library, Tutorial) you should be looking through, particularly as a beginner. If this is your point, then I agree there, but once you know, "Oh, I should be looking through Library Reference for my answers," I think most of us find the documentation quite good, even outstanding in many cases.
Chris, this chapter from Library reference describes operations on strings, not the syntax of string, escape sequences, etc. The PHP and Python examples above are about the syntax.
The tutorial does not explain everything (just because it's a tutorial, not reference). Please actually read the PHP and Python reference pages and compare them. Hope you understand me now.
IMHO, PHP manual is much better written than Python docs. If you are looking in a wrong place, PHP manual directs you to the right place. See delete function as an example. The string syntax page also contains links to “String operators” and “String functions”.
Peter, the grammar reference is not intended for beginners, if you think it is incomprehensible for beginners, it is because it is intended for python (experts) who want to know the precise semantics of the constructs, not for beginners learning the ropes. If you're expecting a reference intended for expert to a tutorial intended for beginner, it's just a silly rant. The reference manual itself describes itself as:
"""This reference manual describes the syntax and “core semantics” of the language. It is terse, but attempts to be exact and complete.""" """This reference manual describes the Python programming language. It is not intended as a tutorial."""
If you want a fair comparison with PHP's Manual, Python's Tutorial is written in the same style as PHP's Manual http://docs.python.org/py3k/tutorial/introduction.html#strings. The tutorial does explains everything about string syntax, escape sequence, etc. On the other hand, PHP's reference is misleading, it is a tutorial mislabeled as a reference. Please actually read the tutorial and compare them.
"""It would be much easier if Python 3 scripts had a different file extension, for example, py3 instead of py."""
You can do that easily if you want to, just associate that file extension with the correct interpreter version.
If you think tuple is optimized list, then it proves you totally do not understand python. There is virtually no difference in the speed of tuple and lists, if you chose tuple instead of list because of speed concern then you're using the wrong language. If you think tuple is just optimized or immutable list, then it's no wonder you're confused. There is no disagreement of what tuple is used for, and in fact when properly used as Guido intended, there is nothing in common between the use cases for list and tuples.
Python's tuple is intended to be a lightweight struct. Whenever you want to use a struct/class in C/C++, but do not want to bear the complexity of the syntax, you use tuple in python. A tuple is NOT a container, nor it is an immutable container, nor an optimized version of lists.
I would agree that there IS a common use cases between tuples and objects/class; however, there is nothing in common between the use case for tuples and lists.
"""PHP documentation does not try to explain several types of strings (single-quoted and triple-quoted) in one paragraph"""
Because PHP's different strings syntaxes have different semantics, different variable substitution rule, and different sets of acceptable escape sequences. Python's string literal all have the same semantic (any characters except the quoting character(s) are allowed), there is no variable substitution rule, and all have the same set of acceptable escape sequences (raw string have different escape sequence, however it's explained in a different paragraph, just as you wanted it to be). Trying to explain single- and double-quote in two separate paragraphs would make it really redundant and confuse people into thinking that single- and double- quote are different, distinction matters only when it matters.
The tutorial does not contain the list of escape sequences. It also does not explain byte literals.
I wrote "single-quoted and triple-quoted". Triple-quoted strings have different rules for unescaped newlines and quotes.
You have to coerce list to tuple when you want to use it as a dictionary key (my two-stage table script contains an example).
A list can be easily used for the same purpose. However, a tuple cannot be always used instead of a list. So, from implementation point of view, tuple is an optimized particular case of a list.
According to these benchmarks, constructing a small tuple is ≈6 times faster than constructing the equivalent list.
> It also does not explain byte literals.
Byte literal syntax is new in Python 3.x and as of now not very well-documented anywhere, you're welcome to contribute a documentation if you'd like to.
>> Trying to explain single- and double-quote in two separate paragraphs would make it really redundant...
> I wrote "single-quoted and triple-quoted". Triple-quoted strings have slightly different rules for unescaped newlines and quotes.
You wrote "PHP documentation does not try to explain several types of strings ... in one paragraph"; in the tutorial triple-quoted string is explained five paragraphs below the paragraph explaining single-quoted string. It's true the language reference explains them all in one paragraph, but if you're reading the language reference, you should already be familiar with how the various string literal works and the first paragraph is a mundane item that you'd want to skip through as fast as possible. Again, language reference (of any language) is not a beginner's tutorial; PHP mislabeled their tutorial as a language reference (notably lacking in PHP's "language reference" is a normative, formal description of the string literal syntax).
>> There is virtually no difference in the speed of tuple and lists...
> According to these benchmarks, constructing a small tuple is ≈6 times faster than constructing the equivalent list.
0.413 usec - 0.0602 usec = 0.3528 usec savings and the particular optimizations that makes this possible is only valid for a tuple containing constant items. I wouldn't bother with that even when microoptimizing a CPU-bound code.
>> Python's tuple is intended to be a lightweight struct.
> A list can be easily used for the same purpose. However,
> a tuple cannot be always used instead of a list. So,
> from implementation point of view, tuple is an optimized
> particular case of a list.
No it can't. A list does not carry the same semantic. A table in a relational database is precisely described as "a list of tuples", it is not a "list of lists" nor it is a "tuple of lists" nor it is a "tuple of tuples". A "tuple" has the same semantic as an "entity" or "instance" or "object", which is not the same as "collection" (list, arrays, dictionaries, etc). That a tuple is implemented as an array similar to list is just an implementation coincidence, it's just like saying that you don't need struct since you can just malloc a memory area, treat is as a char array, and use pointer arithmetic to read/write bytes at the specified offset. Do they compile to the same thing? Yes. Does tuple/struct and list/char-array have the same semantic? Hell, no!
In a list mylist = [2,1], if you call mylist.sort() you get [1,2]. If 2,1 is a coordinate point, then sorting it to 1,2 is a REALLY BAD THING and could lead to really bad bugs.
OTOH: mytuple = 1,2 and you call mytuple.sort(), you get an error because (1,2) is not sortable. This is a GOOD THING, leading it to be usable as a hash (key) in a dict, and for identifying specific things.
A list has no business being coerced to a tuple in order to act as a key in a dict. I think that's an abuse of list. You'd be better off thinking of tuples as a record in a database. "213 1st Street" means nothing when sorted in a list ['1st', '213','Street'), but means everything when left as a tuple ('213', '1st', 'Street') as it should be. Tuples are great for data - when the position *means* something, not just an "ordering". People ask, "why don't you just have named field". A phone number doesn't really need named fields for the 800 to mean something and for it to be important that it not have its order rearranged. Take the following two tuples: (408,555,1212) and (555,408,1212). If you treat them just as lists, you could end up sorting them and they'd be "identical". But they're not identical - they're phone numbers for completely different parts of the country and the structure is meaningful. Which means, tuples would be appropriate here, and that structure is what makes them good as keys.
Version hell.
There is no such thing with python.
Python 2.0 was a major step as 3.0 is. There is a clear path outlined
to get from 2.7 (the latest release in 2-Series) to 3, and even a tool
to help you with that (http://docs.python.org/library/2to3.html).
The quote "python 3 is for brandnew things" relates to new features of the 3-Series, not for new projects to implement in python.
Yes, support for 3 is fairly thin, but there have been troubles (and a moratorium for language changes), and some major external (read again, external) libs are not yet available.
Heading 3.2 the world will get better - and 2.7 will be supported for a while (http://docs.python.org/dev/whatsnew/2.7.html#the-future-for-python-2-x).
Incomprehensible language reference
This is where you are IMHO *absolutely* wrong.
The core language reference is in BNF, which *is* comprehensible.
The library reference *is* comprehensible, and full of examples.
That the PHP crowd mixis a syntax reference with a function reference
is not python's fault (try to get a valid BNF for PHP from their site ...).
But as always, the question is, to whom it is comprehensible.
Perhaps not to you.
BNF is a context-free language to describe grammars.
The PHP reference is definitly not context free.
Optimized data types (immutable, ranges and iterators)
IMHO the differentiation between lists, tuples and also sets
originates from math.
Yes, Python burdens a programmer to choose the right data type
for his particular case.
But what is wrong with that ?
There are always to dimensions for optimazations: time and space.
If you have only one datatype (PHP's array), you have to sacrifice
one or the other anytime soon.
Or create a super datatype, which optimizes itself during runtime.
An iterator is no datatype. Period.
When you say you like to program in C++, you should have come
accross the stdlib, which has plenty of them.
May 2011, py3 is still nowhere... At the last evaluation for script lang to use in our company - we chose Ruby, simply because of the more reliable standard compared to Py, even Lua/Powershell were discussed over Python.
PHP documentation is very poorly written. It is useful for basic topics but once you want to go deeper, it's hard to find information. For some reason the PHP people try to hide language details. Some details are missing in the PHP reference and left for the user to guess. The confusing parts and specific cases are left unexplained pretty often.
In contrast, for example the C manual pages are very detailed.
Also, in my opinion these heystack/needle argument names are just silly. Often the role of the argument is hard to figure out from such a name.
"Looking at download statistics for the Python Package Index, we can see that Python 3 represents under 2% of package downloads. Worse still, almost no code is written for Python 3."
http://alexgaynor.net/2013/dec/30/about-python-3/