Problems with PHP

After having used PHP for over 4 years now, its problems have really started to be come apparent. At first, they were just annoyances, and I didn’t want to think the that problem was other than my own, being new to the language. But overtime, they’ve grown from mere annoyances to full blown banging-your-head-on-the-desk frustrations. This is my attempt to document these frustrations, as part of my PHP therapy.

This is not to say that I don’t still use PHP. It does have its place, and after you get beyond some of the questionable design decisions and learn to stop thinking like a programmer and start thinking like a web designer, and learn that many of the user-contributed examples on php.net are useless, it does fulfill a need and can be the right tool for the job.

All languages and environments have their issues and problems. Usually, the way something is implemented makes sense once you understand it or reach a corner case. This list is not supposed to include those kinds of things, but rather is intented to iterate the inexplicable.

This is a work in progress.

  • going to be like C, or not?
    • odd and confusing inconsistencies. SPL’s Iterator implements rewind, but the actual function named rewind is for file access (like in C’s stdio). For arrays, this function is named reset.
    • have to use setenv and putenv to interact with environment variables, but environment variables can be read from the $_ENV array. why isn’t this array magical to allow you to do things a “php way”?
  • order of arguments for backward compatiblity, like “join” and “explode”.
  • the “magicness” of $_SESSION
  • “exceptions” caused by legacy means are errors, not exceptions, and can not be caught; multiple methods have to employed to handle all errors/exceptions, some, like E_ERROR, can not be handled by error_handler functions. is there some other exception class that can be used to catch engine generated errors?
  • bad examples on website (custom session handler example doesn’t implement all the same semantics that the built-in one does (ie locking))
  • some of the var_dump/var_export/print_r family of functions are unable to gracefully handle data structures with references to themselves, and they don’t act consistently. This is busted with respect to how references are suggested to be used in the documentation
  • I can’t believe that the PHP4 object model is/was so busted that objects are always passed by value than by reference. This entirely defeats the purpose of objects, and leaks implementation details through to the language. Consider Michale Tsai’s blog entry on references and objects in PHP4 and PHP5. There are way too many reverse engineered implementation details in the discussion of this. While this is not necessarily bad, knowing the internals of language can help you use that language more effectively, one should not have to know the internals to make sense of something.
  • What the fuck is this? From func_get_args:
    Note: Because this function depends on the current scope to determine parameter details, it cannot be used as a function parameter. If you must pass this value, assign the results to a variable, and pass the variable.

    How the hell is PHP written? Apparently, this “function” does not push its result onto the stack, or function calling does not read arguments from the stack (which also explains the lack of an array-splice operator), or maybe PHP is not stack based. I fail to see why a called (internal) function can not access the stack frame of its caller. This is a matter of the implementation leaking through to

  • And what’s with things that look like functions actually being statements/keywords/language constructs. array() is forgivable, but isset() and family are not. These are not functions to avoid auto-vivification of variables and array members (perl  seems to handle this okay with exists()? It seems that these constructs have their own language rules between the parens, like inside array() and foreach() is the only place => makes sense (if they were allowed in any list construction, like function args, named arguments would be easy and obvious). On top of that, only simple expressions are allowed–expressions that resolve to array elements are allowed, general expressions are not, isset(1==1) fails. Even isset(CONSTANT) fails.
  • There’s no syntax for obvious things like passing an array as the arguments to a function, you have to the verbose and convoluted call_user_func_array() function, making the code harder to read. Yet call_user_func() exists which takes the arguments to use as arguments itself rather than array. You can’t even emulate named named argument using call_user_func_array(), the array indexes are lost in the result of func_get_args()
  • debug_backtrace()’s return value seems counter-intuitive for some of the elements. A single element does not contain a single stack frame, a single stack frame is split between elements n and n+1. The element ‘function’ is not the name of the function that was executing in a stack frame, it’s the name of the function called by that stack frame. I suppose this could be useful if you wrote code that looked like a(b(), c(), d()), because then you’d know which function call generated the next stack frame, except this is reduced to uselessness with spaghetti code like a(b(b(), b()), b(), b(b(b()))), so there’s no win. To find out the name of the function being executed, you have to look at the next element’s ‘function’ element, which contains the name of the function that the next outter scope called. Because of this, the most outter scope has a ‘function’ element, which at first glace does not make it obvious that it is the outter scope.
    The value of ‘function’ could be implied when using SPL interfaces. If you implement the Iterator interface and then use foreach, debug_backtrace in, for example, your custom ::current() will imply that a function call to “current” will exist when in fact there is no explict call to current on the line that contains the foreach. This could be improved by creating an additional stack frame for the execution of the foreach (this might make sense considering that foreach() looks like a function call, see above), but has its own problems. Since there is no way to detect when an SPL interface is being called, and there is no easy way to avoid namespace collisions between the interfaces and your own custom methods, you can’t implement warnings/errors when you incorrectly call an interface’s methods outside the context of the interface. For example, I have an object that implements Iterator and has a check for “validity” that is unrelated to array access. At first blush, I’d like to name this method ::valid, but that already has meaning in the Iterator interface. So to keep myself from creating hard to track down bugs, I could put a check in ::valid to make sure that it is only being called through the interface and not directly. No go, no way to detect that (that I can find).
  • the configure script takes –with-whateverlib options in the normal autoconf format. You can specify –with-whateverlib=DIR to build the PHP module whateverlib, or –with-whateverlib=shared,DIR. In this case, using ’shared’ has a different meaning than (at least my experience) with other autoconf configure scripts. If you use shared here, then PHP will build the whateverlib PHP module as a .so. Every other time I’ve seen –with-whateverlib=shared,DIR, it means to dynamically link against whateverlib (use libwhatever.so) rather than statically (libwhatever.a). There doesn’t seem to be away to generate a built-in PHP module and specify how you want the library linked.
  • you can’t use the result of every function in an expression, or array indexing operations are not expressions. get_defined_constants(true)['user'] is a syntax error, despite the fact that get_defined_constants() returns an array
  • is_callable does not seem to take into account the scope of the caller; it seems to return false when given a protected function and called from the same object
  • get_class() returns the class name of the current class if called without any arguments in PHP5. But the __CLASS__ constant also exists. Is there a reason to use one over the others
  • If you overload member access using __get and __set, and it happens to return NULL; that is, if you do this:$A->Amember->Bmember = "some new value"(that is, $A->Amember returns an instance of class B, and B->Bmember is being assigned to), you get this:PHP Notice: Indirect modification of overloaded property A::$Amember has no effectIf you treat NULl as an object$x = NULL; $x->something = "hello";it promotes $x to a stdClass object and does the assignment. Neither of these are ideal, but it is woefully inconsistent.
  • instanceof does not seem to be able to take an expression as the value for its right-side, making “instanceof get_class($this)” impossible
  • http://bugs.php.net/bug.php?id=12622 - incorrect behavior is not a bug if it was implemented.
  • some operators only take either literals or variables or simple expressions. “$x=NULL; $x instanceof CLASS” is valid and returns false. “NULL instanceof CLASS” generates a weird run-time fatal error “Invalid opcode 138/1/1″. obviously this expression doesn’t make sense, but the instanceof operator only works on variables, not on values?

Then there’s this site http://tnx.nl/php.

There are some bug reports for PHP that give the impression that PHP devs do not want to (or are unable to) modify the language grammar to fix things, like http://bugs.php.net/bug.php?id=33463

this rant is just getting started…

One Response to “Problems with PHP”

  1. on 23 Apr 2007 at 12:31 am bg

    “get_class() returns the class name of the current class if called without any arguments in PHP5. But the __CLASS__ constant also exists. Is there a reason to use one over the others”

    There is a reason:
    consider a class A with method x() and a class B which extends A.
    If you do
    $b = new B;
    and then call
    $b->x();
    and within x you output get_class() and __CLASS__ you get “B” for the first and “A” for the second. __CLASS__ is to be used like __FILE__ or __LINE__, mostly for debugging purposes: to find out, where an error really happend.

Trackback this Post | Feed on comments to this Post

Leave a Reply