Happy 25th Anniversary, PHP!

Happy 25th Anniversary, PHP!

"Faster, better, easier". That's how the release announcement of a new major software version usually reads. For an Open Source project such as PHP, which has to get by without polished and legally secured professional communication, the announcement usually reads like this: "PHP 8 will be released on November 26, 2020".

In the changelog you will then find rather dry words about what has changed, what has been added, and – especially relevant for PHP 8 – what has been removed. This is when one begins to wonder whether the joy of the syntactic sugar, or an up to 10-fold increase in performance thanks to a just-in-time compiler, will outweigh the joy of the change. Will the entire migration effort – including the necessary code changes – spoil the party mood?

In fact, PHP 8 doesn't just bring a few new features, but is an important milestone of a fundamental redesign and expansion of the language that began years ago. As the engine behind two-thirds of all websites worldwide, PHP is not only one of the most widely used programming languages, but also one of the most criticized.

There are only two kinds of [programming] languages: the ones people complain about and the ones nobody uses.

Bjarne Stroustrup

PHP is 25 years old, which, for a programming language, is a middle age. PHP was created in 1995, the same year Java and JavaScript were created. Many programming languages that are very popular today such as Go, Rust, or TypeScript are much younger, on average about 10 years old. PHP, on the other hand, has matured and its Sturm- und Drang times are over. A strong professionalization has taken place both in the PHP project itself and on the PHP application development side.

The bandwidth of how PHP is used is enormous. On the one hand, PHP is a programming language that is very approachable for beginners. It is often used in a Perl-like way for the automation of processes. Then, numerous widely used standard applications and frameworks such as Drupal, TYPO3, Joomla, WordPress, or Magento are written in PHP. Last but not least, there are many self-developed and sometimes very complex web-based enterprise applications.

Who pays for PHP?

It is a peculiarity of PHP that there is no company or foundation behind the project. PHP is just a real Open Source software project with all the advantages and disadvantages that this has. Many people think that Zend is "the PHP Company". It is undisputed that Zend founders Andi Gutmans and Zeev Suraski, as authors of the PHP runtime environment, made major contributions to the PHP project. But in the long run, Zend's involvement in the PHP project was actually limited to allowing a developer to work part-time on the PHP core. Zend was acquired by Rogue Wave in 2015, who only a few years later discontinued Zend's involvement in the PHP Open Source project. Rogue Wave was then itself acquired by Perforce in 2019. It remains to be seen how much Perforce will be involved in the PHP project in the long run.

There are a few more well-known companies that have been involved in the PHP project, either temporarily or over a longer period of time. MySQL/Sun/Oracle have always provided good database connectivity in PHP. Some years ago IBM had the idea to create a PHP runtime environment running on the JVM (WebSphere sMash). To ensure that their runtime behaved in accordance with the original PHP runtime, IBM paid several developers to write tests for PHP, which still benefits the language to this day.

Microsoft has paid two developers more or less full-time during the development of PHP 7 to make PHP, which was traditionally developed mainly for and under Linux, run under all current Windows versions with full functionality.

You may find it hard to believe, but for several years the most important "corporate sponsor" for the PHP project was Facebook. The Facebook code base was originally written in PHP. Facebook has, for reasons and with consequences that will not be discussed here, grown to an almost unbelievable size. Anecdotally, there are about 30 million lines of PHP code and over a million servers running Facebook. Anyone operating on such a scale naturally has completely different problems than the "little store" around the corner with a few hundred servers.

Facebook scale, Facebook problems

One of Facebook's problems was power consumption. Because PHP executes bytecode on a virtual machine at runtime, multiple machine instructions are required to execute each bytecode instruction. After several engineering teams from Facebook spent quite some time thinking about and experimenting with how Facebook could run PHP applications more efficiently, they created HHVM, an alternative PHP runtime environment that could run PHP much more power efficiently thanks to a just-in-time compiler. As a pleasant side effect, HHVM was also much faster than PHP. As a result, some large PHP installations such as Wikipedia switched to HHVM for a while. At that time, quite a few PHP core developers were on Facebook's payroll and spent part of their work time on PHP improvements.

Competition is good, and the PHP project did not want PHP to fall behind HHVM. This motivated the significant performance gains of PHP 5.5 and PHP 5.6. With the release of PHP 7, which was much faster and used less memory thanks to massive cleanup work under the hood, PHP's performance was at least on par with HHVM, but mostly even better than HHVM.

Although HHVM is also Open Source, it has never been a true community-driven Open Source project. For Facebook, it is merely a technology stack that is controlled by its own development team.

A key learning for Facebook engineers has been that a much more efficient machine code can be generated when more type information is available. Facebook therefore began to enrich the source code with type information using specially created tools to further promote compiler optimization. This is how their own programming language Hack, a fork of PHP, came to be. When Hack was released, it offered an extended range of language features compared to PHP, especially with regard to the type system. It would be a lie to say that the emergence of Hack did not significantly influence the further development of PHP. But one of the great advantages of Open Source software is that you can also learn from the experiences of other teams and projects.

How much type safety do you need today?

A type determines which kind of values a variable can have, which operations are allowed on it, and how these operations are to be performed. The type system of early PHP versions was exclusively implicit, dynamic, and weak:

  • Types of parameters, return values, and properties could not be declared explicitly.
  • A type was only associated with a value when it was assigned.
  • If at all, type safety rules were enforced at runtime.
  • There were hardly any restrictions for the mixing of types, and an (almost always useful) automatic conversion between types was performed.

Under the hood, the PHP runtime environment stores all scalar values in a structure called zval. Depending on the context, such a zval is sometimes interpreted as bool, int, float, or string. What may seem strange at first glance, works surprisingly well in practice, with a few exceptions and edge cases. As a web language, PHP mostly processes HTTP requests, in other words: texts or strings, and generates an HTTP response, which are also strings. It is therefore not unreasonable to consider variables primarily to be strings.

Since version 5, PHP allows optional type declarations in the source code. With PHP 8, with the exception of local variables, all parameters, return values, and properties can have an explicitly declared type. At the same time, the type system has continuously been extended with new types. For example, PHP 7 introduced the scalar types bool, int, float, and string. Since PHP 7, it is also possible to declare a type for the return value of functions and methods.

PHP 8 extends the type system with union types. These allow type declarations such as bool|int, so that one of two different types can be used depending on the situation. One can argue a lot about the sense and nonsense of union types, but there are some use cases for them.

The type mixed, also introduced in PHP 8, allows developers to make it clear in the code that the type cannot be specified more precisely, but that they have thought about the type. Without such an explicit declaration, it would be unclear whether the type declaration was merely forgotten.


How to get ready for PHP 8?
"How to get ready for PHP 8?", a presentation by Sebastian Bergmann

PHP 8's Just-in-Time Compiler (JIT)

When Rasmus Lerdorf started working on what was to become PHP 25 years ago, he didn't think about implementing business logic in PHP. For him, both the original "Personal Home Page Tools" and their successor PHP/FI were merely a template engine that was supposed to integrate functionality written in C into a web server.

Up to and including version 3, PHP executed programs line by line. This was very slow, but nevertheless more and more functionality was implemented in PHP 3 instead of implementing it in C as originally intended. At that time the first libraries and frameworks like the PHPLIB were created. PHP 4 took the first step away from line-based interpretation of the program code. Instead, a compiler performed an implicit compilation step to convert the source code into bytecode, which was then executed in the next step. The first tools to optimize PHP bytecode and cache it between requests were developed during this time.

Support for real object-orientation was introduced in PHP 5.0. The implementation of namespaces was also originally planned for this version. However, shortly before the release, namespaces were deemed to be ill-conceived and thus they were removed from the language without further ado. Similarly, the support for scalar types in the declaration of parameters was removed. Namespaces reappeared in PHP 5.3, scalar type declarations in PHP 7.0. It is therefore not surprising that well-known projects like Drupal or WordPress, which were developed with PHP 4 at the time, still have problems to get rid of their strongly procedural past.

Programming languages such as C, C++, C# or Java, which require an explicit compilation step, do not generate an executable binary if the type system rules are violated. Due to its implicit compilation step, such type errors lead to runtime errors in PHP. However, since PHP can also execute program code without type information, type-specific bytecode cannot always be generated. After all, the actual type is only known at runtime and can change at any time when the same unit of code is run again. What was true for HHVM also applies here: the more type information is available, the better the bytecode execution can be optimized.

PHP 8 goes one step further at this point and provides an optional tracing just-in-time compiler (Tracing JIT). This approach allows the PHP virtual machine to optimize the execution of a program at runtime. To do this, a linear sequence of frequently executed operations is recorded and translated into native machine code. From now on, these operations are executed directly by the CPU and no longer by the PHP interpreter. This saves system resources, of course.

PHP 8 is therefore even faster than PHP 7, but the performance gain of the Tracing JIT compiler is mainly seen with computationally intensive programs. The display of a Mandelbrot set is a good example: the computations required for this are performed ten times faster by PHP 8 compared to PHP 7. From a web programming point of view, however, Mandelbrot should be regarded as a synthetic benchmark. Nevertheless, this shows how a tracing just-in-time compiler can accelerate code execution. It is expected that further noticeable improvements will follow in future PHP versions.

In the past, when such computationally intensive algorithms were needed in a PHP application, they had to be implemented in C as an extension for the PHP interpreter. The PHP project thus shows that software engineering as a discipline is today in a stage of development in which higher-level programming languages can achieve a performance that is comparable to that of low-level programming languages simply thanks to technical measures such as compiler optimizations.

Professionalization of the PHP Project

PHP was originally developed as a hobby project. This was at a time when HTML forms were not even standardized. Today, it is hard to imagine that the composition of an HTTP request from form data to be processed on the server side was once bleeding edge technology! Around the year 2000, PHP spread rapidly, although today it is hard to say whether PHP fired the dotcom-boom or vice versa. In the course of the dotcom-boom and the following years, numerous applications and solutions written in PHP were developed. Taking into account the language's feature set at the time, the existing tooling, the infrastructure, and last but not least the developer know-how, PHP regularly pushed the limits of what was possible at that time a little further. Examples are Yahoo, Wikipedia or WordPress.com.

The PHP project itself has become extremely professional since the release of PHP 5. At that time, technical discussions were regularly held very emotionally on a mailing list, essential features such as namespaces or scalar types were implemented "just like that" and, in case of doubt, removed from the software shortly before the release. There was no reliable release planning. Instead, work was done according to the "it's ready when it's ready" principle.

This led to the fact that new versions of PHP found their way into the field only very slowly. The PHP community still suffers today from the fact that Red Hat shipped PHP version 5.1 in 2007 and provided patches and security fixes over the entire ten-year life cycle of Red Hat Enterprise Linux (RHEL) 5. By the way, the Extended Support for RHEL 5 will only expire these days. Not everything is as short-lived in today's IT as a front-end framework! While you may smile about it today – in times of containers – it was common practice a few years ago to have long and unproductive discussions with system administrators who simply refused to install newer PHP versions for liability reasons, because they were not officially supported by the operating system vendor.

Since version 5.4, PHP has a predictable release cycle. In addition, there are well documented timelines for active support and security support for each PHP version. This has resulted in new PHP versions getting into the field much faster today and hopefully in the near future will lead to more PHP users keeping their PHP installation up-to-date.

New features and changes to the language are proposed and discussed via a defined RFC ("Request for Comments") process. In addition to the motivation for a change or extension, the expected effects, for example in the form of breaks in backward compatibility, must be explained and assessed. The members of the PHP project then publicly vote on these RFCs.

The Big Clean Up

As we have already mentioned, there is neither a requirement in PHP to declare types, nor an explicit compilation step that can detect problems before they become runtime errors. Before there were IDEs and testing tools for PHP, programming was like this: after making a change in a source file in the document root of a web server, you switched from the editor to a browser window where you pressed F5 and could see the result of the program execution directly.

Especially for career changers, who often get to PHP programming via web design, HTML and CSS, this is a pleasingly low entry hurdle. You don't have to worry about data types and special cases, but you have a first feeling of success. Both this way of working and such an entry into programming are not bad per se, but of course in the long run they do not meet the demands that are made on professional software development today.

Professional software development with PHP looks, of course, different. In the IDE, thanks to static code analysis, entire classes of errors are detected early on, even before the code is even executed. The first execution of the code takes place – as in all other programming languages – exclusively in the context of automated tests, of course. Yes, we smiled when we wrote this sentence.

PHP 8 supports this trend towards clean code in the PHP ecosystem because many problematic constructs are simply no longer allowed. Although PHP has issued various warning ("Deprecation", "Notice", or "Warning") in the past, many teams have simply ignored them for years. Such teams won't be able to migrate to PHP 8 without effort, because the new runtime throws an exception in places where previously only warnings were issued. Developers who have always attached importance to an empty PHP logfile without warnings as well as automated tests can in most cases migrate to the new PHP version almost effortlessly.

Like PHP 7, PHP 8 brings some changes in syntax and semantics. Even if the reason for this is not always obvious, these are not unplanned side effects of changes, but rather a necessary prerequisite for technical innovations and improvements to the runtime environment and are well documented as such. As a developer or maintainer of a PHP application that is still running on PHP 5 or even PHP 4, you cannot expect to get the migration to a current version of the language "for free" today.

On the other hand, the PHP project can be credited with the fact that despite the huge installed base, a clear migration path exists. What was sometimes seen from the outside as sticking to outdated principles and design decisions, has in reality always been a very responsible approach to backwards compatibility. For other programming languages such as Perl 6 or Python 3, for example, the big migration has after all not worked that well at all.

Division through innovation?

In a way, the PHP community is split. On the one hand, the language itself stands for innovation and consistent development. Some well-known projects follow this path and synchronize their scheduling with the release cycles and support commitments of the PHP project.

On the other hand, there are vendors of operating systems, frameworks, and widely used applications that have a different perspective and focus on continuity for their users, for example by offering long-term support versions.

It is often propagated that such a situation, which is by no means specific to the PHP project, would lead to a fork in the long run: a split into two or more development branches that grow further and further apart over time. In fact, such forks have already existed several times in the form of alternative runtime environments such as WebSphere sMash from IBM or HHVM from Facebook. However, these have never prevailed against the original PHP.

PHP is often said to be a language that is not innovative and would only copy and adapt ideas and features of other languages. It is certainly true that PHP learns from other languages, but this is also true vice versa and is only a manifestation of knowledge sharing as it happens in an Open Source world without patents. We believe that we have seen an increasing convergence of programming languages in recent years. The original, hard boundaries such as compiled language versus interpreted language or different type system philosophies are more and more softened.

"Serverless" was already standard in the PHP world before this term was even coined. The deployment of a monolithic binary, which had to be countered by a whole microservice movement in other languages, has never been a problem in the PHP world. And PHP's often criticized Shared Nothing architecture has become commonplace for every developer with the spread of cloud computing for independent processing of HTTP requests.

Software users generally tend to focus their attention on just a few "eye candy" features. The sometimes immense efforts that have gone into the infrastructure are not seen. Although PHP 8, just like PHP 7, has some useful and helpful features, these are ultimately just syntactical sugar, which allows the application developers to make their program code a bit more compact. This is the "eye candy" on the surface, which alone would justify an upgrade to PHP 8, but we have deliberately refrained from presenting it in this article.

PHP 8 is an important milestone in the development history of PHP, not only because of some nice features on the surface. An Open Source project that started as a hobby 25 years ago has completed the consistent redesign of both its technical foundation and its development processes. Even fundamental changes and innovations can now be implemented in the PHP language with low risk, even for new core contributors without years of project knowledge.