Hacking PHP 7

Recently, I have taken part in some screen casts with my good friends at 3devs.
 The subject of the screen casts are extension development for, and hacking PHP 7 (Part 1, Part 2).
 Screen casting is a medium I haven’t mastered, or had very much practice at.
 While I’m trying to plan the content for the show, I can’t help but be reminded of every lecturer and tutor that stood in front of me repeating facts, sometimes literally reading from a book, or their own notes.
 As a rule of thumb, if someone says something to me, and my life does not depend on my retention of the information contained in the statement, I will immediately, and without prejudice to the speaker, forget what was said.

One of the first things we learn as children are our multiplication tables, and we first remember those which have a pattern, or some trick, to describe, or determine the sequence of numbers. We don’t remember the sequence until we have uttered it at least a few hundred times. The same goes for the alphabet; We roughly remember a kind of melody, and get most of the letters in the second half of the alphabet wrong for quite some time.
 As education progresses, perhaps for reasons of logistical expediency, the focus does shift from inspiring us, with the beauty of the rainbow, to learn how the rainbow works, to barking facts at us about the past wars and rulers of our respective country, and having us try to memorize various tables of information.
 This sucks, it sucks so hard: Just when your mind becomes fully primed, and developed enough to really respond to inspiration, the inspiration is almost completely squashed from our education process.
 Some may be lucky enough to have a better experience of education, but broadly speaking, until higher education anyway, this is how we were, and our children still are “taught”.
 It is the job of the teacher to inspire listeners to learn for themselves, it is not the job of a teacher to bark facts. I’ve really tried to convey that in the material we prepared.
 Screen casting is a difficult medium however, so to accompany the screencast is this blog post. 
 Writing extensions is fun, but it’s not as fun as hacking PHP. So, we’re going to focus on hacking, we’re going to imagine that we are introducing some new language feature, by RFC.
 Without focusing on the RFC process itself, you need to know which are the relevant parts of PHP you need to change, in order to introduce new language features.
 You also need to know how PHP 7 works, about each stage of turning text into Zend opcodes …

In the Beginning: Lexing

When the interpreter is instructed to execute a PHP file, the first thing that happens is lexing, or if you prefer, lexical analysis.
 The lexer, accepts a stream of characters as input, and emits a stream of tokens as output.
 The input to the lexer is the characters of the code. The output is those sequences that the lexer recognizes, identified in a useful way.
 The following function is illustrative of what a lexer, or lexical analysis does:

function lexer($bytes, ...) { switch ($bytes) { case substr($bytes, 0, 2) == "if": return TOKEN_IF; } }

A lexer doesn’t actually work like that, but it’s enough to understand what it does, you can read associated documentation, or source code to discover how it does it.
 The lexer function itself is generated by software known as a “lexer generator”, the one that PHP uses is named re2c.
 The input file to the lexer generator is a file which contains a set of “rules” in a specific format.
 I’ll just take one illustrative excerpt from the input for the lexer generator:

"if" { RETURN_TOKEN(T_IF); }

This can be roughly translated to “if we are in a scripting state, and find the sequence ‘if’, return the identifier T_IF”.
 What “scripting state” means becomes clear when you remember that PHP used to be embedded in HTML: Not all of a file the interpreter reads is executable PHP code.

Consuming Tokens: Parsing

Originally published at www.laravelfeed.com.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.