The first step I took on was the overall layout of the network was going to look like. Some of the questions I had to think through:
One of the first decisions I made here was that I wanted the network to be as wired as possible. I wanted the reliability, speed, and upgradability of having ethernet throughout the house and in most rooms. So that dictated a lot of the decisions that follow, as I need to be running ethernet cabling.
The next problem to solve is how to run the cables through the 2 story house with an attic and a basement. Do I run everything from the basement to the rest of the house? Do I have a networking station on each floor? Do I put most of it in the attic? After looking at the floor plan considerably, there wasn’t really many first and second floor walls that both lined up well with each other, and were in a good space to access from either above or below. In fact, I only found one wall that would be “easy” to access from all 4 floors (including basement and attic).
This settled me on a “tiered” network architecture. I would separate the upstairs and the downstairs into two distinct segments, and connect them with a “backbone” network layer. I then picked the location of the two network wiring closets. For the downstairs segment, I chose a wall that was adjacent to the garage, which would allow easy wiring to the garage and good access to the rest of the first floor. For the upstairs segment, I selected a tall closet in our future office. This had easy access to the attic, as well as good power and a central location.
So my overall layout was set. I’d have a network rack in the basement which housed my router, a switch for the downstairs network drops, and a switch in the office closet upstairs for the upstairs network drops.
The one final layout problem to solve is where to put the WiFi access points. This has more to do with the layout of the house than anything, so I won’t go into much detail here. The one thing I did was to get a 100 foot ethernet cable, and test different locations for each access point (using a mobile phone app). Once I settled on the locations, they were very similar to wiring ethernet jacks to the network closets, so won’t go too much more detail here.
Now that I had my backbone physically located, I needed to decide how to connect the backbone. I could run ethernet between them, but I wanted to future proof it as much as possible. Given that it wasn’t a short run (about 150 feet, or 45 meters), I made the decision to run fiber for the backbone. I went through a few options, and settled on running two different fiber cables. One multi-mode OM4 and one single-mode OS2. Why not just go with 2 multi-mode or 2 single mode? I don’t know. I couldn’t decide, so I just decided why not one of each. I’d only really be using one at any one point in time (leaving the other as a backup). It should also be sufficiently future-proof (OS2 with different networking transceivers are capable of 400gbps+).
For the rest of the house, I wanted to be as future proof as I could reasonably be, while not going totally overboard (so not running fiber to every room). So I settled on CAT6a cabling in a F/UTP configuration (an outer foil shield, but each pair is unshielded). This is a bit harder to work with (a thicker and less flexible cable), but should be solid for 10gbps ethernet if I ever want to upgrade to it. I made this decision because, at the end of the day, it’s a ton of work to pull all that cable and I’d rather only do it once. I could have gone for CAT7, but the added expense didn’t seem worth it considering at the lengths I am running the bandwidth would likely be very similar (both 10gbps) with CAT7 only maybe theoretically supporting 100gbps on the shorter runs. So I got 2 1000-foot spools of CAT6a F/UTP, and a bunch of keystone jacks.
For the patch panels, I found that there were few that supported CAT6a, and those that did were bulky and cumbersome. Then I found one by Trendnet. This turned out to work really well, so I got 2 and went to the races.
In prior homes (and offices) I have had really good luck with UniFi networking gear, so I started with the assumption that I would use that for this house. I settled on the UniFi Dream Machine Pro for the “router”, a set of 10gbps switches for the backbone (UniFi Switch 16XG, and picked PoE (Power over Ethernet) 1gbps switches for the rest.
I did look at the Mikrotik hardware, and while the price was amazing, I really wanted the central management ability of UniFi.
Before I talk through how I pulled cable, etc, I think it’s important to touch on what sorts of tools were needed for this. Here’s a rough list of the tools I used and would suggest you get as well:
As far as consumables, I won’t list everything here, but the things that I found to be worth while over the “common” things you’ll find:
Before we talk about the actual pulling, I think it’s important to start with a simple rule: never pull just one cable. Pulling cable is a pain in the neck, and if you think you need one cable, pull two. If you think you need two, pull four. And if you think you need four, pull six. It’s so much easier to pull while you’re setup than to come back in a few years because a cable failed or your needs changed.
One technique that I accidentally found is basically always go from the area of least access to the area of most access (or from hard to easy). For most of the first floor runs, that was often going from the first floor down into the basement (as the basement was more open). Similarly with most of the upstairs runs (going up from the upstairs into the attic). However, there were some runs that were odd (that required going from hard to easy to hard, or easy to hard to easy). In those cases, I would often pull a nylon pull line in the “normal” direction (from hard to easy), do the same with the other part of the run (with a new nylon line), then tie the two together. This allowed for the run to be continuous and still navigate some significant challenges.
For “normal” runs through walls, I always started with the stud finder. I found an approximate area on the wall I wanted the network jacks. Then I would find where the studs were on either side, and put masking tape (so not to damage paint). If I was going up (to the attic for example), I would also run the stud finder up to the ceiling to see if there is a fire-block (a piece of wood blocking off the wall). If so, I would usually switch to a different stud bay and try to find one without a block. If you can’t find another, then you’ll have to drill through.
There are a ton of YouTube videos on drilling and running wires through walls, so I won’t go too in-depth here. A few pieces of advice though:
Take your time. You’re drilling blind, and there are multiple things behind the drywall that don’t react well to drills (ducts, electrical wiring, water pipes, etc). Before you cut into the wall, do your best to check for air ducts and electrical wires. Once you cut into the wall, use the borescope to look inside the wall before every step you take. Sometimes you’ll find pre-existing holes you can re-use if you get the angle just right (the fish rod comes in real handy with that). When you do drill, go in bursts and inspect whenever you feel a breakthrough. One of the times I did, I found a set of copper water lines running right under the hole. Had I kept pushing, I’d have had a flooded house…
Leverage gravity whenever you can. Going from top down is often FAR easier than trying to go bottom up (though not always). One trick I found really handy here is to drill a tiny hole in the ceiling next to the wall where I want the cable to pass into the attic. Then, I’d push a piece of small wire (like 1/8” or 3mm in diameter) up. That way, when I find the wire in the attic, I am able to locate where to drill (after offsetting to the wall). Then I can drill my normal 3/4 inch hole in the top of the wall, and push the fish rod down into the wall and grab it from the hole.
Plan. Plan. And Plan some more. Find reference points on each floor that you can measure from so that you can understand where you are operating on each floor and visualize the step in 3D. For example, in the basement I would find air vents that go up into the wall, and then use the position of the vent to measure offsets.
Label your cables before you pull them. I pulled two cables at a time with 2 spools. I labeled one of the spools “A” and the other “B”. Before pulling, I’d label the end of each cable with the room name and cable number for that room (ex: Bedroom 1 and Beddroom 2). Then after pulling, when I cut the cable from the spool, I label the cut end. It’s easy to keep track of, because the “A” spool always has a odd cable number (“Bedroom 1”, “Bedroom 3”, etc), and the “B” spool always has an even cable number. Label when you punch the cables into the patch panel as well.
Test as you go. When you run a cable, punch it in and terminate it at both sides, then test it. If you test every one as you install it, you’ll never have an untested cable (and risk damaging equipment you plug in).
Sometimes you will need two people to complete a run. Some cable runs are just too tricky to do by yourself, so having someone feeding cable up who can pull/push as needed can make life a lot easier.
When using a spool of cable, always rotate the spool to feed cable off it as if you just pull cable off the end you’ll get a twist and result in kinks. Get a wooden rod or some other method to unspool the cable. Initially I 3D printed some rollers, but they weren’t the best. Eventually I just moved to using an old closet rod to unspool.
Always leave yourself slack on both ends when pulling cable. At the wall end, I normally would leave enough for a small loop inside the wall, to allow me to work a foot or two outside the wall when installing the keystones, and in case I made a mistake and had to cut and re-terminate.
Neatness counts. Get yourself a lot of velcro strips, and bundle the cables together as you go. It’s much easier to build up a neat bundle as you go rather than after the fact. Here’s an example of how the upstairs network closet came out:
Finally, don’t get frustrated. If you encounter a major obstacle, take some time and reset. Think through the problem and find a solution. Don’t be afraid to change your plans. There’s no need to get too flustered. As an example, I ran into heavily insulated interior walls when I was wiring the upstairs. It was such a pain in the neck I almost stopped and called a company to come finish. After thinking through it for a bit, I decided to try to come down from the attic and while it wasn’t trivial it was way easier.
One REALLY important note: every time you drill a new hole in the floor or ceiling stud header, you potentially compromise the fire rating of the wall. Know your local building code and how to keep safe. For mine, it involved sealing those holes with fire block (a sort of clay that you mold to take the space).
Setting up the UniFi system is fairly straight forward, and I won’t go into it too much here. I do want to touch on the VLAN setup though. When I designed the network, I wanted to be prepped for multiple levels of trust for the devices on the network. So I decided to setup a multi-vlan setup to segregate the network as much as possible. This won’t be the full setup, but should give you an idea:
VLAN ID | IP Subnet | WiFi? | Purpose |
---|---|---|---|
1 | 10.1.0.1\24 | no | Network Services (DNS, etc) |
10 | 10.10.0.0\24 | no | All network management traffic, switches, etc |
20 | 10.20.0.0\23 | Yes | All “trusted” devices (desktops, laptops, phones, etc) |
30 | 10.30.0.0\24 | no | Servers, network file storage, etc |
40 | 10.40.0.0\24 | Yes | iOT Control devices (iOT devices which need to access other iOT devices) |
50 | 10.50.0.0\23 | Yes | iOT Devices (iOT devices which don’t need to access other iOT devices) |
60 | 10.60.0.0\23 | Yes | Guest network |
Why so many VLANs? Well, I wanted to setup aggressive firewall rules. Each of these VLANs have separate firewall rules to limit who they can see to the minimum necessary to function.
For example, there’s no reason a smart light bulb should ever need to talk to any other device on the network. Therefore I have a firewall rule for VLAN 50 that the only allowed traffic is to the internet. They can’t see or ping the local network at all.
Some iOT devices do need to see others on the network to function well, so those are on the iOT Control VLAN. This allows them to see each other, and all deviecs on the iOT network, but nothing else.
Servers can see all iOT devices, but nothing else.
All trusted devices can see everything and access everything.
And so on.
Once the physical and the virtual networks were setup, then came a huge bit of work to get the services setup to monitor, manage, and automate everything I wanted to. This will need to be another post.
]]>Since I’m going to be talking a lot about compilers and components in this post, I figure it’s good to start with a primer on how they work, and how the different types behave.
Let’s start by talking about the 3 main categories of how programs are executed. (There are definitely some blurred lines here, and you’ll hear people using these labels to refer to multiple different things, but for the purposes of this post):
Interpreted: The vast majority of dynamic languages use a Virtual Machine of some sort. PHP, Python (CPython), Ruby, and many others may be interpreted using a Virtual Machine.
A VM is - at its most abstract level - is a giant switch statement inside of a loop. The language parses and compiles the source code into a form of Intermediary Representation often called Opcodes or ByteCode.
The prime advantage of a VM is that it’s simpler to build for dynamic languages, and removes the “waiting for code to compile” step.
Compiled: The vast majority of what we think of as static languages are “Ahead Of Time” (AOT) Compiled directly to native machine code. C, Go, Rust, and many many others use an AOT compiler.
AOT basically means that the full compilation process happens as a whole, ahead of when you want to run the code. So you compile it, and then some time later you can execute it.
The prime advantage of AOT compilation is that it can generate very efficient code. The (prime) downside is that it can take a long time to compile code.
Just In Time (JIT): JIT is a relatively recently popularized method to get the best of both worlds (VM and AOT). Lua, Java, JavaScript, Python (via PyPy), HHVM, PHP 8, and many others use a JIT compiler.
A JIT is basically just a combination of a VM and an AOT compiler. Instead of compiling the full program at once, it instead runs the code on a Virtual Machine for a while. It does this for two reasons: to figure out which parts of the code are “hot” (and hence most useful to be in machine code), and to collect some runtime information about the code (what types are commonly used, etc). Then, it pauses execution for a moment to compile just that small bit of code to machine code before resuming execution. A JIT runtime will bounce back and forth between interpreted code and native compiled code.
The prime advantage of JIT compilation is that it balances the fast deployment cycle of a VM with the potential for AOT-like performance for some use-cases. But it is also insanely complicated since you’re building 2 full compilers, and an interface between them.
Another way of saying this, is that an Interpreter runs code, whereas an AOT compiler generates machine code which then the Computer runs. And a JIT compiler runs the code but every once in a while translates some of the running code into machine code, and then executes it.
I just used the word “Compiler” a lot (along with a ton of other words), but each of these words have many different meanings, so it’s worth talking a bit about that:
Compiler: The meaning of “Compiler” changes depending on what you’re talking about:
When you’re talking about building language runtimes (aka: compilers), a Compiler is a program that translates code from one language into another with different semantics (there’s a conversion step, it isn’t just a representation). It could be from PHP to Opcode, it could be from C to an Intermediary Representation. It could be from Assembly to Machine Code, it could be from a regular expression to machine code. Yes, PHP 7.0 includes a compiler to compile from PHP source code to Opcodes.
When you’re talking about using language runtimes (aka: compilers), a Compiler is usually implied to be a specific set of programs that convert the original source code into machine code. It’s worth noting that a “Compiler” (like gcc for example) is normally made up of several smaller compilers that chain together to transform the source code.
Yes, it’s confusing…
Virtual Machine (VM): I mentioned above that a VM is a giant switch statement inside of a loop. To understand why it’s called a “Virtual” machine, let’s talk for a second about how a real physical CPU works.
A real machine executes instructions that are encoded as 0’s and 1’s. Those instructions can be represented as assembly code:
incq %rsi addq $2 %rsi
This basically adds 1 to the rsi
register, then adds 2 to it.
Compare this to the PHP opcodes for the “same” operations:
POST_INC !0 ASSIGN_ADD !0, 2
Aside from naming conventions, they are basically conceptually the same. The PHP OpCodes are the building block instructions for the PHP VM, just like assembly is the building block instructions for a CPU.
The difference, is that assembly instructions are very low level and there are relatively few of them, where PHP’s VM OpCode instructions have more logic built in. An example of this is the incq
assembly instruction expects its argument to be an integer. PHP’s POST_INC
instruction on the other hand contains all of the logic necessary to convert the argument to an integer first. There’s a LOT more logic in the PHP VM which is what makes PHP (and any interpreted language) possible, and which is why interpreted languages often use one.
Parser: A parser is very similar to a compiler but it doesn’t translate the source code, it just changes the representation. This can be from text (the source code that you write) into an internal data structure (such as a tree or a graph).
Abstract Syntax Tree (AST): An AST is an internal data structure that represents the source code of a program as a tree. So instead of $a = $b + $c;
you get something like Assign($a, Add($b, $c))
. The key property that makes it a tree is that every node has exactly one parent. PHP internally parses from the source file into an AST before compiling to Opcodes.
Given the following code:
$a = $b + $c; echo $a;
We could expect an AST to look something like:
Control Flow Graph (CFG): A CFG looks a lot like an AST but instead of each node having one parent, they can have multiple. This is because a CFG includes edges for loops, etc such that you can see all the possible ways control can flow through code. PHP’s Opcache’s Optimizer uses a CFG internally.
Given the following PHP code:
function(int $a): int { $value = 1; if ($a > 0) { $value = $value + 1; } $result = $a + $value; return $result; }
We could expect a CFG to look something like:
In this case, long
basically means a PHP integer, numeric
means either an integer or float, and jumpz
means goto a different instruction based on if bool_21
is 0
or not.
Notice how we can see the different paths the code can take. This is the same reason a compiler internally uses a CFG. But instead of an image, it’s done using data structures.
Intermediary Representation (IR): An IR is basically a programming language that lives entirely within a compiler. You never write in an IR, instead letting it be generated for you. The reason for the IR though, is so that the compiler can manipulate it (for example, to implement optimizations) as well as keeping components of the compiler separate (and hence easier to maintain). The AST and CFG structures above are forms of IR.
My first attempt at running PHP on top of PHP was with PHPPHP way back in 2013. The project attempted to “translate” php-src from C into PHP. It was never designed to run “fast” (fast in quotes, since it’s about 200x slower than PHP and had no real way of going faster). It was done for fun, and mainly as a joke/interesting teaching toy.
About a year and a half later I built Recki-CT. It used a different model. Rather than re-implementing PHP in PHP, I built a multi-stage compiler. It would parse PHP into an AST, convert the AST into a CFG, perform some optimizations, and then emit code using a backend. I built two primary backends for it, one to compile to a PECL extension, and one using JitFu to execute it directly, compiling just in time and executing as native machine code. This approach worked quite well, but wasn’t really practical for a few reasons.
A few years later, I picked the idea back up and instead of building a single monolithic project, decided to build out a series of related projects to parsing and analyzing PHP. PHP-CFG implemented the CFG parsing. PHP-Types implemented a type inference system. PHP-Optimizer implemented a basic set of optimizations on top of the CFG. These tools were designed to be incorporated into other projects for different usages. For example, Tuli was an early static analyzer for PHP code. And PHP-Compiler was a poor attempt at compiling the PHP into lower level code, and never really went anywhere.
The biggest challenge I faced to making a useful, low level compiler was the availability (or lack) of a suitable backend. libjit (which JitFu used) was good and fast, but it couldn’t generate binaries. I could have written a c extension binding to LLVM (which is what HHVM used, among many many others), but that’s a TON of work and I didn’t feel like going down those paths. So on the shelf the projects went.
No. PHP 7.4 is not out yet. It won’t be out for likely at least 6 months. But a few months ago, a little RFC was accepted to incorporate an FFI Extension into PHP. I decided to start playing around with it to see how it worked.
After a bit of playing, I remembered my old compiler projects. And I started wondering how hard it would be to pull libjit in to PHP. But then I remembered the fact it couldn’t generate executable files. And so I started searching to see what else was out there. I stumbled upon libgccjit. And then the rabbit hole went down and down and down.
Let’s take a look at all of the new projects I’ve been working on over the past few months:
My first step was to code generate a wrapper around libgccjit. FFI requires a C declaration file similar to a header file, but can’t handle C pre-processor macros. If that sentence doesn’t make sense, just know that every library comes with one or more “header” files which describe the functions, and FFI needs a cut-down version of that header file.
I didn’t feel like hand-editing a few hundred function declarations, and a bunch of type code. So I decided to build a library to do it for me.
Enter FFIMe.
The project started as a C pre-processor to “compile” header files into a form good enough for FFI. This got me started.
After a month or so of work, I realized I needed more. I couldn’t just pre-process the headers, I needed to actually parse them. So after some significant refactoring, FFIMe can now code-generate helpers for use with FFI. It’s by no means perfect and complete, but it is plenty good enough for my purposes so far.
$ffime = new FFIMe\FFIMe('/usr/lib/x86_64-linux-gnu/libgccjit.so.0');$ffime->include('libgccjit.h');$ffime->codegen('libgccjit\\libgccjit', __DIR__ . '/libgccjit.php');
Basically, it takes a path to a Shared Object File, and then one or more #include
directives. It parses the resulting C, eliminates any code not really compatible with FFI, and then code generates a class (well, a lot of them). The generated file can now be committed (here’s the one from the above example).
If you take a peak at that file, you’ll see a BUNCH of code (nearly 5000 lines). Included are all numeric #define
s from the C headers as class constants, all ENUM
s as class constants, all functions, and wrapper classes around all of the underlying C types. It also includes all other headers recursively (hence why the above header has some seemingly unrelated file functions).
Usage is pretty straightforward (ignore what the library is doing, just focus on the types and the call, and compare to the equivalent C code):
$lib = new libgccjit\libgccjit;// gcc_jit_context *context = gcc_jit_context_acquire();$context = $lib->gcc_jit_context_acquire();// gcc_jit_context_release(context);$lib->gcc_jit_context_release($context);
Now, we can work with C libraries in PHP, just like they were C! Woot!
It’s worth noting, that while I did run into a few rough edges with FFI (which have all been since fixed), it’s fairly straight forward to work with. Definitely easier than some other dark corners of PHP (cough Streams cough). Dmitry did a nice job with it.
When I did the refactoring to FFIMe, I decided to build a full blown C parser. This is basically the same thing as Nikita’s PHPParser does but for C instead of PHP.
Not all C syntax is supported yet, but it does use a standard C grammar, so it’s theoretically able to parse everything.
It does this by first running a C pre-processor on the included files. This will resolve all normal directives like #include
, #define
and #ifdef
, etc. From there, it parses the code into an AST (inspired by CLANG’s).
So, for example, the following C code:
#include "includes_and_typedefs.h"#ifdef TEST_FLAGtypedef int A;#elsetypedef int B;#endif
And includes_and_typedefs.h
:
#define TEST_FLAGtypedef int TEST;
Will result in the following Abstract Syntax Tree:
TranslationUnitDecl declarations: [ Decl_NamedDecl_TypeDecl_TypedefNameDecl_TypedefDecl name: "TEST" type: Type_BuiltinType name: "int" Decl_NamedDecl_TypeDecl_TypedefNameDecl_TypedefDecl name: "A" type: Type_BuiltinType name: "int" ]
The blue names are the classnames of the objects, and the red lowercase ones are the names of the properties of said objects. So the outer object here is a PHPCParser\Node\TranslationUnitDecl
object, which has an array property declarations
. Etc…
It’s probably rare that people will need to parse C code in PHP, so I imagine the uses of this library are going to be pretty constrained to FFIMe. But if you have a use for it, run with it!
I picked back up the PHP-Compiler project, and ran with it. This time, I was adding a few stages to the compiler. Rather than compiling directly from the CFG to native code, I decided to implement a Virtual Machine interpreter (which is basically how PHP works). This is the approach that I took with PHPPHP but WAY more mature. But instead of stopping there, I also built a compiler that can take the virtual machine opcodes and generate out native machine code. This enables true JIT (Just In Time) Compilation.
But beyond JIT, it also enables AOT (Ahead of Time) Compilation. So not only can I run or compile while running, but I can also give it a codebase and have it generate out a native machine code binary.
This means that I can (in theory right now) eventually compile the compiler itself to native code. Which has a shot at making the interpreter side reasonably fast (no idea if it will be anywhere close to PHP7’s speed, but I can hope). And as long as the compilation step can be reasonably quick, this has a shot at not only implementing PHP in PHP, but also being insanely fast while doing it.
I started building PHP-Compiler on top of libgccjit, and the initial results are more than promising. A simple set of benchmarks taken from PHP’s own benchmark suite show that while there’s a LOT of overhead right now, the compiled code can really shine.
The following benchmarks compare PHP-Compiler to PHP 7.4 with and without OpCache (Zend Optimizer), PHP 8’s experimental JIT (enabled and disabled).
Test Name | 7.4 (s) | 7.4.NO.OPCACHE (s) | 8.JIT (s) | 8.NOJIT (s) | bin/jit.php (s) | bin/compile.php (s) | compiled time (s) |
---|---|---|---|---|---|---|---|
Ack(3,10) | 1.1752 | 1.9196 | 0.6796 | 1.1634 | 0.5025 | 0.2939 | 0.2127 |
Ack(3,8) | 0.0973 | 0.1215 | 0.0534 | 0.0853 | 0.3053 | 0.2943 | 0.0148 |
Ack(3,9) | 0.3018 | 0.3730 | 0.1776 | 0.3010 | 0.3458 | 0.2937 | 0.0540 |
array_access | 2.5958 | 2.6941 | 1.6697 | 2.6075 | 0.5495 | 0.2936 | 0.2685 |
fibo(30) | 0.0760 | 0.1035 | 0.0429 | 0.0743 | 0.3065 | 0.2946 | 0.0110 |
mandelbrot | 0.0434 | 0.1090 | 0.0323 | 0.0440 | 0.3186 | 0.3075 | 0.0146 |
simple | 0.0650 | 0.0866 | 0.0391 | 0.0673 | 0.3094 | 0.2988 | 0.0120 |
As you can see, the startup penalty is really heavy (it’s all in PHP remember). The compiled code though (both in the JIT and AOT modes) is significantly faster than 8 with JIT compilation for extremely heavy use-cases.
It’s worth noting, that this is absolutely not an apples-to-apples comparison, and I wouldn’t expect the same numbers in a production-ready system. But it does give an indication as to the promise of such an approach…
Currently, there are 4 commands that you can use:
php bin/vm.php
- Run code in a VMphp bin/jit.php
- Compile all code, and then run itphp bin/compile.php
- Compile all code, and output a .o
file.php bin/print.php
- Compile and output CFG and the generated OpCodes (useful for debugging)And it runs just like PHP on the command line:
me@local:~$ php bin/jit.php -r 'echo "Hello World\n";'Hello World
Yes, the echo "Hello World\n";
is running as native machine code there. Overkill? Definitely. Fun? Amazing!
You can see more here in the readme.
I paused building because of a question: is it worth continuing down with libgccjit, or would I be better off with LLVM?
Well, there’s only one way to find out…
As you’ve likely seen, I’m not great at naming things…
PHP-Compiler-Toolkit is an abstraction layer on top of libjit, libgccjit, and llvm.
Basically you “build” C like code into a custom Intermediary Representation using a PHP native interface. For example (note, long long
is a 64-bit integer, just like PHP’s int
type):
long long add(long long a, long long b) { return a + b;}
Could be built as:
use PHPCompilerToolkit\Context;use PHPCompilerToolkit\Builder\GlobalBuilder;use PHPCompilerToolkit\IR\Parameter;$context = new Context;$builder = new GlobalBuilder($context);// First, let's get a reference to the type we want to use:$type = $builder->type()->long_long();// Next, we need to create the function: $func = $builder->createFunction( 'add', // The function's name $type, // The return type of the function false, // Is the function variadic? new Parameter($type, 'a'), // Argument 0 new Parameter($type, 'b') // Argument 1);// We need a block in the function (blocks contain code)$main = $func->createBlock('main');// Now, we add the two arguments$result = $main->add($func->arg(0), $func->arg(1));// We want the block to return the result of addition of the two args:$main->returnValue($result);
This “describes” the code. From there, we can pass the context into a Backend to compile:
use PHPCompilerToolkit\Backend;$libjit = new Backend\LIBJIT;$libgccjit = new Backend\LIBGCCJIT;$llvm = new Backend\LLVM;// Compile using libjit with full optimizations: -O3$result = $libjit->compile($context, Backend::O3);
And then just grab a callable:
$cb = $result->getCallable('add');var_dump($cb(1, 2)); // int(3)
And that’s pure native code.
This allows me to build the frontend (PHP-Compiler) against this abstraction, and then swap out backends for testing.
It turns out, it was a good idea to test, because the initial looks show how slow libgccjit is in this setup. Compilation times:
Backend | Compile Time | RunTime (1,000,000 runs) |
---|---|---|
libjit | 0.000611066818237 | 0.12596678733826 |
libgccjit | 0.026333808898926 | 0.12308621406555 |
llvm | 0.000663995742797 | 0.12417387962341 |
So while they all are within a reasonable efficiency for runtime, the compilation time is off the charts for libgccjit. So this shows that there may be some truth to using LLVM instead…
Oh, and for such a simple function, the overhead of FFI is substantial. A PHP version of the same code runs in about 0.02524
seconds.
But to demonstrate that it’s potentially much faster than PHP, imagine a benchmark like:
function add(int $a, int $b): int { return $a + $b; }function add100(int $a, int $b): int { $a = add($a, $b); $a = add($a, $b); $a = add($a, $b); // ... snip 100 of these in total $a = add($a, $b); return $a;}
In native PHP, that would take approximately 2.5 seconds to run 1 million times. Not exactly slow, but not insanely fast either. Using PHP-Compiler however, we see:
Backend | Compile Time | RunTime (1,000,000 runs) |
---|---|---|
libjit | 0.000905990600585 | 0.31614589691162 |
libgccjit | 0.036949872970581 | 0.34037208557129 |
llvm | 0.000712156295776 | 0.26515483856201 |
So with that contrived example we can see a 10x performance boost over native PHP 7.4.
You can see this example, as well as the compiled code via the examples folder of php-compiler-toolkit
The PHP-LLVM project was then created after PHP-Compiler-Toolkit. Since the Toolkit experimentation showed that there’s really no real benefit to libgccjit vs LLVM, and there’s performance benefits to LLVM as well as feature benefits, I decided to switch PHP-Compiler straight to LLVM.
So rather than using the LLVM C-API directly, I built a thin layer on top of it. This does two things: first, it presents a more “Object Oriented” API (to get the type of a value call $value->typeOf()
rather than LLVMGetType($value)
). Second, it allows me to abstract against different versions of LLVM. That way, ideally, support could be added for different versions of LLVM, and have capability checking to determine what’s supported.
Finally, due to a few bugs in LLVM, I needed a way to see what symbols were actually compiled into LLVM. So I needed to inspect the shared object file (.so
) that contained the compiled LLVM library. To do that, I built PHP-ELF-SymbolResolver which parses ELF format files and extracts what symbols are declared.
For some reason I doubt there will be much need for this project outside of FFIMe, but maybe someone will need to decode a native OS library in PHP again. If so, here’s your lib!
While porting PHP-Compiler to use PHP-LLVM, it became apparent that code generation using a “builder” API just gets verbose quickly. It becomes write-only code. For example, take the relatively “simple” builtin function __string__alloc
which allocates a new internal string structure. Using a builder API, it would look something like:
$fn = $this->context->context->addFunction( '__string__alloc', $this->context->context->functionType( $this->context->getTypeFromString('__string__*'), false, $this->context->getTypeFromString('int64') // size ));$this->context->functions['__string__alloc'] = $fn;$block = $fn->appendBasicBlock('main');$this->context->builder->positionAtEnd($block);$size = $fn->getParam(0);$allocSize = $this->context->builder->addNoSignedWrap($size, $size->typeOf()->constInt(1, false));$type = $this->context->getTypeFromString('__string__');$struct = $this->context->memory->mallocWithExtra($type, $size);$offset = $this->context->structFieldMap[$struct->typeOf()->getElementType()->getName()]['length'];$this->context->builder->store( $size, $this->context->builder->structGep($struct, $offset));$offset = $this->context->structFieldMap[$struct->typeOf()->getElementType()->getName()]['value'];$char = $this->context->builder->structGep($struct, $offset);$this->context->intrinsic->memset( $char, $this->context->context->int8Type()->constInt(0, false), $allocSize, false);$ref = $this->context->builder->pointerCast( $struct, $this->context->getTypeFromString('__ref__virtual*'));$typeinfo = $this->context->getTypeFromString('int32')->constInt(Refcount::TYPE_INFO_TYPE_STRING|Refcount::TYPE_INFO_REFCOUNTED, false);$this->context->builder->call( $this->context->lookupFunction('__ref__init') , $typeinfo, $ref );$this->context->builder->returnValue($struct);$this->context->builder->clearInsertionPosition();
That’s a wall of garbage. Good luck understanding it (though it somewhat is readable, it’s really hard to work with).
So instead, I built a macro system using PreProcess.io and Yay. So now, the same code looks like:
declare { inline function __string__alloc(int64): __string__*;}compile { function __string__alloc($size) { $allocSize = $size + 1; $struct = malloc __string__ $size; $struct->length = $size; $char = &$struct->value; memset $char 0 $allocSize; $ref = (__ref__virtual*) $struct; $typeinfo = (int32) Refcount::TYPE_INFO_TYPE_STRING | Refcount::TYPE_INFO_REFCOUNTED; __ref__init($typeinfo, $ref); return $struct; }}
Way more readable. It’s a mix of C and PHP syntax, and highly tailored to the needs of PHP-Compiler.
The macro language is semi-documented here.
And if you’re curious about the implementation, check out src/macros.yay.
If you’re worried about performance, you should be. These macros take a while to process (about 1 second per file). However, there are two methods to combat this.
First, it will only pre-process at all if you’ve installed PHP-Compiler with dev dependencies (using composer). Otherwise, it will just load the compiled PHP files.
Second, it will only pre-process “on the fly” if you’ve changed a .pre
file, even with dev dependencies.
So in the end, the overhead is light for dev mode, and non-existant for production mode.
First, install PHP 7.4 with the FFI extension enabled. There are no releases yet as far as I’m aware (and it’ll be quite some time until there is).
For FFIMe, declare it as a composer dev-dependency ("ircmaxell/ffime": "dev-master"
), and run the code generator via a rebuild.php
style file. For example, the rebuild.php
that PHP-Compiler-Toolkit uses contains something like this:
<?phprequire __DIR__ . '/../vendor/autoload.php';$ffi = new FFIMe\FFIMe('/opt/lib/libjit.so.0');$ffi->include('/opt/include/jit/jit.h');$ffi->include('/opt/include/jit/jit-dump.h');$ffi->codegen('libjit\\libjit', __DIR__ . '/libjit.php');// Or with a fluid interface:(new FFIMe\FFIMe('/usr/lib/x86_64-linux-gnu/libgccjit.so.0')) ->include('libgccjit.h'); ->codegen('libgccjit\\libgccjit', __DIR__ . '/libgccjit.php');
Then commit the generated files. I suggest including the generated files via composer via the files
keyword instead of autoloading, because it’ll generate a TON of classes into that single file.
Replace the "...so.0"
string with the path to the shared library that you want to load, and the .h
file with the header(s) you want to parse (you can call ->include()
multiple times).
I’d suggest playing with it, and opening Github issues for anything that you don’t understand or like. I won’t release it as stable until more people than me use it (and there are some tests/CI set up).
PHP-Compiler is in a really fluid state right now. So expect things to break. With that said:
First, install the dependencies (you can use LLVM 4.0, 7, 8, or 9):
me@local:~$ sudo apt-get install llvm-4.0-dev clang-4.0me@local:~$ composer install
Now you’re set, just run it:
You can specify on the CLI via -r
argument:
me@local:~$ php bin/jit.php -r 'echo "Hello World\n";'Hello World
And you can specify a file:
me@local:~$ php bin/vm.php test.php
When compiling using bin/compile.php
, you can also specify an “output file” with -o
(this defaults to the input file, with .php
removed). This will generate an executable binary on your system, ready to execute
me@local:~$ php bin/compile.php -o other test.phpme@local:~$ ./otherHello World
Or, using the default:
me@local:~$ php bin/compile.php test.phpme@local:~$ ./testHello World
As far as what’s supported, that’s going to be changing pretty rapidly. Code that works today may not work next week. And the subset of supported PHP is really limited today…
For convinence, two docker images are published for PHP-Compiler. Both are currently on an older version of Ubuntu (16.04) due to some issues with PHP-C-Parser that I haven’t gotten around to yet. But you can download and play with them:
ircmaxell/php-compiler:16.04 - A fully functioning compiler, all installed and configured with everything you’d need to run it.
ircmaxell/php-compiler:16.04-dev - The development dependencies only. This is designed to work with your own checkout of PHP-Compiler so that you can develop it in a consistent environment.
To run some code:
me@local:~$ docker run ircmaxell/php-compiler:16.04 -r 'echo "Hello World\n";'Hello World
This will by default run with bin/jit.php
. If you want to run with a different entrypoint, you can change the entrypoint:
me@local:~$ docker run --entrypoint php ircmaxell/php-compiler:16.04 bin/print.php -r 'echo "Hello World\n";'Control Flow Graph:Block#1 Terminal_Echo expr: LITERAL<inferred:string>('Hello World ') Terminal_ReturnOpCodes:block_0: TYPE_ECHO(LITERAL('Hello World'), null, null) TYPE_RETURN_VOID(null, null, null)
Oh, and if you want to “ship” compiled code, then you can do that by extending the dockerfile. For example:
FROM ircmaxell/php-compiler:16.04WORKDIR appCOPY index.php /app/index.phpRUN php /compiler/bin/compile.php -o /app/index /app/index.phpENTRYPOINT '/app/index'CMD ''
When you run docker build, it will compile the code in index.php
and generate a native machine code binary at /app/index
. That binary will then be executed when you run docker run ...
(note: this isn’t designed for production use as the container will ship with a ton, but is more demonstrative of how this process could work).
Now that PHP-Compiler supports LLVM, work can continue building out more support for the language. There’s still a bunch to do (like Arrays, Objects, untyped variables, error handling, standard library, etc), so :D. There’s a ton that needs to be done in PHP-CFG and PHP-Types as well, including support for exceptions and references as well as fixing a couple of bug cases.
Oh, and tests are needed. Like a lot of them. And testers. Try it out, break it (it’s easy, I promise), and then submit an issue.
Can has tests pls?
]]><script type="text/javascript"> var FOO = "<%= raw whatever %>"; ReactDOM.render(<Blah foo={window.FOO} />, document.getElementById('some_place'));</script>
This is a pretty straight forward vulnerability, since passing "; alert(1); "
for whatever
will result in the code being rendered as var FOO = ""; alert(1); "";
which isn’t good.
The fix, isn’t so simple. I’ve searched high and low, and couldn’t find a single source that had the correct solution to the problem. So here it is…
The first “fix” would be to switch to using Rails’s built in HTML escaping:
var FOO = "<%= whatever %>";
Note that this is what rails does by default. This would theoretically protect against most XSS vulnerabilities here, but there are still two problems:
First, if the input contains newlines, it will render invalid JavaScript resulting in a syntax error (and in some circumstances potentially a vulnerability). This is because, if we pass in \nalert(1); //
, the rendered code will be:
var FOO = "alert(1); //"
And since JavaScript doesn’t support multi-line strings, syntax error. So not good.
The second problem (and the reason raw
was added in the first place), is that we’re using the result of that variable in a data context in JavaScript. Which means that HTML entities won’t be decoded. So passing in This & That
will get rendered as:
var FOO = "This & That";
But since React is treating the variable as a data entity (just as jQuery’s .text()
method), it won’t decode that entity. Resulting in a bug where you’re displaying entities to the user. Not good.
Another fix that’s often cited on is to just use escape_javascript
. So:
var FOO = "<%= escape_javascript whatever %>";
It’s also aliased as j
since it’s used often:
var FOO = "<%= j whatever %>";
This solve the first issue from before, where newlines would break the JavaScript and result in syntax errors. So that’s good.
The second problem is still there though. It’s still HTML encoding the output. But how?
If we look at the docs for escape_javascript
, we’ll see nothing about HTML encoding. But if we look closely at the example, we can see what’s happening:
$('some_element').replaceWith('<%= j render 'some/element_template' %>');
Notice the jQuery function that’s being called: replaceWith
. That takes an HTML string and inserts it. Meaning, that the string that’s being rendered with escape_javascript
in this case is HTML, not data. So what’s actually happening, is Rails is escaping for JS then escaping for HTML before it renders. This is because escape_javascript
doesn’t mark the result as HTML safe.
But there’s a really subtle problem with all of this. The order of execution is different on encode and decode.
When encoding (Rails is rendering), the value is processed in this order:
escape_javascript
(resulting in \\
being added before \\
, </
, \r
, \n
, '
, and "
)html_escape
(resulting in &
, >
, <
, "
and '
being converted to HTML entities)When the browser decodes this data, it does it in the same order:
"
and escaped newlines, etc)Is this safe? In theory, no. One of the rules for escaping for multiple contexts is that each context should form a “wrapper” around the inner ones, and escaping/unescaping should happen in the reverse sequences. In this case, if we decode as JavaScript then HTML, we should encode as HTML then as JavaScript, so that the outer most encoding is the first one to get decoded. Think of these like “shells” where JavaScript “contains” the HTML.
In practice though, is this safe? The short answer is kind-of. There are two ways we could in-theory break this: breaking out of the JS string, or injecting something unsafe into the resulting HTML.
Breaking out of the JS string could be possible if html_escape
restored or altered something that escape_javascript
fixed. In this specific case, it appears that didn’t happen (since every character html_escape
changes it changes to an entity which is safe).
Breaking out of the resulting HTML isn’t possible either, since the only way to get that character into JS without hitting html_escape
would be using an escape sequence (\x3C
would become <
for example). But escape_javascript
will automatically escape any free \
characters we have in the string.
So it appears it is actually safe. Though I do want to stress here, it’s not safe by design, but simply by the specifics of the implementation. A change to either implementation (escape_javascript
or html_escape
) could open new vulnerabilities due to this design issue.
Since we’re later going to use the value as data, the proper solution is to mark the value as html_safe
so Rails won’t run html_escape
on it:
var FOO = "<%= escape_javascript whatever.html_safe %>";
This will prevent double-encoding of entities, and will be safe from XSS.
However, if we later use FOO
in a HTML context (such as via $(blah).html(FOO))
that will be injected). Instead, we need to ensure that the usages will treat it as data and not as HTML (in jQuery, switching to $(blah).text(FOO)
.
This is a bit counter-intuitive, since the value isn’t really html_safe
. What’s really happening is that we don’t want to encode for HTML because that will be done later by another process. There’s no way to indicate that to Rails that otherwise would pass “XSS” checks.
The other way we could do it (functionally equivalent):
var FOO = "<%= raw escape_javascript whatever %>";
The only problem with this, is that Brakeman will still flag this as XSS, as the majority of time you use raw in a template, it’s XSS. It’s hard for it to know the subtle nuances of the usage, and as such has to default to sane rules. Though perhaps that’s something that can be fixed in Brakeman in the future (allowing raw
before escape_javascript
in quoted script contexts).
What matters at the end of the day, is that you know where your data is going, and what expectations surround it. Are you rendering HTML to be interpreted by the browser? Are you rendering data that will be interpreted by another application and escaped later?
This is just another example where “just let the framework handle it” leads to either a sub-par result (rendered HTML entities) or an insecure one. In order to be able to effectively use a tool, you need to understand what it’s doing under the hood so that you can use it appropriately.
Oh, and security scanners like Brakeman are amazing (though definitely not perfect), why aren’t you using one?
]]>The foundations of this vulnerability was reported via Hacker-One on September 20th, 2017.
This post will detail the technical vulnerability as well as how to mitigate it. There is another post which deals with the background and time-lines.
Simply upgrade to 4.8.3 and update any plugins that override $wpdb
(like HyperDB, LudicrousDB , etc). That should be enough to prevent these sorts of issues.
Upgrade wp-db.php
for clients.
There may be some firewall rules in the mean time that you could implement (such as blocking %s
and other sprintf()
values), but your mileage may vary.
To prevent this issue? Nothing, it’s been mitigated at the WP layer.
In general however, go through and remove all user input from the $query
side of ->prepare()
. NEVER pass user input to the query side. Meaning, never do this (in any form):
$where = $wpdb->prepare(" WHERE foo = %s", $_GET['data']);$query = $wpdb->prepare("SELECT * FROM something $where LIMIT %d, %d", 1, 2);
This is known as “double-preparing” and is not a good design.
Also, don’t do this:
$where = "WHERE foo = '" . esc_sql($_GET['data']) . "'";$query = $wpdb->prepare("SELECT * FROM something $where LIMIT %d, %d", 1, 2);
This is also conceptually unsafe.
Instead, build your queries and arguments separately, and then prepare in one shot:
$where = "WHERE foo = %s";$args = [$_GET['data']];$args[] = 1;$args[] = 2;$query = $wpdb->prepare("SELECT * FROM something $where LIMIT %d, %d", $args);
Let’s look at why:
Many months ago, a vulnerability was reported dealing with how WPDB
internally prepares vulnerable code. Let’s talk about the original vulnerability.
To understand it, you need to first understand the internals of WPDB::prepare
. Let’s look at the source (before 4.8.2):
public function prepare( $query, $args ) { if ( is_null( $query ) ) return; // This is not meant to be foolproof -- but it will catch obviously incorrect usage. if ( strpos( $query, '%' ) === false ) { _doing_it_wrong( 'wpdb::prepare', sprintf( __( 'The query argument of %s must have a placeholder.' ), 'wpdb::prepare()' ), '3.9.0' ); } $args = func_get_args(); array_shift( $args ); // If args were passed as an array (as in vsprintf), move them up if ( isset( $args[0] ) && is_array($args[0]) ) $args = $args[0]; $query = str_replace( "'%s'", '%s', $query ); // in case someone mistakenly already singlequoted it $query = str_replace( '"%s"', '%s', $query ); // doublequote unquoting $query = preg_replace( '|(?<!%)%f|' , '%F', $query ); // Force floats to be locale unaware $query = preg_replace( '|(?<!%)%s|', "'%s'", $query ); // quote the strings, avoiding escaped strings like %%s array_walk( $args, array( $this, 'escape_by_ref' ) ); return @vsprintf( $query, $args );}
Notice three things. First, it uses vsprintf
(which is basically identical to sprintf
) to replace placeholders with values. Second, it uses str_replace
to quote placeholders properly (even unquoting first to prevent double quotes). Third, if passed a single argument and that argument is an array, then it will replace the arguments with the value of that array. Meaning that calling $wpdb->prepare($sql, [1, 2])
is identical to calling $wpdb->prepare($sql, 1, 2)
. This will be important later.
The original reported vulnerability (months ago, not by me) relied on the following theoretical (well, many plugins had this pattern) server-side code:
$items = implode(", ", array_map([$wpdb, '_real_escape'], $_GET['items']));$sql = "SELECT * FROM foo WHERE bar IN ($items) AND baz = %s";$query = $wpdb->prepare($sql, $_GET['baz']);
The original reported vulnerability used a sneaky feature in vsprintf
to allow you to “absolute reference” arguments. Let’s look at an example:
vsprintf('%s, %d, %s', ["a", 1, "b"]); // "a, 1, b"vsprintf('%s, %d, %1$s', ["a", 2, "b"]); // "a, 2, a"
Notice that %n$s
will not read the next argument, but the one at the position specified by n
.
We can use this fact to inject into the original query. Imagine that we instead passed the following information to the request:
$_GET['items'] = ['%1$s'];$_GET['baz'] = "test";
Now, the query will be changed to SELECT * FROM foo WHERE bar IN ('test') AND baz = 'test';
Not good (we’ve successfully changed the meaning of the query), but also not incredibly bad on the surface.
There’s one other key piece of information that the original report included to change this into a full-blown SQL Injection. sprintf
also accepts another type of parameter: %c
which acts like chr()
and converts a decimal digit into a character. So now, the attacker can do this:
$_GET['items'] = ['%1$c) OR 1 = 1 /*'];$_GET['baz'] = 39;
Checking an ASCII table, 39
is the ASCII code for '
(a single quote). So now, our rendered query becomes:
SELECT * FROM foo WHERE bar IN ('') OR 1 = 1 /*' AND baz = 'test';
Which means that it’s injected.
This sounds like a long shot. It requires passing in attacker-controlled input to the query
parameter of prepare. But as it turns out, this exists in core in /wp-includes/meta.php
:
if ( $delete_all ) { $value_clause = ''; if ( '' !== $meta_value && null !== $meta_value && false !== $meta_value ) { $value_clause = $wpdb->prepare( " AND meta_value = %s", $meta_value ); } $object_ids = $wpdb->get_col( $wpdb->prepare( "SELECT $type_column FROM $table WHERE meta_key = %s $value_clause", $meta_key ) );}
When 4.8.2 was released, it included a “fix” for the above issue. The “fix” was entirely contained in WPDB::prepare()
. The attempt to fix was basically the addition of a single line:
$query = preg_replace( '/%(?:%|$|([^dsF]))/', '%%\\1', $query );
This does two fundamental things. First, it removes any sprintf
token other than %d
, %s
and %F
. This should nullify the original vulnerability since it relied on %c
(or so it seemed). Second, it removed the ability to do positional substitutions (meaning %1$s
was no longer valid).
This caused a massive outrage. WordPress originally (years ago) documented that you should only use %d
, %s
and %F
. In fact, here’s the quote from their docs:
This function only supports a small subset of the
sprintf
syntax; it only supports %d (integer), %f (float), and %s (string). Does not support sign, padding, alignment, width or precision specifiers. Does not support argument numbering/swapping.
Even though it was documented as undocumented, several million queries in third party code (millions of lines of affected code) used the former syntax (securely I may add).
WordPress’s response to the outrage was won’t fix, sorry. They cited security as the reason and refused to elaborate.
Looking at the fix, something felt wrong. To me, the vulnerability was with passing user-input to the query
side of prepare, even if passed through a “escaper” before.
The original proof-of-concept I shared was the following. Given the formerly secure query:
$db->prepare("SELECT * FROM foo WHERE name= '%4s' AND user_id = %d", $_GET['name'], get_current_user_id());
With the change made in 4.8.2, the %4s
will be rewritten to %%4s
(in other words a literal %
followed by a literal 4s
. No substitution will be done). This will mean the %d
would be rebound to $_GET['name']
, giving the attacker control over the user id. This could be used for privilege escalations, etc.
The response (a day later) was thank you followed by a close as “we don’t support that”. I replied a few times and got no response.
So I went back and crafted a different proof of concept that leveraged another key fact to prove the vulnerability wasn’t %1$s
but in fact passing user input to the query side of prepare:
Given the meta.php
file cited before:
if ( $delete_all ) { $value_clause = ''; if ( '' !== $meta_value && null !== $meta_value && false !== $meta_value ) { $value_clause = $wpdb->prepare( " AND meta_value = %s", $meta_value ); } $object_ids = $wpdb->get_col( $wpdb->prepare( "SELECT $type_column FROM $table WHERE meta_key = %s $value_clause", $meta_key ) );}
Given input of:
$meta_value = ' %s ';$meta_key = ['dump', ' OR 1=1 /*'];
Will generate the following query:
SELECT type FROM table WHERE meta_key = 'dump' AND meta_value = '' OR 1=1 /*'
And there we have it. We have successfully injected core. (It’s worth noting that both $meta_value
and $meta_key
come directly from user input).
This works, because the value clause will be generated as:
AND meta_value = ' %s '
Remember that unquoted %s
are replaced by a quoted '%s'
via prepare. So the second call to ->prepare()
turns the clause into AND meta_value = ' '%s' '
and enables the injection.
I stress that this vulnerability cannot be fixed in WPDB::prepare()
but instead was a problem in meta.php
. Yes, you could mitigate it by preventing “double prepare calls”, but you wouldn’t fix the original issue (which didn’t use prepare, but _real_escape()
).
The simple fix is to not pass user input to the $query
parameter to WPDB::prepare()
in meta.php
.
Passing user input to $query
is always wrong. Full stop.
The next step would be to somehow “quote” placeholders in prepared queries and then restore the placeholders before executing the query. This patch also exists.
Basically, the fix would modify WPDB::prepare()
(and all of the escape functions such as _real_escape()
) to do a replacement of any %
placeholder with a random string. Something like:
$query = str_replace('%', "{$this->placeholder_escape}", $query );
Then, in WPDB::_do_query()
remove the placeholder to restore the original user input.
This “works” by preventing this specific vector.
I still stand by that passing user input to the query side of prepare is potentially dangerous and fundamentally unsafe, even if “escaped”. And double-preparing a string (by passing the output of one “prepare” into another) is extremely dangerous and will always be unsafe, even if you may mitigate known vulnerabilities.
Note: It’s worth noting that this looks similar to my original suggestion of Add a check in prepare to check for and reject double-prepares (using a comment to indicate prior prepares) The important difference is that I suggest bailing out if you detect a double-prepare and showing the developer an error, rather than “trying to make it work”.
This is precisely how 4.8.3 “fixes” the vulnerability. It still doesn’t address the root issue though (passing user input to the query side of prepare)…
The correct fix is to ditch this whole prepare mechanism (which returns a string SQL query). Do what basically everyone else does and return a statement/query object or execute the query directly. That way you can’t double-prepare a string.
It’s worth saying that this would be a major breaking change for WP. One that many other platforms have done successfully (PHPBB did this exact thing, and went from having massive SQL Injection vulnerabilities to almost none).
It doesn’t need to be (and in practice shouldn’t) overnight - they can do it in parallel with the existing API, deprecating the old one and removing in time - but it does need to happen.
The current system is insecure-by-design. That doesn’t mean it’s always hackable, but it means you have to actively work to make it not attackable. It’s better to switch to a design that’s secure-by-default and make the insecure the exceptional case.
The best path forward would be to switch to PDO/MySQLi and use real prepared statements and not emulate them in PHP land. That’s the best path forward.
But if that’s not acceptable, then at least move to a statement object style system where prepare
returns an object which is then executed. And for the love of god get rid of escape_by_ref
/esc_sql
as well as the still-existent _weak_escape
(which calls addslashes()
and has been “deprecated” for 4 years and still somehow exists)…
These changes won’t prevent misuse, but it will make it far harder. It will make the default usage secure making developers go out of their way to make it insecure (where today is precisely the opposite).
It’s also worth noting that with this mitigation technique, support for positional placeholders was added back in (though a subset of what was possible, it should be the vast majority of use-cases).
]]>The foundations of this vulnerability was reported via Hacker-One on September 20th, 2017.
This post will detail the background on the vulnerability as well as why I publicly threatened to Fully Disclose. There is another post which deals with the technical vulnerability.
In short, the WordPress team released a “fix” in 4.8.2 that broke a LOT of sites. It was shown that the fix didn’t actually fix the root issue (but just a narrow subset of the potential exploits).
I reported a new vulnerability the day after the 4.8.2 was released. It was ignored for several weeks. Finally when I got the attention of the team, they wanted to fix a subset of the issue I reported.
It became clear to me that releasing a partial fix was worse than no fix (for many reasons). So I decided the only way to make the team realize the full extent was to Full Disclosure the issue. I started the process of going public by asking for Hosts and Plugin Developers to reach out to me so that we could coordinate the release.
During the planning steps of the FD, the WP team started constructive discussions again.
The 4.8.3 patch mitigates the extent of the issues I could find, and I believe is the second best way to fix the issue (with the first being a much more complex and time consuming change that still needs to happen).
September 19th - WordPress releases 4.8.2 with a “fix”
This fix doesn’t actually fix the vulnerability, but breaks a metric ton of third-party code and sites in the progress (an estimated 1.2 million lines of code affected)
September 20th - I file a security vulnerability report and notify them the fix isn’t a fix and suggest they should revert and fix properly (with included details on how to fix)
September 25th - I request public disclosure since ticket remains closed with no apparent resolution
1 hour later I get a response saying “Sorry for the delay and confusion, we’re still looking into this”
In this span, I posted 4 times adding information and requesting this be escalated, as well as addressing the root issue of the vulnerability.
October 16th - I announce my intention to go public on the 19th, barring any additional contact.
Same day I receive a reply “I’ll get you an update tomorrow”
October 17th - WP replies with a full history of the vulnerability, but still no indication of anything that I said, nor any indication that they would fix anything. In fact, their exact reply for next steps was:
Publishing the details of the issue will be the next step. Once that’s done, #41925 can be reopened and work can be done to add in support for numbered placeholders if there are people that want to do so and if it can be done without re-creating any vulnerabilities.
Note that this doesn’t actually admit to any of the issues found so far.
October 18th - I reply talking about the fact that the vulnerability still wasn’t fixed, and demonstrating a black-and-white proof-of-concept as to the break.
WP replies stating that they are “triaging” the bug (a month into this ordeal) and will fix the meta.php
issue…
So far, they haven’t admitted root issue (that their prepare system is poorly designed and must be fixed conceptually), but only the surface level repercussions of it (and even then only a few of them).
Also receive a patch to remove the “double prepare” from meta.php
October 20th - Receive a reply saying “we’re working on it” and discussing some details of the fix. Specifically that they are going to try to implement something I had suggested earlier (using a “comment marker” to indicate if a string has been through WPDB::prepare()
before).
It’s worth noting that the reply indicated some hesitance to fix this because:
Regarding broken plugins – I’m sure there are plugins that double prepare things as well. Going this route we’re likely to break plugins as well, it will probably just be a different set of plugins.
I also announced my intention to disclose on October 25th (1 month since I originally requested disclosure).
October 22nd - Receive a patch to review that adds implements the “mitigation” fix noted above. WP team says they aren’t comfortable with the 25th as it’s too soon given external factors.
October 23rd - I reply with a few issues with the patch (most minor, one major).
I also ask for an intended date of release so that I can adjust the disclosure time-line accordingly (I mentioned that given the history of this interaction I don’t feel comfortable making it open-ended).
I receive a reply stating that “Several people are traveling and they won’t be able to get a date to me by Wednesday the 25th”. This also includes a new “mitigation patch” that addresses none of the concerns I raised before hand.
October 24th - I reply with two significant comments about the proposed patch. I also recommend a different approach (erroring if detecting a double-prepare/etc).
I also attempt to recognize team constraints and offer to wait until Friday October 27th to negotiate a release date:
I get that there’s timing. Get back to me as soon as you can with a projected date. But I won’t wait around arbitrarily. I’ll push back the deadline for a date until Friday. But by doing that I’m stating I’m not willing to push disclosure past next Wednesday (which will be 6 weeks from the breaking release and the initial report).
October 26th - Receive a reply from the WP security team. The following quote was included:
One of our struggles here, as it often is in security, is how to secure things while also breaking as little as possible.
There was also significant push-back to solving the _real_escape
side of the issue. The proposed patch at this point in time only mitigated double-prepare, but purposely didn’t address when esc_sql()
was passed into the query side of prepare (which happens a fair bit thanks to things like get_meta_sql()
and the posts_where
hook).
This appeared to me to be the worst-case scenario. A partial fix which left the main portion of the vulnerability there, but also gave instructions on how to execute it. Seeing as it appeared to hit an impasse, I felt the only alternative I had was to push for FD.
I replied, announcing my intentions to FD on the 27th if the discussion didn’t meaningfully change (if they didn’t acknowledge the extent of the vulnerability).
Publicly, I announced my intention to FD “soon” and asked hosts and plugin authors to contact me to start working out a responsible FD roll-out giving people enough of a head start to cleanly rectify issues.
October 27th - I receive a reply from the WP security team. A security team member who hadn’t yet participated in the thread went back to the beginning of the thread and re-read every post. He (correctly I may add) summarized the entirety of the issues, as well as asked a few clarifying questions. He also asked for a little more time but gave me a target of Tuesday, October 31st so it wasn’t wide open.
This was the response I was looking for the entire time. We were on the same page finally. The discussion immediately turned constructive from all sides and we worked together on a better solution. Turn around time was good (sometimes 2-3 replies per day, not always). But more importantly I think we both got confidence of the release.
I replied with some comments on the patch/clarifications. We had about 5 replies back and forth that day, all moving solidly in a good direction.
October 28th - The first version of the patch that I felt good about was created. There was still some work to be done, but here is when it seemed like the problem would be properly (IMO) fixed.
I also announced that I was scrapping plans for FD as I felt the current patch and time-line were appropriate (barring anything major changing before the 31st).
October 29th-30th - A bit of refinement and back and forth on the issue.
October 31st - Release, and these posts were published.
The early experience was troubling. I wrote a few years ago about one issue, and honestly while it was easier for me to send things to them (via HackerOne), the overall experience hadn’t improved (if anything got worse in some aspects).
It took literally 5 weeks to even get someone to consider the actual vulnerability. From there, it took me publicly threatening Full Disclosure to get the team to acknowledge the full scope of the issue (though they did start to engage deeper prior to the FD threat).
Once the issue was understood, we got to a really good place. If the entire interaction was like Oct 27 - Oct 31, I would have been ecstatic. Even if on a different time-line (the good part wasn’t the speed of the replies, but the content of the conversation).
Security reports should be treated “promptly”, but that doesn’t mean every second counts (usually). I get that there are competing priorities. But show attention. Show that you’ve read what’s written. And if someone tells you it seems like you don’t understand something, stop and get clarification.
And ask for help.
Overall, I hope the WP security team moves forward from this. I do honestly see hope.
I get that. I really do. And I don’t blame them for this.
The miss IMHO isn’t that a team of volunteers isn’t living up to my expectations, but that a platform that powers 25%+ of the Internet (or at least CMS-powered-Internet) isn’t staffed with full time security personnel. Volunteers are amazing and can only do so much. At some point it comes down to the companies making money off of it and not staffing it that are ultimately the biggest problems…
The core issue is mitigated. My perspective of the interaction was frustrating at first, but got far better towards the end.
I was disappointed for a good part of the past 6 weeks. I’m now cautiously hopeful.
]]>We have a habit of talking about “code smells” to indicate patterns and practices that our experience has shown can be problematic. Many of these “smells” are backed by a lot of data and really are legitimate problems to avoid. These are constructs and tools that often have few legitimate uses. But many so called “smells” really aren’t significantly bad. Let’s dive into some of the nuance here and talk a bit about why our word choice matters.
Before we get into code smells, let’s talk for a brief second about the etymology behind the term. In the real world, smells provide our brains with important data points about our environment. Our brain interprets smells into a few main categories (10 actually). Each of these categories gives us an important piece of information about our surroundings. When we smell a particular food source, we can get an idea if it’s safe to eat or not.
The really important thing to note is that the way our brain interprets scent is heuristical. Meaning that the vast majority of the time you smell something sharp or pungent it’s something that you need to avoid because it likely is bad (or contains bad bacteria). I say it’s heuristical because it’s not always bad for you. It could be simply Limburger cheese.
The smell alone isn’t enough to firmly know if it’s bad or not the vast majority of the time, you need other information from other senses to know for sure. But those smells are useful to us because the vast majority of the time we encounter a bad one it’s because the object is bad.
Some of the items we identify as “code smells” are legitimate sources of problems. That’s not to say they are always problematic, but the vast majority of the time they would be. goto
is an awesome example, but let’s go with a few more explicit ones:
Using eval()
While eval()
does have some valid uses, the vast majority of them are actually huge issues. The reason for this is simply that eval()
is really hard to use without becoming a huge security vulnerability. And that doesn’t mention that there are usually better, easier, or more robust tools that can use the job.
Some things are literally impossible without some form of dynamic evaluation. Meta-programming for one is quite difficult without some ability to dynamically execute code. In fact, some languages blur that line so much that language becomes code (which is quite similar to eval()
, but beyond the scope of these thoughts).
Duplicated Code
Writing the same thing over and over again is quite often seen as a bad practice. The reasoning behind this is that when a bug is discovered in one copy (or a new feature is added), the chances are quite significant that other copies won’t be updated.
One of the downsides of refactoring out duplicated code is that by doing so you create artificial coupling between different pieces of code. The vast majority of times this coupling is less of a burden than the original duplication was. But every once in a while you wind up in a situation where the coupling makes life so difficult that copy/pasting is not just easier, it’s cleaner.
Global Variables In Modularized Code
Specifically inside of modular code, global variables tend to be sources of bugs and difficult to understand complexity. The problem isn’t the global variable itself, but that the nature of modular code is that you don’t control the life-cycle or access patterns of the code. The code is being used by another system and will be called in all sorts of weird ways. By having global state it makes it far harder for interacting code to know when a chance will clash in other areas of the system.
For example, imagine a logging class that used a global variable for “prefix”. One part of the code sets the prefix to A, and another to B later in the execution. Because it’s a true global variable, those clashes will cause weird side-effects.
These (and way more) are legitimate “smells”. They tell our fight-or-flight response to be ready to jump pending confirmation from another sense. The key thing to realize here is that the false-positive rate is exceedingly low (less than 100:1), meaning that valid uses are few and far between.
Some “smells” are hinting at legitimate issues, but the false positive rate is significantly higher at reasonable values for the definition. Perhaps a valid:invalid ration of 5:1 or 10:1.
Too Many Method/Function Parameters
This one has more preference baked in, but you’ll usually find few people advocating for hundreds of parameters to a single function.
The nuance here is where the line is. Some designs and systems having more than one or two parameters results in a more complicated system to the point where it’s problematic. For some other systems and usages that number is higher.
There is a limit where too many parameters becomes an obvious smell that something’s wrong. The point though is that limit (and hence the value of the heuristic) only provides a strong signal that there’s a problem at quite high values…
Excessively Long Methods
This is another one where you need to know more about what’s going on to really judge whether there’s a problem or not. In some cases, having a 10 line method is exceedingly long and results in a huge amount of complexity. In other cases, a 50 line method may really do one thing and may not be worth refactoring (depending on what’s being done at least).
Again, the smell is only significantly valuable outside of reasonable limits. And even then it only hits our “few false positive” test at relatively extreme limits (more than many developers would consider sane for most cases).
There are plenty more, but I want to get into the third class of “smells”:
The majority of “smells” you will see people talk about seem perfectly reasonable on the surface, but only really make sense inside of a context. To get what I mean, let’s take a few examples of some “smells”:
Using Conditionals
Yes, some programming camps suggest that having an if
statement inside of your code is an anti-pattern and a “smell”. From their perspective, it’s hard-coding a decision that should be inverted (a reflection on the Tell-Don’t-Ask principle. Let’s look at an example:
if ($user->canBuyAlcohol()) { $shoppingCart->checkout(); }
That could be rewritten as:
$user->purchaseAlcohol($shoppingCart);
The key benefit there is that complex logic can be encapsulated way more than a simple boolean can. For example, in some European countries the age depends upon location or percent of the alcohol. By removing the conditional you can leverage polymorphism to truly encapsulate the change and make for a more flexible design.
If you buy into this way of designing this should seem obvious. If you don’t buy into it, this likely seems contrived and overly dogmatic.
Depending On A Service Locator
I’ve personally written about this one before, and I do not like service locators for OO code. The reason is they create hidden dependencies and act akin to global variables.
But that really depends on what you’re building. If you’re building a large application or reusable modules, those hidden dependencies cause significant complexity and problems. But for some corners of the codebase, and for many applications, the trade-offs don’t fall as hard (to the point where in some cases the alternatives are worse).
To know if a Service Locator is the right choice or not really depends on what you’re building and the trade-offs you need to account for…
These smells are those that are good to know about, but don’t give you a whole lot of signal to noise. Meaning that they are signs of good code as often as bad code.
Avoid Static Function/Methods
This is another one of those “it depends” cases. A static function/method is one that’s not polymorphic, meaning that it’s referenced by class name or global variable. In PHP, that’s ClassName::method()
(most of the time). In Ruby, it’s ClassName.method()
(which is really just a method call on a global variable). In Java ClassName.method()
. And so on.
The problem with static calls are the same as with calling new
and creating a hard-coded object class. It’s not polymorphic (meaning it isn’t dynamic based on other objects) and hence it’s not OO. Take the following code:
class User { public static function getCurrentUser() {...} }
What’s the difference between that and get_current_user()
as a function? The answer is: nothing.
If you’re writing imperative, functional or really any other code than OO, this is a non-issue at all. But when looking at composition, hard coding the call does reduce flexibility (no matter what paradigm you’re using).
So to know if this is really a problem or not you need to not only know a lot about the style of the application you’re looking at, but also know the trade-offs involved and specific points that may change…
Any smell that says “always” or “never”
Any so-called “smell” that declares an absolution like “always do X” or “never do Y” is likely an unreliable smell. Look for the context, and look at the trade-offs.
I urge you to be cautious about how you talk about code smells, and how you talk about “good vs bad” ways to write code. Realize that programming is an insanely complicated endeavour with a huge amount of nuance.
Be wary of anyone who uses the terms “always” or “never” when it comes to writing code, unless they specifically qualify the statement with the context needed to justify it.
Most of all, think critically about your code and other’s code. Don’t just look at it in isolation, but understand the context around which it was developed.
After all, that context is the most important part of programming that we talk the least about…
]]>Here’s the specs of the current machine:
There are also several architectural decisions and limits which are designed to change over time, so here are the theoretical maximums and the current implementations:
This project has been something I’ve wanted to do for a very long time, but in a slightly different form. I’ve always been fascinated with computers, way back to my early childhood. I remember when I was 5 or 6 years old, going for long walks with my father talking about how computers really worked. I remember being fascinated that all of the functionality that we see in a computer really just being caused by a large number of switches turning on and off really fast.
Think about that for a second. The device that you’re reading this post on is really just a collection of switches. Billions of them (a number that I know I can’t really comprehend). All dancing their choreographed dance. Turning on and off in the precise order to render this next pixel. There’s simultaneously something magical yet poetic about that.
I’ve always wanted to ditch the transistor and make a mechanically operated computer. My original idea was to do it with electronic relays. You can find a few videos on YouTube of people who’ve done just that, and there’s an awesome rhythmic sound to hearing the computer actuate. There is a limiting factor there in that it takes a LOT of relays to do anything. Plus if I was going to go that far, I’d want to keep the RAM using the same technology, and so I never wound up doing it.
My next idea was to use Lego pneumatics actuating switches. Make a huge array of them, each building logic gates, which build components which build the computer. I prototyped a few gates, but ultimately ran into the same problem: it would take thousands of these actuators to build the computer. And that doesn’t even cover the problem of pressure, of speed (think cycles per minute, not per second), etc. And at $5 per actuator, yeah…
So I tabled the project. For years. Every once in a while I would get a spark and think about it, but it died for one reason or another. Then, in the beginning of January, I stumbled upon a YouTube video by Ben Eater. He built an 8-bit computer on breadboards. That was all the inspiration I needed. It wasn’t exactly what I thought of years ago, but it was close enough.
Some components I based directly on his (the 8-bit registers are nearly identical for example). Some components are loosely based (the ALU shares some similarity at the core). And others are really far off (the entire address bus and memory system).
In the process of building and designing the computer, I’ve learned several really important lessons that apply to normal software projects and life itself. I’d like to end this post with a few things that I really learned so far.
Test early, test often, test small
We talk about the importance of Unit Testing all the time. I’ve been a huge proponent for automated testing and advocate for it for a while. However, it took building a computer by hand to really appreciate the value.
Imagine this situation: You built an entire computer, thousands and thousands of wires and individual connections, chips upon chips and tons of hours of work. Then you try to add 1 + 1. You get 90 as the result (yes, this really did happen to me). Now what…
Instead, what I started doing was testing each component as I built it. I would build Arduino programs to cycle through all valid (and invalid) input and control states to test a component. Only once it passed would I integrate it into the computer. Now, once I got 1+1=90, I knew it wasn’t a flawed component. I knew the flaw was somewhere in the integration layer. Now I only have about 100 possible sources for the error instead of 100,000, just because I decided to test at the unit level.
Prototype and prove your designs before you build them
Most of us are guilty of assuming our design will work just through logical analysis. We put tons of work thinking, drawing and analyzing, and then once we’re done we’re reasonably confident that it’ll work. So we jump forward to actually building it. Well, what happens when we miss something in the design and there’s a flaw?
In reality, this happens all the time. Good architects and engineers will notice the flaw and recover; modify as they go to fix it. But what happens when that modification requires undoing 100 individual wires and re-routing a component you just spent the past 8 hours wiring… You get frustrated. And look for a better way.
So what I started to do was to prototype. Take the design, lay it out, connect the minimum number of wires necessary to prove the design. Don’t wire in permanently, but use jumper wire to “get it working”. This is messy, but takes 5 minutes versus 5 hours to fully wire. Then, once you’ve proven the design, then go into production mode.
Decomposition is incredibly powerful
Looking at a completed project really is daunting. There’s a ton of work and a lot to design. And too much to understand. Yet by breaking down the project into many small ones (build a register, build a bus, build another register, build an ALU function, etc), it became manageable to both design and build.
There is a flip side to that though, which is that later builds can show design flaws of earlier components. So it’s not without risk…
Have a “big picture” in mind
It’s easy when decomposing to miss the forest for the trees. When I started this project I drew an architecture diagram that has helped keep in check and provided guidance towards smaller design components.
OMG is it satisfying to see those LEDs blink in pattern
After all those hours, the feeling of seeing those LEDs pulse in their pattern. Wow. Just wow. (click the image below to get what I’m talking about :)
In later posts I’ll likely go into the design. Talk about why I did something the way I did, the trade-offs I made and how they effected the result. I haven’t quite decided yet. What I do know is that I’m having way too much fun so far, and will keep going on the project.
If there’s something specific you’d like to read about, let me know and I’ll try to do a post on it. Thanks for reading!
]]>Let’s look at the simple(ish) case of driving. There’s an extraordinary amount of trust required. When you’re behind the wheel, you’re not only trusting every single engineer who designed every component of your vehicle, but of every surrounding vehicle as well. You’re trusting the quality assurance processes of each manufacturer. You’re trusting the assembly line technician. You’re trusting the mechanics. The part manufacturers. Not to mention the surrounding infrastructure (who would want a road to collapse when they are driving on it).
You also need to trust the drivers around you. You trust that the person driving next to you on the highway isn’t going to randomly jerk the wheel into you and cause an accident. You trust that they aren’t drunk or otherwise impaired. That they are also driving in the shared best interest of safety.
Yet we all know that blind trust in any of those items is bad. Driving schools teach “Defensive Driving” techniques, to prepare you if someone does drive poorly around you. You take your car to mechanics to be inspected so you can verify that the parts that the engineer said were going to last, are going to be safe and last.
Each and every one of these “Trusts” are leaky, meaning that we all know of instances that each and every one of them have failed (remember the Ford Pinto?). Yet it’s far more useful to us to trust than not to, even though we all know that trust isn’t perfect.
As a society, we’ve built mechanisms to balance the benefits of reducing the necessary trust against the costs that the measures take to implement.
Compare driving a car to flying a plane. The regulations are far stiffer for the pilot in pretty much all cases. You probably know that it’s far harder to get a pilot’s license than a driver’s license. A basic pilot’s license takes 40-80 hours of training plus a comprehensive exam, plus 6 hours of training every 2 years. And there are many more licenses beyond that.
What you may not know though is that airplanes have mandatory maintenance schedules that require inspections and service as few as every 25 hours of operation. Compare that to automobiles which typically get serviced once per year if the owner is diligent; FAR less in many cases. The stakes are simply far lower with a car.
Another great example is in the case of air traffic control. Airspace (in the USA at least) is divided into 7 “classes”, with 3 of them being “controlled” (you’re not allowed to enter without talking to someone first).
Away from metropolitan areas, you can fly in uncontrolled airspace without talking on a radio. You’re flying all by yourself using “Visual Flight Rules” (you need to see where you’re going). You trust that other pilots are looking for you, are following rules as well as established conventions. In areas with more traffic, the rules increase.
As with any rule increase, the overhead and therefore costs increase dramatically. Instead of being able to take a quick 10 minute flight to a neighboring airport, you now need to file a flight plan and put yourself at the mercy of a controller. What could have taken 20 minutes including planning now takes over an hour. This added cost increases safety and reliability dramatically.
No matter what situation you’re in, there will always be some amount of required trust. Take AWS for example. You’re trusting that Amazon really doesn’t have the keys to the server they built for you. You’re trusting that they are going to give you the uptime they promise (or in Amazon’s case, that they imply). You’re trusting that the machine is what they say it is. You’re trusting them for a whole lot more than that in reality, but there’s always trust required.
But as with driving, blind trust is also not always the best. So we can install measures to verify that trust and ensure that nothing too bad happens without us at least knowing about it.
Tools like Tripwire are perfect examples of this concept. You need to give someone access to your production machines (or access to the machines that have access, or access to the machines that have access to the machines that have access, etc). How do you ensure that they are doing what they say they are, not modifying files directly in prod? A tool such as tripwire that helps verify that the trust wasn’t broken.
Could you prevent the possibility in the first place? Not in the generic case. Someone effectively needs access to the root key (either directly, or through another machine). You can lock your engineers down or you can give them total freedom. Or, you could do the smart thing and give them enough freedom so they can do their jobs effectively, and implement “verification” systems to detect breach of trust (even for “honorable” reasons).
This is what many companies do. Trust, but verify.
Security is always a trade-off against usability. The only way to make a perfectly secure system is to not make the system in the first place. The trick isn’t making the system perfectly secure, the trick is balancing the security requirements with the costs (monetary, time and usability) to implement those requirements.
It’s important to note that I’m not talking about security requirements that are about diligence in construction, such as SQL Injection, XSS, etc. Those costs are literally nothing compared to the wrong way, if you do it right from the beginning. So the investment in education and training on those types of attacks is vital and should never be traded away. There’s no reason to go out to sea in a boat full of holes in the hull.
The costs I’m talking about come in terms of UX tradeoffs, in terms of application-level security (requiring approvals, multi-factor authentication, etc) and costs associated with controlling access (authentication, access control, scopes and federation, etc).
It’s easy to find an attack vector (they always exist). The key is weighing the cost of exploit against the cost of mitigation against the cost of breach (the damage it will do).
Effective security doesn’t end with all boxes checked. Effective security comes from the right boxes checked. This is always a continuing question and dynamic. One that changes over time.
After all, you’re going to trust a janitor in a supermarket a lot more than you’re going to trust a janitor in a nuclear power plant. Both require trust, but must you tailor the limitations and mitigations to the problem at hand.
]]>*Note: All code that will be used in this post is real-world code found in the wild (and linked to) with one exception (X-Powered-By
).
The current proposal includes a single interface (the return-type was added by me for clarity):
interface MiddlewareInterface { public function __invoke( RequestInterface $request, ResponseInterface $response, callable $next ): ResponseInterface;}
This is not really a new idea. The Slim Framework uses this exact signature. And a number of frameworks/libraries use similar interfaces: mindplay/middleman, relay/relay, zendframework/zend-stratigility among others.
It’s important to note that StackPHP and Laravel use a different approach. It does not pass the response in as a parameter to the middleware. In fact, many middleware implementations in the ecosystem use this approach (Including the original: Rack with Ruby on Rails). I will go more into why and what makes this approach both technically and non-technically superior towards the end of this post.
First, let’s take an example of real world code that uses this approach. Let’s look at the AccessLog Middleware. This is really straight forward and demonstrates the concept well.
public function __invoke( ServerRequestInterface $request, ResponseInterface $response, callable $next) { if (!self::hasAttribute($request, ClientIp::KEY)) { throw new RuntimeException( 'AccessLog middleware needs ClientIp executed before' ); } $response = $next($request, $response); $message = $this->combined ? self::combinedFormat($request, $response) : self::commonFormat($request, $response); if ( $response->getStatusCode() >= 400 && $response->getStatusCode() < 600 ) { $this->logger->error($message); } else { $this->logger->info($message); } return $response;}
Note here that there are really two things that this middleware is doing. First, it validates that the request is valid, meaning that it has the additional ClientIP address added by a prior middleware. The second step is that it generates a log message and then decides how to execute the log based on the status code of the response.
Note here how the $next()
handler is called in the middle of the method. Behavior that needs to change based on the request should happen before the call. Behavior that changes based on the response needs to happen after this call. Overall, it should be simple.
Let’s take another simple example to really demonstrate this concept. Let’s build a middleware that adds an X-Powered-By
header to the response:
public function __invoke( ServerRequestInterface $request, ResponseInterface $response, callable $next) { $response = $response->withHeader('X-Powered-By', 'This Blog'); return $next($request, $response);}
That’s one approach (known forth as “Pre-Modifying”). Another approach (known forth as “Post-Modifying”):
public function __invoke( ServerRequestInterface $request, ResponseInterface $response, callable $next) { $response = $next($request, $response); return $response->withHeader('X-Powered-By', 'This Blog');}
Note that there’s an important distinction between them. The first modifies the response, and then passes the response to further middleware. The second executes the inner middleware and then modifies the returned response.
The fundamental problem with this interface is that it passes a response in to the middleware, rather than letting the inner middleware define the response. On the surface this may not seem like a big deal, because through discipline you can avoid the pitfalls associated with such an approach. However, it really is a fundamental problem that is better solved with a different interface.
The root of the problem is this:
What does
$response
mean inside of the middleware?
The proponents of this style interface have said many times that it is an “instance that middleware should modify should they need to generate a response”.
The problem is that the actual meaning of the instance passed in depends on what outer middleware (middleware that was called before it) decided the meaning should be. This means that no middleware can actually trust what $response
means.
Let me give an example of why this is an actual problem. And let me show you actual code. Here is a cut-down version of the Cache Middleware which basically adds cache control headers:
public function __invoke( RequestInterface $request, ResponseInterface $response, callable $next) { $key = $this->getCacheKey($request); $item = $this->cache->getItem($key); //If it's cached if ($item->isHit()) { $headers = $item->get(); foreach ($headers as $name => $header) { $response = $response->withHeader($name, $header); } if ($this->cacheUtil->isNotModified($request, $response)) { return $response->withStatus(304); } $this->cache->deleteItem($key); } $response = $next($request, $response); //Add cache-control header if ( $this->cacheControl && !$response->hasHeader('Cache-Control') ) { $response = $this->cacheUtil->withCacheControl( $response, $this->cacheControl ); } //Add Last-Modified header if (!$response->hasHeader('Last-Modified')) { $response = $this->cacheUtil->withLastModified( $response, time() ); } //Save in the cache if ($this->cacheUtil->isCacheable($response)) { $item->set($response->getHeaders()); $item->expiresAfter( $this->cacheUtil->getLifetime($response) ); $this->cache->save($item); } return $response;}
Now, let’s walk through what this function is doing. First, it looks up the item in cache. If it finds the item in the cache, it gets the headers and sets all of the cached headers on the response. Then it looks to see if the cache is still valid (the item isn’t modified). If and only if the item isn’t modified is the 304 response returned to the client.
But if the item was modified, things change. The next middleware is called. NOTE: the cached headers still exist on $response
. This includes the old Cache-Control
and Last-Modified
headers. Which means that if an inner middleware returns an error, the $response
is no longer a prototype, but instead has cache headers attached to it. Which means this will cause any HTTP errors generated to have cache-control headers attached. Which is normally not a good thing…
The solution here would be to not re-use the $response
when adding the headers, and hence avoid the problem all together.
But that’s not really the cause of the error. There are plenty of middleware that write to the $response
before calling the inner middleware. Some set headers. Some set bodies. Some modify status codes.
What this means is that by definition you cannot trust the meaning of $response
.
Now, you could make the argument that this is just bad code, and that it’s not a fundamental flaw of the proposal. And indeed, good code will not have these issues. The reason good code won’t have these issues, is because good code won’t modify $response
before it’s returned from an inner middleware. An outer middleware cannot possibly know anything about the response prior to it being handled. So why would it modify the response before looking at it?
If best practice is to only modify the $response
after calling $next()
, then why bother passing it in at all?
An argument that’s being made to pass in the $response
as a parameter is that it acts as a form of Dependency Inversion. On the surface, this is legitimate. It allows middleware that wants to return a response directly (rather than modifying one created further down the pipe) to not have to depend on a concrete implementation of PSR-7.
This prevents a potential explosion of PSR-7 implementations inside of an application, where 5 middleware each bring in a different PSR-7 implementation.
This is a false tradeoff.
There are several reasons this is a false tradeoff. First, passing the $response
as a parameter is not the only (or easiest) way of solving this dependency inversion problem. The easiest, would be to not solve it and let individual middleware authors use normal DI techniques to solve it (using a constructor parameter for the prototype, using a use()
clause in a closure, etc).
Another solution would be to pass a factory to create responses into the middleware.
The reasons both of these solutions are far preferable to passing a $response
parameter is that both of the other solutions impart context onto the injected instance. The context being that it is an empty prototype, not a pre-filled partially-completed response.
One of the arguments that’s being used to justify the $response
being included as a parameter is that it’s easy to adapt from the parameter to another style. For example:
class Adapter implements MiddlewareInterface { private $otherMiddleware public function __construct($other) { $this->otherMiddleware = $other; } public function __invoke( ServerRequestInterface $request, ResponseInterface $response, callable $next ) { return $this->other->handle( $request, function($request) use ($next, $response) { return $next($request, $response); } ); }
This seems simple and straight forward. And the flexibility this buys is huge, right? It allows for both “formats” to be solved.
Except it’s not.
By definition, any modification to $response
(and actually the default response itself) will be thrown away. The reason is that the other middleware will create its own separate response and return it since it doesn’t have access to the outer $response
.
So this breaks the contract that has been provided by passing the response as a parameter in the first place. Which is going to be an interoperability nightmare.
Can you use the $response
parameter method effectively? Absolutely. Hands down. Can you use it wrongly? Absolutely, 100%. The design actively encourages poor usage by providing a response to modify. Many tutorials show that you should end the middleware with a call to return $next($request, $response);
. Further complicating the problem when the author wants to modify the response. It becomes incredibly confusing. What happens if an inner middleware resets the response to a new instance (or clears it)???
It’s like comparing a straight razor with a safety razor. When used perfectly, both give almost identical results. But when you make a tiny mistake with a safety razor, you don’t end up in the hospital (or worse).
Don’t take my word for it. Redditer /u/renang complied a list of middleware that modifies $response
prior to calling $next()
. The interesting point, is that the majority of these actually have significant bugs and inconcistencies due to this effect:
$next
after building the full response, meaning that a later middleware (further in) can completely overwrite the response.$next
after adding all of the metering headers, meaning that a later middleware can remove all of the headers rendering the middleware completely ineffective.$next
after building the full response from the application.$next
when a cache item is modified after modifying the response with all cached headers. This means that cache control headers will erroneously propagate to all responses that re-use the passed in response.$next
after adding the Content-Type
header to the response. Meaning that error conditions or other responses may have an erroneous content type added.$next
after setting HSTS headers, meaning that if later middleware write over the response or reset it, the header will be lost.$next
after setting headers, allowing later middlewre to overwrite or reset the response. And may write the header in a response that is actually incorrect.$response
wasn’t passed in to the middleware, none of these issues would exist.It’s not that this middleware proposal can’t work. It’s that it’s REALLY easy to screw up. And that makes it a bad design from the ground up.
There are several other issues with the proposal that really boil down to more “academic” or “style” points, but are worth mentioning:
The usage of __invoke
rather than a named method presents an interesting problem. It was chosen because it allows for compatibility with anonymous functions, and hence backwards compatibility with a lot of pre-existing middleware. However, this also prevents any implementing middleware from using __invoke
for other means.
But further, it also prevents distinguishing between client and server middleware. Since both use the same root interface, it forces the distinction to happen at runtime inside of the implementation. This is mentioned explicitly in the proposal by saying the middleware should throw an InvalidArgumentException
if the wrong type is passed.
Using a named method would allow this distinction to occur at an interface level. We could define two interfaces, one for Client
and one for Server
, and push that error checking up a level.
The current proposal defines the following:
Middleware consumers (e.g. frameworks and middleware stacks) MUST type-hint any method accepting middleware components as arguments formally as
callable
, and informally asPsr\Http\Middleware\MiddlewareInterface
, e.g. using php-doc tags:
/** * @param MiddlewareInterface $middleware */public function push(callable $middleware){ // ...}
This means that by definition no application that implements the proposed middleware is allowed to use the middleware as type information. Which means that static analysis will not work, autocompletion will not work, and you will not get any help from the engine (or your IDE) with type checking.
The fact that $next
parameter is simply a callable also suffers from the same problem as above. It means that there’s no longer any enforcement or ability to auto-complete or check types.
Instead, $next
should be a formal interface which would allow for type validation.
All of the above issues can be rectified extremely simply by using a few simple patterns. The first, is to rename the method. handle()
sounds good, so let’s start there:
interface Middleware { public function handle( RequestInterface $request, ResponseInterface $response, callable $next ): ResponseInterface;}
Next, let’s remove the response as a parameter which will solve the fundamental problem with the proposal that I detailed above:
interface Middleware { public function handle( RequestInterface $request, callable $next ): ResponseInterface;}
Next, let’s change the $next
from callable to be a formal interface:
interface Middleware { public function handle( RequestInterface $request, Frame $frame ): ResponseInterface;}interface Frame { public function next( RequestInterface $request ): ResponseInterface;}
This is all we need to do. It’s really simple. Let’s take our X-Powered-By
example from above, and see how it looks here:
public function handle( RequestInterface $request, Frame $frame): ResponseInterface { $response = $frame->next($request); return $response->withHeader('X-Powered-By', 'This Blog');}
Basically the same as before, but without the ability to screw up the response.
Let’s say we wanted to return a 404 from a middleware? What would we do in this case? We have three options:
Take it as a constructor parameter:
class MyMiddleware implements Middleware { private $response; public function __construct(ResponseInterface $response) { $this->response = $response; } public function handle( RequestInterface $request, Frame $frame ): ResponseInterface { return $this->response->withStatusCode(404); }}
Bind to a specific instance of PSR-7
class MyMiddleware implements Middleware { private $response; public function __construct(ResponseInterface $response) { $this->response = $response; } public function handle( RequestInterface $request, Frame $frame ): ResponseInterface { return new Guzzle\Psr7\Response(404); }}
Modify our original $frame
to include a factory.
class MyMiddleware implements Middleware { private $response; public function __construct(ResponseInterface $response) { $this->response = $response; } public function handle( RequestInterface $request, Frame $frame ): ResponseInterface { return $frame->factory()->createResponse(404); }}
All three solve the “DI” problem. The first is the most flexible for authors. The second is the most flexible for framework authors. The third is a good mix between the two.
So our final interfaces become:
interface Middleware { public function handle( RequestInterface $request, Frame $frame ): ResponseInterface;}interface Frame { public function next( RequestInterface $request ): ResponseInterface; public function factory(): Factory;}interface Factory { public function createRequest( /* snip */ ): RequestInterface; public function createServerRequest( /* snip */ ): ServerRequestInterface; public function createResponse( /* snip */ ): ResponseInterface; public function createStream( /* snip */ ): StreamInterface; public function createUri( /* snip */ ): UriInterface; public function createUploadedFile( /* snip */ ): UploadedFileInterface;}
Easy And Simple
One of the arguments used for the proposed syntax is that it’s simple to add new middleware with closures rather than requiring objects for everything.
In reality, this could be trivially solved by creating an adapter:
class CallableServerMiddleware implements ServerMiddlewareInterface { private $callback; public function __construct(callable $callback) { $this->callback = $callback; } public function handle( ServerRequestInterface $request, ServerFrameInterface $frame ): ResponseInterface { return ($this->callback)($request, $frame); }}
Now, it’s worth noting that frameworks can optionally allow callables to be registered directly by using this adapter:
public function append($middleware) { if (!$middleware instanceof ServerMiddlewareInterface) { $middleware = new CallableServerMiddleware($middleware); } //append here}
Really simple. And since it’s unrelated to dispatching, it’s out of context for the proposal.
Another frequently cited justification for the proposed interface is that it’s backwards compatibility with a set of middleware that already exists for PSR-7
.
My assertion here is that the correctness gains that we can have by formalizing the interface far outweigh any compatibility issues. This is especially true when you consider that the current interface has such serious flaws.
Should we standardize something broken because it’s used, or should we standardize something robust? Especially when many of the existing usages are broken and incorrect already.
I have released a proof-of-concept package called Tari on these APIs so that you can try them yourself. The names are a little bit different (ServerMiddlewareInterface
, ServerFrameInterface
and FactoryInterface
), but the concept is identical.
This is a far more robust middleware interface set that solves a lot of very significant problems with existing middleware. Note that this isn’t new either, it’s basically identical to StackPHP, Laravel Middleware, Ruby-On-Rails’ Rack and many others.
I strongly encourage PHP-FIG to recognize the problems with the existing proposal and move to a more robust interface design. One that encourages and can support arbitrary interoperability, not just “works if you get lucky”.
]]>I want to share some of what I’ve been thinking about along those lines. What follows is a collection of some of my evolving thoughts relating to change and complexity. Let me know your thoughts in the comments.
A few years ago, Rich Hickey -the creator of Clojure- did a talk called “Simple made Easy”. It’s an excellent talk, and I highly encourage you to watch it. In the talk, he talks about the difference between simple and easy. One is a measure of complexity, the other of effort.
One of the really interesting takeaways, is that we need to measure complexity separately from the effort that it takes to produce. The reason is that effort is highly coupled to both skill and tacit knowledge about a problem. Familiarity with a code base can significantly reduce the effort required to work on it, but it doesn’t affect the code base’s complexity.
Quite often in life though, we tend to equate the easy solution with the simple one. “One line of code is all it takes to do X”. That seems simple enough, right? Well, if that one line of code causes a significant amount of complexity “behind the scenes”, then no, it’s not simple. It’s easy, but not simple.
Why is this an important distinction? There are a few reasons, but one of the most significant is the Law of Leaky Abstractions. All abstractions leak. That one line of code is by definition a leaky abstraction. So what looks like one line of code today, may in the future require you to know what’s going on behind the scenes because it’s an imperfect abstraction.
Another significant distinction is that all abstractions require assumptions. And when the abstraction is “easy”, it’s easy because many of those assumptions are hidden away from you. This leads to coupling your code base to hidden assumptions which often will have very significant maintenance overhead in the future.
All change introduces risk. Period. The amount of risk that is introduced does vary wildly with each change. Are you changing well understood code? Are you changing simple code? Are you changing well tested code? Are the changes well peer-reviewed?
Change is so risky, that decades ago we started writing software in a manner that reduced the amount of change we would have to do. We built plugin systems. We built configuration files. We built hooks and event systems to try to isolate change to minimize risk. We made principles like Open/Closed to encourage building for minimal change. All in an effort to minimize risk.
All of these practices increase complexity significantly. In many cases, it’s nearly impossible to understand exactly how these pluggable systems work due to this unbounded complexity. The Drupal community even built plugins to help understand the complexity of the plugin system (there’s a bit of irony for you).
There is a better way of handling risk that actually results in reduced complexity instead of increased complexity. It’s something that many people have been advocating for years. It’s something that many people practice today. It’s called: Testing. Specifically Unit Testing, but all forms of testing are a risk reduction tool.
With a well tested code base, the risks associated with making a change are drastically reduced. The fear of change is all but eliminated. If you practice TDD, it even encourages you to change your code at every step in the way.
So why do we fear testing? Many of us are quick to say “testing? we don’t have time for that!”. Yet we spend countless hours building pluggable and over-abstracted systems. Why do we waste our time building event-based and pluggable systems that are MUCH harder to understand rather than starting off with a simple system and testing it?
I think the reason is that we’ve trained ourselves to optimize for Easy. If you look at frameworks and libraries, they try as hard as possible to make themselves easy to use. They use the term “easy” as a tag-line. And they do so for good reason (marketing).
When you optimize for “easy” it becomes trivial to say “well, I don’t need a test here, the code is too hard to test”. When you optimize for “easy” you can fall into the trap of saying “this is just glue code, it doesn’t need a test”. And if you let yourself get away with not testing one part of a system, it gets easier to justify not testing another part.
All the while, the complexity introduced by all of these hidden assumptions keeps building up (after all, it wouldn’t be easy if it made those assumptions apparent). And building up. And building up. Until you realize that you can’t make the change you want because some hidden assumption prevents it.
Or worse, one of those abstractions leaks in the wrong way, and all of a sudden you’re left with a code base that’s failing and you don’t know why.
And worst of all, you’ve let yourself get away without testing everything properly and therefore can’t change the situation to make it better without taking on yet more risk. It’s a vicious cycle.
When you optimize for simple, clarity is king. Magic is the enemy. That doesn’t mean there isn’t complexity, but it means that complexity is never introduced to enable change. Instead, we allow change, we embrace change. But we do that responsibly through the use of testing.
This is one of the fundamental problems that I have with modern web application frameworks. They focus on the easy/hard distinction. They very rarely focus on the simple/complex distinction.
A simple example is the way many PHP frameworks have adopted “Dependency Injection” (not to be confused with actual Dependency Injection, which has almost nothing to do with what they are doing). Almost every single framework today has a “DI Container” and some way of configuring that container.
Why change code if you need to change a dependency, just change a YAML file and regenerate the container! Easy, right?
But what if instead of using this complex system, we just created a series of functions? Real code that you can debug and understand. Simple:
function makeFoo(): Foo { return new Foo();}
Need a dependency? Then wire it up:
function makeBar(Foo $foo = null): Bar { return new Bar($foo ?: makeFoo());}
Want to share instances? Then only call that function once!!! Simple. We were able to avoid a few thousand lines of code and an amazing amount of complexity by simply trusting each other.
Many people will look at the last statement I wrote and say “but I don’t trust the other programmers to do it correctly, I need my container to prevent them from making multiple instances”.
Stop doing that.
Defensive programming is just another way of introducing complexity to avoid change and reduce its risks.
Instead, empower developers. Give them the tools to embrace change. Give them the tools to do it safely. Focus on Simple, don’t fall into the trap of Easy. After all, change is our friend. Without it, we can never move forward.
]]>