Dependencies in Disguise

Yesterday I gave a presentation at the Bulgaria PHP Conference (a great event, by the way).

Following an ad-hoc workshop that I gave as part of the hallway track and an entertaining hackathon, I decided it was too late to join the party and went back to the hotel with some other speakers.

Checking out how the day was reflected in social media, I contributed a few more tweets to a conversation that had started earlier in the day (here are the slides of my talk that people are referring to).

I am writing this to clarify my point, and help everybody to understand better.

These days, we all know that dependency injection is a best practice that we should use whenever an object has a collaborator. Instead of creating it in place (where we would have to deal with its dependencies), we use dependency injection:

1 2 3 4 5 6 7 8 9 10 11 12
<?php
class Something
{
private $collaborator ;

public function __construct ( Collaborator $collaborator )
{
$this -> collaborator = $collaborator ;
}

// ...
}

We delegate the problem of creating the collaborator to somewhere else. This means that we will not have to deal with creating the collaborating object or its dependencies.

This way, we can move pretty much all object creation to one place: the factory. There are some exceptions where you do not want, or need, the factory to create an object. Value objects and domain objects are commonly created in place.

Now we have a factory that can create all our objects. For each object it can create it has one method:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
<?php
class Factory
{
public function createSomething ( )
{
return new Something ( $this -> createCollaborator ( ) ) ;
}

private function createCollaborator ( )
{
return new Collaborator ;
}

// ...
}

To create an object, the factory will have to supply its dependencies. In the example above, the Collaborator object needs to be supplied when a Something object is to be created. This works really well when there is one factory. If you create multpiple factories, you need to deal with factories that depend on other factories, because they need them to create objects. This gets very complex and frustrating, so please don't do it. Create one factory. (There are ways of breaking down a factory without the mentioned drawbacks, but this is beyond the scope of this post.)

This approach works really well when all objects can be created up front. A real application, however, will have to make some runtime decisions about which objects to create. Typical examples are selecting a command handler (or controller, if you insist) to execute, or a view to render. Assuming that your application supports at least a solid two-digit number of command handlers, it is obvious that we cannot create all of them up front, and inject them, for example, into a router.

The problem that we need to solve is object creation that is based on certain parameters that only become available "mid-runtime", maybe because they are derived from a HTTP request.

1 2 3 4 5 6 7 8 9 10 11 12 13
<?php
class HttpPostRequestRouter
{
public function route ( HttpPostRequest $request )
{
switch ( $request -> getParameter ( 'command' ) ) {
case 'createAccount' :
return new CreateAccountCommandHandler ( /* ... */ ) ;
}

// ...
}
}

This example is simplified, but illustrates the point. But wait: we cannot create the command handler here, because then we would have to deal with its dependencies, which might include an AccountRepository, for example. So, do we need to inject the factory?

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
<?php
class HttpPostRequestRouter
{
private $factory ;

public function __construct ( Factory $factory )
{
$this -> factory = $factory ;
}

public function route ( HttpPostRequest $request )
{
switch ( $request -> getParameter ( 'command' ) ) {
case 'createAccount' :
return $this -> factory -> createCreateAccountCommandHandler (
/* ... */
) ;
}

// ...
}
}

If we did this, we would have delegated the responsibility to create the CreateAccountCommandHandler and would not have to deal with its dependencies. But passing around the factory is widely considered a bad practice. The reason is simple: when you have access to the factory, you can create any object. This is simply too much power for a single developer. You do not have access to the database? Well, just have the factory create another connection for you, and off you go.

But it gets worse. Instead of passing around the factory, many developers have started to pass around a service locator that looks something like this:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
<?php
class ServiceLocator
{
private $factory ;

public function __construct ( Factory $factory )
{
$this -> factory = $factory ;
}

public function getService ( $identifier )
{
switch ( $identifier ) {
case 'createAccountCommandHandler' :
return $this -> factory -> createCreateAccountCommandHandler ( ) ;

// ...
}
}
}

This is an implicit API. To fetch a "service" you need to pass a string. The getService() method cannot have a sane return type (annotation), because it can return arbitrary objects (as long as they have been declared as services). This means: no auto completion in the IDE, unless your IDE does some serious magic behind the scenes.

It would be much better to make the API explicit, and have multiple methods, each one having exactly one given return type:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
<?php
class ServiceLocator
{
private $factory ;

public function __construct ( Factory $factory )
{
$this -> factory = $factory ;
}

public function getCreateAccountCommandHandler ( )
{
return $this -> factory -> createCreateAccountCommandHandler ( ) ;
}

// ...
}

This avoids the ugly long case switch, and helps the IDE to offer auto-completion, because it is clear what will be returned. The downside: there are many methods now. A big public API. This is how we would use the locator:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
<?php
class HttpPostRequestRouter
{
private $serviceLocator ;

public function __construct ( ServiceLocator $serviceLocator )
{
$this -> serviceLocator = $serviceLocator ;
}

public function route ( HttpPostRequest $request )
{
switch ( $request -> getParameter ( 'command' ) ) {
case 'createAccount' :
return $this -> serviceLocator
-> getCreateAccountCommandHandler (
/* ... */
) ;
}

// ...
}
}

The service locator decouples our router further from the object creation that the factory does. But this solution still suffers from the same problem: the API is too big. You can locate (and thus create) any service. And it is far too easy to just make pretty much every object a service.

Undoubtedly selecting the command handler has to be a runtime decision for the router. But the service locator's API is just too big to pass an instance of it around. It is also a violation of the Interface Segregation Principle (the "I" in SOLID), because it forces the router to depend on quite a few API methods it does not use.

A service locator with an implicit API has significant drawbacks: it is hiding dependencies. I call this Dependency Disguise, and it is an antipattern. If we make those dependencies explicit, we end up with a service locator that has many methods. But why should we only create one service locator? A smaller one would do for the router:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
<?php
class HttpPostRequestRouter
{
private $commandHandlerLocator ;

public function __construct ( CommandHandlerLocator $commandHandlerLocator )
{
$this -> commandHandlerLocator = $commandHandlerLocator ;
}

public function route ( HttpPostRequest $request )
{
return $this -> commandHandlerLocator -> locateCommandHandlerFor (
$request -> getParameter ( 'command' )
) ;
}
}

As we can see, the router is a special case: in this example, there is really nothing left in the router. All functionality has been moved to the locator. The router is a locator: it selects (locates) the command handler based on a request. In real life, the router would still do other things, for example make sure that the user is allowed to execute the command.

Let us look at a different example. Somewhere we have to select the view:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
<?php
class Something
{
private $viewLocator ;

public function __construct ( ViewLocator $viewLocator )
{
$this -> viewLocator = $viewLocator ;
}

public function doWork ( )
{
// ...

$this -> viewLocator -> locateViewFor ( $result ) ;

// ...
}
}

For the purposes of this example, we do not care about what $result is. It may be a result object passed back from a command handler.

All this works because we are strictly separating concerns, following the Single Responsibility Principle (the "S" in SOLID): the locator selects objects and the factory creates objects. We can break down the one big service locator that many developers think they need into smaller (and harmless) locators. We cannot do this with the factory itself, because that would get us into a factory-requires-other-factory dependency problem.

It all boils down to the question when you can decide which objects to instantiate.

PHPUnit 4.8: Code Coverage Support On Hackathons