Application of the Poka-Yoke Principle in Programming by using PHP

Application of the Poka-Yoke Principle in Programming by using PHP

Hello! I’m Paul Brook, developer of the SmartSpate. We in SmartSpate always try to make it so that our code can be easily maintained, developed and reused because it depends on these parameters how quickly and efficiently we can implement any feature. One way to achieve this goal is to write code that simply does not allow you to make a mistake. The maximum strict interface will not let you get wrong with the order of its call. The minimum number of internal states guarantees the expected results. The other day I saw an article that describes how the application of these methods makes life easier for developers.

  • When working together with code in a medium or large team sometimes it becomes difficult to understand and use someone else’s code. There are various solutions to this problem.
  • For example, you can agree to follow certain coding standards or use the well-known command-line framework. However, often this is not enough, especially when you need to fix an error or add a new function to the old code. It is difficult to remember what specific classes were designed for and how they should work both separately and together. At such times, you can accidentally add side effects or errors without even realizing it.

These errors can be detected during testing, but there is a real chance that they will slip into production. And even if they are identified, it may take quite some time to roll back the code and fix it.

So, how can we prevent this? With the help of the principle “poka-yoke”.

What is poka-yoke?

Poka-yoke is a Japanese term that is translated into English roughly as “mistake-proofing”. This concept originated in lean manufacturing, where it refers to any mechanism that helps the equipment operator avoid mistakes.

  • In addition to production, poka-yoke is often used in consumer electronics. Take, for example, a SIM card that, due to its asymmetric shape, can only be inserted into the adapter by the right side.
SIM card
  • The opposite example (without using the poka-yoke principle) is the PS/2 port, which has the same connector shape for both the keyboard and mouse. They can be distinguished only by color and therefore easily confused.
PS/2 port
  • Another concept of poka-yoke can be used in programming. The idea is to make the public interfaces of our code as simple and understandable and generate errors as soon as the code is misused. This may seem obvious, but in fact, we often encounter code that does not.

Please note that poka-yoke is not intended to prevent intentional abuse. The goal is only to avoid accidental errors, and not to protect the code from malicious use. Anyway, while someone has access to your code, he can always bypass the fuses if he really wants to.

Before discussing specific measures that make the code more secure from errors, it is important to know that the mechanisms of poka-yoke can be divided into two categories:

  • Error prevention;
  • Error detection;

The mechanisms for preventing errors are useful for eliminating errors at an early stage. By simplifying interfaces and behavior, we make sure that no one can accidentally use our code incorrectly (remember the example with a SIM card).

  • On the other hand, error detection mechanisms are outside our code. They control our applications to track possible errors and warn us about them. An example might be software that determines whether the device connected to the PS/2 port has the correct type, and if not, tells the user why it does not work. Such software could not prevent the error since the connectors are the same, but it can detect it and report it.

Next, we’ll look at several methods that can be used to both prevent and detect errors in our applications. But keep in mind that this list is just the starting point. Depending on the application, additional measures can be taken to make the code more secure from errors. In addition, it is important to make sure that poka-yoke is implemented in your project: depending on the complexity and size of your application, some measures may be too costly compared to the potential cost of errors. Therefore, you and your team decide which measures are best for you.

Examples of error prevention

Type Declaration

  • Formerly known as Type Hinting in PHP 5, type declarations are a simple way to prevent errors when calling functions and methods in PHP 7. By assigning certain types to functions, it becomes more difficult to break the order of arguments when calling this function.

For example, let’s consider a notification that we can send to the user:

<?php

    class Notification {
        private $userId;
        private $subject;
        private $message;

        public function __construct(
            $userId,
            $subject, 
            $message
        ) {
            $this->userId = $userId;
            $this->subject = $subject;
            $this->message = $message;
        }

        public function getUserId()
        {
            return $this->userId;
        }

        public function getSubject()
        {
            return $this->subject;
        }

        public function getMessage()
        {
            return $this->message;
        }
    }
  • Without type declaration, you can accidentally pass variables of the wrong type, which can disrupt the application. For example, we can assume that $userId must be a string, while in fact, it can be int.
  • If we pass an incorrect type to the constructor, then the error will probably go unnoticed until the application tries to do something with this notification. And at this point, most likely, we’ll get some mysterious error message, in which nothing points to our code, where we pass a string instead of int. Therefore, it is usually preferable to force the application to break as soon as possible, so as early as possible in the course of development to detect such errors.

In this particular case, you can simply add a type declaration – PHP will stop and immediately warn us of a fatal error as soon as we try to pass a parameter of the wrong type:

<?php

    declare(strict_types=1);

    class Notification {
        private $userId;
        private $subject;
        private $message;

        public function __construct(
            int $userId,
            string $subject, 
            string $message
        ) {
            $this->userId = $userId;
            $this->subject = $subject;
            $this->message = $message;
        }

        public function getUserId() : int
        {
            return $this->userId;
        }

        public function getSubject() : string
        {
            return $this->subject;
        }

        public function getMessage() : string
        {
            return $this->message;
        }
    }
  • Note that by default PHP will try to give incorrect arguments to their expected types. To avoid this and generate a fatal error, it is important to allow strict typing (strict_types). Because of this, declaring scalar types is not an ideal form of poka-yoke, but serves as a good starting point for reducing the number of errors. Even with strict typing disabled, the type declaration can still serve as a clue which type is expected for the argument.
  • In addition, we declared the return type for our methods. This simplifies the determination of what values we can expect when calling a particular function.
  • Clearly defined types of returned data are also useful for avoiding the set of switch statements when working with return values since, without explicitly declared return types, our methods can return different types. Therefore, someone, using our methods, will have to check which type was returned in a particular case. Obviously, you can forget about switch statements, which will lead to errors that are difficult to detect. But they become much less common when declaring the type of the return value of a function.

Value Objects

The problem that the type declaration can not solve is that the presence of several arguments to the function allows you to confuse their order in the call.

  • When arguments have different types, PHP can warn us about the violation of the order of arguments, but this does not work if we have several arguments with the same type.

To avoid errors, in this case, we could wrap our arguments into value objects:

class UserId {
        private $userId;

        public function __construct(int $userId) {
            $this->userId = $userId;
        }

        public function getValue() : int
        {
            return $this->userId;
        }
    }

    class Subject {
        private $subject;

        public function __construct(string $subject) {
            $this->subject = $subject;
        }

        public function getValue() : string
        {
            return $this->subject;
        }
    }

    class Message {
        private $message;

        public function __construct(string $message) {
            $this->message = $message;
        }

        public function getMessage() : string
        {
            return $this->message;
        }
    }

    class Notification {
        /* ... */

        public function __construct(
            UserId $userId,
            Subject $subject, 
            Message $message
        ) {
            $this->userId = $userId;
            $this->subject = $subject;
            $this->message = $message;
        }

        public function getUserId() : UserId { /* ... */ }

        public function getSubject() : Subject { /* ... */ }

        public function getMessage() : Message { /* ... */ }
    }

Since our arguments now have a very specific type, they are almost impossible to confuse.

  • An additional advantage of using value objects in comparison with declaring scalar types is that we no longer need to include strict typing in each file. And if we do not need to remember this, then we can not forget about it.

Validation

  • When working with objects-values, we can encapsulate the logic of checking our data inside the objects themselves. Thus, you can prevent the creation of a value object with an invalid state, which can lead to problems in the future in other layers of our application.
  • For example, we may have a rule that any UserId should always be positive. We could obviously test it whenever we get UserId as input, but on the other hand, it can also be easily forgotten in one place or another. And even if this forgetfulness leads to an actual error in another layer of our application, it may be difficult to understand from the error message what actually went wrong, and this will complicate the debugging.

To prevent such errors, we could add some validation to the UserId constructor:

class UserId {
        private $userId;

        public function __construct($userId) {
            if (!is_int($userId) || $userId < 0) {
                throw new \InvalidArgumentException(
                    'UserId should be a positive integer.'
                );
            }

            $this->userId = $userId;
        }

        public function getValue() : int
        {
            return $this->userId;
        }
    }

Thus, we can always be sure that when working with the UserId object, it has the correct state. This saves us from having to constantly check the data at different levels of the application.

  • Note that here we could add a scalar type declaration instead of using is_int, but this will force us to include strict typing wherever UserId is used. If this is not done, PHP will try to cast other types to int whenever they are passed as UserId. This can be a problem, since we could, for example, pass a float, which may be an invalid variable, because the user IDs are not normally float. In other cases, when we could, for example, work with the Price object, disabling strict typing can lead to rounding errors, because PHP automatically converts float variables to int.

Unchangeability

  • By default, objects in PHP are passed by reference. This means that when we make changes to an object, it changes instantly throughout the application.

Although this approach has its advantages, it has some drawbacks. Let’s consider an example of a notification sent to a user via SMS and e-mail:

interface NotificationSenderInterface
    {
        public function send(Notification $notification);
    }

    class SMSNotificationSender implements NotificationSenderInterface
    {
        public function send(Notification $notification) {
            $this->cutNotificationLength($notification);

            // Send an SMS...
        }

        /**
         * Makes sure the notification does not exceed the length of an SMS.
         */
        private function cutNotificationLength(Notification $notification)
        {
            $message = $notification->getMessage();
            $messageString = substr($message->getValue(), 160);
            $notification->setMessage(new Message($messageString));
        }
    }

    class EmailNotificationSender implements NotificationSenderInterface
    {
        public function send(Notification $notification) {
            // Send an e-mail ...
        }
    }

    $smsNotificationSender = new SMSNotificationSender();
    $emailNotificationSender = new EmailNotificationSender();

    $notification = new Notification(
        new UserId(17466),
        new Subject('Demo notification'),
        new Message('Very long message ... over 160 characters.')
    );

    $smsNotificationSender->send($notification);
    $emailNotificationSender->send($notification);
  • Because the Notification object is passed by reference, an unintended side effect has been obtained. When you shortened the message length in SMSNotificationSender, the associated Notification object was updated in the entire application, so the message was also truncated when later sent to the EmailNotificationSender.

To fix this, make the Notification object unchanged. Instead of providing set methods for making changes to it, we’ll add with-methods that make a copy of the original Notification of making these changes:

class Notification {
        public function __construct( ... ) { /* ... */ }

        public function getUserId() : UserId { /* ... */ }

        public function withUserId(UserId $userId) : Notification {
            $c = clone $this;
            $c->userId = clone $userId;
            return $c;
        }

        public function getSubject() : Subject { /* ... */ }

        public function withSubject(Subject $subject) : Notification {
            $c = clone $this;
            $c->subject = clone $subject;
            return $c;
        }

        public function getMessage() : Message { /* ... */ }

        public function withMessage(Message $message) : Notification {
            $c = clone $this;
            $c->message = clone $message;
            return $c;
        }
    }

Now that we are making changes to the Notification class, for example, by shortening the length of the message, they no longer apply to the entire application, which prevents the occurrence of various side effects.

  • However, note that in PHP it is very difficult (if not impossible) to make the object truly immutable. But in order to make our code more secure from errors, it will be enough to add “immutable” with-methods instead of set-methods, since class users will no longer need to remember about the need to clone an object before making changes.

Return null objects

  • Sometimes we come across functions and methods that can return either a value or null. And these null return values can be a problem because you will almost always need to check the values for null before we can do anything with them. Again, it is easy to forget about this.

To get rid of the need to check the return values, we could return null objects instead. For example, we can have a ShoppingCart with or without a discount:

interface Discount {
        public function applyTo(int $total);
    }

    interface ShoppingCart {
        public function calculateTotal() : int;

        public function getDiscount() : ?Discount;
    }

When calculating the final cost of ShoppingCart before calling the applyTo method, we now always need to check that the getDiscount () function is returned: null or a discount:

 $total = $shoppingCart->calculateTotal();

    if ($shoppingCart->getDiscount()) {
        $total = $shoppingCart->getDiscount()->applyTo($total);
    }

If you do not perform this test, we will get a PHP warning and/or other side effects when getDiscount () returns null.

On the other hand, these checks can be avoided if we return a null object when the discount is not provided:

class ShoppingCart {
        public function getDiscount() : Discount {
            return !is_null($this->discount) ? $this->discount : new NoDiscount();
        }
    }

    class NoDiscount implements Discount {
        public function applyTo(int $total) {
            return $total;
        }
    }
  • Now that we call getDiscount (), we always get the Discount object, even if there is no discount. Thus, we can apply a discount to the final amount, even if it does not exist, and we no longer need the if statement:

Optional Dependencies

  • For the same reasons that we want to avoid null return values, we want to get rid of optional dependencies, simply by making all dependencies mandatory.

Take, for example, the following class:

class SomeService implements LoggerAwareInterface {
        public function setLogger(LoggerInterface $logger) { /* ... */ }

        public function doSomething() {
            if ($this->logger) {
                $this->logger->debug('...');
            }

            // do something

            if ($this->logger) {
                $this->logger->warning('...');
            }

            // etc...
        }
    }

There are two problems:

  • We constantly need to check the presence of a logger in our do something () method.
  • If you configure the SomeService class in our service container, someone may forget to configure the logger, or it may not even know that the class has the ability to do this.

We can simplify the code by making LoggerInterface a mandatory dependency:

class SomeService {
        public function __construct(LoggerInterface $logger) { /* ... */ }

        public function doSomething() {
            $this->logger->debug('...');

            // do something

            $this->logger->warning('...');

            // etc...
        }
    }
  • Now our public interface has become less cumbersome, and whenever someone creates a new instance of SomeService, he knows that the class requires an instance of LoggerInterface, and so he can not forget to specify it.
  • In addition, we got rid of the need to constantly check for the presence of a logger, which makes dosomething () easier to understand and less susceptible to errors whenever someone makes a change to it.

If we wanted to use SomeService without a logger, we could apply the same logic as with the return of the null object:

$service = new SomeService(new NullLogger());
  • As a result, this approach has the same effect as using the optional setlogger () method, but it simplifies our code and reduces the probability of an error in the dependency injection container.

Public-methods

  • To make the code easier to use, it’s better to limit the number of public methods in classes. Then the code becomes less confusing, and we have less chance to refuse backward compatibility when refactoring.

The reduction in the number of public methods to a minimum is facilitated by the analogy with transactions. Consider, for example, the transfer of money between two bank accounts:

$account1->withdraw(100);
$account2->deposit(100);
  • Although a database with a transaction can provide cancellation of withdrawals if replenishment can not be made (or vice versa), it can not prevent us from forgetting to call either $ account1-> withdraw () or $ account2-> deposit (), which will result in an incorrect operation.

Fortunately, we can easily fix this by replacing two separate methods with one transactional one:

$account1->transfer(100, $account2);

As a result, our code becomes more reliable, because it will be more difficult to commit an error, completing the transaction in part.

Examples of error detection

  • The error detection mechanisms are not designed to prevent them. They should only alert us about problems when they are discovered. Most of the time they are outside our application and check the code at certain intervals or after specific changes.

Unit-tests

Unit tests can be a great way to ensure that the new code works correctly. They also help to make sure that the code still works correctly after someone has reorganized part of the system.

  • Since someone can forget to perform unit testing, it is recommended to run tests automatically when making changes using services such as Travis CI and GitLab CI. Thanks to them, developers receive notifications when something breaks, which also helps to make sure that the changes made work as intended.

In addition to detecting errors, unit tests are excellent examples of using specific parts of the code, which in turn prevents errors when someone else uses our code.

Code Coverage Reports and Mutational Testing

  • Since we can forget to write enough tests, it is useful when testing to automatically generate reports about code coverage by tests using services such as Coveralls. Whenever our code coverage is reduced, Coveralls sends us a notification, and we can add missing tests. Thanks to Coveralls, we can also understand how the code coverage changes over time.
  • Another way to ensure that we have enough unit-tests is to use mutational tests, for example, using Humbug. As the name suggests, they check to see if our code is covered in the tests, slightly changing the source code and then running unit tests, which should generate errors due to the changes made.

Using code coverage reports and mutational tests, we can verify that our unit tests are sufficient to prevent errors.

Static code analyzers

  • Code analyzers can detect errors in our application at the beginning of the development process. For example, IDEs, such as PhpStorm, use code analyzers to warn us about errors and give clues when we write the code. Errors can range from simple syntax to repetitive code.
  • In addition to analyzers built into most IDEs, third-party and even custom analyzers can be included in the process of building our applications to identify specific problems. An incomplete list of parsers suitable for projects in PHP can be found on GitHub.

There are also online solutions, for example, SensioLabs Insights.

Logging

Unlike most other error detection mechanisms, logging can help detect errors in the application when it works in production.

  • Of course, this requires that the code writes to the log whenever something unexpected happens. Even when our code supports loggers, you can forget about them when setting up the application. Therefore, optional dependencies should be avoided (see above).
  • Although most applications at least partially maintain a log, the information that is recorded there becomes really interesting when it is analyzed and controlled with tools such as Kibana or Nagios. They can give an idea of what errors and warnings occur in our application, when people actively use it, and not when it is being tested.

Do not suppress errors

  • Even with the logging of errors, it happens that some of them are suppressed. PHP has a tendency to continue working when a “recoverable” error occurs. However, errors can be useful when developing or testing new functions, since they can indicate errors in the code. That’s why most code analyzers warn you when they find that you use @ to suppress errors, as this can hide errors that will inevitably appear again as soon as the application is used.
  • As a general rule, it is better to set the error_reporting level of PHP to E_ALL to receive even the smallest warnings. However, do not forget to record these messages somewhere and hide them from users so that no confidential information about your application architecture or potential vulnerabilities is available to end users.
  • In addition to error_reporting, it is important to always include strict_types so that PHP does not attempt to automatically cast function arguments to their expected type, as this can lead to hard-to-find errors (for example, rounding errors when casting a float to int).

Using Outside PHP

Because poka-yoke is a concept rather than a specific technique, it can also be used in non-PHP areas.

Infrastructure

At the infrastructure level, many errors can be prevented by creating a common development environment identical to the production environment, using tools such as Vagrant.

Automating the deployment of an application using build servers, such as Jenkins and GoCD, can help prevent errors when deploying changes to an application since this process can include many steps, some of which are easy to forget.

REST API

  • When creating the REST API, you can implement poka-yoke to simplify the use of the API. For example, we can make sure that we return an error whenever an unknown parameter is passed to the URL or the body of the request. This may seem strange since we obviously want to avoid “breaking” our API clients, but it’s usually better to warn developers using our API as soon as possible about incorrect usage so that bugs are fixed at an early stage of the development process.
  • For example, we can have the color parameter in the API, but someone who uses our API can accidentally use the colour parameter. Without any warning, this error can easily get into production, until it is noticed by end users.

To learn how to create an API that does not bother you, read Building APIs You Will Not Hate.

Application Configuration

  • Virtually all applications need some kind of customization. Most often, developers provide as many default settings as possible, which simplifies configuration. However, as in the example with color and color, you can easily make a mistake in the configuration parameters, which will cause the application to unexpectedly return to the default values.

Such moments are difficult to track because the application does not initiate an error as such. And the best way to get notified when configured improperly is to simply not provide any default values and generate an error as soon as there is no configuration option.

Preventing user errors

The concept of poka-yoke can also be used to prevent or detect user errors. For example, in accounting software, the account number entered by the user can be checked using a check digit algorithm. This will not allow you to enter the account number with a typo.

Conclusion

  • Although poka-yoke is only a concept and not a specific set of tools, there are different principles that we can apply to the code and development process to prevent errors or detect them at an early stage. Very often, these mechanisms will be specific to the application itself and its business logic, but there are a few simple methods and tools that you can use to make any code more reliable.
  • The main thing is to remember that although we want to avoid mistakes in production, they can be very useful in the development process, and we should not be afraid to initiate them as soon as possible so that it is easier to track them. These errors can be generated either by the code itself or by separate processes that run separately from the application and control it from the outside.

To further reduce the number of errors, we must strive to ensure that the public interfaces of our code are as simple and straightforward as possible.