This is an excerpt from my book: Drupal 8 module development - second edition. Do check out the rest of the book/chapter to see how you can write automated tests in Drupal 8.

In order to really understand how entity data is modelled, we need to understand the TypedData API. Unfortunately, this API still remains quite a mystery for many. But you're in luck because, in this section, we're going to get to the bottom of it.

Why TypedData?

It helps to understand things better if we first talk about why there was the need for this API. It all has to do with the way PHP as a language is, compared to others, and that is, loosely typed. This means that in PHP it is very difficult to use native language constructs to rely on the type of certain data or understand more about that data.

The difference between the string "1" and integer 1 is a very common example. We are often afraid of using the === sign to compare them because we never know what they actually come back as from the database or wherever. So, we either use == (which is not really good) or forcefully cast them to the same type and hope PHP will be able to get it right.

In PHP 7, we have type hinting for scalar values in function parameters which is good, but still not enough. Scalar values alone are not going to cut it if you think of the difference between 1495875076 and 2495877076. The first is a timestamp while the second is an integer. Even more importantly, the first has meaning while the second one does not. At least seemingly. Maybe I want it to have some meaning because it is the specific formatting for the IDs in my package tracking app.

Drupal was not exempt from the problems this loosely typed nature of PHP can create. Drupal 7 developers know very well what it meant to deal with field values in this way. But not anymore because we now have the TypedData API in Drupal.

What is TypedData?

The TypedData API is a low-level and generic API that essentially does two things from which a lot of power and flexibility is derived.

First, it wraps "values" of any kind of complexity. More importantly, it forms "values". This can be a simple scalar value to a multidimensional map of related values of different types that together are considered one value. Let's take, for example, a New York license plate: 405-307. This is a simple string but we "wrap" it with TypedData to give it meaning. In other words, we know programmatically that it is a license plate and not just a random PHP string. But wait, that plate number can be found in other states as well (possibly, I have no idea). So, in order to better define a plate, we need also a state code: NY. This is another simple string wrapped with TypedData to give it meaning—a state code. Together, they can become a slightly more complex piece of TypedData: US license plate, which has its own meaning.

Second, as you can probably infer, it gives meaning to the data that it wraps. If we continue our previous example, the US license plate TypedData now has plenty of meaning. So, we can programmatically ask it what it is and all sorts of other things about it, such as what is the state code for that plate. And the API facilitates this interaction with the data.

As I mentioned, from this flexibility, a lot of power can be built on top. Things like data validation are very important in Drupal and rely on TypedData. As we will see later in this chapter, validation happens at the TypedData level using constraints on the underlying data.

Check out the book for a getting a deeper understanding on how this API is used to model the entity system in Drupal.