A Pragmatic Architecture I
While working on Symfony applications for the last few years I have had a lot of fun but also my fair share of headaches while working on and especially maintaining them. Therefore I am of course interested in keeping those headaches as minimal as possible and one of the biggest issues in my experience is the architecture.
I am always striving to find the best possible architecture for a project (as probably all of you do) and read a lot of articles on various related topics. But I seldomly find an example that shows how to put all the different ideas and concepts together in a good way and a real-world application. Until a few months ago it always felt like there was a piece missing somewhere or it just felt very impractical.
What I want to show you in this article is not a revolutionary new concept, but how to put some good concepts and ideas together to build a Symfony application that is reliable, maintainable, extendable, testable and easy to grasp with as few as possible dependencies. To get a first impression of what we are going to build have a look at the image below. We are going to build a SPA using REST, CQRS, messaging and some Symfony components to create custom objects from requests. The used example application in this article can also be found in the repository here. Feel free to check it out and use it to follow along.
There is already a lot of very good literature on topics like REST, CQRS and hexagonal architecture out there so I won’t explain it here again. Please read up on those topics if you are not familiar with them as I am going to use the concepts and some of the terminology in the rest of this article. You might find the following links useful to get started:
Well almost. Before we get into the details I want to give you some context on what the applications I worked on looked like to make the decisions comprehensible. The next parts will be way more practical - pinky-swear! Feel free to skip this part if you are only interested in the technical details.
The other parts (coming soon):
- Part II - REST
- Part III - Argument value resolvers
- Part IV - CQRS and Messaging
- Part V - Wrapping things up
Our usual projects are single page applications with REST APIs and are way beyond simple CRUD applications. Quite a few years ago most of the code was placed in the controllers, later we structured them more in a Controller-Service(s)-Repository-Entity way. That is not bad but not good either - testability, maintenance, comprehensibility are still pain points.
As mentioned above we had and still have REST APIs but not only simple CRUDs. Trying to stick with REST as much as possible, generic update requests were used to update only a very specific part of a model e.g. disabling a user through a request to
PATCH /api/users/1 with a matching payload. This means on the backend it’s not only hard but also fragile to find out what the real intent of such a generic update request was and also if a user is even allowed to do this.
The APIs were rarely documented and although we are working in small teams this was the cause of a lot of back and forth between developers from time to time. Furthermore, the models in the client application may diverge unnoticed from the ones in the backend because there is no simple way to compare them in an automated way.
Validation was done individually in every project. From request validation with JSON schema, through Symfony validation and probably quite a few things in between. The JSON schema is in general not bad, but the code and schema can deviate from one another unnoticed when refactoring unless it’s covered by some test case. It would be nice to keep those things together and in the code instead of multiple separate files.
Usually, the FOSRestBundle and JMSSerializer bundle were used for doing the API stuff but only a very small part (generating routes and returning JSON responses) of both bundles was used and needed in those projects. By that I don’t want to say that those bundles are bad in any way - we just didn’t use nor need them.
Let’s take a look at a simplified code example.
Here we have a patch action for updating a customer. It takes all data from the request, passes it on to a service (the
manager) as an array and gets an updated entity as a response. The response will be wrapped in a DTO and serialized based on serialization groups.
There is not much code here, but still, there are some things that can be improved:
- Passing on a generic array with data instead of a typed object: No one knows what exactly is inside this array, what its structure looks like and what types can be expected (unless you have a look at the JSON schema file in this case). Also, some information like the project id has to be added manually. So all in all quite error-prone and a lot of tests are needed to cover all cases for the manager.
- Using serialization groups may work for quite a while but sooner or later it will get very complex when just one DTO exists for every entity and everything should be resolved with different groups.
- The generic patch endpoint might be needed, but it also prevents us from being able to tell what the real intent of this request was. Was a comment added or did the order get canceled? It would be nice to have specific endpoints whenever possible, as it would make our lives as developers tremendously easier and the code as well.
The service which handles the call from the controller does a lot of things. But mostly it processes the data from the request and applies all the data provided to the entity. What happens in detail is not clear at this point. The only thing one knows is that some or all data will be changed.
It seems that the service
CustomerManager separates the business logic and database stuff from the controller and prevents the business logic from being mixed with serialization stuff. But there are other problems (the list is not exhaustive):
setDatais a generic function to update the data of a customer it’s not clear what the intent was. Everything or nothing could be changed and taking specific actions based on the data will be difficult and produce a rather bad code.
- There seems to exist some sort of inheritance which probably is the reason why an
Identifiable $customerinstead of a
Customer $customeris declared in the parameters of the
setDatafunction. This seems to be the reason for the first condition concerning the passed
- The service has 400 lines and is still growing. Fully understanding what this service is doing takes quite a bit of time.
- Instead of having an object, the array with the data from the request is passed on to the service which requires us to use those strings for the array keys in the service. Changing a property name somewhere will probably break the application here. Tests might uncover this issue but why take chances?
Another thing that is not visible in this code example at the first glance (and just of cosmetic nature) is that all controllers, all managers, all repositories etc are located in one respective directory. But seldom you will be changing only controllers or managers and therefore you will be jumping from one end of the file tree to the other constantly which is a bit of a pain in the ass.
Also, it’s not clear from looking at this code that there exists a validation and also not which JSON schema files are used to validate this request. That’s hidden somewhere deep in a configuration file.
Those are for sure not all things that could be improved, but I think we identified enough problems to get started and improve quite a few things. And as a wise man once told me:
There are no problems, just challenges.
As the title already suggests the approach will be a pragmatic one. So the focus is to provide the maximum value for developers and the customer. We try to stick to concepts as much as possible, but we won’t do it just for the sake of that. If we don’t get any additional value or even make things more complicated we won’t do it.
- Intentions should be clear from the beginning.
- Following requests through the application should be like reading a short story.
- Everything should be typed.
- Things that belong together should be together but every unit should be as concise as possible.
- The documentation for the API should be generated from the existing code as far as possible.
- The code should be as easy as possible to comprehend, maintain, test and extend.
- As few as possible extensions should be used to make maintenance and upgrades as painless as possible.
Let’s have a look at the basic architecture with the major building blocks again.
In our case we have two possible starting points which are either a request or a console command. Independently of where it started, a command or query should be created and put onto the message bus. From there the messages (which can be a command, a query or an event) will be distributed to their handlers. The handlers themselves interact with your domain in some way (e.g. domain services), but we are not going to look into this part of the application in this article.
Command and query handlers should not create other commands and queries directly but they can create an event that is again put onto the message bus. In an event handler can be decided what needs to be done as a reaction to the event, which might be executing a second command like sending an email after user registration. There might also be multiple event handlers for one event.
The most important thing here is that the handlers do just the one thing they are supposed to do. For example a command should not send two emails. The reason is that if sending the second mail fails, processing the command will usually be retried and therefore the first mail will be sent again.
Let’s have a look at it in more detail and label the concepts used in the architecture above.
One of our goals is to capture as much as possible of the intent to make our code easier to understand and better in various other ways (e.g. no parsing of generic update requests). Following one request along in the code should be like reading a short story. There might be some side-stories (e.g. when X happened, Y should be done as well) but the main thing should be very easy to grasp and to follow along.
A good match for that is the CQRS pattern. Creating commands and queries for every request and trying to keep them as specific as possible will do just that. For every command and query we are going to have a specific handler who should do his one and only job - handling the command or query. It should not be a new “I-can-do-it-all-in-here” place.
In a real-world application we still might have a few generic (update) requests here and there but we can use the same structure as for every other specific request.
Bringing those queries, commands and also events together with their handlers is a perfect job for the Symfony Messenger component. But this component can do way more than just that. There exist multiple middlewares like for example for doctrine transaction handling (no manual flushing anymore) and your own can be integrated as well. The component makes it easy to do things asynchronously as well and we of course want to process as much as possible async e.g.sending emails.
As stated above we want to get rid of untyped and generic data arrays. But creating objects manually from a request or a generic data array is a tedious job - at least if we have to do it for every controller and API. So why not automate this and have a logic mapping the requests to objects?
Ideally those objects would already be our commands and queries and they would also get validated after creation. There exists a feature in Symfony called Argument Resolvers which enables us to accomplish just that in combination with the Symfony Validation and the Symfony Serializer components. No manual work is needed and it’s just Symfony doing a wonderful job with some small additions from us.
At this point we have everything to process requests except for a REST API. To my knowledge there exist two approaches to make REST and CQRS work together nicely. Either the API is very unspecific (CRUD) and also the queries and commands as well. Or the API has to change (less REST more RESTish) in some places and get more specific about what is happening.
So it’s not that much a technical but a conceptual thing that one has to get accustomed to when moving from one generic endpoint PUT
/api/customers/1 to multiple endpoints for example POST
/api/orders/1/changeType etc. To bring the concepts together it might help to see e.g.
/activate not as an action but as a resource of activation commands where a consumer can add new elements to.
We want a solution that makes it possible to accomplish both (specific and unspecific APIs) depending on the use case and the consumers. If possible one should always be as specific as possible though.
Documenting is seldom a fun task and usually nobody is too keen to do it. In a perfect world it would be generated from the existing code. An additional benefit would be that it also produces a file to validate the frontend models against the backend models. By generating the documentation from our code we could ensure that the documentation stays up to date and is less error-prone.
Thankfully this is possible by using our command and query classes, the routing information and some response objects in combination with the NelmioApiDocBundle. Also the bundle will take our validation annotations into account to generate the docs.
It seems like we have all the things we need and each is very well documented. But how can they be combined? Let’s take a look at this challenge in the next few articles.
In the next part, we will take the first step to make our API more expressive and prepare to work together with the rest.