Back in the day X was a great protocol that reflected the needs of the time.
Applications asked it to draw some lines and text.
It sent input events to applications.
People also wanted to customize how their windows were laid out more flexibly. So the window manager appeared. This would move all of your windows around for you and provide some global shortcuts for things.
Then graphics got more complicated. All of a sudden the simple drawing primitives of X weren’t sufficient. Other than lines, text and rectangles applications wanted gradients, rounded corners and to display rich graphics. So now instead of using all of these fancy drawing APIs they were just uploading big bitmaps to the X server. At this point 1/3 of what the X server was previously doing became obsolete.
Next people wanted fancy effects and transparency (like drop shadows). So window managers started compositing the display. This is great but now they need more control than just moving windows around on the display in case they are warped, rendered somewhere slightly differently or on a different workspace. So now all input events go first from X to the window manager, then back to X, then to the application. Also output needs to be processed by the window manager, so it is sent from the client to X, then to the window manager, then the composited output is sent to X. So another 1/3 of what X was doing became obsolete.
So now what is the X server doing:
Outputting the composited image to the display.
Receiving input from input devices.
Shuffling messages and graphics between the window manager and applications.
It turns out that 1 and 2 have got vastly simpler over the years, and can now basically be solved by a few libraries. 3 is just overhead (especially if you are trying to use X over a network because input and output need to make multiple round-trips each).
So 1 and 2 turned into libraries and 3 was just removed. Basically this made the X server disappear. Now the window manager just directly read input and displayed output usually using some common libraries.
Now removing the X server is a breaking change, so it was a great time to rethink a lot of decisions. Some of the highlights are:
Accessing other applications information (output and input capture) requires explicit permission. This is a key piece to sandboxing applications.
Organize the system around frames to avoid tearing except for when desired (X doesn’t really have the concept of a frame).
Remove lots of basically unused APIs like fonts, drawing and many others.
So the future is great. Simpler, faster, more secure and more extensible. However getting there takes time.
This was also slowed down by some people trying to resist some features that X had (such as applications being able to position themselves). And with a few examples like that it can be impossible to make a nice port of an application to Wayland. However over time these features are being added and these days most applications have good Wayland support.
Back in the day X was a great protocol that reflected the needs of the time.
People also wanted to customize how their windows were laid out more flexibly. So the window manager appeared. This would move all of your windows around for you and provide some global shortcuts for things.
Then graphics got more complicated. All of a sudden the simple drawing primitives of X weren’t sufficient. Other than lines, text and rectangles applications wanted gradients, rounded corners and to display rich graphics. So now instead of using all of these fancy drawing APIs they were just uploading big bitmaps to the X server. At this point 1/3 of what the X server was previously doing became obsolete.
Next people wanted fancy effects and transparency (like drop shadows). So window managers started compositing the display. This is great but now they need more control than just moving windows around on the display in case they are warped, rendered somewhere slightly differently or on a different workspace. So now all input events go first from X to the window manager, then back to X, then to the application. Also output needs to be processed by the window manager, so it is sent from the client to X, then to the window manager, then the composited output is sent to X. So another 1/3 of what X was doing became obsolete.
So now what is the X server doing:
It turns out that 1 and 2 have got vastly simpler over the years, and can now basically be solved by a few libraries. 3 is just overhead (especially if you are trying to use X over a network because input and output need to make multiple round-trips each).
So 1 and 2 turned into libraries and 3 was just removed. Basically this made the X server disappear. Now the window manager just directly read input and displayed output usually using some common libraries.
Now removing the X server is a breaking change, so it was a great time to rethink a lot of decisions. Some of the highlights are:
So the future is great. Simpler, faster, more secure and more extensible. However getting there takes time.
This was also slowed down by some people trying to resist some features that X had (such as applications being able to position themselves). And with a few examples like that it can be impossible to make a nice port of an application to Wayland. However over time these features are being added and these days most applications have good Wayland support.
Saving this for later – what a fantastic breakdown. Thanks for this!