mozilla August 20, 2020

Why Did Mozilla Remove XUL Add-ons?

TL;DR: Firefox used to have a great extension mechanism based on the XUL and XPCOM. This mechanism served us well for a long time. However, it came at an ever-growing cost in terms of maintenance for both Firefox developers and add-on developers. On one side, this growing cost progressively killed any effort to make Firefox secure, fast or to try new things. On the other side, this growing cost progressively killed the community of add-on developers. Eventually, after spending years trying to protect this old add-on mechanism, Mozilla made the hard choice of removing this extension mechanism and replacing this with the less powerful but much more maintainable WebExtensions API. Thanks to this choice, Firefox developers can once again make the necessary changes to improve security, stability or speed.

During the past few days, I’ve been chatting with Firefox users, trying to separate fact from rumor regarding the consequences of the August 2020 Mozilla layoffs. One of the topics that came back a few times was the removal of XUL-based add-ons during the move to Firefox Quantum. I was very surprised to see that, years after it happened, some community members still felt hurt by this choice.

And then, as someone pointed out on reddit, I realized that we still haven’t taken the time to explain in-depth why we had no choice but to remove XUL-based add-ons.

So, if you’re ready for a dive into some of the internals of add-ons and Gecko, I’d like to take this opportunity to try and give you a bit more detail.

Dismantling a clockwork scaffolding, as interpreted by MidJourney

About Promiscuous Add-Ons, JetPack and WebExtensions

For a very long time, Firefox was composed of a very small core on top of which everything was implemented as extensions. Many of these extensions were written in C++, others in JavaScript and many involved the XUL interface language and the XBL binding language. C++ and JavaScript code were connected thanks to a technology called XPCOM. Whenever an extension developer wished to customize Firefox, it was simple and extremely powerful, as the exact same building blocks used to power Firefox could be used to customize it.

This is how Session Restore (the technology that lets you resume Firefox where you left it the last time, even in case of crash) or the Find Bar were first implemented in Firefox, among other features. This is the technology that powers Firefox and Thunderbird. This is how tools such as Songbird (an open-source iTunes competitor) or Instantbird (a chat client) were developed. This is also how I customized Firefox to become an eBook reader a long time ago. And this is how thousands of Firefox add-ons were developed.

Many people call this extension mechanism “XUL-based Add-Ons”, or sometimes “XPCOM-based Add-Ons”, and I’ll use both terms in this blog entry, but I often think of this as the “Promiscuous Extension Mechanism”, for several reasons:

very quickly, add-on developers realized that anything they did could break anything else in the system, including other add-ons and Firefox itself, and they often had no way to prevent this;
similarly, anything Firefox developers did could break add-ons, and they often had no way to prevent this;
also, some of the changes that Firefox needed to be as fast, as stable and as secure as possible were going to break most add-ons immediately, possibly all add-ons in the longer term;
oh, and by the way, since add-ons could do everything, they could very easily do anything to the operating system, from stealing passwords to pretending to be your bank.

Note: Having read in comments that some users apparently do not care about security, let me add that being secure is a really, really important point for Mozilla and has been since the first day. Regardless of add-ons, not having security means that an exploit is eventually going to show up that will steal user’s passwords and use them to steal their bank accounts – and that exploit will get sold around and will soon show up everywhere. Firefox developers fight this threat daily by all sorts of means, including code reviews, defensive programming, crash scene investigations, several types of sandboxing, static analysis, memory-safe languages, … Consequently, for Mozilla, if a feature prevents us from achieving great security, we always pick security over features.

I’ll return to these points in more details later. For the moment, suffices to say that it had been clear to Firefox developers for a long time (at least since 2010) that this situation was untenable. So Mozilla came up with a backup plan called the Firefox Jetpack.

Firefox Jetpack was a very different manner of extending Firefox. It was much cleaner. It finally had a permissions mechanism (something that had been suggested even before Firefox was called Firefox and that was generally considered too hard to implement). Out of the box, add-ons could not break each other or Firefox (I seem to remember that it was still sometimes possible by exploiting the observer service, but you had to work hard at it), it made extensive use of async programming (which was great to achieve a feeling of high-performance) and thanks to the fact that it had a finite API, it could be tested, which meant that when Firefox developers broke add-ons, they knew about it immediately and could fix the breakages! That was several enormous steps forward. This came at the cost of a more limited API but in most cases, the tradeoff seemed worth it.

Unfortunately, it turned out that there was an unexpected incompatibility between the design of Jetpack and some of the major changes that were needed in Firefox. I’m not entirely clear about what this incompatibility was but this meant that we had to abandon Jetpack. Instead, we introduced WebExtensions. Overall, WebExtensions had a similar objective as Jetpack-based add-ons, with a similarly restricted API and the added bonus that they could be made to work on both Chromium-based browsers and Firefox.

If you needed very advanced APIs, switching from the promiscuous extension mechanism to Jetpack or WebExtensions was not always possible, but for most extensions, the transition was simple – in my personal experience, it was even pleasant.

Firefox introduced WebExtensions in time for Firefox Quantum because this is when the promiscuous add-on model was scheduled to break.

At this stage, we’re done with the historical overview. I hope you’re ready for a more technical dive because that’s how I’m going to explain to you exactly which problems were solved as we switched from the promiscuous extension model to WebExtensions.

Let’s talk XPCOM!

How it started

XPCOM, the Xross-Platform Component Object Model, is perhaps the feature of Firefox that can best be described as the core (for people who know Gecko in-depth, I’m counting XPConnect and the Cycle Collector as part of XPCOM), alongside SpiderMonkey, our JavaScript Virtual Machine.

XPCOM is a technology that lets you write code in two languages and have each other call the other. The code of Firefox is full of C++ calling JavaScript, JavaScript calling C++ and a long time ago, we had projects that added Python and .Net in the mix. This piece of machinery is extremely complicated because languages do not share the same definitions (what’s a 64-bit integer in JavaScript? what’s a JavaScript exception in C++?) or the same memory model (how do you handle a JavaScript object holding a reference to a C++ object that C++ might wish to delete from memory?) or the same concurrency model (JavaScript workers share nothing while C++ threads share everything).

Gecko itself was originally designed as thousands of XPCOM components that could each be implemented in C++ or in JavaScript, tested individually, plugged, unplugged or replaced dynamically and it worked. In addition, the XPCOM architecture made for much cleaner C++ programming than was available at the time, worked on dozens of platforms, and let us combine the convenience of writing code in JavaScript and the raw speed permitted by C++.

To write a XPCOM component, you typically define an interface, then write the implementation in either C++ or JavaScript (or Rust, nowadays, and maybe soon Wasm). Some boilerplate is needed, but hey, it works.

When early Firefox developers decided to open the platform to extensions, XPCOM was immediately picked as the base technology for add-ons. Firefox just had to let add-on authors plug anywhere within the code and they would have tremendous power at their disposal.

And add-on developers (including myself) certainly did and had lots of fun with it!

…the era of immutable XPCOM

Unfortunately, problems progressively started to creep up.

When you’re developing a large application, you need to change things, either to fix bugs or to add new features, or to improve performance. In the XPCOM world, this meant changing XPCOM components. Sometimes to add new features to a component. Sometimes to entirely remove one because this design has been replaced with a better design.

In the first era of the XPCOM-based extension mechanism, this was often forbiddden. If there was an XPCOM component used by add-ons, it simply could not be changed in incompatible ways. This was great for add-on developers but it quickly became a nightmare for Firefox developers. Because every single change had to be made in backwards-compatible way both externally (for web developers) and internally (for add-on developers). This meant that each XPCOM component nsISomething was quickly accompanied by a nsISomething2, which was the better component – and both needed to be made to work alongside each other - One case was even more complicated to handle by Firefox developers: XPCOM-based add-ons could replace any existing XPCOM component. Needless to say, this was a very good way to break Firefox in ways that puzzled Firefox crash investigators.

This meant that development became slower and slower as we needed to check each new feature or each improvement against not only current features, but also past/deprecated features or simply old ways to work that had been obsolete for years. For a time, this development tax was acceptable. After all, the main competitor of Firefox was Internet Explorer, which had even worse architectural issues, and there was an apparently unlimited number of open-source contributors helping out. Also, the feature set of the web was much smaller, so it was still possible.

…the era of keeping add-on developers in every loop

However, as the web grew, it became apparent that these choices made it simply impossible to fix some issues, in particular performance issues. For instance, at some point around 2008, Firefox developers realized that the platform had simply too many XPCOM components and that this hurt performance considerably because XPCOM components prevented both the JIT and the C++ compiler from optimizing code and required too many conversions of data. Thus started the deCOMtamination, which was about rewriting performance-critical sections of the code without XPCOM components. Which meant breaking add-ons.

In this second era of the XPCOM-based extension mechanism, Firefox developers were allowed to remove or refactor XPCOM components, provided they got in touch with the add-on developers and worked with them how to make it possible for fix their add-ons. This unblocked development but the development tax had somehow managed to grow even higher, as it sometimes meant weeks of brainstorming with external developers before we could land even simple improvements. Thus also began a high tax on add-on development tax, as some add-on developers needed to rework their add-ons time after time after time. In some cases, the Firefox developers and the add-on developers enjoyed a very good relationship, which sometimes led to add-on developers designing APIs used within Firefox. In other cases, the add-on developers grew tired of this maintenance burden and gave up, sometimes switching to the nascent Chrome ecosystem.

…the era of Snappy

Around this time, Mozilla started paying serious attention to Chrome. Chrome had started with very different design guidelines than Firefox:

at the time, Chrome didn’t care about eating too much memory or system resources;
Chrome used many processes, which gave this browser heightened security and responsiveness by design;
Chrome had started without an add-on API, which meant that Chrome developers could get away with refactoring anything they wanted, without this development tax;
as Chrome introduced their extension mechanism, they did it with a proper API, which could usually be maintained regardless of changes to the back-end;
also, while Chrome was initially slower than Firefox on pretty much all benchmarks, it relied on numerous design tricks that made it feel faster – and users loved that.

It had been clear to Mozilla for years that Firefox needed to switch to a multi-process design. In fact, there were demos of multiprocess Firefox circulating roughly when Chrome 1.0 was unveiled. The project was called Electrolysis, or e10s for short. We’ll come back to it.

At the time, Mozilla decided to pause e10s, which was bound to use much more memory than what many of our users had, and concentrate on a new project called Snappy (disclosure: I was one of the developers of Project Snappy). Snappy was about using the same design tricks as Chrome to make Firefox feel faster, hopefully without having to refactor everything.

The reason for which Firefox felt slower than Chrome is that we were doing pretty much everything on a single thread. When Firefox was writing a file to disk, this blocked visual refreshes, so the number of frames per second dropped. When Firefox was collecting the list of tabs to save them in case of crash, this blocked refreshes, with the same result. When Firefox was cleaning up cookies, this blocked refreshes, etc.

There were two solutions to this, which we both used:

whenever possible, instead of executing code on the main thread, we moved it to another thread;
whenever that was impossible, we had to split the treatment into small chunks that we could somehow guarantee could be executed within a few milliseconds, then manually interleave these chunks and the rest of the execution of the main thread.

Both solutions helped us ensure that we did not drop frames and that we could immediately respond when the user clicked. This meant that the user interface felt fast. The former had the advantage that it benefited from the help of the operating system, while the later required a considerable amount of measurements and fine-tuning. Both solutions were tricky to pull off because we suddenly were faced with concurrency issues, which are notoriously hard to debug. Both solutions were also hard on add-on developers as they require changing entire features from synchronous to asynchronous, which often meant that the entire add-on needed to be rewritten from scratch.

For me this is memory time, as that’s when Irakli, Paolo and myself (am I forgetting someone?) introduced Promise and what is now known as async function in the Firefox codebase. These experiments (which were in no way the only experiments around the topic) served as testbed for later introducing these features to the web. More immediately, they made writing asynchronous code much easier, both for Firefox developers and add-on developers. Despite these improvements (and others that haven’t quite made it to standards), writing and debugging asynchronous code remained very complicated.

So, once again, there was a new maintenance tax for add-on developers, one that quickly became really complicated. In terms of extension power, the situation was even worse when we moved code off the main thread, because they were typically moved to C++ threads, which are entirely separate from JavaScript threads. XPCOM components were vanishing here and there, losing extensibility power for add-on developers.

I believe that it is at that stage that many add-on developers started seriously complaining about this maintenance tax – or more often just stopped updating their add-ons. And they were right. As an add-on developer, by then, I had long given up on maintening my add-ons, it was just too damn time-consuming. The maintenance tax was burning out our add-on developer community.

And Mozilla got serious about finding a new way to write add-ons that would considerably decrease the tax on both Firefox developers and add-on developers. At the time, the solution was Jetpack and it was pretty great! For a while, two extension mechanisms coexisted: the cleaner Jetpack and the older promiscuous model.

…the era of Electrolysis/Quantum

Snappy got us pretty far but it could never solve all the woes of Firefox.

As mentioned above, Firefox developers had known for a long time that we would eventually need to move to a multi-process model. This was better for safety and security, this made it easier to ensure that frames kept updating smoothly and this felt natural. By then, Mozilla developers had been experimenting for a while with multi-process Firefox but two things had prevented Mozilla from moving forward with multi-process Firefox (aka Electrolysis or e10s):

the fact that having multiple processes required considerable amounts of RAM;
the fact that having multiple processes required rewriting pretty much every single add-on – and that some could never be ported at all.

As RAM got cheaper and as we optimized memory usage (Project Memshrink), problem 1. progressively stopped being a blocker. Problem 2, on the other hand, could not be solved.

Let us consider the simple case of an add-on that somehow needs to interact with the content of a page. For instance, an add-on designed to increase contrast. Prior to e10s, this was some JavaScript code that existed in the main window of Firefox and could manipulate directly the DOM of individual pages. This was quite simple to write. With e10s, this add-on needed to be rewritten to work across processes. The parent process could only communicate with children processes by exchanging messages and children processes could be stopped at any moment, including while treating a message, either by the operating system (in case of crash) or by the end user (by closing a tab). This add-on could be ported to e10s because the processes that dealt with the contents of web pages also ran JavaScript and because the e10s team had exposed APIs for add-ons to send and receive messages and to load themselves within these content processes.

However, not all processes could afford to play so nicely with add-ons. For instance, processes dedicated to protecting Firefox from the notorious crashes of the Flash plug-in didn’t have a JavaScript virtual machine. Processes dedicated to interacting with the GPU didn’t have a JavaScript virtual machine. All of this for (good) reasons of performance and safety – and the fact that adding a JavaScript virtual machine to these processes would have made them much more complex.

And yet, there was no choice for Mozilla. Every day that Mozilla didn’t ship Electrolysis was one day that Chrome was simply better in terms of architecture, safety and security. Unfortunately, Mozilla delayed Electrolysis by years, in part because of the illusion that the Firefox benchmarks were better enough that it wouldn’t matter for a long time, in part because we decided to test-drive Electrolysis with FirefoxOS before Firefox itself, but mostly because we did not want to lose all these add-ons.

When Mozilla finally commited to the transition towards Electrolysis, some add-on developers ported their add-ons – devoting considerable time to that task – but many didn’t, either because it was impossible for their add-on or, more often, because we had lost them to the addon maintenance tax. And unfortunately, even the add-ons that had been ported to Electrolysis broke, one by one, for all the other reasons mentioned in this article.

In the end, Mozilla decided to introduce WebExtensions and finally make the jump towards e10s as part of the Quantum Project. We had lost years of development that we would never recover.

Performance was improving. Add-ons needed to be rewritten. Add-on power had irredeemably decreased. The XPCOM-based extension mechanism was mostly dropped (we still use it internally).

…the future is Rust, Wasm, Fission…

Today, we live in a post-Quantum era. Whenever possible, new features are implemented in Rust. I don’t remember whether we have shipped features implemented in Wasm yet, but that is definitely in the roadmap.

Out of the box, Rust doesn’t play nice with XPCOM. Rust has a different mechanism for interacting with other languages, which works very well, but it doesn’t speak XPCOM natively. Writing or using XPCOM code in Rust is possible and is progressively reaching the stage at which it works without too much effort, but for the most part of the existence of Rust code in Firefox, it was simply too complicated to do. Consequently, even if we had left the XPCOM-based extension mechanism available, most of the features implemented in Rust would simply not have been accessible to add-ons unless we had explicitly published an API for them – possibly as a WebExtension API.

While I’ve not been closely involved with Wasm-in-Gecko, I believe that the story will play out similarly. No XPCOM at first. Later, progressively some kind of XPCOM support, but only upon explicit decision to do so. Possibly exposed as a WebExtension API, later.

Also, after Project Electrolysis comes Project Fission. This is another major refactoring of Firefox that goes much further in the direction first taken by Electrolysis by further splitting Firefox into processes to increase safety and security and hopefully later performance. While it doesn’t directly affect XPCOM, it does mean that any add-on using the XPCOM-based mechanism that had been ported to Project Electrolysis would need to be entirely rewritten yet again.

All these reasons confirm that the choice to get rid of XPCOM-based addons could, at best, have been altered to prolong the agony of the technology but that the conclusion was foregone.

The problems with XUL

But wait, that’s not all! So far, we have only mentioned XPCOM, but XPCOM only represented half of the technology behind Firefox add-ons.

How it started

XUL, the XML User interface Language, was one of the revolutions pioneered by Firefox. For the first time, a declarative language for users interfaces that just worked. Imagine placing buttons, styling them with CSS, connecting them with a little bit of JavaScript, adding a little metadata to achieve accessibility, localization, store preferences… Also a great mechanism called XUL overlays which let easily plug components within existing interfaces, for instance to extend a menu or add a new button, etc. without having to change the other document.

During most of the existence of Mozilla, this is how the user interface of Firefox was written – and of course how the user interface of add-ons was written. And it worked like a charm. As an add-on developer who had gotten fed up with writing user interfaces in Gtk, Win32 and Swing, for the first time, I could develop a UX that looked way better than anything I had ever achieved with any of these toolkits plus it only took me a few seconds instead of hours, I could simply reload to see what it looked like instead of needing to rebuild the app first and, by opposition to Gtk and Win32, I didn’t have to fear crashes because my code was written in a memory-safe language.

If you’re thinking that sounds very much like HTML5 (and possibly Electron) with the right libraries/frameworks, you are entirely right. XUL was developed at the time of HTML4, when web specifications were stuck in limbo, and was designed largely as a successor of HTML dedicated to applications instead of documents. Almost twenty years ago, Mozilla released a first version of XULRunner, which was basically an earlier version of Electron using XUL instead of HTML (HTML could also be inserted within XUL).

… a few problems of XUL

If you were writing XUL-based add-ons, you quickly realized that preventing it from breaking stuff was… complicated. Several add-ons could modify the same part of the user interface, resulting in odd results. Several add-ons could accidentally inject JavaScript functions with the same name, or with the same name as an existing function, causing all sorts of breakages.

Similarly, writing a malicious add-on was very simple. You could easily log all the keys pressed by the user, catching all their passwords, but why would you take the trouble when you could simply patch the password manager to send all the passwords to a distant website? And, to be clear, that’s not science-fiction. While I’m not aware of any extension that went quite that far, I know of installers for otherwise unrelated products (at least two of them market leaders that shall remain unnamed in this post) that silently installed invisible Firefox add-ons that took control of key features and compromised privacy.

Proposals had been made to improve security, but it was very clear that, as long as we kept XUL (and XPCOM), there was nothing we could do to prevent a malicious add-on to do anything it wanted.

… the era of HTML5

Once the work on HTML5 started, Mozilla enthusiastically contributed the features of XUL. Progressively, HTML5 got storage, editable content, drag & drop, components (which were known as XBL in XUL), history manipulation, audio, cryptography… as well as sufficient support to let client libraries implement accessibility and internationalization.

While HTML5 still doesn’t quite have all the features of XUL, it eventually reached a stage at which many of the features that had been implemented in Gecko to support XUL needed to be also implemented in Gecko to support HTML5. As was the case with XPCOM, this translated into a development tax as every new feature in HTML5 needed to be implemented in such a way that it didn’t break any of the features in XUL, even if add-on developers somehow attempted to combine them.

This considerably slowed down the development of Gecko and increased the number of bugs that accidentally killed XUL-based add-ons, hence increasing add-on developer’s maintenance tax.

By then, Mozilla had long decided to stop improving XUL and rather concentrate on HTML5. As the performance of HTML5 got better than that of XUL for many tasks, Firefox developers started porting components from XUL to HTML5. The result was nicer (because the design was new), faster (because Gecko was optimized for HTML5) and extending it from XUL-based addons became more complicated. Also, it became easier to hire and train Firefox front-end developers because suddenly, they could use their HTML5 knowledge and their HTML5 frameworks within Firefox.

It is roughly at that time that Jetpack first entered the picture. Jetpack made it possible for add-on authors to write their extensions with HTML5 instead of XUL. This was better for most add-on authors, as they could use their HTML5 knowledge and frameworks. Of course, since HTML5 was missing some features of XUL, this didn’t work for everyone.

… the push towards Servo

In parallel, work had started at Mozilla on the Servo rendering engine. While the engine was not feature-complete (and still isn’t to this day), extremely cool demos had demonstrated how Servo (or at least pieces of Servo) could one day replace Gecko (or at least piece of Gecko) by code that was both easier to read and modify, safer and much faster.

There was a catch, of course: the Servo team didn’t have the resources to also reimplement XUL, especially since Mozilla had decided long ago to stop working on this technology. In order to be able to eventually replace Gecko (or parts thereof) with Servo, Mozilla first needed to migrate the user interface of Firefox to HTML5.

Within Firefox, this continued the virtuous circle: decreasing the amount of XUL meant that Gecko could be simplified, which decreased the Gecko development tax and let Gecko developers improve the speed of everything HTML5. It also meant that the user interface and add-on developers could use technologies that were better optimized and existing web libraries/frameworks.

Again, it also meant that using XUL for user interfaces made less and less sense and it decreased the number of things that XUL-based add-ons could hope to achieve.

… the era of Electrolysis/Quantum/Fission

And then came Project Quantum. As discussed above, every day that Mozilla spent without shipping Electrolysis was one day that Firefox spent being worse than Chrome in terms of security and crashes.

When XUL had been designed, multi-threading was still considered a research topic, Linux and System 7 (the ancestor to macOS) didn’t even have proper multithreading and few people outside of academia considered seriously that any user-facing application would ever need any kind of concurrency. So XUL was designed for a single process and a single thread.

When the time came to implement Electrolysis for Firefox, many of the features of XUL proved unusable. It is possible that a new version of XUL could have been designed to better support this multi-process paradigm, but this would have increased the development tax once again, without decreasing add-on developer’s maintenance task, as they would have needed to port their add-ons to a new XUL regardless. So the choice was made that features that needed to be rewritten during the push for Electrolysis would be rewritten in HTML5.

After Project Electrolysis comes Project Fission. Looking at how we need to refactor Firefox code for Fission, it seems very likely that Project Fission would have required a XUL3 or at the very least, broken all add-ons yet again.

While XUL would not be entirely removed from Firefox or Gecko for several more years, the era of XUL was officially over. Again, the choice to prolong the XUL-based extension mechanism could have been altered to prolong the agony but the conclusion was long foregone.

What now?

Well, for Firefox add-on developers, the present and the foreseeable future is called WebExtensions.

By design, WebExtensions is more limited than the promiscuous extension mechanism. By design, it also works better. Most of the Firefox development tax has disappeared, as only the WebExtensions API needs to be protected, rather than the entire code of Firefox. Most of the maintenance tax has disappeared, as the WebExtensions API are stable (there have unfortunately been a few exceptions). It is also much simpler to use, lets add-on developers share code between Firefox and Chromium add-ons and should eventually make it easier to write extensions that work flawlessly on Desktop and Mobile.

The main drawback is, of course, that WebExtensions do not give add-on developers as much power as the promiscuous extension mechanism. I hope that all the power that add-on developers need can eventually be added to WebExtensions API but that’s something that takes engineering power, a critical resource that we’re sorely lacking.

While I realize that all the actions outlied above have had a cost for add-on developers and add-on users, it is my hope that I have convinced you that these choices made by Mozilla were necessary to preserve Firefox and keep it in the race against Chrome.

Edits

Added XUL overlays (as suggested by Robert Kaiser).
Mentioned the ability to replace XPCOM components by add-ons in the old days (as suggested by glandium).
Clarified that we were not the only people working on Promise, etc.
Replaced the words “competitive with Chrome” with “as fast, as stable and as secure as Chrome”.
Clarified the XPCOM-side transition to Electrolysis.
Clarified the fact that add-on developers were fed up with the maintenance task.
Clarified that security problems did happen.
Clarified what security is all about.
Added a TL;DR.

Author: David Teller

Posted on: August 20, 2020