The OECD Data Portal
The Organisation for Economic Co-operation and Development (OECD) is an intergovernmental body that collects data and publishes studies on behalf of its member states. The fields of work, amongst others, include economy, environmental issues, well-being or education.
The OECD Data Portal is the central hub for statistical data. It helps researchers, journalists and policymakers to find meaningful data and to visualize it quickly with different charts. It connects with the OECD iLibrary, hosting the publications, and OECD.Stat, storing the full data.
The OECD is funded by its member states, so eventually by taxpayers like you and me. Using cost-effective, sustainable technologies was one of the requirements so the code can be maintained in the long term.
The Data Portal is a joint work by OECD staff as well as external developers and designers. The initial design and prototyping came from Moritz Stefaner and Raureif. 9elements developed the production front-end code and still maintains it.
We started working on the Data Portal in 2014. Since then, there was no “big rewrite”, only new features, gradual improvements and refactoring. Recently, for the OECD Economic Outlook in December 2020, we added several features, including four new chart types. Also we refactored the codebase significantly.
In this article I’m going to describe how we maintained the code for so long and how we improved the code step by step. Also I will point out things that did not work well.
Boring mainstream technology
Back in 2014, these technologies were the safest, most compatible available. Only CoffeeScript was kind of a venture. We chose CoffeeScript because it made us more productive and helped us to write reliable code. But we were aware that it poses a liability.
The ravages of time
Without doubt, jQuery has lost its dominance for non-trivial DOM scripting tasks. As mentioned, we would choose React or Preact for a project like the Data Portal today.
The second library, D3, remains the industry standard when it comes to data visualization in the browser. It exists since 2010 and is still leading. While several major releases changed the structure and the API significantly, it is still an outstanding work of engineering.
The Backbone library is not as popular, but has other qualities. Backbone is a relatively simple library. You can read the source code in one morning and could re-implement the core parts yourself in one afternoon. Backbone is still maintained, but more importantly it is feature-complete.
From today’s perspective, only CoffeeScript poses a significant technical debt. CoffeeScript was developed because of blatant deficits in ECMAScript 5. Later, many ideas from CoffeeScript were incorporated into the ECMAScript 6 (2015) and ECMAScript 7 (2016) standards. Since then, there is no compelling reason to use CoffeeScript any longer.
From CoffeeScript to TypeScript
Using the decaffeinate tool, we converted the CoffeeScript code to ECMAScript 6 (2015). We still wanted to support the same browsers, so we now use the Babel compiler to produce backwards-compatible ECMAScript 5.
All in all, this migration went smoothly. But we did not want to stop there.
For the Data Portal, we wanted to have the development benefits of TypeScript without converting the codebase to fully typed TypeScript.
This way, the developing experience in Visual Studio Code improved greatly with little effort. While there is no strict type checking, code editing feels as good as in an average TypeScript project.
By combining a rather boring but rock-solid technology with the latest TypeScript compiler, we could add new features and refactor the code safely and easily.
Documentation and code comments
On the surface, coding is a conversation between you and the computer: You tell the computer what it should do.
But more importantly, coding is a conversation between you and the reader of the code. It is a well-known fact that code is written once but read again and again. First and foremost, the reader is your future self.
The Data Portal codebase contains many comments and almost all proved valuable during the last six years. Obviously, code should be structured to help human readers understanding it. But I do not believe in “self-descriptive” or “self-documenting” code.
Before we switched to JSDOC, we had human-readable type annotations, documented function parameters and return values. Also we documented the main data types, complex nested object structures.
These human-readable comments proved to be really helpful six years later. We translated them into machine-readable JSDOC and type declarations for the TypeScript compiler.
Things will break – have a test suite
The project has only a few automated unit tests, but more than 50 test pages that demonstrate all Data Portal pages, components, data query interfaces, chart types and chart configuration options. They test against live or staging data, but also against fabricated data.
These test pages serve the same purpose as automated tests: If we fix a bug, we add the scenario to the corresponding test page first. If we develop a new feature, we create a comprehensive test page simultaneously.
Before a release, we check all test pages manually and compare them to the last release – both visually and functionally. This is time-consuming, but it lets us find regressions quickly.
I don’t think an automated test suite would serve us better. It is almost impossible to test interactive data visualizations in the browser in an automated way. Visual regression testing is a valuable tool in general, but would produce too many false positives in our case.
Backward and forward compatibility
In 2014, the Data Portal had to work with Internet Explorer 9. Today, Internet Explorer has no importance when you develop a dynamic, in-browser charting engine.
We decided to keep the compatibility with old browsers. The Data Portal is an international platform, so users visit from all over the world. They do not have the latest hardware and newest browsers.
Your abstractions will bite you
The technology stack was not the limit we faced in this project over the years. It was rather the abstractions we created our own that got in our way.
The idea of Backbone’s model-view separation is to have the model as the single source of truth. The DOM should merely reflect the model data. All changes should originate from the model. Modern frameworks like React, Vue and Angular enforce the convention that the UI is “a function of the state”, meaning the UI is derived from the state deterministically.
We violated this principle and sometimes made the DOM the source of truth. This led to confusion with code that treated the model as authoritative.
For the charts, we chose yet another approach. We created chart classes not based on the view class described above.
D3 itself is functional. A chart is typically created and updated with a huge
render function that calls other functions. The chart data is the input for this large function. More state is held in specific objects.
This makes D3 tremendously expressive and flexible. But D3 code is hard to read since there are little conventions on the structure of a chart.
Folks at Bocoup, Irene Ros and Mike Pennisi, invented d3.chart, a small library on top of D3 that introduced class-based OOP. Its main goal was to structure and reuse charting code. These charts are made of layers. A layer renders and updates a specific part of the DOM using D3. Charts can have other charts attached.
A general rule of OOP is “favor composition over inheritance”. Unfortunately, we used a weird mix of composition and inheritance to mix chart behavior.
We should have used functions or simple classes instead of complex class hierarchies. People still wrap D3 in class-based OOP today, but no class-based solution has prevailed against D3’s functional structure.
Instead of rendering string-based HTML templates and updating the DOM manually, UI components are declarative nowadays. You simply update the state and the framework updates the DOM accordingly. This unidirectional data flow eliminates a whole class of bugs.
The technologies we picked in 2014 either stood the test of time or offered a clear migration path. You could say we were lucky, but together with the client, we also chose long-lasting technologies deliberately.
For each client project, we seek the right balance between well-established, zero-risk technologies as well as innovative technologies that help us to deliver an outstanding product in time.
Thanks to Susanne Nähler, designer at 9elements, for creating the teaser illustration.
Thanks to the kind folks at OECD for the great collaboration over the course of the last six years.