Building a geographic information system for fun and non-profit

Posted on 11.12.2024 in computing • 3887 words • 19 minute read

Tags: computing, projects, polyjapan

I have been a member of the PolyJapan student association since I was in my second year at EPFL (2017), and during that time I have built a few more or less useful computing systems for them.

One of these systems is internally called Cartographe, the french word for Cartographer. As its name implies, it’s the system we use to design and visualize the plans for our event. But before we dive in, let’s do a bit of background.

PolyJapan and Japan Impact

As a student association, PolyJapan’s goal is to promote Japanese Culture. While we organize anime projections during the semester, our biggest event by far is Japan Impact, a full blown convention on Japanese culture that happens over a weekend every year at EPFL.

The event is more than 15 years old now, and was visited in 2024 by about 12'000 persons over two days, which is a very respectable size for a student-organized event in Switzerland. Compared to the biggest convention in french-speaking Switzerland, PolyManga, we pride ourselves with our vast amount of traditional cultural activities, which you can not easily find in most other events. Since we are organized at a school, the look of the convention is also quite different from a typical event organized at a convention center.

In any case, organizing a big event like this as a non profit requires a big comitee of volunteers with work an entire year to put that event together. And one of the most important documents all these volunteers work on during the year is, of course, the floor plan of the event.

How we did plans before: Inkscape

Before any of the tools I will describe in the rest of this article existed in our organization (it will come, I promise), the workflow to make the plans was based on Inkscape, a free vector-graphics editor.

Each year, we would go to the school map and take screenshots (or use PDF exports) of the different buildings we were planning to use for our event. Then, we would put each of these screenshots in a separate SVG file with Inkscape, using the school map as a base layer on top of which we would draw the different elements of the event. We would typically end-up with 6 to 12 SVG files, named something like Plan-CO-1.svg (more or less Plan-{CE,CM,CO}-{0,1,2}.svg, with CE,CM,CO the three main buildings in which our event happened, and 0,1,2 the floors). We would then upload these files to our shared Google Drive, and voilà!

To reduce the visual mess, each theme had its own layer(s): layers for exhibitors, layers for rooms usage, layers for security, layers for decoration, layers for electrical systems, … If you are for example working in the decoration and don’t care about the placement of electrical cables, you just hide the electrical layer and do what you have to do.

One of the Inkscape plans for the last year where we used it

The first issue with this system was that Inkscape didn’t really like our big raster layers (the school PDF base), and it was quite hard to make it run on a standard laptop by the end of the year.

The second issue, much more important, was that of concurrent changes. Indeed, if two persons downloaded the same plan from the Drive, worked on it in parallel, then uploaded it back, one of them would overwrite the other’s work. For that reason, we ended up developping some rudimentary form of shared lock: whenever one person wanted to work on the plans, they would send a message on the shared Telegram group to say they were locking one specific plan. They would then download the latest version, do their changes, push the new version, and finally unlock by sending a new message. This worked, and produced quite amusing short stories about the plans, but was still cumbersome and prevented multiple people from working at the same time.

Another big issue, which did not trouble most people except me, was that the data was not structured. A stand on the plan was nothing more than a rectangle, and possibly some text, on some layer in an SVG file. If you wanted to indicate logistical needs (which we call TCP, for Tables Chairs Panels), you’d just write it in a text box on top of the stand. Wanted to count the number of tables for a laugh? Tough, you’d better hope the excel spreadsheet we also have for stand is up to date, otherwise you’ll just have to sum by hand.

Clearly, something better was possible. Which is when the project began, helped in no small part by 2 year interuption in our event caused by an the new coronavirus.

The new thing: QGIS and PostGIS

It turns out that a lot of people have problems very similar to ours, and some of these people are also open-source contributors who build apps to solve their problems. Bless them. ¹

The two most important apps for this story are PostGIS and QGIS. Let’s dive into what they accomplish.

PostGIS: the data storage tool

As you may have guessed by its name, PostGIS is actually an extension to the very popular PostgreSQL database management system. This extension adds support for storing geometric/geographic data. In practice, it adds a new class of data types, called geometry. A geometry is parameterized by the type of geometry to represent (Point, MultiLine, Polygon, and variations of these). These types can then be used as any other type in PostgreSQL, including as column types.

Here is an example of a table creation command in PostGIS, corresponding to one of the tables in our database (translated to english).

create table stands_1
(
    id                     serial primary key,
    geom                   geometry(Polygon, 2056),
    internal_id            varchar(15),
    stand_provider         varchar(250),
    stand_type             varchar(40),
    price                  smallint,
    nb_tables              smallint,
    nb_chairs              smallint,
    nb_panels              smallint,
    comments               text,
    electricity_required   integer,
    sheet_link             varchar(255) default NULL::character varying,
);

This is of course identical to your typical PostgreSQL table, except for the geom column, whose geometry(Polygon, 2056) type is provided by PostGIS.

The interesting thing about using PostGIS as a storage layer for our plans is that anything we put on our plan is a row in a SQL table. This is of course a huge benefit if you want to use the data in other ways than just displaying polygons on top of a map. This also means that we can build systems that modify the metadata of features of our map: if we want to alter the number of tables required by a stand, we can just issue a standard UPDATE stands_1 SET nb_tables = 2 WHERE id = 123 with any PostgreSQL client, and it will work just fine.

QGIS: the visualization and edition tool

QGIS (which probably stands for QtGIS?) is a free and open source software that is, I think, mostly used to do surveying on the ground.

The typical user for QGIS is someone working for some public agency that wants to have better data for what is located in their jurisdiction. Maybe they want to count trees, and so they create a Point layer for trees, with some additional attributes, such as their height or their liveliness. Then, they go on the ground, count the trees, and note them on their laptop (or on the QField app) and update the map. Once they come back to the lab, they can use QGIS’ extensive toolbox to do analytics on the features they surveyed on the ground, and to present the data in multiple ways that can then help take better decisions.

Of course, this is not the only usecase for QGIS. If you look online, you will find tutorials for people organizing public events in the street, and that organize everything in QGIS. This is of course a much closer usecase to what we are doing.

An important advantage of QGIS is that it supports different backends for storing the features in each layer, and one of them is PostGIS. So you could imagine QGIS as a software that enables you to visualize and edit the data that you store in PostGIS, a kind of graphical database manager bundled with a map view.

Just like Inkscape, QGIS is organized in layers. Each layer is connected to a PostGIS table, and comes with additional metadata that tells QGIS how each feature of that layer should be displayed on the map. Layers are aranged in groups, which constitute a tree. In our case, we have one layer group for each floor of the event, which contain the thematic layer groups (Security, Electricity, Base layer, …), which themselves contain multiple layers. For the “base” layers (the background of the map), we use the GIS server of our school, which uses standard protocols, so we can ensure that our base layers are always up to date, because they come directly from the school data.

A small caveat however: QGIS is not really built for multi-floor mapping, so we had to cheat a little bit. For each interesting layer (such as stands), we actually have one table (and therefore one layer) per floor: stands_0, stands_1, stands_2. We then use a big layer group for each floor, and make these top level groups mutually exclusive (so you cannot visualize multiple floors at once). It’s not the most elegant solution, but it works for us, and has less risks of bugs than using a custom plugin, for example.

The workflow in the software is a bit different from Inkscape, however. In Inkscape, all layers are always editable: you just have to select one and draw you rectangle, or your line, or write your text. If you used the wrong layer, you can always cut and paste on the correct layer. This is a bit different in QGIS, where by default the entire map is read-only. You can look around, find features by attributes or just select some, but you cannot alter them. To add new features or modify existing one, you need to select a layer and enable the Edit mode. Then, you can make your changes, which you must then save separately (which QGIS can also do when you leave edit mode). Since each layer is stored in its own table, you have to save each layer separately, and it’s only when you save that the data will be pushed to the database. This is easy to do, but requires some training for people that are less used to working with computers, and one of the many things that make the software a bit confusing for inexperienced users. The advantage of this system is that as soon as you click Save and the table is updated, all the other users working on the map will immediately see your changes.

Since QGIS is built to work with geographical data, it also comes supercharged with a lot of very powerful tools and goodies. For example, you can always measure distances acurately on your plan, both as an independent tool (the ruler), and when adding features (using a dedicated toolbox). You can also use geometry functions in formulas, such as $length or $area (that describe the length and area of the current feature), and you can then use these formulas for aggregates (I want to know the sum of $length of all the barriers in this layer is an easy question to answer with QGIS) or in the label of the features, which I mention just after.

QGIS also comes with a lot of tools that facilitate visualization and exportation of its data.

First, visualization. On any layer, you can configure the look of the features and how they are labeled. Both of these things can be pretty arbitrarily complex, and can be configured using formulas. As an example, on a single layer that represents barriers and fences, you can have multiple styles that depend on the type of barriers you have put on the map. You can decide to have a label that uses the $length of the line to display the approximate number of barriers required on top of the line on the map, taking the type of barrier into account (as each barrier type has a different base length).

Together with the powerful styling, QGIS provides themes, which are a way to have multiple alternative styles for a single layer, with a simple way to switch between those styles. You work in logistics and want all the details of the stands to be written on them? Just load the logistics theme! You work in stands and would like to export a simple visualization to show to a provider, where stands have a color based on wether or not they are still free? Just load the Exhibitors them. This enables each person in the organization to tailor the map to only show the information they need. These themes can also be stored separately, then reused for other layers, for example.

Second, exportation. To handle exporting its plans, QGIS has a system of print layouts. When you launch QGIS, you are in a view where you can freely move around the map, and zoom in/out. This is obviously the view you want to have when you are working on a map, but this does not help you when you want to print a map or send it to someone. Print layouts allow you to design documents that can leverage your map, including adding map views and other elements that are linked to the map (such as a compass, a scale, a legend, …). You can then easily export that view to an image or a PDF document. The feature is very complete, and enables us among other things to export a complete plan which combines all floors into a single page, which we can then print on A0 paper and hang in our HQ while setting up the event.

This feature is even more powerful when used in the atlas mode, which lets you automatically generate one plan per feature in a layer. For example, we use it to generate automatically a location map for each stand of our event. See below an example of what it looks like. Keep in mind, QGIS automatically generates this for each of our stands!

An atlas

QGIS is a very complete software, and we probably only use 10% of its features. Of course, it was not the easiest tool to handle for everyone in the association, and the administration of the tool (layers management, …) is still the responsibility of the IT team at PolyJapan. But still, I am convinced it works well better than Inkscape, solves the problems we had with it, and enables a ton of usecases we could not even dream of in the Inkscape days.

Another visualization tool: the Cartographe web-app

QGIS is a great tool. However, it’s quite heavy, and it takes some time to boot-up. Also, it’s not very usable on mobile, and you cannot use it directly as an information channel for the public.

For all of these reasons, we needed to build a secondary visualization tool. A web-app that would be used only to view the maps, but never to edit them. After all, we’ve put all our geo data in a database - why not use that data and put it on a website?

This web-app was built in a couple of days, using a Scala backend with an Angular frontend.

The Backend, as I mentioned, uses Scala, together with the PlayFramework framework. There is no particular reason for this technology choice: it does the job, and I like the language. This backend is mostly responsible for authenticating users, and of course loading the data from the PostGIS database and converting it into GeoJSON, the format expected by the frontend libraries.

The Frontend is an Angular (Typescript) application, and it relies heavily on the OpenLayers framework to display the map and its various layers. It is basically a giant OpenLayers component, with all the piping required to provide its data, and a few additional components for logging in, switching map themes, enabling/disabling layers, and obtaining informations on a selected feature. This frontend also has a compile-time option that enables us to compile both plan.japan-impact.ch (the public plan, for our guests) and cartographe.japan-impact.ch (the internal service, with our plans with more “internal” information) from the same code.

The Cartographe backend uses a quick-and-dirty but sadly probably-permanent solution for the layers definition. Indeed, it turns out that parsing our QGIS file to generate the webmap is both very annoying and complex (the XML document that represents a QGIS project is quite complex), and not necessarily what we want, as we may want to have multiple plans (public plan for visitor, internal plan with only stands, internal plans with everything…). So, layers are defined in a big Scala object, and when the Backend is queried, it just checks your permissions and returns the layers you are allowed to see, filling them with features from the database in the process. This is an inelegant solution, but it works for now, and so it will likely stay. The main issue is that when we start a new year, we need to push an update to these layer definitions by updating the backend (instead of an administrative interface). An example of layer definitions is found below.

    // ...

    val PrivateJsonLayers: List[LayerGroup[MultiDimensionLayer]] = List(
      LayerGroup("Sécurité", List(
        MultiDimensionLayer("barrieres", List("type", "commentaires"), Some("Barrières"), dbStyle = Some("barrieres")),
      )),
      LayerGroup("Pro", List(
        MultiDimensionLayer(Tables.Tables("stands"), prettyName = Some("Stands"), dimensionName = "Étages", style = None, dbStyle = Some("stands"), None),
        MultiDimensionLayer("salles", List("nom_public", "type", "commentaires", "puissance_elec_requise", "max_personnes"), dbStyle = Some("salles"))
      ))
      // ...
    )


    val ProJsonLayers: List[LayerGroup[MultiDimensionLayer]] = List(
      LayerGroup("Plan Pro", List(
      MultiDimensionLayer("stands", List("id_pj", "exposant", "type", "link"), style = Some(
        UnionStyle(List(ColorFillStyle(Expression.fromCode("""if ("exposant" is not null or "type" = 'Bar' or "type" = 'Accueil', 'red', 'green')""")),
          LabelTextStyle(Expression.fromCode("\"id_pj\" || if (\"exposant\" is not null or \"type\" = 'Bar' or \"type\" = 'Accueil', '', '\\n* Libre *')"), font = "12px \\'Open Sans\\', sans-serif")))
      ))
    ))
    )
    // ...

An interesting feature enabled by our database-based system is the ability to open the map focused on a specific feature, for example a stand. This enables us to put a direct link to the plan for all the stands on our website, and when the guest clicks on one of these links, the plan loads directly at the correct spot, with the stand highlighted. Very neat, and very easy to do once your data is properly layed out.

The Transpiler

A very nice QGIS feature that I already mentioned before is the ability to write custom formulas, for labels or other things. For example, you can write a label as CONCAT("internal_id", '\n', "nb_tables", 't', "nb_chairs", 'c') to easily visualize the ID of a stand and its logistical needs. You can also use these formulas to decide which theme to apply on a specific feature: should it be represented as a yellow square or a red one?

OpenLayers supports a very similar idea. When you style a layer, instead of passing a fixed style, you can also pass a JavaScript function that takes a feature as parameter and returns the style that should apply to that specific feature.

I think you are starting to see where this is going: we have some QGIS expression on one side, and we have a dynamic styling layer in the frontend that wants to see some JavaScript. Yup! Let’s write a transpiler for QGIS expressions to JavaScript, and serve the JS file! You can see the result for transpiling the current available layers on https://cartographe.japan-impact.ch/api/8/expressions.js. As you can see, the output is very simple. But this is enough to handle most of our formulas!

This is not a complicated “compiler” by any standard, but I always have lots of fun in writing small transpilers. This one was built using Scala’s Parser Combinators, a very elegant way to write parsers. The total code for the transpiler ends up at around 1000 lines of code, so it’s definitely not very complex, but it still took about a weekend to make it work.

The API

In addition to providing a nice way for visitors to access our plans during the event, Cartographe provides an API that enables us to read and modify the attributes (except the geometry) of features in certain layers. This API is of course protected by API tokens, as we don’t want anyone to modify our maps or access confidential data.

Why do we do that?

Well. Even though popular wisdom says it’s a very bad idea, we run a lot of our event based on partially automated Google Sheets.

One of these sheets, a very important one, is the sheet in which we put all the stands. Each stand has its identifier on the map, and information about the provider that occupies the stand, logistical needs, … The people who work with the exhibitors spend 99% of their time in this documentm and if everytime they put a stand on the map they had to open QGIS (or Inkscape, as it was the case in the past), they would go insane.

Thankfully, because everything is ultimately in a database, as long as the stand was already put on the map, changing who occupies it or its logistical needs no longer requires opening the plan. Instead, the information can be written in the row corresponding to the stand in the Google Sheet, and then a synchronization routine updates the information in the database, using the Cartographe API as an intermediate.

Conclusion

As most projects in a student association, this is likely a little bit overkill. But, when you join an association, you are doing so to enjoy it. And one of the ways I have enjoyed my time at PolyJapan was to build tools to help us be more efficient. These tools are rough around the edges, only work 95% of the time, but when they work, they are extremely useful.

As a little anecdote, the Cartographe part of this project was initially a quick and dirty proof-of-concept. I had convinced everyone we had to switch to QGIS/PostGIS, so the plan was being done there, but we did not have the web visualization nor the automatic Google Sheets sync. And over a weeekend, I wrote the first prototype for Cartographe (most of its code is of course still running), including the Sheets synchronization. The prototype evolved a little bit, made it into a public release with our public plans for guests, but for me it was still only a quick-and-dirty prototype, not something critical. The year after that, I took some time to update it to work with the next edition of our plans, and it hugely impacted the efficicacy of the people working with the plans. Suddenly, they had to go edit the plan everytime any information changed. When I finally did the update, one of the people working with exhibitors gifted me a bottle of wine to thank me, because it was so useful to them. So this is how a quick and dirty proof of concept became in a few months so essential that we cannot live without it anymore. And as I now occupy the seat of exhibitors manager, I can only understand why.

I learned later that quite a lot of these people work for different public offices and work on the tools they have deployed at work to improve them. Public money, public code, yay! ↩︎

Ceyal's Log

Notes from a Computer Engineer