Shapefile Importer

Motivation

This project was a collaborative effort with Prof. Robert Benkoczi from the University of Lethbridge, Alberta - as part of an initiative to expose high school students to the academic research space by partnering with university professors. This effort resulted in the publication of Importing Data from Shapefiles and Pathfinding along Generated Nodes, which was published by The Journal of Student Science and Technology in 2017. The purpose of this project was to utilize pathfinding techniques to optimize the transit system in the City of Lethbridge, Alberta.

ShapefileScreenshot

Research Pilot

When I was in Grade 11, my high school initiated a new pilot program intended to spark students' interest in research, by partnering students with university professors across Canada to work with them on a project. This sounded like a unique opportunity, and naturally I signed up. I submitted my resume and was matched up with/selected by (I am not sure what exactly the process looked like on the other end) by Prof. Robert Benkoczi.

Once we were partnered up, we started discussing projects that would be a good balance between learning and getting something done. Since by this point I had already built quite complex (if not well-architected) programs such as Pew Pew Pew Game, we decided to attempt to make an tool that could read GIS ShapeFiles that mapped the City of Lethbridge and optimize the bus transit system in the city. We imagined this working by having users enter a series of waypoints (bus stations) that would need to be hit, and then running an A* pathfinding algorithm to create the final routes. All of this had to be completed in a single semester.

All of the students that signed up for this program had a dedicated timeslot in our schedules for this, in addition to working on it on our own time. Prof. Benkoczi was awesome enough to give me lots of free reign over the project, and acted primarily as a supervisor and point of contact throughout the period.

The Problem Doubles

At first, progress was smooth. I was already quite familiar with SFML and utilized it for all of the rendering I'd need to do. Since the ShapeFile specification didn't look overly complicated - and I only needed a subset of it for the Lethbridge data - plus as a learning opportunity for myself, I opted for (and Prof. Benkoczi approved of) writing a parser from scratch. This was a great experience in serialization, because it was a (sensitive) binary format.

However, the project of course was not without problems. The first of which highlighted the need for a smaller dataset, as the entire Lethbridge dataset was simply too inconvenient to step through during debugging, and too complex to really grasp what was going on. Plus, due to privacy/security reasons relating to the data, I also couldn't use any online GIS tools as a reference, so I never saw the correct output until I got my own parser working.

After several iterations of running into a problem and solving it, I finally met my match. No matter what I tried, the map ended up looking like a garbled piece of abstract art, with lines going all over the place. However, there weren't any errors popping up, or any indication of what I had done wrong. Eventually, and after over a week of trying to debug the issue, some progress was made. It turns out that the GIS files were encoded with a particular endianness, which was different from what the standard C++ byte reader used. Hence, when I read the raw bytes and tried to convert them into a double, I got this jumbled mess instead of a beautiful map.

Publication

After overcoming that hurdle, the rest of the project progressed well, though sadly there wasn't enough time to fully implement everything we had set out to do. Fortunately, there was still enough to be useful - the A* implementation was for the most part able to properly traverse roads (it struggled with making illegal turns at bridges due to the road representation in the shapefile) and was interesting enough to write a paper on.

Writing the paper was an interesting experience in and of itself, as it was a project with deadlines outside of school - and ones that weren't always communicated the same way as school deadlines had been. For example, there were submission deadlines, and internal review deadlines, and everything had to be reviewed several times before the final approval for publication. The whole process took much longer than I expected - I had already started university before I got the notice that the paper was published (although I had confirmation much earlier that it would be published).

Current Thoughts & Learnings

Although at the time I was really excited to share my parser with the world, and hoped that others would find it useful, I think the reality is that it was really much more about the experience for the students such as myself in terms of the value generated by this initiative, rather than the projects we worked on. With the benefit of several more years of experience, I see many ways to have done better during this research project, both from a coding and personal point of view. That all said, of course, I have absolutely no regrets having signed on for this, and Prof. Benkoczi was fantastic to work with, plus I think the goal of exposing high school students to the research side of academia was a great success - one of the other students ended up choosing to go more towards research at university.