Sure. Thanks for having me on the podcast by the way. So I'm Brendan. I'm the CTO at Bunting Labs and that means that I have been building a AI auto-complete for QGIS for the past year. So I wasn't originally a GIS geospatial person. I actually got my start kind of in open source software.
So I originally got involved with the Node.js project which is in open source platform for running JavaScript on servers normally and that's kind of where I initially learned to engine or software and that really gave me my spark in computer science and it was only later that I went to MIT and ended up studying physics and computer science there where I got more into machine learning and I think you can kind of see that Bunting Labs is kind of at the intersection of
many of these different interests of mine. So kind of my background in machine learning and specifically I studied Bayesian inference at MIT kind of led me to have this geospatial moment towards the later end of my college career.
And that's actually when I was working on a consulting project with a real estate developer and we were more or less summarizing different neighborhoods in terms of their characteristics with statistics and with neural networks and it's more or less at that moment where I realized that geospatial data is extremely powerful and I was kind of confused as to why other people weren't using geospatial data more and that really set me on this path
to discover geospatial data and kind of the intersection between geospatial data and machine learning and that's kind of where Bunting Labs came from. I guess the big un-unanswered question for me is why QGIS then? Why did you decide to build something on top of the platform that is QGIS? My familiarity has always really been with open source software.
So when I think of software that I can contribute to and software that I can build an ecosystem around or contribute to the ecosystem in I think of GitHub and I think of repositories that everyone can kind of see, learn from, contribute to and build around and so QGIS is in many ways closer to my comfort zone than many of the other GIS software that are available out there.
It's extremely easy to get started with so we basically built the first version of this plugin in about two weeks and I could do that by learning from all of the other open source plugins that are available and so QGIS is not just this one app that you download to your computer, it's really an ecosystem that everyone is contributing software to and enjoying as a result and so I think that's why I first got started with the QGIS plugin. And so my next question is a two-part question.
Explain to myself and into the listness of this podcast what auto-complete for digitizing vectors means for you and QGIS and why that was a problem you wanted to solve.
So auto-complete for factorizing in QGIS or another way to kind of describe it is an auto-complete for tracing maps is a addition to the GIS pen tool and so if you are presented with raster imagery of some sort so if that's a satellite image of an area if that's a rendered PDF of some sort of construction or architecture asset somewhere in the world and you need to convert that into a vector file with a projection and some sort of like accuracy
you would use the pen tool to digitize that into its vector representation and in many conversations with GIS professionals we found that they essentially participated in this workflow extremely frequently and so they would be presented with oftentimes a PDF that doesn't have any kind of projection with it.
They would need to do reference that PDF to oftentimes sub beta resolution and that's literally locating it in aerial imagery or maybe on an open stream app base map and adding control points until the raster layer perfectly lines up with its location in the real world and then they would get out their pen tool and digitize out lines, polygons, maybe even points that represent semantic shapes from that map and then they use that and then add metadata
to those shapes and so it's not just lines in your local state plane these are buried utility lines and so you would add metadata of the diameter of that pipe or maybe these are literally painted lines on a highway and you are adding metadata that says these were last painted on 2014 August 2014 and they will need to be painted again in August 2024 and this kind of workflow where you would take rasters and then you would take a meaningful digital file format equivalent of
them we saw all the time and so given my background we started imagining what this workflow would look like with AI and this is more or less the product of that. You did a great job answering both questions with one answer
that was brilliant. So given your skill set, your experience and your interests, if I was you I would be I would look at this convert every line that I can see in that rendered image to a you know a vector output instead of trying to like trace lines individually is that a thought that crossed your mind when you were thinking about solving this problem.
Yeah and I think it's really funny because my conception of how to solve this map digitization problem is totally different now than it was a year ago when I first started looking at this. So when Michael and I decided that map digitization was a problem that we would tackle we basically went to the whiteboard and I threw up my design, my attempted solution for what digitization with AI would look like and I said it's probably really easy to convert an entire map into its
vector representation. In fact there's an entire academic field dedicated to solving it. If you go online and Google vectorization especially of images there's a lot of published research as to actually accomplish that task and I said this is a solved problem shortly and I can build this in two weeks. I said that probably 13 or 14 months ago so I totally take that back and the first six months of me building this was actually very similar to
your suggested solution. I did a literature search and found the best performing machine learning models for converting an image into its vector representation and there are some really high quality research available online for doing that. If you take pretty much any image we're talking especially like logos images of the real world and convert it into its vector format to your eye it looks fantastic. It is very perfectly recreating the entire image as in SVG for example.
But when we went to kind of our original design partners and showed them this technology
we realized that it kind of missed the mark. It's like op-re-oriented when you see a map it's kind of difficult to determine which of the assets in that map are important to have the vector representation and so we realized that if we wanted to productionize this we actually had to build a semantic understanding of all of the features in these maps and that all of a sudden was a much more difficult problem because that's akin to machine reading of maps and
understanding and when we were looking at this a year ago that seemed impossible. And even worse when you have extremely important assets that you want a high quality vector representation of you don't actually trust the
output of a machine learning model on its first go. If you were to hand me a geotip and I handed you back a shape file you would actually load it into your GIS software choice and review it such that it's perfect such that it's two your standards and realizing that took about six months after six or seven months of working on that approach I actually went on a walk with Michael and I essentially came to the conclusion that I was not on a trajectory of success.
If I interpolated my progress out I would not be successful in one or two years and so I essentially had to go back to the drawing board and it was only then with inspiration kind of from these more recent generative AI models like Chat GBT did I consider an AI autocomplete this was never originally going to be a plug-in that you run in your desktop GIS software but we realized that that was actually the modality that people wanted they wanted to be able to have access
to this and immediately review its output. I think to what you're getting it is they wanted to be in charge they wanted to have just increase or make me more efficient in the work that are more really doing in that way I will trust the
output more. Could you talk to me a little bit about how this is different from segment anything for example because this is kind of like you give it an image and it segments everything in the image and people seem to love that and it seems to work really really well and this reminds me of your original approach like give me an image and I will vectorize the whole thing could you just explain to me in the list is please what why these two things are different and why
I think that would be really helpful. So segment anything is extremely powerful but it's powerful in a few specific characteristics the first one is that it's grounded in text descriptions of what you're segmenting so if you take a very large satellite image for example I think the best use case for segment anything is really satellite and aerial and you can textually describe the geometry that you are extracting then I think segment anything and I believe
segment anything geospatial the package that actually implements it specifically for geospatial software is probably the best thing that you have and so if you load drone imagery that you just took and ask it to segment out the lake boundary that's pretty much the best thing you can get I think where our AI auto complete is preferable to segment anything is especially when you are digitizing geometries that you have a semantic understanding of but are difficult
to textually explain or you are digitizing lines that are difficult to semantically extract from that map and so for example we often see a lot of extremely low quality maybe there's a lot of JPEG artifacts on the map maybe the resolution is pretty low it was originally rendered to be really low or even those artifacts from the original scan those are kind of maps where segment anything struggles it's much more difficult to get segment anything to extract
a line of a pressure sewer such that the line is a dashed line that has PS intermittently interrupting the line style whereas on the other hand our AI auto complete because it looks at basically the line that you have already
started it excels at just completing it. Okay the line that you have already started in the same way auto complete works for fun using chat gbt for example some people describe it as a fancy auto complete where it says okay your question was this the next logical token is this in the same way we see it auto complete in Google docs for example it looks like you're the next logical word is here so when you say auto complete you mean I take my pen
or I start tracing the line and it jumps ahead of me as is looks like you're
going on this line here and follows it along am I correct. That's exactly right so the way our AI auto complete works is it's a drop in replacement for the panel and so if you activate our plug-in in qgis and begin digitizing some geometry let's say for our example it was a utility line on a raster map once you begin digitizing that line by drawing two segments a small bit of that map will be sent to our inference server and it will auto
complete 50 of the next vertices of that line and these vertices are output by a neural network that we've specifically trained for this job and so that neural network is literally looking at the pixels that you have already drawn so as to predict which of the map is actually semantically this same feature and which of those pixels it should choose as continuing vertices and I've realized that this problem is actually much more complex than
I originally imagined when you and I are talking about digitizing maps it's extremely broad and big and it can be difficult to imagine what those maps actually are but the magic in AI auto complete is not that it works on certain maps it's that it works on your map and so when someone uploads into qgis a map that our AI has never seen and yet is able to generalize based on the hundreds of maps that we've trained it on that I think is the really impressive thing.
So the I think you're the example you have on your website is amazing and one why I think it's amazing is that you're tracing a dash line so just to describe this for the list is that there's a video of yeah screen video I guess of you tracing or someone tracing a dash line in and QJs in which is incredible and you can see the auto complete working in you know being ahead of the pen but I think the really incredible thing is that adjacent
to that line there is an identical line crossing over that line is another identical line and yet the auto complete knows that I do want to follow this particular one of those identical lines but to me to the layperson this seems like an incredibly hard problem to solve.
So I think early on I underestimated that difficulty and I'm actually very fortunate that my background kind of let me push through all of the complexities that came with solving this problem and so I think if you kind of follow our journey on social media it might seem as though we built this in a couple of weeks and all of a sudden there's AI and QJs but really that's a total simplification of kind of this journey I've had in understanding what map digitization really meant both at a
semantic level for the people that do it professionally but also at a technical level and how you could teach a computer to truly learn the semantic features that these people are working with and can actually auto complete ahead of them and so to kind of dive into what is surprising about this I feel like I actually have a lot of you know as someone who's literally looked at probably a thousand different maps over the course of building this AI autocomplete and I've
actually run nearly 2000 machine learning experiments to get to this current AI autocomplete and experiment one I promise you to not perform well at all and if you are actually one of our users and you download QJs download our plugin and try it out you are actually running model 1,854 and the 1,853 models that came before that were bad in many ways but the main important thing is that they weren't able to generalize into the map that you're looking at I've realized
that lines have so many semantic meaning that allows you to disambiguate between two lines say when they intersect and encoding that into the model has been pretty much the most important task that I've completed. So it hits off to you for continuing I think if I got no what do you say 800 or what 1,800 times I would
be tempted to give up but you didn't. And finally you got a yes but my guess is also that I guess you were like during that process you could see that you were getting better and better and better and further closer and closer I should say to the goal so my guess is you had encouragement along the way it wasn't like hard no hard no hard no right that many times and then yes so but anyway that is incredible well done well done had talked to you but when I think about this so
digitizing lines is very impressive but it's a step in the in the journey right right. Georeferencing is another part of that great
way. Now I've got my lines not to not to throw any cold water not to rain on your parade or anything but now I've got my lines you've done a great job of that I've got my lines now they need to land somewhere specific on the real world in order to be useful to me and then I need to extract some metadata about those as well is the any part of this the model that you built today the experience that you've had the learnings long the way that overlap with
it with maybe the next step of of Georeferencing that's a great so I've been working simultaneously on both of these models so the AI autocomplete for vectorization is something that has kind of crested this threshold of usefulness for most of its time it was not useful at all and I also have a machine learning model on my computer that can georeference maps as builds documents from architecture and construction firms you name it automatically but it hasn't yet pressed it this threshold
where it can actually save someone time and it's interesting because these two problems while they're both grounded in maps they're actually totally different because the AI autocomplete is really about understanding the semantic nature of maps and the features that are depicted on them whereas georeferencing is actually a search problem when you're presented with a map that you don't know where it came from especially this is extremely common for consultants if you are basically asked to do
reference this map to a sub meter accuracy you often don't know actually where it is at all and so you embark on this interesting search whether you're using Google Maps you're using Erion satellite imagery maybe open stream map and all of these other kind of data sources that can help you locate where a particular map is and the possibilities for that are like astronomical you know we're talking a chess game number of possible game positions that's comparable to the number of
possible ways a map could be georeference to sub meter accuracy in the world right so it's actually extremely difficult especially when the map only has a little bit of information so if the map is just a picture of a building and two cross streets I can guarantee you there are a thousand roads in the United States with those street names and so what people actually do is they go through and page through all of these possible permutations of where
this map could be and then they line up exactly where this building is and so if you imagine an AI model for georeferencing it has to do all of that plus more and so you're synthesizing all of these different datasets as to you know all of the roads in the world not just the US obviously all of the satellite imagery or even high quality high resolution aerial imagery that you can use to georeference in exact building outline or maybe even the curb on
a road these are challenges that you know I've had while hand digitizing hand to referencing maps and building in AI model to do that same thing is surprisingly difficult but I think it's something that you know we'll be able to do in hopefully three months it's interesting to hear you say surprisingly difficult because no it doesn't surprise me at all especially after what you just said which makes a lot of sense but the good uses we have a bunch of filters normally on a map you have a name
you might have street names you might have metadata you might have some coordinates in there you might have something like this and we can filter it down to okay I know I'm looking in this particular town or around this address just as an example and I guess to not all other datasets a great your base maps for digitizing against so if you've got a you know PDF as built of some building looking at really grainy satellite data is not going to be super helpful it's not going to get you we
want to go maybe open street map would be better or another building layer would be better where we're talking about discrete objects as opposed to pixels and so we can do things to filter that down at least that that's my sort of experience is that also your experience or do you can you put some other words around that as someone who's tried to make this model and is also continuously improving it there's even more complexity than you could imagine we've done a few projects on like
book to referencing maps and one of the really interesting parts of this challenge is things change over time and so we've gotten basically the equivalent of a truck load of of PDF maps and I've been asked to do your reference these and the complexity of that is like highway names change over time right road names change over time buildings disappear and appear and so when you're being at rivers like change direction as they flow and so when you're being asked to
locate a map the map is actually a description of how it was at a certain place in time and so that's another complexity that you're also dealing with well yeah it's a really good point to talk about that trust me I hadn't thought about it either but you also said that you're hoping to have something working in three months time which just seems insane but what do your expectations to have working three months time what what do you think it is that that will be working and how will it work
I think it's really about this threshold of human usability for an AI autocomplete we realized that the bar was that the autocompleted vertices are easy to delete in case they're wrong but easy to continue from in case that they're right so we find that a lot of our users are actually okay with mistakes because they can instantly repair them designing that into our qgs plugin has actually been a great boon for its usability and we're kind of looking for the same
thing with this year referencing model it's actually of no cost to you to try and gerruffin something acknowledge even if you get an error message or acknowledge that the gerruffin is not as accurate as you wanted it and then go in there and gerruffin set yourself and so you can imagine a gerruffin sing button in your desktop GIS software where you load a raster and perhaps it doesn't have a CRS perhaps the ground control points are wrong and it automatically
locates its location in the real world and aligns like rotates that image aligns it and gives you ground control points to ground it in you know in its actual location and that experiences what we're essentially trying to emulate with an a i gerruffinser wow so you would that that's a really interesting inside that they're already doing that which sounds incredible already doing that would save a ton of time for people and be a massive value add for them right and because they come into this
task with the expected cost of gerruffin sing because gerruffin sing something accurately can take upwards 20 minutes if I've had maps that go longer but let's let's say 20 minutes as a good number because you know that it can take that long the cost of actually waiting 10 seconds to have an AI model try it first that's actually pretty good and so that's kind of what I have in mind as I go about building this model now we jump back to use it to your auto
complete for a vectorization now that it that it's working you've crossed that threshold of usability that you keep it does that mean that as the person's using it and creating these results are they in a way creating labels and adding to the model as they do it yet yes this verticie was right yes this verticie was no that's wrong over here are they improving the model over time so not by default so our plugin doesn't collect any unnecessary telemetry
so we obviously know when you've requested a like an auto complete but we don't track whether or not you keep that result we don't track whether or not you delete the resulting completion and we don't track whether or not you cut that completion at a particular point all these are kind of potential outcomes that could happen in the plugin and so we don't track any telemetry as to those outcomes and so we don't automatically add any maps to our training data set we do occasionally have
conversations with our users where they request that we add their map storage training to set and on occasion we will go and have them hand digitized we have a GIS staff that helps us with that but by default that's not happening is that in the interests of privacy or is it because that data would not add any you know particular value to to the auto complete model in the background it is definitely in the interest of privacy yeah let's say if okay so what you're saying here is that you
would be better for the model we could make a better model if we collected that data about the interest of privacy not I realize earlier on you said that these are two different problems to solve the auto complete is a you know a semantic problem and the the georeferencing is a search problem assuming that you know we didn't have to think about privacy issues here if you also collected data for that the georeferencing that people were doing could you
also improve that over time I think you could I think as with most ML models that are kind of deployed on the internet seeing how it's actually being used in the real world is valuable information but at least for us we found that we can collect really high quality data without having to go into our users data and I think that's a great great that we've kind of found ourselves in in a great position to be in yeah absolutely so you sound incredibly
talented and I'm an optimist so let's say in the next few hours when you've solved both these problems perfectly and you're ready to move on to one of the next steps in the process which is adding metadata now that you've solved the the auto complete problem you've got a model that works 100% of the time you've got a georeferencing model that works 100% of the time how will you start work on collecting metadata what will that look like what what will be the steps
involved in the potential output the metadata is really interesting because I think it brings the semantic understanding of the map that someone is digitizing to a totally new level because when you see people extracting all the relevant metadata for a particular project really it's not at the point where you're ingesting metadata it's not about a particular map it's how the map fits into your overall project and the goal that you're trying to accomplish
and so if you are doing construction in an area and you want to have a complete understanding of all the buried utility lines such that in the in the case that you need to actually dig around these and what's called daylight them which is when you dig exactly around them such that you can see them with your own eyes all of that metadata that would help you accomplish that day lighting process is actually your goal and the metadata extraction is just how you get there and so I
think this is really where the application of multi-modal models can really accelerate this process it's because when we combine the intelligence that we can generate about a map along with the visual and textual data understanding that an LLM can provide it's only with that larger context that metadata extraction begins to make sense and so if you go in to gpt4 with vision or some of the other multi-modal models that are available right now
and ask it to describe a particular feature in the image it is totally incapable of doing that and that's really not surprising this data is not in these models training data sets and so it's not surprising that by default you can't get this kind of behavior from these models that being said it's something that I know is possible and so I'm really excited to be a part of that future of making it.
Yeah I guess what I'm trying to understand is like what kinds of attributes would you be able to collect? What kinds of metadata? Because for me it would make sense that maybe you could calculate the distance of that line just as an example.
Maybe you could look at the color of the line that you have just auto completed and say well I know that that matches to this thing over here in the legend of my map if that's things and then there might be some data associated with that it might say water pipe, sewage pipe, I don't know gas pipe, that kind of thing that would be valuable data to have in there.
Maybe I've digitized other lines maybe it could say well this pipe is so close to the other pipe maybe you could start building up this database based on the metadata that you're collecting about each object based on its physical characteristics based on whatever else is in the map. That is the way I would think of it but I'm sure you have other ideas.
So in terms of metadata that a user could probably extract from these maps automatically a really simple example is data on a legend but that's not really the exciting applications. Data on a legend is like pretty easy to extract and so it's not really the level where you're saving a GIS analyst or a GIS technician a lot of time. It's really about metadata that's hidden in these maps and so I'm going to continue with this subsurface utility example even though it's kind of misleading.
If you were to daylight barred utility lines you would do a more it would be a more complicated process to actually do that and you wouldn't do that with just QLD raster images, raster maps but hypothetically really important metadata to know if you were daylighting a particular barred utility line is actually the depth of that line and also the trait of the soil that it's buried in.
These are kind of important things for the construction crews to know if they were to be delicately digging around a gas line such that it wouldn't have a dangerous utility strike and so that's kind of a good example in the utility space. We also have our users outside of architecture engineering construction. Another good example is actually mining and geology and so if you are a geologist and you are dealing with kind of a older geologic map that describes the deposits in an area.
The metadata associated with those deposits can be pretty complex especially when the legend associated with that. It's not even about describing a solid color region of that map but rather the matching the exact stylistic look of that polygon to a tiny little bit in the region.
That kind of says this is a sediment deposit of some type and so it's metadata like that that we would want to extract automatically and actually embed into shape files or geopackages such that these projects can be accomplished more quickly. Well, it looks like you have got your hands full for the next little while. What more generally are you surprised at the lack of AI being used in QDS for example or GIS tools and especially when we think of desktop tools?
For me, this is one of the few examples that I've seen out there in the world where it's being embedded into the actual tool itself, into the GIS tool itself but maybe you've seen more, maybe you've seen less. I don't know but I'd really like to hear your thoughts about that. As an ML oriented person, I think there's a big gap between what is possible in desktop GIS software and what is already happening.
So when people see large language models like ChatGPT, GPD for a lot of the other major developments that have been happening especially in terms of video generation, audio processing, it's clear that pretty much all complex problems will eventually be solved by artificial intelligence until it relates to scarcity. I think unless you're talking about material scarcity that is like literally the amount of food that's available on Earth, a lot of these problems can and will be solved by AI and ML.
And I think it's only because we're at the beginning of this renaissance in machine learning that we see attention to it but everyone's talking about it and nobody is implementing it. I think that's the best way to kind of imagine how this renaissance is happening right now. Obviously open AI and these larger AI research companies are driving most of the innovation in terms of where AI is going.
Meta being actually an extremely great example who created segment anything and is releasing a lot of this groundbreaking research publicly. But I think most of the opportunity that exists in machine learning today is actually in building domain specific models and embedding them into professionals workflows. So I think GIS is a really great example of this. You can't take an LLM even if it's memorized the entirety of the QGIS documentation plus all mentions of GIS online ever.
That LLM will often not immediately accelerate a professionals workflow. But once you understand that these large models are doing much more than generate text, they're actually generating semantic understanding of what you are asking it to do. And then completing that task as a result. Once you see that semantic understanding is actually what's being accomplished here, you can embed semantic understanding into all programs.
And that program can understand your overall task just as much as you do. And I think we will see that trend more generally within GIS, within CAD, within all of these kind of desktop softwares. And it's not just about text.complete. It's really about augmentation.
If you weren't working on this problem of auto-complete with digitization of geo-referencing, automatic extraction of metadata, and thinking again about working within and on top of the QGIS platform and solving problems for geospatial professionals that were using this as the main tool, what problem would you work on? That's a great question. So pretty much all of my work so far has been around mapping the built world, or even the natural world.
But an interesting opportunity for the geospatial world to move towards is less about mapping its current state, but more about imagining what's possible. And there's a couple of examples of startups actually doing this, where you take your conception of how the world is now and pretty much synthesize a way that the world could look in the future and evaluate whether or not that version is better.
There are some really great advancements in how people are doing this to, for example, make the world a greener place, because geospatial software can literally find the optimal place to put a wind farm or a solar farm. Once you consider all the complexity that's associated with that. But I think it goes much deeper than just real estate development. It's actually about evaluating ways on a map that the world is a bad place and changing that in the future. And that's really magical.
A fun example of that is actually when you map out the street trees in a neighborhood. This is especially true in the United States, where you have kind of large roads and not much shade. They don't really encourage pedestrian-friendly neighborhoods. But you can actually use machine learning and statistics on geospatial data to find the best spot to plant large trees and create more shade, create a better spot to walk around. And basically improve the nearby residents' lives.
So I think if I wasn't building AI autocomplete for this geomorphencing, vectorizing and metadata extraction workflow, I would be working on changing what's actually in the map. That was a great answer. I think I'm going to need a few minutes to walk around the house after this interview and think about that. I appreciate it. You've really given me food for thought. This is also probably a great time to wrap up the conversation. And thank you very much for your time.
It's much appreciated. And thank you very much for your work. I think it's fascinating. I think it's really interesting. And above all, I think it's going to be incredibly helpful to a lot of people. So if people want to check out what you're doing, where is the best place to they can go? Can they reach out to you? Is there a website? What, where can we send them? Yeah, thank you, Daniel. If you're interested in seeing what we're up to, you can go to our website, which is bunting labs, B-U-N-T,
I-N-G, labs.com. And you can also follow us on Twitter at the same handle. Yeah, thank you so much for the opportunity to share what we're up to. No problem. Any time. I appreciate your work. And I wish you all the best in the future. Cheers. Thanks.