You're listening to a stage talk titled Digging for Gold in Import -Export Data. In this episode, we're joined by my colleague and Bellingcat research consultant, Catherine de Tolly, who brings a year's worth of investigative learnings from exploring the hidden stories buried within global trade databases. Catherine walks us through key
insights she's uncovered. explains why this kind of data can be both powerful and perplexing, and breaks down a real investigation to highlight common gaps, blind spots, and pitfalls researchers are likely to encounter. You can find links to all the resources mentioned in the talk in the podcast description. This talk was hosted by me, Charlotte Ma, on Thursday the 4th of December 2025 in the BellyCat Discord server. Hi, hello all. Thank you for coming to our final Stage
Talk of 2025. If this is your first time listening, you can find all previous episodes on our RSS feed or on podcast platforms by searching Stage Talks with Bill and Cat. I'm sure one of our lovely mods can pop that in the chat. We talk about many different topics associated with open source research, from covering conflict, mapping environmental damage to building ship tracking tools. Today, though, we're diving into financial
investigations. Catherine Gittoli, Bellingat Consultant and general amazing researcher is here to share her tips and tricks for shifting through import and export trade data. From finding the elusive information to sorting out what really matters in all of the numbers, listening as she
shares her tried and tested know -how. Catherine has worked on major financial investigations uncovering who is behind one of the world's largest deep fake porn sites, The tracking online adds for dangerous drugs to Chinese trade market sites and social media. Whilst we talk, you can place your questions in the chat accessible on the right hand corner of your screen. But please remember that this is being audio recorded for
the podcast. So if you don't want me to mention your username, please add that as a note in your question. You may have noticed that our cameras are off for this particular talk as well. Please don't panic. That's on request of the speaker. But we will be sharing screens, so just be prepared to also follow along as Catherine shares her slides. Okay, Catherine, over to you. Fantastic. Hello, everyone. It's wonderful to see so many
people here. I didn't know that so many people were going to be interested in this obscure area of trade data. And certainly I found when I started using trade data that it was quite hard to find people that knew about it. So I'm hoping that some of you might be experts and you're welcome to pop notes in the chat while I'm talking if you think I'm talking rubbish or if something isn't clear because this has really been this year has been an area of exploration for me and
as always I'm learning. So I am going to switch to sharing my screen and I will start my presentation. So I'm talking about what you can dig for in trade data. So trade data is really import -export data, and it's pretty obvious that when goods leave a country or come into a country, then
data is collected. For instance, because of the kind of legal processes around shipping, who's liable for what, when, because of customs duties and taxations, because governments need to know You know, when goods have left a country, that kind of thing. So that's really what trade data is. What I'm showing on my screen now is a generic version of what's called a bill of lading. So bills of lading have been around for donkey's
years, many, many, many decades. And they're really used to capture data related to a particular shipment. So there'll be things like the name of the exporter, the name of the consignee or the importer. There'll be a description of the goods, the value of the goods, which country the goods are going into. These bills of lading are not necessarily standardized between countries, but they do tend to capture wherever it's going
to. So I want to really explain to you like why why use it because I felt like a bit of an outlier at Bellingcat when I started using trade data because nobody else really used it and I thought okay well let's look at other examples of investigations where trade data has been used. This is one of my favorite stories and it really inspired me and I would encourage you to go and look it up
like the BBC. What they did was they had a tip -off that Indian pharma companies were sending a very dangerous opioid across to West Africa.
And what they did was they looked at trade data, so publicly available export data, and they were able to find that a particular company, Avio Pharmaceuticals, there were other companies, but Avio was sending a lot of these drugs across, and they then sent an undercover journalist in to interview this charming man who explained his, let's say, lack of ethics around exporting
dangerous opioids to West Africa. So the trade data here was really useful for them because they were able to see, well, who in India is exporting these drugs in bulk to West Africa? And then they could use that information to deepen their investigation, to take the next step. Another example is the New York Times, and they used trade data along with other sets of data. They had a tip off that Boeing parts were making their way to sanctioned Russian airlines, which they
should not have been. And they were able to use trade data and this other data to then track where these goods were going, because they were obviously not going straight from the US to Russia. They were going from the US to the UAE to somewhere else. They were using corporate registries, a whole bunch of different data, but they were able to show that effectively sanctions were being bussed with Boeing parts, which is, you know, that's pretty big. The next one is using
another set of data, which is Comtrade. That's a UN system that that countries all submit their own data to. So they submit their annual trade data to the UN, who then makes it available in a system called ComTrade. It's different from the kind of trade data that I'm talking about that I'll show you in a minute. This is more categorized data. If you want to be technical,
it's HS code level data. So you won't be able to see specific shipments, but you'll be able to see that a whole bunch, in this case, it was really cigarettes going into Mali. And they would then compare that. They could see on Comtrade the volumes that were going into Mali, and they compared that to local demand for cigarettes and local production of cigarettes. And then they could see that a lot of cigarettes were being sent from Mali into the Sahel region, and
this was not legal. But again, they used the trade data to help them kind of kick off their story and quantify things. But it wasn't the end point. And I'll keep saying this, that the data is not the story. The trade data helps you to get insights into a problem that you're investigating, but it's obviously not going to tell you everything. So what I'm going to move on to is to talk about
an investigation that I've been busy with. And what happened was I started looking at health and Africa, which is obviously a very broad topic, but I'm from South Africa and I really wanted to find an Africa relevant story. And then I happened to read a UN report which was on drug trafficking and it mentioned a particular drug. that they said was taking over from other drugs that were now more heavily regulated by India. And I thought, oh, that's pretty interesting.
They were saying, hmm, it looks like the exporters, the Indian companies might be shifting to a different drug. And I thought, okay, well, let me have a look. So I went and I used free trade data. So the providers that make this data available, like Import Genius, Export Genius, Volza, there's so many of them. Some of them make little bits of free data available online. And I was then able to, I mean, this was literally manual copy and pasting, but it was a way for me to learn
and see, okay, is what the UN is saying. about this particular drug to pentadol. Is it actually being exported from India to West Africa? And I thought, okay, just use free traded. And then I could pick up some patterns. Like I could see supplier names and I could dig into them a bit. I could see buyer names. I then went and did some other research and I was like, oh, well, that's very interesting. This drug is not legal in Ghana. How can it be? that it's being exported
from India to Ghana, yet it's not legal. So I started with free trade data and manual copy and paste. Now, as I mentioned, there are many providers. They have vastly different prices. So some of the top end ones like Panjiva, I think are like $12 ,000 a year. will provide it free. Like Import Yeti, if you are doing US data, you can get free access to data if you're a researcher, like doing an open source journalist. Import
Genius provides free data as well. We didn't know this at the time that I started needing data, so we bought for $1 ,000 a year access to 52 WMB. So I'm going to just quickly show you so that it looks a little bit more real what I'm talking about when I talk about using a trade data source. You can see this is a pretty standard kind of UI where you choose your country, you choose that you want to export or import data, you choose your date period, and then here I
happen to put the drug that I wanted. You can also Search by supplier name or buyer name depending on the country because as I said different countries provide They capture different data on the bills of lading so what I did was I then did searches within 52 WMB and I downloaded The data for Depend et al and it ended up looking something like this So this was where I consolidated to pentadol exports from India to West African countries.
And being a bit of a data monster, this made me really, really happy because it was, yeah, I'll show you my favorite picture. The cookie monster got her data. And finally, instead of manually copying and pasting into a sheet, which was never going to scale, I could actually see proper amounts of data and I could start to play with it. So as always, and I'm sure many of you know this, when you get access to data, it is not perfect. That is not the nature of data.
So I had to do things like clean the data and it was quite basic things like standardizing names, for instance, company names. Because a lot of the data is captured actually manually, it's the bills of lading or paper, and then some poor clerk at a customs office at a port of exit will be capturing the data from this bill of lading. They can make mistakes or the bill of lading actually, the data could be just written wrong. The company name could have a little mistake
in it. So I had to do things like that to standardize names. so that I could then do analysis on the data and I'll show you the analysis in a minute. I had to do things like, you know, tidying up column formats, which sometimes threw my numbers off, but it's okay. We all know how to do these
things if you know anything about data. And what it enabled me to then do Was to make statistics and this is where the cookie monster gets really happy because this is when things get interesting Because to me data talks data tells you a story if you are able to go in and do some analysis and Have a look at what? the data says So I used what I think are pretty basic for me. You can probably see here. It's a simple sum if I did
the odd sum ifs with multiple criteria. I used count unique, count unique ifs, that kind of thing. So really not rocket science. It might be that some kind of AI could have done this for me. I'm just kind of a little bit traditional with my data and I had to do it myself because I sometimes found as I applied my formula and thought about how I was going to do things that I would get other ideas and so on. One day, I will get my head around using AI for data analysis.
So when I applied my formula to the data that I downloaded and consolidated, the story started to emerge. So for instance, you can see here that, okay, looking at the percentages, over 80 % of the exports of this particular opioid from India to West Africa, was going to Ghana and Sierra Leone. Those were noted down as the destination countries. So that's interesting.
The data is then telling you, okay, you might really want to look at Ghana and Sierra Leone, because those are kind of two of the inadvertent
commas, like problem countries. When I did sums for the top exporters, you can see here that Okay, there are some bigger exporters we decided to highlight the top three Because of you know, there were so many different exporters from India that I wasn't gonna go through and look at all of them So we said, okay, let's look at the top three and it was at least again The data tells you I feel where to go and look next I hope that makes sense So that's yeah, that's really What
I did with the little cookie monster did with the data was downloaded, cleaned, and analyzed. And again, the data is not the story. The data is telling you where to go in your story. It's suggesting to you areas that you might want to investigate next, but it's not the whole story. I do just want to point out something else and this is where we start getting into the mess of trade data. Where I was using 52 WMB data
and then we got access. We found out that in fact we could get free access from Import Genius. So Import Genius gave us access to another data set and you'll see that there are some pretty different numbers here. And I got pretty nervous. Like, you can see here, this column is my calculated differences between Import Genius's data and 52WMB's data. And at that point, I got pretty scared. And our editor also got pretty scared because he was like, wait a second, you can't
keep going with this story. This data is terrible. I really then had to look into, well, how can this be? I mean, part of me instinctually knows that when you're dealing with databases with millions, if not billions of records, if you think of all of the shipments globally over, I don't know how many years, that's a lot of data. Things are going to get chaotic because data is never perfect. I think that's the nature
of data. But also, I had to really look into, well, how can it be that I've got two different sources with different numbers? And then luckily, I made contact with William, who is the head of research at ImportGenius. And he was explaining to me that, look, the nature of this data is that it's imperfect. And what you need to do is you need to make sure that the data is basically
going in the same direction. So if, for instance, 52WMB had been showing me all these numbers, yet import genius had nothing or very, very little, then there's likely to be a problem and I need to look into it further. So these pieces of data were different enough, but they weren't such that one needed to say, okay, this has to stop. So to be honest, I'm still a little bit uncomfortable with this, but Looking into it further, it became clear that this is just the nature of the data.
And what I'm going to do now is actually take a step back and say, well, where does this data come from? So I've shown you a picture of a bill of lading, and basically the government bodies, like customs offices, they capture the data in the bills of lading into some kind of IT system. And in a lot of countries in the world, that data is still, the bills of lading are still paper. They're not digital. India has actually gone quite a long way to digitizing. I think
Singapore has completely digitized. The US is starting to digitize at various, and I think it's at various ports. There are other countries, the US has a, sorry, the EU has a system. I don't know how well it's been implemented. The UN also has a system that they make available to various countries where it has been implemented. But basically what I'm saying is that the data is in kind of different formats, different databases
for different places. Then what happens is that online trade data providers buy it as well as
brokers. So the online data trade data providers are like Import genius or volsor or you know, any of those other providers penjiva, etc but then there are also these brokers and those brokers and providers they combine the sources where they get all the data they clean and present it online and this is where I used AI was to Try and present some of what I was perceiving to be the chaos around trade data So you can see at the top here that there are different
data sources, and those data sources will be like a particular port office, or a particular country, or a particular province, their trade data. They will then sell this to data brokers. These are often regional data brokers, like you get Asian data brokers, American data brokers, et cetera. And then you get the online data providers like Penjiva, Import Genius, et cetera. who either get the data from data brokers or sometimes they
buy it from the original sources as well. Now at each of these stages, the data gets cleaned and it gets put into databases. And to me that partially explains why there are differences and I could see it. And when I say differences, I mean differences between sources. I could see that sometimes like one provider or one source would round numbers and another wouldn't run numbers. Or they use different exchange rates, for instance, when they convert data into US
dollars. So there's a whole bunch of things going on with the data, which I think explains well enough that there's going to be differences between your sources. But I will say that I wouldn't We've been quite hesitant to quote specific numbers. And William of Import Genius did say to me that, look, if you are going to write a story based on one trade that you can see in the shipping data, he really recommends that you go and you try and corroborate that one trade somehow through
interviews, through other means. I'm not sure
what. To make sure that that trade did happen You know when you aggregate that problem isn't as it's not as much of a problem But when you're looking at a single trade you're gonna really need to be much more careful because of the kind of what I'm showing here the sort of chaotic nature of things So in the summary I went just by the way on my story We're still busy writing it and I think it'll come out probably in early January But in my story, the trade data was really
key because it told us where to look. We could see that the trade was happening, that huge quantities of this opioid that is not legal in West African countries are being exported from India. And we could see that it was a whole range of countries that were receiving it. It was actually great. I had a conversation with the journalist who was one of the lead researchers on that fantastic
story that I showed you from the BBC. And she told me that when she was writing her story, she was looking at the trade data and she was researching and she thought, I'm going crazy. I must be going crazy. This can't be happening. How can it be that an opioid that is not legal in West Africa is being exported from India to West Africa and yet it's being shown in the trade data? And it was really good to hear that from another researcher from the BBC, because I've
also been feeling like I must be nuts. How can this be happening? But it is being shown in the data. So in summary, as I said, trade data, records and imports and exports, it's messy, but you can find things in there. There are some free sources. Import Genius will give you, if you are like an open source or you're a journalist, they will give you access. They won't just give you access to the whole database. They tend to want to narrow it down a bit, which I understand,
but it's great and they're very helpful. It can be the start of a story and it can inform your story, but you're going to have to clean, you're going to have to analyze, but that can be really, really useful in investigations. If you want to talk some more, please contact me. As I said at the beginning of the talk, I'm learning. I'm not an expert. I'm learning about this. I found it really useful in my investigation, but I know that there are people out there who probably
know a lot more than me. So yeah, there's my email, Katherine at consultant .pellingcat .com. And please get hold of me if you'd like. Thank you, Catherine. And done. That was amazing. Thank you. One, thank you so much for taking us through the mess that is trade data. And two, thank you for using a live example as well to really demonstrate the point. We've got a few questions already in. I wanted to first ask you, because you started
to clarify it a little bit. People were asking, well, how do we know that this data is real, like the platform? which is providing it, is how to trust it. You started to explain that and how you came to the same kind of realization. But how do you first come across those platforms? How do you know that what you're on, for example, 52 WMB, is a legitimate trade platform and not something where there's just a bunch of numbers? How did you initially come across those platforms
first? Yeah, it's a good question. You know, we wanted to use somebody fancy like Pangeva or Lloyd's S &P, but we just, we could not afford it. So we had to go kind of in inverted commas the cheap route. We just, we didn't have a choice. And then when we got access to the import genius data, that showed us that even though the numbers weren't exactly the same, and sometimes they differ quite a bit, that the data was essentially saying the same thing. So I think that if at
all possible, you try and get two sources. And I don't think it's in anybody's interests to just make up a whole bunch of trade data and sell it. Because somebody is going to figure out that you're just selling rubbish Like there's there's too many fields in there and there's
too much data. I I can't see how anybody would just make it up Even though it's it's imperfect But as I said two sources I think is a is a really good idea and it could be that for the second source You use those free the free data like I was showing you which was where I started you might have to do that Yeah, I think for many people listening who tend to be freelancers or people who are working in research as side jobs, that's something that probably will be the step
to take. Tied to that, along the same lines, somebody asked about the bill of lading and whether it was scanned. You mentioned that in some countries it's digital and some countries it's not. Within the trade databases, do you have access to the bill of lading or is it just something that uh, you have to, uh, accept exists in the background and you don't actually see it. You have to, you kind of have to accept that it exists. Can I share my screen again quickly? Okay. So let me
do that. So I'm just going to show you in 52WMB, you don't get to see the original bill of lading. But for instance here, when I'm clicking on one particular shipment, it's showing me the different fields that I'll see in the bill of lading. But this is basically what you'll see when you download the data. So you'll see this same data in sheet form, which is what I got when I consolidated everything into this sheet. You'll basically see that when you download, but you'll see it
all consolidated in a CSV or an XLS. You'll see bill of lading IDs, although I found some providers give you that, some don't, but I don't know of any place where you can go and double check bill of lading IDs against some sort of central place where you can check the bill of lading IDs. I saw one mention, and I honestly can't remember which provider it was. I saw one that said that they provide the original bill of lading. Yeah,
one of them said that. I know that Import Genius said to me that in the US, where you had to digitize, that they were actually sending where the data wasn't digitized yet, where still everything was in paper. that they were actually sending their staff to the port with a scanner to scan bills of lading, which sounds like the worst job in the world, but somebody's doing it. After the fact, you can probably find an example of a bill of lading online or I'll attach an image
example to the podcast description. Regarding the bill of lading, that's a legal document, right? Do you often find, you mentioned that sometimes the bill ID isn't there. Is that just based on the database's preferences or do you often find that sometimes the bill of lading isn't filled out? completely and there's data gaps there. No, I mean, look, I'm not an expert to say whether bills of lading ever don't have
IDs. I'm sure they must have to have an ID, but I found that different providers would provide the bill of lading ID or not. And like 52WMB, when you download the data, in fact, the bill of lading ID isn't there, but the declaration number is there. I don't know why. I don't know what the declaration number is. But yeah, there are weird things going on that I don't really
understand. William from Import Genius did at one point say to me, well, you know, to analyze trade data, it really helps if you've got like the level of knowledge of a customs officer, which Quite obviously I don't have, but I've literally gone and read the user manual of the EDI system, which is the system that they use in India because they've gone quite far into
digitizing. So that system is used at some customs houses and I've read that user manual because I was just trying to understand the data that I'm using that the Indian port authorities then sell on to brokers, on to data providers. Where is it being captured? What does the system look like where this data is being captured? Just to try and make myself feel a little bit more reassured about what it was that I was looking
at. Yeah, I had a similar issue when I looked into tobacco companies and was investigating that. A lot of their reports used kind of... uh, lexicon and, um, words that weren't familiar to me from outside of the industry. So I ended up reading through their entire catalog of company reports and company training, so that I could understand the vague references in their data, um, which, you know, sometimes you have to do, um, and just commit to it. Um, G Clairf asks,
and this Fabian, sorry, go ahead. Oh no, I just wanted to mention that for any of the other data heads out there, I dreamt of finding a data provider that had a data dictionary for their data, which then showed you, you know, what are the definitions for each field? What's the field format? And I couldn't find that anywhere. And that really frustrates me because I feel like as a data provider, they should do basic things like, you know, at least explain. what I should expect to find in
the data. But anyway, that's my dream. I didn't find it. If anyone's listening, maybe that's a task to do. Just for Catherine's sanity, at least. Gclef asked, does the shipping data show the ship carrying the goods? I'd be curious to see if the ship actually went to where it claimed or ended up somewhere else. Is that kind of data available? I didn't see that. Sorry, I'm interrupting. I didn't see that anywhere. It would have been
lovely. Didn't see that. That's a shame, because our ship tracking friends in the server would have loved that. Yeah, I just want to add a little story there and it's kind of tangentially related. I was talking with a Ghanaian pharmacist and I was telling him about the data and he was saying to me, well, what are the shipping dates? Give me the most recent dates that you're seeing because he knew how long shipments took from India to
Ghana. And then I told him some of the most recent shipments of Tabentadol, and he was like, okay, great, that's good. Okay, I'm going to tell the authorities that they need to look out for that shipment. And that didn't include the shipper name, but he could at least tell them that goods had left, a certain quantity of goods had left India on a particular day and that they needed
to look out for these drugs. And I can't tell you if anything ever came of that, but I found it quite an interesting use case for the data. Yeah. Yeah, that is interesting. Quite a few people in the chat have been asking, quite surprised, that the trade data implicitly identifies a drug that's being shipped to a location where it is illegal to do so. That is 100 % as the case, so my mind. totally blew my mind, it made no sense, but that's exactly what the BBC journalist
was saying. She said she thought she was going nuts, that there must be something that she's missing. With the drug that I've been looking into to pentadol, as I said, it is legal in some places, and it's legal in India, for instance, but it's legal up to, in its instant release dose, it's legal up to 100 milligrams. But in the trade data, the dosages being sent over were like 200, 250, 300 milligrams. So those were even dosages that weren't allowed in India. And
it says that in the product description. So I can't explain to you why this is all being disclosed in the trade data, but it's there. And it's with the BBC story. In fact, that combination of Tepentadol and Caesopridol, it wasn't legal anywhere in the world. Yet, if you looked at the trade data, all of that was in the data. Someone's asked for a link to the BBC story. I'll pop it in the chat in a second. How did no one else sound the alarm bells on this if it's so blatantly visible
that it's illegal? Some of the comments coming through. Yeah, it's a surprising story. Hopefully Catherine gets the opportunity to write it up. Yeah, I don't know. I've really had to delve deep into the Indian regulatory environment because there could be some little regulatory glitch that is allowing this to happen. It should not happen. Drugs should not be sent to countries where they're not registered. But there could be some kind of glitch in the system where we're
not sure. I wanted to ask, because you mentioned that obviously it's legal in some countries and others and the trade databases, let's just go back to away from the story itself a little bit. Does that cover all global trade or are there some restrictions and are countries comparable as well? As you mentioned, lots of different countries have different ways of filing things, are the country trade information also comparable? You mentioned different currencies, for example,
and things like that to be aware of. Yeah, look, I will start by saying that I've focused very heavily on India, so I know India the best. What I have found though, looking at the different data providers, is that their coverage of countries is very different. And a lot of them provide what's called mirror data. So mirror data is where they can't buy the actual, from the source,
the trade data. So then what they do is they say, okay, we can't get the trade, you know, we can't get the trade data for Western Sahara. But what we're going to do is we're going to look at all the other countries for which we have data and look at where they've exported to Western Sahara. or where they've received goods from Western Sahara. And then they kind of build up the data that way. But you won't find any provider that can give you every country's
data. It just doesn't happen. And in fact, not all of them disclose it. I saw an article the other day, actually by one of the trade data providers, who said that 200 countries make their trade data available. But I have not. found a provider that provides all of that data. And certainly, Import Genius says that they make available all of the data that actually can be put online that they could get their hands on.
So to be honest, I don't know which countries make available and which data is good enough quality to be made available. But all you can do is look at the different providers and see which countries they cover. We've got somebody in the chat said, our team traced the dark fleet of oil and built an AI for compliance for this to support Ukraine. The architecture of how the routes were set up in the crypto systems connecting
to port authority was fascinating. And then someone else has put, I work in international trade and from my experience, the custom officers do not necessarily check whether something is legal to import. But they rather focus on the accuracy of the documents, like whether the actual weight matches the documents and whether all items are declared properly. However, this is limited to
Eastern Europe. This is something that we found when looking at the sanctions on Russian ships out of occupied Crimea when we were covering the grain shipping. We also found this where checks, they'd gone through checkpoints, but what actually was in the ship wasn't necessarily checked as thoroughly as it should have been.
Thanks for those personal stories. Please keep them coming in and if you have any tips for Katherine as well as she mentioned, because she's diving into this newly as well, please do pop them in the chat if you've worked with them previously and thank you to the person who shared the BBC article in the chat as well. Somebody's asked, have you found that the choice of HS codes makes
a difference to the accuracy of data? When a new product is created or when products enters a new market, At the exporter, there's usually a process of figuring out exactly which HS code is appropriate. There is some leeway in choosing which HS code applies so things can be mis -categorized. Have you found that the choice of HS codes makes a difference? No, you know, because I don't know if you remember when I showed you the 52WMB UI where I did my searches. I didn't have to use
HS code. I could search by the opioid's name. So in fact, it didn't matter what HS code it had been allocated to. When I looked through the data, I found that it was very consistent. It happened to be that the HS code that it was allocated to was always the same one, but the HS code did not affect my investigation. When I showed you that Mali, the cigarettes one that was using com trade data, and there you have to do it at HS code level because that data is
aggregated to HS code. But there I think cigarettes must fit under one HS code quite neatly. With my story, if I'd been doing it, if I could only get access to HS code data, I actually wouldn't have been able to do it because there's so many different drugs. and medications, you know, that are put under one HS code that it wouldn't have been viable. Thanks for that. We've got a few more minutes for questions. So if you want to
pop some in the chat, please do. I wanted to ask specifically about project management because as soon as you showed that spreadsheet and went through all of the data, I know you said that you love it and that it's so much fun, but actually I put in the chat that It gives me a headache
and makes me want to cry. For anyone who gets a little bit overwhelmed by large data sets, have you got any tips or tricks just to get in the right mindset to kind of deal with that amount of data when you're trying to sift for a needle in a haystack, for example? Oh, if you're asking me that question, I... I mean, literally that blue Cookie Monster that I showed you, I've had a t -shirt made with that Cookie Monster and it says, me wants the data. That's how much I
love data. So I feel like I don't necessarily have tips because I'm a bit of a crazy person. So I'm probably not like other people. I think just don't be scared of data. It's basically bits of text and numbers and rows and columns. Particularly if you're using Google Sheets, if you're using the filters, the filters that you can apply in that top row, even just getting to know the data through the filters and you get to understand the data and what's there,
that can be a good place to start. Thank you. I appreciate that. Someone has put in the chat, the answer is give it to someone else. And maybe the answer is give it to Catherine at this point. It's interesting that you love data that much. I can see the value in the numbers, I swear. Someone asked earlier in the chat when you were talking through your Excel spreadsheets about whether you've used coding, whether you've used
Python to sort through data before. Because obviously you mentioned AI, there might be an AI tool out there, but you like to do things manually. Have you ever experimented with using code to also kind of shift through data, large pieces of data? No. I tried to learn to program R and I started a course twice and I was just so rubbish. It was embarrassing. And I think I feel really comfortable in Sheets because I've been using them for years. At some point I'm going to spend the time to
learn OpenRefine. to do data cleaning. I didn't need to use OpenRefine or I felt that I didn't need to use OpenRefine to clean this data because it was not actually, it wasn't that bad. And I could use basically search and replaces to clean my data. But at some point, especially with a bigger data set, with a much messier and bigger data set, I would need to learn something like OpenRefine to clean my data. But we also like, we have coders in Bellingcat who dream
in Python, whereas I don't. I probably have nightmares in Python. So I know I could always just rely on them to do that for me. We've got a few people in the comments saying just that I've written Python scripts that can do the cleansing super fast as well as deduplicating. and somebody else put LibreOffice or LibreOffice and Microsoft Excel now support automating commands through Python that functionality can come in very handy
for this task. So there you go. Maybe you should dive away from Google Sheets for a second and explore Microsoft Excel. I think Owen, I need to grow up. As we're shifting towards the final parts of the stage talk, I wanted to ask you, because we've mentioned that repeatedly, that data is in the story, right? The data is a starting point. It's a place to realise the story maybe.
What would be your next steps if you find, as you said, you find this data that is a little confusing in terms of consistency, but tells a coherent narrative. Where do you go from there? Are you now looking at doing those interviews that you mentioned to corroborate it? Do you have to step away from the OSINT angle or are there OSINT methods that you can take it further with? You know, for me, I did more. I wouldn't even say it was... I don't know if it's necessarily
OSINT. I basically just researched. Like I've got quite a background in healthcare, so it made sense to me to, when I'm seeing an opioid that's being exported from one country to another, it made sense to me to go and look at the regulatory environment in the importing countries and try and understand, okay, is it, are they, is this actually legal to export there? And so it was really more, there wasn't, other OSINT research
really to do. There was just kind of basic desktop research to try and make sense of what looked like something that's impossible, to try and make sense of it. Because we could see, like if you look, for instance, in Ghana, to pentadol, and then it's had different opioid forms. There have been other opioids that have been used before. They're called, in the street, the drug is called Red. And then I could see on social media that there were posts about it. There they call it
War Name Red. And there was a song, War Name Red. And I could see videos like on TikTok and on YouTube or on Facebook of users and the packs of the pills. And then I could see other local journalists' investigations of War Name Red and them talking about the problem of the addictions in their country. I would say that's... I guess that's OSINT research, it's using social media, but it was really to try and understand what I would call the demand side of the problem.
So the countries in which the drugs are being used and social media was very useful for that. Yeah, you're looking at the impact of that trade and why it matters that you're even reporting on these numbers as well. Cybers put in the comments, that's a good point, linking social media content along with the trade info. You could also, right, dig into the suppliers that you found and check whether they also supply legitimate things as well. I'm guessing they do. I'm guessing that
large suppliers in India at the minute. Yeah, in fact, we did do that. We're just not sure whether we can publish it or not. There we did use actually some more traditional OSINT kind of digital footprint tracing to try and figure out some of the suppliers and turned up all kinds of interesting stuff that I can't really talk about now because I don't know, you know, whether we can publish on it or not. But it was very interesting looking at the suppliers, a whole
range of them. And yes, some of them do supply actually perfectly normal drugs to countries, you know, because India is, they call themselves the pharmacy of the world. And it's not like the whole of the Indian pharma industry is bad. It's just certain companies are wantonly exporting drugs that are ruining lives in Africa. And you said you focused on Africa because you wanted to cover the impact. um, local to you, but have you also found that these shipments are going
to Europe or to the US? Like is, is there ways to track on these databases if the same shipments are going to other locations as well, or are you stuck searching one or two regions at a time? No, well here I was looking at India export data. So everywhere that India was exported in this particular opioid to I could see, and there is more to be done. Yeah, there are other countries
that I would love to look into. It can be quite difficult with some of the other countries, though, depending on how much information they make available online. Although, in fact, it was possible with West Africa. Most of the countries are quite good about disclosing, for instance, which drugs are legal in their country. Other countries do
not. So I haven't yet looked through the data to look at whether I can do further investigations, but I can see that strong dosages of this particular opioid are being shipped all over the place. Some people call to pentadol like tramadol, which was one of the opioids before. It's like it's the fentanyl of West Africa. It does need that level of attention. So I'm sure there are other countries. There were a bunch that I could see that I would love to dive into at some point.
Wow. That's a big statement as well. The fentanyl of West Africa. Amazing that you're spending some time diving into this. We're coming to the end now. So I wanted to ask lastly, unless anybody else has any comments that they quick questions
that they want to quickly squeeze in. is if you have any advice, you've gone through lots of advice today in the talk, but if you have any lasting tips for anyone who perhaps is completely new to financial investigations and now, because of this talk, maybe wants to delve into drug data or tobacco data or any suspicious goods. What would be your main tips for people who are
first starting out? Is there any resources, any guides that you would shout out for people to read, any people that they should speak to who are a little bit wiser on this subject, perhaps that you found useful speaking to? Yeah, basic tips for people who are beginning their journey in this space. You know, sadly, I kept on thinking I would do a lovely Google search and I would find the guide to trade data and how to use it and I didn't find it. It's actually something
that I'd like to write myself. And I was meeting with a researcher from another NGO in Europe and she was She had talked to a bunch of investigators from other organizations and there was a similar cry of people saying, we know there's trade data out there. We can see how useful it is. It's tricky and we need to learn how to use it. How can we do that? So one of my ambitions for 2026
is to at least start the guide. to write it and to have people add to it, disagree, agree, whatever they want to do, but at least get something down so that we as open source researchers can start to kind of share our knowledge so that we can all make better use of this data. Because I was literally flailing around blind. I'd never used this data before. And I've just kind of had to
learn as I went along. And I ended up interviewing I got in touch with one of the, the guy who was the head of the training Institute in India for customs officials. I don't know how, why he decided to speak to me, but he did. And, but I had to go to that level to try and understand the trade data because I couldn't find guides online. So I wish I could say to you, it was there, but I would love to write it. Watch this space is what Catherine's saying basically. There will
be a guide soon, I'm sure. Thank you so much, Catherine, for your time today. It has been so fascinating to find out what you found out over the last few months and really dig into the data. Thank you for taking us through it. Do you want to remind people of how they can reach out to you again? Just quickly before we end. Oh yeah, you're welcome to email me. It's Katherine with a K, K -A -T -H -E -R -I -N -E at consultant
.bellingcat .com. All right. Pop Katherine a message if you have any tips for her, but also if you feel like you want to chat a little bit further. Within this space, we do have spaces to chat about financial investigations. Hashtag money is the place to go to for that. You can talk about trade data to your heart's content in that particular channel. So please feel free to go in there. Obviously respecting any rules
of the server as you're chit chatting. I can see a subtle knife is currently typing, which is probably reminding you all of that. But anyway, wrapping up, thank you so much, Catherine, again for today. And yeah, we will be back not in two weeks time, but in the new year with a very special stage talk from Elliot Higgins. But for now, thank you for listening, and we'll be back in January. Thank you all. Thank you for listening
to the stage talk. If you'd like to catch a stage talk live where you can ask the guest questions, join the Bellingcat Discord server by visiting www .discord .gg slash Bellingcat. The music you've heard is titled Dawn by Newer Self and is courtesy of Artlist.
