Jonny: I'm so glad that you're here with us, Dorothy, just because, Jonny: like, I'm just, like, always interested in your perspective on this, Jonny: having, like, lived in the library world of leaked data for so long, Jonny: just being, like, because on the other end of, like, living in programmer world, Jonny: sometimes I still get the sort of, like, both the persnickety, Jonny: you know, purist side and the people that are trying to make it work happening.
Jonny: But like very few like actually this doesn't even come close to meeting my needs Jonny: or like resemble my my work style at all i. Jay: Remember it was so funny like scott carlson helped edit or write that like linked Jay: data in libraries book and then like two days later was like linked data's dead Jay: and then like became a like a programmer i love scott i think.
Dorothea: He was stuck in a deeply shitty workplace and i. Jay: Agree it happens to us and then we get out of them hooray. Jonny: Proud of you. Jay: Yay okay, Jay: I'm Justin. Justin: I'm a Skoll Column Librarian, my pronouns are he and they. Sadie: I'm sadie i work it at a public library and my pronouns are they then. Jay: I'm jay and i'm a no longer a music librarian, Jay: finally fucking a cataloging librarian again for the first time seven years Jay: after finishing She Graskell.
Jay: And I won't say where. And my pronouns are he, him. Justin: Just post the address this time around. Jay: If you're in the Discord, you know. Justin: Okay. And we have guests, would you like to introduce yourselves? Dorothea: Sure, I'll start. I'm Dorothea Salo, pronouns she, her. Dorothea: And I teach at the University of Wisconsin-Madison Information School.
Jonny: I'm Johnny Saunders, they, them. I'm just sort of like, I guess I do various Jonny: forms of like information based work at UCLA. Justin: Thank you. Jonny: Yeah. For the belated applause. I was waiting for that. Dorothea: Thank you. Very kind. Justin: Welcome. Welcome. I still have my reorganized on board, so I still only have like 10 sounds. Oh. Justin: No copyright law in the universe is going to stop me.
Jay: I i started making justin watch it's always sunny and it was a bad decision Jay: because now the soundboard has it's always sunny theme on it and. Jonny: It's got to be the full length version too no soundboard is complete without Jonny: the one that keeps going for an hour. Justin: Because that's.
Jay: Just a piece of like public domain music it's not even like written for the Jay: show I'm pretty sure sweet I. Justin: Think I had just the full Soviet Union anime, Justin: yeah I was like this is anime 1 you piece of shit, Justin: yeah that one keeps going, Justin: so this was an episode we came up with because Justin: city wanted us to explain linked open data and Justin: i think i probably know the second least Justin: so i figured it would be funnest for
Justin: me to start and try and explain what linked open data is Justin: which is all from Justin: what i remember in grad school which is the last Justin: time i ever had to interact with it that i'm aware of Justin: besides like you know the parts of Justin: linked data that are used by google is it's primarily Justin: you can think about it as triples and everything is one item linked to another
Justin: item so hamlet is a character in hamlet the book those are two separate uris and then It's a play. Justin: Well, it's in book form. Jay: Okay. Justin: And then Shakespeare is the author of Hamlet, and so there's an is the author Justin: of statement that each has a URI, Justin: and these three things can chain together forever, and that way you would have Justin: something that's both machine-readable and human-readable, and somehow that Justin: makes data boxes in Google work.
Dorothea: Or certain extremely non-human-readable forms of human-readable. Justin: Right so once he's trying to organize it in other ways like say make a list Justin: of things suddenly it doesn't work anymore yeah because now you have to see a series of statements. Jonny: Yep i'm like just chin. Sadie: Hands here waiting for all of these super smart. Jay: People literally, Jay: this is we explain linked data to sata yeah.
Jonny: There's like the tripled explanation and And then immediately you fall off the Jonny: cliff of ideology and 25 years of some of the most prickly and opinionated people Jonny: in the world making like claims on reality that you truly can't believe until you see them. Jonny: So it's like, you know, we got talking about technology and beliefs.
Jonny: And then also like for a lot of people, like a huge amount of like wasted time, Jonny: trauma or success, depending on if you work for Amazon or Google or not. Jay: Yeah. Like my experience with linked data is that I took ontology development Jay: in grad school with Dave Dubin, shouts out Dave Dubin. Jay: And we developed, we learned RDF and we mainly wrote in turtle writing.
Jay: I think but we learned like all the other like triples and in three and all Jay: that but i think he liked turtle the the best if i'm remembering the only. Jonny: One that worked. Jay: Yeah as a as a class we collectively created an ontology together each of us Jay: had our own specific section of it that we had to create and like mine's still Jay: it's like still on my github and everything like it's still like theoretically Jay: is a working rdf like ontology is.
Jonny: This the origin of the homosaurus. Jay: Yes no but i'm Jay: also i'm also on the homosaurus which is Jay: actually linked data but i don't none of us on the board actually interact with Jay: that part so much like we have like a software dude who does that but like we Jay: all know about it to some degree and then i've also done some like wiki data Jay: like I did a Wikidata training. Jay: I went through one of those trainings one summer, and that was cool.
Jay: And I submitted a proposal for a paper on thinking of Wikidata and linked data Jay: as a cyborg kind of thing, but interrogating that. Jay: And I submitted this to the Code for Live journal that ended up being the one Jay: that everyone yelled at. So I'm glad it got rejected. Jay: Like literally that issue was the one that I submitted to with like the data, Jay: like bad data practices. Jay: That one, that was the one I had submitted to. So I'm glad I got rejected now.
Jonny: Narrow miss, narrow miss. Dorothy, weren't you the one that blew the whistle Jonny: on that? Or is that different? It's like a different time. Dorothea: It was like you and Becky. Jay: Right? Dorothea: Well, I mean, if we blew the whistle over anything, it wasn't over linked data. Dorothea: It was over privacy. Dorothea: It's a thing. You might want to let people keep it. Jay: Yeah. Dorothea: Yeah.
Jay: Yeah, I just happened to be writing about linked data for the thing I was writing about. Right. Jay: Yeah, so I'm very glad that my goofy little high theory article got rejected. Justin: So I actually never ended up using Turtle. I think I learned it in three notation. Justin: It was very not hands-on the way I learned about it.
Justin: And so it was never clear how it worked except for the aspects that kind of Justin: pulled from Wikidata and that explained a little bit, Justin: but I never got an in-depth explainer for how Wikidata works. Justin: So it was very theoretical and my metadata teacher was very on the theoretical Justin: side of things, so I never got to see a lot of practical applications of a lot Justin: of the stuff we talked about in class. Jonny: So that is not how. Dorothea: I teach metadata.
Jonny: Yeah if. Dorothea: You're not doing that one step away yeah.
Jonny: Exactly and that's like one of the major cultural Jonny: fissures is that just like is it supposed to be something Jonny: that you touch or is it something that is supposed Jonny: to be like a true artifact of the world and Jonny: needs to be done once and never touched again you know Jonny: so like that that you the division Jonny: between the teaching styles it's like reflective of Jonny: the entire system of belief that goes into linked open
Jonny: data as well i'm like i'm curious like like hearing people's like origin stories Jonny: with linkedin because like i'm like because because dorothea you've been doing Jonny: this for like a while there's like in libraries and stuff like that i'm curious Jonny: like if what your origins really are i. Dorothea: Mean you know i got into it the same way a lot of people did as it started to Dorothea: be talked about out as potentially where libraries move from MARC.
Dorothea: And, you know, that's a really awkward question when you think about it. Dorothea: Sticking with the homegrown, if you will, like MARC encoding, Dorothea: which we made up from scratch in the 1960s, Lord bless Henriette Avram, she was awesome. Right. Dorothea: But it doesn't map cleanly onto any of the dominant data structures, Dorothea: data models that we have today.
Dorothea: It's pulling teeth to try to stuff mark into a relational database such that Dorothea: you can actually do anything with it. Dorothea: You can kind of do it in XML, but XML is really squishy that way. Dorothea: And I don't mean that in a bad way. XML squishiness is actually quite useful. Dorothea: If you look at, for example, EAD, encoded archival description, Dorothea: some of EAD is what you and I, Johnny, would probably think of as data.
Dorothea: But a lot of EAD is narrative, right? It's storytelling. Dorothea: And you know what? Databases are shit at storytelling. You can't represent Hamlet in a database. Dorothea: Linked data is shit in storytelling. One of the things that really pissed me Dorothea: off about the very early days of linked data was some of its boosters going Dorothea: around and just bragging on it as something where you could literally represent anything, right?
Dorothea: If you could put it in a computer, you could put it in linked data. Dorothea: And my retort to that is, as it has always been, and this is pure coincidence, but I kind of love it. Dorothea: All right, express Hamlet in RDF and get back to me, okay? Dorothea: You can't do it. And I was reading through some of the stuff in the show notes Dorothea: for today. And I happened on one of the Tim Berners-Lee pages.
Dorothea: Let me see if I can find that. Ah, yes. And Tim Berners-Lee on this particular Dorothea: page talks about a semantic web, or sorry, a magical artificial intelligence. He's talking about AI. Dorothea: And he says this, the concept of machine understandable documents does not imply Dorothea: some magical artificial intelligence, which allows machines to comprehend human mumbling. Dorothea: That's literally what he says. Human mumbling. Excuse you, Tim Berners-Lee.
Dorothea: Excuse you. Language is one of the most magnificent things we have as human beings. Dorothea: And you are calling it mumblings. things excuse you very much sorry that was my rant.
Jonny: No well felt i mean yeah his his Jonny: relationship to this this sort of like you know Jonny: the fuzziness of language is like one of Jonny: the most fascinating parts of like the early Jonny: outlooks on what link data could be because on the one Jonny: hand there's sort of the romanticism of language and like Jonny: the fluidity of language as being something to embrace but then almost immediately
Jonny: that becomes like squished out just sort of like the thing that's almost immediately Jonny: excluded is the ability for people to actually express ambiguity uncertainty and so on yeah right. Jay: I think last time you were on johnny or or maybe this was in like just oh no Jay: this is when we were watching it together but we talked about how like the the Jay: ted nelson versus the tim berners-lee view of like the interconnected internet and data.
Jonny: Right let's see if i can find interesting. Dorothea: Dude ted nelson i actually did get to meet him once. Jonny: Um i. Dorothea: Was like wiped out at the time unfortunately but uh yeah i will i will always treasure that he was an, Dorothea: is i think still is interesting dude. Jonny: Yeah the. Jay: Chad ted nelson.
Jonny: Like so i'm like the story that Jonny: i don't have a good like hold on it's just like so like what happened and this Jonny: probably relates to just like you know some of the stuff that we talk about Jonny: all the time in like cyber security screaming channel and just like saying what Jonny: you may have to deal with as well of just like the state of technologies that Jonny: go into libraries and how just like they're They're not actually under any of our control,
Jonny: and we sort of do the best we can to exist on whatever scraps that IT wants to feed us and stuff. Jonny: And so I imagine that's the intertwined stories of why did linked data not happen Jonny: all the way at libraries, sort of related to the institutional inertia as well. Dorothea: Yeah, that's part of it. And, you know, getting back to my point about the question Dorothea: of getting off mark, relational databases weren't going to work.
Dorothea: XML wasn't going to work and was in kind of a little bit of a decline as we Dorothea: were asking ourselves this question. Dorothea: So what was left? I remember a blog post by Jonathan Rochkang, Dorothea: who hates LinkedIn. Why does he hate RDF? Dorothea: And, you know, he backs it up. He's not just a random hater. Dorothea: But he was like, we can't, we cannot move to this. And I'm like, Dorothea: okay, what's the alternative, right?
Dorothea: And there are things about RDF that are attractive ideologically, Dorothea: but also practically to libraries. Dorothea: The idea of the open in linked open data. Dorothea: We can really truly share and OCLC can't stop us. Oops, did I say that out loud? Wow.
Dorothea: I mean, you know, really, the elephant in the room is OCLC and its enclosure Dorothea: of Mark and Mark cataloging for its own corporate, Dorothea: and I am going to call them corporate, I don't care that they're not legally Dorothea: and non-profit, for their own corporate benefit. Dorothea: So linked data to some of us look like a possible way out of that. Dorothea: And, you know, I can't fault anybody for that. It's definitely a goal worth pursuing.
Dorothea: So why didn't it get as far as we might have wanted it to? Dorothea: Part of it is that RDF was not built, and Johnny can speak to this more because Dorothea: he's read more of the STS and sociology literature around it than I have. Dorothea: But it was not really built for practicality or computability, Dorothea: right? I, as a complete Sparkle duffer, and Sparkle, if you haven't run into Dorothea: it, is the query language for linked data.
Dorothea: It is to link to RDF what SQL is for relational database. Dorothea: I can make a typo in a Sparkle query and knock a server over dead. Dorothea: It's not even hard. Dorothea: So, like, the brittleness of just being able to ask a question without killing Dorothea: a server, this is not a consideration for the early designers of the semantic web. Dorothea: And, like, how do you build a library infrastructure on a foundation that is Dorothea: that technologically brittle?
Dorothea: And the answer is you can't. You really, really can't. Dorothea: Another, I'm not going to say this is a problem, actually. I actually think it was good. Dorothea: But it's a situation that does not commend itself to libraries, to librarians, right?
Dorothea: We tend to be very orderly people. people and catalogers Dorothea: as much as anybody and more than some so in Dorothea: the aughts right in well no not the Dorothea: aughts in the teens i guess particularly in europe Dorothea: there was just this flowering of experimentation Dorothea: with how are we going to Dorothea: represent the things in the library universe Dorothea: like books and maps and musical scores Dorothea: and all and movies and all that good stuff how
Dorothea: are we going to represent present this in rdf lots of Dorothea: experimentation a lot of it was fantastic european is Dorothea: great yeah yeah there's a lot of really good thinking very practical thinking Dorothea: going into this but there were models data models rdf models ontologies if you Dorothea: will springing up all over the place and so if you're an average cataloger you're Dorothea: looking at this and going well what which one
Dorothea: do I learn and which one are we going to use? Dorothea: And when is there a tool that's going to work with any of this? Jonny: Yeah. Dorothea: And the answer is there wasn't. Now, what seems to have fallen out of that, Dorothea: is that BibFrame, for all of its faults, and it has many, it is not my favorite Dorothea: bibliographic ontology. Dorothea: It seems to be kind of taking over the world and muscling out a lot of that Dorothea: European experimentation.
Dorothea: And that frankly makes me sad because Europe, there's several countries in Europe Dorothea: that just plain kicked BibFrame's ass as far as modeling quality. Dorothea: And i hate that they're Dorothea: getting plowed under basically by this Dorothea: crappy american juggernaut but why why is this happening because they're finally Dorothea: tooling they're finally cataloging tools that as much as any rdf based tool can fail to suck, Dorothea: yeah like i know in in alma you can do.
Jay: Bib frame stuff in alma. Dorothea: Yeah but you can you can look at sinopia and you can look at marva and you can Dorothea: imagine an actual person using these right and making them work and getting Dorothea: good records out of them which we didn't have for at least a literal actual Dorothea: decade after BibFrame happened.
Dorothea: So when Tim Berners-Lee calls human language mumbling, I think it's a symptom Dorothea: of the contempt that so many linked data people have for human beings. Dorothea: And I yelled at the Semantic Web and Libraries conference in like 2014, Dorothea: a decade ago, about exactly that. Dorothea: Stop dissing human beings you can't do that if you actually want linked data Dorothea: but nobody listened and here we are right yeah.
Jay: Like another idea and this was also something i think i talked with johnny Jay: about like another idea for a goofy like high-minded like theory paper i had Jay: was thinking of like linked data as this attempt to like do a reverse confusion Jay: of tongues like a pre-tower of babel divine language that ignores the actual...
Jay: The reason that linked data is cool is that it has the potential to, Jay: everyone to have their own way of doing it and it'll talk together and intermingle Jay: instead it's just turned into this like nope everything looks this way now and Jay: this almost like mechanized version of language like taking over like it doesn't Jay: care about being human readable actually.
Jonny: Right and like so it's it's like this this tension that that was there from Jonny: the origin of it and it's It's actually just like the dawn of the term linked Jonny: to data as opposed to the semantic web is just like a part of this, Jonny: the same thing of like part of this.
Jonny: I feel like we need to like at least nod to, because it's like, Jonny: we talked about this at length last time I was on here, but just like also nod Jonny: to the Lindsay Poirier piece, Jonny: like a turn to the scruffy, which is like we both called out as being like one Jonny: of this, this is like seminal work on like understanding the culture of the semantic web.
Jonny: And just like that just like points to and also just like it's there too in in tim biel's website, Jonny: of that just like the separation of linked Jonny: data and linked open data from the semantic web was about like Jonny: reclaiming just like stuff that worked as opposed to stuff Jonny: that like was perfect that just like this is we're about like trying to make Jonny: a bunch of separate ontology so it's like the initial idea of being there's
Jonny: one graph like one global graph where everything is always linked together and Jonny: there should be one uri that represents each unique concept and only one.
Jonny: And to the point where just like there's these sort of like absurd blog Jonny: posts and like one of the things that's amazing always about just like web history Jonny: is that a lot of it is just like still there and still up there at least on Jonny: archive.org but just like these just like blog posts that i think this is 2009 Jonny: i put this in the in the links as well but i'm just like they apparently took Jonny: took down the comment section on it.
Jonny: But just like someone that was like from like semantic web, like in this era Jonny: of just posting a blog post about when the first time that the New York Times Jonny: had like linked data in their web version of the product. Jonny: And so what they'd done is they'd made some, you know, article that was about. Jonny: Barack Obama and the quote unquote, you know, the racist controversy, Jonny: like, you know, Barack Obama is a Muslim, whatever. So it was an article about that controversy.
Jonny: And so there was an RDF claim that was like, Barack Obama related to Muslim Jonny: or something like that, that just like, this is just like trying to describe Jonny: the contents of this piece of writing.
Jonny: But then people immediately were like Jonny: this is messed up because that's now a claim on Jonny: reality and it's like it's not just like someone says Jonny: this it's just this is a fact and Jonny: that was just like something that like the rdf group had specifically designed to Jonny: be doing and so like the the model of the world that like people keep trying Jonny: to escape from but now need to return to but keep trying to escape to have to
Jonny: return to is that like when you make a statement in rdf like Like there's a Jonny: difference between the way that like the language and the syntax and the systems Jonny: designer thought about it as being literally like, Jonny: like there are some like really remarkable quotes in the W3C archives. Jonny: And I was like, I was trying to pull up earlier, but it's like that, Jonny: like this, this one quote from Brian McBride, 2001.
Jonny: So this would have been just like only a couple of years after the project formally launched at W3C. Jonny: That's like RDF is not just a data model. The RDF specs should define a semantic Jonny: so that an RDF statement on the web is interpreted as an assertion of that statement Jonny: so that its author would be responsible in law as if it had been published in a newspaper.
Jonny: So these are like, they're supposed to be like legally binding documents in Jonny: this way, where there is no such thing as an author. Jonny: Someone says this, you know, that just like when in reality...
Jonny: Everything it has an author everything has Jonny: a point of view and a perspective and just Jonny: like was said by or written by somebody but like Jonny: you know it took a while for even that that notion Jonny: to be encoded in the language at all as like Jonny: an expressible thing period adding the fourth Jonny: item in the triplets like being able Jonny: to say that this doesn't belong to the global graph of
Jonny: everything but in fact is my like local system of meaning and then but then Jonny: like that just like this that you know you have to keep escaping that because Jonny: it doesn't actually work because it's like the thing that i always come back Jonny: to is i just like imagine if language worked this way where i have.
Jonny: To i want to use a word and i i have to use Jonny: johnny's version of this word and so Jonny: i have to to say like i had to go into like johnny.net slash Jonny: this word and now i'm Jonny: referring to that one and there's no way that i can make Jonny: my own copy of this word it's like in the way that Jonny: language works of just like you know we have these sort of like parallel representations Jonny: of ideas and concepts and words and phrases
Jonny: that are like you know they're not the same at all Jonny: even close to the same in between Jonny: person to person when or even utterance to utterance and Jonny: yet like we're trying to express like a system of meaning where Jonny: there is one version of each of these things like no Jonny: simply no one would do it like no one would if Jonny: i had to go to the dictionary every time and look up each person's unique
Jonny: word and like use that or else it was meaningless then it just doesn't work Jonny: so like and it's like intimately i don't know i don't want to just like trail Jonny: off forever on here but it's like intimately related to the tooling problem Jonny: where like theoretically and so like one of the authors of SCoS, Jonny: like the Simple Knowledge Organization System, like the ontology and modeling Jonny: system for like modeling relatedness and similarity.
Dorothea: It's how you do controlled vocabularies in RDF and it's actually quite functional, quite useful. Dorothea: And if I'm not wrong, I think homosaurus is actually based on it. Dorothea: That's your underlying, how you're modeling this stuff. Jay: Yeah, it's SCoS, yeah.
Jonny: Yeah, it works pretty well. And like, so you'd imagine that like a tool like Jonny: that, where you're able to say that something is a similar match, Jonny: or this is exactly the same as this other thing would enable this kind of like expressive system.
Jonny: And it doesn't because doing all of those queries and lookups is preposterously Jonny: expensive, because of just like the way that it's encoded as URIs, i.e. Jonny: URLs, i.e. I need to hit a web server every time to actually retrieve this item, Jonny: as opposed to Yeah, there's a any number of different web architectural models Jonny: that that That could take, but that's the form it took.
Jonny: And so as a result, like, yeah, it's like intimately related to the tooling Jonny: as well as the implementation of the technology, like in the same way that it Jonny: is a reflection of the ideas behind it. Dorothea: Right on. Jonny: Yeah. Dorothea: So how are we doing, Sadie? Clear as mud? Sadie: Yeah, just about. Like, I think the thing that gets me about linked data and, Sadie: like, I haven't gone to library school.
Sadie: I have just, like, the most barest knowledge of cataloging and that kind of Sadie: thing is, like, I'm a very practical, hands-on person. Sadie: So, like, I have to dig into a system to be able to show, like, Sadie: to really see how it works. Dorothea: Oh, yeah, totally. Sadie: Every time I have tried to do that, to even think about open linked data, Sadie: I'm like, I don't, I don't see how this is usable.
Sadie: So that yeah, like you talked about, like, there is, there needs to be tools to be able to use it. Sadie: It sounds like the heart of the problem at a lot of library technology where Sadie: I keep saying this is just like there's a very small selection of vendors that Sadie: have a very large control and they just keep conglomerating together. Sadie: So there's like three now.
Sadie: And somehow libraries, who are the ones who are using the tools, Sadie: are the most powerless people in the whole ecosystem of it, right? Sadie: So a big topic at my work lately, and maybe a tangent here, is why the fuck are we still using SIP2? Dorothea: Can't blame you on that one. Sadie: I don't know if you're familiar with SIP2, Johnny.
Sadie: It's basically a protocol. So integrated library systems, ILS is the biggest Sadie: software that libraries use to keep track of all of their stuff. Sadie: It's basically the protocol that passes information between. Sadie: These systems, right? So like a lot of vendors use SIP. Sadie: So like, like Overdrive, you know, you like Overdrive has to know what you already Sadie: have checked out to be able to enforce your limits.
Sadie: Like you can only have five books checked out. So it uses SIP to query that Sadie: information from your library system, right? Sadie: It is entirely unencrypted, clear text, unencrypted, and has been its entire life. Sadie: And SIP2, which is different from the IT SIP, which is a VoIP protocol, Sadie: which causes no end of confusion every time people are, like every time we have Sadie: to talk to a vendor IT to figure out how to set something up.
Sadie: I just totally gave myself, if a single one of my coworkers is listening to Sadie: this, I just absolutely gave myself away because I've had this conversation so many times. Sadie: But yeah, it's like, and it's been in use for so long and all of these interlibrary, Sadie: it's the only one that is actually usable, like actually, what's the word I'm Sadie: looking? Agnostic, system agnostic.
Sadie: So it's starting to be replaced by a lot of APIs, but each API for each system is its own thing.
Sadie: So you have to wait for other like, you know, Sadie: oh we could do this api we could do i Sadie: don't know if this is true we could do almost api but we can't do sierra millennium's Sadie: api so it's just like just like and in it it's just like why the fuck are we Sadie: still using this and then we talk to people like vendors and they're just like Sadie: well what's the problem and we're like it's completely clear text and requires Sadie: extra tunneling to be able
Sadie: to actually keep our patron data over like not readable over the internet and i've asked. Dorothea: It all over the entire internet.
Sadie: For anybody and like looking at the Sadie: strings it's literally like library card number name Sadie: full name address you know number of Sadie: checkouts like it's just like it's it's so ridiculous Sadie: and people are still just like well i don't Sadie: i don't understand what the problem is until you talk to an Sadie: ipt person and you say it's in clear text it's completely Sadie: unencrypted and they go oh that's bad
Sadie: but no libraries have like the power to go to these freaking vendors and just Sadie: be like you have to figure something else out something has to be worked out Sadie: but it's going to end up being you know oclc who does that kind of stuff or Sadie: something like that and then yeah it'd. Dorothea: Be nice out right and. Sadie: Yeah they're. Dorothea: They're vendor patsies that's That's all we are. Sadie: In a lot of ways, yeah.
Justin: Yeah, what was it Bree said in the Skullcom Discord? ACAB includes Niso. Jonny: Yeah. Justin: Yeah. Dorothea: Absolutely. Sadie: So like, I still don't think I understand entirely what linked data is, Sadie: but I, I do think that I like, I can start to get to it if, if you know what Sadie: I mean, because yeah, like it's, it's just, it's a system.
Sadie: It's a system to connect data to other data in meaningful ways and it once had Sadie: the promise to actually help libraries figure shit out and it has completely Sadie: kind of shit the bed on that is that is is that an accurate that's. Dorothea: That's completely accurate i still have tiny little sparks of hope Oh, I do. Jay: Did we describe why it's called the semantic web? Dorothea: Oh, I don't think we did. Johnny, I'll leave you that one.
Jonny: It's a really simple story. It's like being like web happened, Jonny: right? And so web is documents with links between them. Jonny: But those links are meaningless. Jonny: They're just the relationship from one page to another.
Jonny: And it's hard to imagine this in retrospect, Jonny: respect of a web without search engines Jonny: or without any sort of like overlay to them because Jonny: like basically the way that everyone interacts with the web now Jonny: is either through search or through some mediating discovery Jonny: mechanism like you don't just like go on the web and then Jonny: go to a url and then just be like well Jonny: i'm here now and just like i've found the internet
Jonny: and like it said yeah so like that's like Jonny: the way that the web was sort of designed and like Jonny: the way that it's supposed to work is it just like it would be self-organizing where Jonny: the like the literally like if you go back to like Jonny: the founding I was like we will just have people that have Jonny: lists of links on their personal websites and they will link everything together Jonny: and then just like people will find their way from these like local nodes of
Jonny: meaning like and the imagination there was always that just like the web would Jonny: be super easy for the average person to make a website on and that just like Jonny: everyone one would basically have one.
Jonny: And that didn't work at all, not even close, Jonny: not even from the very beginning, where just like, you know, Jonny: this, it was the case where just like the ultra nerds that were on the internet Jonny: at the very first part of it, still, you know, gravitated towards sort of like Jonny: mediating platforms like bulletin board systems, and etc. Jonny: So the semantic web was supposed to be a way of encoding computer readable information
Jonny: into the protocols of the web, and specifically into HTML documents that are, Jonny: you know, that are XML, a dialect of XML. Jonny: I don't even know how to describe the relationship between HTML and XML.
Jonny: But like, so that it would be possible to both annotate a given page and then Jonny: also just like be able to link them together so that you'd have this sort of like, Jonny: you know, coexistent between documents that people are on that have like, Jonny: you know, human readable text, and then embedded within that and embedded between Jonny: that are just sort of like, in this paragraph, I'm talking about this person.
Jonny: And like, then I can sort of like, say, go to that page and theoretically go Jonny: and find backlinks to all the time that that person was mentioned or something like that. Jonny: And so that's like why it's called like the semantic web is we're adding semantics Jonny: to the web which formerly was just sort of like naked links and documents yep.
Jay: Like the computer could understand what that Jay: johnny is a person because Jay: it knows what those uris are and what they point to and it then can then tell Jay: what the relationship between those are not in a way where it knows what a person Jay: is but it knows what this uri is and if you use this uri then it sees other Jay: things that have that uri and knows that they're people too And.
Jonny: There's a certain amount of magical thinking that like, because language sort Jonny: of works this way, that it's like entirely relational and metaphor based and like, Jonny: you know, the meaning of a word is only sensible in context of surrounding meanings Jonny: and contrast with similar, you know, that just like meaning would emerge.
Jonny: And like again like that's sort Jonny: of true like there's like like language does work like that just like Jonny: sort of local negotiations over meaning and indigent but like you need to have Jonny: the people there negotiating in order for it to work and that never really existed Jonny: so just like so like there's and it sort of like points to one of the salient Jonny: features that is both like it's like you know, eerily prescient,
Jonny: but also just like another one of these like critical pieces where we're talking Jonny: about just like the missing tools, Jonny: is like from the very beginning, like there's this 1999 piece in Scientific Jonny: American that Tim Berners-Lee, that was like sort of like the public announcement Jonny: of like, you know, the existence of the semantic web as a problem. Dorothea: I remember reading that. I was at work. I remember reading it.
Jonny: And so is this wonderful document and just like that like is like this very Jonny: pie in the sky kind of system of, you know, release about just like what it could be. Jonny: And like, there's a bunch of just like really basic and obvious things that Jonny: like, wow, we should really have the computers work like that. Jonny: We're just like, you know, like the, the, the idea that I have a calendar appointment or whatever.
Jonny: Why can't my computer know that, like, I also have a photo that was taken on that day. Jonny: So I can just like say, computer, find me the photos that were taken during Jonny: this appointment on my calendar or something like that. Jonny: So like a sort of universal acid for this data, where just like, Jonny: I can just relate, you know, totally heterogeneous systems between one another.
Jonny: But the part that's like really, like, you know, come to be, Jonny: we all like thinking about just like Like AI is like, you know, Jonny: this year and this last year being like, it was always going to be dependent on compute. Jonny: That it's just like, there's metadata there.
Jonny: But even from the very beginning, you need what Tim Bersley was talking about Jonny: as agents, like as about just like little bots, little scripts or whatever that Jonny: are running around getting all of this metadata around. Jonny: And this is like around the time when Google and like the first algorithmic Jonny: search engines were starting to exist.
Jonny: Exist so like this idea of crawlers and Jonny: ingesting this information and making sense of it was like Jonny: a relatively new one especially like at Jonny: a mass scale like this and like that's but that's always been the tension we're Jonny: just like like say just like talking about like what is it where do i touch Jonny: it like how am i supposed to use that just like that was sort of always the Jonny: intention with that just like
Jonny: you would have like a little computer butler thing that would just like be Jonny: going out and you have your own set of commands to just sort of like, Jonny: go get this for me, go fetch this for me. Jonny: But again, it's never really materialized just because with what infrastructure Jonny: does the average person have a constantly running bot that goes out and scrapes Jonny: the web for them all the time?
Jonny: And so even from, yeah, there are a couple of moments in the history of the Jonny: Mending Web of times when Google basically bought it. Jonny: That happens actually several times. Jonny: We're just like this sort of domestication of this process where like, Jonny: now like when you think about it, like, where does it exist? How does it exist?
Jonny: Pretty much the only way that people usually interact with it is like the metadata, Jonny: the open graph metadata and well, that open graph slightly different, Jonny: but like the JSON LD document that you'll have at the top of your website header Jonny: that is just like, using schema.org terms to say that this is a website about Jonny: an organization or an event or whatever.
Jonny: And like as Justin was saying in the beginning, just like sometimes it makes Jonny: the Google info boxes work. Jonny: And like that's pretty much the most concrete realization that the average person Jonny: has for linked data on the everyday. Jonny: And that's because who owns the crawler? Google owns the crawler. Jonny: And so it becomes something where you make metadata available to be crawled Jonny: by Google in this very constrained, commercially focused context.
Jonny: But it's not a system of expression. Jonny: And like, just one more thing is like, there's like these other technology that Jonny: like, RDF-A, like this dialect of RDF, which is supposed to be like the thing Jonny: that goes embedded in documents where like, as I'm writing, Jonny: I will tag a particular paragraph as you know, with some, you know, Jonny: semantic web tag or something like that.
Jonny: That's like arguably one of the most like Jonny: attempts at making human a human link Jonny: data like interface for that we're just like you could imagine i Jonny: have like a document editing software or something Jonny: like that and i can highlight a highlight a sentence and add a tag to it or Jonny: whatever you know just like actually embedding this in documents that people Jonny: actually use that is actually no longer supported by the main art like rdf parsing
Jonny: library rdf lib in python because it's complicated to parse, Jonny: but also it's just sort of like, that's not really the important one. Jonny: It's like, you know, for all these like mushy positional document tags and stuff Jonny: like that, and people don't really want to know the information in context. Jonny: They want it all split out into like, you know, something where I can do an Jonny: HTTP request and just get the headers and that's it.
Jonny: And so like, it's like, it's just one of these mutating landscape of technology Jonny: always ratchets more and more towards, Jonny: it's intended for doing the Jonny: big web of open data that you're not a part of but Jonny: you get to experience through platforms and a Jonny: lot of platforms are in fact powered by linked data Jonny: at least if not rdf knowledge graph Jonny: tm derivatives of that idea where Jonny: like it is an extremely powerful set of Jonny: of ideas but not for you
Jonny: so if you but if you are a company that exists as a giant conglomeration of Jonny: data sets that you've bought by acquiring smaller companies over time it is Jonny: an incredibly powerful system for integrating all of that information being Jonny: able to do complex queries across them so in that piece for tim berners-lee. Jay: Not for thee.
Jonny: Exactly and increasingly for Jonny: the surveillance state and just like the people who Jonny: have this nightmarish multi-sided market Jonny: of selling your data to insurance Jonny: providers at the same time as selling it to police at the same time as selling Jonny: you back a little slice of it as well so like it's yeah the way that it exists Jonny: now is largely in the shadows and that's by no means passive effort there's
Jonny: an active corralling and an active domestication of this set of ideas. Dorothea: And to bring it back to tooling for just a second, Dorothea: Some of the more pro-social, I guess I will use that word, experiments in this Dorothea: space, like Wikidata, for example, Dorothea: are already running up against the absolute limits of what you can do with linked Dorothea: data if you're not, like, Google.
Dorothea: Google, they've already, and the technical details here completely escaped me, Dorothea: but Wikidota has gotten too big for its britches. Dorothea: The infrastructure literally cannot cope with it anymore, so they're sharding Dorothea: it, is my understanding. Dorothea: They're kind of splitting it down the middle and figuring out how to get the Dorothea: two shards to talk to one another, which I'm sure is really exciting technically, Dorothea: but wow, that's not great.
Dorothea: For those of us who are not Google, but are interested in this technology stack.
Jonny: Did you see the the cause of this this issue Jonny: is that like it's the underlying database Jonny: software blaze graph that it's running on amazon hired Jonny: away all of the engineers so they're oh Jonny: great yeah so all right typical so again this is like the big company is literally Jonny: buying the underlying technologies we're just like you know the software needs Jonny: maintenance you know that like that it needs maintenance and these constant
Jonny: improvements and just like to be able to handle an ever-growing stack of triples like Wikidata, Jonny: you need to have active maintenance workers. And like, Jonny: Who pays for open source work? Like, if I'm a software developer and Amazon Jonny: says, here's, you know, 250K a year to make the, do the thing you were already Jonny: doing for free, then it's like, sure, I have a family. Jonny: You know, I, you know, I'd like to have, like, you know, go on vacation sometimes.
Jonny: And so, like, yeah, it's just like, yeah, actively, that, that was another moment Jonny: of, like, yeah, actively poaching away the talent so that, like, Jonny: the underlying technology can. Dorothea: And I will say, for all that we are cultural heritage organizations founded Dorothea: on the idea that culture should persist, Dorothea: we're very bad in libraries and archives at admitting that software needs maintenance, Dorothea: that standards need maintenance, right?
Dorothea: That's the SIP2 problem in a nutshell, though that was proprietary, actually. Dorothea: So Ruth Kitchen-Tillman and I wrote an article, got published about a year ago, Dorothea: about the ethics of linked data sustainability. Dorothea: You can find it open access online. Dorothea: And we took a pot shot, actually. Okay, we. I took a pot shot. This one was mine.
Dorothea: At information scientists. Okay? Because there are too many information scientists Dorothea: who are serial project and standard abandoners. Dorothea: They get grant money to do this fancy-dancy thing, and they get as far as it Dorothea: being implemented in libraries. Dorothea: And then they just wander away to write the next grant application and do the Dorothea: next fancy-dancy thing. And then it rots.
Jonny: Totally. Dorothea: Right, whatever they built, it rocks, because inevitably, they didn't build Dorothea: it right in the first place, and I'm totally thinking about OEIPMH here, Dorothea: since we have some Skulltom folks in the room, but SIP2 is another beautiful example. Dorothea: Gosh, we are so bad at versioning stuff. Dorothea: It's a really basic idea. You gotta version stuff. You can never get it right the first time.
Dorothea: So yeah, I, in that article, took a pot shot at serial project abandoners and Dorothea: said, funders, stop funding them. Dorothea: Ask what happened to their last Dorothea: three projects. And if they're dead in the water, add some black mark. Jonny: For real. Yeah, this is a general issue in any sort of publicly funded tooling space.
Jonny: Is that just like... I was allegedly on some review panel for some funding agency Jonny: that is theoretically talking about software sustainability. Jonny: And that was a completely novel concept that just like what we want to do is Jonny: we want to fund sustainable software ecosystems.
Jonny: That just like we're not trying to start a new project. We're not trying to Jonny: like, you know, fund the new feature, but just like, these are the already existing Jonny: things that are happening in open source. Jonny: And let's just keep that going, like paying for like, like stuff like documentation Jonny: and like making the tests work and like, you know, years and years of technical debt.
Jonny: And like security audits yeah totally yeah Jonny: and please yeah and so this is like this is Jonny: one thing like this one of one of my entry points into thinking about semantic Jonny: web and thinking about just like linked open data was just like initially Jonny: thinking about because i was like living with someone who is like working in Jonny: metadata in a library at the time and there was this like increasing cry of
Jonny: just like the we all know the journal system is broken and like there's this Jonny: recurring strain of papers that are just sort of like let's just like make the Jonny: libraries do it you You know, just like that, Jonny: just like we can sort of like get libraries to host a bunch of journal like things, Jonny: journal like overlays or whatever, completely ignoring the reality of work and Jonny: the reality of bureaucracy in libraries that just like.
Jonny: And and and so like, you know, you wonder who I'm talking about. Dorothea: Oh, I don't have to wonder. I let him talk it out.
Jonny: Yeah and and so Jonny: like that just like this is where like Jonny: and on the one hand it seems like an obvious thing where just like of Jonny: course like it seems like libraries in general Jonny: in the abstract should be invested in just like you know maintaining some Jonny: their catalogs at least but just Jonny: like also the all the other things that just like you know that are being archived Jonny: and cataloged and just like you know exist in libraries and just
Jonny: like making that as available as a public catalog on like sure surely they're Jonny: already doing stuff like that so it shouldn't be that much of additional effort Jonny: to have an institutional repository that acts like a journal and like can link Jonny: together these things but as y'all know yeah i keep. Dorothea: Coming back to tooling. Jonny: Yeah tooling. Dorothea: Was shit the tooling for open access is and always has been shit.
Jonny: Right. Yeah. And so it's just a matter of like, that like, there's there is Jonny: this universe of like, we're like, okay, we could get sort of some of these things aligned, Jonny: like funding priorities for maintaining sustainable software. Okay. Jonny: If we can then like, get some sort of like IT consortium to help out with like, Jonny: maybe, you know, quote, unquote, public cloud. Jonny: So it's not the case that just like every library needs to have like an on prem IT team.
Jonny: That just like there are some of these things that could like Jonny: lock into place that just could theoretically make some of Jonny: this work but just like that's just not the way academic work is Jonny: done generally and just like that's just not the way it's structured to make Jonny: these sort of like long lasting infrastructural efforts like as you say that Jonny: these are just like grant cycle to grant cycle let's just like ride to the next
Jonny: thing and even within so like part of my role in the last Jonny: six months of work it's like i'm working with actually a lovely group of people who i who i i. Jonny: Like and they have welcomed me and so i'm not trying to speak ill of Jonny: them at all but just like this is a linked open data project and Jonny: basically what i've been trying to do for the last like six months is like pay Jonny: down technical debt we're just like there's this like really good idea
Jonny: of this like this way of having authorable Jonny: linked data schemas doesn't require you to be Jonny: part of the priesthood to be able to describe what exists in Jonny: your reality but it's just like i didn't Jonny: really work it's just sort of like they're Jonny: just like that it's just like the people that are concerned with the Jonny: modeling part about the the like Jonny: what you know what is this kind of thing do we Jonny: put it in this category like this like are not usually
Jonny: the same people who are just like going to be able to write a really good implementation Jonny: of that and so like trying to figure Jonny: out how to make those collaborations happen as well because Jonny: this is another point where like i i don't see Jonny: this as a thing that really could exist or come Jonny: from any sort of startup like rest in Jonny: peace to the solid project which i have been trying to find for several years
Jonny: and i keep seeing little promising scraps of it but this is like so solid was Jonny: like the thing that tim berners-lee was like this will be the semantic web like Jonny: the thing that we're trying to like do to so it's like it has like Jonny: crisis of conscious, like actually the web sort of sucks.
Jonny: Like, like I think around like 2015 and 2016 and like, you know, Jonny: starting to be just like, okay, let's try and make solid as like a way for people Jonny: to do the, like the, the more like vernacularist dream of the semantic web where I have my. Jonny: Like this, now they're talking about like activity pods. Jonny: Like I have my little unit of my semantic web, like graph and information graph.
Jonny: But that quickly got bogged down in the academic cycle. No one could manage a project. Jonny: Then they spun that off into a startup. And wouldn't you know it, Jonny: once that happened, then it became owning your own data was a bug, not a feature. Jonny: And so now you're supposed to be pushed on to renting a cloud server for it and so on and so forth. Jonny: So I think that this doesn't come from startups or from any sort of company.
Jonny: It also doesn't come from the scattered wastes of open source world. Jonny: They're just like, you can't just like ask people to do it for free. Jonny: And it also doesn't come from this like local efforts of like trying to make Jonny: tools for like an individual institution.
Jonny: And so just like what's left is like, you know, we need to use some sort of Jonny: public funding and try and rally public funding in a way that it's not designed Jonny: to be allocated in order to like make these kinds of technologies. Jonny: And also the belief that there should be these technologies in the first place Jonny: in order to make that real.
Jonny: And so like, that's, this is like this unending knot of like, Jonny: who do we, who is the next little thread that we need to pull in order to make this large tapestry?
Jonny: But then like you you're dealing Jonny: with 25 years of baggage at the Jonny: same time so it's like a lot of the people that are still in Jonny: that space either have distanced Jonny: themselves from it and i have and look back on it Jonny: with this chain of mixed emotion mixed emotional Jonny: memories but i don't want to touch that anymore or they're Jonny: like in some way still true believers that just Jonny: like what do you you mean nothing is actually broken it's totally
Jonny: fine and like you just need to learn how Jonny: to do it good and so yeah so like Jonny: like and so this is like one of the reasons why Jonny: i'm just like like we were talking about this earlier today just being like Jonny: that in some ways like talking about like Jonny: serial project abandoners protocol abandoners that just like there Jonny: needs to be like a break in a way that's like backwards compatible we bring
Jonny: the past with us or like or have some way to like carry it through with us but Jonny: we're not beholden by all of this baggage that and and so i don't know like Jonny: talking about just like what happens in the future i guess i don't know if we've even gotten past the.
Jonny: Expository part of what even are we talking about yet but like maybe i'm jumping Jonny: the gun there but like yeah just last last thoughts on that idea is like that's Jonny: another part like the twin, Jonny: entry points for me into this whole line of thinking or just like thinking about Jonny: just like what could be an alternative to scholarly communication and publishing.
Jonny: It just like, it shouldn't be possible for me to throw stuff up on the web and Jonny: then have it be part of this sort of like blob of information without like a Jonny: lot of gatekeepers in the way. Jonny: The other part of it is that it's like, even long before I got interested in Jonny: it, I keep coming across these various like graveyards of things that are just like.
Jonny: This is a really cool idea, like a browser extension that like everywhere I Jonny: go, I can make sort of personal annotations and not just like bookmarks, Jonny: but just like I highlight this section, Jonny: and then I can relate it and share it to my friends like, oh, Jonny: actually, that extension was for like Netscape 6.0. Jonny: And like, was abandoned 20 years ago. And like, no one has thought about this ever since.
Jonny: And just like this long string of just like dead projects that are Jonny: that are exactly like this because again like Jonny: didn't you imagine like the kinds of open source projects that work and Jonny: like are sustainable are usually ones that have some Jonny: material tangible benefit for the Jonny: people that use them day to day like this is a tool i have active use for or Jonny: their baseline behind the scenes infrastructural work that like a lot of companies
Jonny: that will just like sort of rely on them like the but the types of like this Jonny: niche of technology just like what what you have to have in order to use it are a website. Jonny: So that rules out 99% of all people. Jonny: And then be like a website where you are deeply in control of the HTML that goes on that page. Jonny: And that rules out 80% of the remaining 1%.
Jonny: And so like, that just like, there just, yeah, there never was a time when it Jonny: had like an actual practical use. Jonny: And this is something that just like gets called out as early as the The earliest Jonny: I've seen of people saying, what is the point of all this was like in 2005 and 2006, Jonny: where just like there's a series of these blog posts of just like abandoning the semantic web. Jonny: It's like, no one actually figured out why we're doing this at all.
Jonny: Like, there's one interesting example of like music annotation, Jonny: where just like it's sort of like a peer to peer ish music system. Jonny: And then that's it. Like the rest of it is totally pointless.
Jonny: Like why would i ever do this in the first like invest Jonny: all this time into learning these incredibly complicated parts of Jonny: it because like one of the things that we're missing in the exposition stack Jonny: is the exposition section is like the sort of Jonny: stack of things that that the data is Jonny: like you have the triples part which we talked about but then Jonny: you also have like ontologies and schemas and just like the way that these things
Jonny: all sort of relate to get in it took me a year to even and figure out what these Jonny: meant and what they look like and why they existed and just like why is a schema Jonny: different than an ontology? Jonny: That seems like the same sort of thing but there's like different roles in the Jonny: ecosystem and also definitely different... Jonny: Just to say that... Dorothea: Why does neither of them have record constraint language?
Jay: Ontology means that your professor goes on tangents about first-order logic Jay: when you're learning it. Dorothea: That's right. Jay: Yeah. Justin: And schemas are on schema.org. Jonny: Exactly. Justin: That's how you know there's schemas. Jay: Also, was the music project you were talking about linked jazz? Jonny: I will look up this. It's in this blog post, Abandoning the Semantic Web. I'll see if I can find it. Jay: It's linked jazz rules. rules yeah.
Dorothea: That's a great little site i love it that. Jay: Was like the first i ever heard of link data i was like an undergrad still working Jay: in a music library sure and my and my like mentor professor or not professor Jay: my mentor like boss was like this is the coolest thing i've ever seen in my life.
Dorothea: Well and and and music in particular in a library context is actually a really Dorothea: wonderfully subversive place for for late data to get a foothold because mark for music saw. Jay: So bad oh it's terrible music cataloging like music copyright is something that Jay: even seasoned professionals will not touch um yeah music cataloging is its own Jay: has its own rules i mean heaven. Dorothea: But but wow mark was just not designed for that and it shows.
Jay: Oh it shows it shows yeah back to. Justin: The explaining part of things as well one of Justin: the one of the main benefits always sold about Justin: link data is that since the web Justin: is sort of a page or document focused sharing of information this would allow Justin: subsets of information to be pulled like johnny said pulling like all the headers Justin: from an article with a request the thing is that like without out like I could pull 9,000.
Justin: I don't know 500 fields from a mark record what do i Justin: need that for because i don't know anything about the context Justin: of it uh without the full document plus that's Justin: i'm guessing that's probably why it's so computationally heavy Justin: is that everything has to be done through servers whereas documents can be retained Justin: locally and it's just mostly just text files right so it's sort of the same
Justin: problem blockchain had where everything had to be done computationally And that's Justin: why it took 20 minutes to buy a donut because it had to get pushed out to like 20 ledgers. Justin: And instead, this is like, if I want to query information, it has to go through Justin: different servers, which I think was kind of the idea of websites that heal. I have it pulled up. Justin: It's a John Rhodes blog post.
Justin: But when Johnny was talking about bots, I think that was the idea was websites Justin: like link rot would happen between them.
Justin: And eventually bots would just kind of communicate server to server constantly Justin: and then just fix links and they would heal themselves and that was kind of Justin: the idea and that blog post ended with if anyone wants to write this i'll help Justin: but until then but that's the thing is like it's very difficult to do that because Justin: if you've ever worked with like government websites.
Justin: Particularly like healthcare websites every presidential administration stuff moves entire Justin: divisions of the government and so they're on completely different domains and Justin: that's why government websites always break and like really important ones and that's also why the the, Justin: government tends to do a lot of like dot coms now where it's just like healthcare Justin: healthcare.com okay just go there and we'll point it wherever it ends up because
Justin: trying to keep because i was an allied health librarian and trying to keep those Justin: pages about like the affordable care act up to Justin: date in libguides i mean thank god has a very good link checker but i constantly Justin: had to run that link checker because those things broke all the time they.
Jay: Don't even keep their pearls or whatever it is that they use because like one Jay: of them one year in grad school i was the the gov docs librarian graduate assistant Jay: and half of my job was just like going through sudoc stuff and then And also, Jay: like, checking the pearls or whatever permalink system that government websites Jay: and online GovDocs uses. Jay: And just finding all of the broken ones, which was all of them. Jay: They don't even maintain their
Jay: permalinks. Yeah. Which is the point of permalinks, is so that the back... Jay: The URL itself can change. Dorothea: Well, if I... Can I write on OCLC again? Jay: Yeah, always.
Dorothea: Yeah, that was actually another example that Ruth and I wrote about in our piece, was OCLC and Perl.org, Dorothea: which was not originally OCLCs, it was a grassroots little thing for okay, Dorothea: here's a place where you can mint permalinks, and we'll keep the database of Dorothea: where they went to, and everything will just work, Dorothea: and we'll happy permalink utopia and then Dorothea: with absolutely no warning some Dorothea: years after oclc took
Dorothea: over pearl.org and made a very loud Dorothea: statement about how it was very important and Dorothea: they were going to maintain it and definitely uh it Dorothea: broke they broke it the the person i Dorothea: i don't know the details i think Dorothea: the the person who had been maintaining it left retired Dorothea: who even knows but pearl org just completely broke oclc of course didn't give Dorothea: a fuck and it remained broken for like several years and now the internet archive
Dorothea: eventually took it over and they don't give a fuck so you can't actually get any support for it. Dorothea: And a bunch of innocent third parties who believed OCLC's lies and gleefully Dorothea: minted all kinds of pearls because they thought that infrastructure was going Dorothea: to stick around, dot burn. Dorothea: Right? This idea that Justin, I believe, was talking about of self-healing websites. Dorothea: Right, that is nonsense.
Dorothea: That is garbage. The world does not work that way. The world needs maintenance.
Jonny: Yeah and and so like there's like this whole nest Jonny: of ideas about like roads not taken in the internet with a Jonny: lot of this because it's like i have the same feeling about just like Jonny: permanent ideas and like and as Jonny: i do about just like in general when i see like a yet another platform for scholarly Jonny: communication or like we're going to fix the ills of like academia by making Jonny: yet another platform is that just like this is intrinsically a political one
Jonny: where and it puts And it's one where you are putting power in the hands of a Jonny: specific organization that just like, Jonny: and the longevity of that is strictly social. Jonny: We're just like, it's the same way just like permalinks exist as long as the organization exists.
Jonny: And so like I have in general sort of like more faith than average that archive.org Jonny: will continue to exist in the next year, Jonny: although they are sort of like damaging that reputation lately to sort of like, Jonny: like, just like, you know, anyway, Jonny: we won't go there just being sort of like.
Jonny: I think that they have good longevity plans for their Jonny: archive of the web okay but and i Jonny: also in general think that like the doi system Jonny: is probably not going anywhere but that's largely because Jonny: it's like you know one of the mechanisms for extracting billions Jonny: of dollars from public funding every year then just like so there's Jonny: like social reasons why these things persist but it's like there's the major
Jonny: thing that was not taken like why the like as you're saying just like the web Jonny: doesn't work in such a way where it would be possible to do self-healing websites Jonny: or self-healing links is because it's designed to be a client to server, Jonny: you go to a place and get something that someone else controls entirely. Jonny: And like, you're not actually supposed to have any agency in this world. Jonny: And like, there's good reasons for that. Don't get me wrong.
Jonny: But just like, this is like one of the true things about linked open data is that just like, Jonny: it needs to be peer to peer, The way that it could conceivably work is as a Jonny: peer-to-peer system where it's possible to do efficient querying and caching Jonny: between a bunch of different peers.
Jonny: So it's designed to be distributing labor in this way instead of every time Jonny: someone updates a link or makes a new record, Jonny: everyone has to go and hit this one server to get this one URI that represents Jonny: this core concept or whatever. Jonny: And so as long as that doesn't exist, there's this duality of this beautiful idea of, Jonny: of basing semantic web and linked data on URIs?
Jonny: Is that just like, okay, and elegant simplicity of this idea that the identifier Jonny: is actually a location, that like location and identity are the same thing. Jonny: And when I go to that location, I'm supposed to get something useful from it. Jonny: And then that allows me to go to the next thing. That's like a wonderful, wonderful idea. Jonny: But in reality, it doesn't work at all because like identity and location are not the same thing.
Jonny: That like i didn't and because you know Jonny: for one one reason is identities change and like Jonny: that like that like and so like there's this Jonny: like you know classic thing that Jonny: everyone always reference on the web is that it's like cool uris don't change Jonny: that's another tim berners-lee classic it's like actually all Jonny: uris change all the time and like and for that to be something where just like
Jonny: you You have a polemic trying to force something to behave in a way that it Jonny: doesn't rather than adapting to the reality of that thing than just like, Jonny: yes, you buy yourself in an infinite failure. Jonny: And so like one of the there's this. Jonny: Raising your hand. Jay: I just want to jump in. Yeah, we do the raise hand thing to like you can keep Jay: going. And then when you're done, Sadie will say something.
Jonny: But also just like interrupt. I actually would start trying to make some notes Jonny: to organize this thought, cause this is a long idea. So like, I, but like, yeah. Sadie: Oh, I've been thinking a lot about the purpose of a system is what it does. Jonny: Completely. Sadie: Right. Not what it thinks, not what it was designed to do, because we all know how design goes awry. Sadie: But yeah, the purpose of a system is what it does. Dorothea: Right on.
Sadie: I don't remember where I saw that. I love systems theory. Jay: Yeah, right. Sadie: Right. Jonny: So if you, if anyone has ever maintained a website or any sort of web technology, Jonny: we're just like, if the intention of this thing is to be liberating and freeing, Jonny: it certainly doesn't feel that way. Jonny: That just like that, like, you know what it would take to actually maintain a URL for forever.
Jonny: Like if that's the way the web is supposed to be, that the purpose of the web Jonny: is to like put these documents on the web. Jonny: Like it didn't, it doesn't do that. So yeah, exactly.
Jonny: That just like the purpose of the system is different. We're just like, and like, Jonny: again like thinking about just like all the ways that the technical Jonny: development has been stunted by the you know commercialization of Jonny: the web that just like precluded a lot of these things from existing is Jonny: like it's not an accident it's so like so Jonny: one of like one of the ways the ways that linked data is working en masse right
Jonny: now in a pretty invisible way is the fediverse and this is like what we were Jonny: talking about the last time i was on here so i won't belabor the point but it's Jonny: just like that that's built on linked data at least in the abstract and this is sort Jonny: of fascinating like realization of that we're just like Jonny: like for example like macedon like the largest implementation of Jonny: that does not actually use linked data as its internal data model that's all
Jonny: like a postgres database that then it's sort of just like synthesizes json ld Jonny: out of and like as like there's benefits and trade-offs that we're just like Jonny: as a result it sort of doesn't do all Jonny: of the linked data parts of what ActivityPub was supposed to do. Jonny: But there's the other, like, one other major alternative to this is Pleroma Jonny: and Dekoma, like the fork of Pleroma that is based on a graph database.
Jonny: And that can do a bunch of really interesting things. Jonny: But it also is, like, always crashing all the time and, like, Jonny: sort of hard to, too, because it's like, you know, think about just like, Jonny: because social networks are networks, it's like easily modeled by a graph. Jonny: And, and so doing something as simple as just like, there's this notion of like Jonny: this containers and these ordered collections and stuff like that in activity pub.
Jonny: And one ends like this, I have, you know, obviously lots of feelings about this, Jonny: this particular spec, but like, one of them is I have.
Jonny: A this notion of who i'm addressing my message to Jonny: and i should be able to address it to whoever Jonny: i want to that i have i can address it to this one controlled Jonny: ontology term public and that's just like i'm sending it to Jonny: the world but also it should be possible for me to have collections Jonny: of people and like i can address it to Jonny: this collection of people and so it's like in that way i Jonny: have a graph and then that graph is.
Jonny: Modeled like and all the relationships are modeled within in activity publishers Jonny: being like i'm allowed to send it to these people and i Jonny: want to send it to this subset of them in this particular case Jonny: and so you can do stuff like that in Jonny: a coma employment like i like the ui for it is a little less than Jonny: what could be desired but that's not something you can do in macedon where Jonny: each one of those addressing features has to be carefully
Jonny: architected from like as a as a database query Jonny: so like there's a this this tension of Jonny: just like okay we try and do it the semantic web way has the Jonny: beautiful possibilities but it's like really hard to implement and one Jonny: of the things that's hardest that was extremely like big reach and Jonny: was really only like done and made work Jonny: by just the sheer hegemony of mastodon as Jonny: like you know the the thing that if it does something everyone
Jonny: else has to adapt around it is like implementing editing Jonny: like you know thinking about just like i have Jonny: a post i want to edit that post that means i have have to propagate Jonny: that new version out to everybody else and so Jonny: like thinking about just like what it would take to have like these Jonny: sort of self-healing websites or just like the ability for the Jonny: web to adapt to change is like you need to have that expectation that just like
Jonny: everything that i know about i should be able to receive changes and be able Jonny: to propagate those among the people in the same way that just like that's how Jonny: rumors and horizontal information transfer works generally is that just like.
Jonny: Oh, I heard that this new thing happened, Jonny: and I tell my friends about it, and just like, you know, maybe and doing so Jonny: in a way that's like actually safe, and, Jonny: that is resistant to counterfeiting is a remarkably hard thing to retrofit into Jonny: a system and so like that's like like. Jay: How do we make the web actually rhizomatic.
Jonny: Yeah and yeah and this is Jonny: like again it goes back to the like the dawn of Jonny: the web browser and what it is as a technology is Jonny: like this idea of the read write web we're just like it Jonny: should be just as easy to write as it is to read Jonny: on the web and like you know obviously controlled by Jonny: permissions in some way but like this that Jonny: experiment died basically when netscape Jonny: won in the early browser wars but then
Jonny: it persisted in the form of wikis and this notion Jonny: of soft security where just like how do Jonny: we make that work is we make it so that doing Jonny: this kind of like you know we allow stuff stuff Jonny: to happen but then make it so it can't damage the Jonny: system in some profound way we're just like if someone Jonny: does something they're not supposed to do you know someone goes and Jonny: vandalizes a wikipedia page or whatever then like sure the next person that
Jonny: goes and loads that page might see a bunch of vandalism and that's bad but like Jonny: it's not it doesn't ruin the page it doesn't break it forever and completely Jonny: like it's possible for me to revert the old version of it and and so on and so forth. Jonny: So like, and that's a radically different political vision than the, Jonny: most of the web stack that we're familiar with. So just like that, it's like that.
Jonny: Ultimately, for this technology to work, it Jonny: needs to be constructed on a different set of political Jonny: primitives that include other people existing and being able to do stuff in Jonny: a way that just like is very uncomfortable for like most of the people who design Jonny: web technology nowadays to think of that as being I'm going to design a platform Jonny: that I administer for other people.
Jonny: And so instead like thinking about Jonny: it as being stuff that is designed so you get Jonny: out of the way like the most successful technology that Jonny: would enable like semantic web stuff is that no longer requires Jonny: the developer to be there and allows people to actually have autonomy on computers Jonny: but again there's no percentage in that it's in fact anti-profitable and so Jonny: like that's it's a very difficult thing to organize that kind of not only a
Jonny: technical vision, but social vision as well. Jonny: Yeah. I always end up just like back in wiki world. Jonny: It's just like some of the most, some of the most lovely parts of the web, as far as I'm concerned. Jonny: I'm still curious if I can find this, this like link data music project. Jonny: Cause that also is something I'm interested in. Jonny: Oh, so like, I don't know. I feel like the thing I think about is like survivable web technology.
Jonny: Always just like return to like pirate networks being sort of like the things Jonny: that can exist and do survive on the web we're just like what are the longest Jonny: lived things on the internet and it's like the w3c website just sort of they Jonny: win by the hell but like but like, Jonny: other than that like pirate networks like that is the other major answer that just like some of Jonny: those like mp3s that were
Jonny: like released on kazaar or something like that are still floating Jonny: around and that just like you Jonny: compare that to the extreme adversarial Jonny: conditions by which the entire global intellectual property regime is bearing Jonny: down you and still it happens like why does that work and like you know to some Jonny: degree it's a technological question but it's also a social question of just Jonny: being like because people take it as their responsibility that it's like i see
Jonny: see myself as an active participant in this system. Jonny: And so when my pirate site gets shut down, I go to the next one and put everything back up. Jonny: So, yeah, that's anyway, you've got to love the pirates, although there's a Jonny: huge amount of power and political problems in those circles as well. Jay: Librarians need to read that, like how to form an affinity group zine and like Jay: go from there, see what happens. Justin: I mean, I was.
Dorothea: It's likely to work as anything, really. Justin: Yeah. I think one of the practical reasons also linked up in data is always Justin: difficult is that kind of all files are local files in the same way that like Justin: all history is local history because it's always local to somewhere.
Justin: Anytime I try and think of, you know, particularly like when you mentioned EADs, Justin: there used to be a lot of stuff in the EAD literature about like, Justin: why does no one share their local authority files? Justin: Like, you know, like John Fox Smith donated to the library and we have his name Justin: authority file in like our decks, but he doesn't have like a library of Congress Justin: name authority because he wasn't famous enough.
Justin: Right. So everyone's got there. Right. Right. He just had a bunch of money. Justin: Right. And so, so, so we have all of these people who are local in our local Justin: name authority files and they never, ever get shared and they always stay siloed.
Justin: And yes there is almost no Justin: solution to it because the amount Justin: of labor it would take to like disambiguate the Justin: names people who have common names and you Justin: know is this the same person and then who's going to do Justin: it too because like they barely have enough staff Justin: and special collections anyway so who cares if like every local Justin: donor is going to get their own name authority file while and like I think another
Justin: thing is like like Johnny mentioned having like the way Johnny uses a word would Justin: have to go to a URI it's kind of when we were talking about taxonomy last week Justin: and that episode doesn't come out yet but I, Justin: Sort of like the issues with like taxonomy for animals and everything. Justin: You need like smaller sets of words, not bigger ones in order to actually make it useful for humans.
Justin: So when I was working with the bird working group, it was like everyone keeps Justin: using too many different words. Justin: We need to just all we need to solve this problem is like a short list. Justin: And then we can use that as like user submitted metadata and tags. Justin: And that's really all we need is just to agree between us humans, Justin: we're going to use the word paleo-ornithology instead of archaeo-ornithology.
Justin: And, like, that's all we had to do is, like, kind of get people to agree to that. Justin: There's not really, like, a technical solution because, you know, Justin: the entire birdworking group of paleo-ornithologists is, like, Justin: if they were all on a boat and it sank, there wouldn't be a birdworking group. Justin: Right. So it's, it's not too difficult to like, it's, it's not an impossible Justin: like political solution.
Justin: And it's what I always keep kind of thinking about is like, we have all these documents. Justin: Yeah. And there it's, it would be nice to break things up into data and share it as linked data. Justin: But as an organization, you don't really need to depending on the size and scale. Justin: And so that's why like so many libraries have their own.
Justin: When I think of like how a library is organized, it is ultimately you know the Justin: reason why mark is like that is its access points and it's kind of what we always Justin: default back to is what's the access point for this and i don't really care.
Justin: Semantically like how the data works as long as like this is a subject area Justin: this is the title this is the author how do i get to the information like the Justin: quickest possible steps and then that And that leads to, Justin: I feel like that's where always the disconnect has been for me with linked open data of like, Justin: when is this going to help my users in my library?
Justin: It's like, well, you can get stuff out into the, and it's easy for me as a Skullcom Justin: person, because it's like, I'm the only person who's like, no, Justin: I want this out everywhere in the world. Justin: I want everyone to look at this. But everything else in the library is categorically Justin: organized around how do people in here find the stuff that we're looking for?
Justin: And I'm the only one who has to flip that and try and say, how do we get what's Justin: in here out to the world with no barriers and restrictions and logins?
Jay: Yeah like was it last year maybe a couple years ago i was part of the like pcc ad hoc, Jay: group that put out the final decision about like hey maybe don't put gender in name authority files, Jay: because there was the initial one and then a lot of people got mad at that one Jay: and then i was part of the ad hoc hey let's revisit this thank you for your Jay: service and one of the final sticking points.
Jay: Like, cause most of us were on board with like, maybe let's just don't like, Jay: it's too complicated to think of any ways to like put consistent language ways to do this ethically. Jay: That's not going to hurt like trans people was mainly who we were thinking of, Jay: but like, there's other reasons why you might put gender. Jay: I'm like, some of the reasons were like, but with like Asian names, Jay: sometimes it's hard to disambiguate. And I'm like, that's racist.
Jay: Like, that's just lazy and racist. exist but the Jay: big one like the final kind of sticking point where we were Jay: like maybe there's a point here but ultimately no we Jay: don't care was well in a Jay: linked data environment people could query Jay: books about xyz written Jay: by trans authors or for Jay: example like you can do a sparkle with wiki data where Jay: you can be like pull all of the towns that currently have Jay: female mayors or whatever is usually the example that
Jay: they use when they tell you what sparkle can do with wiki data Jay: like what if you could do that with a library Jay: catalog whoa and we had Jay: to be like yeah but no not discovery Jay: layers like primo doesn't even Jay: do that yet like no discovery layer right now Jay: that's like popularly used by academic or Jay: public libraries has that capability they might Jay: have linked data in the records and they might have apis exposed if you have
Jay: a developer who can do neat shit but ultimately that's not how those searches Jay: work right now so maybe it is available in the future but for right now we don't Jay: care and that's not the purpose of name authority files so right, Jay: yeah like.
Jonny: The the question of just like what is it for like what is the point of it you Jonny: know why Why would I do it if there's no use is like also ultimately really Jonny: just like, like beliefs about like how things are supposed to be designed.
Jonny: We're just like, is the goal of it to be able to get a exhaustive and true answer Jonny: of all of the, you know, cities that have a woman as a bear, Jonny: you know, is that, that the point of what we should be doing with semantic web Jonny: is to like make the correct information exist in a unified vocabulary.
Jonny: And like, I don't, I like, spoiler alert I don't think so that just like well Jonny: because there's no such thing as like the authoritative and complete true archive Jonny: of all knowledge but it's also just like.
Jonny: Thinking about is like well that's like an impressive technical Jonny: feat that i could put on like some sort of like tech Jonny: specs document that just like my query engine Jonny: can produce 10 billion triples in like one one Jonny: second but like yeah like what's Jonny: the point of that and just like thinking about it like in the context of language Jonny: we're just like it's also related to the notion of like Jonny: ontology curation about just like how do we come to
Jonny: like know the terms that are the one term to Jonny: use is like that's only an important question if the Jonny: goal of it is to like make everything be totally uniform Jonny: and also that that act Jonny: of searching is like relatively precious and hard Jonny: to do and like i can only do one Jonny: of these or something like that that just like this is not an iterative process Jonny: of exploration and ultimate and
Jonny: also that just like you're not able to so like the thing about just like the way Jonny: that this works with language we're just like it doesn't ever work with language like Jonny: say new phenomenon exists in the Jonny: world like we need to get the council of languages Jonny: together to agree on the one word for Jonny: that and then everyone from then on has to agree to only use that word to refer Jonny: to that phenomenon it's like that never how it has happened and it never will
Jonny: be and just like instead just like this sort of local interpretation of what's Jonny: happening in my immediate reality and just like you try and use this word and Jonny: is this effective with it when when I say it in this way. Jonny: Oh, what I'm talking about is this.
Jonny: And Oh, I know it as this. And just like this sort of negotiation over what Jonny: things mean and in what context and to who, and like being able to have your Jonny: personal vocabulary and ontology where just like, as your history of your browsing. Jonny: It's like, I've come to know that these terms are the same terms or just like Jonny: when I am in this neighborhood of semantic space, Jonny: I use this word instead of this word. And like.
Jonny: Then you can imagine like the collective power of something like that. Jonny: We're just like, okay, all of my friends know these words as being the same. Jonny: And so just like in general, I can ask around and say who I'm looking for this. Jonny: Does anyone know how I would refer to that? And just like being able to, Jonny: you know, make sense of just like as like as like an iterative and a social Jonny: and an interactive process.
Jonny: Not one that's done once as if it were like a database query with a very controlled Jonny: database schema that's like known in advance.
Jonny: Ants and so like it just it changes our Jonny: expectations for what technology should look like that just Jonny: like i don't go to the vast impersonal search Jonny: engine that indexes the whole web but instead Jonny: i have to actively cultivate sort of like Jonny: a set of nodes and and friends and like relationships and Jonny: like prior acquaintances with this kind of thing and Jonny: then expect it to take a little bit of Jonny: time to find stuff you know
Jonny: that just like that and like i that sounds Jonny: sort of counterintuitive we're just like i'm not saying it in create exclusion Jonny: or create inefficiency but like Jonny: that just like the goal of the system isn't to produce Jonny: maximally true maximally numerous and maximally Jonny: cleanly organized data all the time and like Jonny: it's just like it it's i can imagine like thinking about just like what happens
Jonny: you know just like like just talking about just like why doesn't everybody share Jonny: their their like local i actually i'm not familiar with this term like authority Jonny: file i assume that's like you know like a local like reference like subject. Jay: Headings or like if you publish a book like your name how that's in the library Jay: of congress it's an authority file.
Jonny: Gotcha yeah they're just like they're it's also just like one of the things Jonny: who gets to do that you know that like the same problem with just like, Jonny: you know, libraries and museums being the sites of just like pillaged cultural artifacts. Jonny: It's just sort of like not your job and not your role to be the purveyor of Jonny: this information like it's about this person. Jonny: And it becomes your role because like they have no means of doing so themselves.
Jonny: Like there's just like these systems aren't ones that can be touched by the Jonny: average person. Like I can't like deposit a book myself in Library of Congress.
Jonny: I need some intermediary force and so like Jonny: that's just like that like there's another part just like Jonny: why doesn't it happen and why doesn't it work is because like on the Jonny: other end just like who is it for and should we even do Jonny: that at all because like same thing of just like what happens when Jonny: you need to change your dead name in in the Jonny: all the bibliometric records like how does Jonny: that happen yeah i
Jonny: freak all my software friends out when i talk about eventually Jonny: needing to write the anti-performance performance manifesto that Jonny: just like sort of like that just like like Jonny: and someone who is like a friend on Jonny: the fediverse and it's like we talk all the time just sort of like horrified just Jonny: like what do you mean software should be delightful to run and like just Jonny: like yeah yeah that's not exactly what i'm referring
Jonny: to though just being sort of like that like the we need to get page load time Jonny: down to two milliseconds or life will be lost and meaningless as we know it Jonny: as just like a set of ideological commitments rather than making stuff be usable Jonny: by people is the thing I'm talking about. Jonny: Oh my god i'm opening this i'm opening this you have an authority file you have an official uri i. Jay: Do i have a uri and i'm part of the problem.
Jonny: We all have many uris yeah i helped. Jay: I helped write a book in like 2018 in my during my first job hell.
Jonny: Yeah so like one of Jonny: the interesting things that i think that blue sky and Jonny: ad protocol has done is like make it so that Jonny: like domains are sort of meaningful as identity we're just like yeah that's Jonny: cool yeah that just like i have a domain and like control over a domain and Jonny: that gives me a source of identity even if it doesn't give me control over the Jonny: computers that host the thing that you know whatever like we talk about that
Jonny: different time but just being like it's very Jonny: interesting that just like that has resurged and actually genuinely useful. Jonny: And I think one of the best ideas to come out of it is like actually using those, Jonny: like I, you know, URIs and URLs has just literally, this can be my name.
Dorothea: Yeah. Because it's language independent, human language independent and things Dorothea: like debt naming, which we have to deal with in the authority file environment Dorothea: because it is predicated on names. Dorothea: It's just a URI you don't have to do that you can attach any name you want to it so there's.
Jay: Definitely that's the good thing about URIs is it allows the flexibility for Jay: trans names or any other kind of name that might change absolutely that's the Jay: good part about them love URIs. Dorothea: That's one thing that I want to keep at all this nonsense URIs as identifiers Dorothea: was genuinely a clever and useful idea.
Jay: Yeah, it was a big deal when the homosaurus moved from having the terms be the Jay: URIs to having alphanumeric URIs so that we could change terms as language use changed. Dorothea: Yeah, love it. Justin: Did they ever tell you don't put semantic information into URI? Justin: Everyone does it. It's so stupid. Jay: We're queer. We don't listen. Fuck you. Jay: Um doi.org doi.org. Justin: Slash my journal volume one and.
Dorothea: It's like yeah if you ever if you Dorothea: ever meet jeff builder who's a wonder he works at crossref wonderful human being Dorothea: he has many many many rants about publishers coming to crossref wanting to change Dorothea: a doi prefix because they merged with another publisher or internal change publishers Dorothea: or whatever the hell and he's like no that's not the point.
Justin: They um they have a a suffix generator now it's just it's literally just a spreadsheet Justin: that generates a suffix but they're like use this idiots yes. Jay: Please is that like half your job justin is just being like. Justin: No i don't meant i mean i don't meant dois manually usually but But the thing Justin: that always bugged me was OJS used to put semantic information into the automated Justin: strings that it would create.
Justin: So it would create, it would say like V and then the volume number and then Justin: N and then the article number. Justin: And I was like, don't do that. Just put random numbers. Just put random numbers. Justin: Just general, just random number generator. That's all you need to do. Justin: But they didn't do it until the latest update. So now they do it properly. Dorothea: Where you can do what every single baby relational database administrator knows to do, and just count.
Jay: I don't know how to count. I'm gay, as we've learned from the homosaurus. Justin: I do have an Excel sheet of... Justin: Manuscripts and database bases and it's just zero zero zero zero zero what is this yeah. Jay: What happens when you go beyond the capacity for how many zeros you picked with them. Justin: Doesn't matter.
Jonny: Okay and like it's like it's like all of these things like have their times Jonny: and applications and usages and everything like that we're just like just do all of them and make them Jonny: all point to you know the same thing different things etc Jonny: that just like like because i think like you know Jonny: sequential numbering identify works you Jonny: know there are times when you don't want to use Jonny: it like we're just like you have like potentially personally identifying
Jonny: information where you don't want someone to be able to enumerate over all possible Jonny: things and find all the stuff on the server and spoiler alert is like university Jonny: it terrible job at this and And frequently we'll just have like very sensitive Jonny: documents hanging out that can be publicly enumerated on their public web.
Jonny: But like, you know, so it's like super useful when designing some systems in Jonny: the same way that just like having totally anonymous strings is super useful in like PID space. Jonny: But then want to have semantic URIs and some other content that just like do all of these things. Jonny: And like the other one is like the content hashing where just like the identifier
Jonny: is like intrinsically based on the content of the thing. So if I have the thing, Jonny: I know how it would be called everywhere in the world, like has its own benefits and trade offs. Jonny: That's like, that is one of those dangerous ideological territories where just Jonny: like you get pirates and also cryptocurrency zealots in the same room.
Jonny: And it's just sort of like, like, it becomes this maelstrom of just like, Jonny: the same idea, meaning completely different things to different people. Jonny: But like, yeah, yeah, we're not going to solve the identification problem, Jonny: but basically just like, you know, it's the rigidity and being only able to Jonny: use one thing that like is the problem to me. Justin: Yeah. Now, I don't have Library of Congress name authority file, though.
Justin: Someone from Florida with my name born same year as me does, which is confusing. Justin: There's so many people with my I went to high school with someone with my name. Justin: It's very confusing. It doesn't seem like it should be that common. Jonny: It makes you harder to dox, though, so that's like passive self-defense. Justin: It is really good. I have successfully scrubbed my information off the web several times.
Justin: It's not hard. Or one time I couldn't do it, so I just redirected it to another dude with my name. Justin: And so I just changed my information to, I changed my address to his.
Jonny: And I feel like this would be something that just like, Jonny: like Dorothy would probably have stronger thoughts about, Jonny: it's like the notion of privacy and like when Jonny: it comes to like linked open data and stuff like that we're just like this the Jonny: fact that just like we don't want all the world's information to be publicly Jonny: we don't want like the justin authority record that includes your you know social
Jonny: security number and you know phone number and everything like that like like Jonny: like limits to openness you know that just like needs to be some amount of like Jonny: fungibility and yeah i'll.
Dorothea: Actually give you a real world example if you Dorothea: go and look at my wiki data page and you Dorothea: can just go to wikidata.org and look up dorothea salo i'm Dorothea: the only one as far as i know that has ever existed so what you find will be Dorothea: me i might uh although i identify like i'm c's female that is how i identify Dorothea: that's who i am my wiki data page actually says no gender no gender recorded Dorothea: and the reason for that is that Wikipedia,
Dorothea: with which I have a very vexed relationship, Dorothea: runs through wikidata every now and again Dorothea: to do things like make lists of people Dorothea: who maybe should have wikipedia entries but don't Dorothea: and of course they do this for minoritized and Dorothea: underrepresented populations and of course wikipedia Dorothea: is well known for having a huge gender problem Dorothea: gender disparity coverage problem so i get sucked up into those lists and nobody
Dorothea: asked me i do not actually want onto wikipedia page thank you very much and Dorothea: i would rather not be so i changed my gender that is listed on wikidana. Justin: I did not actually change. Dorothea: My gender that's. Jonny: Dope like anti-bot action like you just like. Dorothea: Yeah a digital seem to be the only option for saying no don't make me a wikipedia entry transfer. Sadie: The privacy of it. Dorothea: Pretty. Justin: Much gender opsec. Jay: My gender is.
Justin: Fuck off, Justin: get this gender working for me, Justin: yeah no that's why I also like orchid IDs too because it's a very nice system Justin: that you get to control and you get to you get to write your name how you want Justin: it you can write it in multiple scripts, Justin: and it's just an orchid and it just will point to whatever you tell it so you Justin: can change it whenever you want and that's what I really like about Justin: it is you know that that would
Justin: be something that would be very nice to use Justin: for like local archiving and stuff like that Justin: but the reason why is like no one's going to bother to do that nerds will Justin: do that but like i couldn't even get like faculty to do it even when this would Justin: save them time in the long run or it would make right or it would solve headaches Justin: like if they don't if they have a double barrel first name and people keep putting
Justin: their second first name as their their last name it would solve them this problem Justin: but they you know they don't go sign up for an orchid i. Jay: Was actually when i was cited in the ethics in name authority files book one Jay: of the chapters and then they asked like how i wanted to be cited i was like i would like my orchid, Jay: because they were citing one of my articles or my thesis or something they had
Jay: my dead name on it and i was like i want you to do it this way and i want you Jay: to have my orchid in there so that it's collocated like properly links back Jay: to like all of my stuff right and i think it was brie actually then went on Jay: to write an article and talk about, Jay: like how i ask to be cited Jay: in that book as like Jay: using orchids and uris and linked data as Jay: a way to help trans people who maybe have Jay: published under dead names um and if
Jay: they don't want to go back and change like ask Jay: for it to be changed which i don't but this way i can have people cite me and Jay: just use my like first initial and it point back to my current stuff and everything Jay: i've done with my current name while also still being like but i'm also the Jay: person that wrote that yeah it's not that hard.
Justin: Yeah especially if you like use initials because Justin: i use my initial a lot because i do have a very common name so i think Justin: but i used to write my full middle name and i don't do that Justin: anymore so it's nice to be able to be like okay i Justin: published my thesis with my full name but now i only Justin: like using my middle initial yeah and now i'm Justin: at an institution where i'm the only one of me so i don't even have a number
Justin: after my name i was very excited when i got my email signed to me because there Justin: is now someone else at my university with my name so there is like a zero one Justin: now and i'm like ah finally got there first i used to get detention because Justin: of some dude had my name are.
Jonny: You serious i. Justin: Get his detention yeah they used to put out a roll with the names at the beginning Justin: of the period teachers had to check them and if you were on the list you had Justin: to go to the cafeteria so i kept getting called into the cafeteria because Because Justin: it wouldn't disambiguate my name. Sadie: I had that happen to me too. I had my birth last name, which is, Sadie: I changed my last name when I got married. Sadie: My birth last name is Johnson.
Sadie: So there's like, not only are there 70 billion S Johnsons out there, Sadie: but I have a cousin who has almost the same exact name as, we were born almost Sadie: the same exact person practically, right? Sadie: We have the same name, the same first name, same last name. Sadie: Neither of us use our middle name, right? Yeah. So- Sadie: I got told I was supposed to go to detention a couple of times in high school Sadie: because there was another person with my name. It's common.
Jonny: But like that's a, you know, like free bad kid social currency, Jonny: you know, just like, hell yeah, I'm going to detention, baby. Jonny: Like that's like, you don't even have to do it. So you get the best of both worlds. Justin: Well, I used that. What they said to me was, well, if they don't put your middle initial, it's not you. Justin: And I use that excuse for the next four years, even though that dude was a senior when I was a freshman. Jonny: Said no middle initial.
Justin: It's not me. Jonny: It's like can't. Sadie: Make me do it. Jonny: That's like just social engineering you know just in the real world you know Jonny: just people just intuitively do it there. Sadie: Is no difference between social engineering and con artistry. Jonny: Hell yeah yeah. Sadie: I will die i will die on that hill.
Jonny: Yeah a good friend Jonny: of mine is having a crisis of like direction Jonny: in life and i'm like okay so Jonny: your strengths you are super good at like Jonny: infiltrating unfriendly organizations and Jonny: groups of people and like taking on roles and shit Jonny: and did you know that that is a job and like Jonny: um and so like trying to like yeah turn this person totally a job like it's Jonny: like and a lot of the people that do it sort of accidentally find themselves
Jonny: you know like like you know seeing it the first time like holy shit you can Jonny: do that and then just like suddenly becoming really good at it anyway i. Sadie: Feel like the the alternate of that fork is improv comedian. Jonny: Their their. Jay: True their true destiny is they just become podcasters improv people are good Jay: at doing podcasts like all my favorite podcasts i've learned like the people Jay: did improv i have no idea what. Sadie: I'm doing here.
Jay: Yeah. Jonny: That's like something we did improv that one episode what you did like improv Jonny: games or like what what are you talking about we. Justin: Had we'd seriously wrong on we did skits and those. Jonny: Were oh yeah i i. Justin: Dipped i was bad at it we. Jonny: Were very. Justin: Bad at it but they very good at editing. Jay: They're so good at editing my god when i finally listened to the episode i I Jay: was like, oh, wow, they made something out of this. Yes.
Justin: But, yeah, the only thing that we didn't mention that I wanted to maybe mention Justin: is kind of what we talked about last time was whoever controls the nodes of Justin: a graph can control the graph.
Justin: And so I was also thinking about that as a security problem with linked open Justin: data is, you know, when we were talking about like all of the privatization Justin: happening, if someone buys a certain node of the graph, Justin: then the same problem Sadie was saying with everyone having their own API is Justin: like, if you're controlling this graph, Justin: even though it's open, and you control like the right permissions,
Justin: then like, I don't know, assume that's a problem that's going on. Justin: Because oclc has meridian now and i assume that that it only exists because it will make money if. Jonny: You control. Jay: The spice you control the universe. Jonny: Yeah is that a animal this is a. Justin: Very cranky.
Jonny: And just like desirous animal it's like my turn like i'm sure i haven't heard Jonny: about this this meridian thing was the first time i heard about this today is Jonny: this just like a it says may 2024 is it like i assume it's is it that new i. Dorothea: Hadn't known about it until today either for when it's worth oh. Jay: Clc just loves to do shit.
Justin: Our our metadata librarian is is currently work like on on a at my job is on Justin: like a committee for i think what is what is the organization the program for Justin: cooperative cataloging and they're They're working on a task group for like Justin: URIs in Mark implementation. Justin: So I guess like they're going to have separate types of like handle based permalinks or something.
Justin: I don't know that are going to be in Mark, but they were also talking about Justin: how they had like a demonstration of Meridian. Justin: And I don't, I think it's just the link data they've made out of WorldCat. Jonny: So they're, they're, they're using an entry for Octavia Butler as the demo data. Jonny: And I'm like, that's like an interesting, interesting, like person and body Jonny: of work to evoke in your like corporate platform. Jonny: Like that's just like, yeah.
Justin: The don't build this machine. Jonny: Yeah. Sadie: The Torment Nexus. Dorothea: Thank you. Don't create the Torment Nexus. Sadie: Wouldn't it be terrible if we created the Torment Nexus? Creates the Torment Nexus anyways. Dorothea: So here's a gif. And this is totally off the cuff just because, Dorothea: again, I only heard about this today. Dorothea: I think it is clear to OCLC that their WorldCat monopoly is not long for this Dorothea: world. one way or another.
Dorothea: Whether it's a customer revolt or we finally find a way to do this with linked Dorothea: data without getting sued out of existence, that's not going to last. Dorothea: So how can OCLC come up with a linked data store that they can fence around, Dorothea: limit to their customers the same way that they've done with WorldCat? Dorothea: That's what I think Meridian is.
Justin: Probably. Jonny: Probably i mean as as you're saying like Jonny: they're doing it because it makes money Jonny: somehow and like i think that's a pretty good bet i mean and it's like continuous Jonny: with the way that the rest of like linked open data has has worked we're just Jonny: like that's like what wiki data is to to some degree is that it's like basically Jonny: a captive labor pool like and so it's like like who funds wiki data is largely Google.
Jonny: And so like Google bought Freebase, like the predecessor to it, Jonny: you know, they did their attempts at cleaning it up and everything like that. Jonny: And then basically like shunted that into Wikidata and they profit from it immensely Jonny: by being clean, corporate friendly. Jonny: Like there's no like swearing on Wikidata, you know, and, and.
Jonny: Way of concentrating a bunch of labor so that Jonny: then they can mine it and make derivative profits from it and Jonny: like we're just like the people that work on wiki data are like Jonny: genuinely true believers in like the Jonny: beneficence of cataloging the Jonny: world's data they're just like they're like not corporate stooges Jonny: they're like view themselves as being like we're just Jonny: trying to do the same mission as wikipedia
Jonny: which is just like yeah make make a global information store but Jonny: not really evaluating the like why would google want Jonny: us to do this you know and like and so Jonny: just like that that sort of pure production Jonny: as captive labor model is one Jonny: of those biggest sort of like you know red pilling moments Jonny: for like information people is Jonny: that just like what if it's actually bad to have like Jonny: these sort of like crowdsourced information platforms that
Jonny: just like so when we were watching when we Jonny: were watching lo and behold like one of the Jonny: like examples of just like the beauty of the internet and so Jonny: it's like again like every time i think about this is like this is a Jonny: movie that was released in 2016 which is not that long Jonny: ago but yet and yet it feels like a completely different Jonny: universe we're just like this is like one of.
Jonny: The promising things about it where you had this like chemical reaction crowdsourced Jonny: thing where just like the wisdom of the crowds Jonny: lots of people playing this game about like protein Jonny: folding or whatever was able to do something Jonny: that you know the best scientists in the world could do and it's just like cool Jonny: but were any of those people on the paper that got published from that and from
Jonny: all of that work and like we're just like If it's just a thing where you farm Jonny: out other people's labor in time. Jonny: Or just in this case, farm out all Jonny: of the cataloging labor that happens in libraries into curating this...
Jonny: Collection of information in the same way that i i don't know Jonny: the politics of world cat i assume it's the similar kind of way we're just like Jonny: everyone is required to use this but we don't actually have much control over Jonny: it kind of thing and just like yeah like that is a a massive extraction vector Jonny: sort of hiding in plain sight under the guise of pro-social technologies.
Justin: Yeah and this is probably more of the same which Justin: is to make that data then usable and Justin: and useful to ai products i would Justin: assume particularly it's interesting that they Justin: mentioned like incorporating orchid and ror Justin: which are like skullcom specific things really especially ror Justin: is like a weird one to throw in there because that's like research organizations Justin: right to make sure that those are disambiguated because journals are really
Justin: really bad at disambiguating like the biology department of this university Justin: because departments change all the time and also people abbreviate them and And, you know, Justin: so there's no, there's no like one identity and that causes all kinds of problems, Justin: even just like getting the university right half the time. It's like, it's wrong. Justin: So ROR is kind of like orchid for organizations. And so that's a very specific thing.
Justin: And I find that very strange. Like, do they want like regular, Justin: like cataloging librarians, like fix the Skollcom metadata thing? Justin: Problems that are out there they. Dorothea: Do like oyster yeah. Justin: That like clarivate.
Dorothea: Can fix scoop that up back in Dorothea: the day what's that oh it was a Dorothea: union search engine for institutional and sometimes disciplinary repositories Dorothea: is is what it was it's basically there were always problems with But the problems Dorothea: go back to OAIPMH being complete garbage, Dorothea: such that you couldn't, for one of the things it does not allow you to say is, Dorothea: is there a full text associated with this item?
Dorothea: And so one of the reasons Oyster became completely useless is that it was choked Dorothea: with metadata-only records, which really disappointed end users because they Dorothea: couldn't click on it and get to the thing. Right. Jonny: And that's definitely why I auto-embed Sci-Hub links in all of my writing, Jonny: because it's just like, what use is it to someone else for me to cite something Jonny: if they can't actually see it?
Justin: I wonder how they scrape the full text information now when stuff gets pulled Justin: from OAIPMH, because it still does. Justin: Because OAIPMH is how we push out to core, but it definitely does know if we've got full text. Dorothea: I have to think they implemented a check, which is fascinating because they Dorothea: would have had to implement such a check for pretty much every single repository Dorothea: and repository design in existence.
Dorothea: Like, you're literally looking for a link that says PDF or something. Justin: Yeah. Dorothea: Wow. All because Herbert Van de Soppel is complete crap at building protocols Dorothea: and things that will be useful at last. All right. I said the name. Jonny: This is obscure beef. Dorothea: Oh, I, you know, Dorothea: Herbert Vandesop, when I say serial project abandoner, he is the paradigm example. Dorothea: He totally did that with OAPMH. He totally did it with Memento.
Dorothea: There are probably six other projects of his that I could also... Right? Memento. Justin: Remember Memento? Dorothea: Yeah. And I'm just like, funders, stop giving this guy money. It never turns out well. Justin: We got more obscure beef than a wagyu farm heck yeah don't look at me like that. Jay: I'll look at you however I want to. Justin: Alright I was very proud of that. Sadie: It's good. Justin: Well done thank you I think we should wrap up.
Jonny: Yeah yes I've got sleepy bitch disease. Sadie: Did we clarify what the hell's. Jonny: Going on or still cloudy. Sadie: I i think i've got a pretty good gist actually and you know what knowing the Sadie: beef actually helps it it it does so good that's. Dorothea: Like and you know i do teach this stuff sadie you know my email address you Dorothea: can totally ask me questions. Sadie: That's true yeah That's true.
Jonny: And like, like one of the things I have come to love in this world, Jonny: you know, the few things that you can love in it. Jonny: It's just like, every time you get close to something, like you just like realize Jonny: that it's all just people. Jonny: And that's just like all these things that are these immutable features of the world. Jonny: One day you might just come face to face with like, Oh, that was you.
Jonny: And then just be able to be just like, like that just like yeah all of a sudden Jonny: it makes sense where it's like i get why it is that way that just like you know Jonny: you knowing the beef and knowing the people is the way to know the thing yep. Sadie: It all makes sense now. Dorothea: Oh glad to hear it thanks y'all i as always love being on the on the podcast. Justin: Yeah oh thank you so so much for coming on yeah thanks and i'm glad we got to do this.
Jonny: Yep yes good to see you yet again let's let's find time to watch a movie sometime Jonny: soon it's been a while yes. Justin: Oh yeah i need to do more i need to do more movies in the in the discord which Justin: i was about to plug because dorothea you've also been answering questions in Justin: the discord it's very helpful yes and we appreciate it it's. Jay: Just us shit posting and you being helpful yeah. Dorothea: Well i mean you know that's and worse the way it usually is.
Dorothea: Everybody else is being helpful, and I'm shit-missing. So, hey! Sadie: Even the score. Justin: Good night.
