135 - The Once and Future Linked Open Data feat. Dorothea and Jonny - podcast episode cover

135 - The Once and Future Linked Open Data feat. Dorothea and Jonny

Aug 28, 20241 hr 56 minEp. 137
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Transcript

Jonny: I'm so glad that you're here with us, Dorothy, just because, Jonny: like, I'm just, like, always interested in your perspective on this, Jonny: having, like, lived in the library world of leaked data for so long, Jonny: just being, like, because on the other end of, like, living in programmer world, Jonny: sometimes I still get the sort of, like, both the persnickety, Jonny: you know, purist side and the people that are trying to make it work happening.

Jonny: But like very few like actually this doesn't even come close to meeting my needs Jonny: or like resemble my my work style at all i. Jay: Remember it was so funny like scott carlson helped edit or write that like linked Jay: data in libraries book and then like two days later was like linked data's dead Jay: and then like became a like a programmer i love scott i think.

Dorothea: He was stuck in a deeply shitty workplace and i. Jay: Agree it happens to us and then we get out of them hooray. Jonny: Proud of you. Jay: Yay okay, Jay: I'm Justin. Justin: I'm a Skoll Column Librarian, my pronouns are he and they. Sadie: I'm sadie i work it at a public library and my pronouns are they then. Jay: I'm jay and i'm a no longer a music librarian, Jay: finally fucking a cataloging librarian again for the first time seven years Jay: after finishing She Graskell.

Jay: And I won't say where. And my pronouns are he, him. Justin: Just post the address this time around. Jay: If you're in the Discord, you know. Justin: Okay. And we have guests, would you like to introduce yourselves? Dorothea: Sure, I'll start. I'm Dorothea Salo, pronouns she, her. Dorothea: And I teach at the University of Wisconsin-Madison Information School.

Jonny: I'm Johnny Saunders, they, them. I'm just sort of like, I guess I do various Jonny: forms of like information based work at UCLA. Justin: Thank you. Jonny: Yeah. For the belated applause. I was waiting for that. Dorothea: Thank you. Very kind. Justin: Welcome. Welcome. I still have my reorganized on board, so I still only have like 10 sounds. Oh. Justin: No copyright law in the universe is going to stop me.

Jay: I i started making justin watch it's always sunny and it was a bad decision Jay: because now the soundboard has it's always sunny theme on it and. Jonny: It's got to be the full length version too no soundboard is complete without Jonny: the one that keeps going for an hour. Justin: Because that's.

Jay: Just a piece of like public domain music it's not even like written for the Jay: show I'm pretty sure sweet I. Justin: Think I had just the full Soviet Union anime, Justin: yeah I was like this is anime 1 you piece of shit, Justin: yeah that one keeps going, Justin: so this was an episode we came up with because Justin: city wanted us to explain linked open data and Justin: i think i probably know the second least Justin: so i figured it would be funnest for

Justin: me to start and try and explain what linked open data is Justin: which is all from Justin: what i remember in grad school which is the last Justin: time i ever had to interact with it that i'm aware of Justin: besides like you know the parts of Justin: linked data that are used by google is it's primarily Justin: you can think about it as triples and everything is one item linked to another

Justin: item so hamlet is a character in hamlet the book those are two separate uris and then It's a play. Justin: Well, it's in book form. Jay: Okay. Justin: And then Shakespeare is the author of Hamlet, and so there's an is the author Justin: of statement that each has a URI, Justin: and these three things can chain together forever, and that way you would have Justin: something that's both machine-readable and human-readable, and somehow that Justin: makes data boxes in Google work.

Dorothea: Or certain extremely non-human-readable forms of human-readable. Justin: Right so once he's trying to organize it in other ways like say make a list Justin: of things suddenly it doesn't work anymore yeah because now you have to see a series of statements. Jonny: Yep i'm like just chin. Sadie: Hands here waiting for all of these super smart. Jay: People literally, Jay: this is we explain linked data to sata yeah.

Jonny: There's like the tripled explanation and And then immediately you fall off the Jonny: cliff of ideology and 25 years of some of the most prickly and opinionated people Jonny: in the world making like claims on reality that you truly can't believe until you see them. Jonny: So it's like, you know, we got talking about technology and beliefs.

Jonny: And then also like for a lot of people, like a huge amount of like wasted time, Jonny: trauma or success, depending on if you work for Amazon or Google or not. Jay: Yeah. Like my experience with linked data is that I took ontology development Jay: in grad school with Dave Dubin, shouts out Dave Dubin. Jay: And we developed, we learned RDF and we mainly wrote in turtle writing.

Jay: I think but we learned like all the other like triples and in three and all Jay: that but i think he liked turtle the the best if i'm remembering the only. Jonny: One that worked. Jay: Yeah as a as a class we collectively created an ontology together each of us Jay: had our own specific section of it that we had to create and like mine's still Jay: it's like still on my github and everything like it's still like theoretically Jay: is a working rdf like ontology is.

Jonny: This the origin of the homosaurus. Jay: Yes no but i'm Jay: also i'm also on the homosaurus which is Jay: actually linked data but i don't none of us on the board actually interact with Jay: that part so much like we have like a software dude who does that but like we Jay: all know about it to some degree and then i've also done some like wiki data Jay: like I did a Wikidata training. Jay: I went through one of those trainings one summer, and that was cool.

Jay: And I submitted a proposal for a paper on thinking of Wikidata and linked data Jay: as a cyborg kind of thing, but interrogating that. Jay: And I submitted this to the Code for Live journal that ended up being the one Jay: that everyone yelled at. So I'm glad it got rejected. Jay: Like literally that issue was the one that I submitted to with like the data, Jay: like bad data practices. Jay: That one, that was the one I had submitted to. So I'm glad I got rejected now.

Jonny: Narrow miss, narrow miss. Dorothy, weren't you the one that blew the whistle Jonny: on that? Or is that different? It's like a different time. Dorothea: It was like you and Becky. Jay: Right? Dorothea: Well, I mean, if we blew the whistle over anything, it wasn't over linked data. Dorothea: It was over privacy. Dorothea: It's a thing. You might want to let people keep it. Jay: Yeah. Dorothea: Yeah.

Jay: Yeah, I just happened to be writing about linked data for the thing I was writing about. Right. Jay: Yeah, so I'm very glad that my goofy little high theory article got rejected. Justin: So I actually never ended up using Turtle. I think I learned it in three notation. Justin: It was very not hands-on the way I learned about it.

Justin: And so it was never clear how it worked except for the aspects that kind of Justin: pulled from Wikidata and that explained a little bit, Justin: but I never got an in-depth explainer for how Wikidata works. Justin: So it was very theoretical and my metadata teacher was very on the theoretical Justin: side of things, so I never got to see a lot of practical applications of a lot Justin: of the stuff we talked about in class. Jonny: So that is not how. Dorothea: I teach metadata.

Jonny: Yeah if. Dorothea: You're not doing that one step away yeah.

Jonny: Exactly and that's like one of the major cultural Jonny: fissures is that just like is it supposed to be something Jonny: that you touch or is it something that is supposed Jonny: to be like a true artifact of the world and Jonny: needs to be done once and never touched again you know Jonny: so like that that you the division Jonny: between the teaching styles it's like reflective of Jonny: the entire system of belief that goes into linked open

Jonny: data as well i'm like i'm curious like like hearing people's like origin stories Jonny: with linkedin because like i'm like because because dorothea you've been doing Jonny: this for like a while there's like in libraries and stuff like that i'm curious Jonny: like if what your origins really are i. Dorothea: Mean you know i got into it the same way a lot of people did as it started to Dorothea: be talked about out as potentially where libraries move from MARC.

Dorothea: And, you know, that's a really awkward question when you think about it. Dorothea: Sticking with the homegrown, if you will, like MARC encoding, Dorothea: which we made up from scratch in the 1960s, Lord bless Henriette Avram, she was awesome. Right. Dorothea: But it doesn't map cleanly onto any of the dominant data structures, Dorothea: data models that we have today.

Dorothea: It's pulling teeth to try to stuff mark into a relational database such that Dorothea: you can actually do anything with it. Dorothea: You can kind of do it in XML, but XML is really squishy that way. Dorothea: And I don't mean that in a bad way. XML squishiness is actually quite useful. Dorothea: If you look at, for example, EAD, encoded archival description, Dorothea: some of EAD is what you and I, Johnny, would probably think of as data.

Dorothea: But a lot of EAD is narrative, right? It's storytelling. Dorothea: And you know what? Databases are shit at storytelling. You can't represent Hamlet in a database. Dorothea: Linked data is shit in storytelling. One of the things that really pissed me Dorothea: off about the very early days of linked data was some of its boosters going Dorothea: around and just bragging on it as something where you could literally represent anything, right?

Dorothea: If you could put it in a computer, you could put it in linked data. Dorothea: And my retort to that is, as it has always been, and this is pure coincidence, but I kind of love it. Dorothea: All right, express Hamlet in RDF and get back to me, okay? Dorothea: You can't do it. And I was reading through some of the stuff in the show notes Dorothea: for today. And I happened on one of the Tim Berners-Lee pages.

Dorothea: Let me see if I can find that. Ah, yes. And Tim Berners-Lee on this particular Dorothea: page talks about a semantic web, or sorry, a magical artificial intelligence. He's talking about AI. Dorothea: And he says this, the concept of machine understandable documents does not imply Dorothea: some magical artificial intelligence, which allows machines to comprehend human mumbling. Dorothea: That's literally what he says. Human mumbling. Excuse you, Tim Berners-Lee.

Dorothea: Excuse you. Language is one of the most magnificent things we have as human beings. Dorothea: And you are calling it mumblings. things excuse you very much sorry that was my rant.

Jonny: No well felt i mean yeah his his Jonny: relationship to this this sort of like you know Jonny: the fuzziness of language is like one of Jonny: the most fascinating parts of like the early Jonny: outlooks on what link data could be because on the one Jonny: hand there's sort of the romanticism of language and like Jonny: the fluidity of language as being something to embrace but then almost immediately

Jonny: that becomes like squished out just sort of like the thing that's almost immediately Jonny: excluded is the ability for people to actually express ambiguity uncertainty and so on yeah right. Jay: I think last time you were on johnny or or maybe this was in like just oh no Jay: this is when we were watching it together but we talked about how like the the Jay: ted nelson versus the tim berners-lee view of like the interconnected internet and data.

Jonny: Right let's see if i can find interesting. Dorothea: Dude ted nelson i actually did get to meet him once. Jonny: Um i. Dorothea: Was like wiped out at the time unfortunately but uh yeah i will i will always treasure that he was an, Dorothea: is i think still is interesting dude. Jonny: Yeah the. Jay: Chad ted nelson.

Jonny: Like so i'm like the story that Jonny: i don't have a good like hold on it's just like so like what happened and this Jonny: probably relates to just like you know some of the stuff that we talk about Jonny: all the time in like cyber security screaming channel and just like saying what Jonny: you may have to deal with as well of just like the state of technologies that Jonny: go into libraries and how just like they're They're not actually under any of our control,

Jonny: and we sort of do the best we can to exist on whatever scraps that IT wants to feed us and stuff. Jonny: And so I imagine that's the intertwined stories of why did linked data not happen Jonny: all the way at libraries, sort of related to the institutional inertia as well. Dorothea: Yeah, that's part of it. And, you know, getting back to my point about the question Dorothea: of getting off mark, relational databases weren't going to work.

Dorothea: XML wasn't going to work and was in kind of a little bit of a decline as we Dorothea: were asking ourselves this question. Dorothea: So what was left? I remember a blog post by Jonathan Rochkang, Dorothea: who hates LinkedIn. Why does he hate RDF? Dorothea: And, you know, he backs it up. He's not just a random hater. Dorothea: But he was like, we can't, we cannot move to this. And I'm like, Dorothea: okay, what's the alternative, right?

Dorothea: And there are things about RDF that are attractive ideologically, Dorothea: but also practically to libraries. Dorothea: The idea of the open in linked open data. Dorothea: We can really truly share and OCLC can't stop us. Oops, did I say that out loud? Wow.

Dorothea: I mean, you know, really, the elephant in the room is OCLC and its enclosure Dorothea: of Mark and Mark cataloging for its own corporate, Dorothea: and I am going to call them corporate, I don't care that they're not legally Dorothea: and non-profit, for their own corporate benefit. Dorothea: So linked data to some of us look like a possible way out of that. Dorothea: And, you know, I can't fault anybody for that. It's definitely a goal worth pursuing.

Dorothea: So why didn't it get as far as we might have wanted it to? Dorothea: Part of it is that RDF was not built, and Johnny can speak to this more because Dorothea: he's read more of the STS and sociology literature around it than I have. Dorothea: But it was not really built for practicality or computability, Dorothea: right? I, as a complete Sparkle duffer, and Sparkle, if you haven't run into Dorothea: it, is the query language for linked data.

Dorothea: It is to link to RDF what SQL is for relational database. Dorothea: I can make a typo in a Sparkle query and knock a server over dead. Dorothea: It's not even hard. Dorothea: So, like, the brittleness of just being able to ask a question without killing Dorothea: a server, this is not a consideration for the early designers of the semantic web. Dorothea: And, like, how do you build a library infrastructure on a foundation that is Dorothea: that technologically brittle?

Dorothea: And the answer is you can't. You really, really can't. Dorothea: Another, I'm not going to say this is a problem, actually. I actually think it was good. Dorothea: But it's a situation that does not commend itself to libraries, to librarians, right?

Dorothea: We tend to be very orderly people. people and catalogers Dorothea: as much as anybody and more than some so in Dorothea: the aughts right in well no not the Dorothea: aughts in the teens i guess particularly in europe Dorothea: there was just this flowering of experimentation Dorothea: with how are we going to Dorothea: represent the things in the library universe Dorothea: like books and maps and musical scores Dorothea: and all and movies and all that good stuff how

Dorothea: are we going to represent present this in rdf lots of Dorothea: experimentation a lot of it was fantastic european is Dorothea: great yeah yeah there's a lot of really good thinking very practical thinking Dorothea: going into this but there were models data models rdf models ontologies if you Dorothea: will springing up all over the place and so if you're an average cataloger you're Dorothea: looking at this and going well what which one

Dorothea: do I learn and which one are we going to use? Dorothea: And when is there a tool that's going to work with any of this? Jonny: Yeah. Dorothea: And the answer is there wasn't. Now, what seems to have fallen out of that, Dorothea: is that BibFrame, for all of its faults, and it has many, it is not my favorite Dorothea: bibliographic ontology. Dorothea: It seems to be kind of taking over the world and muscling out a lot of that Dorothea: European experimentation.

Dorothea: And that frankly makes me sad because Europe, there's several countries in Europe Dorothea: that just plain kicked BibFrame's ass as far as modeling quality. Dorothea: And i hate that they're Dorothea: getting plowed under basically by this Dorothea: crappy american juggernaut but why why is this happening because they're finally Dorothea: tooling they're finally cataloging tools that as much as any rdf based tool can fail to suck, Dorothea: yeah like i know in in alma you can do.

Jay: Bib frame stuff in alma. Dorothea: Yeah but you can you can look at sinopia and you can look at marva and you can Dorothea: imagine an actual person using these right and making them work and getting Dorothea: good records out of them which we didn't have for at least a literal actual Dorothea: decade after BibFrame happened.

Dorothea: So when Tim Berners-Lee calls human language mumbling, I think it's a symptom Dorothea: of the contempt that so many linked data people have for human beings. Dorothea: And I yelled at the Semantic Web and Libraries conference in like 2014, Dorothea: a decade ago, about exactly that. Dorothea: Stop dissing human beings you can't do that if you actually want linked data Dorothea: but nobody listened and here we are right yeah.

Jay: Like another idea and this was also something i think i talked with johnny Jay: about like another idea for a goofy like high-minded like theory paper i had Jay: was thinking of like linked data as this attempt to like do a reverse confusion Jay: of tongues like a pre-tower of babel divine language that ignores the actual...

Jay: The reason that linked data is cool is that it has the potential to, Jay: everyone to have their own way of doing it and it'll talk together and intermingle Jay: instead it's just turned into this like nope everything looks this way now and Jay: this almost like mechanized version of language like taking over like it doesn't Jay: care about being human readable actually.

Jonny: Right and like so it's it's like this this tension that that was there from Jonny: the origin of it and it's It's actually just like the dawn of the term linked Jonny: to data as opposed to the semantic web is just like a part of this, Jonny: the same thing of like part of this.

Jonny: I feel like we need to like at least nod to, because it's like, Jonny: we talked about this at length last time I was on here, but just like also nod Jonny: to the Lindsay Poirier piece, Jonny: like a turn to the scruffy, which is like we both called out as being like one Jonny: of this, this is like seminal work on like understanding the culture of the semantic web.

Jonny: And just like that just like points to and also just like it's there too in in tim biel's website, Jonny: of that just like the separation of linked Jonny: data and linked open data from the semantic web was about like Jonny: reclaiming just like stuff that worked as opposed to stuff Jonny: that like was perfect that just like this is we're about like trying to make Jonny: a bunch of separate ontology so it's like the initial idea of being there's

Jonny: one graph like one global graph where everything is always linked together and Jonny: there should be one uri that represents each unique concept and only one.

Jonny: And to the point where just like there's these sort of like absurd blog Jonny: posts and like one of the things that's amazing always about just like web history Jonny: is that a lot of it is just like still there and still up there at least on Jonny: archive.org but just like these just like blog posts that i think this is 2009 Jonny: i put this in the in the links as well but i'm just like they apparently took Jonny: took down the comment section on it.

Jonny: But just like someone that was like from like semantic web, like in this era Jonny: of just posting a blog post about when the first time that the New York Times Jonny: had like linked data in their web version of the product. Jonny: And so what they'd done is they'd made some, you know, article that was about. Jonny: Barack Obama and the quote unquote, you know, the racist controversy, Jonny: like, you know, Barack Obama is a Muslim, whatever. So it was an article about that controversy.

Jonny: And so there was an RDF claim that was like, Barack Obama related to Muslim Jonny: or something like that, that just like, this is just like trying to describe Jonny: the contents of this piece of writing.

Jonny: But then people immediately were like Jonny: this is messed up because that's now a claim on Jonny: reality and it's like it's not just like someone says Jonny: this it's just this is a fact and Jonny: that was just like something that like the rdf group had specifically designed to Jonny: be doing and so like the the model of the world that like people keep trying Jonny: to escape from but now need to return to but keep trying to escape to have to

Jonny: return to is that like when you make a statement in rdf like Like there's a Jonny: difference between the way that like the language and the syntax and the systems Jonny: designer thought about it as being literally like, Jonny: like there are some like really remarkable quotes in the W3C archives. Jonny: And I was like, I was trying to pull up earlier, but it's like that, Jonny: like this, this one quote from Brian McBride, 2001.

Jonny: So this would have been just like only a couple of years after the project formally launched at W3C. Jonny: That's like RDF is not just a data model. The RDF specs should define a semantic Jonny: so that an RDF statement on the web is interpreted as an assertion of that statement Jonny: so that its author would be responsible in law as if it had been published in a newspaper.

Jonny: So these are like, they're supposed to be like legally binding documents in Jonny: this way, where there is no such thing as an author. Jonny: Someone says this, you know, that just like when in reality...

Jonny: Everything it has an author everything has Jonny: a point of view and a perspective and just Jonny: like was said by or written by somebody but like Jonny: you know it took a while for even that that notion Jonny: to be encoded in the language at all as like Jonny: an expressible thing period adding the fourth Jonny: item in the triplets like being able Jonny: to say that this doesn't belong to the global graph of

Jonny: everything but in fact is my like local system of meaning and then but then Jonny: like that just like this that you know you have to keep escaping that because Jonny: it doesn't actually work because it's like the thing that i always come back Jonny: to is i just like imagine if language worked this way where i have.

Jonny: To i want to use a word and i i have to use Jonny: johnny's version of this word and so Jonny: i have to to say like i had to go into like johnny.net slash Jonny: this word and now i'm Jonny: referring to that one and there's no way that i can make Jonny: my own copy of this word it's like in the way that Jonny: language works of just like you know we have these sort of like parallel representations Jonny: of ideas and concepts and words and phrases

Jonny: that are like you know they're not the same at all Jonny: even close to the same in between Jonny: person to person when or even utterance to utterance and Jonny: yet like we're trying to express like a system of meaning where Jonny: there is one version of each of these things like no Jonny: simply no one would do it like no one would if Jonny: i had to go to the dictionary every time and look up each person's unique

Jonny: word and like use that or else it was meaningless then it just doesn't work Jonny: so like and it's like intimately i don't know i don't want to just like trail Jonny: off forever on here but it's like intimately related to the tooling problem Jonny: where like theoretically and so like one of the authors of SCoS, Jonny: like the Simple Knowledge Organization System, like the ontology and modeling Jonny: system for like modeling relatedness and similarity.

Dorothea: It's how you do controlled vocabularies in RDF and it's actually quite functional, quite useful. Dorothea: And if I'm not wrong, I think homosaurus is actually based on it. Dorothea: That's your underlying, how you're modeling this stuff. Jay: Yeah, it's SCoS, yeah.

Jonny: Yeah, it works pretty well. And like, so you'd imagine that like a tool like Jonny: that, where you're able to say that something is a similar match, Jonny: or this is exactly the same as this other thing would enable this kind of like expressive system.

Jonny: And it doesn't because doing all of those queries and lookups is preposterously Jonny: expensive, because of just like the way that it's encoded as URIs, i.e. Jonny: URLs, i.e. I need to hit a web server every time to actually retrieve this item, Jonny: as opposed to Yeah, there's a any number of different web architectural models Jonny: that that That could take, but that's the form it took.

Jonny: And so as a result, like, yeah, it's like intimately related to the tooling Jonny: as well as the implementation of the technology, like in the same way that it Jonny: is a reflection of the ideas behind it. Dorothea: Right on. Jonny: Yeah. Dorothea: So how are we doing, Sadie? Clear as mud? Sadie: Yeah, just about. Like, I think the thing that gets me about linked data and, Sadie: like, I haven't gone to library school.

Sadie: I have just, like, the most barest knowledge of cataloging and that kind of Sadie: thing is, like, I'm a very practical, hands-on person. Sadie: So, like, I have to dig into a system to be able to show, like, Sadie: to really see how it works. Dorothea: Oh, yeah, totally. Sadie: Every time I have tried to do that, to even think about open linked data, Sadie: I'm like, I don't, I don't see how this is usable.

Sadie: So that yeah, like you talked about, like, there is, there needs to be tools to be able to use it. Sadie: It sounds like the heart of the problem at a lot of library technology where Sadie: I keep saying this is just like there's a very small selection of vendors that Sadie: have a very large control and they just keep conglomerating together. Sadie: So there's like three now.

Sadie: And somehow libraries, who are the ones who are using the tools, Sadie: are the most powerless people in the whole ecosystem of it, right? Sadie: So a big topic at my work lately, and maybe a tangent here, is why the fuck are we still using SIP2? Dorothea: Can't blame you on that one. Sadie: I don't know if you're familiar with SIP2, Johnny.

Sadie: It's basically a protocol. So integrated library systems, ILS is the biggest Sadie: software that libraries use to keep track of all of their stuff. Sadie: It's basically the protocol that passes information between. Sadie: These systems, right? So like a lot of vendors use SIP. Sadie: So like, like Overdrive, you know, you like Overdrive has to know what you already Sadie: have checked out to be able to enforce your limits.

Sadie: Like you can only have five books checked out. So it uses SIP to query that Sadie: information from your library system, right? Sadie: It is entirely unencrypted, clear text, unencrypted, and has been its entire life. Sadie: And SIP2, which is different from the IT SIP, which is a VoIP protocol, Sadie: which causes no end of confusion every time people are, like every time we have Sadie: to talk to a vendor IT to figure out how to set something up.

Sadie: I just totally gave myself, if a single one of my coworkers is listening to Sadie: this, I just absolutely gave myself away because I've had this conversation so many times. Sadie: But yeah, it's like, and it's been in use for so long and all of these interlibrary, Sadie: it's the only one that is actually usable, like actually, what's the word I'm Sadie: looking? Agnostic, system agnostic.

Sadie: So it's starting to be replaced by a lot of APIs, but each API for each system is its own thing.

Sadie: So you have to wait for other like, you know, Sadie: oh we could do this api we could do i Sadie: don't know if this is true we could do almost api but we can't do sierra millennium's Sadie: api so it's just like just like and in it it's just like why the fuck are we Sadie: still using this and then we talk to people like vendors and they're just like Sadie: well what's the problem and we're like it's completely clear text and requires Sadie: extra tunneling to be able

Sadie: to actually keep our patron data over like not readable over the internet and i've asked. Dorothea: It all over the entire internet.

Sadie: For anybody and like looking at the Sadie: strings it's literally like library card number name Sadie: full name address you know number of Sadie: checkouts like it's just like it's it's so ridiculous Sadie: and people are still just like well i don't Sadie: i don't understand what the problem is until you talk to an Sadie: ipt person and you say it's in clear text it's completely Sadie: unencrypted and they go oh that's bad

Sadie: but no libraries have like the power to go to these freaking vendors and just Sadie: be like you have to figure something else out something has to be worked out Sadie: but it's going to end up being you know oclc who does that kind of stuff or Sadie: something like that and then yeah it'd. Dorothea: Be nice out right and. Sadie: Yeah they're. Dorothea: They're vendor patsies that's That's all we are. Sadie: In a lot of ways, yeah.

Justin: Yeah, what was it Bree said in the Skullcom Discord? ACAB includes Niso. Jonny: Yeah. Justin: Yeah. Dorothea: Absolutely. Sadie: So like, I still don't think I understand entirely what linked data is, Sadie: but I, I do think that I like, I can start to get to it if, if you know what Sadie: I mean, because yeah, like it's, it's just, it's a system.

Sadie: It's a system to connect data to other data in meaningful ways and it once had Sadie: the promise to actually help libraries figure shit out and it has completely Sadie: kind of shit the bed on that is that is is that an accurate that's. Dorothea: That's completely accurate i still have tiny little sparks of hope Oh, I do. Jay: Did we describe why it's called the semantic web? Dorothea: Oh, I don't think we did. Johnny, I'll leave you that one.

Jonny: It's a really simple story. It's like being like web happened, Jonny: right? And so web is documents with links between them. Jonny: But those links are meaningless. Jonny: They're just the relationship from one page to another.

Jonny: And it's hard to imagine this in retrospect, Jonny: respect of a web without search engines Jonny: or without any sort of like overlay to them because Jonny: like basically the way that everyone interacts with the web now Jonny: is either through search or through some mediating discovery Jonny: mechanism like you don't just like go on the web and then Jonny: go to a url and then just be like well Jonny: i'm here now and just like i've found the internet

Jonny: and like it said yeah so like that's like Jonny: the way that the web was sort of designed and like Jonny: the way that it's supposed to work is it just like it would be self-organizing where Jonny: the like the literally like if you go back to like Jonny: the founding I was like we will just have people that have Jonny: lists of links on their personal websites and they will link everything together Jonny: and then just like people will find their way from these like local nodes of

Jonny: meaning like and the imagination there was always that just like the web would Jonny: be super easy for the average person to make a website on and that just like Jonny: everyone one would basically have one.

Jonny: And that didn't work at all, not even close, Jonny: not even from the very beginning, where just like, you know, Jonny: this, it was the case where just like the ultra nerds that were on the internet Jonny: at the very first part of it, still, you know, gravitated towards sort of like Jonny: mediating platforms like bulletin board systems, and etc. Jonny: So the semantic web was supposed to be a way of encoding computer readable information

Jonny: into the protocols of the web, and specifically into HTML documents that are, Jonny: you know, that are XML, a dialect of XML. Jonny: I don't even know how to describe the relationship between HTML and XML.

Jonny: But like, so that it would be possible to both annotate a given page and then Jonny: also just like be able to link them together so that you'd have this sort of like, Jonny: you know, coexistent between documents that people are on that have like, Jonny: you know, human readable text, and then embedded within that and embedded between Jonny: that are just sort of like, in this paragraph, I'm talking about this person.

Jonny: And like, then I can sort of like, say, go to that page and theoretically go Jonny: and find backlinks to all the time that that person was mentioned or something like that. Jonny: And so that's like why it's called like the semantic web is we're adding semantics Jonny: to the web which formerly was just sort of like naked links and documents yep.

Jay: Like the computer could understand what that Jay: johnny is a person because Jay: it knows what those uris are and what they point to and it then can then tell Jay: what the relationship between those are not in a way where it knows what a person Jay: is but it knows what this uri is and if you use this uri then it sees other Jay: things that have that uri and knows that they're people too And.

Jonny: There's a certain amount of magical thinking that like, because language sort Jonny: of works this way, that it's like entirely relational and metaphor based and like, Jonny: you know, the meaning of a word is only sensible in context of surrounding meanings Jonny: and contrast with similar, you know, that just like meaning would emerge.

Jonny: And like again like that's sort Jonny: of true like there's like like language does work like that just like Jonny: sort of local negotiations over meaning and indigent but like you need to have Jonny: the people there negotiating in order for it to work and that never really existed Jonny: so just like so like there's and it sort of like points to one of the salient Jonny: features that is both like it's like you know, eerily prescient,

Jonny: but also just like another one of these like critical pieces where we're talking Jonny: about just like the missing tools, Jonny: is like from the very beginning, like there's this 1999 piece in Scientific Jonny: American that Tim Berners-Lee, that was like sort of like the public announcement Jonny: of like, you know, the existence of the semantic web as a problem. Dorothea: I remember reading that. I was at work. I remember reading it.

Jonny: And so is this wonderful document and just like that like is like this very Jonny: pie in the sky kind of system of, you know, release about just like what it could be. Jonny: And like, there's a bunch of just like really basic and obvious things that Jonny: like, wow, we should really have the computers work like that. Jonny: We're just like, you know, like the, the, the idea that I have a calendar appointment or whatever.

Jonny: Why can't my computer know that, like, I also have a photo that was taken on that day. Jonny: So I can just like say, computer, find me the photos that were taken during Jonny: this appointment on my calendar or something like that. Jonny: So like a sort of universal acid for this data, where just like, Jonny: I can just relate, you know, totally heterogeneous systems between one another.

Jonny: But the part that's like really, like, you know, come to be, Jonny: we all like thinking about just like Like AI is like, you know, Jonny: this year and this last year being like, it was always going to be dependent on compute. Jonny: That it's just like, there's metadata there.

Jonny: But even from the very beginning, you need what Tim Bersley was talking about Jonny: as agents, like as about just like little bots, little scripts or whatever that Jonny: are running around getting all of this metadata around. Jonny: And this is like around the time when Google and like the first algorithmic Jonny: search engines were starting to exist.

Jonny: Exist so like this idea of crawlers and Jonny: ingesting this information and making sense of it was like Jonny: a relatively new one especially like at Jonny: a mass scale like this and like that's but that's always been the tension we're Jonny: just like like say just like talking about like what is it where do i touch Jonny: it like how am i supposed to use that just like that was sort of always the Jonny: intention with that just like

Jonny: you would have like a little computer butler thing that would just like be Jonny: going out and you have your own set of commands to just sort of like, Jonny: go get this for me, go fetch this for me. Jonny: But again, it's never really materialized just because with what infrastructure Jonny: does the average person have a constantly running bot that goes out and scrapes Jonny: the web for them all the time?

Jonny: And so even from, yeah, there are a couple of moments in the history of the Jonny: Mending Web of times when Google basically bought it. Jonny: That happens actually several times. Jonny: We're just like this sort of domestication of this process where like, Jonny: now like when you think about it, like, where does it exist? How does it exist?

Jonny: Pretty much the only way that people usually interact with it is like the metadata, Jonny: the open graph metadata and well, that open graph slightly different, Jonny: but like the JSON LD document that you'll have at the top of your website header Jonny: that is just like, using schema.org terms to say that this is a website about Jonny: an organization or an event or whatever.

Jonny: And like as Justin was saying in the beginning, just like sometimes it makes Jonny: the Google info boxes work. Jonny: And like that's pretty much the most concrete realization that the average person Jonny: has for linked data on the everyday. Jonny: And that's because who owns the crawler? Google owns the crawler. Jonny: And so it becomes something where you make metadata available to be crawled Jonny: by Google in this very constrained, commercially focused context.

Jonny: But it's not a system of expression. Jonny: And like, just one more thing is like, there's like these other technology that Jonny: like, RDF-A, like this dialect of RDF, which is supposed to be like the thing Jonny: that goes embedded in documents where like, as I'm writing, Jonny: I will tag a particular paragraph as you know, with some, you know, Jonny: semantic web tag or something like that.

Jonny: That's like arguably one of the most like Jonny: attempts at making human a human link Jonny: data like interface for that we're just like you could imagine i Jonny: have like a document editing software or something Jonny: like that and i can highlight a highlight a sentence and add a tag to it or Jonny: whatever you know just like actually embedding this in documents that people Jonny: actually use that is actually no longer supported by the main art like rdf parsing

Jonny: library rdf lib in python because it's complicated to parse, Jonny: but also it's just sort of like, that's not really the important one. Jonny: It's like, you know, for all these like mushy positional document tags and stuff Jonny: like that, and people don't really want to know the information in context. Jonny: They want it all split out into like, you know, something where I can do an Jonny: HTTP request and just get the headers and that's it.

Jonny: And so like, it's like, it's just one of these mutating landscape of technology Jonny: always ratchets more and more towards, Jonny: it's intended for doing the Jonny: big web of open data that you're not a part of but Jonny: you get to experience through platforms and a Jonny: lot of platforms are in fact powered by linked data Jonny: at least if not rdf knowledge graph Jonny: tm derivatives of that idea where Jonny: like it is an extremely powerful set of Jonny: of ideas but not for you

Jonny: so if you but if you are a company that exists as a giant conglomeration of Jonny: data sets that you've bought by acquiring smaller companies over time it is Jonny: an incredibly powerful system for integrating all of that information being Jonny: able to do complex queries across them so in that piece for tim berners-lee. Jay: Not for thee.

Jonny: Exactly and increasingly for Jonny: the surveillance state and just like the people who Jonny: have this nightmarish multi-sided market Jonny: of selling your data to insurance Jonny: providers at the same time as selling it to police at the same time as selling Jonny: you back a little slice of it as well so like it's yeah the way that it exists Jonny: now is largely in the shadows and that's by no means passive effort there's

Jonny: an active corralling and an active domestication of this set of ideas. Dorothea: And to bring it back to tooling for just a second, Dorothea: Some of the more pro-social, I guess I will use that word, experiments in this Dorothea: space, like Wikidata, for example, Dorothea: are already running up against the absolute limits of what you can do with linked Dorothea: data if you're not, like, Google.

Dorothea: Google, they've already, and the technical details here completely escaped me, Dorothea: but Wikidota has gotten too big for its britches. Dorothea: The infrastructure literally cannot cope with it anymore, so they're sharding Dorothea: it, is my understanding. Dorothea: They're kind of splitting it down the middle and figuring out how to get the Dorothea: two shards to talk to one another, which I'm sure is really exciting technically, Dorothea: but wow, that's not great.

Dorothea: For those of us who are not Google, but are interested in this technology stack.

Jonny: Did you see the the cause of this this issue Jonny: is that like it's the underlying database Jonny: software blaze graph that it's running on amazon hired Jonny: away all of the engineers so they're oh Jonny: great yeah so all right typical so again this is like the big company is literally Jonny: buying the underlying technologies we're just like you know the software needs Jonny: maintenance you know that like that it needs maintenance and these constant

Jonny: improvements and just like to be able to handle an ever-growing stack of triples like Wikidata, Jonny: you need to have active maintenance workers. And like, Jonny: Who pays for open source work? Like, if I'm a software developer and Amazon Jonny: says, here's, you know, 250K a year to make the, do the thing you were already Jonny: doing for free, then it's like, sure, I have a family. Jonny: You know, I, you know, I'd like to have, like, you know, go on vacation sometimes.

Jonny: And so, like, yeah, it's just like, yeah, actively, that, that was another moment Jonny: of, like, yeah, actively poaching away the talent so that, like, Jonny: the underlying technology can. Dorothea: And I will say, for all that we are cultural heritage organizations founded Dorothea: on the idea that culture should persist, Dorothea: we're very bad in libraries and archives at admitting that software needs maintenance, Dorothea: that standards need maintenance, right?

Dorothea: That's the SIP2 problem in a nutshell, though that was proprietary, actually. Dorothea: So Ruth Kitchen-Tillman and I wrote an article, got published about a year ago, Dorothea: about the ethics of linked data sustainability. Dorothea: You can find it open access online. Dorothea: And we took a pot shot, actually. Okay, we. I took a pot shot. This one was mine.

Dorothea: At information scientists. Okay? Because there are too many information scientists Dorothea: who are serial project and standard abandoners. Dorothea: They get grant money to do this fancy-dancy thing, and they get as far as it Dorothea: being implemented in libraries. Dorothea: And then they just wander away to write the next grant application and do the Dorothea: next fancy-dancy thing. And then it rots.

Jonny: Totally. Dorothea: Right, whatever they built, it rocks, because inevitably, they didn't build Dorothea: it right in the first place, and I'm totally thinking about OEIPMH here, Dorothea: since we have some Skulltom folks in the room, but SIP2 is another beautiful example. Dorothea: Gosh, we are so bad at versioning stuff. Dorothea: It's a really basic idea. You gotta version stuff. You can never get it right the first time.

Dorothea: So yeah, I, in that article, took a pot shot at serial project abandoners and Dorothea: said, funders, stop funding them. Dorothea: Ask what happened to their last Dorothea: three projects. And if they're dead in the water, add some black mark. Jonny: For real. Yeah, this is a general issue in any sort of publicly funded tooling space.

Jonny: Is that just like... I was allegedly on some review panel for some funding agency Jonny: that is theoretically talking about software sustainability. Jonny: And that was a completely novel concept that just like what we want to do is Jonny: we want to fund sustainable software ecosystems.

Jonny: That just like we're not trying to start a new project. We're not trying to Jonny: like, you know, fund the new feature, but just like, these are the already existing Jonny: things that are happening in open source. Jonny: And let's just keep that going, like paying for like, like stuff like documentation Jonny: and like making the tests work and like, you know, years and years of technical debt.

Jonny: And like security audits yeah totally yeah Jonny: and please yeah and so this is like this is Jonny: one thing like this one of one of my entry points into thinking about semantic Jonny: web and thinking about just like linked open data was just like initially Jonny: thinking about because i was like living with someone who is like working in Jonny: metadata in a library at the time and there was this like increasing cry of

Jonny: just like the we all know the journal system is broken and like there's this Jonny: recurring strain of papers that are just sort of like let's just like make the Jonny: libraries do it you You know, just like that, Jonny: just like we can sort of like get libraries to host a bunch of journal like things, Jonny: journal like overlays or whatever, completely ignoring the reality of work and Jonny: the reality of bureaucracy in libraries that just like.

Jonny: And and and so like, you know, you wonder who I'm talking about. Dorothea: Oh, I don't have to wonder. I let him talk it out.

Jonny: Yeah and and so Jonny: like that just like this is where like Jonny: and on the one hand it seems like an obvious thing where just like of Jonny: course like it seems like libraries in general Jonny: in the abstract should be invested in just like you know maintaining some Jonny: their catalogs at least but just Jonny: like also the all the other things that just like you know that are being archived Jonny: and cataloged and just like you know exist in libraries and just

Jonny: like making that as available as a public catalog on like sure surely they're Jonny: already doing stuff like that so it shouldn't be that much of additional effort Jonny: to have an institutional repository that acts like a journal and like can link Jonny: together these things but as y'all know yeah i keep. Dorothea: Coming back to tooling. Jonny: Yeah tooling. Dorothea: Was shit the tooling for open access is and always has been shit.

Jonny: Right. Yeah. And so it's just a matter of like, that like, there's there is Jonny: this universe of like, we're like, okay, we could get sort of some of these things aligned, Jonny: like funding priorities for maintaining sustainable software. Okay. Jonny: If we can then like, get some sort of like IT consortium to help out with like, Jonny: maybe, you know, quote, unquote, public cloud. Jonny: So it's not the case that just like every library needs to have like an on prem IT team.

Jonny: That just like there are some of these things that could like Jonny: lock into place that just could theoretically make some of Jonny: this work but just like that's just not the way academic work is Jonny: done generally and just like that's just not the way it's structured to make Jonny: these sort of like long lasting infrastructural efforts like as you say that Jonny: these are just like grant cycle to grant cycle let's just like ride to the next

Jonny: thing and even within so like part of my role in the last Jonny: six months of work it's like i'm working with actually a lovely group of people who i who i i. Jonny: Like and they have welcomed me and so i'm not trying to speak ill of Jonny: them at all but just like this is a linked open data project and Jonny: basically what i've been trying to do for the last like six months is like pay Jonny: down technical debt we're just like there's this like really good idea

Jonny: of this like this way of having authorable Jonny: linked data schemas doesn't require you to be Jonny: part of the priesthood to be able to describe what exists in Jonny: your reality but it's just like i didn't Jonny: really work it's just sort of like they're Jonny: just like that it's just like the people that are concerned with the Jonny: modeling part about the the like Jonny: what you know what is this kind of thing do we Jonny: put it in this category like this like are not usually

Jonny: the same people who are just like going to be able to write a really good implementation Jonny: of that and so like trying to figure Jonny: out how to make those collaborations happen as well because Jonny: this is another point where like i i don't see Jonny: this as a thing that really could exist or come Jonny: from any sort of startup like rest in Jonny: peace to the solid project which i have been trying to find for several years

Jonny: and i keep seeing little promising scraps of it but this is like so solid was Jonny: like the thing that tim berners-lee was like this will be the semantic web like Jonny: the thing that we're trying to like do to so it's like it has like Jonny: crisis of conscious, like actually the web sort of sucks.

Jonny: Like, like I think around like 2015 and 2016 and like, you know, Jonny: starting to be just like, okay, let's try and make solid as like a way for people Jonny: to do the, like the, the more like vernacularist dream of the semantic web where I have my. Jonny: Like this, now they're talking about like activity pods. Jonny: Like I have my little unit of my semantic web, like graph and information graph.

Jonny: But that quickly got bogged down in the academic cycle. No one could manage a project. Jonny: Then they spun that off into a startup. And wouldn't you know it, Jonny: once that happened, then it became owning your own data was a bug, not a feature. Jonny: And so now you're supposed to be pushed on to renting a cloud server for it and so on and so forth. Jonny: So I think that this doesn't come from startups or from any sort of company.

Jonny: It also doesn't come from the scattered wastes of open source world. Jonny: They're just like, you can't just like ask people to do it for free. Jonny: And it also doesn't come from this like local efforts of like trying to make Jonny: tools for like an individual institution.

Jonny: And so just like what's left is like, you know, we need to use some sort of Jonny: public funding and try and rally public funding in a way that it's not designed Jonny: to be allocated in order to like make these kinds of technologies. Jonny: And also the belief that there should be these technologies in the first place Jonny: in order to make that real.

Jonny: And so like, that's, this is like this unending knot of like, Jonny: who do we, who is the next little thread that we need to pull in order to make this large tapestry?

Jonny: But then like you you're dealing Jonny: with 25 years of baggage at the Jonny: same time so it's like a lot of the people that are still in Jonny: that space either have distanced Jonny: themselves from it and i have and look back on it Jonny: with this chain of mixed emotion mixed emotional Jonny: memories but i don't want to touch that anymore or they're Jonny: like in some way still true believers that just Jonny: like what do you you mean nothing is actually broken it's totally

Jonny: fine and like you just need to learn how Jonny: to do it good and so yeah so like Jonny: like and so this is like one of the reasons why Jonny: i'm just like like we were talking about this earlier today just being like Jonny: that in some ways like talking about like Jonny: serial project abandoners protocol abandoners that just like there Jonny: needs to be like a break in a way that's like backwards compatible we bring

Jonny: the past with us or like or have some way to like carry it through with us but Jonny: we're not beholden by all of this baggage that and and so i don't know like Jonny: talking about just like what happens in the future i guess i don't know if we've even gotten past the.

Jonny: Expository part of what even are we talking about yet but like maybe i'm jumping Jonny: the gun there but like yeah just last last thoughts on that idea is like that's Jonny: another part like the twin, Jonny: entry points for me into this whole line of thinking or just like thinking about Jonny: just like what could be an alternative to scholarly communication and publishing.

Jonny: It just like, it shouldn't be possible for me to throw stuff up on the web and Jonny: then have it be part of this sort of like blob of information without like a Jonny: lot of gatekeepers in the way. Jonny: The other part of it is that it's like, even long before I got interested in Jonny: it, I keep coming across these various like graveyards of things that are just like.

Jonny: This is a really cool idea, like a browser extension that like everywhere I Jonny: go, I can make sort of personal annotations and not just like bookmarks, Jonny: but just like I highlight this section, Jonny: and then I can relate it and share it to my friends like, oh, Jonny: actually, that extension was for like Netscape 6.0. Jonny: And like, was abandoned 20 years ago. And like, no one has thought about this ever since.

Jonny: And just like this long string of just like dead projects that are Jonny: that are exactly like this because again like Jonny: didn't you imagine like the kinds of open source projects that work and Jonny: like are sustainable are usually ones that have some Jonny: material tangible benefit for the Jonny: people that use them day to day like this is a tool i have active use for or Jonny: their baseline behind the scenes infrastructural work that like a lot of companies

Jonny: that will just like sort of rely on them like the but the types of like this Jonny: niche of technology just like what what you have to have in order to use it are a website. Jonny: So that rules out 99% of all people. Jonny: And then be like a website where you are deeply in control of the HTML that goes on that page. Jonny: And that rules out 80% of the remaining 1%.

Jonny: And so like, that just like, there just, yeah, there never was a time when it Jonny: had like an actual practical use. Jonny: And this is something that just like gets called out as early as the The earliest Jonny: I've seen of people saying, what is the point of all this was like in 2005 and 2006, Jonny: where just like there's a series of these blog posts of just like abandoning the semantic web. Jonny: It's like, no one actually figured out why we're doing this at all.

Jonny: Like, there's one interesting example of like music annotation, Jonny: where just like it's sort of like a peer to peer ish music system. Jonny: And then that's it. Like the rest of it is totally pointless.

Jonny: Like why would i ever do this in the first like invest Jonny: all this time into learning these incredibly complicated parts of Jonny: it because like one of the things that we're missing in the exposition stack Jonny: is the exposition section is like the sort of Jonny: stack of things that that the data is Jonny: like you have the triples part which we talked about but then Jonny: you also have like ontologies and schemas and just like the way that these things

Jonny: all sort of relate to get in it took me a year to even and figure out what these Jonny: meant and what they look like and why they existed and just like why is a schema Jonny: different than an ontology? Jonny: That seems like the same sort of thing but there's like different roles in the Jonny: ecosystem and also definitely different... Jonny: Just to say that... Dorothea: Why does neither of them have record constraint language?

Jay: Ontology means that your professor goes on tangents about first-order logic Jay: when you're learning it. Dorothea: That's right. Jay: Yeah. Justin: And schemas are on schema.org. Jonny: Exactly. Justin: That's how you know there's schemas. Jay: Also, was the music project you were talking about linked jazz? Jonny: I will look up this. It's in this blog post, Abandoning the Semantic Web. I'll see if I can find it. Jay: It's linked jazz rules. rules yeah.

Dorothea: That's a great little site i love it that. Jay: Was like the first i ever heard of link data i was like an undergrad still working Jay: in a music library sure and my and my like mentor professor or not professor Jay: my mentor like boss was like this is the coolest thing i've ever seen in my life.

Dorothea: Well and and and music in particular in a library context is actually a really Dorothea: wonderfully subversive place for for late data to get a foothold because mark for music saw. Jay: So bad oh it's terrible music cataloging like music copyright is something that Jay: even seasoned professionals will not touch um yeah music cataloging is its own Jay: has its own rules i mean heaven. Dorothea: But but wow mark was just not designed for that and it shows.

Jay: Oh it shows it shows yeah back to. Justin: The explaining part of things as well one of Justin: the one of the main benefits always sold about Justin: link data is that since the web Justin: is sort of a page or document focused sharing of information this would allow Justin: subsets of information to be pulled like johnny said pulling like all the headers Justin: from an article with a request the thing is that like without out like I could pull 9,000.

Justin: I don't know 500 fields from a mark record what do i Justin: need that for because i don't know anything about the context Justin: of it uh without the full document plus that's Justin: i'm guessing that's probably why it's so computationally heavy Justin: is that everything has to be done through servers whereas documents can be retained Justin: locally and it's just mostly just text files right so it's sort of the same

Justin: problem blockchain had where everything had to be done computationally And that's Justin: why it took 20 minutes to buy a donut because it had to get pushed out to like 20 ledgers. Justin: And instead, this is like, if I want to query information, it has to go through Justin: different servers, which I think was kind of the idea of websites that heal. I have it pulled up. Justin: It's a John Rhodes blog post.

Justin: But when Johnny was talking about bots, I think that was the idea was websites Justin: like link rot would happen between them.

Justin: And eventually bots would just kind of communicate server to server constantly Justin: and then just fix links and they would heal themselves and that was kind of Justin: the idea and that blog post ended with if anyone wants to write this i'll help Justin: but until then but that's the thing is like it's very difficult to do that because Justin: if you've ever worked with like government websites.

Justin: Particularly like healthcare websites every presidential administration stuff moves entire Justin: divisions of the government and so they're on completely different domains and Justin: that's why government websites always break and like really important ones and that's also why the the, Justin: government tends to do a lot of like dot coms now where it's just like healthcare Justin: healthcare.com okay just go there and we'll point it wherever it ends up because

Justin: trying to keep because i was an allied health librarian and trying to keep those Justin: pages about like the affordable care act up to Justin: date in libguides i mean thank god has a very good link checker but i constantly Justin: had to run that link checker because those things broke all the time they.

Jay: Don't even keep their pearls or whatever it is that they use because like one Jay: of them one year in grad school i was the the gov docs librarian graduate assistant Jay: and half of my job was just like going through sudoc stuff and then And also, Jay: like, checking the pearls or whatever permalink system that government websites Jay: and online GovDocs uses. Jay: And just finding all of the broken ones, which was all of them. Jay: They don't even maintain their

Jay: permalinks. Yeah. Which is the point of permalinks, is so that the back... Jay: The URL itself can change. Dorothea: Well, if I... Can I write on OCLC again? Jay: Yeah, always.

Dorothea: Yeah, that was actually another example that Ruth and I wrote about in our piece, was OCLC and Perl.org, Dorothea: which was not originally OCLCs, it was a grassroots little thing for okay, Dorothea: here's a place where you can mint permalinks, and we'll keep the database of Dorothea: where they went to, and everything will just work, Dorothea: and we'll happy permalink utopia and then Dorothea: with absolutely no warning some Dorothea: years after oclc took

Dorothea: over pearl.org and made a very loud Dorothea: statement about how it was very important and Dorothea: they were going to maintain it and definitely uh it Dorothea: broke they broke it the the person i Dorothea: i don't know the details i think Dorothea: the the person who had been maintaining it left retired Dorothea: who even knows but pearl org just completely broke oclc of course didn't give Dorothea: a fuck and it remained broken for like several years and now the internet archive

Dorothea: eventually took it over and they don't give a fuck so you can't actually get any support for it. Dorothea: And a bunch of innocent third parties who believed OCLC's lies and gleefully Dorothea: minted all kinds of pearls because they thought that infrastructure was going Dorothea: to stick around, dot burn. Dorothea: Right? This idea that Justin, I believe, was talking about of self-healing websites. Dorothea: Right, that is nonsense.

Dorothea: That is garbage. The world does not work that way. The world needs maintenance.

Jonny: Yeah and and so like there's like this whole nest Jonny: of ideas about like roads not taken in the internet with a Jonny: lot of this because it's like i have the same feeling about just like Jonny: permanent ideas and like and as Jonny: i do about just like in general when i see like a yet another platform for scholarly Jonny: communication or like we're going to fix the ills of like academia by making Jonny: yet another platform is that just like this is intrinsically a political one

Jonny: where and it puts And it's one where you are putting power in the hands of a Jonny: specific organization that just like, Jonny: and the longevity of that is strictly social. Jonny: We're just like, it's the same way just like permalinks exist as long as the organization exists.

Jonny: And so like I have in general sort of like more faith than average that archive.org Jonny: will continue to exist in the next year, Jonny: although they are sort of like damaging that reputation lately to sort of like, Jonny: like, just like, you know, anyway, Jonny: we won't go there just being sort of like.

Jonny: I think that they have good longevity plans for their Jonny: archive of the web okay but and i Jonny: also in general think that like the doi system Jonny: is probably not going anywhere but that's largely because Jonny: it's like you know one of the mechanisms for extracting billions Jonny: of dollars from public funding every year then just like so there's Jonny: like social reasons why these things persist but it's like there's the major

Jonny: thing that was not taken like why the like as you're saying just like the web Jonny: doesn't work in such a way where it would be possible to do self-healing websites Jonny: or self-healing links is because it's designed to be a client to server, Jonny: you go to a place and get something that someone else controls entirely. Jonny: And like, you're not actually supposed to have any agency in this world. Jonny: And like, there's good reasons for that. Don't get me wrong.

Jonny: But just like, this is like one of the true things about linked open data is that just like, Jonny: it needs to be peer to peer, The way that it could conceivably work is as a Jonny: peer-to-peer system where it's possible to do efficient querying and caching Jonny: between a bunch of different peers.

Jonny: So it's designed to be distributing labor in this way instead of every time Jonny: someone updates a link or makes a new record, Jonny: everyone has to go and hit this one server to get this one URI that represents Jonny: this core concept or whatever. Jonny: And so as long as that doesn't exist, there's this duality of this beautiful idea of, Jonny: of basing semantic web and linked data on URIs?

Jonny: Is that just like, okay, and elegant simplicity of this idea that the identifier Jonny: is actually a location, that like location and identity are the same thing. Jonny: And when I go to that location, I'm supposed to get something useful from it. Jonny: And then that allows me to go to the next thing. That's like a wonderful, wonderful idea. Jonny: But in reality, it doesn't work at all because like identity and location are not the same thing.

Jonny: That like i didn't and because you know Jonny: for one one reason is identities change and like Jonny: that like that like and so like there's this Jonny: like you know classic thing that Jonny: everyone always reference on the web is that it's like cool uris don't change Jonny: that's another tim berners-lee classic it's like actually all Jonny: uris change all the time and like and for that to be something where just like

Jonny: you You have a polemic trying to force something to behave in a way that it Jonny: doesn't rather than adapting to the reality of that thing than just like, Jonny: yes, you buy yourself in an infinite failure. Jonny: And so like one of the there's this. Jonny: Raising your hand. Jay: I just want to jump in. Yeah, we do the raise hand thing to like you can keep Jay: going. And then when you're done, Sadie will say something.

Jonny: But also just like interrupt. I actually would start trying to make some notes Jonny: to organize this thought, cause this is a long idea. So like, I, but like, yeah. Sadie: Oh, I've been thinking a lot about the purpose of a system is what it does. Jonny: Completely. Sadie: Right. Not what it thinks, not what it was designed to do, because we all know how design goes awry. Sadie: But yeah, the purpose of a system is what it does. Dorothea: Right on.

Sadie: I don't remember where I saw that. I love systems theory. Jay: Yeah, right. Sadie: Right. Jonny: So if you, if anyone has ever maintained a website or any sort of web technology, Jonny: we're just like, if the intention of this thing is to be liberating and freeing, Jonny: it certainly doesn't feel that way. Jonny: That just like that, like, you know what it would take to actually maintain a URL for forever.

Jonny: Like if that's the way the web is supposed to be, that the purpose of the web Jonny: is to like put these documents on the web. Jonny: Like it didn't, it doesn't do that. So yeah, exactly.

Jonny: That just like the purpose of the system is different. We're just like, and like, Jonny: again like thinking about just like all the ways that the technical Jonny: development has been stunted by the you know commercialization of Jonny: the web that just like precluded a lot of these things from existing is Jonny: like it's not an accident it's so like so Jonny: one of like one of the ways the ways that linked data is working en masse right

Jonny: now in a pretty invisible way is the fediverse and this is like what we were Jonny: talking about the last time i was on here so i won't belabor the point but it's Jonny: just like that that's built on linked data at least in the abstract and this is sort Jonny: of fascinating like realization of that we're just like Jonny: like for example like macedon like the largest implementation of Jonny: that does not actually use linked data as its internal data model that's all

Jonny: like a postgres database that then it's sort of just like synthesizes json ld Jonny: out of and like as like there's benefits and trade-offs that we're just like Jonny: as a result it sort of doesn't do all Jonny: of the linked data parts of what ActivityPub was supposed to do. Jonny: But there's the other, like, one other major alternative to this is Pleroma Jonny: and Dekoma, like the fork of Pleroma that is based on a graph database.

Jonny: And that can do a bunch of really interesting things. Jonny: But it also is, like, always crashing all the time and, like, Jonny: sort of hard to, too, because it's like, you know, think about just like, Jonny: because social networks are networks, it's like easily modeled by a graph. Jonny: And, and so doing something as simple as just like, there's this notion of like Jonny: this containers and these ordered collections and stuff like that in activity pub.

Jonny: And one ends like this, I have, you know, obviously lots of feelings about this, Jonny: this particular spec, but like, one of them is I have.

Jonny: A this notion of who i'm addressing my message to Jonny: and i should be able to address it to whoever Jonny: i want to that i have i can address it to this one controlled Jonny: ontology term public and that's just like i'm sending it to Jonny: the world but also it should be possible for me to have collections Jonny: of people and like i can address it to Jonny: this collection of people and so it's like in that way i Jonny: have a graph and then that graph is.

Jonny: Modeled like and all the relationships are modeled within in activity publishers Jonny: being like i'm allowed to send it to these people and i Jonny: want to send it to this subset of them in this particular case Jonny: and so you can do stuff like that in Jonny: a coma employment like i like the ui for it is a little less than Jonny: what could be desired but that's not something you can do in macedon where Jonny: each one of those addressing features has to be carefully

Jonny: architected from like as a as a database query Jonny: so like there's a this this tension of Jonny: just like okay we try and do it the semantic web way has the Jonny: beautiful possibilities but it's like really hard to implement and one Jonny: of the things that's hardest that was extremely like big reach and Jonny: was really only like done and made work Jonny: by just the sheer hegemony of mastodon as Jonny: like you know the the thing that if it does something everyone

Jonny: else has to adapt around it is like implementing editing Jonny: like you know thinking about just like i have Jonny: a post i want to edit that post that means i have have to propagate Jonny: that new version out to everybody else and so Jonny: like thinking about just like what it would take to have like these Jonny: sort of self-healing websites or just like the ability for the Jonny: web to adapt to change is like you need to have that expectation that just like

Jonny: everything that i know about i should be able to receive changes and be able Jonny: to propagate those among the people in the same way that just like that's how Jonny: rumors and horizontal information transfer works generally is that just like.

Jonny: Oh, I heard that this new thing happened, Jonny: and I tell my friends about it, and just like, you know, maybe and doing so Jonny: in a way that's like actually safe, and, Jonny: that is resistant to counterfeiting is a remarkably hard thing to retrofit into Jonny: a system and so like that's like like. Jay: How do we make the web actually rhizomatic.

Jonny: Yeah and yeah and this is Jonny: like again it goes back to the like the dawn of Jonny: the web browser and what it is as a technology is Jonny: like this idea of the read write web we're just like it Jonny: should be just as easy to write as it is to read Jonny: on the web and like you know obviously controlled by Jonny: permissions in some way but like this that Jonny: experiment died basically when netscape Jonny: won in the early browser wars but then

Jonny: it persisted in the form of wikis and this notion Jonny: of soft security where just like how do Jonny: we make that work is we make it so that doing Jonny: this kind of like you know we allow stuff stuff Jonny: to happen but then make it so it can't damage the Jonny: system in some profound way we're just like if someone Jonny: does something they're not supposed to do you know someone goes and Jonny: vandalizes a wikipedia page or whatever then like sure the next person that

Jonny: goes and loads that page might see a bunch of vandalism and that's bad but like Jonny: it's not it doesn't ruin the page it doesn't break it forever and completely Jonny: like it's possible for me to revert the old version of it and and so on and so forth. Jonny: So like, and that's a radically different political vision than the, Jonny: most of the web stack that we're familiar with. So just like that, it's like that.

Jonny: Ultimately, for this technology to work, it Jonny: needs to be constructed on a different set of political Jonny: primitives that include other people existing and being able to do stuff in Jonny: a way that just like is very uncomfortable for like most of the people who design Jonny: web technology nowadays to think of that as being I'm going to design a platform Jonny: that I administer for other people.

Jonny: And so instead like thinking about Jonny: it as being stuff that is designed so you get Jonny: out of the way like the most successful technology that Jonny: would enable like semantic web stuff is that no longer requires Jonny: the developer to be there and allows people to actually have autonomy on computers Jonny: but again there's no percentage in that it's in fact anti-profitable and so Jonny: like that's it's a very difficult thing to organize that kind of not only a

Jonny: technical vision, but social vision as well. Jonny: Yeah. I always end up just like back in wiki world. Jonny: It's just like some of the most, some of the most lovely parts of the web, as far as I'm concerned. Jonny: I'm still curious if I can find this, this like link data music project. Jonny: Cause that also is something I'm interested in. Jonny: Oh, so like, I don't know. I feel like the thing I think about is like survivable web technology.

Jonny: Always just like return to like pirate networks being sort of like the things Jonny: that can exist and do survive on the web we're just like what are the longest Jonny: lived things on the internet and it's like the w3c website just sort of they Jonny: win by the hell but like but like, Jonny: other than that like pirate networks like that is the other major answer that just like some of Jonny: those like mp3s that were

Jonny: like released on kazaar or something like that are still floating Jonny: around and that just like you Jonny: compare that to the extreme adversarial Jonny: conditions by which the entire global intellectual property regime is bearing Jonny: down you and still it happens like why does that work and like you know to some Jonny: degree it's a technological question but it's also a social question of just Jonny: being like because people take it as their responsibility that it's like i see

Jonny: see myself as an active participant in this system. Jonny: And so when my pirate site gets shut down, I go to the next one and put everything back up. Jonny: So, yeah, that's anyway, you've got to love the pirates, although there's a Jonny: huge amount of power and political problems in those circles as well. Jay: Librarians need to read that, like how to form an affinity group zine and like Jay: go from there, see what happens. Justin: I mean, I was.

Dorothea: It's likely to work as anything, really. Justin: Yeah. I think one of the practical reasons also linked up in data is always Justin: difficult is that kind of all files are local files in the same way that like Justin: all history is local history because it's always local to somewhere.

Justin: Anytime I try and think of, you know, particularly like when you mentioned EADs, Justin: there used to be a lot of stuff in the EAD literature about like, Justin: why does no one share their local authority files? Justin: Like, you know, like John Fox Smith donated to the library and we have his name Justin: authority file in like our decks, but he doesn't have like a library of Congress Justin: name authority because he wasn't famous enough.

Justin: Right. So everyone's got there. Right. Right. He just had a bunch of money. Justin: Right. And so, so, so we have all of these people who are local in our local Justin: name authority files and they never, ever get shared and they always stay siloed.

Justin: And yes there is almost no Justin: solution to it because the amount Justin: of labor it would take to like disambiguate the Justin: names people who have common names and you Justin: know is this the same person and then who's going to do Justin: it too because like they barely have enough staff Justin: and special collections anyway so who cares if like every local Justin: donor is going to get their own name authority file while and like I think another

Justin: thing is like like Johnny mentioned having like the way Johnny uses a word would Justin: have to go to a URI it's kind of when we were talking about taxonomy last week Justin: and that episode doesn't come out yet but I, Justin: Sort of like the issues with like taxonomy for animals and everything. Justin: You need like smaller sets of words, not bigger ones in order to actually make it useful for humans.

Justin: So when I was working with the bird working group, it was like everyone keeps Justin: using too many different words. Justin: We need to just all we need to solve this problem is like a short list. Justin: And then we can use that as like user submitted metadata and tags. Justin: And that's really all we need is just to agree between us humans, Justin: we're going to use the word paleo-ornithology instead of archaeo-ornithology.

Justin: And, like, that's all we had to do is, like, kind of get people to agree to that. Justin: There's not really, like, a technical solution because, you know, Justin: the entire birdworking group of paleo-ornithologists is, like, Justin: if they were all on a boat and it sank, there wouldn't be a birdworking group. Justin: Right. So it's, it's not too difficult to like, it's, it's not an impossible Justin: like political solution.

Justin: And it's what I always keep kind of thinking about is like, we have all these documents. Justin: Yeah. And there it's, it would be nice to break things up into data and share it as linked data. Justin: But as an organization, you don't really need to depending on the size and scale. Justin: And so that's why like so many libraries have their own.

Justin: When I think of like how a library is organized, it is ultimately you know the Justin: reason why mark is like that is its access points and it's kind of what we always Justin: default back to is what's the access point for this and i don't really care.

Justin: Semantically like how the data works as long as like this is a subject area Justin: this is the title this is the author how do i get to the information like the Justin: quickest possible steps and then that And that leads to, Justin: I feel like that's where always the disconnect has been for me with linked open data of like, Justin: when is this going to help my users in my library?

Justin: It's like, well, you can get stuff out into the, and it's easy for me as a Skullcom Justin: person, because it's like, I'm the only person who's like, no, Justin: I want this out everywhere in the world. Justin: I want everyone to look at this. But everything else in the library is categorically Justin: organized around how do people in here find the stuff that we're looking for?

Justin: And I'm the only one who has to flip that and try and say, how do we get what's Justin: in here out to the world with no barriers and restrictions and logins?

Jay: Yeah like was it last year maybe a couple years ago i was part of the like pcc ad hoc, Jay: group that put out the final decision about like hey maybe don't put gender in name authority files, Jay: because there was the initial one and then a lot of people got mad at that one Jay: and then i was part of the ad hoc hey let's revisit this thank you for your Jay: service and one of the final sticking points.

Jay: Like, cause most of us were on board with like, maybe let's just don't like, Jay: it's too complicated to think of any ways to like put consistent language ways to do this ethically. Jay: That's not going to hurt like trans people was mainly who we were thinking of, Jay: but like, there's other reasons why you might put gender. Jay: I'm like, some of the reasons were like, but with like Asian names, Jay: sometimes it's hard to disambiguate. And I'm like, that's racist.

Jay: Like, that's just lazy and racist. exist but the Jay: big one like the final kind of sticking point where we were Jay: like maybe there's a point here but ultimately no we Jay: don't care was well in a Jay: linked data environment people could query Jay: books about xyz written Jay: by trans authors or for Jay: example like you can do a sparkle with wiki data where Jay: you can be like pull all of the towns that currently have Jay: female mayors or whatever is usually the example that

Jay: they use when they tell you what sparkle can do with wiki data Jay: like what if you could do that with a library Jay: catalog whoa and we had Jay: to be like yeah but no not discovery Jay: layers like primo doesn't even Jay: do that yet like no discovery layer right now Jay: that's like popularly used by academic or Jay: public libraries has that capability they might Jay: have linked data in the records and they might have apis exposed if you have

Jay: a developer who can do neat shit but ultimately that's not how those searches Jay: work right now so maybe it is available in the future but for right now we don't Jay: care and that's not the purpose of name authority files so right, Jay: yeah like.

Jonny: The the question of just like what is it for like what is the point of it you Jonny: know why Why would I do it if there's no use is like also ultimately really Jonny: just like, like beliefs about like how things are supposed to be designed.

Jonny: We're just like, is the goal of it to be able to get a exhaustive and true answer Jonny: of all of the, you know, cities that have a woman as a bear, Jonny: you know, is that, that the point of what we should be doing with semantic web Jonny: is to like make the correct information exist in a unified vocabulary.

Jonny: And like, I don't, I like, spoiler alert I don't think so that just like well Jonny: because there's no such thing as like the authoritative and complete true archive Jonny: of all knowledge but it's also just like.

Jonny: Thinking about is like well that's like an impressive technical Jonny: feat that i could put on like some sort of like tech Jonny: specs document that just like my query engine Jonny: can produce 10 billion triples in like one one Jonny: second but like yeah like what's Jonny: the point of that and just like thinking about it like in the context of language Jonny: we're just like it's also related to the notion of like Jonny: ontology curation about just like how do we come to

Jonny: like know the terms that are the one term to Jonny: use is like that's only an important question if the Jonny: goal of it is to like make everything be totally uniform Jonny: and also that that act Jonny: of searching is like relatively precious and hard Jonny: to do and like i can only do one Jonny: of these or something like that that just like this is not an iterative process Jonny: of exploration and ultimate and

Jonny: also that just like you're not able to so like the thing about just like the way Jonny: that this works with language we're just like it doesn't ever work with language like Jonny: say new phenomenon exists in the Jonny: world like we need to get the council of languages Jonny: together to agree on the one word for Jonny: that and then everyone from then on has to agree to only use that word to refer Jonny: to that phenomenon it's like that never how it has happened and it never will

Jonny: be and just like instead just like this sort of local interpretation of what's Jonny: happening in my immediate reality and just like you try and use this word and Jonny: is this effective with it when when I say it in this way. Jonny: Oh, what I'm talking about is this.

Jonny: And Oh, I know it as this. And just like this sort of negotiation over what Jonny: things mean and in what context and to who, and like being able to have your Jonny: personal vocabulary and ontology where just like, as your history of your browsing. Jonny: It's like, I've come to know that these terms are the same terms or just like Jonny: when I am in this neighborhood of semantic space, Jonny: I use this word instead of this word. And like.

Jonny: Then you can imagine like the collective power of something like that. Jonny: We're just like, okay, all of my friends know these words as being the same. Jonny: And so just like in general, I can ask around and say who I'm looking for this. Jonny: Does anyone know how I would refer to that? And just like being able to, Jonny: you know, make sense of just like as like as like an iterative and a social Jonny: and an interactive process.

Jonny: Not one that's done once as if it were like a database query with a very controlled Jonny: database schema that's like known in advance.

Jonny: Ants and so like it just it changes our Jonny: expectations for what technology should look like that just Jonny: like i don't go to the vast impersonal search Jonny: engine that indexes the whole web but instead Jonny: i have to actively cultivate sort of like Jonny: a set of nodes and and friends and like relationships and Jonny: like prior acquaintances with this kind of thing and Jonny: then expect it to take a little bit of Jonny: time to find stuff you know

Jonny: that just like that and like i that sounds Jonny: sort of counterintuitive we're just like i'm not saying it in create exclusion Jonny: or create inefficiency but like Jonny: that just like the goal of the system isn't to produce Jonny: maximally true maximally numerous and maximally Jonny: cleanly organized data all the time and like Jonny: it's just like it it's i can imagine like thinking about just like what happens

Jonny: you know just like like just talking about just like why doesn't everybody share Jonny: their their like local i actually i'm not familiar with this term like authority Jonny: file i assume that's like you know like a local like reference like subject. Jay: Headings or like if you publish a book like your name how that's in the library Jay: of congress it's an authority file.

Jonny: Gotcha yeah they're just like they're it's also just like one of the things Jonny: who gets to do that you know that like the same problem with just like, Jonny: you know, libraries and museums being the sites of just like pillaged cultural artifacts. Jonny: It's just sort of like not your job and not your role to be the purveyor of Jonny: this information like it's about this person. Jonny: And it becomes your role because like they have no means of doing so themselves.

Jonny: Like there's just like these systems aren't ones that can be touched by the Jonny: average person. Like I can't like deposit a book myself in Library of Congress.

Jonny: I need some intermediary force and so like Jonny: that's just like that like there's another part just like Jonny: why doesn't it happen and why doesn't it work is because like on the Jonny: other end just like who is it for and should we even do Jonny: that at all because like same thing of just like what happens when Jonny: you need to change your dead name in in the Jonny: all the bibliometric records like how does Jonny: that happen yeah i

Jonny: freak all my software friends out when i talk about eventually Jonny: needing to write the anti-performance performance manifesto that Jonny: just like sort of like that just like like Jonny: and someone who is like a friend on Jonny: the fediverse and it's like we talk all the time just sort of like horrified just Jonny: like what do you mean software should be delightful to run and like just Jonny: like yeah yeah that's not exactly what i'm referring

Jonny: to though just being sort of like that like the we need to get page load time Jonny: down to two milliseconds or life will be lost and meaningless as we know it Jonny: as just like a set of ideological commitments rather than making stuff be usable Jonny: by people is the thing I'm talking about. Jonny: Oh my god i'm opening this i'm opening this you have an authority file you have an official uri i. Jay: Do i have a uri and i'm part of the problem.

Jonny: We all have many uris yeah i helped. Jay: I helped write a book in like 2018 in my during my first job hell.

Jonny: Yeah so like one of Jonny: the interesting things that i think that blue sky and Jonny: ad protocol has done is like make it so that Jonny: like domains are sort of meaningful as identity we're just like yeah that's Jonny: cool yeah that just like i have a domain and like control over a domain and Jonny: that gives me a source of identity even if it doesn't give me control over the Jonny: computers that host the thing that you know whatever like we talk about that

Jonny: different time but just being like it's very Jonny: interesting that just like that has resurged and actually genuinely useful. Jonny: And I think one of the best ideas to come out of it is like actually using those, Jonny: like I, you know, URIs and URLs has just literally, this can be my name.

Dorothea: Yeah. Because it's language independent, human language independent and things Dorothea: like debt naming, which we have to deal with in the authority file environment Dorothea: because it is predicated on names. Dorothea: It's just a URI you don't have to do that you can attach any name you want to it so there's.

Jay: Definitely that's the good thing about URIs is it allows the flexibility for Jay: trans names or any other kind of name that might change absolutely that's the Jay: good part about them love URIs. Dorothea: That's one thing that I want to keep at all this nonsense URIs as identifiers Dorothea: was genuinely a clever and useful idea.

Jay: Yeah, it was a big deal when the homosaurus moved from having the terms be the Jay: URIs to having alphanumeric URIs so that we could change terms as language use changed. Dorothea: Yeah, love it. Justin: Did they ever tell you don't put semantic information into URI? Justin: Everyone does it. It's so stupid. Jay: We're queer. We don't listen. Fuck you. Jay: Um doi.org doi.org. Justin: Slash my journal volume one and.

Dorothea: It's like yeah if you ever if you Dorothea: ever meet jeff builder who's a wonder he works at crossref wonderful human being Dorothea: he has many many many rants about publishers coming to crossref wanting to change Dorothea: a doi prefix because they merged with another publisher or internal change publishers Dorothea: or whatever the hell and he's like no that's not the point.

Justin: They um they have a a suffix generator now it's just it's literally just a spreadsheet Justin: that generates a suffix but they're like use this idiots yes. Jay: Please is that like half your job justin is just being like. Justin: No i don't meant i mean i don't meant dois manually usually but But the thing Justin: that always bugged me was OJS used to put semantic information into the automated Justin: strings that it would create.

Justin: So it would create, it would say like V and then the volume number and then Justin: N and then the article number. Justin: And I was like, don't do that. Just put random numbers. Just put random numbers. Justin: Just general, just random number generator. That's all you need to do. Justin: But they didn't do it until the latest update. So now they do it properly. Dorothea: Where you can do what every single baby relational database administrator knows to do, and just count.

Jay: I don't know how to count. I'm gay, as we've learned from the homosaurus. Justin: I do have an Excel sheet of... Justin: Manuscripts and database bases and it's just zero zero zero zero zero what is this yeah. Jay: What happens when you go beyond the capacity for how many zeros you picked with them. Justin: Doesn't matter.

Jonny: Okay and like it's like it's like all of these things like have their times Jonny: and applications and usages and everything like that we're just like just do all of them and make them Jonny: all point to you know the same thing different things etc Jonny: that just like like because i think like you know Jonny: sequential numbering identify works you Jonny: know there are times when you don't want to use Jonny: it like we're just like you have like potentially personally identifying

Jonny: information where you don't want someone to be able to enumerate over all possible Jonny: things and find all the stuff on the server and spoiler alert is like university Jonny: it terrible job at this and And frequently we'll just have like very sensitive Jonny: documents hanging out that can be publicly enumerated on their public web.

Jonny: But like, you know, so it's like super useful when designing some systems in Jonny: the same way that just like having totally anonymous strings is super useful in like PID space. Jonny: But then want to have semantic URIs and some other content that just like do all of these things. Jonny: And like the other one is like the content hashing where just like the identifier

Jonny: is like intrinsically based on the content of the thing. So if I have the thing, Jonny: I know how it would be called everywhere in the world, like has its own benefits and trade offs. Jonny: That's like, that is one of those dangerous ideological territories where just Jonny: like you get pirates and also cryptocurrency zealots in the same room.

Jonny: And it's just sort of like, like, it becomes this maelstrom of just like, Jonny: the same idea, meaning completely different things to different people. Jonny: But like, yeah, yeah, we're not going to solve the identification problem, Jonny: but basically just like, you know, it's the rigidity and being only able to Jonny: use one thing that like is the problem to me. Justin: Yeah. Now, I don't have Library of Congress name authority file, though.

Justin: Someone from Florida with my name born same year as me does, which is confusing. Justin: There's so many people with my I went to high school with someone with my name. Justin: It's very confusing. It doesn't seem like it should be that common. Jonny: It makes you harder to dox, though, so that's like passive self-defense. Justin: It is really good. I have successfully scrubbed my information off the web several times.

Justin: It's not hard. Or one time I couldn't do it, so I just redirected it to another dude with my name. Justin: And so I just changed my information to, I changed my address to his.

Jonny: And I feel like this would be something that just like, Jonny: like Dorothy would probably have stronger thoughts about, Jonny: it's like the notion of privacy and like when Jonny: it comes to like linked open data and stuff like that we're just like this the Jonny: fact that just like we don't want all the world's information to be publicly Jonny: we don't want like the justin authority record that includes your you know social

Jonny: security number and you know phone number and everything like that like like Jonny: like limits to openness you know that just like needs to be some amount of like Jonny: fungibility and yeah i'll.

Dorothea: Actually give you a real world example if you Dorothea: go and look at my wiki data page and you Dorothea: can just go to wikidata.org and look up dorothea salo i'm Dorothea: the only one as far as i know that has ever existed so what you find will be Dorothea: me i might uh although i identify like i'm c's female that is how i identify Dorothea: that's who i am my wiki data page actually says no gender no gender recorded Dorothea: and the reason for that is that Wikipedia,

Dorothea: with which I have a very vexed relationship, Dorothea: runs through wikidata every now and again Dorothea: to do things like make lists of people Dorothea: who maybe should have wikipedia entries but don't Dorothea: and of course they do this for minoritized and Dorothea: underrepresented populations and of course wikipedia Dorothea: is well known for having a huge gender problem Dorothea: gender disparity coverage problem so i get sucked up into those lists and nobody

Dorothea: asked me i do not actually want onto wikipedia page thank you very much and Dorothea: i would rather not be so i changed my gender that is listed on wikidana. Justin: I did not actually change. Dorothea: My gender that's. Jonny: Dope like anti-bot action like you just like. Dorothea: Yeah a digital seem to be the only option for saying no don't make me a wikipedia entry transfer. Sadie: The privacy of it. Dorothea: Pretty. Justin: Much gender opsec. Jay: My gender is.

Justin: Fuck off, Justin: get this gender working for me, Justin: yeah no that's why I also like orchid IDs too because it's a very nice system Justin: that you get to control and you get to you get to write your name how you want Justin: it you can write it in multiple scripts, Justin: and it's just an orchid and it just will point to whatever you tell it so you Justin: can change it whenever you want and that's what I really like about Justin: it is you know that that would

Justin: be something that would be very nice to use Justin: for like local archiving and stuff like that Justin: but the reason why is like no one's going to bother to do that nerds will Justin: do that but like i couldn't even get like faculty to do it even when this would Justin: save them time in the long run or it would make right or it would solve headaches Justin: like if they don't if they have a double barrel first name and people keep putting

Justin: their second first name as their their last name it would solve them this problem Justin: but they you know they don't go sign up for an orchid i. Jay: Was actually when i was cited in the ethics in name authority files book one Jay: of the chapters and then they asked like how i wanted to be cited i was like i would like my orchid, Jay: because they were citing one of my articles or my thesis or something they had

Jay: my dead name on it and i was like i want you to do it this way and i want you Jay: to have my orchid in there so that it's collocated like properly links back Jay: to like all of my stuff right and i think it was brie actually then went on Jay: to write an article and talk about, Jay: like how i ask to be cited Jay: in that book as like Jay: using orchids and uris and linked data as Jay: a way to help trans people who maybe have Jay: published under dead names um and if

Jay: they don't want to go back and change like ask Jay: for it to be changed which i don't but this way i can have people cite me and Jay: just use my like first initial and it point back to my current stuff and everything Jay: i've done with my current name while also still being like but i'm also the Jay: person that wrote that yeah it's not that hard.

Justin: Yeah especially if you like use initials because Justin: i use my initial a lot because i do have a very common name so i think Justin: but i used to write my full middle name and i don't do that Justin: anymore so it's nice to be able to be like okay i Justin: published my thesis with my full name but now i only Justin: like using my middle initial yeah and now i'm Justin: at an institution where i'm the only one of me so i don't even have a number

Justin: after my name i was very excited when i got my email signed to me because there Justin: is now someone else at my university with my name so there is like a zero one Justin: now and i'm like ah finally got there first i used to get detention because Justin: of some dude had my name are.

Jonny: You serious i. Justin: Get his detention yeah they used to put out a roll with the names at the beginning Justin: of the period teachers had to check them and if you were on the list you had Justin: to go to the cafeteria so i kept getting called into the cafeteria because Because Justin: it wouldn't disambiguate my name. Sadie: I had that happen to me too. I had my birth last name, which is, Sadie: I changed my last name when I got married. Sadie: My birth last name is Johnson.

Sadie: So there's like, not only are there 70 billion S Johnsons out there, Sadie: but I have a cousin who has almost the same exact name as, we were born almost Sadie: the same exact person practically, right? Sadie: We have the same name, the same first name, same last name. Sadie: Neither of us use our middle name, right? Yeah. So- Sadie: I got told I was supposed to go to detention a couple of times in high school Sadie: because there was another person with my name. It's common.

Jonny: But like that's a, you know, like free bad kid social currency, Jonny: you know, just like, hell yeah, I'm going to detention, baby. Jonny: Like that's like, you don't even have to do it. So you get the best of both worlds. Justin: Well, I used that. What they said to me was, well, if they don't put your middle initial, it's not you. Justin: And I use that excuse for the next four years, even though that dude was a senior when I was a freshman. Jonny: Said no middle initial.

Justin: It's not me. Jonny: It's like can't. Sadie: Make me do it. Jonny: That's like just social engineering you know just in the real world you know Jonny: just people just intuitively do it there. Sadie: Is no difference between social engineering and con artistry. Jonny: Hell yeah yeah. Sadie: I will die i will die on that hill.

Jonny: Yeah a good friend Jonny: of mine is having a crisis of like direction Jonny: in life and i'm like okay so Jonny: your strengths you are super good at like Jonny: infiltrating unfriendly organizations and Jonny: groups of people and like taking on roles and shit Jonny: and did you know that that is a job and like Jonny: um and so like trying to like yeah turn this person totally a job like it's Jonny: like and a lot of the people that do it sort of accidentally find themselves

Jonny: you know like like you know seeing it the first time like holy shit you can Jonny: do that and then just like suddenly becoming really good at it anyway i. Sadie: Feel like the the alternate of that fork is improv comedian. Jonny: Their their. Jay: True their true destiny is they just become podcasters improv people are good Jay: at doing podcasts like all my favorite podcasts i've learned like the people Jay: did improv i have no idea what. Sadie: I'm doing here.

Jay: Yeah. Jonny: That's like something we did improv that one episode what you did like improv Jonny: games or like what what are you talking about we. Justin: Had we'd seriously wrong on we did skits and those. Jonny: Were oh yeah i i. Justin: Dipped i was bad at it we. Jonny: Were very. Justin: Bad at it but they very good at editing. Jay: They're so good at editing my god when i finally listened to the episode i I Jay: was like, oh, wow, they made something out of this. Yes.

Justin: But, yeah, the only thing that we didn't mention that I wanted to maybe mention Justin: is kind of what we talked about last time was whoever controls the nodes of Justin: a graph can control the graph.

Justin: And so I was also thinking about that as a security problem with linked open Justin: data is, you know, when we were talking about like all of the privatization Justin: happening, if someone buys a certain node of the graph, Justin: then the same problem Sadie was saying with everyone having their own API is Justin: like, if you're controlling this graph, Justin: even though it's open, and you control like the right permissions,

Justin: then like, I don't know, assume that's a problem that's going on. Justin: Because oclc has meridian now and i assume that that it only exists because it will make money if. Jonny: You control. Jay: The spice you control the universe. Jonny: Yeah is that a animal this is a. Justin: Very cranky.

Jonny: And just like desirous animal it's like my turn like i'm sure i haven't heard Jonny: about this this meridian thing was the first time i heard about this today is Jonny: this just like a it says may 2024 is it like i assume it's is it that new i. Dorothea: Hadn't known about it until today either for when it's worth oh. Jay: Clc just loves to do shit.

Justin: Our our metadata librarian is is currently work like on on a at my job is on Justin: like a committee for i think what is what is the organization the program for Justin: cooperative cataloging and they're They're working on a task group for like Justin: URIs in Mark implementation. Justin: So I guess like they're going to have separate types of like handle based permalinks or something.

Justin: I don't know that are going to be in Mark, but they were also talking about Justin: how they had like a demonstration of Meridian. Justin: And I don't, I think it's just the link data they've made out of WorldCat. Jonny: So they're, they're, they're using an entry for Octavia Butler as the demo data. Jonny: And I'm like, that's like an interesting, interesting, like person and body Jonny: of work to evoke in your like corporate platform. Jonny: Like that's just like, yeah.

Justin: The don't build this machine. Jonny: Yeah. Sadie: The Torment Nexus. Dorothea: Thank you. Don't create the Torment Nexus. Sadie: Wouldn't it be terrible if we created the Torment Nexus? Creates the Torment Nexus anyways. Dorothea: So here's a gif. And this is totally off the cuff just because, Dorothea: again, I only heard about this today. Dorothea: I think it is clear to OCLC that their WorldCat monopoly is not long for this Dorothea: world. one way or another.

Dorothea: Whether it's a customer revolt or we finally find a way to do this with linked Dorothea: data without getting sued out of existence, that's not going to last. Dorothea: So how can OCLC come up with a linked data store that they can fence around, Dorothea: limit to their customers the same way that they've done with WorldCat? Dorothea: That's what I think Meridian is.

Justin: Probably. Jonny: Probably i mean as as you're saying like Jonny: they're doing it because it makes money Jonny: somehow and like i think that's a pretty good bet i mean and it's like continuous Jonny: with the way that the rest of like linked open data has has worked we're just Jonny: like that's like what wiki data is to to some degree is that it's like basically Jonny: a captive labor pool like and so it's like like who funds wiki data is largely Google.

Jonny: And so like Google bought Freebase, like the predecessor to it, Jonny: you know, they did their attempts at cleaning it up and everything like that. Jonny: And then basically like shunted that into Wikidata and they profit from it immensely Jonny: by being clean, corporate friendly. Jonny: Like there's no like swearing on Wikidata, you know, and, and.

Jonny: Way of concentrating a bunch of labor so that Jonny: then they can mine it and make derivative profits from it and Jonny: like we're just like the people that work on wiki data are like Jonny: genuinely true believers in like the Jonny: beneficence of cataloging the Jonny: world's data they're just like they're like not corporate stooges Jonny: they're like view themselves as being like we're just Jonny: trying to do the same mission as wikipedia

Jonny: which is just like yeah make make a global information store but Jonny: not really evaluating the like why would google want Jonny: us to do this you know and like and so Jonny: just like that that sort of pure production Jonny: as captive labor model is one Jonny: of those biggest sort of like you know red pilling moments Jonny: for like information people is Jonny: that just like what if it's actually bad to have like Jonny: these sort of like crowdsourced information platforms that

Jonny: just like so when we were watching when we Jonny: were watching lo and behold like one of the Jonny: like examples of just like the beauty of the internet and so Jonny: it's like again like every time i think about this is like this is a Jonny: movie that was released in 2016 which is not that long Jonny: ago but yet and yet it feels like a completely different Jonny: universe we're just like this is like one of.

Jonny: The promising things about it where you had this like chemical reaction crowdsourced Jonny: thing where just like the wisdom of the crowds Jonny: lots of people playing this game about like protein Jonny: folding or whatever was able to do something Jonny: that you know the best scientists in the world could do and it's just like cool Jonny: but were any of those people on the paper that got published from that and from

Jonny: all of that work and like we're just like If it's just a thing where you farm Jonny: out other people's labor in time. Jonny: Or just in this case, farm out all Jonny: of the cataloging labor that happens in libraries into curating this...

Jonny: Collection of information in the same way that i i don't know Jonny: the politics of world cat i assume it's the similar kind of way we're just like Jonny: everyone is required to use this but we don't actually have much control over Jonny: it kind of thing and just like yeah like that is a a massive extraction vector Jonny: sort of hiding in plain sight under the guise of pro-social technologies.

Justin: Yeah and this is probably more of the same which Justin: is to make that data then usable and Justin: and useful to ai products i would Justin: assume particularly it's interesting that they Justin: mentioned like incorporating orchid and ror Justin: which are like skullcom specific things really especially ror Justin: is like a weird one to throw in there because that's like research organizations Justin: right to make sure that those are disambiguated because journals are really

Justin: really bad at disambiguating like the biology department of this university Justin: because departments change all the time and also people abbreviate them and And, you know, Justin: so there's no, there's no like one identity and that causes all kinds of problems, Justin: even just like getting the university right half the time. It's like, it's wrong. Justin: So ROR is kind of like orchid for organizations. And so that's a very specific thing.

Justin: And I find that very strange. Like, do they want like regular, Justin: like cataloging librarians, like fix the Skollcom metadata thing? Justin: Problems that are out there they. Dorothea: Do like oyster yeah. Justin: That like clarivate.

Dorothea: Can fix scoop that up back in Dorothea: the day what's that oh it was a Dorothea: union search engine for institutional and sometimes disciplinary repositories Dorothea: is is what it was it's basically there were always problems with But the problems Dorothea: go back to OAIPMH being complete garbage, Dorothea: such that you couldn't, for one of the things it does not allow you to say is, Dorothea: is there a full text associated with this item?

Dorothea: And so one of the reasons Oyster became completely useless is that it was choked Dorothea: with metadata-only records, which really disappointed end users because they Dorothea: couldn't click on it and get to the thing. Right. Jonny: And that's definitely why I auto-embed Sci-Hub links in all of my writing, Jonny: because it's just like, what use is it to someone else for me to cite something Jonny: if they can't actually see it?

Justin: I wonder how they scrape the full text information now when stuff gets pulled Justin: from OAIPMH, because it still does. Justin: Because OAIPMH is how we push out to core, but it definitely does know if we've got full text. Dorothea: I have to think they implemented a check, which is fascinating because they Dorothea: would have had to implement such a check for pretty much every single repository Dorothea: and repository design in existence.

Dorothea: Like, you're literally looking for a link that says PDF or something. Justin: Yeah. Dorothea: Wow. All because Herbert Van de Soppel is complete crap at building protocols Dorothea: and things that will be useful at last. All right. I said the name. Jonny: This is obscure beef. Dorothea: Oh, I, you know, Dorothea: Herbert Vandesop, when I say serial project abandoner, he is the paradigm example. Dorothea: He totally did that with OAPMH. He totally did it with Memento.

Dorothea: There are probably six other projects of his that I could also... Right? Memento. Justin: Remember Memento? Dorothea: Yeah. And I'm just like, funders, stop giving this guy money. It never turns out well. Justin: We got more obscure beef than a wagyu farm heck yeah don't look at me like that. Jay: I'll look at you however I want to. Justin: Alright I was very proud of that. Sadie: It's good. Justin: Well done thank you I think we should wrap up.

Jonny: Yeah yes I've got sleepy bitch disease. Sadie: Did we clarify what the hell's. Jonny: Going on or still cloudy. Sadie: I i think i've got a pretty good gist actually and you know what knowing the Sadie: beef actually helps it it it does so good that's. Dorothea: Like and you know i do teach this stuff sadie you know my email address you Dorothea: can totally ask me questions. Sadie: That's true yeah That's true.

Jonny: And like, like one of the things I have come to love in this world, Jonny: you know, the few things that you can love in it. Jonny: It's just like, every time you get close to something, like you just like realize Jonny: that it's all just people. Jonny: And that's just like all these things that are these immutable features of the world. Jonny: One day you might just come face to face with like, Oh, that was you.

Jonny: And then just be able to be just like, like that just like yeah all of a sudden Jonny: it makes sense where it's like i get why it is that way that just like you know Jonny: you knowing the beef and knowing the people is the way to know the thing yep. Sadie: It all makes sense now. Dorothea: Oh glad to hear it thanks y'all i as always love being on the on the podcast. Justin: Yeah oh thank you so so much for coming on yeah thanks and i'm glad we got to do this.

Jonny: Yep yes good to see you yet again let's let's find time to watch a movie sometime Jonny: soon it's been a while yes. Justin: Oh yeah i need to do more i need to do more movies in the in the discord which Justin: i was about to plug because dorothea you've also been answering questions in Justin: the discord it's very helpful yes and we appreciate it it's. Jay: Just us shit posting and you being helpful yeah. Dorothea: Well i mean you know that's and worse the way it usually is.

Dorothea: Everybody else is being helpful, and I'm shit-missing. So, hey! Sadie: Even the score. Justin: Good night.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android