Multi-Model Data Stores with Ted Neward - podcast episode cover

Multi-Model Data Stores with Ted Neward

Jul 20, 20231 hr 1 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Databases continue to evolve! Carl and Richard talk to Ted Neward about multi-model data stores - which, these days, are most databases! Ted talks about how SQL and NoSQL are not that different - it's only a query engine. But how do you store your data? Today multi-model databases store data with multiple storage engines, and so can store your data in the most appropriate form. There are lots of choices, and it's worth digging deeper into your existing data stores, as well as the new ones available!

Transcript

How'd you like to listen to dot net rocks with no ads? Easy? Become a patron For just five dollars a month you get access to a private RSS feed where all the shows have no ads. Twenty dollars a month will get you that and a special dot net Rocks patron mug. Sign up now at Patreon dot dot net rocks dot com. Hey Carlin Richard here. As you may have heard, NDC is back offering their incredible in person conferences around the world, and we'd like to tell you about them. NDC Copenhagen is

happening August twenty seventh through the thirty first. Go to NDC Copenhagen dot com for more information. NDC Porto is happening October sixteenth through the twentieth. The early bird discount for ADC Porto ends July twenty first. Go to Dcporto dot com to register and check out the full lineup of conferences at NDC Conferences dot com. Hey there, this is Jeff Fritz, the purple blazer guy from like or Soft, letting you in on a little secret about my friend Carl

Franklin. You know, the guy who started dot net rocks, the first podcast about dot Net in two thousand and two. The guy who's been teaching Blazer on YouTube since twenty twenty, Yeah, that Carl Franklin. Well, Carl's joined up with the folks from Code in a Castle to teach a week long hands on Blazer class at Are you Ready to Get This? At a castle slash villa in Tuscany. It's sort of a luxury vacation with Blazer learning

built in. Carl's calling it the Blazer master Class. You'll learn Blazer from the ground up, finishing the week with the ability to build and deploy Blazer applications. Since the training happens for only four hours in the morning over six days, you can bring your significant other, your partner with you and you should right This part of Italy is absolutely beautiful. There's so much to see and do, and in Larion Marco from Code into Castle are organizing daily activities

both at the castle and in the area. The castle is in the Marema, a less touristed region of Tuscany, offering both classic Tuscan hill country as well as easy access to the Etruscan Riviera, with sublime local food, wine and olive oil around every corner. Breakfast is included. Every day there will be two communal dinners at the castle book ending the experience, and most other meals and all activities are included. And did I mention you'll learn Blazer in

person from Carl Franklin. Listen, space is limited and for very good reason. This is quality training in a beautiful setting. Go to code in Acastle dot com slash Blazer twenty twenty three that's b L A z R two zero two three to take advantage of this amazing opportunity to join Carl in Tuscany for an unforgettable week of La dolce vita while advancing your programming skills in this important new technology. Hey guess what, it's dot net rocks. Welcome back.

I'm Carl Franklin and I'm Richard Cavill and our old friend Ted Newer to see her. We're going to introduce him in a few minutes. But first, dude, what's up all sorts of things happening in your neck of the woods, particularly your neighborhood. What happened? Well, yeah, well, you know, we were getting to a weird place with where do you want to live? What do you want to do? You know, what I realized in hindsight is that the old dog passed away about a year ago, right,

and so it's just been the two of us in the house. We went from dog kat and two daughters in February of twenty twenty to jest us by March of twenty twenty two. And then there was two daughters and a boyfriend. Right. Well, they weren't here for very long. They went off into their own places. Now, husband, right, they're all they're all done. But it's like the two of us in this house is a

house built to raise a family in. Yeah, and as much as we love it and we built it like it fits us in a lot of ways, it's a lot of house. And as soon as we started talking in terms of not having this house, immediately we're talking about building another one. So so you didn't think about, you know, taking one of the rooms and making it all stainless steel and putting a villain chair in it, and you know, having multiple monitors. Well, to be clear, we had

a room for restoring looms, like that's the place we were at. There was a loom room, which by the way, had its own on suite in case we needed to go to the bathroom so that we have too much space. Two house is amazing. It is a great house, and and the buyers are a young family with two teenagers. Like it's a perfect replacement. We're moving to the coast full time. I suspect we'll build something eventually, but yeah, we're ready to simplify a little and OK with it.

I'm really happy. And then my neighbors are quite annoyed with us, But that's okay because well it's a tight knit neighborhood. As you know, you've been here. I remember the parties and stuff, and I actually attended one

where all the neighbors came over. And that's a really cool thing, you know, because not everybody has neighbors that just get together for you know, special events like sunsets, yep um, you know, and and all through the pandemic when we'd go all come out on our driveways at six o'clock with a glass of wine. Yeah, you know, that's nice toast each other and go back in hek. I love this place, and now you can

do that with the otters. Yeah. I would also say there's a couple of my neighbors literally in the cul de Sac that have places on the coast as well, so they are entirely possible to be retiring up there themselves sometime in the future. Not that I'm retiring, or arguably I am retired. This is me goofing off. What is this word you speak of? Retire? Right? Don't? I don't even know what to talk to tell you anymore? Right, I don't know what you're talking. That's nonsense anyway.

But you know you would think by getting rid of a house my project list would get shorter. You would be incorrect, sir. So it's like, yeah, now you got to clean it all up, and there's just not enough automation up on the coast. There's some automation on the coast, but there could be much more automation on the game. All right, well, enough of that fascinating chit chat. Let's get to uh that's laughing. Let's get to better know a framework. Boom awesome. All right, man.

You know, as if I didn't have enough to do, I started a new freaking YouTube show. Are you out of controller? What? I'm out of my mind? I don't know. Everybody's like, hey, is your marriage okay? You're like, you're like spending more time making shows than you are with your wife. And she might tell you that's true, but I don't think so. But anyway, I mean, the thing is you're recording

from your house these days. Yeah, it's not like you're gone, you just it's yeah, it's you're just in a little box full of foam, that's right, and in a microphone. Some fallome in a microphone, little webcam. But um yeah, So the thing is this and I got this stuff down to a science. It's all automatic as much as I yeah, I know what I'm now. When a good topic comes along, such as AI bots, and I have friends that have been into it since the very

beginning and they've done some amazing things, a show happens. So it's yeah, it's the AI Bought Show with me and Brian McKay. And Brian McKay is one of our app next guys. He's extremely smart and extremely resourceful. Um yeah, So anyway, it's at the aibot show dot com. There

should be at least three episodes up there. In introduction to chat GPT, and then in the second one we talked about prompt engineering and we used the GPT API playground to write a little three act play basically, or a three act screenplay. Everything from it helping us find the topic and the characters and the conflicts. And you know, we basically started with a one paragraph story premise and then you know, got from different acts and the beats and the

scenes, and I mean sort of that's classic writer brainstorming technique exactly. Yeah. Here you're you're storming with a piece of software, right, and he basically has this idea where you take the key ideas that you get, You say, give me five ideas, you take the one you want, you add it to the system prompt, which is sort of the backstory, and then it continues generating from there. It's just fascinating to me. Yeah, that's awesome stuff. AI bought show dot com. No, it learned to

love it. Who's talking to us today, Richard Campbell, You're gonna love this. I grabbed a comment of a show ten eighty five, so that's January of twenty fifteen. It is disturbing to me to say twenty fifteen and say eight years ago, but geez, and this is a show we did with David Simons. When I just called it different databases, right, because it's really you know, early in a lot of those conversations too about where the sequel no sequel, like who cares and generated a ton of comments at

the time I've had. I've read many of them, but this one I haven't read. And this is from Thomas Jansen, and Thomas Janson says this is close to a talk I did at NDC London, which makes me think we should probably have Thomas on the show one of these days. Well, Polyglot Heaven and considering we have ted around today like there, you know,

there's a line he's used before too. Specifically in the talk he dugged into using multiple data stores in a given application based on what it's appropriate, what's the shape of the data, what you know, what's the sensitivity and so forth, And in his particular demo for his talk he was using elastic Search

and Neo for Jay. He goes on to say, one of the things that Richard mentioned was to store objects as they were received, but better approach is to store the events as a result of the action the user was taking. I think this is a little bit of semantics, but it's good semantics. By storing the event, you get a complete audit logue of what has happened, but not necessarily the state of the object, and this is a

good thing. The reason is it's a good thing is because the state could have many things in a system depending on how you choose to read the events, but the fact that something has happened can't change. Also, these events you store can trigger other parts of the system, executing long running processes or processes that just update different read views that could be stored in sequel and the over j mongol, last research or whatever. And another thing that Carl was

worried about this particular episode was eventual consistency. And to be clear, every time something makes an action is almost always doing it on data that could be old. The important part is if you're using an event store and events sourcing, is that when you're executing the action, the action will be directed against the event store database, which is consistent. The consistent part is that the user or some other system might have triggered an action on old data. That

is something that's really hard to work yourself around. Really love the show, but also kind of a rare occurrence if you do it right. Yeah, you know, And it makes sense to optimize through the majority of cases and then deal with exceptions, and exceptions occur, Right's I mean you think about every single website you ever go to to look at air fares. Yeah, they're all lies. It's just that only because most people never actually buy the

airfas or what do you care that it's a lie? And then when you actually go to buy it once in a while, it's gonna go Hey remember that airfare we showed you? Yeah we lied. Sorry, Yeah, here's the new air fair. Somebody was faster than you. Yeah. It's just a cash failure, that's all it is. But it's best compensated with a happy dialogue. Yeah yeah, hey Thomas instead, yeah the one that costs more enough, Hey, Thomas, thanks so much for your comment and a

copy of music Code Buy. It's on its way to you. And if you'd like a copy of music Code by I read a comment on the website and dot net rocks dot com or on the facebooks we published every show there, and you comment there and a reading on the show, we'll send you a copy of music Code Buy, and definitely follow us on Twitter if you want. But the cool kids they're on Mastodon. I'm at Carl Franklin at tech hub dot social and I'm Rich Campbell at Mastodon dot social And is it

really the cool club? Because I think Threads is the cool club. Now Threads sucks. I'm sorry, it's you know, maybe it has the potential once it uses Federation and all that stuff, which it doesn't yet, but the app itself is I hate it. Let me say this, one hundred million people disagree with you. Yeah, that's because their sheets. It's a very big number. That is like a quarter or maybe even a third of all of Twitter in a week. Yeah, that's true. Network effect is

important, friend. Well it's because everybody on Instagram said, okay, I'll try that. Yeah. Well this is the brilliant I mean, listen, Zuckerberg screwed a lot of stuff up for quite a while. This is a pretty slick move. Yeah. So now he just has to fix the app so that it actually has functionality. Wow, he's got the network effect. Maybe he can. Actually he's already spend billions on an app that doesn't work. Maybe you can spend MP some money on an app that does work.

That's right, Hey, Zuck, call me, I'll fix it for you. It'll be fine. Everything's fine. Yeah, everything's fine. We're all fine here, how are you. Let's get to work. Yeah, all right, let's introduce Ted Ted Neward. How long has it been since you were on this show? It's a long time safe fifteen, twenty fifteen, all right, I'm wrong. I think it's going to be even longer than that. Twenty fourteen. He was on the panel that we did at the

end Service Bus conference. Remember that. Oh yeah, boondoggle in Brooklyn. Boy, that was a riot. That was fun. All that panel. The last full show on his own twenty twelve, ten years ago. Oh my god. So your bio has changed. It's a lot more general. But you've done so much work. I'm just going to read what's on your on your site here. Ted Neward is an industry professional with twenty plus years

I put in the plus there experience. He speaks at conferences all over the world and writes regularly for a variety of publications across the Java, dot net and other ecosystems. He currently resides in Pacific Northwest with his wife, two sons, three cats, twelve laptop, seven tablets, nine phones, and a rather large utility bill. I like that one. That's a great one. I actually actually, since we were speaking of domestic situations. I need

to revise that because one son has moved out there. You go, Yeah, so it's now it's just one son, three cats and all the rest of the paraphernalia. Well, I remember meeting you somewhere. You were a teacher, a trainer, developmentor and a consultant there and when dot net first started, you were intrigued because you were a Java guy. And then C Sharp turns out was pretty damn good. Well, so the interesting story there is, you know, I was development or teaching Java when you know,

Microsoft, all the dot net stuff was still very very hush hush. Nobody knew anything about it. But Don knew about it, and so did don knew about it? Don Box and Don Box and Cross Cells probably I think only Don to start because they kept the you know, the circle of trust very very tight back in those days. And Don was part of the Soapspec guys, he and Dave Weiner. They contributed to the soap specs, so

they were naturally pulled into the process. Yeah, and so we were down in LA for a instructor retreat and Don pulled a bunch of us, you know, basically put a piece of paper in front of us and said sign this or you need to leave the room. And it was an extension of the NBA to allow Don to tell us about this forthcoming dot Net thing, which back then was still they hadn't quite figured out the names, so it was like next generation services. Yeah, well the first version was window services.

They you know, consent decree web services. Yeah, oh, they went through. That was actually one of the t shirts that we had at one point was all of the different names Calm three com plus all of them xed out, mgws UT, Universal Runtime xed out. Yeah, And so it's you know, it's part of the reality that dot Net was on an intentional product, like the Universal runtime was addressing a Windows problem and had nothing to do with needing an alternative to Java and had nothing to do with wow,

our web tech sucks. Like the fact that they all came together is kind of miraculous, although there's very much a Conway's Law effect of that's why there were too many teams involved in a lot of the architectural decisions are a bit odd because they had to do with the team with the idea had to be the team implement and even though it was in the wrong place, so many. Anyway, that was here introduction that dot net and uh we we

interviewed you early early on in dot net Rocks. Oh yeah, Mark, don Yeah, I think I've actually done a dotnet Rocks with all of your co hosts. I did one with Mark, I did one with Rory, uh and Richard. I've done several with Richard. That's just pretty special. Yeah, I'm kind of surprised you've done several of me. Uh. Well, I just like you, Richard. I don't know if that's true either.

Man. There are very few people, very very few people in the world that I invited to my fortieth birthday party, and only one that I asked mc the roast. There's that. Yeah, that's mostly because I don't think that's almost kind of a punishment. Actually, although I didn't spend time with your sister, and your sister is awesome. Well, but here's the thing that I know about you, Ted, and then that we can you know, your bio doesn't really explain it. You are like the polyglot.

You all of these languages on top of the JVM. You know a lot of languages on top of dot net, some that aren't even on dot net. You know a lot of different languages, a lot of different technologies. You were sort of you were the anti o RM guy back when you and

or and Eeny had that slap smacked down show Canada. Yeah, of course, but I mean you were the guy who's like, start procedure, start procedures, and Aran's like no, and then you know, that debate raged on for a while and now you're you're You're back here to talk about couch base. Well, so part of the thing is I've always looked at technologies and particularly new stuff that's coming down the pipe. But more importantly, I think it's the way to describe me. The polite way is to say philosopher,

the more accurate ways to say contrarian. So back during you know, the two thousands, particularly in the Java space, everybody was going hog wild over enterprise Java beans and object relational mapping tools, and I was trying to point out that there are significant costs to using these things and to say don't

don't just take it by default. And so yeah, that's when I penned the you know, object relitional mapping is the Vietnam of computer science, for which for which I am, that is still the thing I am most widely known that that actually earned me an entry in Wikipedia. But that was a point in time, like you're pointing out, right, things change and things evolve in some things that weren't working is so well back then got fixed and

you know, new technologies came out around it. Well, part of the thing is we have to be a little bit careful because yes, we made changes, but there are also certain properties of the universe that are immutable, such as the speed of light. Right. The fallacies of distributed computing are always going to be there because as soon as you try to go from one machine to another, it gets very very expensive in time and energy and bandwidth

and so forth, compared to within the machine. I mean exactly, exactly. Often we're only slicy seconds here, but you're going from nanoseconds to milliseconds in a lot of cases, and as soon as you change orders of magnitude like that, you know, And and the big concern was that a lot of the rms at the time, you know, lazy loading was the word of the day, and so it'd be like, oh, well, let's

go fetch one property at a time. So you're making all these trips back and forth between these two nodes and the network and suddenly wondering why your system is slow. This holds, by the way, whether we're talking about you know, databases, whether we're talking about web services, whether we're talking about you know, grid computing. If you remember that phrase from way back when you know cloud it, as long as they are just bo stuff bingo,

right, these these things still hold. And so you know, part of part of what I've always tried to do is to look at things and say, all right, you know, the yes, there is a mainstream view of things, but how do we you know, let's get out of the mainstream for a second, because it turns out the eight twenty rule still holds. Eighty percent of what anybody does can probably fit in the mainstream. But

then there's the twenty percent around the outside edges that don't fit. And so this is where you know, particularly as I started looking into databases, of course, I was looking at all the Nosequel at the time, right, and there are tremendously useful things about a Nosequel database. You know, you

look at mongo dB for a second, the documents shape of things. Before we get into that, I want I want to go back a point where you were talking about how certain you know, constants like the speed of light and being chatty and all that, And I think that I a sink innumerable changed a lot of things, don't you When we could do streaming either locally or over the internet, that did. I mean, you know, it's sort of a nice answer to lazy loading when you're talking about data that it's

it changes things. I'm not necessarily going to go so far as to say it, you know, it solved all problems, but it's certainly you know, it changes where things occur and how things occur. Right, Yeah, you know, a lot of this is borne out of the whole reactive programming space and reactive you know, and many responses says we're still going to pay the cost of time in terms of data moving over the network, but we're

going to pay it differently. Your users aren't going to notice it. Let's put it in a perfect world ideally, right, But I mean, you know, if an interesting experiment that I often do, because one of the things I've done since we last talk is is I actually am a guest lecturer at the University of Washington, so I'm actually talking to undergrads on a regular basis, and spring quarter I was teaching distributed systems, and so one of the things I was pointing out is, you know, yes, there's the

cost of sending all this stuff across the network. But let's do an interesting experiment. Bring up the Amazon dot Com homepage. But don't look at the actual center of the browser. Look at the bottom, look at the status bar right because it flickers as it loads various images and so forth. And let's count how long it takes to actually fully load the Amazon dot Com home page. And it's measured in seconds, and it's not single digit seconds.

But what they do is they load the first you know, first page, if you will. They load the first that you can see very quickly, yes, and then they go fill in all this other stuff and a synchronous link right by the time you scroll down, bingo, you'll see it. And it would you would think that it's been there all along, when in fact it just got there right right. And in many respects, it's a

brilliant use of asynchronous you know loading. To say, most people aren't ever going to go below the fold anyway, So let's load all that stuff after the fact. But let's make sure that that above the fold. The first I don't know, eight hundred thousand something pixels that loads immediately, right,

that that loads as fast as we can possibly get it. And and there are so many ways that you can approach the design of a system, the design of a page, of your user interaction, whatever you want to call it to, you know, to reflect that. Right, that is clearly somebody who sat down and looked very very much, not just at the numbers in the system, but the way people were interacting with the system. All right, could talk next next Now you were talking. You started talking about

manguid B and and all that stuff, and I interrupted you. I apologize. I mean, the big thing was, you know, when no sequel came around, Right, Originally the was to release us from some of the things we were doing with respect to transactions and to be able to get to some of that eventual consistency that you were talking about earlier. But along the way, you know, when somebody said, okay, we're going to break away from the relational database. Here's an opportunity. Let's really break away from

the relational database. And so they started changing up the data. You know,

the data shapes, the data models, right. So now instead of it always being relations and relevars, let's look at documents, let's look at graph let's look at and what I'm seeing now personally looking at some of these different databases, and this is kind of where I got into conversations with couch Base, and I'm working with them to kind of, you know, introduce them into more of the university setting as part of my stuff at Utah.

You know, different databases are now actually allowing us to do a certain amount of combination of various things. I mean, before it was you got relations in relev you had transactional, and you didn't even see the storage engine. That was just you know, a baked in, opaque property of the database.

Today we are seeing databases being able to plug some of the storage engine, you know, so like my sequel will let you choose which storage engine use, We can decide whether we want to run a cloud or on prem We can, you know, we can make all of these interesting decisions either

between products or within products. But more and more of them are starting to go the whole multimodel route, right, which is one of the reasons why couch is interesting because early no sequel said, yeah, Okay, we'll give you a document model, but you're going to have to write a lot of code to query it, right, you got to do this query by example stuff. And you know, couch was one of the couch Base. I have to be careful because there's couch base and there's couch dB and they are

very very different things. Ok. The couch base guy said, no, no, no, we can we can give you your sequel or a sequel like dialect. We can give that to you, Nickel, and yeah, exactly, and I can go, yeah, go go get it, go go queery to your heart's content. And you know that in many ways addresses

some of the concerns people have. You know, couch based supports, you know, some of the traditional relational view of things as well as you know, they've got some degree of graph I don't know that I would go so far as to call it a full graph database. There's also a link to couch base tool out there, so you could use your good old familiar link

syntax, which is very sequel like and very functional. Yeah, yeah, you know, so this I mean to me, this is interesting simply because it means that it opens up a whole lot more in the way of opportunities to choose things. You know, twenty five years ago, if we were talking, the only choice you had with respect to where you were going to store your data is which relational database were you going to use? Right?

Was it Sequel server? Was it Oracle? Or maybe if you were really heretical, you chose one of the open source ones, right, my Sequel or postgress right today? Oh you didn't even say dB two. I didn't know what's wrong with that dBase? Oh well, you know, IBM. IBM just decided. IBM just decided that they were going to charge for red Hat Enterprise Linux. So now they are dead to me that that's that's what's

going on there. Um, you know, I mean the the options available, And frankly, that's very intimidating to a lot of people, right. There are significant number of developers and architects out there who find that choice intimidating because it's like, what if I choose the wrong one? Yeah, because we all know how changing the database sucks. Like that's a brute that's a brutal prospect. I don't know how often they get to even choose. We're

paying for sequel server licenses. You will store your data and sequel service. There is that too, Yeah, yeah, there is that. Although the joys of microservices right now persistent. This is all you know, an internal detail, right it's encapsulated away because it's all part of my microservice. So you know, yeah, well that's for those that are doing microservices. There's

a lot of us out there that aren't. Well. And I would also argue it's one of the reasons that the open source database took off because the license hammer is defeated by but it's free, right right right, I can't argue with that. Yeah, So, you know, I mean the in the case of couch base, you know, I like some of the decisions they've made. Um, you know, I like some of the people that I've talked to. There are a lot of other, you know, viable

tools to use. And as a matter of fact, one of the things spring Quarter I was also teaching a data and databases course to the kids that you do and teaching them relational databases. I use Sequelight because well it's free and frankly it's low realized, and so suddenly it opens up a whole lot of opportunities to avoid making those round trips and I saw a great Lettle meme that said every time somebody says this meeting could have been an email, there

is somebody else saying this database could have been a Sequal Light. Yeah. Right, Well, you know, if your database as one user, and you know where so many databases have exactly one user, you're holding it in your hand, voice the phone, the phone. Yeah, well you're holding it. You're well I was holding it, yes, but you know, I'll pick mine up two just to be here you go. I'm just running

down the sort of the features of couch Base. Obviously, it's a document oriented distributed so there's a distributed architecture, so it can scale horizontally, right, And depending on where you're running it in on what hardware or if it's a cloud thing, all the cloud providers have services for it in memory optimization.

Right. So if you could think of it like, oh, load of Jason file into a dictionary, there's your database, right, And then there's this Nickel NYQL query language which is kind of like Sequel but designed for Jason data synchronization. Couch Base Mobile, so a mobile database like you were talking about Sequel, you know, Sequel Light, but this is a couch based light and a synchronization gateway so that you can do that. And then

of course it's got integrations and SDKs and all sorts of stuff. And I also mentioned before this link to couch base, which is good for dot net developers if you're used to link, right, So yeah, it's it's pretty

cool. The other thing I know that that the couch based folks are talking a great deal about, and you know offering is Capella, which is there cloud hosted flavor of I mean, you can certainly a couch base to run in your favorite abs ash or whatever, but you know what is the practical difference between you figuring out which version to install and just letting couch base deal with it in their cloud, right well, and just add option to just

go, hey, I don't have to decide now we're just running this in the local rig while we're figuring stuff out. But when it comes to scaling, it's like, do I really want to build it this infrastructure? Can I just shift to the services model? And it's a database as a service. Pretty awesome and gentlemen, and I'm gonna use that term very loosening to day. I need to interrupt from one moment for this very important message,

and we're back. It's dotting that Rocks. I'm Richard Campbell. Let's call Frankly and our old friend Ted Neward, who only careful with that old careful with that old emphasis on the word old. Boon at first show was fifty nine, which I think is actually rary at that point, but you know, he's pretty close in that ballpark, and we're talking a little bit about the modern data landscape. I think more than anything, just that you know,

different ways to store. I really like this multi model mindset, just that what engine would you like to use? What form would you like the store? I mean even sequel servers headed down this path. He yeah, oh you know like it. Now that you say it, it's like, wow, that's really what's happening with data stories these days. It's like, well, what do you got, Let me hold it for you, let

me index for you. And this is where you know, some knowledge of database internals will help, right, because one of the things that we saw you terms of well, let me back up for a second because earlier you talked about the comment from Thomas and I can't remember his last name, sorry, and he was talking about polygot persistence and storing data in different databases. One of the drawbacks to that approach is the fact that in many cases,

trying to do so a TOM across these different databases was tricky. And so if I were storing something in my document that also needed to modify something in the graph, Okay, in one case, I'm doing it to a couch dB, in the other case, I'm doing it to a NEO for J and making sure that both of those got updated or neither did was always a little problematic because you're going across databases, and we didn't in the early days

really have distributed transactions across different no sequel databases. When the database vendors started to embrace this multi model thing. Now it became possible to do that atomically because it's all inside of one transaction manager, the one database, and still have some of your multimodel. But you have to be a little bit careful because how do they store things internally is going to really affect some of the performance, right most right, Like, for example, there's no data or

time types in couch base right right. Oops. I just think you quiet walk past the real estate, which is life is too short for distributed transactions. Well there is that, there is stuff. Once upon a time, you know, I crushed my soul trying to make DTS work, truly try to make it work. Oh, we all did. We all did because truthfully, you know, that was that was what we all of us believed and taught everybody else. And what was the I mean, the canonical example

for transactions banking? Right, Yeah, I'm going to make a deposit to you. So we're going to take out a lock and we're going to make sure that we remove the money from my account and deposited to your account. And the great lie behind all of that was that's not actually how banking works. It never has. It's you ever been a thing exactly exactly. Oh and by the way of using JavaScript for those floating point data types, you

got another problem. Well there's gonna be some funny look at penny shown up real time. Now there are those problems that we inherit, and there are those problems we make for ourselves. And let's let's if you're using floating point in any language, in any language, for money, we have a yeah, seriously experienced tip. Money is not floating point. No tip for you all kids, right, don't count dollars with fractions of a dollar. Count

pennies. It's all pennies, right, One dollar is one hundred money units, you know. Yeah, I mean going past the whole transactions thing, distributed transactions thing, right, especially because when we needed to scale, contention is the enemy of scale, right, you know, just repeat that to

yourself over and over and over again. And you know. The the multimodel databases allow us to you know, work with all these different shapes of data, but deep down, you know, obviously they have to store data somehow, and this is why a lot of the early multi models were built on top of key value stores. Right here is here's a primary key and the rest of it is a blob binary and you can interpret that however you want. But every time you fetched that, it had to be repounded into shape

before it gave it back to you. Now, some of the databases seem to be experimenting with more in the way of you know, like like actually storing it differently with indexes and weird places. So it's not just one storage engine. They're actually using multiple storage engines depending upon what you're storing underneath there. And then and a unified intacting strategy, right right, right, right,

right, So it's just really just a performance thing. Just Hey, let's take the shape of data and store it in the closest thing to its natural shape. I think that that's where you get the best performance generally speaking, right, because the customer doesn't care. They just expect us to store the data and be able to retrieve it, and sooner would be better and right cheapest possible, please, thank you very much. Yeah, cheap is

also handy. Yeah, yeah, well, and I think you know, all of this is still relative, right, because so much of this, you know, so much of the performance of your queer could be drowned out by the length of time it takes that web request to arrive at the page and then get back to the browser. So again, go back to what

we talked about earlier, the Amazon story. Right, In many cases, let me just fetch an initial set of data, get it back to you, and then we can go get more later if you need it, right, yeah, and do so asynchronously in that sort of reactive model that Carl

talked about. But part of you know, part of what any system designer, part of what any architect needs to do is figure out where we need to be really responsive and where we can do that asynchronous loading, and in many cases with databases, we can do some of that by saying, let me query what I'm looking for, but then let me also pull back some

additional data, but pull that back asynchronously or what have you. And this applies not just a databases obviously, web services and Jason services and blah blah blah blah blah blah blah blah blah. You know, do we still love Jason in twenty twenty three. It's just a data form. I mean I never fell out of love with XML. I never fell out of love with CSV. I love hard with We have issues, man, you're just a You're just a love them and leave them kind of to me. Yeah,

you know, but Jason I never had a problem ever. Okay, No, I have had problems with Jason. I have had problems where by default a serializer a de serializer changes the case of properties to to camel case when that's not the case, that's not what they are, and therefore it fails. Yeah, I have a problem with that. Well, see, here's here's the fun thing, right, because XML could actually account for that, for you because XML has something that Jason doesn't, which is the opportunity to

capture metadata directly in the format. Right, And I find it absolutely telling that you know what people are working on right now, Schema for Jason. Oh god, no, wait, well it stop stop. How many times have we seen this? This is the path to hell? You know, it would be even more fun. How about schema for XML? Could we do that? That would be really fun. Let's do that, hey man,

Schema for XML. You know, it was, it was, it was incredibly I'm not saying schema for x MEL didn't buy me a car because it right, right, But it doesn't mean it was ever a good idea. Right. It's self defining. But here's the thing. Here's the thing, right is, first of all, the idea of schema itself is not bad because it allows machines to be able to do a bunch of the work we would normally have to be verifying. Yeah, right, exactly validation.

And where the XML Community committee I think got XML schema wrong as they tried to write XML schema in XML. To be fair, it's because they wanted to do everything in XML all the time. We just kind of started you know, believing our own stuff. Now I need a scheme for my schema. Well, and you know who else did that is the Sequel Committee, right, because they want, you know, they want schema to find within the sequel language itself. So they just created a whole bunch of new keywords

and so forth to do it. You know. So this notion of trying to get kind of this linguistic closure that we can define the language inside the language. It's I see what you did there with that language reference there, I saw that. I got you. I'm onto you man. You know, it's an interesting idea, and I don't know, you know, I know a lot of language designers find it to be really interesting and elegant. I don't know how much. I don't know how how much that elegance actually

pays off in the long run. Yeah, you know, but at the end of the day, they still go after it. And you know, Jason ld I think it is is the schema for Jason. I think they're trying to do it in Jason at the same time. And it's like, you know, maybe it's not bad to just have a different definition, but please, please, please please please don't don't throw yammo at this particular problem because that's just this. Oh God, I was waiting for you to bring

up that awesome, awesome language. I cannot stand. M loves you. But these are all just representations of data, right, And you know, I really want us to stop, like, you know, reinventing new data formats to do the same thing and then arguing about it. That's what I'd really love us to do. Can't we all just get along? We could have cured cancer by now and we had taken all this energy around data formats and just put it towards you know, identifying human Gino. Anyway, I'll

get off my soapbox. Okay, yeah, So, so, um do this couch based cloud database as a service I think available Capella. Yeah. I think it's available on all the cloud platforms, as well as the you know, couch base cloud thing, isn't it. I don't know officially, I've I do not work for couch Base. They do not. They do not you know, give me benefits or anything like that. So I'm not

a salesperson. I don't know the full details, and I don't have their website up in front of me. I would believe it if you told me though, because I know they've been pretty comprehensive. Yeah. Okay, so so here's a story from on couch based dot Com from January twenty twenty one. Couch based Cloud now available on Microsoft Adger. Yeah okay, yeah, So I think part of the idea of cappella is you are treating couch base as your cloud, but you could probably specify which cloud provider you're in,

But don't quote me on that. I don't know that for sure. Yeah, it looks that way. So run couch based multi cloud across AWSU and GCP, so you could do that. There you go, And I think they're already by the by what you've said and what I remember conversing with Matthew Groves, who is one of couch bases more public figures, I believe part of what they're trying to do there is automatically incorporate some of the multi cloud

strategy that people often talk about for greater reliability. Right, So yeah, I think so too. Yeah. I always question that one because the cloud's out relying all of our data centers, like come on, it's a pretty big overhead to take on. But I make to me having the checks box at him there at least is a good idea. Whether or not you actually want to pay for the implementation is another question entirely Well, it kind of

goes back to two things, right. Number one the reliability argument, which, given how reliable all three of the big cloud players are already, I'm not sure if it's worth it. But the other is vendor neutrality. Right, so that if AWS decides to pull an oracle and bump their prices up by you know, ten thousand percent, you can say, yeah, okay, we'll just flip over here to Azure, flip the switch and we're there.

Yeah. But again, like you said, you know, you pay a price for that, right, And vendor neutrality is frequently we got the job of folks in trouble when we tried to be vendor neutral, right. I mean that was part of the whole or an i ny discussion way back when was stored procedures were intrinsically tied to a particular database, ures you automatically weren't vendor neutral. So therefore stay away, said the Java community. But

also, I mean he wanted to have sequel in his app. He wanted to generate that sequel dynamically, you know, in the RM because that's how they work. Yeah, and you know if he has to go through stored procedures, that adds a level of complexity and blah blah blah blah. Yeah, and it it violates encapsulation, right, you know, just to go back and put my little whip cream in a cherry on that because if you well, I mean I still remembered back in back in ninety ninety seven here

from another planet. You know, I was good till the whip Cream showed up. Everything was come on, cream show. You are not so naive as that Richard Campbell. You know if you've ever been have you ever been sitting in your cubicle and had somebody call you up and start yelling at you because you broke their mission critical app And you responded with I'm sorry, who

are you? And they're a VP from like four organizations over because your boss's boss's boss decided to give away the schema to your database and then you changed it because you needed to add it's it's encapsulation changed to sit that way too shortly. My line was always I'm sorry, who are you? And why shouldn't it tell you to get stuffed? Yeah, that's why I don't make it incubica on that second part of it. Yes, well, yeah, there are very few people in the world who are less less tactful than me,

Richard Campbell. I'm not saying you're one of them. I'm awful. I'm offen tactful. But you know, so what he gone when I see what? All right? So, what the hell are we talking about with the whip cream in the chair you're gonna put on top of that? I was just gonna say that whenever, whenever you have a relational database, your schema is your is what you seek to encapsulate. Right. That was part of the whole reason for stored proxy is they act as an encapsulation layer.

And you know, if you don't care to encapsulate your relational database, that's fine. Just live with the consequences of vps calling you up and you know you resisting the urge to tell them to How did Richard put it? Get stuffed? That's right, get stuffed that the day is the polite version for a podcast? Yes, there you go. There several of the versions. I'm sure the listener can contemplate exactly. So what if you're new to the whole. I can't imagine you're new to it. But what if you've never

used a document database before. What are the sort of the reasons that these things have become, you know, really popular. So part of the thing is the document model, the document data model is it frequently lines up with some of the things that we're doing. Uh So, for example, the canonical example for a document store is a blog. Right, You're maintaining a

weblog. Each entry is a document in the blog, and there's a bunch of information around that particular entry, who wrote it, when they wrote it, you know, the tags that are for that particular entry, etc. But the what I've found the blog entry, I want everything about the blog entry, and so I can fetch that back as a singular document, right. And there are a lot of cases where when I'm looking for a thing, I'm looking for that one thing, so frequently a customer's order history.

Right. When I'm looking at the order history, I'm often not looking at the I'm not looking across the entirety of all customers to figure out which customers have a history of ordering toilet paper. I'm looking at this particular customer's order history because I want to you know, look a look at their list of purchases because they want to dispute one or something similar. And so if you're if sort of the natural data model that you're working with is one that has

kind of a central piece to it with some ancillary data. Yeah, a document data model would work. If you've ever built a star schema in a relational database. I would argue that you are actually building a document model in a relational database, and in the end, as a developer, you get back a list of objects which are essentially documents in and of themselves, right,

I mean, it's just another way to think of it. But you know, we're using Jason all the time, right to persist and serialize and de serialize, and so, you know, actually taking something in Jason and stuffing it into a relational database actually in twenty twenty three seems kind of strange if you think about it. It takes time. Yeah, right, It's not just the esthetics of it. It's the fact that you have to do,

like you were talking about earlier, I have to serialize that. I have to take this object model, which could be cyclical by the way, which Jason does not deal with well anymore than exemel, did I have to take this object model pounded into the document model of Jason and then ship that when it comes back, I have to somehow rehydrate it back to my object model when it would you know, in many cases make a lot more sense to take this and store it oh natural. It's one of the reasons why

I loved object databases back when they were popular. Ye because writing code and droversy sharp. My schema is my class. We're done here right right, and so it saves you time and energy and effort, both at runtime as well as during development, if you can store the thing in its natural state. So I remember talking to Ornini Beck and Richard probably does too in about raven dB and all the questions that I had, Like, you know,

one of the benefits of sequel server is indexes. You can index fields that you're going to do queries on, and then they create these separate databases or separate tables called indexes, so that they can be looked up quickly. And you know, I quickly found out how how useful an index is when I did not index a date field in the old dot net rocks you know thing,

and then I want to do a query. And this was like on requests, Like I had web requests in a database right with a date, and I wanted to do a quarry to see how many you know, hits we got between how many downloads we got between this date and that date. And it was like, you know, you know, like sitting there waiting and waiting and waiting. So I asked, Uren, you know, how

does how does the document database deal with that? And I don't know if couch base and Mango do this, but raven dB basically looks at what you're doing and what you're quarrying and builds indexes on a low priority background thread in the background. Do they all do that? Now? So different databases will do different things, right, And a lot of this is up to the database designer. And as a counter example, I will give you couch dB,

which again is different from couch base. World's worst naming decision by the way, couch base, yeah, you know, naming yourself after something else. But in couch dB, they actually what we would think about in terms

of indexes and so forth. They call views, and they build them ahead of time, and every time you insert or remove data, they modify the view and so there, they are choosing to optimize for reads as opposed to doing that look up at later, you know, because it takes time to do that when you write something into the database serviously, right, So it really is one of those things that the database designers can do, you know,

based on the decisions they want to make. Right. So, in the case of RABDB, if they do that indexing on a low priority thread, their assumption is that you're not going to immediately query along these indices once you've put something in, or they're willing to take the performance hit, you

know, while we're doing this. Other databases, you know, particularly a lot of the relational databases, because they wanted to maintain those acid level semantics, they would do that indexing during the query and not return to you until all the indexes were updated and everything was correct. Kind of and I'm like, well, it depends on what you want. It kind of defeats the purpose having an index, right, Oh, the first time you actually query

with these fields, yeah, you're gonna have to wait on that. We'll be back to you in a few minutes. Thanks. Well, and that's why the RDBMS said, we'll update them all while you're putting the data will make sure that the index is always ready to go. So you come back and issue a query right after doing that, insert updata, delete, We're ready for you. We're good to go. You just paid for it though

while you were waiting for that initial data modification to okay. Different databases will make different decisions, and that's you know, that's part of the Just like there's certain things that are fun from a language designer's perspective, this is part of the fun a database designer is to say, I'm going to choose you know, or in some cases, I'm going to make that slider available to you the DBA to be able to decide where do we want off them?

So you know, of obviously a lot about these different things are Is there anything in couch base that you wish they would have done differently or better or other? You know, the I have become enamored of graph databases in general. I think that the graph model is one that is actually probably the hardest to replicate in other data. And how's that different from the document? Instead of pointing from one document to another, you duplicate it and have a tree

for every record or what. So a graph database is something that will support cyclical references very easily. So the canonical example here is, let's us that we want to represent, um, you know, the six, the three of us and our respective spouses in a document. And so we have person, right, I'll use the the XMLA. We have an XML tag called

person first named Carl, last named Franklin. Now you're married, so we'll have a spouse field there which has you know, first named, last named spouse, which has you first named, last named spouse, which has her, And suddenly we get into this deep infinite recursion. Right, So either we can't really represent persons. The way we think of ourselves is I have a spouse that is a property of me, the person. Whereas in a

graph database, fundamentally the data structure is nodes and arcs. So we have node first name, you know, Richard, last named Campbell, and then we have an ARC to another node first named Stacy, last name Holt, and at arc can have in fact data associated with it, such as beginning of the marriage, you know, and if if they get divorced, end of the marriage. So the ARC is sort of like a relationship, right, it's a connection to connection between these two nodes, right, at it.

And so if you've ever tried to model genealogy data, if you ever looked at a you know, genealogy database, particularly because you know, in some cases, not only can two people have been married and then got divorced, but they could get married again, right, and so those would just be two separate arcs or one arc with a begin and begin end. It's like going to the fridge and saying, oh, this milk is spoiled. Well maybe tomorrow really fresh, right, yeah exactly, We'll leave We'll leave

people's you know, life choices off to the side. We still need to be able to model it. And so graph databases do that. They handle those cyclical relationships very very well, much better than Jason. And this is funny because Jason an ex Amel are very similar in the sense that they are both kind of a hierarchical and must form a tree. Yeah right, exactly exactly. So that's the one thing is I wish couch Base had, you know, I wish they had a little bit better in the way of graphic

support. But that is one of those things that's very very tricky to store, and particularly to do so distributed. And so you know, there are certain points at which you say, if I want to be everything to everybody, I'm probably going to be not very good to anybody. I'm trying to

think of how I would approach that relationship thing. With spouses. You would probably have like a groups, you know, table with person IDs basically, right, So you'd have a group and then this group name is spouse or whatever spouses, and then you'd have the people that are because now you get into right, what if you're polyamorous? What if you have two spouse which

is legal in some states. Yeah, I guess we were talking about data sooner or later than many too many conversation was going to come well you know, yeah, I mean it's it's um there there have been ways to be able to model graphs in for example, a relational database model, and Joe Selko has a book that is about as thick as your head and that should tell you how to do it right exactly reading this for a while, yes, exactly exactly. Well, friends, Yeah, I guess that's a conversation.

I missed you guys. We need to do this, absolutely sure. I missed you too, man, So well, thanks and come back soon. I will be happy to come back what's what's your schedule look like tomorrow? Come on over next day. I got a barbecue there. You go, all right, it's bound to be some smoke. Meet around here,

something absolutely there and we'll see you next time on dot net rocks. Dot net Rocks is brought to you by Franklin's Net and produced by Pop Studios, a full service audio, video and post production facility located physically in New London, Connecticut, and of course in the cloud online at pwop dot com. Visit our website at dt n et r ocks dot com for RSS feeds, downloads, mobile apps, comments, and access to the full archives going back

to show number one, recorded in September two thousand and two. And make sure you check out our sponsors. They keep us in business. Now go write some code. See you next time. You got a dead metal bands down. Slama is hard

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android