Databases underpin almost every user experience on the web, but scaling a database is one of the most fundamental infrastructure challenges in software development. PlanetScale offers a MySQL platform that is managed and highly scaleable. Sam Lambert is the CEO of PlanetScale and he joins the show to talk about why he started the platform, The post Hyperscaling SQL with Sam Lambert appeared first on Software Engineering Daily ....
Jul 04, 2024
Apache Iceberg is an open source high-performance format for huge data tables. Iceberg enables the use of SQL tables for big data, while making it possible for engines like Spark and Hive to safely work with the same tables, at the same time. Iceberg was started at Netflix by Ryan Blue and Dan Weeks, and The post Iceberg at Netflix and Beyond with Ryan Blue appeared first on Software Engineering Daily ....
Mar 07, 2024•48 min
Starburst is a data lake analytics platform. It’s designed to help users work with structured data at scale, and is built on the open source platform, Trino. Adam Ferrari is the SVP of Engineering at Starburst. He joins the show to talk about Starburst, data engineering, and what it takes to build a data lake. The post Building a Data Lake with Adam Ferrari appeared first on Software Engineering Daily ....
Feb 06, 2024•46 min
Building scalable software applications can be complex and typically requires dozens of different tools. The engineering often involves handling many arcane tasks that are distant from actual application logic. In addition, a lack of a cohesive model for building applications can lead to substantial engineering costs. Nathan Marz is the creator of Rama, which is The post Rama with Nathan Marz appeared first on Software Engineering Daily ....
Dec 28, 2023•45 min
SurrealDB is the result of a long-time collaboration between brothers Tobie and Jaime Morgan Hitchcock. The project has modest origins and started merely to support other projects the brothers were working on. However, over time the project grew and in 2021 they started working on it full-time. Since then the project has gained serious adoption. The post Bonus Episode: SurrealDB with Tobie Morgan Hitchcock appeared first on Software Engineering Daily ....
Dec 25, 2023•57 min
Maritime logistics is the process organizing the movement of goods across the ocean. Historically, this has been a challenging problem because of the multinational nature of shipping, as well as piracy, smuggling, and legacy technology. It’s also profoundly important for security reasons, and because 90% of what we buy travels over the oceans. Ocean vessels The post Tracking Drug Smugglers and Migrating Databases with Benny Keinan and Lior Resisi appeared first on Software Engineering Daily ....
Dec 07, 2023•51 min
Data breaches at major companies are so now common that they hardly make the news. The Wikipedia page on data breaches lists over 350 between 2004 and 2023. The Equifax breach in 2017 was especially notable because over 160 million records were leaked, and much of the data was acquired by Equifax without individuals’ knowledge The post The Right to Be Forgotten with Gal Ringel appeared first on Software Engineering Daily ....
Nov 29, 2023•48 min
If you’re a sports fan and like to track sports statistics and results, you’ve probably heard of Sofascore. The website started in 2010 and ran on a modest single server. It now has 25 million monthly active users, covers 20 different sports, 11,000 leagues and tournaments, and is available in over 30 languages. Josip The post Sofascore with Josip Stuhli appeared first on Software Engineering Daily ....
Nov 28, 2023•50 min
Cloud-based software development platforms such as GitHub Codespaces continue to grow in popularity. These platforms are attractive to enterprise organizations because they can be managed centrally with security controls. However, many, if not most, developers prefer a local IDE. Daytona is aiming to bridge that gap. It’s a layer between a local IDE and a The post Daytona with Ivan Burazin appeared first on Software Engineering Daily ....
Nov 23, 2023•48 min
Knowledge graphs are an intuitive way to define relationships between objects, events, situations, and concepts. Their ability to encode this information makes them an attractive database paradigm. Hume is a graph-based analysis solution developed by GraphAware. It represents data as a network of interconnected entities and provides analysis capabilities to extract insights from the data. The post GraphAware with Luanne Misquitta appeared first on Software Engineering Daily ....
Nov 22, 2023•58 min
Observability software helps teams to actively monitor and debug their systems, and these tools are increasingly vital in DevOps. However, it’s not uncommon for the volume of observability data to exceed the amount of actual business data. This creates two challenges – how to analyze the large stream of observability data, and how to keep The post Chronosphere with Martin Mao appeared first on Software Engineering Daily ....
Nov 09, 2023•48 min
The importance of data teams is undeniable. Most companies today use data to drive decision-making on anything from software feature development to product strategy, hiring and marketing. In some companies data is the product, which can make data teams even more vital. But there’s a common problem – analyzing data is hard and time consuming. The post Streamlit with Amanda Kelly appeared first on Software Engineering Daily ....
Oct 24, 2023•47 min
Today it’s estimated there are over 1 billion websites on the internet. Much of this content is optimized to be viewed by human eyes, not consumed by machines. However, creating systems to automatically parse and structure the web greatly extends its utility, and paves the way for innovative solutions and applications. The industry of web The post Modern Web Scraping with Erez Naveh appeared first on Software Engineering Daily ....
Oct 18, 2023•57 min
There are hundreds of observability companies out there, and many ways to think about observability, such as application performance monitoring, server monitoring, and tracing. In a production application, multiple tools are often needed to get proper visibility on the application. This creates some challenges. Applications can produce lots of different observatory observability data, but how The post Observability with Eduardo Silva appeared first on Software Engineering Daily ....
Oct 12, 2023•45 min
It’s now clear that the adoption of AI will continue to increase, with nearly every industry working to rapidly incorporate it into their systems and applications to provide greater value to their users. Business analytics is a key domain that promises to be radically reshaped by AI. Alembic is an AI platform that integrates web The post AI and Business Analytics with John Adams appeared first on Software Engineering Daily ....
Oct 05, 2023•30 min
ScyllaDB is a fast and highly scalable NoSQL database designed to provide predictable performance at a massive cloud scale. It can handle millions of operations per second at a scale of gigabytes or petabytes. It’s also designed to be compatible with Cassandra and DynamoDB APIs. Scylla is used by Zillow, Comcast, and for Discord’s 350M+ The post Highly Scalable NoSQL with Dor Laor appeared first on Software Engineering Daily ....
Sep 07, 2023•36 min
Database caching is a fundamental challenge in database management and there are hundreds of techniques to satisfy different caching scenarios. PolyScale is a fully automated database cache. It offers an innovative approach to database caching, leveraging AI and automated configuration to simplify the process of determining what should and should not be cached. Ben Hagan The post Database Caching with Ben Hagan appeared first on Software Engineering Daily ....
Aug 08, 2023•36 min
Companies have high hopes for Machine learning and AI to support real-time product offerings, prevent fraud and drive innovation. But there was a catch – training models require labeled data that machines can digest. As data volumes increase, the opportunity to get great ML results rises, but so does the problem of labeling all the The post Data-Centric AI with Alex Ratner appeared first on Software Engineering Daily ....
Jul 20, 2023•50 min
RudderStack is a warehouse-native customer data platform (CDP) that helps businesses collect, unify, and activate customer data from all their different sources. In today’s episode, we’re talking to Soumyadeb Mitra, the founder and CEO of RudderStack. We discuss the importance of activating all your data, how RudderStack can help you activate your data, the challenges The post Making Data-Driven Decisions with Soumyadeb Mitra appeared first on Software Engineering Daily ....
Jul 11, 2023•51 min
The state of Data inside most companies is chaotic. It takes significant time and investment to tame this chaos. When you are a platform provider you are gathering tons of data from the developers using your platform. These developers building products on your platform need insight into that data to better understand how their application The post Customer-facing Analytics with Tyler Wells appeared first on Software Engineering Daily ....
Jun 30, 2023•52 min
As companies depend more on data to improve digital products and make informed decisions, it’s crucial that the data they use be accurate and reliable. MonteCarlo, the data reliability company, is the creator of the industry’s first end-to-end data observability platform. Barr Moses and Lior Gavish are the founders of Monte Carlo and they join The post Data Reliability with Barr Moses and Lior Gavish appeared first on Software Engineering Daily ....
Jun 12, 2023•56 min
In this podcast episode, we take a look at the intricacies of low-code data pipelines with Raj Bains, the founder of Prophecy.io. Raj shares valuable insights into how performant low-codedata pipelines are revolutionizing industries and transforming everyday operations. Raj discusses the founding story of Prophecy.io, the company’s mission, and its approach to democratizing the creation The post Low-Code SQL on dbt Core with Raj Bains from Prophecy appeared first on Software Engineering Daily ....
May 26, 2023•54 min
Chroma is an open source embedding database that is designed to make it easy to build large language model applications by making knowledge, facts and skills pluggable. Anton Troynikov is the co-founder of Chroma and he is our guest today. This episode is hosted by Lee Atchison. Lee Atchison is a software architect, author, and The post Open-Source Embedding Database with Anton Troynikov appeared first on Software Engineering Daily ....
Apr 20, 2023•33 min
Data Activation is the method of unlocking the knowledge sorted within your data warehouse, and making it actionable by your business users in the end tools that they use every day. In doing so, Data Activation helps bring data people toward the center of the business, directly tying their work to business outcomes. Hightouch is The post Data Activation with Tejas Manohar appeared first on Software Engineering Daily ....
Apr 13, 2023•41 min
A data catalog provides an index into the data sets and schemas of a company.Data teams are growing in size, and more companies than ever have a data team, so the market for data catalog is larger than ever. Mark is the CEO of Stemma and the co-creator of Amundsen, a data catalog that came The post Self-Service Data Culture with Stemma’s Mark Grover appeared first on Software Engineering Daily ....
Apr 07, 2023•46 min
Streaming analytics refers to the process of analyzing real-time data that is generated continuously and rapidly from various sources, such as sensors, applications, social media, and other internet-connected devices. Streaming analytics platforms enable organizations to extract business value from data in motion, similar to how traditional analytics tools derive insights from data at rest. DeltaStream The post Streaming Analytics with Hojjat Jafarpour appeared first on Software Engineering Dail...
Apr 06, 2023•47 min
Distributed databases are necessary for storing and managing data across multiple nodes in a network. They provide scalability, fault tolerance, improved performance, and cost savings. By distributing data across nodes, they allow for efficient processing of large amounts of data and redundancy against failures. They can also be used to store data across multiple locations The post Turso: Globally Replicated SQLite with Glauber Costa appeared first on Software Engineering Daily ....
Apr 03, 2023•51 min
DataSet is a log analytics platform provided by Sentinel One that helps DevOps, IT engineering, and security teams get answers from their data across all time periods, both live streaming and historical. It’s powered by a unique architecture that uses a massively parallel query engine to provide actionable insights from the data available. John Hart The post Observability Trends with John Hart appeared first on Software Engineering Daily ....
Mar 20, 2023•27 min
There are many types of early stage funding available from friends and family to seed to series A. Some firms invest across a wide set of technologies and seek only to provide capital. Others are in it for the long haul – they focus on specific areas of technology and develop both long term relationships The post Data Investing and the MAD with Matt Turck appeared first on Software Engineering Daily ....
Mar 10, 2023•51 min
The Presto/Trino project makes distributed querying easier across a variety of data sources. As the need for machine learning and other high volume data applications has increased, the need for support, tooling, and cloud infrastructure for Presto/Trino has increased with it. Starburst helps your teams run fast queries on any data source. With Starburst you The post Accessing Data at Scale with Justin Borgman appeared first on Software Engineering Daily ....
Nov 11, 2022•46 min