👀Python Poetry
Python dependency management and packaging made easy. ... Poetry comes with all the tools you might need to manage your projects in a deterministic way
Python dependency management and packaging made easy. ... Poetry comes with all the tools you might need to manage your projects in a deterministic way
In computer networks, rate limiting is used to control the rate of requests sent or received by a network interface controller. It can be used to prevent DoS attacks and limit web scraping
dbt - Transform data in your warehouse What is dbt? dbt is a development framework that combines modular SQL with software engineering best practices to make data transformation reliable, fast,
HashAggregateExec is a unary physical operator (i.e. with one child physical operator) for hash-based aggregation that is created (indirectly through AggUtils.
The Spark SQL shuffle is a mechanism for redistributing or re-partitioning data so that the data is grouped differently across partitions
We have two types of automatic backups in dynamodb one is point in time and another one is snapshots.
System design
Giorgia Meloni is an Italian politician and journalist. A member of the Chamber of Deputies in Italy since 2006, she has led the Brothers of Italy political party since 2014, and has been the president of the European Conservatives and Reformists Party since 2020.
Apache spark unit tests
📚 Book - The Psychology of Money - https://www.amazon.com/Psychology-Money-Timeless-lessons-happiness/dp/0857197681#?&_encoding=UTF8&tag=planetizer0c-20&linkCode=ur2&linkId=c18c78f3d241db79ce045de652b93722&camp=1789&creative=9325
How can we pivot on spark?
Why a single file on your repository that describes the project could make life much easier for newcomer programmers
DataFrame.createOrReplaceTempView - Creates or replaces a local temporary view with this DataFrame. The lifetime of this temporary table is tied to the SparkSession that was used to create this DataFrame.
Spark SQL is a Spark module for structured data processing. It provides a programming abstraction called DataFrames and can also act as a distributed SQL query engine. It enables unmodified Hadoop Hive queries to run up to 100x faster on existing deployments and data.
We will describe how to use Apache Spark to get top 100 words from a file!
Apache Spark Paired RDDs are defined as RDDs containing the key-value pair(KVP), which consists of two linked data item in it. So, In most cases, the key is an identifier, and the value is data corresponding to the key value. Furthermore, Apache Spark operations work on RDDs that contain any objects.
What is this name delta lake? Why delta? What is the core benefit of using it over a standard data lake? What are its disadvantages
Hive aws glue and additional data catalogs compare
In this episode we will discuss what are Go language to link and how is it different from other computer languages
SageMaker
What is AWS DynamoDB DAX and how does it relate to elastic cache
Java fork join pool uses the stealing algorithm in order to utilize better the threads that we have
In AWS the write capacity units is one write per second for an item of up to 1 KB in size
DynamoDb read capacity units and write capacity units
Why does reserve currency metals so much today in today's economy and what are the alternatives and investment opportunities in this area.
Common table expressions are very useful in constructing SQL and are your great next milestone to using SQL
What is the core of doing SEO did it change over the years and what would bring you to the topmost search results in the search engines
Here we discuss some of the most interesting product interview questions in the data area and in the data world
An intro to blockchain technology
What does term frequency means and what is its relationship with inverse document frequency that we use in order to identify categories of documents and in order to find testament and many more applications of this topic