💥 Spark DataFrame Cache - podcast episode cover

💥 Spark DataFrame Cache

Jan 11, 2023•5 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

In Apache Spark, you can cache a DataFrame in memory using the cache() or persist() method. The cache() method is a shorthand for persist() with the default storage level of MEMORY_ONLY. You can use the persist() method to specify a different storage level if desired, such as MEMORY_AND_DISK or MEMORY_ONLY_SER.
For the best experience, listen in Metacast app for iOS or Android