Darshini

What Happens When an Action is Executed in Apache Spark?

Darshini January 30, 2025 No Comments

In Apache Spark, an Action is a type of operation that triggers the execution of a Spark job. Actions are operations that produce a result or output, and they cause…

Spark Architecture

Darshini January 30, 2025 No Comments

Apache Spark is a distributed data processing framework designed for speed and scalability. It works internally through a combination of cluster computing, in-memory processing, and DAG execution. Below is a…

Date functions

Darshini January 29, 2025 No Comments

SQL provides various date functions to manipulate and extract information from date and time values. These functions vary slightly across different databases (MySQL, PostgreSQL, SQL Server, Oracle), but the core…

Accumulator

Darshini January 29, 2025 No Comments

Accumulator in PySpark An accumulator in PySpark is a shared, mutable variable used for aggregating information across tasks. It allows workers to increment or add values to a shared variable…

Broadcast Variable

Darshini January 28, 2025 17 Comments

A broadcast variable in PySpark is a mechanism for efficiently sharing read-only data across all nodes in a cluster. It is especially useful when you have data that needs to…

Aggregations

Darshini January 27, 2025 13 Comments

Aggregations in PySpark involve performing summary computations on data, such as calculating sums, averages, counts, or other statistical measures. These operations are often used to gain insights from datasets, such…

Handling NULL in join condition

Darshini January 18, 2025 10 Comments

In SQL, the behavior of joins involving NULL values in the join conditions depends on the type of join used. Here’s a breakdown: 1. INNER JOIN Behavior: Rows with NULL…

REVOKE

Darshini January 17, 2025 11 Comments

The SQL REVOKE command is used to remove or withdraw permissions from users or roles, which were previously granted using the GRANT command. It is an essential tool for managing…

GRANT

Darshini January 17, 2025 11 Comments

The GRANT command in SQL is used to assign permissions to users or roles, enabling them to perform specific operations on database objects. These permissions are essential for managing access…

SELECT

Darshini January 16, 2025 12 Comments

The SQL SELECT command is one of the most fundamental and frequently used operations in relational database management. It retrieves data from one or more tables, enabling you to query…

Concepts