Skip to content

Spark LocalFS

Bases: SparkFileDFConnection

Spark connection to local filesystem. |support_hooks|

Based on Spark Generic File Data Source <https://spark.apache.org/docs/latest/sql-data-sources-generic-options.html>_.

.. warning::

To use SparkHDFS connector you should have PySpark installed (or injected to ``sys.path``)
BEFORE creating the connector instance.

See :ref:`install-spark` installation instruction for more details.

.. warning::

Currently supports only Spark sessions created with option ``spark.master: local``.

.. note::

Supports only reading files as Spark DataFrame and writing DataFrame to files.

Does NOT support file operations, like create, delete, rename, etc.

.. versionadded:: 0.9.0

Parameters:

  • spark (:class:pyspark.sql.SparkSession) –

    Spark session

Examples:

.. code:: python

from onetl.connection import SparkLocalFS
from pyspark.sql import SparkSession

# create Spark session
spark = SparkSession.builder.master("local").appName("spark-app-name").getOrCreate()

# create connection
local_fs = SparkLocalFS(spark=spark).check()