Skip to content

File Filter (legacy)

Bases: BaseFileFilter, FrozenModel

"Filter files or directories by their path.

.. deprecated:: 0.8.0

Use :obj:`Glob <onetl.file.filter.glob.Glob>`, :obj:`Regexp <onetl.file.filter.regexp.Regexp>`
or :obj:`ExcludeDir <onetl.file.filter.exclude_dir.ExcludeDir>` instead.

Parameters:

  • glob (str | None, default: `None` ) –

    Pattern (e.g. *.csv) for which any file (only file) path should match

    .. warning::

    Mutually exclusive with ``regexp``
    
  • regexp (str | Pattern | None, default: `None` ) –

    Regular expression (e.g. \d+\.csv) for which any file (only file) path should match.

    If input is a string, regular expression will be compiles using re.IGNORECASE and re.DOTALL flags

    .. warning::

    Mutually exclusive with ``glob``
    
  • exclude_dirs (list[PathLike | str], default: `[]` ) –

    list[directories] which should not be a part of a file or directory path

Examples:

Create exclude_dir filter:

.. code:: python

from onetl.core import FileFilter

file_filter = FileFilter(exclude_dirs=["/export/news_parse/exclude_dir"])

Create glob filter:

.. code:: python

from onetl.core import FileFilter

file_filter = FileFilter(glob="*.csv")

Create regexp filter:

.. code:: python

from onetl.core import FileFilter

file_filter = FileFilter(regexp=r"\d+\.csv")

# or

import re

file_filter = FileFilter(regexp=re.compile("\d+\.csv"))

Not allowed:

.. code:: python

from onetl.core import FileFilter

FileFilter()  # will raise ValueError, at least one argument should be passed

match(path)

False means it does not match the template by which you want to receive files