Skip to content

Writing to MongoDB using DBWriter

For writing data to MongoDB, use DBWriter.

Warning

Please take into account MongoDB types

Examples

from onetl.connection import MongoDB
from onetl.db import DBWriter

mongodb = MongoDB(...)

df = ...  # data is here

writer = DBWriter(
    connection=mongodb,
    target="schema.table",
    options=MongoDB.WriteOptions(
        if_exists="append",
    ),
)

writer.run(df)

Write options

Method above accepts MongoDB.WriteOptions

MongoDBWriteOptions

Bases: GenericOptions

Writing options for MongoDB connector.

.. warning::

Options ``uri``, ``database``, ``collection`` are populated from connection attributes,
and cannot be overridden by the user in ``WriteOptions`` to avoid issues.

.. versionadded:: 0.7.0

Examples:

.. note ::

You can pass any value
`supported by connector <https://www.mongodb.com/docs/spark-connector/current/batch-mode/batch-write-config/>`_,
even if it is not mentioned in this documentation. **Option names should be in** ``camelCase``!

The set of supported options depends on connector version.

.. code:: python

from onetl.connection import MongoDB

options = MongoDB.WriteOptions(
    if_exists="append",
    sampleSize=500,
    localThreshold=20,
)

if_exists = Field(default=(MongoDBCollectionExistBehavior.APPEND), alias=(avoid_alias('mode'))) class-attribute instance-attribute

Behavior of writing data into existing collection.

Possible values:

  • append (default) Adds new objects into existing collection.

    .. dropdown:: Behavior in details

    * Collection does not exist
        Collection is created using options provided by user
        (``shardkey`` and others).
    
    * Collection exists
        Data is appended to a collection.
    
        .. warning::
    
            This mode does not check whether collection already contains
            objects from dataframe, so duplicated objects can be created.
    
  • replace_entire_collection Collection is deleted and then created.

    .. dropdown:: Behavior in details

    * Collection does not exist
        Collection is created using options provided by user
        (``shardkey`` and others).
    
    * Collection exists
        Collection content is replaced with dataframe content.
    
  • ignore Ignores the write operation if the collection already exists.

    .. dropdown:: Behavior in details

    * Collection does not exist
        Collection is created using options provided by user
    
    * Collection exists
        The write operation is ignored, and no data is written to the collection.
    
  • error Raises an error if the collection already exists.

    .. dropdown:: Behavior in details

    * Collection does not exist
        Collection is created using options provided by user
    
    * Collection exists
        An error is raised, and no data is written to the collection.
    

.. versionchanged:: 0.9.0 Renamed modeif_exists