Clickhouse connection¶
Bases: JDBCConnection
Clickhouse JDBC connection. |support_hooks|
Based on Maven package com.clickhouse:clickhouse-jdbc:0.7.2 <https://mvnrepository.com/artifact/com.clickhouse/clickhouse-jdbc/0.7.2>
(official Clickhouse JDBC driver <https://github.com/ClickHouse/clickhouse-jdbc>).
.. seealso::
Before using this connector please take into account :ref:`clickhouse-prerequisites`
.. versionadded:: 0.1.0
Parameters:
-
host(str) –Host of Clickhouse database. For example:
test.clickhouse.domain.comor193.168.1.11 -
port(int, default:`8123`) –Port of Clickhouse database
-
user(str) –User, which have proper access to the database. For example:
some_user -
password(str) –Password for database connection
-
database(str) –Database (==schema) in Clickhouse.
-
spark(:obj:pyspark.sql.SparkSession) –Spark session.
-
extra(dict, default:`None`) –Specifies one or more extra parameters by which clients can connect to the instance.
For example:
{"continueBatchOnError": "false"}.See:
Clickhouse JDBC driver properties documentation <https://clickhouse.com/docs/en/integrations/java#configuration>_Clickhouse core settings documentation <https://clickhouse.com/docs/en/operations/settings/settings>_Clickhouse query complexity documentation <https://clickhouse.com/docs/en/operations/settings/query-complexity>_Clickhouse query level settings <https://clickhouse.com/docs/en/operations/settings/query-level>_
Examples:
Create and check Clickhouse connection:
.. code:: python
from onetl.connection import Clickhouse
from pyspark.sql import SparkSession
# Create Spark session with Clickhouse driver loaded
maven_packages = Clickhouse.get_packages()
spark = (
SparkSession.builder.appName("spark-app-name")
.config("spark.jars.packages", ",".join(maven_packages))
.getOrCreate()
)
# Create connection
clickhouse = Clickhouse(
host="database.host.or.ip",
user="user",
password="*****",
extra={"continueBatchOnError": "false"},
spark=spark,
).check()
get_packages(package_version=None, apache_http_client_version=None)
classmethod
¶
Get package names to be downloaded by Spark. |support_hooks|
Allows specifying custom JDBC and Apache HTTP Client versions.
.. versionadded:: 0.9.0
Parameters:
-
package_version(str, default:None) –ClickHouse JDBC version client packages. Defaults to
0.7.2.Versions 0.8.0-0.9.2 are not supported, see
issue #2625 <https://github.com/ClickHouse/clickhouse-java/issues/2625>_... versionadded:: 0.11.0
-
apache_http_client_version(str, default:None) –Apache HTTP Client version package. Defaults to
5.4.2.Used only if
package_versionis in range0.5.0-0.7.0... versionadded:: 0.11.0
Examples:
.. code:: python
from onetl.connection import Clickhouse
Clickhouse.get_packages()
Clickhouse.get_packages(package_version="0.7.2")
Clickhouse.get_packages(package_version="0.6.0", apache_http_client_version="5.4.2")