Spark connector options
The necessary parameters for connecting to the ADB database and read/write parameters are set via key-value pairs.
The connector supports the following options:
Key | Description | Type | Required (default) |
---|---|---|---|
spark.adb.url |
Database Connection string |
Read/Write |
Yes |
spark.adb.dbschema |
The name of the database schema which the table belongs |
Read/Write |
Yes (public) |
spark.adb.dbtable |
The name of the database table |
Read/Write |
Yes |
spark.adb.driver |
The full path to the JDBC driver, when using a custom driver |
Read/Write |
No (org.postgresql.Driver) |
spark.adb.user |
User/ADB role |
Read/Write |
Yes |
spark.adb.password |
The user’s password in ADB |
Read/Write |
No |
spark.adb.server.usehostname |
Using the Spark executor node name as a gpfdist server address |
Read/Write |
No (false) |
spark.adb.server.env.name |
The name of the environment variable whose value determines the name of the Spark executor node or the IP address on which the gpfdist server process runs |
Read/Write |
No |
spark.adb.server.port |
A port number or a port range of the gpfdist server process on the Spark executor node |
Read/Write |
No |
spark.adb.pool.maxsize |
The maximum number of connections in the connection pool |
Read/Write |
No (4) |
spark.adb.pool.timeoutms |
The time in milliseconds after which an inactive connection is considered idle |
Read/Write |
No (10 000) |
spark.adb.pool.minidle |
The minimum number of idle connections supported in the connection pool |
Read/Write |
No (0) |
spark.adb.debugmode |
Enabling event logging mode in the |
Read/Write |
No (false) |
spark.adb.partition.column |
The name of the table column used in Spark for partitioning. This column should be one of the integer data types or a date/time |
Read |
No |
spark.adb.partition.count |
The number of partitions in Spark. It can be specified either independently or in conjunction with |
Read |
No |
spark.adb.partition.hash |
An expression used as a partitioning key when reading data in Spark. Specified in conjunction with |
Read |
No |
spark.adb.batch.enable |
Enabling batch mode. It is assumed when reading data from ADB to Spark, the |
Read |
No (false) |
spark.adb.batch.memorymode |
The type of data structure used when organizing the ColumnarBatch: JVM array in memory or offheap. Acceptable values: |
Read |
No |
spark.adb.table.truncate |
Used when writing in Overwrite mode. Performs the truncate table operation, if true, otherwise drops table |
Write |
No (false) |
spark.adb.create.table.with |
Used when writing in Overwrite and errorIfExists modes. Storage parameters when creating a table (WITH expression) |
Write |
No |
spark.adb.create.table.distributedby |
Used when writing in Overwrite and errorIfExists modes. Distribution key when creating the target table (expression DISTRIBUTED BY) |
Write |
No (RANDOMLY) |