ADB ClickHouse connector installation
Currently, ADB ClickHouse connector can be installed only manually. You should install RPM packages and create corresponding extensions in the ADB database. Follow the steps:
-
Make sure the PXF service is installed. For information on how to add services to ADB clusters, see Add services.
-
Get tkh-connector, gppxf-fdw, and tkh-fdw RPM packages from the Arenadata support team. Package names should contain a specific version number, in the example below they are listed without versions for simplification. Install RPM packages on all cluster hosts. Below is an example of installation via the YUM package manager:
-
tkh-connector. If you already use Tkhemali connector 1.X (the Tkhemali service is installed in your cluster), you do not need to install this package.
$ sudo yum install -y tkh-connector.rpm
-
gppxf-fdw
$ sudo yum install -y gppxf-fdw.rpm
-
tkh-fdw
$ sudo yum install -y tkh-fdw.rpm
-
-
Add the new custom profile
Tkh_clickhouse:text
to the PXF configuration file /var/lib/pxf/conf/pxf-profiles.xml on Master. To edit the file, you can use thevi
orvim
command:$ sudo vi /var/lib/pxf/conf/pxf-profiles.xml
The content of the changed file can look as follows. In our case, the file does not contain other profiles:
<profiles> <profile> <name>Tkh_clickhouse:text</name> <description>Clickhouse</description> <plugins> <accessor>io.arenadata.tkh.pxf.TkhFdwAccessor</accessor> <resolver>io.arenadata.tkh.pxf.TkhResolver</resolver> </plugins> </profile> </profiles>
NOTEThe directory, where PXF configuration files are located, is defined by the
$PXF_BASE
environment variable. The default value is /var/lib/pxf. The path may differ in your environment. -
Login under the default user name
gpadmin
. All the commands listed in the subsequent steps should be performed on Master:$ sudo su - gpadmin
-
Synchronize the PXF configuration to all hosts in the ADB cluster:
$ pxf cluster sync
The result:
Syncing PXF configuration files from master host to standby master host and 2 segment hosts... PXF configs synced successfully on 3 out of 3 hosts
-
Restart PXF:
$ pxf cluster restart
The result:
Restarting PXF on master host, standby master host, and 2 segment hosts... PXF restarted successfully on 4 out of 4 hosts
-
Connect to the database via psql (or any other client program):
$ psql adb
-
Create two new extensions:
-
pxf_fdw
CREATE EXTENSION pxf_fdw;
-
tkh_fdw
CREATE EXTENSION tkh_fdw;
In both cases, the result should be similar to:
CREATE EXTENSION
-
-
To check the existence of new extensions, you can use the psql meta-command
\dx
:\dx
The output contains new extensions
pxf_fdw
andtkh_fdw
:List of installed extensions Name | Version | Schema | Description -----------+---------+------------+--------------------------------------------------------------------- dblink | 1.1 | public | connect to other PostgreSQL databases from within a database diskquota | 2.2 | public | Disk Quota Main Program gptkh | 0.14 | public | gptkh kadb_fdw | 0.16 | kadb | Kafka-ADB foreign data wrapper plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language postgis | 2.5.4 | public | PostGIS geometry, geography, and raster spatial types and functions pxf | 2.0 | public | Extension which allows to access unmanaged data pxf_fdw | 1.0 | public | PXF Foreign Data Wrapper for Greenplum tkh_fdw | 1.0 | public | ClickHouse Foreign Data Wrapper for Greenplum (9 rows)
-
You can also check that the new foreign data wrapper
tkh_fdw
is available using the following SQL query against thepg_catalog.pg_foreign_data_wrapper
system table:SELECT * FROM pg_catalog.pg_foreign_data_wrapper;
The result:
fdwname | fdwowner | fdwhandler | fdwvalidator | fdwacl | fdwoptions ---------------+----------+------------+--------------+--------+-------------------------------------------------------------------------------------------- dblink_fdw | 10 | 0 | 16480 | | kadb_fdw | 10 | 20308 | 20310 | | {"mpp_execute=all segments",k_initial_offset=0,k_automatic_offsets=true,k_latency_ms=2000} jdbc_pxf_fdw | 10 | 20377 | 20378 | | {protocol=jdbc,"mpp_execute=all segments"} hdfs_pxf_fdw | 10 | 20377 | 20378 | | {protocol=hdfs,"mpp_execute=all segments"} hive_pxf_fdw | 10 | 20377 | 20378 | | {protocol=hive,"mpp_execute=all segments"} hbase_pxf_fdw | 10 | 20377 | 20378 | | {protocol=hbase,"mpp_execute=all segments"} s3_pxf_fdw | 10 | 20377 | 20378 | | {protocol=s3,"mpp_execute=all segments"} gs_pxf_fdw | 10 | 20377 | 20378 | | {protocol=gs,"mpp_execute=all segments"} adl_pxf_fdw | 10 | 20377 | 20378 | | {protocol=adl,"mpp_execute=all segments"} wasbs_pxf_fdw | 10 | 20377 | 20378 | | {protocol=wasbs,"mpp_execute=all segments"} file_pxf_fdw | 10 | 20377 | 20378 | | {protocol=file,"mpp_execute=all segments"} tkh_fdw | 10 | 20392 | 20391 | | {protocol=tkh_clickhouse,"mpp_execute=all segments",format=text} (12 rows)