doris集成vertica 数据源catalog

news/2025/10/22 12:05:30/文章来源:https://www.cnblogs.com/lisacumt/p/19157512

准备:

  1. linux编译环境,需要安装好docker,使用docker编译,同时可以运行linux版本的idea
    idea官方
    下载一个社区版本即可。并复制到虚拟机linux环境的任意目录。启动即可。
    图片
    需要为linux配置好maven
    图片

  2. github
    编译doris请参考github和官方文档
    doris编译
    图片
    官方文档编译一:
    github-doris
    图片
    官方文档编译二:
    compilation-with-docker
    图片

  3. 下载源码:
    需要注意:
    图片

一定要找到对应branch下载,找到release,大版本,另外,doris.apache.org下载的源码,因为归档时候修改了很多文件夹名称,导致不可以编译!!!
图片

git clone -b branch-2.0 https://github.com/apache/doris.git

另外需要注意检查自己部署服务器的硬件支持avx2,选择对应的版本。

  1. 安装docker
    选择源码对应的docker环境镜像即可,注意:avx2。

  2. 启动docker

# -v 本地目录:docker镜像内目录
# 需要将源码和maven仓库都映射到docker内部目录
docker run -it --network=host --name dorisbuild -v /opt/maven_repository:/root/.m2/repository -v /opt/pjs/doris-3.1.0-src:/root/doris-3.1.0-src/ apache/doris:build-env-for-3.1-no-avx2# 启动容器
docker start dorisbuild
# 进入容器bash
docker exec -it dorisbuild /bin/bash# 编译原始代码-必须试一下:
# -j是并行度设置,编译be时候有用
USE_AVX2=0 DISABLE_JAVA_CHECK_STYLE=ON ./build.sh --fe --be -j 6
USE_AVX2=0 DISABLE_JAVA_CHECK_STYLE=ON ./build.sh --be -j 6# 编译be 非常慢,如果报错了,多试几次,成功后再次编译cpp时候,会增量编译,快得多。

记得修改源码前提交个git commit。

源码修改内容:
具体文件夹位置参考JdbcPostgreSQLClient

JdbcVerticaClient

package org.apache.doris.datasource.jdbc.client;import com.google.common.collect.Lists;
import org.apache.doris.catalog.ArrayType;
import org.apache.doris.catalog.ScalarType;
import org.apache.doris.catalog.Type;
import org.apache.doris.common.util.Util;
import org.apache.doris.datasource.jdbc.util.JdbcFieldSchema;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;import java.sql.*;
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;
import java.util.Optional;
import java.util.regex.Matcher;
import java.util.regex.Pattern;public class JdbcVerticaClient extends JdbcClient {private static final Logger LOG = LogManager.getLogger(JdbcVerticaClient.class);private static final String[] supportedInnerType =new String[]{"Boolean","Integer","Float","Numeric","Date", "Time", "Timestamp", "TimestampTz","Char", "Varchar", "Long Varchar","Varbinary", "Long Varbinary","Uuid"};protected JdbcVerticaClient(JdbcClientConfig jdbcClientConfig) {super(jdbcClientConfig);}private static boolean supportInnerType(String innerType) {if (Arrays.asList(supportedInnerType).contains(innerType)) return true;else if (innerType.startsWith("Array[")) {return true;} else if (innerType.startsWith("Interval ")) {return true;} else {return false;}}private Type createArrayType(JdbcFieldSchema fieldSchema) {// TODO: TBL,DB acquire?String remoteDbName=null;String remoteTableName=null;String innerTypeName = fieldSchema.getDataTypeName().orElse(null);if (innerTypeName == null || innerTypeName.isEmpty()) {LOG.warn(String.format("vertica array type not support inner, inner type must in %s", Arrays.toString(supportedInnerType)));return Type.UNSUPPORTED;}boolean isSupported = supportInnerType(innerTypeName);if (!isSupported) {LOG.warn(String.format("vertica array type not support inner, inner type must in %s", Arrays.toString(supportedInnerType)));return Type.UNSUPPORTED;}Connection conn = null;Statement stmt = null;ResultSet arrElementRs = null;JdbcFieldSchema innerSchema = null;try {conn = getConnection();stmt = conn.createStatement();arrElementRs = stmt.executeQuery(String.format("SELECT %s[0] ITEM FROM %s.%s WHERE 1=0", fieldSchema.getColumnName(), remoteDbName, remoteTableName));innerSchema = new JdbcFieldSchema(arrElementRs.getMetaData(), 1);} catch (SQLException ex) {LOG.warn("Failed to get item type for column {}: {}",fieldSchema.getColumnName(), Util.getRootCauseMessage(ex));} finally {close(arrElementRs, null);if (stmt != null) {try {stmt.close();} catch (SQLException ex) {LOG.warn("Failed to close statement: {}", Util.getRootCauseMessage(ex));}}if (conn != null) {try {conn.close();} catch (SQLException ex) {LOG.warn("Failed to close connection: {}", Util.getRootCauseMessage(ex));}}}Type arrayInnerType = jdbcTypeToDoris(innerSchema);Type arrayType = ArrayType.create(arrayInnerType, true);return arrayType;}@Overridepublic List<JdbcFieldSchema> getJdbcColumnsInfo(String remoteDbName, String remoteTableName) {Connection conn = null;ResultSet rs = null;List<JdbcFieldSchema> tableSchema = Lists.newArrayList();try {conn = getConnection();DatabaseMetaData databaseMetaData = conn.getMetaData();String catalogName = getCatalogName(conn);rs = getRemoteColumns(databaseMetaData, catalogName, remoteDbName, remoteTableName);while (rs.next()) {JdbcFieldSchema fieldSchema = new JdbcFieldSchema(rs, 0);tableSchema.add(fieldSchema);}Iterator<JdbcFieldSchema> it=tableSchema.iterator();LOG.error(String.format("print table schema: %s.%s",remoteDbName,remoteTableName));while(it.hasNext()){LOG.error(String.format("FieldSchema: %s",it.next().toString()));}} catch (SQLException e) {throw new JdbcClientException("failed to get jdbc columns info for remote table `%s.%s`: %s",remoteDbName, remoteTableName, Util.getRootCauseMessage(e));} finally {close(rs, conn);}return tableSchema;}@Overrideprotected String[] getTableTypes() {return new String[]{"TABLE", "PARTITIONED TABLE", "VIEW", "MATERIALIZED VIEW", "FOREIGN TABLE"};}@Overrideprotected Type jdbcTypeToDoris(JdbcFieldSchema fieldSchema) {String verticaType = fieldSchema.getDataTypeName().orElse("unknown");switch (verticaType) {case "Integer":return ScalarType.BIGINT;case "Numeric":case "Decimal":case "Number": {int precision = fieldSchema.getColumnSize().orElse(0);int scale = fieldSchema.getDecimalDigits().orElse(0);return createDecimalOrStringType(precision, scale);}case "Real":case "Float":case "Float8":case "Double Precision":return ScalarType.DOUBLE;case "Timestamp":case "TimestampTz": {// vertica can support microsecondint scale = fieldSchema.getDecimalDigits().orElse(0);if (scale > 6) {scale = 6;}return ScalarType.createDatetimeV2Type(scale);}case "Date":return ScalarType.DATEV2;case "Boolean":return ScalarType.BOOLEAN;case "Char":case "Character":case "Varchar":case "Character Varying":case "Long Varchar":case "Uuid":case "Binary":case "Varbinary":case "Binary Varying":case "Long Varbinary":case "Raw":case "Bytea":case "Time":case "TimeTz":return ScalarType.createStringType();default:if (verticaType.startsWith("Interval")) {return ScalarType.createStringType();} else if (verticaType.startsWith("Array")) {LOG.warn(String.format("got array type of Vertica: %s", verticaType));return ScalarType.createStringType();// return createArrayType(fieldSchema);}LOG.error(String.format("got unsupported type of Vertica: %s", verticaType));return Type.UNSUPPORTED;}}
}

具体文件夹位置参考PostgreSQLJdbcExecutor
添加类:VerticaJdbcExecutor


package org.apache.doris.jdbc;import com.google.common.collect.Lists;
import org.apache.doris.common.jni.vec.ColumnType;
import org.apache.doris.common.jni.vec.ColumnType.Type;
import org.apache.doris.common.jni.vec.ColumnValueConverter;
import org.apache.doris.common.jni.vec.VectorTable;
import org.apache.log4j.Logger;import java.io.BufferedInputStream;
import java.io.ByteArrayOutputStream;
import java.io.InputStream;
import java.math.BigDecimal;
import java.sql.Date;
import java.sql.ResultSetMetaData;
import java.sql.SQLException;
import java.sql.Timestamp;
import java.time.LocalDate;
import java.time.LocalDateTime;
import java.time.OffsetDateTime;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;public class VerticaJdbcExecutor extends BaseJdbcExecutor {private static final Logger LOG = Logger.getLogger(VerticaJdbcExecutor.class);public VerticaJdbcExecutor(byte[] thriftParams) throws Exception {super(thriftParams);}@Overrideprotected void initializeBlock(int columnCount,String[] replaceStringList,int batchSizeNum,VectorTable outputTable) {for (int i = 0; i < columnCount; ++i) {if (outputTable.getColumnType(i).getType() == Type.DATETIME|| outputTable.getColumnType(i).getType() == Type.DATETIMEV2) {block.add(new Object[batchSizeNum]);} else if (outputTable.getColumnType(i).getType() == Type.STRING|| outputTable.getColumnType(i).getType() == Type.ARRAY) {block.add(new Object[batchSizeNum]);} else {block.add(outputTable.getColumn(i).newObjectContainerArray(batchSizeNum));}}}@Overrideprotected Object getColumnValue(int columnIndex, ColumnType type, String[] replaceStringList)throws SQLException {switch (type.getType()) {case BOOLEAN:return resultSet.getObject(columnIndex + 1, Boolean.class);case TINYINT:case SMALLINT:case INT:return resultSet.getObject(columnIndex + 1, Integer.class);case BIGINT:return resultSet.getObject(columnIndex + 1, Long.class);case FLOAT:return resultSet.getObject(columnIndex + 1, Float.class);case DOUBLE:return resultSet.getObject(columnIndex + 1, Double.class);case DECIMALV2:case DECIMAL32:case DECIMAL64:case DECIMAL128:return resultSet.getObject(columnIndex + 1, BigDecimal.class);case DATE:case DATEV2:java.sql.Date date = resultSet.getObject(columnIndex + 1, java.sql.Date.class);if (date == null) {return null;} else {return date.toLocalDate();}case DATETIME:case DATETIMEV2:java.sql.Timestamp ts = resultSet.getObject(columnIndex + 1, java.sql.Timestamp.class);if (ts == null) {return null;} else {return ts.toLocalDateTime();}case CHAR:case VARCHAR:case STRING:Object data = resultSet.getObject(columnIndex + 1);if (data == null) {return null;} else if (data instanceof java.sql.Array) {java.sql.Array array = (java.sql.Array) data;Object arr = array.getArray();if (arr instanceof long[]) {return Arrays.toString((long[]) arr);} else if (arr instanceof Long[]) {return Arrays.toString((Long[]) arr);} else if (arr instanceof String[]) {return Arrays.toString((String[]) arr);} else if (arr instanceof double[]) {return Arrays.toString((double[]) arr);}else if (arr instanceof Double[]) {return Arrays.toString((Double[]) arr);}  else if (arr instanceof char[]) {return Arrays.toString((char[]) arr);}  else if (arr instanceof Character[]) {return Arrays.toString((Character[]) arr);} else if (arr instanceof BigDecimal[]) {return Arrays.toString((BigDecimal[]) arr);} else {LOG.error(String.format("unsupported array type of class: %s",arr.getClass().getName()));return Arrays.toString(new String[]{});}} else if (data instanceof java.sql.Time) {return timeToString((java.sql.Time) data);} else if (data instanceof byte[]) {return new String((byte[]) data);} else {return data.toString();}case ARRAY:java.sql.Array array = resultSet.getArray(columnIndex + 1);return array == null ? null : convertArrayToList(array.getArray());case BINARY:try (InputStream is = resultSet.getBinaryStream(columnIndex + 1);ByteArrayOutputStream bs = new ByteArrayOutputStream();) {byte[] buffer = new byte[4096];Integer bytesRead = -1;while ((bytesRead = is.read(buffer)) != -1) {bs.write(buffer, 0, bytesRead);}return bs.toByteArray();} catch (Exception ex) {}default:throw new IllegalArgumentException("Unsupported Runtime column type: " + type.getType());}}@Overrideprotected ColumnValueConverter getOutputConverter(ColumnType columnType, String replaceString) {switch (columnType.getType()) {case DATETIME:case DATETIMEV2:return createConverter(input -> {if (input instanceof Timestamp) {return ((Timestamp) input).toLocalDateTime();} else if (input instanceof OffsetDateTime) {return ((OffsetDateTime) input).toLocalDateTime();} else {return input;}},LocalDateTime.class);case CHAR:return createConverter(input -> trimSpaces(input.toString()), String.class);case VARCHAR:case STRING:return createConverter(input -> {if (input instanceof java.sql.Time) {return timeToString((java.sql.Time) input);} else if (input instanceof byte[]) {return verticaByteArrayToHexString((byte[]) input);} else {return input.toString();}},String.class);case ARRAY:return createConverter((Object input) ->convertArray((List<?>) input, columnType.getChildTypes().get(0)),List.class);default:return null;}}private static String verticaByteArrayToHexString(byte[] bytes) {StringBuilder hexString = new StringBuilder("\\x");for (byte b : bytes) {hexString.append(String.format("%02x", b & 0xff));}return hexString.toString();}private List<Object> convertArrayToList(Object array) {if (array == null) {return null;}int length = java.lang.reflect.Array.getLength(array);List<Object> list = new ArrayList<>(length);for (int i = 0; i < length; i++) {Object element = java.lang.reflect.Array.get(array, i);list.add(element);}LOG.error(String.format("got array size: %d", list.size()));return list;}private List<?> convertArray(List<?> array, ColumnType type) {if (array == null) {return null;}LOG.error(String.format("starting to convert nonempty array:  %s", array));switch (type.getType()) {case DATE:case DATEV2: {List<LocalDate> result = new ArrayList<>();for (Object element : array) {if (element == null) {result.add(null);} else {result.add(((Date) element).toLocalDate());}}return result;}case DATETIME:case DATETIMEV2: {List<LocalDateTime> result = new ArrayList<>();for (Object element : array) {if (element == null) {result.add(null);} else {if (element instanceof Timestamp) {result.add(((Timestamp) element).toLocalDateTime());} else if (element instanceof OffsetDateTime) {result.add(((OffsetDateTime) element).toLocalDateTime());} else {result.add((LocalDateTime) element);}}}return result;}case ARRAY:List<List<?>> resultArray = Lists.newArrayList();for (Object element : array) {if (element == null) {resultArray.add(null);} else {List<?> nestedList = convertArrayToList(element);resultArray.add(convertArray(nestedList, type.getChildTypes().get(0)));}}return resultArray;default:return array;}}
}

修改类org.apache.doris.catalog.JdbcResource

    public static final String JDBC_VERTICA = "jdbc:vertica";public static final String VERTICA = "VERTICA";public static String parseDbType(String url) throws DdlException {if (url.startsWith(JDBC_MYSQL) || url.startsWith(JDBC_MARIADB)) {return MYSQL;}// 添加vertica url判断else if (url.startsWith(JDBC_VERTICA)) {return VERTICA;}throw new DdlException("Unsupported jdbc database type, please check jdbcUrl: " + url);}

修改类org.apache.doris.datasource.jdbc.client.JdbcClient

    public static JdbcClient createJdbcClient(JdbcClientConfig jdbcClientConfig) {String dbType = parseDbType(jdbcClientConfig.getJdbcUrl());switch (dbType) {case JdbcResource.MYSQL:return new JdbcMySQLClient(jdbcClientConfig);// 添加vertica数据源case JdbcResource.VERTICA:return new JdbcVerticaClient(jdbcClientConfig);default:throw new IllegalArgumentException("Unsupported DB type: " + dbType);}}

修改类org.apache.doris.jdbc.JdbcExecutorFactory

    public static String getExecutorClass(TOdbcTableType type) {switch (type) {case MYSQL:case OCEANBASE:return "org/apache/doris/jdbc/MySQLJdbcExecutor";// 添加Vertia执行器的class名称case VERTICA:return "org/apache/doris/jdbc/VerticaJdbcExecutor";default:throw new IllegalArgumentException("Unsupported jdbc type: " + type);}}

org.apache.doris.thrift.TOdbcTableType类是thrift生成的源码,由注解可见@javax.annotation.Generated(value = "Autogenerated by Thrift Compiler (0.16.0)", date = "2025-10-16")其模板文件是

修改类org.apache.doris.datasource.jdbc.source.JdbcScanNode

    private String getJdbcQueryStr() {// Other DataBase use limit do top nif (shouldPushDownLimit()&& (jdbcType == TOdbcTableType.MYSQL// 添加vertica|| jdbcType == TOdbcTableType.VERTICA)) {sql.append(" LIMIT ").append(limit);}}

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/943249.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

JUnit 6.0.0发布:Java 17基线、取消API与Kotlin协程支持

JUnit 6.0.0正式发布,统一平台版本并提升最低要求至Java 17。新增Kotlin协程测试支持、取消令牌API、内置Java飞行记录器监听器,采用JSpecify空值注解并迁移至FastCSV库。Vintage模块现已弃用,为JUnit 4用户提供迁移…

2025年10月上海ICL医生推荐榜:王晓瑛领衔五强对比

想在上海做ICL晶体植入,却担心医生经验不足、术后效果不稳、价格不透明?这是不少近视患者共同的焦虑。ICL手术对医生资质、晶体定制、术前评估要求极高,一旦选择失误,可能面临二次手术或视觉质量下降。2025年,上海…

2025年10月消泡剂厂家推荐:权威榜单一网打尽

如果您正在寻找稳定供货、技术响应快、性价比高的消泡剂厂家,大概率会陷入“信息过载”:网页广告繁多、参数口径不一、小厂贴牌混杂,担心样品与批量质量不一致,又怕售后无人跟进。尤其在2025年四季度环保核查趋严、…

详细介绍:老题新解|合法C标识符

详细介绍:老题新解|合法C标识符pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", "Monaco&qu…

国产化Excel开发组件Spire.XLS教程:使用Python将TXT文件转换为CSV

在 Python 中处理数据时,将 TXT 文本文件转换为 CSV 是数据分析、报表生成或跨应用共享数据的常见需求。本文将详细介绍如何借助Spire.XLS使用 Python 实现 TXT 文件转 CSV,包括单文件转换、批量转换以及处理不同分隔…

VMware Holodeck 9.0.1.0 发布 - 自动化部署 VCF 实验环境

VMware Holodeck 9.0.1.0 发布 - 自动化部署 VCF 实验环境VMware Holodeck 9.0 - 自动化部署 VCF 实验环境 高效管理虚拟机 (VM) 和容器工作负载,为本地部署的全栈超融合基础架构 (HCI) 提供云的优势。 请访问原文链接…

【AI大模型前沿】HunyuanWorld-Voyager:腾讯开源的超长漫游世界模型,开启3D场景生成新纪元 - 指南

【AI大模型前沿】HunyuanWorld-Voyager:腾讯开源的超长漫游世界模型,开启3D场景生成新纪元 - 指南2025-10-22 11:52 tlnshuju 阅读(0) 评论(0) 收藏 举报pre { white-space: pre !important; word-wrap: normal …

[题解]meal

题目描述 下课铃终于响了,你和一群朋友(共 N 人)一起冲到食堂。因为你们到的非常早,现在食堂窗口前面还没有人。食堂共有两个窗口。你们每个人打饭会耗时 a i,打完立刻去座位上吃饭会耗时 b i,由于你们吃完饭要一…

CADSoftTools发布两款重要更新:CAD VCL Multiplatform 16.2 与 CAD .NET 16全新发布

近日,CADSoftTools 宣布推出两款产品更新——CAD VCL Multiplatform 16.2 与 CAD .NET 16。两款新版本均在性能与兼容性方面实现重大提升,为 Delphi、C++Builder 以及 .NET 开发者打造更加高效、现代化的 CAD 应用开…

linux常用命令 - 实践

linux常用命令 - 实践pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", "Monaco", "…

2025 年公交/乡村/不锈钢/智能候车亭厂家推荐:江苏丁一城市智能科技有限公司提供定制化方案与全流程服务

行业背景 随着城市化进程不断推进,城市街具作为公共基础设施的重要组成部分,其功能性与美观性需求持续提升。候车亭作为市民日常出行的关键节点设施,不仅需要满足基础的遮风挡雨功能,还需融入智能交互、文化特色与…

python 查看arcgis里面的模板文件都链接着啥内容在arcgis里面输入的代码

我有个arcgis里面有一堆图 在输入代码的框里面输入import arcpy import osmxd = arcpy.mapping.MapDocument("CURRENT") layers = arcpy.mapping.ListLayers(mxd)print "=" * 60 print "ALL …

2025年10月河道防撞护栏厂家全景解析报告,基于专业测评的技术、性能及市场优势深度分析

随着城市水利基础设施建设和河道治理要求的不断提升,河道防撞护栏作为保障公共安全的重要设施,其技术标准与市场需求日益提高。根据行业调研数据显示,2024年防撞护栏市场规模已达68亿元,年均增长率稳定在9.5%左右。…

2025年10月宠物空气净化器产品推荐:权威榜单对比评测

把猫砂盆搬进客厅后,空气里总飘着“看不见”的猫毛蛋白,扫地机器人刚走,鼻子又痒了——这是不少养宠家庭在2025年的真实场景。中国小动物保护协会数据显示,城镇犬猫数量已超1.2亿只,同期宠物家庭对“可吸附动物过…

无穷小比较、等价无穷小替换

无穷小比较\(\lim \frac{\beta}{\alpha} = 0\) , \(\beta\) 比 \(\alpha\) 高阶无穷小。 \(\lim \frac{\beta}{\alpha} = \infty\) ,\(\beta\) 比 \(\alpha\) 低阶无穷小。 \(\lim \frac{\beta}{\alpha} = c \neq 0\…

在 Linux 系统上安装 Miniconda、安装 Xinference,并设置 Xinference 开机自启动

一、安装 Miniconda 1. 下载 Miniconda 安装脚本 x86_64 架构: cd ~ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.shARM 架构(如树莓派或 Apple Silicon 的 Linux 虚拟机):wget htt…

【项目复现上新】Karpathy大神开源GitHub高分项目NanoChat!仅用100美元+8000行代码手搓ChatGPT

10月13日,AI领域大神AndrejKarpathy发布了自己的最新开源项目。截至当前,GitHub项目上已经达到29.1KStar。The best ChatGPT that $100 can buy. 10 月 13 日,AI 领域大神 AndrejKarpathy 发布了自己的最新开源项目…

实用指南:Ansible实战:VMware下K8s自动化部署指南

pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", "Monaco", "Courier New", …

作业三(结对编程)-小学四则运算题目生成与判卷(Python + 可视化)

结对作业:小学四则运算题目生成与判卷(Python + 可视化)姓名/学号:翁广驰(3123004409)、关健佳(3121004072) Github项目地址:https://github.com/Gucvii/homeworkPSP2.1 表格(实现前的预估)(1分)PSP 阶段 预…

2025年10月景区钢丝绳护栏厂家全景解析报告,基于专业测评的技术、性能及市场优势深度分析

随着旅游业的蓬勃发展和景区安全标准的持续提升,景区钢丝绳护栏作为关键的安全防护设施,其市场需求呈现稳定增长态势。行业统计数据显示,2024年中国景区防护设施市场规模已突破75亿元,其中钢丝绳护栏凭借其优良的安…