This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Config

1: HugeGraph 配置
2: HugeGraph 配置项
3: HugeGraph 内置用户权限与扩展权限配置及使用
4: 配置 HugeGraphServer 使用 https 协议
5: HugeGraph-Computer 配置

1 - HugeGraph 配置

1 概述

配置文件的目录为 hugegraph-release/conf，所有关于服务和图本身的配置都在此目录下。

主要的配置文件包括：gremlin-server.yaml、rest-server.properties 和 hugegraph.properties

HugeGraphServer 内部集成了 GremlinServer 和 RestServer，而 gremlin-server.yaml 和 rest-server.properties 就是用来配置这两个 Server 的。

GremlinServer：GremlinServer 接受用户的 gremlin 语句，解析后转而调用 Core 的代码。
RestServer：提供 RESTful API，根据不同的 HTTP 请求，调用对应的 Core API，如果用户请求体是 gremlin 语句，则会转发给 GremlinServer，实现对图数据的操作。

下面对这三个配置文件逐一介绍。

2 gremlin-server.yaml

gremlin-server.yaml 文件默认的内容如下：

# host and port of gremlin server, need to be consistent with host and port in rest-server.properties
#host: 127.0.0.1
#port: 8182

# Gremlin 查询中的超时时间（以毫秒为单位）
evaluationTimeout: 30000

channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
# 不要在此处设置图形，此功能将在支持动态添加图形后再进行处理
graphs: {
}
scriptEngines: {
  gremlin-groovy: {
    staticImports: [
      org.opencypher.gremlin.process.traversal.CustomPredicates.*',
      org.opencypher.gremlin.traversal.CustomFunctions.*
    ],
    plugins: {
      org.apache.hugegraph.plugin.HugeGraphGremlinPlugin: {},
      org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
      org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {
        classImports: [
          java.lang.Math,
          org.apache.hugegraph.backend.id.IdGenerator,
          org.apache.hugegraph.type.define.Directions,
          org.apache.hugegraph.type.define.NodeRole,
          org.apache.hugegraph.traversal.algorithm.CollectionPathsTraverser,
          org.apache.hugegraph.traversal.algorithm.CountTraverser,
          org.apache.hugegraph.traversal.algorithm.CustomizedCrosspointsTraverser,
          org.apache.hugegraph.traversal.algorithm.CustomizePathsTraverser,
          org.apache.hugegraph.traversal.algorithm.FusiformSimilarityTraverser,
          org.apache.hugegraph.traversal.algorithm.HugeTraverser,
          org.apache.hugegraph.traversal.algorithm.JaccardSimilarTraverser,
          org.apache.hugegraph.traversal.algorithm.KneighborTraverser,
          org.apache.hugegraph.traversal.algorithm.KoutTraverser,
          org.apache.hugegraph.traversal.algorithm.MultiNodeShortestPathTraverser,
          org.apache.hugegraph.traversal.algorithm.NeighborRankTraverser,
          org.apache.hugegraph.traversal.algorithm.PathsTraverser,
          org.apache.hugegraph.traversal.algorithm.PersonalRankTraverser,
          org.apache.hugegraph.traversal.algorithm.SameNeighborTraverser,
          org.apache.hugegraph.traversal.algorithm.ShortestPathTraverser,
          org.apache.hugegraph.traversal.algorithm.SingleSourceShortestPathTraverser,
          org.apache.hugegraph.traversal.algorithm.SubGraphTraverser,
          org.apache.hugegraph.traversal.algorithm.TemplatePathsTraverser,
          org.apache.hugegraph.traversal.algorithm.steps.EdgeStep,
          org.apache.hugegraph.traversal.algorithm.steps.RepeatEdgeStep,
          org.apache.hugegraph.traversal.algorithm.steps.WeightedEdgeStep,
          org.apache.hugegraph.traversal.optimize.ConditionP,
          org.apache.hugegraph.traversal.optimize.Text,
          org.apache.hugegraph.traversal.optimize.TraversalUtil,
          org.apache.hugegraph.util.DateUtil,
          org.opencypher.gremlin.traversal.CustomFunctions,
          org.opencypher.gremlin.traversal.CustomPredicate
        ],
        methodImports: [
          java.lang.Math#*,
          org.opencypher.gremlin.traversal.CustomPredicate#*,
          org.opencypher.gremlin.traversal.CustomFunctions#*
        ]
      },
      org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {
        files: [scripts/empty-sample.groovy]
      }
    }
  }
}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1,
      config: {
        serializeResultToString: false,
        ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
      }
  }
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0,
      config: {
        serializeResultToString: false,
        ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
      }
  }
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV2d0,
      config: {
        serializeResultToString: false,
        ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
      }
  }
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0,
      config: {
        serializeResultToString: false,
        ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
      }
  }
metrics: {
  consoleReporter: {enabled: false, interval: 180000},
  csvReporter: {enabled: false, interval: 180000, fileName: ./metrics/gremlin-server-metrics.csv},
  jmxReporter: {enabled: false},
  slf4jReporter: {enabled: false, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}
}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
  enabled: false
}

上面的配置项很多，但目前只需要关注如下几个配置项：channelizer 和 graphs。

graphs：GremlinServer 启动时需要打开的图，该项是一个 map 结构，key 是图的名字，value 是该图的配置文件路径；
channelizer：GremlinServer 与客户端有两种通信方式，分别是 WebSocket 和 HTTP（默认）。如果选择 WebSocket，用户可以通过 Gremlin-Console 快速体验 HugeGraph 的特性，但是不支持大规模数据导入，推荐使用 HTTP 的通信方式，HugeGraph 的外围组件都是基于 HTTP 实现的；

默认 GremlinServer 是服务在 localhost:8182，如果需要修改，配置 host、port 即可

host：部署 GremlinServer 机器的机器名或 IP，目前 HugeGraphServer 不支持分布式部署，且 GremlinServer 不直接暴露给用户;
port：部署 GremlinServer 机器的端口；

同时需要在 rest-server.properties 中增加对应的配置项 gremlinserver.url=http://host:port

3 rest-server.properties

rest-server.properties 文件的默认内容如下：

# bind url
# could use '0.0.0.0' or specified (real)IP to expose external network access
restserver.url=http://127.0.0.1:8080
#restserver.enable_graphspaces_filter=false
# gremlin server url, need to be consistent with host and port in gremlin-server.yaml
#gremlinserver.url=http://127.0.0.1:8182

graphs=./conf/graphs

# The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0
batch.max_write_ratio=80
batch.max_write_threads=0

# configuration of arthas
arthas.telnet_port=8562
arthas.http_port=8561
arthas.ip=127.0.0.1
arthas.disabled_commands=jad

# authentication configs
# choose 'org.apache.hugegraph.auth.StandardAuthenticator' or
# 'org.apache.hugegraph.auth.ConfigAuthenticator'
#auth.authenticator=

# for StandardAuthenticator mode
#auth.graph_store=hugegraph
# auth client config
#auth.remote_url=127.0.0.1:8899,127.0.0.1:8898,127.0.0.1:8897

# for ConfigAuthenticator mode
#auth.admin_token=
#auth.user_tokens=[]

# TODO: Deprecated & removed later (useless from version 1.5.0)
# rpc server configs for multi graph-servers or raft-servers
#rpc.server_host=127.0.0.1
#rpc.server_port=8091
#rpc.server_timeout=30

# rpc client configs (like enable to keep cache consistency)
#rpc.remote_url=127.0.0.1:8091,127.0.0.1:8092,127.0.0.1:8093
#rpc.client_connect_timeout=20
#rpc.client_reconnect_period=10
#rpc.client_read_timeout=40
#rpc.client_retries=3
#rpc.client_load_balancer=consistentHash

# raft group initial peers
#raft.group_peers=127.0.0.1:8091,127.0.0.1:8092,127.0.0.1:8093

# lightweight load balancing (beta)
server.id=server-1
server.role=master

# slow query log
log.slow_query_threshold=1000

# jvm(in-heap) memory usage monitor, set 1 to disable it
memory_monitor.threshold=0.85
memory_monitor.period=2000

restserver.url：RestServer 提供服务的 url，根据实际环境修改。如果其他 IP 地址无法访问，可以尝试修改为特定的地址；或修改为 http://0.0.0.0 来监听来自任何 IP 地址的请求，这种方案较为便捷，但需要留意服务可被访问的网络范围；
graphs：RestServer 启动时也需要打开图，该项为 map 结构，key 是图的名字，value 是该图的配置文件路径；

注意：gremlin-server.yaml 和 rest-server.properties 都包含 graphs 配置项，而 init-store 命令是根据 gremlin-server.yaml 的 graphs 下的图进行初始化的。

配置项 gremlinserver.url 是 GremlinServer 为 RestServer 提供服务的 url，该配置项默认为 http://localhost:8182，如需修改，需要和 gremlin-server.yaml 中的 host 和 port 相匹配；

4 hugegraph.properties

hugegraph.properties 是一类文件，因为如果系统存在多个图，则会有多个相似的文件。该文件用来配置与图存储和查询相关的参数，文件的默认内容如下：

# gremlin entrence to create graph
gremlin.graph=org.apache.hugegraph.HugeFactory

# cache config
#schema.cache_capacity=100000
# vertex-cache default is 1000w, 10min expired
#vertex.cache_capacity=10000000
#vertex.cache_expire=600
# edge-cache default is 100w, 10min expired
#edge.cache_capacity=1000000
#edge.cache_expire=600

# schema illegal name template
#schema.illegal_name_regex=\s+|~.*

#vertex.default_label=vertex

backend=rocksdb
serializer=binary

store=hugegraph

raft.mode=false
raft.safe_read=false
raft.use_snapshot=false
raft.endpoint=127.0.0.1:8281
raft.group_peers=127.0.0.1:8281,127.0.0.1:8282,127.0.0.1:8283
raft.path=./raft-log
raft.use_replicator_pipeline=true
raft.election_timeout=10000
raft.snapshot_interval=3600
raft.backend_threads=48
raft.read_index_threads=8
raft.queue_size=16384
raft.queue_publish_timeout=60
raft.apply_batch=1
raft.rpc_threads=80
raft.rpc_connect_timeout=5000
raft.rpc_timeout=60000

# if use 'ikanalyzer', need download jar from 'https://github.com/apache/hugegraph-doc/raw/ik_binary/dist/server/ikanalyzer-2012_u6.jar' to lib directory
search.text_analyzer=jieba
search.text_analyzer_mode=INDEX

# rocksdb backend config
#rocksdb.data_path=/path/to/disk
#rocksdb.wal_path=/path/to/disk

# cassandra backend config
cassandra.host=localhost
cassandra.port=9042
cassandra.username=
cassandra.password=
#cassandra.connect_timeout=5
#cassandra.read_timeout=20
#cassandra.keyspace.strategy=SimpleStrategy
#cassandra.keyspace.replication=3

# hbase backend config
#hbase.hosts=localhost
#hbase.port=2181
#hbase.znode_parent=/hbase
#hbase.threads_max=64

# mysql backend config
#jdbc.driver=com.mysql.jdbc.Driver
#jdbc.url=jdbc:mysql://127.0.0.1:3306
#jdbc.username=root
#jdbc.password=
#jdbc.reconnect_max_times=3
#jdbc.reconnect_interval=3
#jdbc.ssl_mode=false

# postgresql & cockroachdb backend config
#jdbc.driver=org.postgresql.Driver
#jdbc.url=jdbc:postgresql://localhost:5432/
#jdbc.username=postgres
#jdbc.password=

# palo backend config
#palo.host=127.0.0.1
#palo.poll_interval=10
#palo.temp_dir=./palo-data
#palo.file_limit_size=32

重点关注未注释的几项：

gremlin.graph：GremlinServer 的启动入口，用户不要修改此项；
backend：使用的后端存储，可选值有 memory、cassandra、scylladb、mysql、hbase、postgresql 和 rocksdb；
serializer：主要为内部使用，用于将 schema、vertex 和 edge 序列化到后端，对应的可选值为 text、cassandra、scylladb 和 binary；(注：rocksdb 后端值需是 binary，其他后端 backend 与 serializer 值需保持一致，如 hbase 后端该值为 hbase)
store：图存储到后端使用的数据库名，在 cassandra 和 scylladb 中就是 keyspace 名，此项的值与 GremlinServer 和 RestServer 中的图名并无关系，但是出于直观考虑，建议仍然使用相同的名字；
cassandra.host：backend 为 cassandra 或 scylladb 时此项才有意义，cassandra/scylladb 集群的 seeds；
cassandra.port：backend 为 cassandra 或 scylladb 时此项才有意义，cassandra/scylladb 集群的 native port；
rocksdb.data_path：backend 为 rocksdb 时此项才有意义，rocksdb 的数据目录
rocksdb.wal_path：backend 为 rocksdb 时此项才有意义，rocksdb 的日志目录
admin.token: 通过一个 token 来获取服务器的配置信息，例如：http://localhost:8080/graphs/hugegraph/conf?token=162f7848-0b6d-4faf-b557-3a0797869c55

5 多图配置

我们的系统是可以存在多个图的，并且各个图的后端可以不一样，比如图 hugegraph_rocksdb 和 hugegraph_mysql，其中 hugegraph_rocksdb 以 RocksDB 作为后端，hugegraph_mysql 以 MySQL 作为后端。

配置方法也很简单：

[可选]：修改 rest-server.properties

通过修改 rest-server.properties 中的 graphs 配置项来设置图的配置文件目录。默认配置为 graphs=./conf/graphs，如果想要修改为其它目录则调整 graphs 配置项，比如调整为 graphs=/etc/hugegraph/graphs，示例如下：

graphs=./conf/graphs

在 conf/graphs 路径下基于 hugegraph.properties 修改得到 hugegraph_mysql_backend.properties 和 hugegraph_rocksdb_backend.properties

hugegraph_mysql_backend.properties 修改的部分如下：

backend=mysql
serializer=mysql

store=hugegraph_mysql

# mysql backend config
jdbc.driver=com.mysql.cj.jdbc.Driver
jdbc.url=jdbc:mysql://127.0.0.1:3306
jdbc.username=root
jdbc.password=123456
jdbc.reconnect_max_times=3
jdbc.reconnect_interval=3
jdbc.ssl_mode=false

hugegraph_rocksdb_backend.properties 修改的部分如下：

backend=rocksdb
serializer=binary

store=hugegraph_rocksdb

停止 Server，初始化执行 init-store.sh（为新的图创建数据库），重新启动 Server

$ ./bin/stop-hugegraph.sh

$ ./bin/init-store.sh

Initializing HugeGraph Store...
2023-06-11 14:16:14 [main] [INFO] o.a.h.u.ConfigUtil - Scanning option 'graphs' directory './conf/graphs'
2023-06-11 14:16:14 [main] [INFO] o.a.h.c.InitStore - Init graph with config file: ./conf/graphs/hugegraph_rocksdb_backend.properties
...
2023-06-11 14:16:15 [main] [INFO] o.a.h.StandardHugeGraph - Graph 'hugegraph_rocksdb' has been initialized
2023-06-11 14:16:15 [main] [INFO] o.a.h.c.InitStore - Init graph with config file: ./conf/graphs/hugegraph_mysql_backend.properties
...
2023-06-11 14:16:16 [main] [INFO] o.a.h.StandardHugeGraph - Graph 'hugegraph_mysql' has been initialized
2023-06-11 14:16:16 [main] [INFO] o.a.h.StandardHugeGraph - Close graph standardhugegraph[hugegraph_rocksdb]
...
2023-06-11 14:16:16 [main] [INFO] o.a.h.HugeFactory - HugeFactory shutdown
2023-06-11 14:16:16 [hugegraph-shutdown] [INFO] o.a.h.HugeFactory - HugeGraph is shutting down
Initialization finished.

$ ./bin/start-hugegraph.sh

Starting HugeGraphServer...
Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)...OK
Started [pid 21614]

查看创建的图：

curl http://127.0.0.1:8080/graphs/

{"graphs":["hugegraph_rocksdb","hugegraph_mysql"]}

查看某个图的信息：

curl http://127.0.0.1:8080/graphs/hugegraph_mysql_backend

{"name":"hugegraph_mysql","backend":"mysql"}

curl http://127.0.0.1:8080/graphs/hugegraph_rocksdb_backend

{"name":"hugegraph_rocksdb","backend":"rocksdb"}

2 - HugeGraph 配置项

Gremlin Server 配置项

对应配置文件gremlin-server.yaml

config option	default value	description
host	127.0.0.1	The host or ip of Gremlin Server.
port	8182	The listening port of Gremlin Server.
graphs	hugegraph: conf/hugegraph.properties	The map of graphs with name and config file path.
scriptEvaluationTimeout	30000	The timeout for gremlin script execution(millisecond).
channelizer	org.apache.tinkerpop.gremlin.server.channel.HttpChannelizer	Indicates the protocol which the Gremlin Server provides service.
authentication	authenticator: org.apache.hugegraph.auth.StandardAuthenticator, config: {tokens: conf/rest-server.properties}	The authenticator and config(contains tokens path) of authentication mechanism.

Rest Server & API 配置项

对应配置文件rest-server.properties

config option	default value	description
graphs	[hugegraph:conf/hugegraph.properties]	The map of graphs’ name and config file.
server.id	server-1	The id of rest server, used for license verification.
server.role	master	The role of nodes in the cluster, available types are [master, worker, computer]
restserver.url	http://127.0.0.1:8080	The url for listening of rest server.
ssl.keystore_file	server.keystore	The path of server keystore file used when https protocol is enabled.
ssl.keystore_password		The password of the path of the server keystore file used when the https protocol is enabled.
restserver.max_worker_threads	2 * CPUs	The maximum worker threads of rest server.
restserver.min_free_memory	64	The minimum free memory(MB) of rest server, requests will be rejected when the available memory of system is lower than this value.
restserver.request_timeout	30	The time in seconds within which a request must complete, -1 means no timeout.
restserver.connection_idle_timeout	30	The time in seconds to keep an inactive connection alive, -1 means no timeout.
restserver.connection_max_requests	256	The max number of HTTP requests allowed to be processed on one keep-alive connection, -1 means unlimited.
gremlinserver.url	http://127.0.0.1:8182	The url of gremlin server.
gremlinserver.max_route	8	The max route number for gremlin server.
gremlinserver.timeout	30	The timeout in seconds of waiting for gremlin server.
batch.max_edges_per_batch	500	The maximum number of edges submitted per batch.
batch.max_vertices_per_batch	500	The maximum number of vertices submitted per batch.
batch.max_write_ratio	50	The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0.
batch.max_write_threads	0	The maximum threads for batch writing, if the value is 0, the actual value will be set to batch.max_write_ratio * restserver.max_worker_threads.
auth.authenticator		The class path of authenticator implementation. e.g., org.apache.hugegraph.auth.StandardAuthenticator, or org.apache.hugegraph.auth.ConfigAuthenticator.
auth.admin_token	162f7848-0b6d-4faf-b557-3a0797869c55	Token for administrator operations, only for org.apache.hugegraph.auth.ConfigAuthenticator.
auth.graph_store	hugegraph	The name of graph used to store authentication information, like users, only for org.apache.hugegraph.auth.StandardAuthenticator.
auth.user_tokens	[hugegraph:9fd95c9c-711b-415b-b85f-d4df46ba5c31]	The map of user tokens with name and password, only for org.apache.hugegraph.auth.ConfigAuthenticator.
auth.audit_log_rate	1000.0	The max rate of audit log output per user, default value is 1000 records per second.
auth.cache_capacity	10240	The max cache capacity of each auth cache item.
auth.cache_expire	600	The expiration time in seconds of vertex cache.
auth.remote_url		If the address is empty, it provide auth service, otherwise it is auth client and also provide auth service through rpc forwarding. The remote url can be set to multiple addresses, which are concat by ‘,’.
auth.token_expire	86400	The expiration time in seconds after token created
auth.token_secret	FXQXbJtbCLxODc6tGci732pkH1cyf8Qg	Secret key of HS256 algorithm.
exception.allow_trace	false	Whether to allow exception trace stack.
memory_monitor.threshold	0.85	The threshold of JVM(in-heap) memory usage monitoring , 1 means disabling this function.
memory_monitor.period	2000	The period in ms of JVM(in-heap) memory usage monitoring.

基本配置项

基本配置项及后端配置项对应配置文件：{graph-name}.properties，如hugegraph.properties

config option	default value	description
gremlin.graph	org.apache.hugegraph.HugeFactory	Gremlin entrance to create graph.
backend	rocksdb	The data store type, available values are [memory, rocksdb, cassandra, scylladb, hbase, mysql].
serializer	binary	The serializer for backend store, available values are [text, binary, cassandra, hbase, mysql].
store	hugegraph	The database name like Cassandra Keyspace.
store.connection_detect_interval	600	The interval in seconds for detecting connections, if the idle time of a connection exceeds this value, detect it and reconnect if needed before using, value 0 means detecting every time.
store.graph	g	The graph table name, which store vertex, edge and property.
store.schema	m	The schema table name, which store meta data.
store.system	s	The system table name, which store system data.
schema.illegal_name_regex	.\s+$\|~.	The regex specified the illegal format for schema name.
schema.cache_capacity	10000	The max cache size(items) of schema cache.
vertex.cache_type	l2	The type of vertex cache, allowed values are [l1, l2].
vertex.cache_capacity	10000000	The max cache size(items) of vertex cache.
vertex.cache_expire	600	The expire time in seconds of vertex cache.
vertex.check_customized_id_exist	false	Whether to check the vertices exist for those using customized id strategy.
vertex.default_label	vertex	The default vertex label.
vertex.tx_capacity	10000	The max size(items) of vertices(uncommitted) in transaction.
vertex.check_adjacent_vertex_exist	false	Whether to check the adjacent vertices of edges exist.
vertex.lazy_load_adjacent_vertex	true	Whether to lazy load adjacent vertices of edges.
vertex.part_edge_commit_size	5000	Whether to enable the mode to commit part of edges of vertex, enabled if commit size > 0, 0 means disabled.
vertex.encode_primary_key_number	true	Whether to encode number value of primary key in vertex id.
vertex.remove_left_index_at_overwrite	false	Whether remove left index at overwrite.
edge.cache_type	l2	The type of edge cache, allowed values are [l1, l2].
edge.cache_capacity	1000000	The max cache size(items) of edge cache.
edge.cache_expire	600	The expiration time in seconds of edge cache.
edge.tx_capacity	10000	The max size(items) of edges(uncommitted) in transaction.
query.page_size	500	The size of each page when querying by paging.
query.batch_size	1000	The size of each batch when querying by batch.
query.ignore_invalid_data	true	Whether to ignore invalid data of vertex or edge.
query.index_intersect_threshold	1000	The maximum number of intermediate results to intersect indexes when querying by multiple single index properties.
query.ramtable_edges_capacity	20000000	The maximum number of edges in ramtable, include OUT and IN edges.
query.ramtable_enable	false	Whether to enable ramtable for query of adjacent edges.
query.ramtable_vertices_capacity	10000000	The maximum number of vertices in ramtable, generally the largest vertex id is used as capacity.
query.optimize_aggregate_by_index	false	Whether to optimize aggregate query(like count) by index.
oltp.concurrent_depth	10	The min depth to enable concurrent oltp algorithm.
oltp.concurrent_threads	10	Thread number to concurrently execute oltp algorithm.
oltp.collection_type	EC	The implementation type of collections used in oltp algorithm.
rate_limit.read	0	The max rate(times/s) to execute query of vertices/edges.
rate_limit.write	0	The max rate(items/s) to add/update/delete vertices/edges.
task.wait_timeout	10	Timeout in seconds for waiting for the task to complete,such as when truncating or clearing the backend.
task.input_size_limit	16777216	The job input size limit in bytes.
task.result_size_limit	16777216	The job result size limit in bytes.
task.sync_deletion	false	Whether to delete schema or expired data synchronously.
task.ttl_delete_batch	1	The batch size used to delete expired data.
computer.config	/conf/computer.yaml	The config file path of computer job.
search.text_analyzer	ikanalyzer	Choose a text analyzer for searching the vertex/edge properties, available type are [word, ansj, hanlp, smartcn, jieba, jcseg, mmseg4j, ikanalyzer]. # if use ‘ikanalyzer’, need download jar from ‘https://github.com/apache/hugegraph-doc/raw/ik_binary/dist/server/ikanalyzer-2012_u6.jar' to lib directory
search.text_analyzer_mode	smart	Specify the mode for the text analyzer, the available mode of analyzer are {word: [MaximumMatching, ReverseMaximumMatching, MinimumMatching, ReverseMinimumMatching, BidirectionalMaximumMatching, BidirectionalMinimumMatching, BidirectionalMaximumMinimumMatching, FullSegmentation, MinimalWordCount, MaxNgramScore, PureEnglish], ansj: [BaseAnalysis, IndexAnalysis, ToAnalysis, NlpAnalysis], hanlp: [standard, nlp, index, nShort, shortest, speed], smartcn: [], jieba: [SEARCH, INDEX], jcseg: [Simple, Complex], mmseg4j: [Simple, Complex, MaxWord], ikanalyzer: [smart, max_word]}.
snowflake.datecenter_id	0	The datacenter id of snowflake id generator.
snowflake.force_string	false	Whether to force the snowflake long id to be a string.
snowflake.worker_id	0	The worker id of snowflake id generator.
raft.mode	false	Whether the backend storage works in raft mode.
raft.safe_read	false	Whether to use linearly consistent read.
raft.use_snapshot	false	Whether to use snapshot.
raft.endpoint	127.0.0.1:8281	The peerid of current raft node.
raft.group_peers	127.0.0.1:8281,127.0.0.1:8282,127.0.0.1:8283	The peers of current raft group.
raft.path	./raft-log	The log path of current raft node.
raft.use_replicator_pipeline	true	Whether to use replicator line, when turned on it multiple logs can be sent in parallel, and the next log doesn’t have to wait for the ack message of the current log to be sent.
raft.election_timeout	10000	Timeout in milliseconds to launch a round of election.
raft.snapshot_interval	3600	The interval in seconds to trigger snapshot save.
raft.backend_threads	current CPU v-cores	The thread number used to apply task to backend.
raft.read_index_threads	8	The thread number used to execute reading index.
raft.apply_batch	1	The apply batch size to trigger disruptor event handler.
raft.queue_size	16384	The disruptor buffers size for jraft RaftNode, StateMachine and LogManager.
raft.queue_publish_timeout	60	The timeout in second when publish event into disruptor.
raft.rpc_threads	80	The rpc threads for jraft RPC layer.
raft.rpc_connect_timeout	5000	The rpc connect timeout for jraft rpc.
raft.rpc_timeout	60000	The rpc timeout for jraft rpc.
raft.rpc_buf_low_water_mark	10485760	The ChannelOutboundBuffer’s low water mark of netty, when buffer size less than this size, the method ChannelOutboundBuffer.isWritable() will return true, it means that low downstream pressure or good network.
raft.rpc_buf_high_water_mark	20971520	The ChannelOutboundBuffer’s high water mark of netty, only when buffer size exceed this size, the method ChannelOutboundBuffer.isWritable() will return false, it means that the downstream pressure is too great to process the request or network is very congestion, upstream needs to limit rate at this time.
raft.read_strategy	ReadOnlyLeaseBased	The linearizability of read strategy.

RPC server 配置

config option	default value	description
rpc.client_connect_timeout	20	The timeout(in seconds) of rpc client connect to rpc server.
rpc.client_load_balancer	consistentHash	The rpc client uses a load-balancing algorithm to access multiple rpc servers in one cluster. Default value is ‘consistentHash’, means forwarding by request parameters.
rpc.client_read_timeout	40	The timeout(in seconds) of rpc client read from rpc server.
rpc.client_reconnect_period	10	The period(in seconds) of rpc client reconnect to rpc server.
rpc.client_retries	3	Failed retry number of rpc client calls to rpc server.
rpc.config_order	999	Sofa rpc configuration file loading order, the larger the more later loading.
rpc.logger_impl	com.alipay.sofa.rpc.log.SLF4JLoggerImpl	Sofa rpc log implementation class.
rpc.protocol	bolt	Rpc communication protocol, client and server need to be specified the same value.
rpc.remote_url		The remote urls of rpc peers, it can be set to multiple addresses, which are concat by ‘,’, empty value means not enabled.
rpc.server_adaptive_port	false	Whether the bound port is adaptive, if it’s enabled, when the port is in use, automatically +1 to detect the next available port. Note that this process is not atomic, so there may still be port conflicts.
rpc.server_host		The hosts/ips bound by rpc server to provide services, empty value means not enabled.
rpc.server_port	8090	The port bound by rpc server to provide services.
rpc.server_timeout	30	The timeout(in seconds) of rpc server execution.

Cassandra 后端配置项

config option	default value	description
backend		Must be set to `cassandra`.
serializer		Must be set to `cassandra`.
cassandra.host	localhost	The seeds hostname or ip address of cassandra cluster.
cassandra.port	9042	The seeds port address of cassandra cluster.
cassandra.connect_timeout	5	The cassandra driver connect server timeout(seconds).
cassandra.read_timeout	20	The cassandra driver read from server timeout(seconds).
cassandra.keyspace.strategy	SimpleStrategy	The replication strategy of keyspace, valid value is SimpleStrategy or NetworkTopologyStrategy.
cassandra.keyspace.replication	[3]	The keyspace replication factor of SimpleStrategy, like ‘[3]’.Or replicas in each datacenter of NetworkTopologyStrategy, like ‘[dc1:2,dc2:1]’.
cassandra.username		The username to use to login to cassandra cluster.
cassandra.password		The password corresponding to cassandra.username.
cassandra.compression_type	none	The compression algorithm of cassandra transport: none/snappy/lz4.
cassandra.jmx_port=7199	7199	The port of JMX API service for cassandra.
cassandra.aggregation_timeout	43200	The timeout in seconds of waiting for aggregation.

ScyllaDB 后端配置项

config option	default value	description
backend		Must be set to `scylladb`.
serializer		Must be set to `scylladb`.

其它与 Cassandra 后端一致。

RocksDB 后端配置项

config option	default value	description
backend		Must be set to `rocksdb`.
serializer		Must be set to `binary`.
rocksdb.data_disks	[]	The optimized disks for storing data of RocksDB. The format of each element: `STORE/TABLE: /path/disk`.Allowed keys are [g/vertex, g/edge_out, g/edge_in, g/vertex_label_index, g/edge_label_index, g/range_int_index, g/range_float_index, g/range_long_index, g/range_double_index, g/secondary_index, g/search_index, g/shard_index, g/unique_index, g/olap]
rocksdb.data_path	rocksdb-data/data	The path for storing data of RocksDB.
rocksdb.wal_path	rocksdb-data/wal	The path for storing WAL of RocksDB.
rocksdb.allow_mmap_reads	false	Allow the OS to mmap file for reading sst tables.
rocksdb.allow_mmap_writes	false	Allow the OS to mmap file for writing.
rocksdb.block_cache_capacity	8388608	The amount of block cache in bytes that will be used by RocksDB, 0 means no block cache.
rocksdb.bloom_filter_bits_per_key	-1	The bits per key in bloom filter, a good value is 10, which yields a filter with ~ 1% false positive rate, -1 means no bloom filter.
rocksdb.bloom_filter_block_based_mode	false	Use block based filter rather than full filter.
rocksdb.bloom_filter_whole_key_filtering	true	True if place whole keys in the bloom filter, else place the prefix of keys.
rocksdb.bottommost_compression	NO_COMPRESSION	The compression algorithm for the bottommost level of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd.
rocksdb.bulkload_mode	false	Switch to the mode to bulk load data into RocksDB.
rocksdb.cache_index_and_filter_blocks	false	Indicating if we’d put index/filter blocks to the block cache.
rocksdb.compaction_style	LEVEL	Set compaction style for RocksDB: LEVEL/UNIVERSAL/FIFO.
rocksdb.compression	SNAPPY_COMPRESSION	The compression algorithm for compressing blocks of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd.
rocksdb.compression_per_level	[NO_COMPRESSION, NO_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION]	The compression algorithms for different levels of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd.
rocksdb.delayed_write_rate	16777216	The rate limit in bytes/s of user write requests when need to slow down if the compaction gets behind.
rocksdb.log_level	INFO	The info log level of RocksDB.
rocksdb.max_background_jobs	8	Maximum number of concurrent background jobs, including flushes and compactions.
rocksdb.level_compaction_dynamic_level_bytes	false	Whether to enable level_compaction_dynamic_level_bytes, if it’s enabled we give max_bytes_for_level_multiplier a priority against max_bytes_for_level_base, the bytes of base level is dynamic for a more predictable LSM tree, it is useful to limit worse case space amplification. Turning this feature on/off for an existing DB can cause unexpected LSM tree structure so it’s not recommended.
rocksdb.max_bytes_for_level_base	536870912	The upper-bound of the total size of level-1 files in bytes.
rocksdb.max_bytes_for_level_multiplier	10.0	The ratio between the total size of level (L+1) files and the total size of level L files for all L.
rocksdb.max_open_files	-1	The maximum number of open files that can be cached by RocksDB, -1 means no limit.
rocksdb.max_subcompactions	4	The value represents the maximum number of threads per compaction job.
rocksdb.max_write_buffer_number	6	The maximum number of write buffers that are built up in memory.
rocksdb.max_write_buffer_number_to_maintain	0	The total maximum number of write buffers to maintain in memory.
rocksdb.min_write_buffer_number_to_merge	2	The minimum number of write buffers that will be merged together.
rocksdb.num_levels	7	Set the number of levels for this database.
rocksdb.optimize_filters_for_hits	false	This flag allows us to not store filters for the last level.
rocksdb.optimize_mode	true	Optimize for heavy workloads and big datasets.
rocksdb.pin_l0_filter_and_index_blocks_in_cache	false	Indicating if we’d put index/filter blocks to the block cache.
rocksdb.sst_path		The path for ingesting SST file into RocksDB.
rocksdb.target_file_size_base	67108864	The target file size for compaction in bytes.
rocksdb.target_file_size_multiplier	1	The size ratio between a level L file and a level (L+1) file.
rocksdb.use_direct_io_for_flush_and_compaction	false	Enable the OS to use direct read/writes in flush and compaction.
rocksdb.use_direct_reads	false	Enable the OS to use direct I/O for reading sst tables.
rocksdb.write_buffer_size	134217728	Amount of data in bytes to build up in memory.
rocksdb.max_manifest_file_size	104857600	The max size of manifest file in bytes.
rocksdb.skip_stats_update_on_db_open	false	Whether to skip statistics update when opening the database, setting this flag true allows us to not update statistics.
rocksdb.max_file_opening_threads	16	The max number of threads used to open files.
rocksdb.max_total_wal_size	0	Total size of WAL files in bytes. Once WALs exceed this size, we will start forcing the flush of column families related, 0 means no limit.
rocksdb.db_write_buffer_size	0	Total size of write buffers in bytes across all column families, 0 means no limit.
rocksdb.delete_obsolete_files_period	21600	The periodicity in seconds when obsolete files get deleted, 0 means always do full purge.
rocksdb.hard_pending_compaction_bytes_limit	274877906944	The hard limit to impose on pending compaction in bytes.
rocksdb.level0_file_num_compaction_trigger	2	Number of files to trigger level-0 compaction.
rocksdb.level0_slowdown_writes_trigger	20	Soft limit on number of level-0 files for slowing down writes.
rocksdb.level0_stop_writes_trigger	36	Hard limit on number of level-0 files for stopping writes.
rocksdb.soft_pending_compaction_bytes_limit	68719476736	The soft limit to impose on pending compaction in bytes.

HBase 后端配置项

config option	default value	description
backend		Must be set to `hbase`.
serializer		Must be set to `hbase`.
hbase.hosts	localhost	The hostnames or ip addresses of HBase zookeeper, separated with commas.
hbase.port	2181	The port address of HBase zookeeper.
hbase.threads_max	64	The max threads num of hbase connections.
hbase.znode_parent	/hbase	The znode parent path of HBase zookeeper.
hbase.zk_retry	3	The recovery retry times of HBase zookeeper.
hbase.aggregation_timeout	43200	The timeout in seconds of waiting for aggregation.
hbase.kerberos_enable	false	Is Kerberos authentication enabled for HBase.
hbase.kerberos_keytab		The HBase’s key tab file for kerberos authentication.
hbase.kerberos_principal		The HBase’s principal for kerberos authentication.
hbase.krb5_conf	etc/krb5.conf	Kerberos configuration file, including KDC IP, default realm, etc.
hbase.hbase_site	/etc/hbase/conf/hbase-site.xml	The HBase’s configuration file
hbase.enable_partition	true	Is pre-split partitions enabled for HBase.
hbase.vertex_partitions	10	The number of partitions of the HBase vertex table.
hbase.edge_partitions	30	The number of partitions of the HBase edge table.

MySQL & PostgreSQL 后端配置项

config option	default value	description
backend		Must be set to `mysql`.
serializer		Must be set to `mysql`.
jdbc.driver	com.mysql.jdbc.Driver	The JDBC driver class to connect database.
jdbc.url	jdbc:mysql://127.0.0.1:3306	The url of database in JDBC format.
jdbc.username	root	The username to login database.
jdbc.password	******	The password corresponding to jdbc.username.
jdbc.ssl_mode	false	The SSL mode of connections with database.
jdbc.reconnect_interval	3	The interval(seconds) between reconnections when the database connection fails.
jdbc.reconnect_max_times	3	The reconnect times when the database connection fails.
jdbc.storage_engine	InnoDB	The storage engine of backend store database, like InnoDB/MyISAM/RocksDB for MySQL.
jdbc.postgresql.connect_database	template1	The database used to connect when init store, drop store or check store exist.

PostgreSQL 后端配置项

config option	default value	description
backend		Must be set to `postgresql`.
serializer		Must be set to `postgresql`.

其它与 MySQL 后端一致。

PostgreSQL 后端的 driver 和 url 应该设置为：
jdbc.driver=org.postgresql.Driver
jdbc.url=jdbc:postgresql://localhost:5432/

3 - HugeGraph 内置用户权限与扩展权限配置及使用

概述

HugeGraph 为了方便不同用户场景下的鉴权使用，目前内置了完备的StandardAuthenticator权限模式，支持多用户认证、以及细粒度的权限访问控制，采用基于“用户 - 用户组 - 操作 - 资源”的 4 层设计，灵活控制用户角色与权限 (支持多 GraphServer)

StandardAuthenticator 模式的几个核心设计：

初始化时创建超级管理员 (admin) 用户，后续通过超级管理员创建其它用户，新创建的用户被分配足够权限后，可以创建或管理更多的用户
支持动态创建用户、用户组、资源，支持动态分配或取消权限
用户可以属于一个或多个用户组，每个用户组可以拥有对任意个资源的操作权限，操作类型包括：读、写、删除、执行等种类
“资源” 描述了图数据库中的数据，比如符合某一类条件的顶点，每一个资源包括 type、label、properties三个要素，共有 18 种类型、任意 label、任意 properties 可组合形成的资源，一个资源的内部条件是且关系，多个资源之间的条件是或关系

举例说明：

// 场景：某用户只有北京地区的数据读取权限
user(name=xx) -belong-> group(name=xx) -access(read)-> target(graph=graph1, resource={label: person, city: Beijing})

配置用户认证

HugeGraph 目前默认未启用用户认证功能，需通过修改配置文件来启用该功能。(Note: 如果在生产环境/外网使用, 请使用 Java11 版本 + 开启权限避免安全相关隐患)

目前已内置实现了StandardAuthenticator模式，该模式支持多用户认证与细粒度权限控制。此外，开发者可以自定义实现HugeAuthenticator接口来对接自身的权限系统。

用户认证方式均采用 HTTP Basic Authentication ，简单说就是在发送 HTTP 请求时在 Authentication 设置选择 Basic 然后输入对应的用户名和密码，对应 HTTP 明文如下所示 :

GET http://localhost:8080/graphs/hugegraph/schema/vertexlabels
Authorization: Basic admin xxxx

警告：在 1.5.0 之前版本的 HugeGraph-Server 在鉴权模式下存在 JWT 相关的安全隐患，请务必使用新版本或自行修改 JWT token 的 secretKey。

修改方式为在配置文件rest-server.properties中重写auth.token_secret信息：(1.5.0 后会默认生成随机值则无需配置)

auth.token_secret=XXXX   #这里为 32 位 String，由 a-z，A-Z 和 0-9 组成

也可以通过下面的命令实现：

RANDOM_STRING=$(head /dev/urandom | tr -dc A-Za-z0-9 | head -c 32)
echo "auth.token_secret=${RANDOM_STRING}" >> rest-server.properties

StandardAuthenticator 模式

StandardAuthenticator模式是通过在数据库后端存储用户信息来支持用户认证和权限控制，该实现基于数据库存储的用户的名称与密码进行认证（密码已被加密），基于用户的角色来细粒度控制用户权限。下面是具体的配置流程（重启服务生效）：

在配置文件gremlin-server.yaml中配置authenticator及其rest-server文件路径：

authentication: {
  authenticator: org.apache.hugegraph.auth.StandardAuthenticator,
  authenticationHandler: org.apache.hugegraph.auth.WsAndHttpBasicAuthHandler,
  config: {tokens: conf/rest-server.properties}
}

在配置文件rest-server.properties中配置authenticator及其graph_store信息：

auth.authenticator=org.apache.hugegraph.auth.StandardAuthenticator
auth.graph_store=hugegraph

# auth client config
# 如果是分开部署 GraphServer 和 AuthServer，还需要指定下面的配置，地址填写 AuthServer 的 IP:RPC 端口
#auth.remote_url=127.0.0.1:8899,127.0.0.1:8898,127.0.0.1:8897

其中，graph_store配置项是指使用哪一个图来存储用户信息，如果存在多个图的话，选取任意一个均可。

在配置文件hugegraph{n}.properties中配置gremlin.graph信息：

gremlin.graph=org.apache.hugegraph.auth.HugeFactoryAuthProxy

然后详细的权限 API 调用和说明请参考 Authentication-API 文档。

自定义用户认证系统

如果需要支持更加灵活的用户系统，可自定义 authenticator 进行扩展，自定义 authenticator 实现接口org.apache.hugegraph.auth.HugeAuthenticator即可，然后修改配置文件中authenticator配置项指向该实现。

基于鉴权模式启动

在鉴权配置完成后，需在首次执行 init-store.sh 时命令行中输入 admin 密码 (非 docker 部署模式下)

如果基于 docker 镜像部署或者已经初始化 HugeGraph 并需要转换为鉴权模式，需要删除相关图数据并重新启动 HugeGraph, 若图已有业务数据，暂时无法直接转换鉴权模式 (hugegraph 版本 <= 1.2.0)

对于该功能的改进已经在最新版本发布 (Docker latest 可用)，可参考 PR 2411, 此时可无缝切换。

# stop the hugeGraph firstly
bin/stop-hugegraph.sh

# delete the store data (here we use the default path for rocksdb)
# Note: no need to delete data in the latest code (fixed in https://github.com/apache/incubator-hugegraph/pull/2411)
rm -rf rocksdb-data/

# init store again
bin/init-store.sh

# start hugeGraph again
bin/start-hugegraph.sh

使用 Docker 时开启鉴权模式

对于镜像 hugegraph/hugegraph 大于等于 1.2.0 的版本，我们可以在启动 docker 镜像的同时开启鉴权模式

具体做法如下：

1. 采用 docker run

在 docker run 中添加环境变量 PASSWORD=123456（密码可以自由设置）即可开启鉴权模式：：

docker run -itd -e PASSWORD=123456 --name=server -p 8080:8080 hugegraph/hugegraph:1.5.0

2. 采用 docker-compose

使用 docker-compose 在环境变量中设置 PASSWORD=123456即可

version: '3'
services:
  server:
    image: hugegraph/hugegraph:1.5.0
    container_name: server
    ports:
      - 8080:8080
    environment:
      - PASSWORD=123456

3. 进入容器后重新开启鉴权模式

首先进入容器：

docker exec -it server bash
# 用于快速修改配置, 修改前的文件被保存在conf-bak文件夹下
bin/enable-auth.sh

之后参照基于鉴权模式启动即可

4 - 配置 HugeGraphServer 使用 https 协议

概述

HugeGraphServer 默认使用的是 http 协议，如果用户对请求的安全性有要求，可以配置成 https。

服务端配置

修改 conf/rest-server.properties 配置文件，将 restserver.url 的 schema 部分改为 https。

# 将协议设置为 https
restserver.url=https://127.0.0.1:8080
# 服务端 keystore 文件路径，当协议为 https 时该默认值自动生效，可按需修改此项
ssl.keystore_file=conf/hugegraph-server.keystore
# 服务端 keystore 文件密码，当协议为 https 时该默认值自动生效，可按需修改此项
ssl.keystore_password=******

服务端的 conf 目录下已经给出了一个 keystore 文件hugegraph-server.keystore，该文件的密码为hugegraph，这两项都是在开启了 https 协议时的默认值，用户可以生成自己的 keystore 文件及密码，然后修改ssl.keystore_file和ssl.keystore_password的值。

客户端配置

在 HugeGraph-Client 中使用 https

在构造 HugeClient 时传入 https 相关的配置，代码示例：

String url = "https://localhost:8080";
String graphName = "hugegraph";
HugeClientBuilder builder = HugeClient.builder(url, graphName);
// 客户端 keystore 文件路径
String trustStoreFilePath = "hugegraph.truststore";
// 客户端 keystore 密码
String trustStorePassword = "******";
builder.configSSL(trustStoreFilePath, trustStorePassword);
HugeClient hugeClient = builder.build();

注意：HugeGraph-Client 在 1.9.0 版本以前是直接以 new 的方式创建，并且不支持 https 协议，在 1.9.0 版本以后改成以 builder 的方式创建，并支持配置 https 协议。

在 HugeGraph-Loader 中使用 https

启动导入任务时，在命令行中添加如下选项：

# https
--protocol https
# 客户端证书文件路径，当指定 --protocol 为 https 时，默认值 conf/hugegraph.truststore 自动生效，可按需修改
--trust-store-file {file}
# 客户端证书文件密码，当指定 --protocol 为 https 时，默认值 hugegraph 自动生效，可按需修改
--trust-store-password {password}

hugegraph-loader 的 conf 目录下已经放了一个默认的客户端证书文件 hugegraph.truststore，其密码是 hugegraph。

在 HugeGraph-Tools 中使用 https

执行命令时，在命令行中添加如下选项：

# 客户端证书文件路径，当 url 中使用 https 协议时，默认值 conf/hugegraph.truststore 自动生效，可按需修改
--trust-store-file {file}
# 客户端证书文件密码，当 url 中使用 https 协议时，默认值 hugegraph 自动生效，可按需修改
--trust-store-password {password}
# 执行迁移命令时，当 --target-url 中使用 https 协议时，默认值 conf/hugegraph.truststore 自动生效，可按需修改
--target-trust-store-file {target-file}
# 执行迁移命令时，当 --target-url 中使用 https 协议时，默认值 hugegraph 自动生效，可按需修改
--target-trust-store-password {target-password}

hugegraph-tools 的 conf 目录下已经放了一个默认的客户端证书文件 hugegraph.truststore，其密码是 hugegraph。

如何生成证书文件

本部分给出生成证书的示例，如果默认的证书已经够用，或者已经知晓如何生成，可跳过。

服务端

⽣成服务端私钥，并且导⼊到服务端 keystore ⽂件中，server.keystore 是给服务端⽤的，其中保存着⾃⼰的私钥

keytool -genkey -alias serverkey -keyalg RSA -keystore server.keystore

过程中根据需求填写描述信息，默认证书的描述信息如下：

名字和姓⽒：hugegraph
组织单位名称：hugegraph
组织名称：hugegraph
城市或区域名称：BJ
州或省份名称：BJ
国家代码：CN

根据服务端私钥，导出服务端证书

keytool -export -alias serverkey -keystore server.keystore -file server.crt

server.crt 就是服务端的证书

客户端

keytool -import -alias serverkey -file server.crt -keystore client.truststore

client.truststore 是给客户端⽤的，其中保存着受信任的证书

5 - HugeGraph-Computer 配置

Computer Config Options

config option	default value	description
algorithm.message_class	org.apache.hugegraph.computer.core.config.Null	The class of message passed when compute vertex.
algorithm.params_class	org.apache.hugegraph.computer.core.config.Null	The class used to transfer algorithms’ parameters before algorithm been run.
algorithm.result_class	org.apache.hugegraph.computer.core.config.Null	The class of vertex’s value, the instance is used to store computation result for the vertex.
allocator.max_vertices_per_thread	10000	Maximum number of vertices per thread processed in each memory allocator
bsp.etcd_endpoints	http://localhost:2379	The end points to access etcd.
bsp.log_interval	30000	The log interval(in ms) to print the log while waiting bsp event.
bsp.max_super_step	10	The max super step of the algorithm.
bsp.register_timeout	300000	The max timeout to wait for master and works to register.
bsp.wait_master_timeout	86400000	The max timeout(in ms) to wait for master bsp event.
bsp.wait_workers_timeout	86400000	The max timeout to wait for workers bsp event.
hgkv.max_data_block_size	65536	The max byte size of hgkv-file data block.
hgkv.max_file_size	2147483648	The max number of bytes in each hgkv-file.
hgkv.max_merge_files	10	The max number of files to merge at one time.
hgkv.temp_file_dir	/tmp/hgkv	This folder is used to store temporary files, temporary files will be generated during the file merging process.
hugegraph.name	hugegraph	The graph name to load data and write results back.
hugegraph.url	http://127.0.0.1:8080	The hugegraph url to load data and write results back.
input.edge_direction	OUT	The data of the edge in which direction is loaded, when the value is BOTH, the edges in both OUT and IN direction will be loaded.
input.edge_freq	MULTIPLE	The frequency of edges can exist between a pair of vertices, allowed values: [SINGLE, SINGLE_PER_LABEL, MULTIPLE]. SINGLE means that only one edge can exist between a pair of vertices, use sourceId + targetId to identify it; SINGLE_PER_LABEL means that each edge label can exist one edge between a pair of vertices, use sourceId + edgelabel + targetId to identify it; MULTIPLE means that many edge can exist between a pair of vertices, use sourceId + edgelabel + sortValues + targetId to identify it.
input.filter_class	org.apache.hugegraph.computer.core.input.filter.DefaultInputFilter	The class to create input-filter object, input-filter is used to Filter vertex edges according to user needs.
input.loader_schema_path		The schema path of loader input, only takes effect when the input.source_type=loader is enabled
input.loader_struct_path		The struct path of loader input, only takes effect when the input.source_type=loader is enabled
input.max_edges_in_one_vertex	200	The maximum number of adjacent edges allowed to be attached to a vertex, the adjacent edges will be stored and transferred together as a batch unit.
input.source_type	hugegraph-server	The source type to load input data, allowed values: [‘hugegraph-server’, ‘hugegraph-loader’], the ‘hugegraph-loader’ means use hugegraph-loader load data from HDFS or file, if use ‘hugegraph-loader’ load data then please config ‘input.loader_struct_path’ and ‘input.loader_schema_path’.
input.split_fetch_timeout	300	The timeout in seconds to fetch input splits
input.split_max_splits	10000000	The maximum number of input splits
input.split_page_size	500	The page size for streamed load input split data
input.split_size	1048576	The input split size in bytes
job.id	local_0001	The job id on Yarn cluster or K8s cluster.
job.partitions_count	1	The partitions count for computing one graph algorithm job.
job.partitions_thread_nums	4	The number of threads for partition parallel compute.
job.workers_count	1	The workers count for computing one graph algorithm job.
master.computation_class	org.apache.hugegraph.computer.core.master.DefaultMasterComputation	Master-computation is computation that can determine whether to continue next superstep. It runs at the end of each superstep on master.
output.batch_size	500	The batch size of output
output.batch_threads	1	The threads number used to batch output
output.hdfs_core_site_path		The hdfs core site path.
output.hdfs_delimiter	,	The delimiter of hdfs output.
output.hdfs_kerberos_enable	false	Is Kerberos authentication enabled for Hdfs.
output.hdfs_kerberos_keytab		The Hdfs’s key tab file for kerberos authentication.
output.hdfs_kerberos_principal		The Hdfs’s principal for kerberos authentication.
output.hdfs_krb5_conf	/etc/krb5.conf	Kerberos configuration file.
output.hdfs_merge_partitions	true	Whether merge output files of multiple partitions.
output.hdfs_path_prefix	/hugegraph-computer/results	The directory of hdfs output result.
output.hdfs_replication	3	The replication number of hdfs.
output.hdfs_site_path		The hdfs site path.
output.hdfs_url	hdfs://127.0.0.1:9000	The hdfs url of output.
output.hdfs_user	hadoop	The hdfs user of output.
output.output_class	org.apache.hugegraph.computer.core.output.LogOutput	The class to output the computation result of each vertex. Be called after iteration computation.
output.result_name	value	The value is assigned dynamically by #name() of instance created by WORKER_COMPUTATION_CLASS.
output.result_write_type	OLAP_COMMON	The result write-type to output to hugegraph, allowed values are: [OLAP_COMMON, OLAP_SECONDARY, OLAP_RANGE].
output.retry_interval	10	The retry interval when output failed
output.retry_times	3	The retry times when output failed
output.single_threads	1	The threads number used to single output
output.thread_pool_shutdown_timeout	60	The timeout seconds of output threads pool shutdown
output.with_adjacent_edges	false	Output the adjacent edges of the vertex or not
output.with_edge_properties	false	Output the properties of the edge or not
output.with_vertex_properties	false	Output the properties of the vertex or not
sort.thread_nums	4	The number of threads performing internal sorting.
transport.client_connect_timeout	3000	The timeout(in ms) of client connect to server.
transport.client_threads	4	The number of transport threads for client.
transport.close_timeout	10000	The timeout(in ms) of close server or close client.
transport.finish_session_timeout	0	The timeout(in ms) to finish session, 0 means using (transport.sync_request_timeout * transport.max_pending_requests).
transport.heartbeat_interval	20000	The minimum interval(in ms) between heartbeats on client side.
transport.io_mode	AUTO	The network IO Mode, either ‘NIO’, ‘EPOLL’, ‘AUTO’, the ‘AUTO’ means selecting the property mode automatically.
transport.max_pending_requests	8	The max number of client unreceived ack, it will trigger the sending unavailable if the number of unreceived ack >= max_pending_requests.
transport.max_syn_backlog	511	The capacity of SYN queue on server side, 0 means using system default value.
transport.max_timeout_heartbeat_count	120	The maximum times of timeout heartbeat on client side, if the number of timeouts waiting for heartbeat response continuously > max_heartbeat_timeouts the channel will be closed from client side.
transport.min_ack_interval	200	The minimum interval(in ms) of server reply ack.
transport.min_pending_requests	6	The minimum number of client unreceived ack, it will trigger the sending available if the number of unreceived ack < min_pending_requests.
transport.network_retries	3	The number of retry attempts for network communication,if network unstable.
transport.provider_class	org.apache.hugegraph.computer.core.network.netty.NettyTransportProvider	The transport provider, currently only supports Netty.
transport.receive_buffer_size	0	The size of socket receive-buffer in bytes, 0 means using system default value.
transport.recv_file_mode	true	Whether enable receive buffer-file mode, it will receive buffer write file from socket by zero-copy if enable.
transport.send_buffer_size	0	The size of socket send-buffer in bytes, 0 means using system default value.
transport.server_host	127.0.0.1	The server hostname or ip to listen on to transfer data.
transport.server_idle_timeout	360000	The max timeout(in ms) of server idle.
transport.server_port	0	The server port to listen on to transfer data. The system will assign a random port if it’s set to 0.
transport.server_threads	4	The number of transport threads for server.
transport.sync_request_timeout	10000	The timeout(in ms) to wait response after sending sync-request.
transport.tcp_keep_alive	true	Whether enable TCP keep-alive.
transport.transport_epoll_lt	false	Whether enable EPOLL level-trigger.
transport.write_buffer_high_mark	67108864	The high water mark for write buffer in bytes, it will trigger the sending unavailable if the number of queued bytes > write_buffer_high_mark.
transport.write_buffer_low_mark	33554432	The low water mark for write buffer in bytes, it will trigger the sending available if the number of queued bytes < write_buffer_low_mark.org.apache.hugegraph.config.OptionChecker$$Lambda$97/0x00000008001c8440@776a6d9b
transport.write_socket_timeout	3000	The timeout(in ms) to write data to socket buffer.
valuefile.max_segment_size	1073741824	The max number of bytes in each segment of value-file.
worker.combiner_class	org.apache.hugegraph.computer.core.config.Null	Combiner can combine messages into one value for a vertex, for example page-rank algorithm can combine messages of a vertex to a sum value.
worker.computation_class	org.apache.hugegraph.computer.core.config.Null	The class to create worker-computation object, worker-computation is used to compute each vertex in each superstep.
worker.data_dirs	[jobs]	The directories separated by ‘,’ that received vertices and messages can persist into.
worker.edge_properties_combiner_class	org.apache.hugegraph.computer.core.combiner.OverwritePropertiesCombiner	The combiner can combine several properties of the same edge into one properties at inputstep.
worker.partitioner	org.apache.hugegraph.computer.core.graph.partition.HashPartitioner	The partitioner that decides which partition a vertex should be in, and which worker a partition should be in.
worker.received_buffers_bytes_limit	104857600	The limit bytes of buffers of received data, the total size of all buffers can’t excess this limit. If received buffers reach this limit, they will be merged into a file.
worker.vertex_properties_combiner_class	org.apache.hugegraph.computer.core.combiner.OverwritePropertiesCombiner	The combiner can combine several properties of the same vertex into one properties at inputstep.
worker.wait_finish_messages_timeout	86400000	The max timeout(in ms) message-handler wait for finish-message of all workers.
worker.wait_sort_timeout	600000	The max timeout(in ms) message-handler wait for sort-thread to sort one batch of buffers.
worker.write_buffer_capacity	52428800	The initial size of write buffer that used to store vertex or message.
worker.write_buffer_threshold	52428800	The threshold of write buffer, exceeding it will trigger sorting, the write buffer is used to store vertex or message.

K8s Operator Config Options

NOTE: Option needs to be converted through environment variable settings, e.g. k8s.internal_etcd_url => INTERNAL_ETCD_URL

config option	default value	description
k8s.auto_destroy_pod	true	Whether to automatically destroy all pods when the job is completed or failed.
k8s.close_reconciler_timeout	120	The max timeout(in ms) to close reconciler.
k8s.internal_etcd_url	http://127.0.0.1:2379	The internal etcd url for operator system.
k8s.max_reconcile_retry	3	The max retry times of reconcile.
k8s.probe_backlog	50	The maximum backlog for serving health probes.
k8s.probe_port	9892	The value is the port that the controller bind to for serving health probes.
k8s.ready_check_internal	1000	The time interval(ms) of check ready.
k8s.ready_timeout	30000	The max timeout(in ms) of check ready.
k8s.reconciler_count	10	The max number of reconciler thread.
k8s.resync_period	600000	The minimum frequency at which watched resources are reconciled.
k8s.timezone	Asia/Shanghai	The timezone of computer job and operator.
k8s.watch_namespace	hugegraph-computer-system	The value is watch custom resources in the namespace, ignore other namespaces, the ‘*’ means is all namespaces will be watched.

HugeGraph-Computer CRD

CRD: https://github.com/apache/hugegraph-computer/blob/master/computer-k8s-operator/manifest/hugegraph-computer-crd.v1.yaml

spec	default value	description	required
algorithmName		The name of algorithm.	true
jobId		The job id.	true
image		The image of algorithm.	true
computerConf		The map of computer config options.	true
workerInstances		The number of worker instances, it will instead the ‘job.workers_count’ option.	true
pullPolicy	Always	The pull-policy of image, detail please refer to: https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy	false
pullSecrets		The pull-secrets of Image, detail please refer to: https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod	false
masterCpu		The cpu limit of master, the unit can be ’m’ or without unit detail please refer to：https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu	false
workerCpu		The cpu limit of worker, the unit can be ’m’ or without unit detail please refer to：https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu	false
masterMemory		The memory limit of master, the unit can be one of Ei、Pi、Ti、Gi、Mi、Ki detail please refer to：https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory	false
workerMemory		The memory limit of worker, the unit can be one of Ei、Pi、Ti、Gi、Mi、Ki detail please refer to：https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory	false
log4jXml		The content of log4j.xml for computer job.	false
jarFile		The jar path of computer algorithm.	false
remoteJarUri		The remote jar uri of computer algorithm, it will overlay algorithm image.	false
jvmOptions		The java startup parameters of computer job.	false
envVars		please refer to: https://kubernetes.io/docs/tasks/inject-data-application/define-interdependent-environment-variables/	false
envFrom		please refer to: https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/	false
masterCommand	bin/start-computer.sh	The run command of master, equivalent to ‘Entrypoint’ field of Docker.	false
masterArgs	["-r master", “-d k8s”]	The run args of master, equivalent to ‘Cmd’ field of Docker.	false
workerCommand	bin/start-computer.sh	The run command of worker, equivalent to ‘Entrypoint’ field of Docker.	false
workerArgs	["-r worker", “-d k8s”]	The run args of worker, equivalent to ‘Cmd’ field of Docker.	false
volumes		Please refer to: https://kubernetes.io/docs/concepts/storage/volumes/	false
volumeMounts		Please refer to: https://kubernetes.io/docs/concepts/storage/volumes/	false
secretPaths		The map of k8s-secret name and mount path.	false
configMapPaths		The map of k8s-configmap name and mount path.	false
podTemplateSpec		Please refer to: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-template-v1/#PodTemplateSpec	false
securityContext		Please refer to: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/	false

KubeDriver Config Options

config option	default value	description
k8s.build_image_bash_path		The path of command used to build image.
k8s.enable_internal_algorithm	true	Whether enable internal algorithm.
k8s.framework_image_url	hugegraph/hugegraph-computer:latest	The image url of computer framework.
k8s.image_repository_password		The password for login image repository.
k8s.image_repository_registry		The address for login image repository.
k8s.image_repository_url	hugegraph/hugegraph-computer	The url of image repository.
k8s.image_repository_username		The username for login image repository.
k8s.internal_algorithm	[pageRank]	The name list of all internal algorithm.
k8s.internal_algorithm_image_url	hugegraph/hugegraph-computer:latest	The image url of internal algorithm.
k8s.jar_file_dir	/cache/jars/	The directory where the algorithm jar to upload location.
k8s.kube_config	~/.kube/config	The path of k8s config file.
k8s.log4j_xml_path		The log4j.xml path for computer job.
k8s.namespace	hugegraph-computer-system	The namespace of hugegraph-computer system.
k8s.pull_secret_names	[]	The names of pull-secret for pulling image.