本节介绍 HugeGraph-Server 的配置方法,包括:
- Server 启动指南 - 了解配置文件结构和基本配置方法
- Server 完整配置手册 - 完整的配置选项列表和说明
- 权限配置 - 用户认证和授权配置
- HTTPS 配置 - 启用 HTTPS 安全协议
This is the multi-page printable view of this section. Click here to print.
本节介绍 HugeGraph-Server 的配置方法,包括:
配置文件的目录为 hugegraph-release/conf,所有关于服务和图本身的配置都在此目录下。
主要的配置文件包括:gremlin-server.yaml、rest-server.properties 和 hugegraph.properties
HugeGraphServer 内部集成了 GremlinServer 和 RestServer,而 gremlin-server.yaml 和 rest-server.properties 就是用来配置这两个 Server 的。
下面对这三个配置文件逐一介绍。
gremlin-server.yaml 文件默认的内容如下:
# host and port of gremlin server, need to be consistent with host and port in rest-server.properties
#host: 127.0.0.1
#port: 8182
# Gremlin 查询中的超时时间(以毫秒为单位)
evaluationTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
# 不要在此处设置图形,此功能将在支持动态添加图形后再进行处理
graphs: {
}
scriptEngines: {
gremlin-groovy: {
staticImports: [
org.opencypher.gremlin.process.traversal.CustomPredicates.*',
org.opencypher.gremlin.traversal.CustomFunctions.*
],
plugins: {
org.apache.hugegraph.plugin.HugeGraphGremlinPlugin: {},
org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {
classImports: [
java.lang.Math,
org.apache.hugegraph.backend.id.IdGenerator,
org.apache.hugegraph.type.define.Directions,
org.apache.hugegraph.type.define.NodeRole,
org.apache.hugegraph.traversal.algorithm.CollectionPathsTraverser,
org.apache.hugegraph.traversal.algorithm.CountTraverser,
org.apache.hugegraph.traversal.algorithm.CustomizedCrosspointsTraverser,
org.apache.hugegraph.traversal.algorithm.CustomizePathsTraverser,
org.apache.hugegraph.traversal.algorithm.FusiformSimilarityTraverser,
org.apache.hugegraph.traversal.algorithm.HugeTraverser,
org.apache.hugegraph.traversal.algorithm.JaccardSimilarTraverser,
org.apache.hugegraph.traversal.algorithm.KneighborTraverser,
org.apache.hugegraph.traversal.algorithm.KoutTraverser,
org.apache.hugegraph.traversal.algorithm.MultiNodeShortestPathTraverser,
org.apache.hugegraph.traversal.algorithm.NeighborRankTraverser,
org.apache.hugegraph.traversal.algorithm.PathsTraverser,
org.apache.hugegraph.traversal.algorithm.PersonalRankTraverser,
org.apache.hugegraph.traversal.algorithm.SameNeighborTraverser,
org.apache.hugegraph.traversal.algorithm.ShortestPathTraverser,
org.apache.hugegraph.traversal.algorithm.SingleSourceShortestPathTraverser,
org.apache.hugegraph.traversal.algorithm.SubGraphTraverser,
org.apache.hugegraph.traversal.algorithm.TemplatePathsTraverser,
org.apache.hugegraph.traversal.algorithm.steps.EdgeStep,
org.apache.hugegraph.traversal.algorithm.steps.RepeatEdgeStep,
org.apache.hugegraph.traversal.algorithm.steps.WeightedEdgeStep,
org.apache.hugegraph.traversal.optimize.ConditionP,
org.apache.hugegraph.traversal.optimize.Text,
org.apache.hugegraph.traversal.optimize.TraversalUtil,
org.apache.hugegraph.util.DateUtil,
org.opencypher.gremlin.traversal.CustomFunctions,
org.opencypher.gremlin.traversal.CustomPredicate
],
methodImports: [
java.lang.Math#*,
org.opencypher.gremlin.traversal.CustomPredicate#*,
org.opencypher.gremlin.traversal.CustomFunctions#*
]
},
org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {
files: [scripts/empty-sample.groovy]
}
}
}
}
serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1,
config: {
serializeResultToString: false,
ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
}
}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0,
config: {
serializeResultToString: false,
ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
}
}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV2d0,
config: {
serializeResultToString: false,
ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
}
}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0,
config: {
serializeResultToString: false,
ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
}
}
metrics: {
consoleReporter: {enabled: false, interval: 180000},
csvReporter: {enabled: false, interval: 180000, fileName: ./metrics/gremlin-server-metrics.csv},
jmxReporter: {enabled: false},
slf4jReporter: {enabled: false, interval: 180000},
gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
graphiteReporter: {enabled: false, interval: 180000}
}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
enabled: false
}
上面的配置项很多,但目前只需要关注如下几个配置项:channelizer 和 graphs。
默认 GremlinServer 是服务在 localhost:8182,如果需要修改,配置 host、port 即可
同时需要在 rest-server.properties 中增加对应的配置项 gremlinserver.url=http://host:port
rest-server.properties 文件的默认内容如下:
# bind url
# could use '0.0.0.0' or specified (real)IP to expose external network access
restserver.url=http://127.0.0.1:8080
#restserver.enable_graphspaces_filter=false
# gremlin server url, need to be consistent with host and port in gremlin-server.yaml
#gremlinserver.url=http://127.0.0.1:8182
graphs=./conf/graphs
# The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0
batch.max_write_ratio=80
batch.max_write_threads=0
# configuration of arthas
arthas.telnet_port=8562
arthas.http_port=8561
arthas.ip=127.0.0.1
arthas.disabled_commands=jad
# authentication configs
# choose 'org.apache.hugegraph.auth.StandardAuthenticator' or a custom implementation
#auth.authenticator=
# for StandardAuthenticator mode
#auth.graph_store=hugegraph
# auth client config
#auth.remote_url=127.0.0.1:8899,127.0.0.1:8898,127.0.0.1:8897
# TODO: Deprecated & removed later (useless from version 1.5.0)
# rpc server configs for multi graph-servers or raft-servers
#rpc.server_host=127.0.0.1
#rpc.server_port=8091
#rpc.server_timeout=30
# rpc client configs (like enable to keep cache consistency)
#rpc.remote_url=127.0.0.1:8091,127.0.0.1:8092,127.0.0.1:8093
#rpc.client_connect_timeout=20
#rpc.client_reconnect_period=10
#rpc.client_read_timeout=40
#rpc.client_retries=3
#rpc.client_load_balancer=consistentHash
# raft group initial peers
#raft.group_peers=127.0.0.1:8091,127.0.0.1:8092,127.0.0.1:8093
# lightweight load balancing (beta)
server.id=server-1
server.role=master
# slow query log
log.slow_query_threshold=1000
# jvm(in-heap) memory usage monitor, set 1 to disable it
memory_monitor.threshold=0.85
memory_monitor.period=2000
http://0.0.0.0 来监听来自任何 IP 地址的请求,这种方案较为便捷,但需要留意服务可被访问的网络范围;注意:gremlin-server.yaml 和 rest-server.properties 都包含 graphs 配置项,而
init-store命令是根据 gremlin-server.yaml 的 graphs 下的图进行初始化的。
配置项 gremlinserver.url 是 GremlinServer 为 RestServer 提供服务的 url,该配置项默认为 http://localhost:8182,如需修改,需要和 gremlin-server.yaml 中的 host 和 port 相匹配;
hugegraph.properties 是一类文件,因为如果系统存在多个图,则会有多个相似的文件。该文件用来配置与图存储和查询相关的参数,文件的默认内容如下:
# gremlin entrence to create graph
gremlin.graph=org.apache.hugegraph.HugeFactory
# cache config
#schema.cache_capacity=100000
# vertex-cache default is 1000w, 10min expired
#vertex.cache_capacity=10000000
#vertex.cache_expire=600
# edge-cache default is 100w, 10min expired
#edge.cache_capacity=1000000
#edge.cache_expire=600
# schema illegal name template
#schema.illegal_name_regex=\s+|~.*
#vertex.default_label=vertex
backend=rocksdb
serializer=binary
store=hugegraph
raft.mode=false
raft.safe_read=false
raft.use_snapshot=false
raft.endpoint=127.0.0.1:8281
raft.group_peers=127.0.0.1:8281,127.0.0.1:8282,127.0.0.1:8283
raft.path=./raft-log
raft.use_replicator_pipeline=true
raft.election_timeout=10000
raft.snapshot_interval=3600
raft.backend_threads=48
raft.read_index_threads=8
raft.queue_size=16384
raft.queue_publish_timeout=60
raft.apply_batch=1
raft.rpc_threads=80
raft.rpc_connect_timeout=5000
raft.rpc_timeout=60000
# if use 'ikanalyzer', need download jar from 'https://github.com/apache/hugegraph-doc/raw/ik_binary/dist/server/ikanalyzer-2012_u6.jar' to lib directory
search.text_analyzer=jieba
search.text_analyzer_mode=INDEX
# rocksdb backend config
#rocksdb.data_path=/path/to/disk
#rocksdb.wal_path=/path/to/disk
# cassandra backend config
cassandra.host=localhost
cassandra.port=9042
cassandra.username=
cassandra.password=
#cassandra.connect_timeout=5
#cassandra.read_timeout=20
#cassandra.keyspace.strategy=SimpleStrategy
#cassandra.keyspace.replication=3
# hbase backend config
#hbase.hosts=localhost
#hbase.port=2181
#hbase.znode_parent=/hbase
#hbase.threads_max=64
# mysql backend config
#jdbc.driver=com.mysql.jdbc.Driver
#jdbc.url=jdbc:mysql://127.0.0.1:3306
#jdbc.username=root
#jdbc.password=
#jdbc.reconnect_max_times=3
#jdbc.reconnect_interval=3
#jdbc.ssl_mode=false
# postgresql & cockroachdb backend config
#jdbc.driver=org.postgresql.Driver
#jdbc.url=jdbc:postgresql://localhost:5432/
#jdbc.username=postgres
#jdbc.password=
# palo backend config
#palo.host=127.0.0.1
#palo.poll_interval=10
#palo.temp_dir=./palo-data
#palo.file_limit_size=32
重点关注未注释的几项:
我们的系统是可以存在多个图的,并且各个图的后端可以不一样,比如图 hugegraph_rocksdb 和 hugegraph_mysql,其中 hugegraph_rocksdb 以 RocksDB 作为后端,hugegraph_mysql 以 MySQL 作为后端。
配置方法也很简单:
[可选]:修改 rest-server.properties
通过修改 rest-server.properties 中的 graphs 配置项来设置图的配置文件目录。默认配置为 graphs=./conf/graphs,如果想要修改为其它目录则调整 graphs 配置项,比如调整为 graphs=/etc/hugegraph/graphs,示例如下:
graphs=./conf/graphs
在 conf/graphs 路径下基于 hugegraph.properties 修改得到 hugegraph_mysql_backend.properties 和 hugegraph_rocksdb_backend.properties
hugegraph_mysql_backend.properties 修改的部分如下:
backend=mysql
serializer=mysql
store=hugegraph_mysql
# mysql backend config
jdbc.driver=com.mysql.cj.jdbc.Driver
jdbc.url=jdbc:mysql://127.0.0.1:3306
jdbc.username=root
jdbc.password=xxx
jdbc.reconnect_max_times=3
jdbc.reconnect_interval=3
jdbc.ssl_mode=false
hugegraph_rocksdb_backend.properties 修改的部分如下:
backend=rocksdb
serializer=binary
store=hugegraph_rocksdb
停止 Server,初始化执行 init-store.sh(为新的图创建数据库),重新启动 Server
$ ./bin/stop-hugegraph.sh
$ ./bin/init-store.sh
Initializing HugeGraph Store...
2023-06-11 14:16:14 [main] [INFO] o.a.h.u.ConfigUtil - Scanning option 'graphs' directory './conf/graphs'
2023-06-11 14:16:14 [main] [INFO] o.a.h.c.InitStore - Init graph with config file: ./conf/graphs/hugegraph_rocksdb_backend.properties
...
2023-06-11 14:16:15 [main] [INFO] o.a.h.StandardHugeGraph - Graph 'hugegraph_rocksdb' has been initialized
2023-06-11 14:16:15 [main] [INFO] o.a.h.c.InitStore - Init graph with config file: ./conf/graphs/hugegraph_mysql_backend.properties
...
2023-06-11 14:16:16 [main] [INFO] o.a.h.StandardHugeGraph - Graph 'hugegraph_mysql' has been initialized
2023-06-11 14:16:16 [main] [INFO] o.a.h.StandardHugeGraph - Close graph standardhugegraph[hugegraph_rocksdb]
...
2023-06-11 14:16:16 [main] [INFO] o.a.h.HugeFactory - HugeFactory shutdown
2023-06-11 14:16:16 [hugegraph-shutdown] [INFO] o.a.h.HugeFactory - HugeGraph is shutting down
Initialization finished.
$ ./bin/start-hugegraph.sh
Starting HugeGraphServer...
Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)...OK
Started [pid 21614]
查看创建的图:
curl http://127.0.0.1:8080/graphs/
{"graphs":["hugegraph_rocksdb","hugegraph_mysql"]}
查看某个图的信息:
curl http://127.0.0.1:8080/graphs/hugegraph_mysql_backend
{"name":"hugegraph_mysql","backend":"mysql"}
curl http://127.0.0.1:8080/graphs/hugegraph_rocksdb_backend
{"name":"hugegraph_rocksdb","backend":"rocksdb"}
对应配置文件gremlin-server.yaml
| config option | default value | description |
|---|---|---|
| host | 127.0.0.1 | The host or ip of Gremlin Server. |
| port | 8182 | The listening port of Gremlin Server. |
| graphs | hugegraph: conf/hugegraph.properties | The map of graphs with name and config file path. |
| scriptEvaluationTimeout | 30000 | The timeout for gremlin script execution(millisecond). |
| channelizer | org.apache.tinkerpop.gremlin.server.channel.HttpChannelizer | Indicates the protocol which the Gremlin Server provides service. |
| authentication | authenticator: org.apache.hugegraph.auth.StandardAuthenticator, config: {tokens: conf/rest-server.properties} | The authenticator and config(contains tokens path) of authentication mechanism. |
对应配置文件rest-server.properties
| config option | default value | description |
|---|---|---|
| graphs | [hugegraph:conf/hugegraph.properties] | The map of graphs’ name and config file. |
| server.id | server-1 | The id of rest server, used for license verification. |
| server.role | master | The role of nodes in the cluster, available types are [master, worker, computer] |
| restserver.url | http://127.0.0.1:8080 | The url for listening of rest server. |
| ssl.keystore_file | server.keystore | The path of server keystore file used when https protocol is enabled. |
| ssl.keystore_password | The password of the path of the server keystore file used when the https protocol is enabled. | |
| restserver.max_worker_threads | 2 * CPUs | The maximum worker threads of rest server. |
| restserver.min_free_memory | 64 | The minimum free memory(MB) of rest server, requests will be rejected when the available memory of system is lower than this value. |
| restserver.request_timeout | 30 | The time in seconds within which a request must complete, -1 means no timeout. |
| restserver.connection_idle_timeout | 30 | The time in seconds to keep an inactive connection alive, -1 means no timeout. |
| restserver.connection_max_requests | 256 | The max number of HTTP requests allowed to be processed on one keep-alive connection, -1 means unlimited. |
| gremlinserver.url | http://127.0.0.1:8182 | The url of gremlin server. |
| gremlinserver.max_route | 8 | The max route number for gremlin server. |
| gremlinserver.timeout | 30 | The timeout in seconds of waiting for gremlin server. |
| batch.max_edges_per_batch | 2500 | The maximum number of edges submitted per batch. |
| batch.max_vertices_per_batch | 2500 | The maximum number of vertices submitted per batch. |
| batch.max_write_ratio | 70 | The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0. |
| batch.max_write_threads | 0 | The maximum threads for batch writing, if the value is 0, the actual value will be set to batch.max_write_ratio * restserver.max_worker_threads. |
| auth.authenticator | The class path of authenticator implementation. e.g., org.apache.hugegraph.auth.StandardAuthenticator, or a custom implementation. | |
| auth.graph_store | hugegraph | The name of graph used to store authentication information, like users, only for org.apache.hugegraph.auth.StandardAuthenticator. |
| auth.audit_log_rate | 1000.0 | The max rate of audit log output per user, default value is 1000 records per second. |
| auth.cache_capacity | 10240 | The max cache capacity of each auth cache item. |
| auth.cache_expire | 600 | The expiration time in seconds of vertex cache. |
| auth.remote_url | If the address is empty, it provide auth service, otherwise it is auth client and also provide auth service through rpc forwarding. The remote url can be set to multiple addresses, which are concat by ‘,’. | |
| auth.token_expire | 86400 | The expiration time in seconds after token created |
| auth.token_secret | FXQXbJtbCLxODc6tGci732pkH1cyf8Qg | Secret key of HS256 algorithm. |
| exception.allow_trace | true | Whether to allow exception trace stack. |
| memory_monitor.threshold | 0.85 | The threshold of JVM(in-heap) memory usage monitoring , 1 means disabling this function. |
| memory_monitor.period | 2000 | The period in ms of JVM(in-heap) memory usage monitoring. |
| log.slow_query_threshold | 1000 | Slow query log threshold in milliseconds, 0 means disabled. |
对应配置文件rest-server.properties
| config option | default value | description |
|---|---|---|
| pd.peers | 127.0.0.1:8686 | PD server addresses (comma separated). |
| meta.endpoints | http://127.0.0.1:2379 | Meta service endpoints. |
基本配置项及后端配置项对应配置文件:{graph-name}.properties,如hugegraph.properties
| config option | default value | description |
|---|---|---|
| gremlin.graph | org.apache.hugegraph.HugeFactory | Gremlin entrance to create graph. |
| backend | rocksdb | The data store type. For version 1.7.0+: [memory, rocksdb, hstore, hbase]. Note: cassandra, scylladb, mysql, postgresql were removed in 1.7.0 (use <= 1.5.x for legacy backends). |
| serializer | binary | The serializer for backend store, available values are [text, binary, cassandra, hbase, mysql]. |
| store | hugegraph | The database name like Cassandra Keyspace. |
| store.connection_detect_interval | 600 | The interval in seconds for detecting connections, if the idle time of a connection exceeds this value, detect it and reconnect if needed before using, value 0 means detecting every time. |
| store.graph | g | The graph table name, which store vertex, edge and property. |
| store.schema | m | The schema table name, which store meta data. |
| store.system | s | The system table name, which store system data. |
| schema.illegal_name_regex | .\s+$|~. | The regex specified the illegal format for schema name. |
| schema.cache_capacity | 10000 | The max cache size(items) of schema cache. |
| vertex.cache_type | l2 | The type of vertex cache, allowed values are [l1, l2]. |
| vertex.cache_capacity | 10000000 | The max cache size(items) of vertex cache. |
| vertex.cache_expire | 600 | The expire time in seconds of vertex cache. |
| vertex.check_customized_id_exist | false | Whether to check the vertices exist for those using customized id strategy. |
| vertex.default_label | vertex | The default vertex label. |
| vertex.tx_capacity | 10000 | The max size(items) of vertices(uncommitted) in transaction. |
| vertex.check_adjacent_vertex_exist | false | Whether to check the adjacent vertices of edges exist. |
| vertex.lazy_load_adjacent_vertex | true | Whether to lazy load adjacent vertices of edges. |
| vertex.part_edge_commit_size | 5000 | Whether to enable the mode to commit part of edges of vertex, enabled if commit size > 0, 0 means disabled. |
| vertex.encode_primary_key_number | true | Whether to encode number value of primary key in vertex id. |
| vertex.remove_left_index_at_overwrite | false | Whether remove left index at overwrite. |
| edge.cache_type | l2 | The type of edge cache, allowed values are [l1, l2]. |
| edge.cache_capacity | 1000000 | The max cache size(items) of edge cache. |
| edge.cache_expire | 600 | The expiration time in seconds of edge cache. |
| edge.tx_capacity | 10000 | The max size(items) of edges(uncommitted) in transaction. |
| query.page_size | 500 | The size of each page when querying by paging. |
| query.batch_size | 1000 | The size of each batch when querying by batch. |
| query.ignore_invalid_data | true | Whether to ignore invalid data of vertex or edge. |
| query.index_intersect_threshold | 1000 | The maximum number of intermediate results to intersect indexes when querying by multiple single index properties. |
| query.ramtable_edges_capacity | 20000000 | The maximum number of edges in ramtable, include OUT and IN edges. |
| query.ramtable_enable | false | Whether to enable ramtable for query of adjacent edges. |
| query.ramtable_vertices_capacity | 10000000 | The maximum number of vertices in ramtable, generally the largest vertex id is used as capacity. |
| query.optimize_aggregate_by_index | false | Whether to optimize aggregate query(like count) by index. |
| oltp.concurrent_depth | 10 | The min depth to enable concurrent oltp algorithm. |
| oltp.concurrent_threads | 10 | Thread number to concurrently execute oltp algorithm. |
| oltp.collection_type | EC | The implementation type of collections used in oltp algorithm. |
| rate_limit.read | 0 | The max rate(times/s) to execute query of vertices/edges. |
| rate_limit.write | 0 | The max rate(items/s) to add/update/delete vertices/edges. |
| task.wait_timeout | 10 | Timeout in seconds for waiting for the task to complete,such as when truncating or clearing the backend. |
| task.input_size_limit | 16777216 | The job input size limit in bytes. |
| task.result_size_limit | 16777216 | The job result size limit in bytes. |
| task.sync_deletion | false | Whether to delete schema or expired data synchronously. |
| task.ttl_delete_batch | 1 | The batch size used to delete expired data. |
| computer.config | /conf/computer.yaml | The config file path of computer job. |
| search.text_analyzer | ikanalyzer | Choose a text analyzer for searching the vertex/edge properties, available type are [word, ansj, hanlp, smartcn, jieba, jcseg, mmseg4j, ikanalyzer]. # if use ‘ikanalyzer’, need download jar from ‘https://github.com/apache/hugegraph-doc/raw/ik_binary/dist/server/ikanalyzer-2012_u6.jar' to lib directory |
| search.text_analyzer_mode | smart | Specify the mode for the text analyzer, the available mode of analyzer are {word: [MaximumMatching, ReverseMaximumMatching, MinimumMatching, ReverseMinimumMatching, BidirectionalMaximumMatching, BidirectionalMinimumMatching, BidirectionalMaximumMinimumMatching, FullSegmentation, MinimalWordCount, MaxNgramScore, PureEnglish], ansj: [BaseAnalysis, IndexAnalysis, ToAnalysis, NlpAnalysis], hanlp: [standard, nlp, index, nShort, shortest, speed], smartcn: [], jieba: [SEARCH, INDEX], jcseg: [Simple, Complex], mmseg4j: [Simple, Complex, MaxWord], ikanalyzer: [smart, max_word]}. |
| snowflake.datecenter_id | 0 | The datacenter id of snowflake id generator. |
| snowflake.force_string | false | Whether to force the snowflake long id to be a string. |
| snowflake.worker_id | 0 | The worker id of snowflake id generator. |
| raft.mode | false | Whether the backend storage works in raft mode. |
| raft.safe_read | false | Whether to use linearly consistent read. |
| raft.use_snapshot | false | Whether to use snapshot. |
| raft.endpoint | 127.0.0.1:8281 | The peerid of current raft node. |
| raft.group_peers | 127.0.0.1:8281,127.0.0.1:8282,127.0.0.1:8283 | The peers of current raft group. |
| raft.path | ./raft-log | The log path of current raft node. |
| raft.use_replicator_pipeline | true | Whether to use replicator line, when turned on it multiple logs can be sent in parallel, and the next log doesn’t have to wait for the ack message of the current log to be sent. |
| raft.election_timeout | 10000 | Timeout in milliseconds to launch a round of election. |
| raft.snapshot_interval | 3600 | The interval in seconds to trigger snapshot save. |
| raft.backend_threads | current CPU v-cores | The thread number used to apply task to backend. |
| raft.read_index_threads | 8 | The thread number used to execute reading index. |
| raft.apply_batch | 1 | The apply batch size to trigger disruptor event handler. |
| raft.queue_size | 16384 | The disruptor buffers size for jraft RaftNode, StateMachine and LogManager. |
| raft.queue_publish_timeout | 60 | The timeout in second when publish event into disruptor. |
| raft.rpc_threads | 80 | The rpc threads for jraft RPC layer. |
| raft.rpc_connect_timeout | 5000 | The rpc connect timeout for jraft rpc. |
| raft.rpc_timeout | 60000 | The rpc timeout for jraft rpc. |
| raft.rpc_buf_low_water_mark | 10485760 | The ChannelOutboundBuffer’s low water mark of netty, when buffer size less than this size, the method ChannelOutboundBuffer.isWritable() will return true, it means that low downstream pressure or good network. |
| raft.rpc_buf_high_water_mark | 20971520 | The ChannelOutboundBuffer’s high water mark of netty, only when buffer size exceed this size, the method ChannelOutboundBuffer.isWritable() will return false, it means that the downstream pressure is too great to process the request or network is very congestion, upstream needs to limit rate at this time. |
| raft.read_strategy | ReadOnlyLeaseBased | The linearizability of read strategy. |
| config option | default value | description |
|---|---|---|
| backend | Must be set to rocksdb. | |
| serializer | Must be set to binary. | |
| rocksdb.data_disks | [] | The optimized disks for storing data of RocksDB. The format of each element: STORE/TABLE: /path/disk.Allowed keys are [g/vertex, g/edge_out, g/edge_in, g/vertex_label_index, g/edge_label_index, g/range_int_index, g/range_float_index, g/range_long_index, g/range_double_index, g/secondary_index, g/search_index, g/shard_index, g/unique_index, g/olap] |
| rocksdb.data_path | rocksdb-data/data | The path for storing data of RocksDB. |
| rocksdb.wal_path | rocksdb-data/wal | The path for storing WAL of RocksDB. |
| rocksdb.option_path | The YAML file for configuring ToplingDB/RocksDB parameters. | |
| rocksdb.open_http | false | Whether to start ToplingDB HTTP service. Security: enable only in trusted networks and restrict access (firewall/ACL); the port and document_root are configured in the YAML (http.listening_ports/document_root). |
| rocksdb.allow_mmap_reads | false | Allow the OS to mmap file for reading sst tables. |
| rocksdb.allow_mmap_writes | false | Allow the OS to mmap file for writing. |
| rocksdb.block_cache_capacity | 8388608 | The amount of block cache in bytes that will be used by RocksDB, 0 means no block cache. |
| rocksdb.bloom_filter_bits_per_key | -1 | The bits per key in bloom filter, a good value is 10, which yields a filter with ~ 1% false positive rate, -1 means no bloom filter. |
| rocksdb.bloom_filter_block_based_mode | false | Use block based filter rather than full filter. |
| rocksdb.bloom_filter_whole_key_filtering | true | True if place whole keys in the bloom filter, else place the prefix of keys. |
| rocksdb.bottommost_compression | NO_COMPRESSION | The compression algorithm for the bottommost level of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd. |
| rocksdb.bulkload_mode | false | Switch to the mode to bulk load data into RocksDB. |
| rocksdb.cache_index_and_filter_blocks | false | Indicating if we’d put index/filter blocks to the block cache. |
| rocksdb.compaction_style | LEVEL | Set compaction style for RocksDB: LEVEL/UNIVERSAL/FIFO. |
| rocksdb.compression | SNAPPY_COMPRESSION | The compression algorithm for compressing blocks of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd. |
| rocksdb.compression_per_level | [NO_COMPRESSION, NO_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION] | The compression algorithms for different levels of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd. |
| rocksdb.delayed_write_rate | 16777216 | The rate limit in bytes/s of user write requests when need to slow down if the compaction gets behind. |
| rocksdb.log_level | INFO | The info log level of RocksDB. |
| rocksdb.max_background_jobs | 8 | Maximum number of concurrent background jobs, including flushes and compactions. |
| rocksdb.level_compaction_dynamic_level_bytes | false | Whether to enable level_compaction_dynamic_level_bytes, if it’s enabled we give max_bytes_for_level_multiplier a priority against max_bytes_for_level_base, the bytes of base level is dynamic for a more predictable LSM tree, it is useful to limit worse case space amplification. Turning this feature on/off for an existing DB can cause unexpected LSM tree structure so it’s not recommended. |
| rocksdb.max_bytes_for_level_base | 536870912 | The upper-bound of the total size of level-1 files in bytes. |
| rocksdb.max_bytes_for_level_multiplier | 10.0 | The ratio between the total size of level (L+1) files and the total size of level L files for all L. |
| rocksdb.max_open_files | -1 | The maximum number of open files that can be cached by RocksDB, -1 means no limit. |
| rocksdb.max_subcompactions | 4 | The value represents the maximum number of threads per compaction job. |
| rocksdb.max_write_buffer_number | 6 | The maximum number of write buffers that are built up in memory. |
| rocksdb.max_write_buffer_number_to_maintain | 0 | The total maximum number of write buffers to maintain in memory. |
| rocksdb.min_write_buffer_number_to_merge | 2 | The minimum number of write buffers that will be merged together. |
| rocksdb.num_levels | 7 | Set the number of levels for this database. |
| rocksdb.optimize_filters_for_hits | false | This flag allows us to not store filters for the last level. |
| rocksdb.optimize_mode | true | Optimize for heavy workloads and big datasets. |
| rocksdb.pin_l0_filter_and_index_blocks_in_cache | false | Indicating if we’d put index/filter blocks to the block cache. |
| rocksdb.sst_path | The path for ingesting SST file into RocksDB. | |
| rocksdb.target_file_size_base | 67108864 | The target file size for compaction in bytes. |
| rocksdb.target_file_size_multiplier | 1 | The size ratio between a level L file and a level (L+1) file. |
| rocksdb.use_direct_io_for_flush_and_compaction | false | Enable the OS to use direct read/writes in flush and compaction. |
| rocksdb.use_direct_reads | false | Enable the OS to use direct I/O for reading sst tables. |
| rocksdb.write_buffer_size | 134217728 | Amount of data in bytes to build up in memory. |
| rocksdb.max_manifest_file_size | 104857600 | The max size of manifest file in bytes. |
| rocksdb.skip_stats_update_on_db_open | false | Whether to skip statistics update when opening the database, setting this flag true allows us to not update statistics. |
| rocksdb.max_file_opening_threads | 16 | The max number of threads used to open files. |
| rocksdb.max_total_wal_size | 0 | Total size of WAL files in bytes. Once WALs exceed this size, we will start forcing the flush of column families related, 0 means no limit. |
| rocksdb.db_write_buffer_size | 0 | Total size of write buffers in bytes across all column families, 0 means no limit. |
| rocksdb.delete_obsolete_files_period | 21600 | The periodicity in seconds when obsolete files get deleted, 0 means always do full purge. |
| rocksdb.hard_pending_compaction_bytes_limit | 274877906944 | The hard limit to impose on pending compaction in bytes. |
| rocksdb.level0_file_num_compaction_trigger | 2 | Number of files to trigger level-0 compaction. |
| rocksdb.level0_slowdown_writes_trigger | 20 | Soft limit on number of level-0 files for slowing down writes. |
| rocksdb.level0_stop_writes_trigger | 36 | Hard limit on number of level-0 files for stopping writes. |
| rocksdb.soft_pending_compaction_bytes_limit | 68719476736 | The soft limit to impose on pending compaction in bytes. |
对应配置文件rest-server.properties
| config option | default value | description |
|---|---|---|
| server.use_k8s | false | Whether to enable K8s multi-tenancy mode. |
| k8s.namespace | hugegraph-computer-system | K8s namespace for compute jobs. |
| k8s.kubeconfig | Path to kubeconfig file. |
对应配置文件rest-server.properties
| config option | default value | description |
|---|---|---|
| arthas.telnetPort | 8562 | Arthas telnet port. |
| arthas.httpPort | 8561 | Arthas HTTP port. |
| arthas.ip | 0.0.0.0 | Arthas bind IP. |
| config option | default value | description |
|---|---|---|
| rpc.client_connect_timeout | 20 | The timeout(in seconds) of rpc client connect to rpc server. |
| rpc.client_load_balancer | consistentHash | The rpc client uses a load-balancing algorithm to access multiple rpc servers in one cluster. Default value is ‘consistentHash’, means forwarding by request parameters. |
| rpc.client_read_timeout | 40 | The timeout(in seconds) of rpc client read from rpc server. |
| rpc.client_reconnect_period | 10 | The period(in seconds) of rpc client reconnect to rpc server. |
| rpc.client_retries | 3 | Failed retry number of rpc client calls to rpc server. |
| rpc.config_order | 999 | Sofa rpc configuration file loading order, the larger the more later loading. |
| rpc.logger_impl | com.alipay.sofa.rpc.log.SLF4JLoggerImpl | Sofa rpc log implementation class. |
| rpc.protocol | bolt | Rpc communication protocol, client and server need to be specified the same value. |
| rpc.remote_url | The remote urls of rpc peers, it can be set to multiple addresses, which are concat by ‘,’, empty value means not enabled. | |
| rpc.server_adaptive_port | false | Whether the bound port is adaptive, if it’s enabled, when the port is in use, automatically +1 to detect the next available port. Note that this process is not atomic, so there may still be port conflicts. |
| rpc.server_host | The hosts/ips bound by rpc server to provide services, empty value means not enabled. | |
| rpc.server_port | 8090 | The port bound by rpc server to provide services. |
| rpc.server_timeout | 30 | The timeout(in seconds) of rpc server execution. |
| config option | default value | description |
|---|---|---|
| backend | Must be set to hbase. | |
| serializer | Must be set to hbase. | |
| hbase.hosts | localhost | The hostnames or ip addresses of HBase zookeeper, separated with commas. |
| hbase.port | 2181 | The port address of HBase zookeeper. |
| hbase.threads_max | 64 | The max threads num of hbase connections. |
| hbase.znode_parent | /hbase | The znode parent path of HBase zookeeper. |
| hbase.zk_retry | 3 | The recovery retry times of HBase zookeeper. |
| hbase.aggregation_timeout | 43200 | The timeout in seconds of waiting for aggregation. |
| hbase.kerberos_enable | false | Is Kerberos authentication enabled for HBase. |
| hbase.kerberos_keytab | The HBase’s key tab file for kerberos authentication. | |
| hbase.kerberos_principal | The HBase’s principal for kerberos authentication. | |
| hbase.krb5_conf | etc/krb5.conf | Kerberos configuration file, including KDC IP, default realm, etc. |
| hbase.hbase_site | /etc/hbase/conf/hbase-site.xml | The HBase’s configuration file |
| hbase.enable_partition | true | Is pre-split partitions enabled for HBase. |
| hbase.vertex_partitions | 10 | The number of partitions of the HBase vertex table. |
| hbase.edge_partitions | 30 | The number of partitions of the HBase edge table. |
以下后端存储在 1.7.0+ 版本中不再支持,仅在 1.5.x 及更早版本中可用:
| config option | default value | description |
|---|---|---|
| backend | Must be set to cassandra. | |
| serializer | Must be set to cassandra. | |
| cassandra.host | localhost | The seeds hostname or ip address of cassandra cluster. |
| cassandra.port | 9042 | The seeds port address of cassandra cluster. |
| cassandra.connect_timeout | 5 | The cassandra driver connect server timeout(seconds). |
| cassandra.read_timeout | 20 | The cassandra driver read from server timeout(seconds). |
| cassandra.keyspace.strategy | SimpleStrategy | The replication strategy of keyspace, valid value is SimpleStrategy or NetworkTopologyStrategy. |
| cassandra.keyspace.replication | [3] | The keyspace replication factor of SimpleStrategy, like ‘[3]’.Or replicas in each datacenter of NetworkTopologyStrategy, like ‘[dc1:2,dc2:1]’. |
| cassandra.username | The username to use to login to cassandra cluster. | |
| cassandra.password | The password corresponding to cassandra.username. | |
| cassandra.compression_type | none | The compression algorithm of cassandra transport: none/snappy/lz4. |
| cassandra.jmx_port=7199 | 7199 | The port of JMX API service for cassandra. |
| cassandra.aggregation_timeout | 43200 | The timeout in seconds of waiting for aggregation. |
| config option | default value | description |
|---|---|---|
| backend | Must be set to scylladb. | |
| serializer | Must be set to scylladb. |
其它与 Cassandra 后端一致。
| config option | default value | description |
|---|---|---|
| backend | Must be set to mysql. | |
| serializer | Must be set to mysql. | |
| jdbc.driver | com.mysql.jdbc.Driver | The JDBC driver class to connect database. |
| jdbc.url | jdbc:mysql://127.0.0.1:3306 | The url of database in JDBC format. |
| jdbc.username | root | The username to login database. |
| jdbc.password | ****** | The password corresponding to jdbc.username. |
| jdbc.ssl_mode | false | The SSL mode of connections with database. |
| jdbc.reconnect_interval | 3 | The interval(seconds) between reconnections when the database connection fails. |
| jdbc.reconnect_max_times | 3 | The reconnect times when the database connection fails. |
| jdbc.storage_engine | InnoDB | The storage engine of backend store database, like InnoDB/MyISAM/RocksDB for MySQL. |
| jdbc.postgresql.connect_database | template1 | The database used to connect when init store, drop store or check store exist. |
| config option | default value | description |
|---|---|---|
| backend | Must be set to postgresql. | |
| serializer | Must be set to postgresql. |
其它与 MySQL 后端一致。
PostgreSQL 后端的 driver 和 url 应该设置为:
jdbc.driver=org.postgresql.Driverjdbc.url=jdbc:postgresql://localhost:5432/
HugeGraph 为了方便不同用户场景下的鉴权使用,目前内置了完备的StandardAuthenticator权限模式,支持多用户认证、
以及细粒度的权限访问控制,采用基于“用户 - 用户组 - 操作 - 资源”的 4 层设计,灵活控制用户角色与权限 (支持多 GraphServer)
StandardAuthenticator 模式的几个核心设计:
admin) 用户,后续通过超级管理员创建其它用户,新创建的用户被分配足够权限后,可以创建或管理更多的用户type、label、properties三个要素,共有 18 种类型、任意 label、任意 properties 可组合形成的资源,一个资源的内部条件是且关系,多个资源之间的条件是或关系举例说明:
// 场景:某用户只有北京地区的数据读取权限
user(name=xx) -belong-> group(name=xx) -access(read)-> target(graph=graph1, resource={label: person, city: Beijing})
HugeGraph 目前默认未启用用户认证功能,需通过修改配置文件来启用该功能。(Note: 如果在生产环境/外网使用, 请使用 Java11 版本 + 开启权限避免安全相关隐患)
目前已内置实现了StandardAuthenticator模式,该模式支持多用户认证与细粒度权限控制。此外,开发者可以自定义实现HugeAuthenticator接口来对接自身的权限系统。
用户认证方式均采用 HTTP Basic Authentication ,简单说就是在发送 HTTP 请求时在 Authentication 设置选择 Basic 然后输入对应的用户名和密码,对应 HTTP 明文如下所示 :
GET http://localhost:8080/graphs/hugegraph/schema/vertexlabels
Authorization: Basic admin xxxx
警告:在 1.5.0 之前版本的 HugeGraph-Server 在鉴权模式下存在 JWT 相关的安全隐患,请务必使用新版本或自行修改 JWT token 的 secretKey。
修改方式为在配置文件rest-server.properties中重写auth.token_secret信息:(1.5.0 后会默认生成随机值则无需配置)
auth.token_secret=XXXX #这里为 32 位 String,由 a-z,A-Z 和 0-9 组成
也可以通过下面的命令实现:
RANDOM_STRING=$(head /dev/urandom | tr -dc A-Za-z0-9 | head -c 32)
echo "auth.token_secret=${RANDOM_STRING}" >> rest-server.properties
StandardAuthenticator模式是通过在数据库后端存储用户信息来支持用户认证和权限控制,该实现基于数据库存储的用户的名称与密码进行认证(密码已被加密),基于用户的角色来细粒度控制用户权限。下面是具体的配置流程(重启服务生效):
在配置文件gremlin-server.yaml中配置authenticator及其rest-server文件路径:
authentication: {
authenticator: org.apache.hugegraph.auth.StandardAuthenticator,
authenticationHandler: org.apache.hugegraph.auth.WsAndHttpBasicAuthHandler,
config: {tokens: conf/rest-server.properties}
}
在配置文件rest-server.properties中配置authenticator及其graph_store信息:
auth.authenticator=org.apache.hugegraph.auth.StandardAuthenticator
auth.graph_store=hugegraph
# auth client config
# 如果是分开部署 GraphServer 和 AuthServer,还需要指定下面的配置,地址填写 AuthServer 的 IP:RPC 端口
#auth.remote_url=127.0.0.1:8899,127.0.0.1:8898,127.0.0.1:8897
其中,graph_store配置项是指使用哪一个图来存储用户信息,如果存在多个图的话,选取任意一个均可。
在配置文件hugegraph{n}.properties中配置gremlin.graph信息:
gremlin.graph=org.apache.hugegraph.auth.HugeFactoryAuthProxy
然后详细的权限 API 调用和说明请参考 Authentication-API 文档。
如果需要支持更加灵活的用户系统,可自定义 authenticator 进行扩展,自定义 authenticator 实现接口org.apache.hugegraph.auth.HugeAuthenticator即可,然后修改配置文件中authenticator配置项指向该实现。
在鉴权配置完成后,需在首次执行 init-store.sh 时命令行中输入 admin 密码 (非 docker 部署模式下)
如果基于 docker 镜像部署或者已经初始化 HugeGraph 并需要转换为鉴权模式,需要删除相关图数据并重新启动 HugeGraph, 若图已有业务数据,暂时无法直接转换鉴权模式 (hugegraph 版本 <= 1.2.0)
对于该功能的改进已经在最新版本发布 (Docker latest 可用),可参考 PR 2411, 此时可无缝切换。
# stop the hugeGraph firstly
bin/stop-hugegraph.sh
# delete the store data (here we use the default path for rocksdb)
# Note: no need to delete data in the latest code (fixed in https://github.com/apache/hugegraph/pull/2411)
rm -rf rocksdb-data/
# init store again
bin/init-store.sh
# start hugeGraph again
bin/start-hugegraph.sh
对于镜像 hugegraph/hugegraph 大于等于 1.2.0 的版本,我们可以在启动 docker 镜像的同时开启鉴权模式
具体做法如下:
在 docker run 中添加环境变量 PASSWORD=xxx(密码可以自由设置)即可开启鉴权模式::
docker run -itd -e PASSWORD=xxx --name=server -p 8080:8080 hugegraph/hugegraph:1.5.0
使用 docker-compose 在环境变量中设置 PASSWORD=xxx即可
version: '3'
services:
server:
image: hugegraph/hugegraph:1.5.0
container_name: server
ports:
- 8080:8080
environment:
- PASSWORD=xxx
首先进入容器:
docker exec -it server bash
# 用于快速修改配置, 修改前的文件被保存在conf-bak文件夹下
bin/enable-auth.sh
之后参照 基于鉴权模式启动 即可
HugeGraphServer 默认使用的是 http 协议,如果用户对请求的安全性有要求,可以配置成 https。
修改 conf/rest-server.properties 配置文件,将 restserver.url 的 schema 部分改为 https。
# 将协议设置为 https
restserver.url=https://127.0.0.1:8080
# 服务端 keystore 文件路径,当协议为 https 时该默认值自动生效,可按需修改此项
ssl.keystore_file=conf/hugegraph-server.keystore
# 服务端 keystore 文件密码,当协议为 https 时该默认值自动生效,可按需修改此项
ssl.keystore_password=******
服务端的 conf 目录下已经给出了一个 keystore 文件hugegraph-server.keystore,该文件的密码为hugegraph,
这两项都是在开启了 https 协议时的默认值,用户可以生成自己的 keystore 文件及密码,然后修改ssl.keystore_file和ssl.keystore_password的值。
在构造 HugeClient 时传入 https 相关的配置,代码示例:
String url = "https://localhost:8080";
String graphName = "hugegraph";
HugeClientBuilder builder = HugeClient.builder(url, graphName);
// 客户端 keystore 文件路径
String trustStoreFilePath = "hugegraph.truststore";
// 客户端 keystore 密码
String trustStorePassword = "******";
builder.configSSL(trustStoreFilePath, trustStorePassword);
HugeClient hugeClient = builder.build();
注意:HugeGraph-Client 在 1.9.0 版本以前是直接以 new 的方式创建,并且不支持 https 协议,在 1.9.0 版本以后改成以 builder 的方式创建,并支持配置 https 协议。
启动导入任务时,在命令行中添加如下选项:
# https
--protocol https
# 客户端证书文件路径,当指定 --protocol 为 https 时,默认值 conf/hugegraph.truststore 自动生效,可按需修改
--trust-store-file {file}
# 客户端证书文件密码,当指定 --protocol 为 https 时,默认值 hugegraph 自动生效,可按需修改
--trust-store-password {password}
hugegraph-loader 的 conf 目录下已经放了一个默认的客户端证书文件 hugegraph.truststore,其密码是 hugegraph。
执行命令时,在命令行中添加如下选项:
# 客户端证书文件路径,当 url 中使用 https 协议时,默认值 conf/hugegraph.truststore 自动生效,可按需修改
--trust-store-file {file}
# 客户端证书文件密码,当 url 中使用 https 协议时,默认值 hugegraph 自动生效,可按需修改
--trust-store-password {password}
# 执行迁移命令时,当 --target-url 中使用 https 协议时,默认值 conf/hugegraph.truststore 自动生效,可按需修改
--target-trust-store-file {target-file}
# 执行迁移命令时,当 --target-url 中使用 https 协议时,默认值 hugegraph 自动生效,可按需修改
--target-trust-store-password {target-password}
hugegraph-tools 的 conf 目录下已经放了一个默认的客户端证书文件 hugegraph.truststore,其密码是 hugegraph。
本部分给出生成证书的示例,如果默认的证书已经够用,或者已经知晓如何生成,可跳过。
keytool -genkey -alias serverkey -keyalg RSA -keystore server.keystore
过程中根据需求填写描述信息,默认证书的描述信息如下:
名字和姓⽒:hugegraph
组织单位名称:hugegraph
组织名称:hugegraph
城市或区域名称:BJ
州或省份名称:BJ
国家代码:CN
keytool -export -alias serverkey -keystore server.keystore -file server.crt
server.crt 就是服务端的证书
keytool -import -alias serverkey -file server.crt -keystore client.truststore
client.truststore 是给客户端⽤的,其中保存着受信任的证书