This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

HugeGraph-Server Configuration

This section covers HugeGraph-Server configuration, including:

1 - Server Startup Guide

1 Overview

The directory for the configuration files is hugegraph-release/conf, and all the configurations related to the service and the graph itself are located in this directory.

The main configuration files include gremlin-server.yaml, rest-server.properties, and hugegraph.properties.

The HugeGraphServer integrates the GremlinServer and RestServer internally, and gremlin-server.yaml and rest-server.properties are used to configure these two servers.

  • GremlinServer: GremlinServer accepts Gremlin statements from users, parses them, and then invokes the Core code.
  • RestServer: It provides a RESTful API that, based on different HTTP requests, calls the corresponding Core API. If the user’s request body is a Gremlin statement, it will be forwarded to GremlinServer to perform operations on the graph data.

Now let’s introduce these three configuration files one by one.

2. gremlin-server.yaml

The default content of the gremlin-server.yaml file is as follows:

# host and port of gremlin server, need to be consistent with host and port in rest-server.properties
#host: 127.0.0.1
#port: 8182

# timeout in ms of gremlin query
evaluationTimeout: 30000

channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
# don't set graph at here, this happens after support for dynamically adding graph
graphs: {
}
scriptEngines: {
  gremlin-groovy: {
    staticImports: [
      org.opencypher.gremlin.process.traversal.CustomPredicates.*',
      org.opencypher.gremlin.traversal.CustomFunctions.*
    ],
    plugins: {
      org.apache.hugegraph.plugin.HugeGraphGremlinPlugin: {},
      org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
      org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {
        classImports: [
          java.lang.Math,
          org.apache.hugegraph.backend.id.IdGenerator,
          org.apache.hugegraph.type.define.Directions,
          org.apache.hugegraph.type.define.NodeRole,
          org.apache.hugegraph.traversal.algorithm.CollectionPathsTraverser,
          org.apache.hugegraph.traversal.algorithm.CountTraverser,
          org.apache.hugegraph.traversal.algorithm.CustomizedCrosspointsTraverser,
          org.apache.hugegraph.traversal.algorithm.CustomizePathsTraverser,
          org.apache.hugegraph.traversal.algorithm.FusiformSimilarityTraverser,
          org.apache.hugegraph.traversal.algorithm.HugeTraverser,
          org.apache.hugegraph.traversal.algorithm.JaccardSimilarTraverser,
          org.apache.hugegraph.traversal.algorithm.KneighborTraverser,
          org.apache.hugegraph.traversal.algorithm.KoutTraverser,
          org.apache.hugegraph.traversal.algorithm.MultiNodeShortestPathTraverser,
          org.apache.hugegraph.traversal.algorithm.NeighborRankTraverser,
          org.apache.hugegraph.traversal.algorithm.PathsTraverser,
          org.apache.hugegraph.traversal.algorithm.PersonalRankTraverser,
          org.apache.hugegraph.traversal.algorithm.SameNeighborTraverser,
          org.apache.hugegraph.traversal.algorithm.ShortestPathTraverser,
          org.apache.hugegraph.traversal.algorithm.SingleSourceShortestPathTraverser,
          org.apache.hugegraph.traversal.algorithm.SubGraphTraverser,
          org.apache.hugegraph.traversal.algorithm.TemplatePathsTraverser,
          org.apache.hugegraph.traversal.algorithm.steps.EdgeStep,
          org.apache.hugegraph.traversal.algorithm.steps.RepeatEdgeStep,
          org.apache.hugegraph.traversal.algorithm.steps.WeightedEdgeStep,
          org.apache.hugegraph.traversal.optimize.ConditionP,
          org.apache.hugegraph.traversal.optimize.Text,
          org.apache.hugegraph.traversal.optimize.TraversalUtil,
          org.apache.hugegraph.util.DateUtil,
          org.opencypher.gremlin.traversal.CustomFunctions,
          org.opencypher.gremlin.traversal.CustomPredicate
        ],
        methodImports: [
          java.lang.Math#*,
          org.opencypher.gremlin.traversal.CustomPredicate#*,
          org.opencypher.gremlin.traversal.CustomFunctions#*
        ]
      },
      org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {
        files: [scripts/empty-sample.groovy]
      }
    }
  }
}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1,
      config: {
        serializeResultToString: false,
        ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
      }
  }
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0,
      config: {
        serializeResultToString: false,
        ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
      }
  }
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV2d0,
      config: {
        serializeResultToString: false,
        ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
      }
  }
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0,
      config: {
        serializeResultToString: false,
        ioRegistries: [org.apache.hugegraph.io.HugeGraphIoRegistry]
      }
  }
metrics: {
  consoleReporter: {enabled: false, interval: 180000},
  csvReporter: {enabled: false, interval: 180000, fileName: ./metrics/gremlin-server-metrics.csv},
  jmxReporter: {enabled: false},
  slf4jReporter: {enabled: false, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}
}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
  enabled: false
}

There are many configuration options mentioned above, but for now, let’s focus on the following options: channelizer and graphs.

  • graphs: This option specifies the graphs that need to be opened when the GremlinServer starts. It is a map structure where the key is the name of the graph and the value is the configuration file path for that graph.
  • channelizer: The GremlinServer supports two communication modes with clients: WebSocket and HTTP (default). If WebSocket is chosen, users can quickly experience the features of HugeGraph using Gremlin-Console, but it does not support importing large-scale data. It is recommended to use HTTP for communication, as all peripheral components of HugeGraph are implemented based on HTTP.

By default, the GremlinServer serves at localhost:8182. If you need to modify it, configure the host and port settings.

  • host: The hostname or IP address of the machine where the GremlinServer is deployed. Currently, HugeGraphServer does not support distributed deployment, and GremlinServer is not directly exposed to users.
  • port: The port number of the machine where the GremlinServer is deployed.

Additionally, you need to add the corresponding configuration gremlinserver.url=http://host:port in rest-server.properties.

3. rest-server.properties

The default content of the rest-server.properties file is as follows:

# bind url
# could use '0.0.0.0' or specified (real)IP to expose external network access
restserver.url=http://127.0.0.1:8080
#restserver.enable_graphspaces_filter=false
# gremlin server url, need to be consistent with host and port in gremlin-server.yaml
#gremlinserver.url=http://127.0.0.1:8182

graphs=./conf/graphs

# The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0
batch.max_write_ratio=80
batch.max_write_threads=0

# configuration of arthas
arthas.telnet_port=8562
arthas.http_port=8561
arthas.ip=127.0.0.1
arthas.disabled_commands=jad

# authentication configs
# choose 'org.apache.hugegraph.auth.StandardAuthenticator' or a custom implementation
#auth.authenticator=

# for StandardAuthenticator mode
#auth.graph_store=hugegraph
# auth client config
#auth.remote_url=127.0.0.1:8899,127.0.0.1:8898,127.0.0.1:8897

# TODO: Deprecated & removed later (useless from version 1.5.0)
# rpc server configs for multi graph-servers or raft-servers
#rpc.server_host=127.0.0.1
#rpc.server_port=8091
#rpc.server_timeout=30

# rpc client configs (like enable to keep cache consistency)
#rpc.remote_url=127.0.0.1:8091,127.0.0.1:8092,127.0.0.1:8093
#rpc.client_connect_timeout=20
#rpc.client_reconnect_period=10
#rpc.client_read_timeout=40
#rpc.client_retries=3
#rpc.client_load_balancer=consistentHash

# raft group initial peers
#raft.group_peers=127.0.0.1:8091,127.0.0.1:8092,127.0.0.1:8093

# lightweight load balancing (beta)
server.id=server-1
server.role=master

# slow query log
log.slow_query_threshold=1000

# jvm(in-heap) memory usage monitor, set 1 to disable it
memory_monitor.threshold=0.85
memory_monitor.period=2000
  • restserver.url: The URL at which the RestServer provides its services. Modify it according to the actual environment. If you can’t connet to server from other IP address, try to modify it as specific IP; or modify it as http://0.0.0.0 to listen all network interfaces as a convenient solution, but need to take care of the network area that might access.
  • graphs: The RestServer also needs to open graphs when it starts. This option is a map structure where the key is the name of the graph and the value is the configuration file path for that graph.

Note: Both gremlin-server.yaml and rest-server.properties contain the graphs configuration option, and the init-store command initializes based on the graphs specified in the graphs section of gremlin-server.yaml.

The gremlinserver.url configuration option is the URL at which the GremlinServer provides services to the RestServer. By default, it is set to http://localhost:8182. If you need to modify it, it should match the host and port settings in gremlin-server.yaml.

4. hugegraph.properties

hugegraph.properties is a type of file. If the system has multiple graphs, there will be multiple similar files. This file is used to configure parameters related to graph storage and querying. The default content of the file is as follows:

# gremlin entrence to create graph
gremlin.graph=org.apache.hugegraph.HugeFactory

# cache config
#schema.cache_capacity=100000
# vertex-cache default is 1000w, 10min expired
#vertex.cache_capacity=10000000
#vertex.cache_expire=600
# edge-cache default is 100w, 10min expired
#edge.cache_capacity=1000000
#edge.cache_expire=600

# schema illegal name template
#schema.illegal_name_regex=\s+|~.*

#vertex.default_label=vertex

backend=rocksdb
serializer=binary

store=hugegraph

raft.mode=false
raft.safe_read=false
raft.use_snapshot=false
raft.endpoint=127.0.0.1:8281
raft.group_peers=127.0.0.1:8281,127.0.0.1:8282,127.0.0.1:8283
raft.path=./raft-log
raft.use_replicator_pipeline=true
raft.election_timeout=10000
raft.snapshot_interval=3600
raft.backend_threads=48
raft.read_index_threads=8
raft.queue_size=16384
raft.queue_publish_timeout=60
raft.apply_batch=1
raft.rpc_threads=80
raft.rpc_connect_timeout=5000
raft.rpc_timeout=60000

# if use 'ikanalyzer', need download jar from 'https://github.com/apache/hugegraph-doc/raw/ik_binary/dist/server/ikanalyzer-2012_u6.jar' to lib directory
search.text_analyzer=jieba
search.text_analyzer_mode=INDEX

# rocksdb backend config
#rocksdb.data_path=/path/to/disk
#rocksdb.wal_path=/path/to/disk

# cassandra backend config
cassandra.host=localhost
cassandra.port=9042
cassandra.username=
cassandra.password=
#cassandra.connect_timeout=5
#cassandra.read_timeout=20
#cassandra.keyspace.strategy=SimpleStrategy
#cassandra.keyspace.replication=3

# hbase backend config
#hbase.hosts=localhost
#hbase.port=2181
#hbase.znode_parent=/hbase
#hbase.threads_max=64

# mysql backend config
#jdbc.driver=com.mysql.jdbc.Driver
#jdbc.url=jdbc:mysql://127.0.0.1:3306
#jdbc.username=root
#jdbc.password=
#jdbc.reconnect_max_times=3
#jdbc.reconnect_interval=3
#jdbc.ssl_mode=false

# postgresql & cockroachdb backend config
#jdbc.driver=org.postgresql.Driver
#jdbc.url=jdbc:postgresql://localhost:5432/
#jdbc.username=postgres
#jdbc.password=

# palo backend config
#palo.host=127.0.0.1
#palo.poll_interval=10
#palo.temp_dir=./palo-data
#palo.file_limit_size=32

Pay attention to the following uncommented items:

  • gremlin.graph: The entry point for GremlinServer startup. Users should not modify this item.
  • backend: The backend storage used, with options including memory, cassandra, scylladb, mysql, hbase, postgresql, and rocksdb.
  • serializer: Mainly for internal use, used to serialize schema, vertices, and edges to the backend. The corresponding options are text, cassandra, scylladb, and binary (Note: The rocksdb backend should have a value of binary, while for other backends, the values of backend and serializer should remain consistent. For example, for the hbase backend, the value should be hbase).
  • store: The name of the database used for storing the graph in the backend. In Cassandra and ScyllaDB, it corresponds to the keyspace name. The value of this item is unrelated to the graph name in GremlinServer and RestServer, but for clarity, it is recommended to use the same name.
  • cassandra.host: This item is only meaningful when the backend is set to cassandra or scylladb. It specifies the seeds of the Cassandra/ScyllaDB cluster.
  • cassandra.port: This item is only meaningful when the backend is set to cassandra or scylladb. It specifies the native port of the Cassandra/ScyllaDB cluster.
  • rocksdb.data_path: This item is only meaningful when the backend is set to rocksdb. It specifies the data directory for RocksDB.
  • rocksdb.wal_path: This item is only meaningful when the backend is set to rocksdb. It specifies the log directory for RocksDB.
  • admin.token: A token used to retrieve server configuration information. For example: http://localhost:8080/graphs/hugegraph/conf?token=162f7848-0b6d-4faf-b557-3a0797869c55

5. Multi-Graph Configuration

Our system can have multiple graphs, and the backend of each graph can be different, such as hugegraph_rocksdb and hugegraph_mysql, where hugegraph_rocksdb uses RocksDB as the backend, and hugegraph_mysql uses MySQL as a backend.

The configuration method is simple:

[Optional]: Modify rest-server.properties

You can modify the graph profile directory in the graphs option of rest-server.properties. The default configuration is graphs=./conf/graphs, if you want to change it to another directory then adjust the graphs option, e.g. adjust it to graphs=/etc/hugegraph/graphs, example is as follows:

graphs=./conf/graphs

Modify hugegraph_mysql_backend.properties and hugegraph_rocksdb_backend.properties based on hugegraph.properties under conf/graphs path

The modified part of hugegraph_mysql_backend.properties is as follows:

backend=mysql
serializer=mysql

store=hugegraph_mysql

# mysql backend config
jdbc.driver=com.mysql.cj.jdbc.Driver
jdbc.url=jdbc:mysql://127.0.0.1:3306
jdbc.username=root
jdbc.password=xxx
jdbc.reconnect_max_times=3
jdbc.reconnect_interval=3
jdbc.ssl_mode=false

The modified part of hugegraph_rocksdb_backend.properties is as follows:

backend=rocksdb
serializer=binary

store=hugegraph_rocksdb

Stop the server, execute init-store.sh (to create a new database for the new graph), and restart the server.

$ ./bin/stop-hugegraph.sh
$ ./bin/init-store.sh

Initializing HugeGraph Store...
2023-06-11 14:16:14 [main] [INFO] o.a.h.u.ConfigUtil - Scanning option 'graphs' directory './conf/graphs'
2023-06-11 14:16:14 [main] [INFO] o.a.h.c.InitStore - Init graph with config file: ./conf/graphs/hugegraph_rocksdb_backend.properties
...
2023-06-11 14:16:15 [main] [INFO] o.a.h.StandardHugeGraph - Graph 'hugegraph_rocksdb' has been initialized
2023-06-11 14:16:15 [main] [INFO] o.a.h.c.InitStore - Init graph with config file: ./conf/graphs/hugegraph_mysql_backend.properties
...
2023-06-11 14:16:16 [main] [INFO] o.a.h.StandardHugeGraph - Graph 'hugegraph_mysql' has been initialized
2023-06-11 14:16:16 [main] [INFO] o.a.h.StandardHugeGraph - Close graph standardhugegraph[hugegraph_rocksdb]
...
2023-06-11 14:16:16 [main] [INFO] o.a.h.HugeFactory - HugeFactory shutdown
2023-06-11 14:16:16 [hugegraph-shutdown] [INFO] o.a.h.HugeFactory - HugeGraph is shutting down
Initialization finished.
$ ./bin/start-hugegraph.sh

Starting HugeGraphServer...
Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)...OK
Started [pid 21614]

Check out created graphs:

curl http://127.0.0.1:8080/graphs/

{"graphs":["hugegraph_rocksdb","hugegraph_mysql"]}

Get details of the graph

curl http://127.0.0.1:8080/graphs/hugegraph_mysql_backend

{"name":"hugegraph_mysql","backend":"mysql"}
curl http://127.0.0.1:8080/graphs/hugegraph_rocksdb_backend

{"name":"hugegraph_rocksdb","backend":"rocksdb"}

2 - Server Complete Configuration Manual

Gremlin Server Config Options

Corresponding configuration file gremlin-server.yaml

config optiondefault valuedescription
host127.0.0.1The host or ip of Gremlin Server.
port8182The listening port of Gremlin Server.
graphshugegraph: conf/hugegraph.propertiesThe map of graphs with name and config file path.
scriptEvaluationTimeout30000The timeout for gremlin script execution(millisecond).
channelizerorg.apache.tinkerpop.gremlin.server.channel.HttpChannelizerIndicates the protocol which the Gremlin Server provides service.
authenticationauthenticator: org.apache.hugegraph.auth.StandardAuthenticator, config: {tokens: conf/rest-server.properties}The authenticator and config(contains tokens path) of authentication mechanism.

Rest Server & API Config Options

Corresponding configuration file rest-server.properties

config optiondefault valuedescription
graphs[hugegraph:conf/hugegraph.properties]The map of graphs’ name and config file.
server.idserver-1The id of rest server, used for license verification.
server.rolemasterThe role of nodes in the cluster, available types are [master, worker, computer]
restserver.urlhttp://127.0.0.1:8080The url for listening of rest server.
ssl.keystore_fileserver.keystoreThe path of server keystore file used when https protocol is enabled.
ssl.keystore_passwordThe password of the path of the server keystore file used when the https protocol is enabled.
restserver.max_worker_threads2 * CPUsThe maximum worker threads of rest server.
restserver.min_free_memory64The minimum free memory(MB) of rest server, requests will be rejected when the available memory of system is lower than this value.
restserver.request_timeout30The time in seconds within which a request must complete, -1 means no timeout.
restserver.connection_idle_timeout30The time in seconds to keep an inactive connection alive, -1 means no timeout.
restserver.connection_max_requests256The max number of HTTP requests allowed to be processed on one keep-alive connection, -1 means unlimited.
gremlinserver.urlhttp://127.0.0.1:8182The url of gremlin server.
gremlinserver.max_route8The max route number for gremlin server.
gremlinserver.timeout30The timeout in seconds of waiting for gremlin server.
batch.max_edges_per_batch2500The maximum number of edges submitted per batch.
batch.max_vertices_per_batch2500The maximum number of vertices submitted per batch.
batch.max_write_ratio70The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0.
batch.max_write_threads0The maximum threads for batch writing, if the value is 0, the actual value will be set to batch.max_write_ratio * restserver.max_worker_threads.
auth.authenticatorThe class path of authenticator implementation. e.g., org.apache.hugegraph.auth.StandardAuthenticator, or a custom implementation.
auth.graph_storehugegraphThe name of graph used to store authentication information, like users, only for org.apache.hugegraph.auth.StandardAuthenticator.
auth.audit_log_rate1000.0The max rate of audit log output per user, default value is 1000 records per second.
auth.cache_capacity10240The max cache capacity of each auth cache item.
auth.cache_expire600The expiration time in seconds of vertex cache.
auth.remote_urlIf the address is empty, it provide auth service, otherwise it is auth client and also provide auth service through rpc forwarding. The remote url can be set to multiple addresses, which are concat by ‘,’.
auth.token_expire86400The expiration time in seconds after token created
auth.token_secretFXQXbJtbCLxODc6tGci732pkH1cyf8QgSecret key of HS256 algorithm.
exception.allow_tracetrueWhether to allow exception trace stack.
memory_monitor.threshold0.85The threshold of JVM(in-heap) memory usage monitoring , 1 means disabling this function.
memory_monitor.period2000The period in ms of JVM(in-heap) memory usage monitoring.
log.slow_query_threshold1000Slow query log threshold in milliseconds, 0 means disabled.

PD/Meta Config Options (Distributed Mode)

Corresponding configuration file rest-server.properties

config optiondefault valuedescription
pd.peers127.0.0.1:8686PD server addresses (comma separated).
meta.endpointshttp://127.0.0.1:2379Meta service endpoints.

Basic Config Options

Basic Config Options and Backend Config Options correspond to configuration files:{graph-name}.properties, such as hugegraph.properties

config optiondefault valuedescription
gremlin.graphorg.apache.hugegraph.HugeFactoryGremlin entrance to create graph.
backendrocksdbThe data store type. For version 1.7.0+: [memory, rocksdb, hstore, hbase]. Note: cassandra, scylladb, mysql, postgresql were removed in 1.7.0 (use <= 1.5.x for legacy backends).
serializerbinaryThe serializer for backend store, available values are [text, binary, cassandra, hbase, mysql].
storehugegraphThe database name like Cassandra Keyspace.
store.connection_detect_interval600The interval in seconds for detecting connections, if the idle time of a connection exceeds this value, detect it and reconnect if needed before using, value 0 means detecting every time.
store.graphgThe graph table name, which store vertex, edge and property.
store.schemamThe schema table name, which store meta data.
store.systemsThe system table name, which store system data.
schema.illegal_name_regex.\s+$|~.The regex specified the illegal format for schema name.
schema.cache_capacity10000The max cache size(items) of schema cache.
vertex.cache_typel2The type of vertex cache, allowed values are [l1, l2].
vertex.cache_capacity10000000The max cache size(items) of vertex cache.
vertex.cache_expire600The expire time in seconds of vertex cache.
vertex.check_customized_id_existfalseWhether to check the vertices exist for those using customized id strategy.
vertex.default_labelvertexThe default vertex label.
vertex.tx_capacity10000The max size(items) of vertices(uncommitted) in transaction.
vertex.check_adjacent_vertex_existfalseWhether to check the adjacent vertices of edges exist.
vertex.lazy_load_adjacent_vertextrueWhether to lazy load adjacent vertices of edges.
vertex.part_edge_commit_size5000Whether to enable the mode to commit part of edges of vertex, enabled if commit size > 0, 0 means disabled.
vertex.encode_primary_key_numbertrueWhether to encode number value of primary key in vertex id.
vertex.remove_left_index_at_overwritefalseWhether remove left index at overwrite.
edge.cache_typel2The type of edge cache, allowed values are [l1, l2].
edge.cache_capacity1000000The max cache size(items) of edge cache.
edge.cache_expire600The expiration time in seconds of edge cache.
edge.tx_capacity10000The max size(items) of edges(uncommitted) in transaction.
query.page_size500The size of each page when querying by paging.
query.batch_size1000The size of each batch when querying by batch.
query.ignore_invalid_datatrueWhether to ignore invalid data of vertex or edge.
query.index_intersect_threshold1000The maximum number of intermediate results to intersect indexes when querying by multiple single index properties.
query.ramtable_edges_capacity20000000The maximum number of edges in ramtable, include OUT and IN edges.
query.ramtable_enablefalseWhether to enable ramtable for query of adjacent edges.
query.ramtable_vertices_capacity10000000The maximum number of vertices in ramtable, generally the largest vertex id is used as capacity.
query.optimize_aggregate_by_indexfalseWhether to optimize aggregate query(like count) by index.
oltp.concurrent_depth10The min depth to enable concurrent oltp algorithm.
oltp.concurrent_threads10Thread number to concurrently execute oltp algorithm.
oltp.collection_typeECThe implementation type of collections used in oltp algorithm.
rate_limit.read0The max rate(times/s) to execute query of vertices/edges.
rate_limit.write0The max rate(items/s) to add/update/delete vertices/edges.
task.wait_timeout10Timeout in seconds for waiting for the task to complete,such as when truncating or clearing the backend.
task.input_size_limit16777216The job input size limit in bytes.
task.result_size_limit16777216The job result size limit in bytes.
task.sync_deletionfalseWhether to delete schema or expired data synchronously.
task.ttl_delete_batch1The batch size used to delete expired data.
computer.config/conf/computer.yamlThe config file path of computer job.
search.text_analyzerikanalyzerChoose a text analyzer for searching the vertex/edge properties, available type are [word, ansj, hanlp, smartcn, jieba, jcseg, mmseg4j, ikanalyzer]. if use ‘ikanalyzer’, need download jar from ‘https://github.com/apache/hugegraph-doc/raw/ik_binary/dist/server/ikanalyzer-2012_u6.jar' to lib directory
search.text_analyzer_modesmartSpecify the mode for the text analyzer, the available mode of analyzer are {word: [MaximumMatching, ReverseMaximumMatching, MinimumMatching, ReverseMinimumMatching, BidirectionalMaximumMatching, BidirectionalMinimumMatching, BidirectionalMaximumMinimumMatching, FullSegmentation, MinimalWordCount, MaxNgramScore, PureEnglish], ansj: [BaseAnalysis, IndexAnalysis, ToAnalysis, NlpAnalysis], hanlp: [standard, nlp, index, nShort, shortest, speed], smartcn: [], jieba: [SEARCH, INDEX], jcseg: [Simple, Complex], mmseg4j: [Simple, Complex, MaxWord], ikanalyzer: [smart, max_word]}.
snowflake.datacenter_id0The datacenter id of snowflake id generator.
snowflake.force_stringfalseWhether to force the snowflake long id to be a string.
snowflake.worker_id0The worker id of snowflake id generator.
raft.modefalseWhether the backend storage works in raft mode.
raft.safe_readfalseWhether to use linearly consistent read.
raft.use_snapshotfalseWhether to use snapshot.
raft.endpoint127.0.0.1:8281The peerid of current raft node.
raft.group_peers127.0.0.1:8281,127.0.0.1:8282,127.0.0.1:8283The peers of current raft group.
raft.path./raft-logThe log path of current raft node.
raft.use_replicator_pipelinetrueWhether to use replicator line, when turned on it multiple logs can be sent in parallel, and the next log doesn’t have to wait for the ack message of the current log to be sent.
raft.election_timeout10000Timeout in milliseconds to launch a round of election.
raft.snapshot_interval3600The interval in seconds to trigger snapshot save.
raft.backend_threadscurrent CPU v-coresThe thread number used to apply task to backend.
raft.read_index_threads8The thread number used to execute reading index.
raft.apply_batch1The apply batch size to trigger disruptor event handler.
raft.queue_size16384The disruptor buffers size for jraft RaftNode, StateMachine and LogManager.
raft.queue_publish_timeout60The timeout in second when publish event into disruptor.
raft.rpc_threads80The rpc threads for jraft RPC layer.
raft.rpc_connect_timeout5000The rpc connect timeout for jraft rpc.
raft.rpc_timeout60000The rpc timeout for jraft rpc.
raft.rpc_buf_low_water_mark10485760The ChannelOutboundBuffer’s low water mark of netty, when buffer size less than this size, the method ChannelOutboundBuffer.isWritable() will return true, it means that low downstream pressure or good network.
raft.rpc_buf_high_water_mark20971520The ChannelOutboundBuffer’s high water mark of netty, only when buffer size exceed this size, the method ChannelOutboundBuffer.isWritable() will return false, it means that the downstream pressure is too great to process the request or network is very congestion, upstream needs to limit rate at this time.
raft.read_strategyReadOnlyLeaseBasedThe linearizability of read strategy.

RocksDB Backend Config Options

config optiondefault valuedescription
backendMust be set to rocksdb.
serializerMust be set to binary.
rocksdb.data_disks[]The optimized disks for storing data of RocksDB. The format of each element: STORE/TABLE: /path/disk.Allowed keys are [g/vertex, g/edge_out, g/edge_in, g/vertex_label_index, g/edge_label_index, g/range_int_index, g/range_float_index, g/range_long_index, g/range_double_index, g/secondary_index, g/search_index, g/shard_index, g/unique_index, g/olap]
rocksdb.data_pathrocksdb-data/dataThe path for storing data of RocksDB.
rocksdb.wal_pathrocksdb-data/walThe path for storing WAL of RocksDB.
rocksdb.option_pathThe YAML file for configuring ToplingDB/RocksDB parameters.
rocksdb.open_httpfalseWhether to start ToplingDB HTTP service. Security: enable only in trusted networks and restrict access (firewall/ACL); the port and document_root are configured in the YAML (http.listening_ports/document_root).
rocksdb.allow_mmap_readsfalseAllow the OS to mmap file for reading sst tables.
rocksdb.allow_mmap_writesfalseAllow the OS to mmap file for writing.
rocksdb.block_cache_capacity8388608The amount of block cache in bytes that will be used by RocksDB, 0 means no block cache.
rocksdb.bloom_filter_bits_per_key-1The bits per key in bloom filter, a good value is 10, which yields a filter with ~ 1% false positive rate, -1 means no bloom filter.
rocksdb.bloom_filter_block_based_modefalseUse block based filter rather than full filter.
rocksdb.bloom_filter_whole_key_filteringtrueTrue if place whole keys in the bloom filter, else place the prefix of keys.
rocksdb.bottommost_compressionNO_COMPRESSIONThe compression algorithm for the bottommost level of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd.
rocksdb.bulkload_modefalseSwitch to the mode to bulk load data into RocksDB.
rocksdb.cache_index_and_filter_blocksfalseIndicating if we’d put index/filter blocks to the block cache.
rocksdb.compaction_styleLEVELSet compaction style for RocksDB: LEVEL/UNIVERSAL/FIFO.
rocksdb.compressionSNAPPY_COMPRESSIONThe compression algorithm for compressing blocks of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd.
rocksdb.compression_per_level[NO_COMPRESSION, NO_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION]The compression algorithms for different levels of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd.
rocksdb.delayed_write_rate16777216The rate limit in bytes/s of user write requests when need to slow down if the compaction gets behind.
rocksdb.log_levelINFOThe info log level of RocksDB.
rocksdb.max_background_jobs8Maximum number of concurrent background jobs, including flushes and compactions.
rocksdb.level_compaction_dynamic_level_bytesfalseWhether to enable level_compaction_dynamic_level_bytes, if it’s enabled we give max_bytes_for_level_multiplier a priority against max_bytes_for_level_base, the bytes of base level is dynamic for a more predictable LSM tree, it is useful to limit worse case space amplification. Turning this feature on/off for an existing DB can cause unexpected LSM tree structure so it’s not recommended.
rocksdb.max_bytes_for_level_base536870912The upper-bound of the total size of level-1 files in bytes.
rocksdb.max_bytes_for_level_multiplier10.0The ratio between the total size of level (L+1) files and the total size of level L files for all L.
rocksdb.max_open_files-1The maximum number of open files that can be cached by RocksDB, -1 means no limit.
rocksdb.max_subcompactions4The value represents the maximum number of threads per compaction job.
rocksdb.max_write_buffer_number6The maximum number of write buffers that are built up in memory.
rocksdb.max_write_buffer_number_to_maintain0The total maximum number of write buffers to maintain in memory.
rocksdb.min_write_buffer_number_to_merge2The minimum number of write buffers that will be merged together.
rocksdb.num_levels7Set the number of levels for this database.
rocksdb.optimize_filters_for_hitsfalseThis flag allows us to not store filters for the last level.
rocksdb.optimize_modetrueOptimize for heavy workloads and big datasets.
rocksdb.pin_l0_filter_and_index_blocks_in_cachefalseIndicating if we’d put index/filter blocks to the block cache.
rocksdb.sst_pathThe path for ingesting SST file into RocksDB.
rocksdb.target_file_size_base67108864The target file size for compaction in bytes.
rocksdb.target_file_size_multiplier1The size ratio between a level L file and a level (L+1) file.
rocksdb.use_direct_io_for_flush_and_compactionfalseEnable the OS to use direct read/writes in flush and compaction.
rocksdb.use_direct_readsfalseEnable the OS to use direct I/O for reading sst tables.
rocksdb.write_buffer_size134217728Amount of data in bytes to build up in memory.
rocksdb.max_manifest_file_size104857600The max size of manifest file in bytes.
rocksdb.skip_stats_update_on_db_openfalseWhether to skip statistics update when opening the database, setting this flag true allows us to not update statistics.
rocksdb.max_file_opening_threads16The max number of threads used to open files.
rocksdb.max_total_wal_size0Total size of WAL files in bytes. Once WALs exceed this size, we will start forcing the flush of column families related, 0 means no limit.
rocksdb.db_write_buffer_size0Total size of write buffers in bytes across all column families, 0 means no limit.
rocksdb.delete_obsolete_files_period21600The periodicity in seconds when obsolete files get deleted, 0 means always do full purge.
rocksdb.hard_pending_compaction_bytes_limit274877906944The hard limit to impose on pending compaction in bytes.
rocksdb.level0_file_num_compaction_trigger2Number of files to trigger level-0 compaction.
rocksdb.level0_slowdown_writes_trigger20Soft limit on number of level-0 files for slowing down writes.
rocksdb.level0_stop_writes_trigger36Hard limit on number of level-0 files for stopping writes.
rocksdb.soft_pending_compaction_bytes_limit68719476736The soft limit to impose on pending compaction in bytes.
K8s Config Options (Optional)

Corresponding configuration file rest-server.properties

config optiondefault valuedescription
server.use_k8sfalseWhether to enable K8s multi-tenancy mode.
k8s.namespacehugegraph-computer-systemK8s namespace for compute jobs.
k8s.kubeconfigPath to kubeconfig file.
Arthas Diagnostic Config Options (Optional)

Corresponding configuration file rest-server.properties

config optiondefault valuedescription
arthas.telnetPort8562Arthas telnet port.
arthas.httpPort8561Arthas HTTP port.
arthas.ip0.0.0.0Arthas bind IP.
HBase Backend Config Options
config optiondefault valuedescription
backendMust be set to hbase.
serializerMust be set to hbase.
hbase.hostslocalhostThe hostnames or ip addresses of HBase zookeeper, separated with commas.
hbase.port2181The port address of HBase zookeeper.
hbase.threads_max64The max threads num of hbase connections.
hbase.znode_parent/hbaseThe znode parent path of HBase zookeeper.
hbase.zk_retry3The recovery retry times of HBase zookeeper.
hbase.aggregation_timeout43200The timeout in seconds of waiting for aggregation.
hbase.kerberos_enablefalseIs Kerberos authentication enabled for HBase.
hbase.kerberos_keytabThe HBase’s key tab file for kerberos authentication.
hbase.kerberos_principalThe HBase’s principal for kerberos authentication.
hbase.krb5_confetc/krb5.confKerberos configuration file, including KDC IP, default realm, etc.
hbase.hbase_site/etc/hbase/conf/hbase-site.xmlThe HBase’s configuration file
hbase.enable_partitiontrueIs pre-split partitions enabled for HBase.
hbase.vertex_partitions10The number of partitions of the HBase vertex table.
hbase.edge_partitions30The number of partitions of the HBase edge table.

≤ 1.5 Version Config (Legacy)

The following backend stores are no longer supported in version 1.7.0+ and are only available in version 1.5.x and earlier:

Cassandra Backend Config Options
config optiondefault valuedescription
backendMust be set to cassandra.
serializerMust be set to cassandra.
cassandra.hostlocalhostThe seeds hostname or ip address of cassandra cluster.
cassandra.port9042The seeds port address of cassandra cluster.
cassandra.connect_timeout5The cassandra driver connect server timeout(seconds).
cassandra.read_timeout20The cassandra driver read from server timeout(seconds).
cassandra.keyspace.strategySimpleStrategyThe replication strategy of keyspace, valid value is SimpleStrategy or NetworkTopologyStrategy.
cassandra.keyspace.replication[3]The keyspace replication factor of SimpleStrategy, like ‘[3]’.Or replicas in each datacenter of NetworkTopologyStrategy, like ‘[dc1:2,dc2:1]’.
cassandra.usernameThe username to use to login to cassandra cluster.
cassandra.passwordThe password corresponding to cassandra.username.
cassandra.compression_typenoneThe compression algorithm of cassandra transport: none/snappy/lz4.
cassandra.jmx_port=71997199The port of JMX API service for cassandra.
cassandra.aggregation_timeout43200The timeout in seconds of waiting for aggregation.
ScyllaDB Backend Config Options
config optiondefault valuedescription
backendMust be set to scylladb.
serializerMust be set to scylladb.

Other options are consistent with the Cassandra backend.

MySQL & PostgreSQL Backend Config Options
config optiondefault valuedescription
backendMust be set to mysql.
serializerMust be set to mysql.
jdbc.drivercom.mysql.jdbc.DriverThe JDBC driver class to connect database.
jdbc.urljdbc:mysql://127.0.0.1:3306The url of database in JDBC format.
jdbc.usernamerootThe username to login database.
jdbc.password******The password corresponding to jdbc.username.
jdbc.ssl_modefalseThe SSL mode of connections with database.
jdbc.reconnect_interval3The interval(seconds) between reconnections when the database connection fails.
jdbc.reconnect_max_times3The reconnect times when the database connection fails.
jdbc.storage_engineInnoDBThe storage engine of backend store database, like InnoDB/MyISAM/RocksDB for MySQL.
jdbc.postgresql.connect_databasetemplate1The database used to connect when init store, drop store or check store exist.
PostgreSQL Backend Config Options
config optiondefault valuedescription
backendMust be set to postgresql.
serializerMust be set to postgresql.

Other options are consistent with the MySQL backend.

The driver and url of the PostgreSQL backend should be set to:

  • jdbc.driver=org.postgresql.Driver
  • jdbc.url=jdbc:postgresql://localhost:5432/

3 - Built-in User Authentication and Authorization Configuration and Usage in HugeGraph

Overview

To facilitate authentication usage in different user scenarios, HugeGraph currently provides built-in authorization StandardAuthenticator mode, which supports multi-user authentication and fine-grained access control. It adopts a 4-layer design based on “User-UserGroup-Operation-Resource” to flexibly control user roles and permissions (supports multiple GraphServers).

Some key designs of the StandardAuthenticator mode include:

  • During initialization, a super administrator (admin) user is created. Subsequently, other users can be created by the super administrator. Once newly created users are assigned sufficient permissions, they can create or manage more users.
  • It supports dynamic creation of users, user groups, and resources, as well as dynamic allocation or revocation of permissions.
  • Users can belong to one or multiple user groups. Each user group can have permissions to operate on any number of resources. The types of operations include read, write, delete, execute, and others.
  • “Resource” describes the data in the graph database, such as vertices that meet certain criteria. Each resource consists of three elements: type, label, and properties. There are 18 types in total, with the ability to combine any label and properties. The internal condition of a resource is an AND relationship, while the condition between multiple resources is an OR relationship.

Here is an example to illustrate:

// Scenario: A user only has data read permission for the Beijing area
user(name=xx) -belong-> group(name=xx) -access(read)-> target(graph=graph1, resource={label: person, city: Beijing})

Configure User Authentication

By default, HugeGraph does not enable user authentication, and it needs to be enabled by modifying the configuration file (Note: If used in a production environment or over the internet, please use a Java11 version and enable auth-system to avoid security risks.)

You need to modify the configuration file to enable this feature. HugeGraph provides built-in authentication mode: StandardAuthenticator. This mode supports multi-user authentication and fine-grained permission control. Additionally, developers can implement their own HugeAuthenticator interface to integrate with their existing authentication systems.

HugeGraph authentication modes adopt HTTP Basic Authentication. In simple terms, when sending an HTTP request, you need to set the Authentication header to Basic and provide the corresponding username and password. The corresponding HTTP plaintext format is as follows:

GET http://localhost:8080/graphs/hugegraph/schema/vertexlabels
Authorization: Basic admin xxxx

Warning: Versions of HugeGraph-Server prior to 1.5.0 have a JWT-related security vulnerability in the Auth mode. Users are advised to update to a newer version or manually set the JWT token’s secretKey. It can be set in the rest-server.properties file by setting the auth.token_secret information:

auth.token_secret=XXXX   # should be a 32-chars string, consist of A-Z, a-z and 0-9

You can also generate it with the following command:

RANDOM_STRING=$(head /dev/urandom | tr -dc A-Za-z0-9 | head -c 32)
echo "auth.token_secret=${RANDOM_STRING}" >> rest-server.properties

StandardAuthenticator Mode

The StandardAuthenticator mode supports user authentication and permission control by storing user information in the database backend. This implementation authenticates users based on their names and passwords (encrypted) stored in the database and controls user permissions based on their roles. Below is the specific configuration process (requires service restart):

Configure the authenticator and its rest-server file path in the gremlin-server.yaml configuration file:

authentication: {
  authenticator: org.apache.hugegraph.auth.StandardAuthenticator,
  authenticationHandler: org.apache.hugegraph.auth.WsAndHttpBasicAuthHandler,
  config: {tokens: conf/rest-server.properties}
}

Configure the authenticator and graph_store information in the rest-server.properties configuration file:

auth.authenticator=org.apache.hugegraph.auth.StandardAuthenticator
auth.graph_store=hugegraph

# Auth Client Config
# If GraphServer and AuthServer are deployed separately, you also need to specify the following configuration. Fill in the IP:RPC port of AuthServer.
# auth.remote_url=127.0.0.1:8899,127.0.0.1:8898,127.0.0.1:8897

In the above configuration, the graph_store option specifies which graph to use for storing user information. If there are multiple graphs, you can choose any of them.

In the hugegraph{n}.properties configuration file, configure the gremlin.graph information:

gremlin.graph=org.apache.hugegraph.auth.HugeFactoryAuthProxy

For detailed API calls and explanations regarding permissions, please refer to the Authentication-API documentation.

Custom User Authentication System

If you need to support a more flexible user system, you can customize the authenticator for extension. Simply implement the org.apache.hugegraph.auth.HugeAuthenticator interface with your custom authenticator, and then modify the authenticator configuration item in the configuration file to point to your implementation.

Switching authentication mode

After the authentication configuration completed, enter the admin password on the command line when executing init store.sh for the first time. (For non-Docker mode)

If deployed based on Docker image or if HugeGraph has already been initialized and needs to be converted to authentication mode, relevant graph data needs to be deleted and HugeGraph needs to be restarted. If there is already business data in the diagram, it is temporarily not possible to directly convert the authentication mode (version<=1.2.0)

Improvements for this feature have been included in the latest release (available in the latest docker image), please refer to PR 2411. Seamless switching is now available.

# stop the hugeGraph firstly
bin/stop-hugegraph.sh

# delete the store data (here we use the default path for rocksdb)
# there is no need to delete in the latest version (fixed in https://github.com/apache/hugegraph/pull/2411)
rm -rf rocksdb-data/

# init store again
bin/init-store.sh

# start hugeGraph again
bin/start-hugegraph.sh

Use docker to enable authentication mode

For versions of the hugegraph/hugegraph image equal to or greater than 1.2.0, you can enable authentication mode while starting the Docker image.

The steps are as follows:

1. Use docker run

To enable authentication mode, add the environment variable PASSWORD=xxx (you can freely set the password) in the docker run command:

docker run -itd -e PASSWORD=xxx --name=server -p 8080:8080 hugegraph/hugegraph:1.5.0

2. Use docker-compose

Use docker-compose and set the environment variable PASSWORD=xxx:

version: '3'
services:
  server:
    image: hugegraph/hugegraph:1.5.0
    container_name: server
    ports:
      - 8080:8080
    environment:
      - PASSWORD=xxx

3. Enter the container to enable authentication mode

Enter the container first:

docker exec -it server bash
# Modify the config quickly, the modified file are save in the conf-bak folder
bin/enable-auth.sh

Then follow Switching authentication mode

4 - Configuring HugeGraphServer to Use HTTPS Protocol

Overview

By default, HugeGraphServer uses the HTTP protocol. However, if you have security requirements for your requests, you can configure it to use HTTPS.

Server Configuration

Modify the conf/rest-server.properties configuration file and change the schema part of restserver.url to https.

# Set the protocol to HTTPS
restserver.url=https://127.0.0.1:8080
# Server keystore file path. This default value is automatically effective when using HTTPS, and you can modify it as needed.
ssl.keystore_file=conf/hugegraph-server.keystore
# Server keystore file password. This default value is automatically effective when using HTTPS, and you can modify it as needed.
ssl.keystore_password=******

The server’s conf directory already includes a keystore file named hugegraph-server.keystore, and the password for this file is hugegraph. These are the default values when enabling the HTTPS protocol. Users can generate their own keystore file and password, and then modify the values of ssl.keystore_file and ssl.keystore_password.

Client Configuration

Using HTTPS in HugeGraph-Client

When constructing a HugeClient, pass the HTTPS-related configurations. Here’s an example in Java:

String url = "https://localhost:8080";
String graphName = "hugegraph";
HugeClientBuilder builder = HugeClient.builder(url, graphName);
// Client keystore file path
String trustStoreFilePath = "hugegraph.truststore";
// Client keystore password
String trustStorePassword = "******";
builder.configSSL(trustStoreFilePath, trustStorePassword);
HugeClient hugeClient = builder.build();

Note: Before version 1.9.0, HugeGraph-Client was created directly using the new keyword and did not support the HTTPS protocol. Starting from version 1.9.0, it changed to use the builder pattern and supports configuring the HTTPS protocol.

Using HTTPS in HugeGraph-Loader

When starting an import task, add the following options in the command line:

# HTTPS
--protocol https
# Client certificate file path. When specifying --protocol as https, the default value conf/hugegraph.truststore is automatically used, and you can modify it as needed.
--trust-store-file {file}
# Client certificate file password. When specifying --protocol as https, the default value hugegraph is automatically used, and you can modify it as needed.
--trust-store-password {password}

Under the conf directory of hugegraph-loader, there is already a default client certificate file named hugegraph.truststore, and its password is hugegraph.

Using HTTPS in HugeGraph-Tools

When executing commands, add the following options in the command line:

# Client certificate file path. When using the HTTPS protocol in the URL, the default value conf/hugegraph.truststore is automatically used, and you can modify it as needed.
--trust-store-file {file}
# Client certificate file password. When using the HTTPS protocol in the URL, the default value hugegraph is automatically used, and you can modify it as needed.
--trust-store-password {password}
# When executing migration commands and using the --target-url with the HTTPS protocol, the default value conf/hugegraph.truststore is automatically used, and you can modify it as needed.
--target-trust-store-file {target-file}
# When executing migration commands and using the --target-url with the HTTPS protocol, the default value hugegraph is automatically used, and you can modify it as needed.
--target-trust-store-password {target-password}

Under the conf directory of hugegraph-tools, there is already a default client certificate file named hugegraph.truststore, and its password is hugegraph.

How to Generate Certificate Files

This section provides an example of generating certificates. If the default certificate is sufficient or if you already know how to generate certificates, you can skip this section.

Server

  1. Generate the server’s private key and import it into the server’s keystore file. The server.keystore is for the server’s use and contains its private key.
keytool -genkey -alias serverkey -keyalg RSA -keystore server.keystore

During the process, fill in the description information according to your requirements. The description information for the default certificate is as follows:

First and Last Name: hugegraph
Organizational Unit Name: hugegraph
Organization Name: hugegraph
City or Locality Name: BJ
State or Province Name: BJ
Country Code: CN
  1. Export the server certificate based on the server’s private key.
keytool -export -alias serverkey -keystore server.keystore -file server.crt

server.crt is the server’s certificate.

Client

keytool -import -alias serverkey -file server.crt -keystore client.truststore

client.truststore is for the client’s use and contains the trusted certificate.