HugeGraph 配置项
Gremlin Server 配置项
对应配置文件gremlin-server.yaml
config option | default value | description |
---|---|---|
host | 127.0.0.1 | The host or ip of Gremlin Server. |
port | 8182 | The listening port of Gremlin Server. |
graphs | hugegraph: conf/hugegraph.properties | The map of graphs with name and config file path. |
scriptEvaluationTimeout | 30000 | The timeout for gremlin script execution(millisecond). |
channelizer | org.apache.tinkerpop.gremlin.server.channel.HttpChannelizer | Indicates the protocol which the Gremlin Server provides service. |
authentication | authenticator: org.apache.hugegraph.auth.StandardAuthenticator, config: {tokens: conf/rest-server.properties} | The authenticator and config(contains tokens path) of authentication mechanism. |
Rest Server & API 配置项
对应配置文件rest-server.properties
config option | default value | description |
---|---|---|
graphs | [hugegraph:conf/hugegraph.properties] | The map of graphs’ name and config file. |
server.id | server-1 | The id of rest server, used for license verification. |
server.role | master | The role of nodes in the cluster, available types are [master, worker, computer] |
restserver.url | http://127.0.0.1:8080 | The url for listening of rest server. |
ssl.keystore_file | server.keystore | The path of server keystore file used when https protocol is enabled. |
ssl.keystore_password | The password of the path of the server keystore file used when the https protocol is enabled. | |
restserver.max_worker_threads | 2 * CPUs | The maximum worker threads of rest server. |
restserver.min_free_memory | 64 | The minimum free memory(MB) of rest server, requests will be rejected when the available memory of system is lower than this value. |
restserver.request_timeout | 30 | The time in seconds within which a request must complete, -1 means no timeout. |
restserver.connection_idle_timeout | 30 | The time in seconds to keep an inactive connection alive, -1 means no timeout. |
restserver.connection_max_requests | 256 | The max number of HTTP requests allowed to be processed on one keep-alive connection, -1 means unlimited. |
gremlinserver.url | http://127.0.0.1:8182 | The url of gremlin server. |
gremlinserver.max_route | 8 | The max route number for gremlin server. |
gremlinserver.timeout | 30 | The timeout in seconds of waiting for gremlin server. |
batch.max_edges_per_batch | 500 | The maximum number of edges submitted per batch. |
batch.max_vertices_per_batch | 500 | The maximum number of vertices submitted per batch. |
batch.max_write_ratio | 50 | The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0. |
batch.max_write_threads | 0 | The maximum threads for batch writing, if the value is 0, the actual value will be set to batch.max_write_ratio * restserver.max_worker_threads. |
auth.authenticator | The class path of authenticator implementation. e.g., org.apache.hugegraph.auth.StandardAuthenticator, or org.apache.hugegraph.auth.ConfigAuthenticator. | |
auth.admin_token | 162f7848-0b6d-4faf-b557-3a0797869c55 | Token for administrator operations, only for org.apache.hugegraph.auth.ConfigAuthenticator. |
auth.graph_store | hugegraph | The name of graph used to store authentication information, like users, only for org.apache.hugegraph.auth.StandardAuthenticator. |
auth.user_tokens | [hugegraph:9fd95c9c-711b-415b-b85f-d4df46ba5c31] | The map of user tokens with name and password, only for org.apache.hugegraph.auth.ConfigAuthenticator. |
auth.audit_log_rate | 1000.0 | The max rate of audit log output per user, default value is 1000 records per second. |
auth.cache_capacity | 10240 | The max cache capacity of each auth cache item. |
auth.cache_expire | 600 | The expiration time in seconds of vertex cache. |
auth.remote_url | If the address is empty, it provide auth service, otherwise it is auth client and also provide auth service through rpc forwarding. The remote url can be set to multiple addresses, which are concat by ‘,’. | |
auth.token_expire | 86400 | The expiration time in seconds after token created |
auth.token_secret | FXQXbJtbCLxODc6tGci732pkH1cyf8Qg | Secret key of HS256 algorithm. |
exception.allow_trace | false | Whether to allow exception trace stack. |
memory_monitor.threshold | 0.85 | The threshold of JVM(in-heap) memory usage monitoring , 1 means disabling this function. |
memory_monitor.period | 2000 | The period in ms of JVM(in-heap) memory usage monitoring. |
基本配置项
基本配置项及后端配置项对应配置文件:{graph-name}.properties,如hugegraph.properties
config option | default value | description |
---|---|---|
gremlin.graph | org.apache.hugegraph.HugeFactory | Gremlin entrance to create graph. |
backend | rocksdb | The data store type, available values are [memory, rocksdb, cassandra, scylladb, hbase, mysql]. |
serializer | binary | The serializer for backend store, available values are [text, binary, cassandra, hbase, mysql]. |
store | hugegraph | The database name like Cassandra Keyspace. |
store.connection_detect_interval | 600 | The interval in seconds for detecting connections, if the idle time of a connection exceeds this value, detect it and reconnect if needed before using, value 0 means detecting every time. |
store.graph | g | The graph table name, which store vertex, edge and property. |
store.schema | m | The schema table name, which store meta data. |
store.system | s | The system table name, which store system data. |
schema.illegal_name_regex | .\s+$|~. | The regex specified the illegal format for schema name. |
schema.cache_capacity | 10000 | The max cache size(items) of schema cache. |
vertex.cache_type | l2 | The type of vertex cache, allowed values are [l1, l2]. |
vertex.cache_capacity | 10000000 | The max cache size(items) of vertex cache. |
vertex.cache_expire | 600 | The expire time in seconds of vertex cache. |
vertex.check_customized_id_exist | false | Whether to check the vertices exist for those using customized id strategy. |
vertex.default_label | vertex | The default vertex label. |
vertex.tx_capacity | 10000 | The max size(items) of vertices(uncommitted) in transaction. |
vertex.check_adjacent_vertex_exist | false | Whether to check the adjacent vertices of edges exist. |
vertex.lazy_load_adjacent_vertex | true | Whether to lazy load adjacent vertices of edges. |
vertex.part_edge_commit_size | 5000 | Whether to enable the mode to commit part of edges of vertex, enabled if commit size > 0, 0 means disabled. |
vertex.encode_primary_key_number | true | Whether to encode number value of primary key in vertex id. |
vertex.remove_left_index_at_overwrite | false | Whether remove left index at overwrite. |
edge.cache_type | l2 | The type of edge cache, allowed values are [l1, l2]. |
edge.cache_capacity | 1000000 | The max cache size(items) of edge cache. |
edge.cache_expire | 600 | The expiration time in seconds of edge cache. |
edge.tx_capacity | 10000 | The max size(items) of edges(uncommitted) in transaction. |
query.page_size | 500 | The size of each page when querying by paging. |
query.batch_size | 1000 | The size of each batch when querying by batch. |
query.ignore_invalid_data | true | Whether to ignore invalid data of vertex or edge. |
query.index_intersect_threshold | 1000 | The maximum number of intermediate results to intersect indexes when querying by multiple single index properties. |
query.ramtable_edges_capacity | 20000000 | The maximum number of edges in ramtable, include OUT and IN edges. |
query.ramtable_enable | false | Whether to enable ramtable for query of adjacent edges. |
query.ramtable_vertices_capacity | 10000000 | The maximum number of vertices in ramtable, generally the largest vertex id is used as capacity. |
query.optimize_aggregate_by_index | false | Whether to optimize aggregate query(like count) by index. |
oltp.concurrent_depth | 10 | The min depth to enable concurrent oltp algorithm. |
oltp.concurrent_threads | 10 | Thread number to concurrently execute oltp algorithm. |
oltp.collection_type | EC | The implementation type of collections used in oltp algorithm. |
rate_limit.read | 0 | The max rate(times/s) to execute query of vertices/edges. |
rate_limit.write | 0 | The max rate(items/s) to add/update/delete vertices/edges. |
task.wait_timeout | 10 | Timeout in seconds for waiting for the task to complete,such as when truncating or clearing the backend. |
task.input_size_limit | 16777216 | The job input size limit in bytes. |
task.result_size_limit | 16777216 | The job result size limit in bytes. |
task.sync_deletion | false | Whether to delete schema or expired data synchronously. |
task.ttl_delete_batch | 1 | The batch size used to delete expired data. |
computer.config | /conf/computer.yaml | The config file path of computer job. |
search.text_analyzer | ikanalyzer | Choose a text analyzer for searching the vertex/edge properties, available type are [word, ansj, hanlp, smartcn, jieba, jcseg, mmseg4j, ikanalyzer]. # if use ‘ikanalyzer’, need download jar from ‘https://github.com/apache/hugegraph-doc/raw/ik_binary/dist/server/ikanalyzer-2012_u6.jar' to lib directory |
search.text_analyzer_mode | smart | Specify the mode for the text analyzer, the available mode of analyzer are {word: [MaximumMatching, ReverseMaximumMatching, MinimumMatching, ReverseMinimumMatching, BidirectionalMaximumMatching, BidirectionalMinimumMatching, BidirectionalMaximumMinimumMatching, FullSegmentation, MinimalWordCount, MaxNgramScore, PureEnglish], ansj: [BaseAnalysis, IndexAnalysis, ToAnalysis, NlpAnalysis], hanlp: [standard, nlp, index, nShort, shortest, speed], smartcn: [], jieba: [SEARCH, INDEX], jcseg: [Simple, Complex], mmseg4j: [Simple, Complex, MaxWord], ikanalyzer: [smart, max_word]}. |
snowflake.datecenter_id | 0 | The datacenter id of snowflake id generator. |
snowflake.force_string | false | Whether to force the snowflake long id to be a string. |
snowflake.worker_id | 0 | The worker id of snowflake id generator. |
raft.mode | false | Whether the backend storage works in raft mode. |
raft.safe_read | false | Whether to use linearly consistent read. |
raft.use_snapshot | false | Whether to use snapshot. |
raft.endpoint | 127.0.0.1:8281 | The peerid of current raft node. |
raft.group_peers | 127.0.0.1:8281,127.0.0.1:8282,127.0.0.1:8283 | The peers of current raft group. |
raft.path | ./raft-log | The log path of current raft node. |
raft.use_replicator_pipeline | true | Whether to use replicator line, when turned on it multiple logs can be sent in parallel, and the next log doesn’t have to wait for the ack message of the current log to be sent. |
raft.election_timeout | 10000 | Timeout in milliseconds to launch a round of election. |
raft.snapshot_interval | 3600 | The interval in seconds to trigger snapshot save. |
raft.backend_threads | current CPU v-cores | The thread number used to apply task to backend. |
raft.read_index_threads | 8 | The thread number used to execute reading index. |
raft.apply_batch | 1 | The apply batch size to trigger disruptor event handler. |
raft.queue_size | 16384 | The disruptor buffers size for jraft RaftNode, StateMachine and LogManager. |
raft.queue_publish_timeout | 60 | The timeout in second when publish event into disruptor. |
raft.rpc_threads | 80 | The rpc threads for jraft RPC layer. |
raft.rpc_connect_timeout | 5000 | The rpc connect timeout for jraft rpc. |
raft.rpc_timeout | 60000 | The rpc timeout for jraft rpc. |
raft.rpc_buf_low_water_mark | 10485760 | The ChannelOutboundBuffer’s low water mark of netty, when buffer size less than this size, the method ChannelOutboundBuffer.isWritable() will return true, it means that low downstream pressure or good network. |
raft.rpc_buf_high_water_mark | 20971520 | The ChannelOutboundBuffer’s high water mark of netty, only when buffer size exceed this size, the method ChannelOutboundBuffer.isWritable() will return false, it means that the downstream pressure is too great to process the request or network is very congestion, upstream needs to limit rate at this time. |
raft.read_strategy | ReadOnlyLeaseBased | The linearizability of read strategy. |
RPC server 配置
config option | default value | description |
---|---|---|
rpc.client_connect_timeout | 20 | The timeout(in seconds) of rpc client connect to rpc server. |
rpc.client_load_balancer | consistentHash | The rpc client uses a load-balancing algorithm to access multiple rpc servers in one cluster. Default value is ‘consistentHash’, means forwarding by request parameters. |
rpc.client_read_timeout | 40 | The timeout(in seconds) of rpc client read from rpc server. |
rpc.client_reconnect_period | 10 | The period(in seconds) of rpc client reconnect to rpc server. |
rpc.client_retries | 3 | Failed retry number of rpc client calls to rpc server. |
rpc.config_order | 999 | Sofa rpc configuration file loading order, the larger the more later loading. |
rpc.logger_impl | com.alipay.sofa.rpc.log.SLF4JLoggerImpl | Sofa rpc log implementation class. |
rpc.protocol | bolt | Rpc communication protocol, client and server need to be specified the same value. |
rpc.remote_url | The remote urls of rpc peers, it can be set to multiple addresses, which are concat by ‘,’, empty value means not enabled. | |
rpc.server_adaptive_port | false | Whether the bound port is adaptive, if it’s enabled, when the port is in use, automatically +1 to detect the next available port. Note that this process is not atomic, so there may still be port conflicts. |
rpc.server_host | The hosts/ips bound by rpc server to provide services, empty value means not enabled. | |
rpc.server_port | 8090 | The port bound by rpc server to provide services. |
rpc.server_timeout | 30 | The timeout(in seconds) of rpc server execution. |
Cassandra 后端配置项
config option | default value | description |
---|---|---|
backend | Must be set to cassandra . | |
serializer | Must be set to cassandra . | |
cassandra.host | localhost | The seeds hostname or ip address of cassandra cluster. |
cassandra.port | 9042 | The seeds port address of cassandra cluster. |
cassandra.connect_timeout | 5 | The cassandra driver connect server timeout(seconds). |
cassandra.read_timeout | 20 | The cassandra driver read from server timeout(seconds). |
cassandra.keyspace.strategy | SimpleStrategy | The replication strategy of keyspace, valid value is SimpleStrategy or NetworkTopologyStrategy. |
cassandra.keyspace.replication | [3] | The keyspace replication factor of SimpleStrategy, like ‘[3]’.Or replicas in each datacenter of NetworkTopologyStrategy, like ‘[dc1:2,dc2:1]’. |
cassandra.username | The username to use to login to cassandra cluster. | |
cassandra.password | The password corresponding to cassandra.username. | |
cassandra.compression_type | none | The compression algorithm of cassandra transport: none/snappy/lz4. |
cassandra.jmx_port=7199 | 7199 | The port of JMX API service for cassandra. |
cassandra.aggregation_timeout | 43200 | The timeout in seconds of waiting for aggregation. |
ScyllaDB 后端配置项
config option | default value | description |
---|---|---|
backend | Must be set to scylladb . | |
serializer | Must be set to scylladb . |
其它与 Cassandra 后端一致。
RocksDB 后端配置项
config option | default value | description |
---|---|---|
backend | Must be set to rocksdb . | |
serializer | Must be set to binary . | |
rocksdb.data_disks | [] | The optimized disks for storing data of RocksDB. The format of each element: STORE/TABLE: /path/disk .Allowed keys are [g/vertex, g/edge_out, g/edge_in, g/vertex_label_index, g/edge_label_index, g/range_int_index, g/range_float_index, g/range_long_index, g/range_double_index, g/secondary_index, g/search_index, g/shard_index, g/unique_index, g/olap] |
rocksdb.data_path | rocksdb-data/data | The path for storing data of RocksDB. |
rocksdb.wal_path | rocksdb-data/wal | The path for storing WAL of RocksDB. |
rocksdb.allow_mmap_reads | false | Allow the OS to mmap file for reading sst tables. |
rocksdb.allow_mmap_writes | false | Allow the OS to mmap file for writing. |
rocksdb.block_cache_capacity | 8388608 | The amount of block cache in bytes that will be used by RocksDB, 0 means no block cache. |
rocksdb.bloom_filter_bits_per_key | -1 | The bits per key in bloom filter, a good value is 10, which yields a filter with ~ 1% false positive rate, -1 means no bloom filter. |
rocksdb.bloom_filter_block_based_mode | false | Use block based filter rather than full filter. |
rocksdb.bloom_filter_whole_key_filtering | true | True if place whole keys in the bloom filter, else place the prefix of keys. |
rocksdb.bottommost_compression | NO_COMPRESSION | The compression algorithm for the bottommost level of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd. |
rocksdb.bulkload_mode | false | Switch to the mode to bulk load data into RocksDB. |
rocksdb.cache_index_and_filter_blocks | false | Indicating if we’d put index/filter blocks to the block cache. |
rocksdb.compaction_style | LEVEL | Set compaction style for RocksDB: LEVEL/UNIVERSAL/FIFO. |
rocksdb.compression | SNAPPY_COMPRESSION | The compression algorithm for compressing blocks of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd. |
rocksdb.compression_per_level | [NO_COMPRESSION, NO_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION, SNAPPY_COMPRESSION] | The compression algorithms for different levels of RocksDB, allowed values are none/snappy/z/bzip2/lz4/lz4hc/xpress/zstd. |
rocksdb.delayed_write_rate | 16777216 | The rate limit in bytes/s of user write requests when need to slow down if the compaction gets behind. |
rocksdb.log_level | INFO | The info log level of RocksDB. |
rocksdb.max_background_jobs | 8 | Maximum number of concurrent background jobs, including flushes and compactions. |
rocksdb.level_compaction_dynamic_level_bytes | false | Whether to enable level_compaction_dynamic_level_bytes, if it’s enabled we give max_bytes_for_level_multiplier a priority against max_bytes_for_level_base, the bytes of base level is dynamic for a more predictable LSM tree, it is useful to limit worse case space amplification. Turning this feature on/off for an existing DB can cause unexpected LSM tree structure so it’s not recommended. |
rocksdb.max_bytes_for_level_base | 536870912 | The upper-bound of the total size of level-1 files in bytes. |
rocksdb.max_bytes_for_level_multiplier | 10.0 | The ratio between the total size of level (L+1) files and the total size of level L files for all L. |
rocksdb.max_open_files | -1 | The maximum number of open files that can be cached by RocksDB, -1 means no limit. |
rocksdb.max_subcompactions | 4 | The value represents the maximum number of threads per compaction job. |
rocksdb.max_write_buffer_number | 6 | The maximum number of write buffers that are built up in memory. |
rocksdb.max_write_buffer_number_to_maintain | 0 | The total maximum number of write buffers to maintain in memory. |
rocksdb.min_write_buffer_number_to_merge | 2 | The minimum number of write buffers that will be merged together. |
rocksdb.num_levels | 7 | Set the number of levels for this database. |
rocksdb.optimize_filters_for_hits | false | This flag allows us to not store filters for the last level. |
rocksdb.optimize_mode | true | Optimize for heavy workloads and big datasets. |
rocksdb.pin_l0_filter_and_index_blocks_in_cache | false | Indicating if we’d put index/filter blocks to the block cache. |
rocksdb.sst_path | The path for ingesting SST file into RocksDB. | |
rocksdb.target_file_size_base | 67108864 | The target file size for compaction in bytes. |
rocksdb.target_file_size_multiplier | 1 | The size ratio between a level L file and a level (L+1) file. |
rocksdb.use_direct_io_for_flush_and_compaction | false | Enable the OS to use direct read/writes in flush and compaction. |
rocksdb.use_direct_reads | false | Enable the OS to use direct I/O for reading sst tables. |
rocksdb.write_buffer_size | 134217728 | Amount of data in bytes to build up in memory. |
rocksdb.max_manifest_file_size | 104857600 | The max size of manifest file in bytes. |
rocksdb.skip_stats_update_on_db_open | false | Whether to skip statistics update when opening the database, setting this flag true allows us to not update statistics. |
rocksdb.max_file_opening_threads | 16 | The max number of threads used to open files. |
rocksdb.max_total_wal_size | 0 | Total size of WAL files in bytes. Once WALs exceed this size, we will start forcing the flush of column families related, 0 means no limit. |
rocksdb.db_write_buffer_size | 0 | Total size of write buffers in bytes across all column families, 0 means no limit. |
rocksdb.delete_obsolete_files_period | 21600 | The periodicity in seconds when obsolete files get deleted, 0 means always do full purge. |
rocksdb.hard_pending_compaction_bytes_limit | 274877906944 | The hard limit to impose on pending compaction in bytes. |
rocksdb.level0_file_num_compaction_trigger | 2 | Number of files to trigger level-0 compaction. |
rocksdb.level0_slowdown_writes_trigger | 20 | Soft limit on number of level-0 files for slowing down writes. |
rocksdb.level0_stop_writes_trigger | 36 | Hard limit on number of level-0 files for stopping writes. |
rocksdb.soft_pending_compaction_bytes_limit | 68719476736 | The soft limit to impose on pending compaction in bytes. |
HBase 后端配置项
config option | default value | description |
---|---|---|
backend | Must be set to hbase . | |
serializer | Must be set to hbase . | |
hbase.hosts | localhost | The hostnames or ip addresses of HBase zookeeper, separated with commas. |
hbase.port | 2181 | The port address of HBase zookeeper. |
hbase.threads_max | 64 | The max threads num of hbase connections. |
hbase.znode_parent | /hbase | The znode parent path of HBase zookeeper. |
hbase.zk_retry | 3 | The recovery retry times of HBase zookeeper. |
hbase.aggregation_timeout | 43200 | The timeout in seconds of waiting for aggregation. |
hbase.kerberos_enable | false | Is Kerberos authentication enabled for HBase. |
hbase.kerberos_keytab | The HBase’s key tab file for kerberos authentication. | |
hbase.kerberos_principal | The HBase’s principal for kerberos authentication. | |
hbase.krb5_conf | etc/krb5.conf | Kerberos configuration file, including KDC IP, default realm, etc. |
hbase.hbase_site | /etc/hbase/conf/hbase-site.xml | The HBase’s configuration file |
hbase.enable_partition | true | Is pre-split partitions enabled for HBase. |
hbase.vertex_partitions | 10 | The number of partitions of the HBase vertex table. |
hbase.edge_partitions | 30 | The number of partitions of the HBase edge table. |
MySQL & PostgreSQL 后端配置项
config option | default value | description |
---|---|---|
backend | Must be set to mysql . | |
serializer | Must be set to mysql . | |
jdbc.driver | com.mysql.jdbc.Driver | The JDBC driver class to connect database. |
jdbc.url | jdbc:mysql://127.0.0.1:3306 | The url of database in JDBC format. |
jdbc.username | root | The username to login database. |
jdbc.password | ****** | The password corresponding to jdbc.username. |
jdbc.ssl_mode | false | The SSL mode of connections with database. |
jdbc.reconnect_interval | 3 | The interval(seconds) between reconnections when the database connection fails. |
jdbc.reconnect_max_times | 3 | The reconnect times when the database connection fails. |
jdbc.storage_engine | InnoDB | The storage engine of backend store database, like InnoDB/MyISAM/RocksDB for MySQL. |
jdbc.postgresql.connect_database | template1 | The database used to connect when init store, drop store or check store exist. |
PostgreSQL 后端配置项
config option | default value | description |
---|---|---|
backend | Must be set to postgresql . | |
serializer | Must be set to postgresql . |
其它与 MySQL 后端一致。
PostgreSQL 后端的 driver 和 url 应该设置为:
jdbc.driver=org.postgresql.Driver
jdbc.url=jdbc:postgresql://localhost:5432/
Last modified November 1, 2024: doc: add JVM(in-heap) memory monitor config desc (#376) (b7d45478)