What is Apache HugeGraph?
Apache HugeGraph is an easy-to-use, efficient, and general-purpose open-source full-stack graph system (GitHub), covering three major areas: Graph Database (OLTP real-time queries), Graph Computing (OLAP large-scale analysis), and Graph AI (GraphRAG / Graph Machine Learning).
HugeGraph supports the rapid storage and querying of tens of billions of vertices and edges, possessing excellent OLTP performance. Its graph engine is fully compliant with the Apache TinkerPop 3 framework and supports both Gremlin and Cypher (OpenCypher standard) query languages.
Typical Application Scenarios: Deep relationship exploration, association analysis, path search, feature extraction, community detection, knowledge graphs, etc.
Applicable Fields: Network security, telecom anti-fraud, financial risk control, personalized recommendations, social networks, intelligent Q&A, etc.
Ecosystem Overview
┌──────────────────────────────────────────────────────────────┐
│ Apache HugeGraph - Full-Stack Graph System │
├──────────────────┬────────────────────┬──────────────────────┤
│ Graph DB (OLTP) │ Graph Compute │ Graph AI │
│ HugeGraph │ Vermeer (Memory) │ HugeGraph-AI │
│ Server │ Computer (Dist.) │ GraphRAG/GNN/Py │
├──────────────────┴────────────────────┴──────────────────────┤
│ HugeGraph Toolchain │
│ Hubble | Loader | Client(Java/Go/Py) | Spark | Tools │
└──────────────────────────────────────────────────────────────┘
Core Components
🗄️ HugeGraph Server — Graph Engine (OLTP)
The core module of the HugeGraph project, providing high-performance graph data storage and real-time query capabilities:
- Core Engine: Supports Property Graph modeling, including complete Schema management for VertexLabel, EdgeLabel, PropertyKey, and IndexLabel.
- Dual Query Languages: Fully compatible with Gremlin (TinkerPop 3) and Cypher (OpenCypher).
- REST API: Built-in REST Server, providing RESTful graph operation interfaces.
- Multi-type Indexes: Exact query, range query, and complex condition combination queries.
- Pluggable Storage Backends: For 1.7.0 and later, supports
RocksDB(standalone default),HStore(distributed),HBase, andMemory; for 1.5.x or earlier, supportsMySQL/PostgreSQL/Cassandra, etc.
Submodules:
Core: Graph engine implementation, connecting downwards to Backend and upwards to API.Backend: Adapter layer for multiple backend storages.API: RESTful access layer, compatible with Gremlin/Cypher queries.
📊 Graph Computing Engine (OLAP)
Provides two complementary graph analysis engines:
- Vermeer (Recommended): High-performance pure in-memory graph computing engine, simple to deploy, fast response, suitable for small to medium-scale graph analysis and quick onboarding.
- HugeGraph-Computer: Distributed OLAP engine based on the Pregel model, can run on Kubernetes / Yarn clusters, suitable for mega-scale graph algorithm tasks.
🤖 HugeGraph-AI — Graph AI Ecosystem
An independent AI component of HugeGraph, bridging graphs with Large Language Models (LLMs):
- GraphRAG: Graph-based Retrieval-Augmented Generation, enabling LLM intelligent Q&A.
- Knowledge Graph Construction: Automatically extracting entities and relationships from unstructured text to build knowledge graphs.
- Graph Neural Networks: Supports training and inference of GNN models.
- 20+ Graph Machine Learning Algorithms: Built-in rich graph analysis algorithms, continuously updated.
- Python Client: Convenient Python SDK for AI applications.
🛠️ HugeGraph Toolchain
A complete tool ecosystem surrounding the graph system (toolchain repository):
| Tool | Description |
|---|---|
| Hubble | Web visualization platform: one-stop operation for data modeling → batch importing → online/offline analysis. |
| Loader | Data import tool: supports multiple data sources like local files, HDFS, MySQL, and formats like TXT/CSV/JSON. |
| Client | Multi-language SDKs: Java / Python / Go. |
| Spark-connector | Spark integration: supports batch graph data read/write via Spark, suitable for big data offline processing. |
| Tools | Command-line operational tools: graph management, backup/restore, Gremlin execution, etc. |
Deployment Modes
HugeGraph supports two primary deployment modes:
| Mode | Core Components | Suitable Scenarios | Data Scale | High Availability (HA) |
|---|---|---|---|---|
| Standalone | Server + RocksDB | Development, testing, single-node production | < 4TB | Basic |
| Distributed | Server + PD (3-5 nodes) + Store (3+ nodes) | Production environments, horizontal scaling | < 1000TB | ✅ |
Docker Quick Experience:
docker run -itd --name=hugegraph -p 8080:8080 hugegraph/hugegraph
Quick Start Navigation
| I want to… | Start Here |
|---|---|
| 🚀 Quick Experience | Docker Deployment |
| 🔍 Run Graph Queries (OLTP) | Server Quick Start |
| 📈 Large-scale Graph Computing (OLAP) | Vermeer / Computer |
| 🤖 Build AI/RAG Applications | HugeGraph-AI |
| 📥 Batch Import Data | HugeGraph-Loader |
| 🖥️ Visual Management | Hubble Web UI |
System Features
- Easy to Use: Dual Gremlin/Cypher query languages + RESTful API, comprehensive toolchain, extremely easy to get started.
- Efficient: Deeply optimized graph storage and queries, millisecond-level response, supports thousands of concurrent online operations, fast import of billions of data records.
- Universal: Supports both OLTP and OLAP modes, seamlessly integrates with Apache Hadoop, Spark, and Flink big data ecosystems.
- Scalable: Distributed storage, multi-replica data, horizontal scaling, flexible expansion through pluggable backends.
- Open: Apache 2.0 License, fully open-source, warmly welcoming community contributions.
Contact Us
- GitHub Issues: Feedback on usage issues and functional requirements (Recommended)
- Email: dev@hugegraph.apache.org (How to subscribe)
- Security: security@hugegraph.apache.org (Report security issues)
- WeChat Public Account: Apache HugeGraph
