Security Best Practices¶
Grafeo is an embedded database without built-in authentication. Security depends on how it is deployed and used.
Understanding Grafeo's Security Model¶
Grafeo is designed as an embedded library, not a network-accessible server:
- No authentication - Anyone with access to the application can access the database
- No network protocol - No TCP/HTTP ports to secure
- No encryption at rest - Database files are not encrypted
- File-based access control - Security relies on filesystem permissions
This model is appropriate for:
- Single-user applications
- Microservices with internal graph state
- Data science environments
- Applications that implement their own access control
Securing a Deployment¶
1. File System Permissions¶
Protect database files with appropriate permissions:
# Create directory
New-Item -ItemType Directory -Path "C:\ProgramData\MyApp\Data"
# Set permissions (restrict to current user)
$acl = Get-Acl "C:\ProgramData\MyApp\Data"
$acl.SetAccessRuleProtection($true, $false)
$rule = New-Object System.Security.AccessControl.FileSystemAccessRule(
$env:USERNAME, "FullControl", "ContainerInherit,ObjectInherit", "None", "Allow"
)
$acl.AddAccessRule($rule)
Set-Acl "C:\ProgramData\MyApp\Data" $acl
2. Input Validation¶
Always use parameterized queries to prevent injection:
# DANGEROUS - SQL injection risk
user_input = request.form["name"]
db.execute(f"MATCH (n:Person {{name: '{user_input}'}}) RETURN n") # DON'T DO THIS
# SAFE - Parameterized query
user_input = request.form["name"]
db.execute("MATCH (n:Person {name: $name}) RETURN n", {"name": user_input}) # DO THIS
3. Validate Property Values¶
Sanitize data before storing:
def sanitize_string(value: str, max_length: int = 1000) -> str:
"""Sanitize string input."""
if not isinstance(value, str):
raise ValueError("Expected string")
# Limit length
value = value[:max_length]
# Remove null bytes
value = value.replace("\x00", "")
return value
def create_user(db, name: str, email: str):
"""Create user with validated input."""
name = sanitize_string(name, max_length=100)
email = sanitize_string(email, max_length=255)
# Validate email format
if "@" not in email or "." not in email:
raise ValueError("Invalid email format")
return db.create_node(["User"], {"name": name, "email": email})
4. Limit Query Complexity¶
Prevent denial-of-service via expensive queries:
def safe_execute(db, query: str, params: dict = None, max_results: int = 10000):
"""Execute query with result limit."""
# Add LIMIT if not present
if "LIMIT" not in query.upper():
query = f"{query} LIMIT {max_results}"
return db.execute(query, params)
# Usage
result = safe_execute(db, "MATCH (n) RETURN n") # Limited to 10000 results
5. Audit Logging¶
Log database operations for security auditing:
import logging
from datetime import datetime
from functools import wraps
logger = logging.getLogger("grafeo.audit")
def audit_query(func):
"""Decorator to audit database queries."""
@wraps(func)
def wrapper(self, query: str, params: dict = None, *args, **kwargs):
start = datetime.now()
try:
result = func(self, query, params, *args, **kwargs)
logger.info(
"QUERY",
extra={
"query": query[:500], # Truncate long queries
"params": str(params)[:200] if params else None,
"duration_ms": (datetime.now() - start).total_seconds() * 1000,
"result_count": len(result) if hasattr(result, "__len__") else None,
}
)
return result
except Exception as e:
logger.error(
"QUERY_ERROR",
extra={
"query": query[:500],
"error": str(e),
}
)
raise
return wrapper
Sensitive Data Handling¶
Don't Store Secrets in Properties¶
# BAD - Storing plaintext password
db.create_node(["User"], {"email": "user@example.com", "password": "secret123"})
# GOOD - Store only hashed password
import hashlib
password_hash = hashlib.sha256(b"secret123").hexdigest()
db.create_node(["User"], {"email": "user@example.com", "password_hash": password_hash})
Mask Sensitive Data in Logs¶
def mask_sensitive(data: dict, sensitive_keys: set = {"password", "token", "secret"}):
"""Mask sensitive values in dictionaries."""
return {
k: "***MASKED***" if k.lower() in sensitive_keys else v
for k, v in data.items()
}
# Usage in logging
logger.info(f"Creating user: {mask_sensitive(user_data)}")
Consider Encryption for Sensitive Properties¶
from cryptography.fernet import Fernet
# Generate key (store securely!)
key = Fernet.generate_key()
cipher = Fernet(key)
def encrypt_value(value: str) -> str:
return cipher.encrypt(value.encode()).decode()
def decrypt_value(encrypted: str) -> str:
return cipher.decrypt(encrypted.encode()).decode()
# Store encrypted
ssn_encrypted = encrypt_value("123-45-6789")
db.create_node(["Person"], {"name": "Alix", "ssn_encrypted": ssn_encrypted})
# Retrieve and decrypt
node = db.get_node(node_id)
ssn = decrypt_value(node.properties["ssn_encrypted"])
Network Security¶
If exposing Grafeo through an API:
1. Add Authentication Layer¶
from flask import Flask, request, jsonify
from functools import wraps
from grafeo import GrafeoDB
app = Flask(__name__)
db = GrafeoDB("./mydb")
def require_api_key(f):
@wraps(f)
def decorated(*args, **kwargs):
api_key = request.headers.get("X-API-Key")
if api_key != os.environ["API_KEY"]:
return jsonify({"error": "Invalid API key"}), 401
return f(*args, **kwargs)
return decorated
@app.route("/query", methods=["POST"])
@require_api_key
def query():
data = request.json
result = db.execute(data["query"], data.get("params"))
return jsonify(result.to_list())
2. Use HTTPS¶
Always use TLS when exposing over network:
3. Rate Limiting¶
from flask_limiter import Limiter
limiter = Limiter(app, key_func=lambda: request.headers.get("X-API-Key"))
@app.route("/query", methods=["POST"])
@limiter.limit("100/minute")
@require_api_key
def query():
...
Backup Security¶
Secure Backup Storage¶
import shutil
import os
def secure_backup(db_path: str, backup_path: str):
"""Create a secure backup."""
# Create backup
db.save(backup_path)
# Set restrictive permissions
os.chmod(backup_path, 0o600)
# Optionally encrypt
# gpg --encrypt --recipient admin@example.com backup_path
Secure Backup Transfer¶
# Encrypt before transfer
gpg --encrypt --recipient admin@example.com backup.db
# Transfer encrypted file
scp backup.db.gpg backup-server:/backups/
Security Checklist¶
Before deploying:
- Database files have restricted permissions (700 or 600)
- All queries use parameterization (no string interpolation)
- Input validation on all user-provided data
- Query results are limited to prevent DoS
- Sensitive data is encrypted or hashed
- Audit logging is enabled
- API endpoints require authentication
- HTTPS is enabled for network access
- Rate limiting is configured
- Backups are encrypted and access-controlled
- Error messages don't expose internal details
Reporting Security Issues¶
To report a security vulnerability:
- Do not open a public GitHub issue
- Email security concerns to security@grafeo.dev
- Include steps to reproduce
- Allow time for a fix before public disclosure
Security issues are taken seriously and will receive a prompt response.