How to Troubleshoot Teranode (Kubernetes Operator)¶
Last modified: 6-March-2025
Index¶
- Health Checks and System Monitoring
- Service Status
- Detailed Container/Pod Health
- Configuring Health Checks
- Viewing Health Check Logs
- Monitoring System Resources
- Viewing Global Logs
- Viewing Logs for Specific Microservices
- Useful Options for Log Viewing
- Checking Logs for Specific Teranode Microservices
- Redirecting Logs to a File
- Check Services Dashboard**
- Recovery Procedures
Health Checks and System Monitoring¶
Service Status¶
This command lists all pods in the current namespace, showing their status and readiness.
Detailed Container/Pod Health¶
This provides detailed information about the pod, including its current state, recent events, and readiness probe results.
Configuring Health Checks¶
In your Deployment or StatefulSet specification:
spec:
template:
spec:
containers:
- name: teranode-blockchain
...
readinessProbe:
httpGet:
path: /health
port: 8087
periodSeconds: 30
timeoutSeconds: 10
failureThreshold: 3
livenessProbe:
httpGet:
path: /health
port: 8087
periodSeconds: 30
timeoutSeconds: 10
failureThreshold: 3
initialDelaySeconds: 40
Viewing Health Check Logs¶
Health check results are typically logged in the pod events:
Look for events related to readiness and liveness probes.
Monitoring System Resources¶
- Use
kubectl topto view resource usage:
For both environments:
- Consider setting up Prometheus and Grafana for more comprehensive monitoring.
- Look for services consuming unusually high resources.
Viewing Global Logs¶
kubectl logs -n teranode-operator -l app.kubernetes.io/part-of=teranode-operator
kubectl logs -n teranode-operator -f -l app.kubernetes.io/part-of=teranode-operator # Follow logs in real-time
kubectl logs -n teranode-operator --tail=100 -l app.kubernetes.io/part-of=teranode-operator # View only the most recent logs
Viewing Logs for Specific Microservices¶
Useful Options for Log Viewing¶
- Show timestamps:
- Limit output:
- Since time:
Checking Logs for Specific Teranode Microservices¶
Replace [service_name] or <pod-name> with the appropriate service or pod name:
- Propagation Service (service name:
propagation) - Blockchain Service (service name:
blockchain) - Asset Service (service name:
asset) - Block Validation Service (service name:
block-validator) - P2P Service (service name:
p2p) - Block Assembly Service (service name:
block-assembly) - Subtree Validation Service (service name:
subtree-validator) - Miner Service (service name:
miner) - RPC Server (service name:
rpc) - Block Persister Service (service name:
block-persister) - UTXO Persister Service (service name:
utxo-persister)
Redirecting Logs to a File¶
kubectl logs -n teranode-operator -l app.kubernetes.io/part-of=teranode-operator > teranode_logs.txt
kubectl logs -n teranode-operator <pod-name> > pod_logs.txt
Remember to replace placeholders like [service_name], <pod-name>, and label selectors with the appropriate values for your Teranode setup.
Check Services Dashboard**¶
Check your Grafana TERANODE Service Overview dashboard:
-
Check that there's no blocks in the queue (
Queued Blocks in Block Validation). We expect little or no queueing, and not creeping up. 3 blocks queued up are already a concern. -
Check that the propagation instances are handling around the same load to make sure the load is equally distributed among all the propagation servers. See the
Propagation Processed Transactions per Instancediagram. -
Check that the cache is at a sustainable pattern rather than "exponentially" growing (see both the
Tx Meta Cache in Block ValidationandTx Meta Cache Size in Block Validationdiagrams). -
Check that go routines (
Goroutinesgraph) are not creeping up or reaching excessive levels.
Recovery Procedures¶
Third Party Component Failure¶
Teranode is highly dependent on its third party dependencies. Postgres, Kafka and Aerospike are critical for Teranode operations, and the node cannot work without them.
If a third party service fails, you must restore its functionality. Once it is back, please restart Teranode cleanly following the instructions in the How to Start and Stop Teranode in Kubernetes guide.
Should you encounter a bug, please report it following the instructions in the Bug Reporting section.