Scope: cmd/server, cmd/bridge, cmd/decoder, cmd/location and all packages they import.
Date: 2025-03-05.
The codebase has a solid structure and consistent patterns across the four services, but security, reliability, and operational hardening are not yet at production level. With the changes suggested below, it can be brought to a 7–8/10 for a production deployment.
| Dimension | Grade | Notes |
|---|---|---|
| Structure & readability | 7/10 | Clear app lifecycle (New/Run/Shutdown), good package layout, some naming/duplication issues. |
| Reliability | 5/10 | Graceful shutdown and Kafka cleanup are good; missing retries, commit semantics, and DB/timeouts. |
| Security | 3/10 | No API auth, TLS/DB and client TLS weakened, CORS permissive, no rate limiting. |
| Observability | 6/10 | slog, request ID, logging middleware, health/ready; no metrics/tracing. |
| Correctness | 6/10 | Path vs body ID bugs in update endpoints, context.Background() in hot paths. |
New(cfg) → optional Init(ctx) (server only) → Run(ctx) → Shutdown(), with signal.NotifyContext for graceful shutdown.sync.WaitGroup used for consumer goroutines.slog with JSON handler and file + stderr; request logging with method, path, status, duration, bytes./health (liveness) and /ready (DB ping) for the server.KafkaManager with RWMutex, separate readers/writers, group IDs per service.AppState in common/appcontext is thread-safe (RWMutex) and used consistently.LoadServer, LoadBridge, etc.) and getEnvPanic for required vars.response.JSON, Error, BadRequest, InternalError, NotFound with consistent JSON shape.api/openapi.yaml (OpenAPI 3.0), which helps readability and contract clarity.No authentication or authorization on the HTTP API
All server routes (/reslevis/*, /configs/beacons, etc.) are unauthenticated. Anyone who can reach the server can read/update/delete gateways, zones, trackers, parser configs, settings, alerts, and tracks.
Database connection uses sslmode=disable
In internal/pkg/database/database.go, DSN is built with sslmode=disable. In production, DB connections should use TLS and sslmode=verify-full (or equivalent) with CA verification.
TLS verification disabled for outbound HTTP
internal/pkg/apiclient/updatedb.go: TLSClientConfig: &tls.Config{InsecureSkipVerify: true}.internal/pkg/location/inference.go: same, and NewDefaultInferencer(skipTLSVerify bool) ignores the parameter and always uses InsecureSkipVerify: true.CORS defaults to *
In internal/pkg/api/middleware/cors.go, when origins is nil/empty, origins = []string{"*"}. Production should restrict origins to known front-end origins.
Logger file mode 0666
In internal/pkg/logger/logger.go, os.OpenFile(..., 0666) makes the log file world-readable and -writable. Prefer 0600 or 0640.
No rate limiting or request body size limits
No protection against abuse or large-body DoS; consider middleware for max body size and rate limiting (per IP or per key).
Recommendations:
/health and optionally /ready public.DB_SSLMODE, DB_SSLROOTCERT) and use sslmode=verify-full in production.cfg.TLSInsecureSkipVerify (or equivalent) for all outbound HTTP clients; fix NewDefaultInferencer to respect the parameter.0600 (or 0640 if a group needs read).http.MaxBytesReader) and consider rate limiting for API routes.Kafka consumer: decode errors and commit semantics
In internal/pkg/kafkaclient/consumer.go, when json.Unmarshal fails, the code logs and continues without committing. Depending on reader config, this can cause repeated redelivery of bad messages or ambiguous semantics. Production should either skip and commit, or send to a dead-letter path and commit.
Answer: because readers are using consumer groups messages are auto commited, meaning bad unmarshal still commits as the message was technically read
No retries on Kafka produce
Event loops (server, bridge, decoder, location) call WriteMessages once; transient Kafka errors are not retried. Consider retry with backoff (and optional circuit breaker) for critical topics.
Answer: the Writer object is already holding the default configuration for timeout, backoff and retries, but I still added some extra configurations
Database: no explicit pool or timeouts
database.Connect uses GORM defaults. For production, set MaxOpenConns, MaxIdleConns, and connection/timeout settings (e.g. SetConnMaxLifetime) on the underlying *sql.DB.
UpdateDB and Init: failures only logged
In internal/app/server/app.go, apiclient.UpdateDB errors are only logged; Init continues. Consider failing Init (or marking “degraded”) if sync is required for correct operation, or add retries/backoff.
Use of context.Background() in async paths
e.g. internal/pkg/bridge/handler.go and internal/pkg/decoder/process.go use context.Background() for Kafka writes. Prefer passing the request/event context (or a derived timeout) so shutdown and timeouts propagate.
Recommendations:
database.Connect (and optionally make them configurable via config).context.Background().Update endpoints: path id vs body id
internal/pkg/controller/gateways_controller.go): Uses mux.Vars(r)["id"] only to check existence with First(..., "id = ?", id), then decodes body into gateway and calls Save(&gateway). The updated record is identified by gateway.ID from the body, not the path. A client can send a different ID in the body and update another resource.updateZone (no {id} in path); uses zone.ID from body only. If the API contract expects path-based ID, this is inconsistent.id, decode body into a DTO or partial struct, then update only allowed fields for that id (e.g. selective updates or merge then update by path id).TrackerUpdateController
Uses body tracker.ID for lookup and save; route has no {id} in path. If other update endpoints use path {id}, align behavior and documentation.
Recommendations:
{id} only (body has no id or it must match path) or document “body id is canonical” and ensure no IDOR.fmt.Println in internal/pkg/apiclient/auth.go and any internal/pkg/apiclient/updatedb.go / inference paths with slog (or structured logger) so logs are consistent and configurable.DefaultInferencer caches token in a struct field with no expiry or refresh; token may be used after expiry. Use token expiry from auth response and refresh when needed.appcontext, BeaconExists(id string) is used with MAC (e.g. in bridge handler). Rename parameter to mac (or the method to LookupIDByMAC) to avoid confusion./health or /ready endpoints. For orchestration (e.g. Kubernetes), consider a small HTTP server or at least a process-level health check so the platform can restart unhealthy instances.go.mod is clear; consider auditing indirect deps and keeping them updated (e.g. go list -m -u all and Dependabot/Renovate).Add API authentication/authorization
Protect all non-health routes (e.g. JWT or API key middleware); document required claims/scopes if using JWT.
Enable and verify DB TLS
Make sslmode (and optional root cert) configurable; use verify-full (or equivalent) in production.
Respect TLS config for outbound HTTP
Use config (e.g. TLSInsecureSkipVerify) for apiclient and location inferencer; fix NewDefaultInferencer(skipTLSVerify bool) to use the parameter.
Fix update controllers
Use path id as source of truth for update (and optionally delete); ensure body cannot override resource id in an unsafe way.
Tighten CORS and log file permissions
Explicit allowed origins from config; set log file mode to 0600 (or 0640).
Kafka consumer error policy
Define and implement skip+commit or DLQ for bad messages; avoid infinite redelivery of poison messages.
DB connection pool and timeouts
Set MaxOpenConns, MaxIdleConns, ConnMaxLifetime, and timeouts in database.Connect.
Request body size limit
Middleware (e.g. http.MaxBytesReader) for API routes to prevent large-body DoS.
Replace fmt.Println with slog
In apiclient and any remaining places; ensure response body is closed after read (e.g. defer res.Body.Close() in auth/data clients if not already).
Retries for Kafka produce
Retry with backoff (and optionally circuit breaker) for critical WriteMessages calls.
Context propagation
Pass request/event context (or bounded context) into Kafka writes instead of context.Background().
Token refresh in Inferencer
Use expiry from auth response and refresh token before inference calls.
Health for bridge/decoder/location
Add minimal health/readiness (HTTP or signal file) for orchestration and load balancers.
Metrics and tracing
Add metrics (e.g. request duration, Kafka lag, error counts) and optional distributed tracing (e.g. OTel).
Rate limiting
Per-IP or per-token rate limiting on API routes.
Structured validation
Use a validator (e.g. go-playground/validator) for request bodies and path params (IDs, limits).
Documentation
Short runbooks for deploy, config env vars, and dependency on Kafka/DB/MQTT and external auth/API.
cmd/server/main.go, cmd/bridge/main.go, cmd/decoder/main.go, cmd/location/main.gointernal/app/server/*, internal/app/bridge/app.go, internal/app/decoder/app.go, internal/app/location/app.gointernal/pkg/config/config.gointernal/pkg/database/database.go, internal/pkg/kafkaclient/manager.go, internal/pkg/kafkaclient/consumer.go, internal/pkg/logger/logger.gointernal/pkg/api/handler/health.go, internal/pkg/api/middleware/*, internal/pkg/api/response/response.go, internal/app/server/routes.gointernal/pkg/controller/*.go (gateways, trackers, zone, parser, settings, alerts, tracks, trackerzones)internal/pkg/service/beacon_service.go, internal/pkg/apiclient/*.go, internal/pkg/bridge/mqtt.go, internal/pkg/bridge/handler.go, internal/pkg/decoder/process.go, internal/pkg/location/inference.go, internal/pkg/common/appcontext/context.goThe microservices are well-structured and readable, with consistent lifecycle and shutdown. The main gaps are security (no API auth, weak TLS usage) and reliability (Kafka and DB tuning, retries, context usage). Addressing the P0 and P1 items above would bring the system much closer to production grade (around 7–8/10); adding P2/P3 would further improve operability and resilience.