GameDuo
DEV Team Platform Part Leader 2026.04 — Present
DEV Team Server Developer 2025.01 — 2026.03
2025.01 — Present
Mobile game development and publishing company. Shared backend platform design for 12 live services, data pipeline engineering, and common package system operations
Highlights
DEV Team Platform Part Leader
- Newly formed platform team lacked task tracking and sizing framework — Designed 7-stage workflow (BACKLOG→SIZE REVIEW→DONE), handling rules for 4 task types, transition validators, and ticket template library, Standard operating framework deployed immediately upon team formation — task-type handling rules and decision criteria established simultaneously
- Existing sizing criteria did not reflect AI-assisted development time reduction with Claude Code — Established 5-dimension scoring framework and per-domain Claude Code time-discount rules (test writing ~80%, pattern replication ~70%), Team-consensus AI time-discount rules established, ensuring equitable performance measurement before and after Claude Code adoption
DEV Team Server Developer
- Marketing data query cost surge (full-scan billing model) — Evaluated partitioning but rejected due to Write IOPS increase; adopted gRPC distributed reads (Storage Read API), Cut costs 82% ($6.25→$1.1 per TB)
- Batch processing collection limits (60 days) — Migrated from batch processing to EDA-based on-demand pipeline — distributing peak load and enabling throughput scaling, Expanded metrics range 6x (60→360 days), processing time 2h→5min
- Code duplication and convention inconsistency across 7 projects — Replaced manual code sync with shared NestJS package system (10 utility modules + Game Server Kit, auto-deploy), Automated upgrades cutting deployment time 3hrs→15min
- Marketing query latency (18s) from fragmented ad platform data (Google Ads/Meta/TikTok) — Normalized read path + index tuning to avoid full-scan on 100-column table, Performance improved 97% (18s→0.5s)
- Regulatory risk from game probability disclosure requirements — Integrated DynamicModule-based probability package across multi-game projects + CDK-based audit log pipeline, Achieved 94%+ integration test coverage (12 suites, 115 tests)
Projects
Marketing Platform Audit Log System
2025.01 ~ 2025.04Resolved delayed balance issue response caused by inability to track data change history during game operations
- Inability to track data change history across environments and projects — Designed Git-like version control system with UUID-based cross-environment/project entity tracking, Ensured data consistency across 6 games
- Manual entity change recording prone to omission — Applied Event Sourcing-based change tracking with Auditable decorator + TypeORM Subscriber pattern, Automated entity change recording pipeline
- Multi-environment version conflicts during data merge — Developed 3-Way Merge Engine with parent/child entity conflict detection and unique constraint handling, Enabled reliable version merging across 6 games
- Expensive full-snapshot comparison for every version diff — Designed Version Diff Engine with dual strategy: incremental comparison and snapshot comparison based on Base Audit availability, Established cost-efficient version diffing
- Entity tracking failures due to PK dependency during migration and merge — Introduced shared identifier-based entity tracking decoupled from PK dependency, Ensured accurate entity tracking during migration, comparison, and merge
Tech Rationale
Adopted TypeORM EntitySubscriberInterface after dedicated analysis of subscriber behavior and constraints. Designed AOP-based approach combining Auditable decorator + Subscriber to automatically collect entity changes into a standardized audit pipeline.
Marketing Integrated Platform
2025.02 ~ PresentUnified marketing platform managing Google Ads/Meta/TikTok campaigns, creatives, and metrics in a single system, built on dedicated marketing DB separation
- Fragmented ad platform management across Google Ads/Meta/TikTok — Delivered unified campaign automation API for creation, deployment, modification, and retention metrics in a single platform, Consolidated multi-platform campaign management into one system
- NAS-S3 sync reliability and restore failures — Separated SQS server-worker flow, automated outbox events, tracked asset_nas_sync state, Secured restore reliability with state-tracked sync pipeline
- Meta asset sync performance bottleneck — Converted ORM single-row saves to bulk processing, Image sync reduced 72% (25.7s→7.1s), DB transactions reduced 95% (10~16s→0.3~0.5s)
- High creative processing Lambda costs on x86 architecture — Migrated Lambda to ARM64 Graviton2 architecture, Reduced creative processing Lambda costs by 20%
- High BigQuery query costs for marketing data reads — Adopted BigQuery Storage Read API with gRPC streaming-based high-performance data reading, Reduced BigQuery costs by 82% ($6.25→$1.1 per TB)
- Marketing query latency (18s) from 100+ column denormalized table — Normalized table into main/time-series/prediction, tuned indexes + cursor pagination, Performance improved 97% (18s→0.5s)
- The Asset table view suffered severe jank — 30 rows × 13 columns (390 cells) re-rendering on every dropdown open or row selection, combined with an infinite useEffect render loop — Applied React.memo with custom comparator to TableRow, extracted per-row cells with isolated hook subscriptions, reduced columns useMemo dependencies from 12 to 2, and guarded controlled-mode useEffect to break the loop, Eliminated unnecessary re-renders and resolved the infinite render loop; shared Table component propagated the fix to 36 downstream tables
- Marketing RCP data was only viewable in spreadsheets, and the existing BigQuery aggregation table combined multiple sources — creating data source divergence that blocked unified console queries — Switched the BigQuery source table from the multi-source aggregate to a single-source table, introduced Criteria-Result pattern field-metadata/filter-options APIs, and implemented a FE multi-select filter with a TOTAL row aggregation, Structurally resolved data source divergence via single-source migration, embedded unified RCP queries into the console, and replaced hardcoded currency/decimal display with BE metadata-driven formatting
Tech Rationale
Reduced BigQuery costs 82% via Storage Read API migration. Designed hybrid GCP Pub/Sub → AWS Lambda/SQS pipeline for event-driven processing with Outbox pattern for non-blocking async event publishing.
AWS Lambda Migration & Event-Driven Architecture
2025.06 ~ 2025.08Resolved batch job limitations from marketing metrics 60-day → 360-day expansion and serverless transition for batch processing
- Full serverless migration risked operational stability — Designed hybrid architecture keeping API server on EC2 while separating batch/job processing to Lambda, Attained workload-optimized resource usage without disrupting live services
- Batch processing limited to 60-day collection range with 2-hour runtime — Established Event-Driven flow with SQS+Lambda+EventBridge, Reduced batch time 2h→5min, expanded collection 6x (60→360 days)
- Data consistency risk during event publishing — Adopted Transactional Outbox Pattern for scheduled and delayed event publishing, Ensured data consistency across distributed event processing
- Lambda throttling, high log costs, and build OOM issues — Applied Batch Size bulk processing, CloudWatch log optimization, and build OOM remediation, Stabilized Lambda operations and reduced operational costs
- DB connection exhaustion during massive Lambda execution — Introduced RDS Proxy connection pooling, Resolved connection exhaustion and stabilized database access
Tech Rationale
Chose Lambda to decouple batch/event workloads bound to the monolith server, enabling independent deployment. Adopted SQS for async processing to resolve scaling limitations under traffic fluctuation.
Cloud Data Sync System
2025.08 ~ PresentBuilt S3-based sync and automated DDL management system to resolve dynamic game data inconsistency across environments
- Dynamic game data inconsistency across environments — Built S3-based cross-environment data synchronization across development/staging/production, Unified game data state across all environments
- Manual DDL schema management causing sync failures — Created automated DDL management engine with dynamic PK column type resolution, column type mismatch detection with MODIFY, and automatic index creation/RENAME, Automated schema sync across environments
- Large-scale Cloud Data ingestion and S3 upload bottlenecks — Analyzed and optimized ingestion and upload pipeline, Resolved large-scale processing delays
- Sync job instability causing operational issues — Implemented job separation, transitioned scheduling approach, tuned timeouts, and introduced non-blocking processing, Secured operational reliability for sync pipeline
- Four CloudData bugs: Redis cache invalidation, DDL SKIP metadata, copy key deletion, excludeCloudData propagation — Diagnosed and fixed all four issues systematically, Stabilized CloudData operations
- Unintended full Cloud Data deletion risk from missing option forwarding — Added excludeCloudData option forwarding to four POST migration paths, Blocked unintended data deletion risk
Tech Rationale
Applied S3 Lifecycle policies (30d Glacier IR, 90d expiry) for cost optimization. Switched from event-triggered to scheduled execution to reduce sync miss risk.
Internal Common Library System
2025.07 ~ PresentDevelopment and operation of NestJS utility (10 modules) + Game Server Kit (2 packages) for multi-project code consistency
- Common code duplication across services increasing maintenance cost — Designed 10-module library system: core, repository, cache, lock, slack, crypto, smb, hash, type, iac, Unified shared code across 7 projects with standardized dependency management
- Repository module lacking bulk operations and audit logging capabilities — Enhanced Repository module with Bulk/Audit Log/TypeORM narrowing using overloading + TypeScript generics; refactored 2000+ lines by SRP, Improved module extensibility and maintainability
- Concurrency conflicts in multi-instance environments — Engineered distributed lock module with ElastiCache (Redis)-based distributed lock decorator, Enabled safe concurrency control across multi-instance deployments
- Manual library upgrades taking 3 hours across 7 projects — Added workflow_dispatch+matrix and changed-package CI tests on GitHub Packages, Reduced 7-project upgrade runtime 3h→15min
- Game server shared code tightly coupled in monolithic package — Extracted sheet processing module + 5 submodules into independent package, resolved conflicts via 5-branch divergence analysis, Realized independent package versioning and deployment
- Slow CI pipeline (15m47s) due to test infrastructure inefficiency — Switched ts-jest isolatedModules and explicit Entity types, Cut CI time 61% (15m47s→6m06s); passed 81 suites/978 tests
Tech Rationale
Implemented ElastiCache (Redis) distributed lock as AOP decorator to separate lock logic from business code. Evaluated 5 options for Jest 30 VM isolation and adopted poolSize=2.
In-Game Multi-Language Translation System
2026.02 ~ PresentRedesigned the in-game notification translation domain model — decoupling AI translation from library-based conversion into a 3-stage pipeline
- Simplified Chinese was modeled inside the AI translation settings domain despite not being AI-supported, and settings storage/retrieval depended on the Traditional Chinese AI translation state — making independent toggling impossible without a breaking change — Extracted the derived-conversion module from the AI translation module into 9 use cases with dedicated entity/repository/scheduler, and applied the global-inheritance pattern for effective settings, Clarified domain boundaries and enabled independent toggling of derived conversion rules
- Decoupling required a breaking API change — Simultaneously deployed FE/BE/Gateway with role permission migration for zero-downtime transition, Completed the breaking API change without incident and cut effective-settings DB queries by 50% (4→2)
- The detect cron lacked version priority, delaying translation of the latest version; 30-second interval further slowed response — Added version ID-based priority ordering to detect SQL, shortened the processing interval from 30s to 10s, and raised batch limit from 100 to 200, Eliminated latest-version translation lag; progress UI plus total-count caching improved user visibility
Tech Rationale
Materialized the responsibility boundary between AI translation and library-based derived conversion inside the code structure. Split the Detection → Processing (AI) → Conversion (library) pipeline into independent modules and secured concurrent request integrity with a 2-stage race guard plus pessimistic write lock
Probability Calculation & Audit Log Analytics Pipeline
2026.02 ~ PresentBuilt probability calculation package and CDK-based audit log analytics infrastructure to address regulatory risk from lack of game probability verification
- No reusable probability calculation module across game projects — Packaged NestJS DynamicModule with 5 probability functions + Kinesis logging as shared probability package, Integrated into 12 games with standardized probability calculation
- No audit log analytics infrastructure for probability verification — Codified CDK analytics pipeline: Kinesis→Firehose (Dynamic Partitioning+Parquet)→S3→Glue→Athena, Unlocked end-to-end probability audit log querying
- Mock-based tests lacking regression confidence for infrastructure code — Replaced mocks with LocalStack + Testcontainers integration tests, Secured 94%+ coverage (12 suites/115 tests)
Tech Rationale
Codified Kinesis → Firehose (Dynamic Partitioning + Parquet) → S3 → Glue → Athena pipeline with CDK. Chose Parquet columnar format for Athena SQL cost optimization. Replaced mock-based tests with LocalStack Testcontainers for integration testing.